DSA_Troubleshooting_Course_Student_Guide_Mar89 DSA Troubleshooting Course Student Guide Mar89
DSA_Troubleshooting_Course_Student_Guide_Mar89 DSA_Troubleshooting_Course_Student_Guide_Mar89
User Manual: DSA_Troubleshooting_Course_Student_Guide_Mar89
Open the PDF directly: View PDF .
Page Count: 644
Download | |
Open PDF In Browser | View PDF |
DSA TROUBLESHOOTING COURSE STUDENT GUIDE This manual is used as a student guide for the DSA Troubleshooting course. Much of the material contained herein may be used as reference material by the field engineer while troubleshooting problems in the field. The Field Engineer at his/her discretion may remove any or all parts of this mAnual and incorporate them into other resource documents or notebooks. This manual is the property of DIGITAL EQUIPMENT CORPORATION and is considered for DIGITAL INTERNAL USE ONLY. Digital Equipment Corporation makes no representation that use of its products with those of other manufacturers will not infringe existing or future patent rights. The descriptions contained herein do not imply the granting of a license to make, use, or sell equipment or software as described in this manual. Digital Equipment Corporation assumes no responsibility or liability for the proper performance of other manufacturers' products used with its products. Digital Equipment Corporation believes that information in this publication is accurate as of its publication date. Such information is subject to change without notice. Digital Equipment Corporation is not responsible for any inadvertent errors. Class A Computing Devices: NOTICE: This equipment generates, uses, and may emit radio radio frequency energy. It has been tested and found to comply with the limits for a Class A computing device pursuant to Subpart J of Part 15 of FCC rules for operation in a commercial environment. This equipment, when operated in a residential area, may cause interference to radio/TV communications. In such event the user (owner), at his/her own expense, may be required to take corrective measures. Revision/Update Information: Version 1.0, March 1989 This is the first document release from CX/CSSE and supersedes all previous versions used in preliminary DSA Troubleshooting Courses. All revisions and known error corrections up to March/89 have been incorporated into this manual ex/eSSE 1 st Preliminary 1986 eX/eSSE Version 1 March, 1989 The information in this document is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation. Digital Equipment Corporation assumes no responsibility for any errors that may appear in this document. The software described in this document is furnished under a license and may be used or copied only in accordance with the terms of such license. No responsibility is assumed for the use or reliability of software on equipment that is not supplied by Digital Equipment Corporation or its affiliated companies. Copyright © March 1989 by Digital Equipment Corporation All Rights Reserved. Printed in U.S.A. The postpaid READFR'S COMMENTS form on the last page of this document requests the user's critical evaluation to assist in preparing future documentation. The following are trademarks of Digital Equipment Corporation: DEC DEC/eMS DEC/MMS DECnet DECsystem-10 DECSYSTEM-20 DECUS DECwriter DIBOL EduSystem lAS MASSBUS PDP UNIBUS VAX VAXcluster VMS VT PDT RSTS RSX mDmDOmD™ The following are also trademarks of Digital Equipment Corporation: CI DDCMP DOIF DEBET DSA DECconnect DECdirect DECdisk DECmail DECmat DECmate DECnet/E DECnet-RT DECnet-UL TRIX DECserver DECservice DECtape DELNI DElUA DEMPR DEONA DESTA DEUNA OMS DRB32 DSRVB-M HSC IVIS KA10 KD11 KDASO-O KDBSO-A KOBSO-B KI KL10 KS10 LA50 lN01 LN03 MicroPDP-11 MicroVAX MicroVMS MSCP PDP-11 O-bus RA60 RA70 RA80 RA81 RA90 RC25 RODX3 RMS-11 RSX-11 RSX-11 M RSX-11S RX33 SA482 SAS50 SA600 SA650 TA78-81 TMS-11 TK50 TOPS-10 TOPS-20 ULTRIX-32 TU78:81 UDA50 UETP UlTRIX ULTRIX-11 VAXELN VAXNMS VAXsimPLUS VMS RA82 This document was prepared using VAX DOCUMENT, Version 1.0 DSA TROUBLESHOOTING COURSE Lab Exercise #1 BLOCK.COM Digital Internal Use Only 1 BLOCK.COM Lab Exercise 1 1. Log into your STUDENTx account 2. $ SET DEF [STUDENTx.MISC] (x 3. $ SET TERMINALIINQUIRE 4. $ @BLOCK =your student account #) STEP BLOCK COM Program Prompt Your Response 5 Would you like help ? y 6 Create an output file of your conversion Y[N] ? y 7 What type of DISK would you like ? RA87 Note the response and "allowed selections." 8 What type of DISK would you like? RA81 9 What formatted mode would you like to convert ? H Note that help may be obtained as needed. 10 What formatted mode would you like to convert? 16 11 What TYPE of block would you like to convert ? H 12 What TYPE of block would you like to convert ? LBN 13 Select Group Offset ?Pressing selects the normal (default) group offset for the selected drive type. Changing the group offset is normally only used during disk engineering development and new product testing. 14 Would you like a Status Map displayed Y[Nl ? 2 Digital Internal Use Only y BLOCK.COM Lab Exercise 1 STEP 15 Your Response • BLOCK.COM Program Prompt Select the desired LBN number(s) ? 2498 Compare the results with the sample calculation shown in the DSDF section of the Student Reference Manual for this LBN on a 16-bit RA81. See LBN conversion example #2. 16 Select the desired LBN number(s) ? Compare the results with the sample calculation shown in the DSDF section of the Student Reference Manual for this LBN on a 16-bit RA81. See LBN conversion example #3. 2499 17 Select the desired LBN number(s) ? 2498,2499 Be sure a comma separates the numbers. 18 Select the desired LBN number(s) ? 2490:2500 Be sure a colon separates the numbers. 19 Select the desired LBN number(s) ? 2500:2490 Note that the conversions are performed in ascending order regardless of the order in which the block numbers are entered into the program. 20 Select the desired LBN number(s) ? H Note the various ways that block numbers may be entered into the program. Entries may contain-a mix of numbers using commas and/or colons. 21 Select the desired LBN number(s) ? 40:60 Watch the progression of numbers in the SECTOR # column and the SECTOR # FROM INDEX column (physical sector). Note that when the HEAD number changes from 0 to 1 there is a. 14-sector shift in the SECTOR # FROM INDEX column. This is due to the group offset (RA81 =14) when switching groups (heads in an RA81). 22 Select the desired LBN number(s) ? Press until you are prompted to select the disk type. 23 Select RA60, 16-bit LBN 6000. Compare the results with the sample calculation shown in the DSDF Section of the Student Reference Manual for this LBN on a 16-bit RA60. See LBN conversion example #5. Digital Internal Use Only 3 BLOCK.COM Lab Exercise 1 STEP 24 BLOCK.COM Program Prompt Your Response Select LBNs in RCT and HOST area (use MAP). The map may be obtained by typing "M" in response to the prompt that requests the block numbers for the type of block you selected (LBN, RBN, etc.). 25 Select RA82. (Note that the RA82 is only capable of 16-bit mode.) 26 Select other disk types and modes. 27 Select RBN. Look at the Map while RBN is selected. Enter some numbers and review the results. 28 Select DBN. Look at the map while DBN is selected. Enter some numbers and review the results. Note the cylinders that contain these blocks. 29 Select XBN. Look at the map while XBN is selected. Enter some numbers and review the results. 30 Use the H (help) feature for various prompts. 31 Deliberately enter erroneous information to some of the responses. enter an invalid mode number enter an invalid block type enter block numbers that are invalid or too large 32 Exit the program. 4 Digital Internal Use Only EXIT BLOCK.COM Lab Exercise 1 STEP BLOCK.COM Program Prompt 33 $ Type BLOCK.OAT Your Response Note that errors are not included on the output and that the header page is different when you changed parameters while executing the program. 34 Print the file BLOCK.OAT to obtain a hardcopy for review. BLOCK.COM NOTES VERSIONS of BLOCK.COM prior to V 3.5 are obsolete and have some calculation errors. BLOCK. COM is distributed with VAXSIM-Plus using the file name VAXSIM$LBN.COM The versions of this conversion utility that were released with VAXSIM-Plus version 1.0 and 1.1 do not support the RA90. Version 1.2 of VAXSIM-Plus contains VAXSIM$LBN.COM version 3.5 which has the latest corrections and will support the RA90~ In the meantime, use BLOCK.COM version 3.5 to support RAOO troubleshooting. Digital Internal Use Only 5 DSA TROUBLESHOOTING COURSE Lab Exercise #2 DKUTIL - For execution on an RA70 Disk Drive - Digital Internal Use Only 1 DKUTIL Execution on an RA70 Disk Drive Lab Exercise 2, Save haIdcopy of all your activities for reference and questions during subsequent discussion in lecture. 2 RUN DD1 :DKUTIL (HSC50) RUN DKUTIL (HSC70) Select the target disk that you have been assigned. Version 350 of HSC code will prompt you for the disk device number (Oxx). Version 370 and later HSC code will provide you with a message indicating that you will need to enter an additional command (GET Oxx) to select the disk unit number. 3 Verify that the mode is correct and the FCT is VALID from the display. Use your OKUTIL user guide (in the Student Manual) as reference for the following steps. Command(s) To Enter Notes 4 DISPLAY CHARA DISK Compare the ·results to the drive characteristics in your student manual for this drive type. 5 DISPLAY ERRORS On RA80/81/82, this displays the contents of the internal error silo which contains the last 16 drive detected error codes. For later drives (RA70, RA90, etc.) this command will dump and format the drive internal error log. For the RA60, an error message will result since the RA60 does not support internal error logging or error silos. 6 DISPLAY RCT Note the LBNs that are replaced and which RBNs they are revectored to. The asterisk (*) next to an entry indicates nonprimary replacement. Otherwise, the replacement is primary. 7 DISPLAY RCT/FULL Note the additional information that is displayed from RCT block 0 when using the IFULL modifier. 8 DISP FCT Note the PBNs (Physical block numbers) and which logical blocks they describe that should be considered bad and replaced when formatting the disk. The codes in parentheses represent the reasons why the block is bad and how it is to be treated. This information is in the FCT section of the Student Manual. 2 Digital Internal Use Only DKUTlL Execution on an RA70 Disk Drive Lab Exercise 2, Command(s) To Enter Notes 9 DISPLAY ALL The information provided here is the total accumulation of data that would be obtained if individually entering each of the previous DISPLAY commands. '0 DISPLAY CHARA LBN 100 Note the header information that is supplied. Consult your instructor if you have any questions concerning the format of the header information. This display provides translation of an LBN address (100 in this example) into cylinder, group, track, and position. Position is the physical sector from Index. '1 DISPLAY CHARA DBN 2 '2 DISPLAY CHARA RBN 24 '3 DISPLAY CHARA XBN 400 '4 DUMP LBN 100/ALL Note the contents of the data, the four copies of the header, and the calculated EDC difference. Note the header code. 15 DUMP/ALL DBN 123 Note the contents of the data, the four copies of the header and the calculated EOC difference. Note the header code and how it differs from an LBN. Using the RCT display obtained from step 6 above, select an RBN number that is not being used for replacement and use that RBN as part of the following command: 16 DUMP/ALL RSN xxxx Note the contents of the data, the four copies of the header and the calculated EDC difference. Note the header code. Note that EOC is inverted, indicating a forced error Note the data pattern. This is the DEC Standard Format Data Pattern. Using the RCT display obtained from step 6 above, select an RBN number that is being used for replacement and use that RBN as part of the following command: Digital Internal Use Only 3 DKUTlL Execution on an RA70 Disk Drive Lab Exercise 2, Command(s) To Enter Notes xxxx 17 DUMP/ALL RBN 18 DUMP/ALL XBN 0 Note the contents of the data, the 4 copies of the header and the calculated EDC difference. Note the header code. In this case, the RBN should contain valid data from some LBN and a correctly written EDC (EDC diff = O). Note the contents of the data, the four copies of the header and the calculated EDC difference. Note the header code and how it is different -from the other block types. This is the FCT control block. Use your Student Manual to find the mode byte, the FK bit, and their contents. 19 DUMP/ALL FCT BLOCK 1 COpy 1 What is different about this from the information obtained in the previous step? Have the instructor clarify this if it is unclear. 20 DUMP/ALL XBN 1 This is the first block in the FCT that contains PBN descriptors. 21 DUMP/ALL FCT BLOCK 2 COPY 1 What is different about the information displayed here from the previous step? 22 DUMP/ALL RCT BLOCK 1 COPY 1 This is the RCT control block, often referred to as block 0 of the RCT (accessed as block 1 when using DKUTIL). Use your student manual to decode the contents of RCT block O. 23 DUMP/ALL LBN 547041 What's different about the contents here as compared to the contents from the previous step? 24 DUMP/ALL LBN 100 25 MODIFY 32 1111 2222 33334444 55556666 NOTE An invalid command error message indicates the write patch is not installed. InstaU it at this time, if necessary. After installing the patch and re-running DKUTIL, continue this exercise starting with step 24. 4 Digital Internal Use Only DKUTIL Execution on an RA70 Disk Drive Lab Exercise 2, Command(s) To Enter Notes 26 DUMP/ALL BUFFER Note the differences that occurred after the modify. It probably does not appear the way you expected. The next step should provide clarification. 27 MOD 32 01111 02222 0333304444 05555 0666607777 Use the letter 0 (for octal) and not zero (0). 28 DUMP/ALL BUFFER Now notice the changes that occurred. 29 WRITE LBN 547040 This will write the contents of the buffer (which you have modified) to LBN 547040. 30 DUMP LBN 547040 This is to verify the block was correctly written with the desired modifications. 31 REVECTOR 547040 This is actually a command that forces a replacement; i.e., LBN 547040 will be replaced. 32 DISPIFULL RCT Verity that the LBN was replaced and remember which RBN was used. 33 DUMP/ALL LBN 547040 34 DUMP/ALL RBN xxxx xxxx is the RBN number you obtained from step 32 above. 35 DUMP/ALLIRAW LBt! 547040 Compare the results of step 33 through step 35. Do you understand what is happening? Consult the instructor if you are unclear. 36 DUMP RCT BLOCK 2 This shows the contents of the RCT caching block which contains a buffered copy of the host data during the last replacement operation. Notice that this is the same data that was in LBN 547040 which you previously replaced manually. 37 DUMP RCT BLOCK 3 This is the first block in the ReT that contains DESCRIPTORS. It mostly zeroes (if not all zeroes). Digital Internal Use Only 5 DKUTIL Execution on an RA70 Disk Drive Lab Exercise 2, Command(s} To Enter x Notes 38 DUMP RCT BLOCK 39 DUMP RCT BLOCK 9999 Notice that you received an error message. The message will tell you the maximum LBN that you can enter for the RCT table. This is a convenient technique to locate the LBN number of the last RCT descriptor block in the table for this particular type of disk drive. 40 DUMP RCT BLOCK xxxx Substitute the maximum LBN number you obtained in step 39 tor xxxx in this step. This will display the last RCT descriptor block (copy 1 in this case). This descriptor block should contain descriptors with descriptor codes of 10, indicating the end of a valid RCT table copy. Substitute 4, 5, 6, etc., for the value x and continue looking at RCT descriptor blocks until you encounter a block that contains something other than all zeros. The non-zero values will be due to an RBN descriptor entry. Use the RCT section of your Student Guide and see if you can decode a descriptor and compare it to the corresponding entry in the RCT table obtained in step 6. Ask your instructor for assistance, as needed. You should also see a descriptor entry for the RBN that was allocated for LBN547040 when you performed the revector command in step 31. Save all the material you obtained in this exercise for review later in class. 6 Digital Internal Use Only TABLE OF CONTENTS CHAPTER 1 GOALS/OBJECTiVES................................ 1-1 1.1 WHAT'S COVERED. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 1.2 WHAT'S NOT COVERED. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 1.3 COURSE MAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4 CHAPTER 2 DSDF FOR RA60170/81/81/82/90 ... '. . . . . . . . . . . . . . . . . . . . . 2.1 MEDIA COMPONENTS (Physical Geometry, Head Disk Assemblies) 2.1.1 RA80/81/82 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 RA70 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3 RA60 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.4 RA90 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3 2-3 2-5 2-6 2-7 2.2 SERVO INFORMATION .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9 2.3 DATA INFORMAl'ION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2-11 2.4 PHYSICAL SECfOR . . . . . . . 2.4.1 Header . . . . . . . . . . . . . 2.4.2 Data. . . . . . . . . . . . . . . 2.4.3 Error Detecting Code (EDC). 2.4.4 Error Correcting Code (ECC) .. .. .. .. .. 2-11 2-11 2-11 2-11 2-12 2.5 PHYSICAL TRACK. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. 2-12 2.6 PHYSICAL CYLINDER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12 2.7 LOGICAL DISK ADDRESSING. . . . . . . . . 2.7.1 Logical Block . . . . . . . . . . . . . . . . . 2.7.2 Logical Track . . . . . . . . . . . . . . . . . 2.7.3 Logical Group. . . . . . . . . . . . . . . . . 2.7.4 Logical Cylinder. . . . . . . . . . . . . . . . 2.7.5 Implementation of Logical Addressing. . . 2.7.5.1 RA80 Logical Addressing .. . . . . . 2.7.5.2 RA81 Logical Addressing . . . . . . . 2.7.5.3 RA82 Logical Addressing . . . . . . . 2.7.5.4 RA70 Logical Addressing . . . . . . . 2.7.5.5 RA60 Logical Addressing . . . . . . . 2.7.5.6 RA90 Logical Addressingand LOGICAL BLOCKS . . . . . . . . . . . . . . . . . . . . . . . . 2.8.1 Host Application Area (LBNs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.2 Replacement Block Area (RBNs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.3 Replacement Control Table (RCT) Area (LBNs) . . . . . . . . . . . . . . . . . . . . 2.8.4 Format Control Tables (Fer) Area (XBNs, External Blocks) . . . . . . . . . . . . . 2.8.5 Diagnostic Area (Diagnostic Block Numbers) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. .. .. .. .. 2-32 2-33 2-33 2-34 2-34 2-35 iii 2.9 IMPLEMENTATION OF LOGICAL AREAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9.1 Drive Topology Maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. 2-35 2-35 2.10 DRIVE INTERNAL DIAGNOSTIC AREA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2-44 2.11 BAD BLOCK REPLACEMENT (BBR) and REVECTORING 2.11.1 Why is BBR Perfonned? . . . . . . . . . . . . . . . . . . . 2.11.2 When is BBR Invoked? . . . . . . . . . . . . . . . . . . . . 2.11.3 Who Detects and Perfonns BBR? . . . . . . . . . . . . . . 2.11.4 How is BBR Perfonned? . . . . . . . . . . . . " . . . . . . . 2.11.5 Types of Replacement and Revectoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. .. .. .. .. 2-44 2-44 2-45 2-45 2-45 2-48 2.12 HARDWARE ERROR RECOVERY . . . . . . . . . . 2.12.1 RA82 Error Recovery . . . . . . . . . . . . . . . . 2.12.2 What are the RA82 Error Recovery Circuits? . . 2.12.3 When are the Error Recovery Circuits Activated? 2.12.4 RA70 Hardware Error Recovery . . . . . . . . . . 2.12.5 How is Error Recovery Usedommon Characteristics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5 3.2 RA60 Subunit Characteristics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6 3.3 RA70 Common Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . ". . . . . . . . . . . . . . 3-9 3.4 RA70 Subunit Characteristics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-10 3.5 RA80 Common Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-13 3.6 RA80 Subunit Characteristics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-14 3.7 RA81 Common Characteristics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-17 3.8 RA81 Subunit Characteristics . . . . . . . ~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-18 3.9 RA82 Common Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-21 3.10 RA82 Subunit Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-22 3.11 RA90 Common Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-25 3.12 RA90 Subunit Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-26 3.13 Student Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-30 CHAPTER 4 BlO"CKS AND HEADERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1 4.1 Simplified Summary of Header Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6 CHAPTER 5 iv . . . . . . DBN AREA AND RIW DATA PATHS. . . . . . . . .. . . . . . . . . . . . 5-1 CHAPTER 6 REPLACEMENT CONTROL TABLE (RCT) . . . . . . . .. . . . . . . . 6--1 6.1 THE REPLACEMENT CONTROL TABLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4 6.2 RBN DESCRIPI'OR FORMAT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6--5 6.3 PHYSICAL LAYOUT OF THE ReI' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6--6 CHAPTER 7 FORMAT CONTROL TABLE (FCT) . . . . . . . . . . . . . . . . . . . . . . 7-1 7.1 FCI'S'TR.UCfURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3 7.2 VOLUME INFORMATION BlOCK DETAILS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5 CHAPTER 8 STANDARD DISK INTERFACE (SOl). . . . . . . . . . . . . . . . . . . . . 8-1 8.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 8.2 OBJECTIVES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 8.3 SDI BUS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 SDI lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83.2 SDI Bus Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3 8-5 8-6 8.4 DRIVE STATES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7 ~ 8.5 RTCS FORMAT (Real Time Controller State) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-9 8.6 RTDS FORMAT (Real Time Drive State) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . , 8-11 8.7 COMMAND FORMATS on the WRT/CMD LINE . . . . . . . . . . . . . . . . . . . . . . . . . .. 8-13 8.8 LEVEL 1 COMMANDS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 8-13 8.9 LEVEL 2 COMMANDS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 8.9.1 Command Formats on the WRT/CMD tine. . . . . . . . . . . . . . . . . . . . . . . . . . . .. 8.9.2 Response Formats on the Read/Response Line . . . . . . . . . . . . . . . . . . . . . . . . . .. 8-16 8-16 8-20 8.10 INITIATE SEEK COMMAND (Level 2 Command Example) . . . . . . . . . . . . . . . . . . .'. 8-22 8.11 SDI READ OPERATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-26 8.12 SEEK followed by a SELECT TRACK AND READ . . . . . . . . . . . . . . . . . . . . . . . . . 8-28 8.13 SDI WRITE OPERATION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 8-30 8.14 SEEK followed by SELECT TRACK AND WRITE . . . . . . . . . . . . . . . . . . . . . . . . .. 8-32 8.15 EXER.CISES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 8-35 CHAPTER 9 LEVEL 2 SOl COMMANDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1 9.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2 9.2 CHANGE MODE Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2 9.3 CHANGE CON'TR.OLLER FLAGS Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3 9.4 DIAGNOSE Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4 9.5 DISCONNECT Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-5 v 9.6 DRIVE CLEAR Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6 9.7 ERROR RECOVERY Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6 9.8 GET COMMON CHARACfERISTICS Command . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6 9.9 GET SUBUNIT CHARACTERISTICS Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-8 9.10 GET STATUS Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 9-10 9.11 INITIATE SEEK Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 9-11 9.12 ON-LINE Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 9-12 9.13 RUN Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 9-13 9.14 READ MEMORY Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 9-14 9.15 RECALIBRATE Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. 9-15 9.16 TOPOLOGY Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-16 9.17 WRITE MEMORY Command. . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 9-17 CHAPTER 10 10-1 10.1 RA60 DRIVE STATUS DECODE 10-3 10.2 RA70 DRIVE STATUS DECODE 10-11 10.3 RA80 DRIVE STATUS DECODE 10-21 10.4 RA81 DRIVE STATUS DECODE 10-31 10.5 RA82 DRIVE STATUS DECODE 10-41 10.6 RA90 DRIVE STATUS DECODE 10-51 10.7 Status Error Decoding Sample 1 10-60 10.8 Status Error Decoding Sample 2 10-64 10.9 Status Error Decoding Sample 3 10-70 10.10 Status Error Decoding Sample 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 10-74 10.11 VMS V4.4 ERROR LOG ENTRY FORMATTER - Problem with RA Disks on UDA/KDA/KDB50 Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 10.11.1 Drive-Detected Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 10.11.2 How to use the Dump Utility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 10.11.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.11.4 Example of an ANALYZE/ERROR_LOG Output for Entry 17 . . . . . . . . . . . . . . . . 10-82 10-82 10-82 10-84 10-85 CHAPTER 11 vi DECODING DRIVE STATUS BYTES. . . . . . . . . . . . . . . . . . .. VAXSIMPLUS...................................... 11-1 11.1 VAXsimPLUS OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 11-2 11.2 VAXsimPLUS PHONE NUMBERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-4 11.3 VAXsimPLUS RESOURCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-5 11.4 VAXsimPLUS MESSAGE EXAMPLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 11-7 CHAPTER 12 DSA DSDF/BBR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12-1 12.1 INTRODUCTION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12-2 12.2 OVERVIEW MATERIAL for UNDERSTANDING BBR . . . . . . . . . . . . . . . . . . . . . .. 12.2.1 LBN and RBN Association (Disk Organization for BBR) . . . . . . . . . . . . . . . . . . .. 12.2.2 Disk Addressing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12.2.3 How Header Codes are Used. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12.2.4 Special Uses of the Header Code Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12.2.S EDC Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2.6 ECC Detection and Correction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12.2.7 ECC Thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12-2 12-2 12-3 12-3 12-4 12-4 12-4 12-4 12.3 BBR PROCESS OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12.3.1 Notification that a Block Needs to be Replaced. . . . . . . . . . . . . . . . . . . . . . . . .. 12.3.1.1 Host BBR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12.3.1.2 Controller BBR . . . . . . . . . . . . . . . . . . . ~ . . . . . . . . . . . . .. . . . . . . .. 12.3.2 Executing Bad Block Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12.3.3·· Restarting BBR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3.3.1 Host BBR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12.3.3.2 Controller BBR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12-6 12-6 12-6 12-7 12-7 12-9 12-10 12-10 12.4 TROUBLESHOOTING BBR ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12-10 12.S REVECI'ORING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12-10 12.6 QUESTIONS + ANSWERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12-11 CHAPTER 13 TUTORIAL ON FORMATTING RA DRIVES . . . . . . . . . . . . . .. 13-1 13.1 INTRODUCTION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 13-2 13.2 BASIC FORMATTER FUNCTIONALITY REVIEW . . . . . . . . . . . . . . . . . . . . . . . .. 13-3 13.3 SCRUBBER, FORMATTER, HSCSOnO-WHAT REPLACES BLOCKS? . . . . . . . . . . . .. 13-4 13.4 WHEN TO USE THE FORMATTER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 13-S 13.S WHEN NOT TO USE THE FORMATTER. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 13-7 13.6 ITEMS OF FORMATTER IN1EREST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 13-8 13.7 FORMAITING S~y ....................................... 13-9 CHAPTER 14 DRIVE ERROR TOLERANCE ......................... 14-1 14.1 ILEXER. SAl\4PLE 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 14-2 14.2 ILEXER SAl\4PLE 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 14-3 14.3 ACCEPTABLE DRIVE ERROR RATES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-4 vii CHAPTER 15 HSC50nO DKUTIL USER GUIDE ...................... 15-1 15.1 INTR.ODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15-2 15.2 INITIATING DKUTll.. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. 15-2 15.3 COMMAND SYNTAX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15-2 15.4 MODIFIERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15-2 15.5 SAMPLE SESSION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . " 15-3 15.6 DETAll..ED COMMAND DESCRIPTIONS . . . . . . . . . . 15.6.1 CHECK Command . . . . . . . . . . . . . . . . . . . . . 15.6.2 DEFAULT Command. . . . . . . . . . . . . . . . . . . . 15.6.3 DISPLAY Command .... . . . . . . . . . . . . . . . . 15.6.4 DU1v1P Command . . . . . . . . . . . . . . . . . . . . . . 15.6.5 EXIT Command . . . . . . . . . . . . . . . . . . . . . . . 15.6.6 GET Command . . . . . . . . . . . .'..... . . . . . . 15.6.7 MODIFY Command. . . . . . . . . . . . . . . . . . . . . 15.6.8 POP Command. . . . . . . . . . . . . . . . . . . . . . . . 15.6.9 PUSH Command. . . . . . . . . . . . . . . . . . . . . . . 15.6.10 REVECTOR Command - (Manual LBN Replacement) 15.6.11 SET Command . . . . . . . . . . . . . . . . . . . . . . . 15.6.11.1 SET CSSE_WRITE_ON. . . . . . . . . . . . . . . 15.6.12 WRITE Commandand INFORMATION MESSAGES. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15.8.1 DKUTIL-S CTRL/Y or CTRL/C Abort! . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15.8.2 DKUTIL-F Insufficient resources to RUN! . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15.8.3 DKUTIL-F Drive went OFFLINE! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15.8.4 DKUTIL-F I/O request was rejected! . . . . . . . . . . . . . . . . . . . . . . . . . . . .' . . .. 15.8.5 DKUTIL-E Illegal response to start-up question. . . . . . . . . . . . . . . . . . . . . . . . . . 15.8.6 DKUTIL-E Nonexistent unit number. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15.8.7 DKUTIL-E Unit is not available. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.8.8 DKUTIL-E Cannot ONLINE unit.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15.8.9 DKUTIL-E Invalid decimal number. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.8.10 DKUTIL-E Invalid octal number. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IS.8.11 DKUTIL-E Missing parameter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15.8.12 DKUTIL-E There is no buffer to dump.. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15.8.13 DKUTIL-E Missing modifier (only a slash (/) was specified). . . . . ; . . . . . . . . . . .. 15.8.14 DKUTIL-E SDI command was unsuccessful.. . . . . . . . . . . . . . . . . . . . . . . . . .. IS.8.15 DKUTIL-E n is an invalid par number, maximum is n.. . . . . . . . . . . . . . . . . . . .. 15.8.16 DKUTIL-E "text" is an invalid pam. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15.8.17 DKUTIL-E Invalid block number for xBN space. . . . . . . . . . . . . . . . . . . . . . . .. 15.8.18 DKUTIL-E Copy n of xCI' Block n (xBN n) is bad. . . . . . . . . . . . . . . . . . . . . .. 15.8.19 DKUTIL-EAll copies of of xCT Block n are bad. . . . . . . . . . . . . . . . . . . . . . . . 15.8.20 DKUTIL-E Could not write xBN n, MSCP Status: status . . . . . . . . . . . . . . . . . . . 15.8.21 DKUTIL-E Invalid sector size; only 512 and 576 are legal. . . . . . . . . . . . . . . . . .. 15-19 15-19 15-19 15-19 15-19 15-19 15-19 15-19 15-19 15-20 15-20 IS-20 IS-20 15-20 15-20 15-20 15-20 15-20 IS-20 IS-20 15-21 15-21 viii 15.8.22 DKUTIL-E Revector for LBN n failed, MSCP Status: . . . . . . . . . . . . . . . . . . . .. 15.8.23 DKUTIL-E CHECK READ for LBN n failed, MSCP Status: . . . . . . . . . . . . . . . .. 15.8.24 DKUTIL-E CHECK WRITE for LBN n failed, MSCP Status: . . . . . . . . . . . . . . . .. 15-21 15-21 15-21 15.9DKUTTI... Lab Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15-22 CHAPTER 16 RAUTIL USER GUIDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-1 16.1 OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16.1.1" Restrictions . . . .. . :.' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-2 16-2 16.2 GEITING STARTED. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16.2.1 Compile RAVTIL.MAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.2.2 Invoke RAUTIL.EXE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-2 16-2 16-2 16.3 CO~ SU"MMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-4 16.4 CO~ DETAILS and EXAMPLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16.4.1 AN'ALYZE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16.4.2 Manual Bad Block Replacement (BBR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16.4.3 DD - Display Drive . . . . . ". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16.4.4 D~. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16.4.5 Ex:rr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16.4.6 liEAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16.4.7 liEU . . . . . . . . . . . . . . . . . . ; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16.4.8 MODIFY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-4 16-5 16-7 16-7 16-8 16-9 16-9 16-10 16-11 16.4.9 NEXT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-12 16.4.10 SCRUB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-13 16.4.11 SU"MMARY. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-14 16.4.12 TL - TRANSLATE LBN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-14 16.4.13 TR-TRANSLATE RBN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-15 16.4.14 WRI'I'E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-15 16.5 TROUBLESHOOTING and USING RAUTIL 16.5.1 Radial Scratches . . . . . . . . . . . . . . . 16.5.2 Forced Errors. . . . . . . . . . . . . . . . . 16.5.3 Circular Defects . . . . . . . . . . . . . . . 16.5.4 Summary Analysis. . . . . . . . . . . . . . 16.5.5 EDC Errorsestrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ". .. 17-2 17-2 17.2 SELECTION PARAMETERS. 17.2.1 Input File. . . . . . . . . . 17.2.2 Output File . . . . . . . . . 17.2.3 Device(s) and Type(s) . . . 17.2.4 Event Codes . . . . . . . . 17.2.5 Afterix 17.2.6 Before. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 17.2.7 Report . . . . . . . . . . . . . . . . '. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 17.2.7.1 Physical Report (P) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.7.2 Geographic Report (G) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 17.2.7.3 Summary Report (S) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 17.2.7.4 Verbose Report (V) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.7.5 Time Report (T) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 17.2.8 Using the Selection Process . . . . . . . . . '. . . . . . . . . . . . . . . . . . . . . . . . . . .. 17-4 17-5 17-6 17-7 17-9 17-11 17-13 17-15 17.3 MANUAL TRANSLATION of DSA BLOCK NIDvmERS . . . . . . . . . . . . . . . . . . . . .. 17-20 CHAPTER 18 FAKDSK (ON HSC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 18-1 18.1 FAKDSK (on HSC V300N350) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2 18.2 FAKDSK (on HSC V370 and up) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 18-3 18.3 SUM:MARY (HSC Version 370 and up) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 18-4 EXAMPLES 2-1 2-2 2-3 2-4 2-5 2-6 16-1 16-2 16-3 16-4 16-5 16-6 RA70/80/81/82/90 LBN to Physical and Logical Parameters . . . . . . . . . . . . . . . . . . . RA81 16-Bit HDA LBN =2498 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA81 16-Bit HDA LBN = 2499 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA60 LBN to Physical and Logical Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . RA60 16-Bit HDA LBN =6000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quick RA60 Head Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ANALYZE Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. SCRUB Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. SUM:MARY Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 11.. Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 1'R Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. WRITE Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2-59 2-61 16-5 16-13 16-14 16-14 16-15 16-15 Course Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic RA80/81 HDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic RA82}IDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic RA70}IDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic RA60 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic RA90}IDA . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . Basic Track and Sector . . . . . . . . . . . . . . . . . . . . . . . .'. . . . . . . . . . . . . . .. Logical Disk Addressing .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. RA80 Logical Disk Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA80 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA81 Logical Disk Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA81 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA82 Logical Disk Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA82 Geometry . . . . . . . . . . . . .' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA70 Logical Disk Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4 2-3 2-4 2-5 2-6 2-8 2-10 2-13 2-16 2-17 2-19 2-20 2-22 2-23 2-24 2-54 2-56 2-57 2-5~ FIGURES 1-1 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 2-10 2-11 2-12 2-13 2-14 x 2-15 2-16 2-17 2-18 2-19 2-20 2-21 2-22 2-23 2-24 2-25 2-26 2-27 2-28 2-29 2-30 4-1 4-2 4-3 4-4 5-1 5-2 5-3 5-4 ~1 6-2 6-3 6-4 ~5 7-1 7-2 7-3 7-4 8-1 8-2 8-3 8-4 8-5 8-6 8-7 8-8 8-9 8-10 8-11 8-12 8-13 8-14 8-15 8-16 8-17 RA70 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA60 Logical Disk Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA60 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA90 Logical Disk Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA90 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA80 Topology - 16-Bit Fonnat . . . . . . . . . . . . . . . . . . . '.' . . . . . . . . . . . . . . RA80 Topology - 18-Bit Fonnat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA81 Topology - 16-Bit Fonnat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA81 Topology - 18-Bit Fonnat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA82 Topology - 16-Bit Fonnat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA70 Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA60 Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA90 Topology - 16-Bit Fonnat . . . . . . . . . . . ' . . . . . . . . . . . . . . . . . . . . . . . . Basic BBR Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Readfflrite Error Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LBNSector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RBN Sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DBN Sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XBN Sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA8! Topology - 16 Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . R/W Data Path External to Drive/RWDP . . . . . . . . . . . . . '.' . . . . . . . . . . . . . . . Simplified RA81 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA81 Topology and Physical Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simplified Replacement and Control Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Detailed Replacement and Control Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Replacement Block Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ReI'Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ReI' Sector 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . FCT Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . FCT Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . FCT Sector 0 - (Volume Infonnation Block) . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bad. Block Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SDI Radial Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SDI Dual Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SDI Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SDI Bus Encode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Drive Off Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Drive Available. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Drive On Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RTCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RTDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SDI Command Frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . !..evel 1 Command Fonnat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . !..evel 1 Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . !..evel 2 START Command Frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . !..evel 2 CONTINUE Command Frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . !..evel 2 END Command FraDle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . !..evel 2 Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SDI Response Frame Fonnat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-25 2-27 2-28 2-30 2-31 2-33 2-36 2-37 2-38 2-39 2-40 2-41 2-42 2-43 2-46 2-52 4-2 4-3 4-4 4-5 5-2 5-3 5-4 5-5 6-2 6-3 6-5 6-6 ~7 7-2 7-3 7-4 7-6 8-3 8-4 8-5 8-6 8-7 8-8 8-8 8-9 8-11 8-13 8-13 8-14 8-16 8-17 8-17 8-18 8-20 xi 8-18 8-19 8-20 8-21 8-22 8-23 8-24 8-25 8-26 8-27 8-28 9-1 9-2 9-3 9-4 9-5 9-6 9-7 9-8 9-9 9-10 9-11 9-12 9-13 9-14 9-15 9-16 H~1 1~2 1~3 1~ 1~5 1~ 1~7 1~8 1~9 1~10 10-11 10-12 10-13 10-14 10-15 10-16 10-17 10-18 10-19 10-20 1~21 10-22 10-23 xii Level 2 Response Start Frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . INITIATE SEEK Command . . . . . . . . . . . . . . . . . : . . . . . . . . . . . . . . . . . . . Successful Response for SEEK Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unsuccessful Response for SEEK Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . Initiate Seek Simplified . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . '. SDI Select Track and Read Timing . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Select Track and Read Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SEEK Command Followed by SELECT' TRACK AND READ . . . . . . . . . . . . . . . . . SDI Select Track and Write Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Select Track and Write Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SEEK Command Followed by SELECT' TRACK AND WRITE . . . . . . . . . . . . . . . . CHAN'GE MODE . . . . . . . . . . . . . . . . . ' . . . . . . . . . . . . . . . . . ' . . . . . . . . . CHAN'GE CONTROLLER FLAGS . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . DIAGNOSE Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DISCONNECT' Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DRIVE CLEAR Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ERROR RECOVERY Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GET 'COMMON CHARACTERISTICS Command and Response . . . . . . . . . . . . . . . . GET SUBUNIT CHARACI'ERISTICS Command and Response . . . . . . . . . . . . . . . . GET' STAl"US Command . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . INITIA'I'E SEEK Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . ONUNE Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RUN Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . READ MEMORY Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RECAUBRATECommand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Topology Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . WRITE ~MORY Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary of RA60 Drive Status Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA60 Byte 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA60 Bytes 2-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA60 Byte 4 Request Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA60 Byte 5 Mode Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA60 Byte 6 Error Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA60 Bytes 7-8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA60 Bytes 9-15. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary of RA70 Drive Status Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA70 Byte 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA70 Bytes 2-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA70 Byte 4 Request Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA70 Byte 5 Mode Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA70 Byte 6 Error Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA70 Bytes 7-8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA70 Byte 9 I...ast Opcode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA70 Byte 10 Drive-Detected SDI Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA70 Bytes 11-15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary of RA80 Drive Status Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA80 Byte 1 . . . . . . . . . . . . . . . . . . . ' . . . . . . . . . . . . . . . . . . . . . . . . . . . RA80 Bytes 2-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA80 Byte 4 Request Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA80 Byte 5 Mode Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-21 8-22 8-23 8-23 8-24 8-26 8-27 8-29 8-30 8-31 8-33 ·9-2 9-3 9-4 9-5 9-6 9-6 9-7 9-9 9-10 9-11 9-12 9-13 9-14 9-15 9-16 9-17 10-3 10-4 10-4 10-5 10-6 10-7 10-8 10-9 10-11 10-12 10-12 10-13 10-14 10-15 10-16 10-17 10-18 10-19 10-21 10-22 10-22 10-23 10-24 10-24 10-25 10-26 10-27 10-28 10-29 10-30 10-31 10-32 10-33 10-34 10-35 10-36 10-37 10-38 10-39 10-40 10-41 10-42 10-43 10-44 10-45 10-46 10-47 10-48 10-49 10-50 10-51 10-52 10-53 10-54 10-55 10-56 10-57 10-58 12-1 12-2 12-3 12-4 12-5 12-6 12-7 12-8 RA80 Byte 6 Error Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA80 Bytes 7-8 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA80 Byte 9 Last Opcode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA80 Byte 10 Drive-Detected SDI Error. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA80 Bytes 11-15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Summary of RA81 Drive Status Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA81 Byte 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA81 Bytes 2-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA81 Byte 4 Request Byte. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA81 Byte 5 Mode Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA81 Byte 6 Error Byte .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA81 Bytes 7-8 Controller Byte. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA81 Byte 9 Last Opcode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA81 Byte 10 Drive-Detected SDI Error. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA81 Bytes 11-15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA82 Drive Status Decode . . . . . . . . . '. . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA82 Byte 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA82 Bytes 2-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA82 Byte 4 Request Byte. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA82 Byte 5 Mode Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA82 Byte 6 Error Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA82 Bytes 7-8 Controller Byte. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA82 Byte 9 Last Opcode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA82 Byte 10 Real-Time Drive Port Image . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA82 Bytes 11-15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Summary of RA90 Drive Status Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA90 Byte 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA90 Bytes 2-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA90 Byte 4 Request Byte. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA90 Byte 5 Mode Byte. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA90 Byte 6 Error Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA90 Bytes 7-8 Controller Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA90 Byte 9 Last Opcode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. RA90 Byte 10 HDA Revision Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RA90 Bytes 11-15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Disk Track and Sector Organization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Disk Header Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. BCC Symbols and Drive Threshold for BBR and Error Logging. . . . . . . . . . . . . . .. Primary Replacement/Revector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Non-Primary Replacement/Revector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. ReT Layout for an RA81 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. BBR FI..OW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Typical Mount Flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 10-25 10-26· 10-27 10-28 10-29 10-31 10-32 10-32 10-33 10-34 10-35 10-36 10-37 10-38 10-39 10-41 10-42 10-42 10-43 10-44 10-45 10-46 10-47 10-48 10-49 1()":'51 10-52 10-52 10-53 10-54 10-55 10-56 10-57 10-58 10-58 12-2 12-3 12-5 12-8 12-9 12-13 12-18 12-19 xiii TABLES 2-1 2....2 6-1 9-1 9-2 12-1 16-1 16-2 xiv Physical Sectors per Track . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12 Values for RA70/80/81/82/90 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-55 ReT Block 0 Defined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8 Byte 2 C-Bits . . . . . . . . . . . . . . . . . . . . . . . ' . . . . . . . . . . . . . . . . . . . . . . . 9-3 DIAGNOSE Command TI/ST Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-5 Operating Systems Revisions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12-14 Legend for ANALYZE Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-6 Legend for HEAD Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-10 Goals/Objectives CHAPTER 1 GOALS/OBJECTIVES Digital Internal Use Only 1-1 Goals/Objectives 1.1 WHAT'S COVERED DSDF disk fonnat and structures SDI BBR Drive status decoding Basic error log decoding and review Special tools and diagnostics DKUTIL Block Conversion Utility RAUTIL HDA Analyzer DKRFC'r FORMATTING Error Log Tools Remote Analysis Tools Miscellaneous HSC Utilities Disk Scrubbing Logically broken drives (versus) physically broken Emphasis will be VMS, HSC, and some tools/concepts in a two-board controller environment Troubleshooting information will be supplied throughout the discussions of the various tools, topics, and lab activities Lab exercises to familiarize the student with the usage of the tools and problem solving 1.2 WHAT'S NOT COVERED MSCP Specific controller repair • Specific drive repair except HDA and communication references • Emphasis on physically broken equipment Tape drive support 1-2 Digital Internal Use Only Goals/Objectives NOTE: ALL OF THE DOCUMENTATION MATERIALS AND SPECIAL TOOLS OBTAINED IN THIS COURSE ARE STRICTLY DIGITAL INTERNAL USE ONLY. PLEASE TREAT TInS MATERIAL ACCORDINGLY. DO NOT LEAVE ANY OF THE SPECIAL SOFfWARE TOOLS OR DOCUMENTATION ON A SITE THAT IS NOT UNDER DIGITAL SERVICE CONTRACT AGREEMENTS OR A SITE THAT IS ACCESSmLE BY 3RD PARTY MAINTENANCE. DO NOT ATTEMPT TO RE-TEACH TInS COURSE IN THE FIELD ! Dtgltallnternal Use Only 1-3 Goals/Objectives 1.3 COURSE MAP Figure 1-1: Course Map L..o .1 ERROR LOG DKUTIL. L..o.3 RAUTIL. DE~DING L..o SUMMARY 1-4 Digital Internal Use Only BL.OCK. COM ~ L..o ~ DSAERR Error Log Tool L.ab .6 SET HOST/HSC + DKRFCT L..o If6 FElEOC Isolation L..o tI7 Trouble.hooting Bugs L..o •• EVRLK (Optional) CHAPTER 2 DSDF FOR RA60170/81/81/82/90 DSDF for RA60/70/81/81/82190 2-1 DSDF for RA60!70/80/81/82190 Lesson 1 DSDF for RA60nO/80/81/82/90 This document describes the location, specification, and function of the various disk internal storage such as platters, heads, and the positioner mechanism, as well as storage components such as cylinders, groups, tracks, and blocks. Bad block replacement, hardware error recovery, and forced errors are also described. This material stresses Digital Standard Disk Fonnat (DSDF) characteristics unique to a variety of disk drives. It will help you understand the overall function of the drives, interpret error logs, and work with diagnostic infonnation in the field. 2-2 Digital Internal Use Only DSDF for RA60J70/80/81/82190 Lesson 1 2.1 MEDIA COMPONENTS (Physical Geometry, Head Disk Assemblies) 2.1.1 RA80/81 182 The RA80/81/82 HDA contains 4 storage platters attached to a spindle assembly. The 4 platters provide a total of 8 magnetic recording surfaces. A rotary positioner and motor assembly within the HDA contains 8 metal arms. Each metal arm contains 2 head assemblies for a total of 16 heads within the HDA. The heads and arms are attached to the positioner so that 2 heads are located over each of the 8 recording surfaces. The positioner motor is responsible for moving all 16 heads simultaneously across the media surfaces during a SEEK operation. Fourteen data heads are used for R/W data operations to and from the disk surface in the RA80 and RA81. Fifteen of the data heads are used for R/W data operations to and from the disk surfaces in the RA82. The heads are numbered as shown in Figure 2-1 and Figure 2-2. The last head is a servo head used to read specially recorded servo infonnation from the dedicated servo area of the disk surface. Figure 2-1: Basic RA80/81 HDA METAL ARM READ/WRITE DATA HEAD POSITIONER ASSEMBLY SPINDLE SERVO HEAD DISK MEDIA STORAGE PLATTER CXO-8388 Digital Internal Use Only 2-3 DSDF for RA60nO/80/81/82190 Lesson 1 Figure 2-2: Basic RA82 HDA METAL ARM \ POSITIONER ASSEMBLY READ/WRITE DATA HEAD I HEAD 13 1 I I HEAD 11 I I HEAD 121 I HEAD 9 I I HEAD 10 I I HEAD 7 I I HEAD 8 I HEAD 14 SPINDLE I HEAD 5 I I HEAD 6 I I HEAD 3 I I HEAD 4 J I I l HEAD 2 J I I HEAD 0 HEAD 1 I 1/ DISK MEDIA STORAGE PLATTER CXO-2340A 2-4 Digital Internal Use Only DSDF for RA60nO/80/81/82/90 lesson 1 2.1.2 RA70 The RA70 HDA contains 6 storage platters attached to a spindle assembly. The six platters provide a total of 12 magnetic recording surfaces. A linear positioner and motor assembly within the HDA contains 12 metal arms. Each metal arm contains 1 head assembly for a total of 12 heads within the HDA. The heads and arms are attached to the positioner so that 1 head is located over each of the 12 recording surfaces. The positioner motor is responsible for moving all 12 heads simultaneously across the media surfaces during a SEEK operation. Eleven data heads are used for R/W data operations to and from the disk: surface in the RA70. The heads are numbered as shown in Figure 2-3. The last head is a servo head used to read specially recorded servo infonnation from the dedicated selVO area of the disk surface. Figure 2-3: Basic RA70 HDA Metal Arm Read/Write ~aHeod t Heod10" Heod 9 1 Head 8 1 Head 71 Head61 HeadS 1 Pesitioner A ssembly Spindle Head 4 1 Head 31 Head 2 1 Head 11 . HeodOI , Servo I i Disk Media Storage Platter Head MI.C5-2054A Digital Internal Use Only 2-5 DSDF for RA60170/80/81/82190 Lesson 1 2.1.3 RA60 The RAOO uses a removable pack that contains 3 storage platters attached to a spindle. The 3 storage platters provide a total of 6 magnetic recording surfaces. A carriage assembly within the RA60 drive contains 6 replaceable head/arm assemblies. These assemblies are attached to the carriage so that 1 head will be positioned over each recording surface. The carriage assembly is responsible for moving all 6 heads simultaneously across the media surfaces during a SEEK operation. All 6 data heads are used for R/W data operations to and from the disk surface in the RA60. The heads are numbered as shown in Figure 2-4. There is no dedicated servo surface in the RA60. Figure 2-4: Basic RASO READ/WRITE DATA HEAD / CARRIAGE ASSEMBLY I HEAD 0 ( I HEAD 1 I I HEAD 2 I I HEAD 3 I I HEAD4 I SPINDLE I HEAD 5 I / / DISK MEDIA STORAGE PLATTER CXO-2341A 2-6 Digital Internal Use Only DSDF for RA60nO/80/81/82190 Lesson 1 2.1.4 RA90 The RA90 HDA contains 7 storage platters attached to a spindle assembly. The 7 platters provide a total of 14 magnetic recording surfaces. A positioner and motor assembly within the HDA contains 14 metal arms. Each metal arm contains 1 head assembly for a total of 14 heads within the HDA. The heads and anns are attached to the positioner so that 1 head is located over each of the 14 recording surfaces. The positioner motor is responsible for moving all 14 heads simultaneously across the media surfaces during a SEEK operation. Thirteen data heads are used for R/W data operations to and from the disk surface in the RA90. The heads are numbered as shown in Figure 2-5. The last head is a servo head used to read specially recorded servo infonnation from the dedicated servo area of the disk surface. Digital Internal Use Only 2-7 DSDF for RA60f70/80/81/82190 Lesson 1 Figure 2-5: Basic RA90 HDA e( oo:t oo:t ('I) W C\I -I 6x 0 Z o a:: en ~ 0 w a:: r.-,::Ew l- .... ~ C/) e( Ci -I Q. 0 e( w ::r e( ....e( 0 o > a:: e( 0 ~ ww en ::r c: r--- r--- ~ ,...- r--- 0 ..... C\I ('I) oo:t Lt) 0 0 0 0 c e( W W --- ::r ce( UJ ::r ~ ~ 0 ..... ..... ..... .-- 0) 0 ce( 0 0 W w ::r ~ e( W ::r - r--- e( UJ J: ::r ~ ~ ~ a::: e( ...J e( ....UJ . ~ a:: w> %-1 oeD -::E t:w C/)CI) OCl) Q.c( 2-8 Digital Internal Use Only C\I ..... I' I i ' i I I I I I I I I I I , I I I I I I I I I III III GROUP 1 = PHYSICAL CYLINDER 3 Y :u ~ SPINDLE I I I I , , I ~ I J \ DISK MEDIA Y '" o "11 --.o ::D ~ ~ r;~ Ut._~ t' .4 ..... LOGICAL CYLINDER 1 (PHYSICAL CYLINDERS 2 + 3) CXO-842B Utco g~ .....0 DSDF for RA60f70/80/81/82/90 Lesson 1 The RA80 uses dedicated servo for both coarse and fine positioning control. This dedicated servo data is always in control of the positioning. When performing a head switch in the RA80, we can immediately begin reading data from the next sequential head without any significant latency required to re-establish fine positioning. Since the latency required for a head switch is less than the intersector gap time, a group in the RA80 is equivalent to 14 tracks. To switch from a group on one physical cylinder to a group on another adjacent physical cylinder in the RA80 requires the drive to switch heads and internally perform a one-cylinder seek. The RA80 can perform these actions simultaneously as there is little or no additional head settling time required. Therefore, the RA80 is able to equate two groups (or two physical cylinders) into one logical cylinder. This provides the ability to select adjacent physical cylinders using a SELECT GROUP command (requires one SDI command frame from the controller) as opposed to a SEEK command (requires seven SDI command frames from a controller). You are probably wondering why three physical cylinders (3 groups) were not incorporated into one logical cylinder. One reason is simple: DSDF indicates that the time required to select any new group within a logical cylinder must be less than the time to select a new physical cylinder. Switching from a group 0 address to a group 2 address on a logical cylinder is really two physical cylinders in the RA80. Obviously, a two cylinder seek requires more time than a one cylinder seek. 2-18 Digital Internal Use Only DSDF for RA60170/80/81/82/90 Lesson 1 2.7.5.2 RA81 Logical Ata~ressing The RA8l implements the following geometry for logical addressing: =1 physical track 1 logical group =1 logical track 1 logical cylinder = 14 logical groups (1 physical cylinder) 1 logical track Figure 2-10 and Figure 2-11 show the RA8110gical addressing and geometry. Figure 2-10: RA81 Logical Disk Addressing TOTAL ADDRESSABLE STORAGE SPACE OF DISK DEVICE I I~~o~I~'~1~2~1~3~1~4~1~~~~~~~~~~~~~~1~'2~5~71 1_,_1__2 ....1_3____1 _4_1____________1_13~1 _I~o..... o 1258 LOGICAL CYLINDERS 14 G ROUPS/CYLIN DER 1 TRACK (HEAD) PER GROUP ,.----_----'11 I - - -_ . - - - , BLOCKSITRACK I~_o...I~1~,--2 I ....I~3-.....~4 I ....I_~......._ _ _ _ _ _~~5~1 I I 52 (SECTORSITRACK) CXO-2305A Digital Internal Use Only 2-19 I\) "11 J.:, o LOGICAL CYLINDER 0 cO' e Ci = PHYSICAL CYLINDER 0 c 1..... cO' i S' S' ::D t..... 3 c !!. SPINDLE SERVO HEAD ~ ) I I I I I ~ ) I LOGICAL CYLINDER 1 = PHYSICAL CYLINDER r DISK MEDIA 1 CXO-844B ~ en ~ o 05 o 3 (:) -< POSITIONER ASSEMBLY :u 05 ..... 05 !. ~ :;'0..... ., C) CD o I o:::::I I;~ Ie 0'11 ~ DSDF for RA60nO/80/81/82190 Lesson 1 The RA81 has dedicated servo for coarse positioning. The increased data and cylinder densities in the RA81 require a more precise mechanism for fine positioning. Therefore, the RA81 incOlporates embedded servo to accomplish a finer positioning scheme. The embedded servo exists between every sector on every track. When performing a head switch, the RA81 servo logic must read several embedded servo bursts from a track to establish fine position before it can continue reading or writing. The time to accomplish a head switch is obvIously greater than the intersector gap time. Therefore, the RA8t equates one track per group. To switch from a group on one physical cylinder to a group on another adjacent physical cylinder in the RA8t requires the drive to switch heads and internally perform a one-cylinder seek. This requires the RA81 to settle (fine position using embedded servo) after performing the seek. The time required to do this is obviously greater than just the seek itself. Therefore, the RA8t equates 14 groups to 1 physical and logical cylinder. Digital Internal Use Only 2-21 DSDF for RA60n0/80/81/82190 Lesson 1 2.7.5.3 RA82 Logical Addressing The RA82 implements the following geometry for logical addressing: 1 logical track = 1 physical track 1 logical group = 1 logical track 1 logical cylinder =15 logical groups (1 physical cylinder) Figure 2-12 and Figure 2-13 illustrate the RA82 logical addressing and geometry. Figure 2-12: RA82 Logical Disk Addressing TOTAL ADDRESSABLE STORAGE SPACE OF DISK DEVICE I I0 I 1 I 2 I I I 3 I0 I 114341 1435 LOGICAL CYLINDERS' 4 1 1 2 I I 3 4 1 114 o I 15 GROUPS/CYLINDER 1 TRACK (HEAD) PER GROUP I 58 BLOCKSITRACK (SECTORSITRACK) CXO-1500B 2-22 Digital Internal Use Only DSDF for RA60170/80/81/82190 Lesson 1 Figure 2-13: RA82 Geometry Logical Cylinder 0 - Physical Cylinder 0 ,.. ,.. - I \ .:, IHead 131 .: IHead 111 , I Head 9 I J I Head7 I .I II II P 0 S I T I 0 I I , I II II I I N I E R I HeadS J A I.. Head 3 S S Y I I. II, I I I I I I I I I I I Head 1 I :.: :,: I' I - I I II I I :', i I Servo Head - 1l ..I I I I I I I I I I I I , I Head 14 I I Head 121 I Head 101 I Head 8 I I I I I I I I' ~ I' :1:, , ,, ,, I' I' I' ~ I' ; I, I' I I I I I I I I I I , ,, II I , I I 'I s ,,:,, P I N D L I I Head 6 I :I: , I Head 4 I ,, , , I Head 2 I :. ,, I Head 0 I , ill I' I I I I I I I I I I ' I I ,II E III 1 ',I i I I Disk Media I -I ~ -{ . : \ \I \I I I - Logical Cylinder 1 - Physical Cylinder 1 MLDS-l344A The RA82 has dedicated servo for coarse positioning. The large amount of data and cylinder densities in the RA82 require a more precise mechanism for fine positioning. Therefore, the RA82 incOIporates embedded servo to accomplish a fine positioning scheme. The embedded servo exists between every sector on every track. When performing a head switch, the RA82 servo logic must read several embedded servo bursts from a track to establish fine position before it can continue reading or writing. The time to accomplish a head switch is obviously greater than the intersector gap time. Therefore, the RA82 equates one track per group. To switch from a group on one physical cylinder to a group on another adjacent physical cylinder in the RA82 requires the drive to switch heads and internally perform a one-cylinder seek. This requires the RA82 to settle (fine position using embedded servo) after perfonning the seek. The time required to do this is obviously greater than just the seek itself. Therefore, the RA82 equates 15 groups to 1 physical and logical cylinder. Digital Internal Use Only 2-23 DSDF for RA60nO/80/81/82190 Lesson 1 2.7.5.4 RA70 Logical Addressing . TheRA70 implements the following geometry for logical addressing: =1 physical track 1 logical group = 1 logical track 1 logical cylinder = 11 logical groups (1 physical cylinder) 1 logical track Figure 2.... 14 and Figure 2-15 show the RA70 logical addressing and geometry. Figure 2-14: RA70 Logical Disk Addressing TOTAL ADDRESSABLE STORAGE SPACE OF DISK DEVICE I ~I_O~I_1~1_2~1_3~1_4~1~~~~~~~~~~~~~~1_1_51~61 l,--o~I_1.....1_2...1_3~1_4~1~~""",,-~~~__.....1_10.., 111 o 1517 LOGICAL CYLINDERS GROUPS/CYLINDER 1 TRACK (HEAD) PER GROUP I 34 BLOCKSITRACK (SECTORSITRACK) CXO-2307A 2-24 Digital Internal Use Only DSDF for RA60f70/S0/S1/S2190 Lesson 1 Figure 2-15: RA70 Geometry < 0> o CI') W C\I -I 0 Z 6x (.) ~ en a:: w o Z ::i > () ..J <: () G o ~ ..J l ----- ----- ------- ------ - '---- ------ ------ ----- ------ ------ -- ----=., ~ - ."..---- ----- ----- ----- ----- ---- ----- ------ ---= ~ 0 -- ~ 0> - - -=---- - ex) 0 0 0 w w w - ~ ~ " CD It) 0 0 0 w ::r:: <: w o <: a:: w ::r:: ::r:: ::r:: ::r:: o I-- ""-- ~ "--- ;?;; <: 0 ~ ww 0:< ..J en::r:: > () ..J < () G o ..J a:: w> Z-I om -~ !::w en en oen ~< Digital Internal Use Only 2-25 DSDF for RA60170/80181/82190 Lesson 1 The RA70 has dedicated servo for coarse positioning. The large amount of data and cylinder densities in the RA70 require a more precise mechanism for fine positioning. Therefore, the RA70 incorporates embedded servo to accomplish a fine positioning scheme. The embedded servo exists between every sector on every track. When performing a head switch, the RA70 servo logic must read several embedded servo bursts from a track to establish fine position before it can continue reading or writing. The time to accomplish a head switch is obviously greater than the intersector gap time. Therefore, the RA70 equates one track per group. To switch from a group on one physical cylinder to a group on another adjacent physical cylinder in the RA70 requires the drive to switch heads and internally perform a one-cylinder seek. This requires the RA70 to settle (fine position using embedded servo) after performing the seek. The time required to do this is obviously greater than just the seek itself. Therefore, the RA70 equates 11 groups to 1 physical and logical cylinder. 2-26 Digital Internal Use Only DSDF for RA60n0/80/81/82190 Lesson 1 2.7.5.5 RA60 Logical Addressing The RA60 implements the following geometry for logical addressing: =1 physical track . 1 logical group = 1 logical track 1 logical cylinder =4 logical groups (4 physical tracks) 1 logical track Figure 2-16 and Figure 2-17 show the RA60 logical addressing and geometry. Figure 2-16: RA60 Logical Disk Addressing TOTAL ADDRESSABLE STORAGE SPACE OF DISK DEVICE I I I~~0~~1~~2~~3~~4~~~~~~~~~~~~~~~~~2~3~99 I I I I I I r~ I I I 0 1 2 G 3 2400 LOGICAL CYLINDERS (1600 PHYSICAL CYLINDERS) 4 GROUPS/LOGICAL CYLINDER 1 TRACK/GROUP I 43 BLOCKSITRACK (SECTORSITRACKS) CXO-2326A Digital Internal Use Only 2-27 N ~ co LOGICAL CYL 0 ON SURFACE 0 c {~~~~~ ~,-----, GROUP 1 - GROUP 0 - cs· = :; "c cC· GROUPO} GROUP 1. LOGICAL CYL 6 r---GROUP 2 ON SURFACE 0 r-GROUP 3 i ...f ~ !. . ::D » en ~ o ~ !!. i>o c: I I o = ~ -< HEAD 0 HEAD 1 I J :3 CYL 0 I I I I I I Lt CYL 6 CYL 1 ,... CYL 2 IIIIIIII HEAD 2 POSITIONER ASSEMBLY HEAD 3 I CYL 3 HEAD 4 CYL 4 I HEAD 5 LOGICAL CYL 5 ON SURFACE 5 I I I I I I !!. ~ CYL 7 SPINDLE CYL 8 CYL 9 IIIIIIII CYL5 {~~~~~ ~=~- GROUP 2 - GROUP 3 , - - - - - ' CYL 10 CYL11 L.-.-GROUP L----GROUP --GROUP L-----GROUP 3} 2 LOGICAL CYL 11 1 ON SURFACE 5 0 CXO-2342A r-e CD(/) Ie -... 0'"11 ::::J- o ~ CJ) ~ co co co o ~ o DSDF for RA60f70/80/81/82190 Lesson 1 The RA60 does not incorporate a dedicated servo surface but relies upon embedded servo information to perform servo positioning. The embedded servo exists between every sector on every track. When performing a head switch, the RA60 servo logic must read several embedded servo bursts from a track to establish fine poSition before it can continue reading or writing. The time to accomplish a head switch is obviously greater than the intersector gap time. Therefore, the RA60 equates one track per group. Another unique characteristic of the RA60 is that it can seek faster than it can change head selection. After performing a variety of design evaluations under different operating system environments, a unique physical geometry was established to allow optimum performance of the RA60. This is better understood by reviewing Figure 2-17. A logical cylinder consists of four adjacent physical tracks (groups) on the same surface read by the same head. For example, logical cylinder 0 consists of groups 0, 1, 2, and 3 all on disk surface 0, read by R/W head nwnber O. Physically, this equates to the first four physical tracks on surface O. Logical cylinder 1 consists of the first four tracks (group 0 through group 3) on physical disk surface 1, read by R/W head number 1. As you can see from studying the diagram, a logical cylinder is quite different from a physical cylinder in the RA60. Digital Internal Use Only 2-29 DSDF for RA60170/80/81/82/90 Lesson 1 2.7.5.6 RA90 Logical Addressing The RA90 implements the following geometry for logical addressing: 1 logical track = 1 physical track 1 logical group = 1 logical track 1 logical cylinder = 13 logical groups (1 physical cylinder) Figure 2-18 and Figure 2-19 show the RA90 logical addressing and geometry. Figure 2-18: RA90 Logical Disk Addressing TOTAL ADDRESSABLE STORAGE SPACE OF DISK DEVICE I ~I_o~I_'~_2~1_3_1~4~~~~~~~~~~~~~~~1_2_65~51 ~56LOGICALCYLINDERS 1 .. 1_1.........1 _2.....1_3___1_4.......1_. . . . . . . . ._____~1_1..... 21 _0..... 13 GROUPS/CYLINDER o ,..---_--111'---_---, 1 TRACK (HEAD) PER GROUP BLOCKSITRACK 9 I 70 (SECTORSITRACK) I..._°.....I_1......I _2__ I 3____ I 4....1_ I .............._ _ _ _ _--...._ I 6..... CXO-2345A 2-30 Digital Internal Use· Only DSDF for RA60f70/80/81/82/90 Lesson 1 Figure 2-19: RA90 Geometry W ...J C ~ D.. en c: w c Z ::::i >- (,) ...J < 2 C!) o ...J l ------ ------- ------------= f-o------ ------ ----------_. ----- ------- -------~----- '--- ~ 1-------- ---------- ------ ---------------------------'--= ~ ~---- ~----- ::----- - ~ 0 o > c: < w W en J: o c: w !"-"- Z 0 >(,) c < w o ::::i ...J < (,) a o J: ~ - !"-"- - - ~ ..- N ('f) ~ U) c < w 0 c < w c < w c < w - J: < w J: ~ J: - - J: - ~ ,.... co c < w - c < w J: J: J: "'-- I---- '--- - ~ CD CJ) c < w c < w J: J: 10....- -..... -- - 0 c < w 0 < w - J: - J: - (\J ..... c < W - J: - ...J c: w> Z...J oeD -~ t::w en en oen D..< Digital Internal Use Only 2-31 DSDF for RA60170/80/81/82190 Lesson 1 . The RA90 has dedicated servo for coarse positioning. The large amount of data and cylinder densities in the RA90 require a more precise mechanism for fine positioning. Therefore, the RA90 incorporates embedded servo to accomplish a fine positioning scheme. The embedded servo exists between every sector on every track. The time to accomplish a head switch is obviously greater than the intersector gap time. Therefore, the RA90 equates one track: per group. To switch from a group on one physical cylinder to a group on another adjacent physical cylinder in the RA90 requires the drive to switch heads and internally perfonn a one-cylinder seek. This requires the RA90 to settle (fine position using embedded servo) after perfonning the seek. The time required to do this is obviously greater than just the seek itself. Therefore, the RA90 equates 13 groups to 1 physical and logical cylinder. 2.8 LOGICAL AREAS and LOGICAL BLOCKS Infonnation on the data recording surfaces of the media is logically organized according to the Digital Standard Disk Fonnat (DSDF) specification. This specification standardizes and defines how a Digital Storage Architecture (DSA) device appears to the host processor and the controller to which it is attached. The total data storage area of the disk media is divided into physical sectors. In the disks discussed in this documen4 a sector is equivalent to a logical block. We have already seen how the physical attributes of disk addressing are translated to logical attributes. The DSA architecture also provides for lOgical organization of the data blocks on the disk, regardless of the addressing attributes selected. The nwnber of logical blocks is divided into logical areas. The follOwing paragraphs describe these areas. Figure 2-20 illustrates the areas of the basic topology. 2-32 Digital Interna1 Use Only DSDF for RA60r70/80/81/82190 Lesson 1 Figure 2-20: Basic Topology HOST APPLICATION AREA LBNs RCT FCT REPLACEMENT CONTROL TABLES FORMAT CONTROL TABLES (4 COPIES) (4 COPIES) LBNs XBNs I ~_ _ _ _ DBNs -J REPLACEMENT BLOCK AREA (FOR REPLACING HOST APPLICATION LBNs ONLY) ACCESSIBLE BY HOST APPLICATIONS SOFTWARE---1 CONTROLLER DIAGNOSTIC R/W AREA RBNs ACCESSIBLE BY HOST OPERATING SYSTEM SOFTWARE ~-----------ACCESSIBLE BY CONTROLLER ----------~ CXO-1326B 2.8.1 Host Application Area (LBNs) The host application area is the largest area containing data blocks for use by nonnal host operating applications as well as system operating software. This is the area where users store data files and/or programs. System files and system operating software are also stored here. This is the nonna! working area of the disk. Blocks in the host application area are addressed as logical block numbers (LBNs). This area is also sometimes referred to as the nonnal LBN area. 2.8.2 Replacement Block Area (RBNs) Blocks within the replacement block area are used to replace defective blocks in the host applications area. When a block in the nonnal LBN area. becomes unusable, the host operating system or the controller may substitute a replacement block for the defective LBN in the host application area. This is accomplished by a process called Bad Block Replacement (BBR). BBR is discussed later in this course. Blocks in the replacement block area are addressed as RBNs. If an RBN becomes defective, another RBN may be used in its place. RBN's are located in the last logical sector of every track in both the host and Rcr area. Digital Internal Use Only 2-33 DSDF for RA60n0/80/81/82190 Lesson 1 2.8.3 Replacement Control Table (RCT) Area (LBNs) Blocks within the replacement control table area contain information that allows the controller and/or the host operating system software to find blocks from the normal LBN area that have been replaced by blocks in the replacement area. The tables also contain information that identify which RBNs are still available when perfonning bad block replacement. Blocks in RCT are also addressed as LBNs. Blocks in this area are not available for access by normal host applications. These blocks are only accessible by the controller and/or host operating system software (VMS, RSTS, etc.). Blocks in the ReI' are not replaced by RBNs when they become defective. For this reason, multiple copies of the ReI' tables are maintained in the RCT area. This redundancy permits backup protection in the event that any blocks in this area become unusable. 2.8.4 Format Control Tables (FCT) Area (XBNs, External Blocks) Blocks within the format control table area contain the following infomiation: Media serial number. Date of initial factory fonnatting. Date of most recent formatting. Mode that the media/HDA was formatted The HDA/pack is available from the factory in either 16-bitmode (512 8-bit bytes per sector) or 18-bit mode (576 8-bit bytes per sector). PDP-ll and VAX processors, for example, require 16-bit mode media, and DECsystem-lO and DECsystem-20 processors require I8-bit mode media. The format mode of an HDA/pack cannot be changed in the field. Information to indicate if the rest of the FCT structure contains any valid data. Location of the manufacturing-detected bad blocks (sectors). NOTE The RA60, RA80, and RA81 are available in both 16-bit and IS-bit configurations. The RA70, RA82, and RA90 are available in only the H;-bit configuration. During the manufacture of an HDA, special factory scanners are used to locate defective blocks found in the media. During factory formatting, this information is recorded into the FCT. Special formatter programs and/or utilities executing within the controller use the manufacturing-detected bad block information in the FCT to create, re-create, or verify the RCT and replace the bad blocks known to exist when the HDA was manufactured. These special programs are either resident or loaded into the controller from the host for execution, but only upon manual request. Blocks in the FCT are not replaced by RBNs when they become defective. For this reason, multiple copies of the FCI' tables are maintained in the FCI' area. This redundancy permits backup protection in the event that any blocks in this area become unusable. Blocks in the FCI' are addressed as external block numbers (XBNs). Blocks in the FCT are only accessible by the controller. Most external blocks within the FCT contain data that· is used to physically locate manufacturingdetected bad blocks (sectors) on the media. This special data within an XBN is referred to as a physical block number (PBN). 2-34 Digital Internal Use Only DSDF for RA60170/80/81/82190 Lesson 1 2.8.5 Diagnostic Area (Diagnostic Block Numbers) Blocks in this area are used by the controller to perfonn read/write diagnostics to the disk drive. Blocks in this area are addressed as diagnostic block numbers (DBNs). Blocks in the DBN area are only accessible by the controller. Blocks in this area are not replaced by RBNs when they become defective. Also, no hardware provisions nor DSA specifications govern the protection against defective DBNs. It is the responsibility of the controller and its specially loaded diagnostic software to handle . unusable DBNs. Refonnatting the DBN area in the field may restore a previously defective DBN, but use this process with caution. Read data, write data, or format operations to the DBN area require execution of special diagnostic/formatting microcode within the controller itself. Depending upon the controller type, this code may be resident or loaded into the controller from the host. Manual intervention is usually required to invoke these operations. 2.9 IMPLEMENTATION OF LOGICAL AREAS 2.9.1 Drive Topology Maps Figure 2-21 through Figure 2-28 illustrate how the topology is implemented into the various disks. For example, refer to Figure 2-25 for a map of the RA82 topology. This diagram illustrates how the logical areas are mapped into the RA82 physical environment for a 16-bit formatted HDA. Here we show the physical cylinders across the top of the diagram and physical sectors down the left side. Physical cylinder 0 has been further expanded for clarification. All the sectors within track 0 under head 0 appear in the column marked Head O. The logical block assignment for LBN 57 in the host applications area, for example, appears at the intersection of sector 0 and head 1 (track 1), etc. NOTE These figures are for training purposes only. They do not show the implementation of group offset. Dlustrating group offset would cause a more complex representation and result in confusion. Notice that the last sector in each track is allocated as a replacement block (RBN). Even though the RBN area extends physically into the cylinders allocated for the ReI', RBNs are only used to replace bad LBN blocks for the host application area. The RBNs are assigned numbers independently of LBN numbers. Bad block replacement will be discussed later. Digital Internal Use Only 2-35 ~ en :::!! c cc i c ..~ <0' == !!. S' . ;- __OUTE~I __ :::s GUARD-I BAND !. c ---I--GUARDBAND I -_ _ _ _ _--J.I,'--_ _ _~ICYLICYLINDERS = o:::s -< __I __ INNER __ PHYSICAL CYLINDERS if SECTOR i LBN LBN 0.L 31 I SECTOR LBN 1 T 1 PHYSICAL SECTORS SECTOR 30 SECTOR 31 I LBN I 403 I lBN 1 I 1 , , ' , !.-+~- rLBN HOST L~~4_ I I I I I i'r-: : : I , I : I 545 546-549 CYLINDERSI CYLINDERS 550 - 553 554 - 557 Q) 0 ';} "0' 0 cc '< i o .. I :0 » I ~~~~ICATION , 'I I, I LBN ILBN I 30 I 61 I I LBN ,433 ~-- 11 I OJ == "11 LBNs I I II I CONTROLLER DIAG RIW AREA (4 COPIES) (4 CO PIES) r4 COPIES) ILBNs I XB,Ns ---1 BN 17699 ..3 0 IFCT REPlACE- FO RMAT MENT ICONl ~ROL CONTROL TA BLES TABLES , I I __ .1 __ -1 en ----,. RCT I .. '+248LBNs 'FOR HOST I APPlICA:TIONS I DBNs. !. IDC =~ 0'"" ::J- ...r.~ :II ~ 0) ~ 0 05 0 05 ...r. 05 ~ 0 "T1 cO' c: OJ ~ I ~ ~ OUTER GUARDBAND ~ o o SECTOR LBN I leN'I T 1 cO' ;:; !!. 5' CD ~ ::J !!. c: en CD 0 ::J ~ i..... flBN I 29 I I I I I I I I I L365 I --.J.. __ ~ I LBN I I 55 I SECTOR 27 LBN 27 SECTOR 28 RBN II RBN 'I o •1 • ~ II .. ~ o 0' '< 1+224 LBNs IFOR HOST I APPLlCAITIONS HOST APPLICATION AREA I I FCT RCT REPLACE-I FORMAT MENT CONTROL CONTROL TABLES TABLES I I I I t o CC I I I I I I I :u "'C ICYLICYLINDERSI CYLINDERSI CYLINDERS " . . 545 546-549 550 - 553 554 - 557 I I ~~ I I I I .. 1# ~ I LBN I 364 --:--1 :--1 PHYSICAL SECTORS C 1 LBN I LBN I 0 . 1 28 I 1 INNER . - . (GUARD- BAND CYLINDER 0 t---,.....-,.....----iCYLICYLI HEAD HEAD IHEAD 1 2 SECTOR I CYLINDERS !":' J--I LBN I 391 XBNs LBNs I' RBN IREPLACEMENT BLOCK AREA (RBNs) (FOR • 13 REPLACING HOST APPLICATION LBNs ONLY) IRBN 17699 SOFTWARE~ ~I ACCESSIBLE BY HOST OPERATING SYSTEM SOFTWARE ACCESSIBLE BY CONTROLLER CONTROLLER DIAG R/W AREA (4 COPIES) I (4 COPIES) LBNs ACCESSIBLE BY HOST APPLICATIONS o DBNs I ~- I ~I ~I ~:IJ I ~~ I ~ I ~ to> »I ~ I 0 I ~ 0 I ~I I I I ....CD I OJ ;:; "T1 o 3 ~ !. c U) 1/ FOR INTERNAL DRIVE DIAGNOSTICS ONLY CXO-848B C 'TI o .... :u t ~ Cia o IDCia tn ...a. tnCia g ~ ...a. 0 ~ CD c cs' = !!. 5' CD 3 OUTER I .GUARD. . . . . II( __ I~INNER __ -l~GUARD BAND PHYSICAL CYLINDERS BAND !t • c: o :::s ~ " I;C ; 0"1'1 N .... 0 cO' c 1--------J.,,~-----J·~1124 7 1248-1251 1252 - SECTOR ~ LBN , LBN' ., o 0 1. 51 1 'LBN 1 663 SECTOR LBN I 1 1 _+~_~ rLBN laN 1 I I I I I II PHYSICAL SECTORS ~ SECTOR 50 SECTOR 51 I--I/--JII-I I 'I I I 1 1 1 1101 miN RBN'I 11. o RCT FCT REPLACEMENT CONTROL TABLES FORMAT CONTROL TABLES CONTROLLER DIAG R/W AREA : I I 1 .... --.. LBN I LBN I 50 HOST APPLICATION AREA I I I I ~ n I 713 II 1 " II RBN _______ 13 (4 COPIES) (4 COPIES) LBNs I 1--1 LBN I 1257! LBNs I REPLACEMENT BLOCK AREA (RBNs) (FOR REPLACING HOST APPLICATION LBNs ONLY) ACCESSIBLE BY HOST APPLICATIONS SOFTWARE iRBN 117527 ---.l~ ACCESSIBLE BY. HOST OPERATING SYSTEM SOFTWARE ACCESSIBLE BY CONTROLLER IXBNs I DBNs :D l> lJ 0 .~ 0 )I- Q) ..... c} 0' ca • L~~- I I I 1255 1256 - I I I o tI ~l:J I ~en I ~I =E I I I '< I ..... en I OJ ~I ~I ~~ I ~ I m ~I ~I -< I I 1/ i~~RNAL DRIVE DIAGNOSTICS ONLY CXO-849B ... ~ w ." 0 IcYL ICYLINDERS ICYLINDERS ICYLINDERS If m~ ::l~ = ..3cr !. en 05 0 ....05 co N fa 0 !! CD C CiJ ... ~ .. I... OUTER GUARDBAND PHYSICAL CYLINDERS CYLINDER 0 HEAD HEAD o SECTOR o SECTOR 1 1 LBN I LBN 0 146 I I I I I ~ SECTOR c 45 :=; 46 cO' !. S- Ci 3 !. c: = o :::J ~ ~ CD SECTOR I I I CYl CYl IHEAD 1 2 H1! I I 1 rL~~9_ LBN I I I --.l---f RBN I" I R, BN', 0t1 ~ • RCT IFCT REPLACEMENT CONTROL TABLES FORMAT CONTROL TABLES lBNS~XBNS , LBN .. I 643 ""R, B, N, I,RE,P" LACEMENT BLOCK AREA (RBNS), (FOR Ft!=PL~CING HOST APPLICATION LBNs ONLY) • 13.u. ACCESSIBLE BY HOST APPLICATIONS SOFTWARE IRBN 117527 ~~ ACCESSIBLE BY HOST OPERATING SYSTEM SOFTWARE ACCESSIBLE BY CONTROLLER CONTROLLER DIAG R/W AREA (4 COPIES) (4 COPIES) LBNs , ..A 0" 1--" :D t ci 'tJ o r24A1248-1251 11252-125511256-1257 HOST APPLICATION AREA , I lBN 'LBN , I 91 .. 598 ,,I ''' ,,I 45 Yl CYLINDERS CYLINDERS CYLINDERS II if I I I I I I I I-I~ I , I I ' , I I --,--GUARDBAND I lBN LBN I leN !._+~_.. PHYSICAL SECTORS _I_INNER __ . DBNs I I I -~ I o ~Z I :D~ I ~I :IJ i\il en i\ien I ~m I ~I ~I ~I~I -< I I I I ~ FOR INTERNAL DRIVE DIAG- ~~~~~~~~~~~~~~~~~~~~~~~NOSTICS ONLY CXO-850~ CD '< I ..A Q) I OJ ::; -n ...o 3 a c en c.,.. o ~ :D ~ en o ~ co o rD05 en -' en CiO g~ -'0 -n cO' c 1o c cO' :; !. 3" OUTER CYUN)ERO ::s !. c: ::s ... ~ H' :xJ S' ... = o .. ~ INNER GUARDBAND PHYSICAL CYUNDERS ~~~ i ~I~ Sector 0 -< Sector 1 T 1 CYl. 1 " , ; // CYl 1422 /':U3N L.BN : LBN: : LBN :799 L ____ :7<16 ___ ~ 1:581 _____ ~----~ /- !---;I: CYUNDERS 1427-14'1) 1431-1434 Host Application Area I I I I I I I : I I RBN : RBN: o I 1 I I ~ ~ ~ -Cyl .• . • CyI :1435 : 1436 Replacement Control Tables Format Control Tables Controlloer Diagnostic R/Warea -R/W· read • only ~ '< I .... en I m (4 copies) DBNs r----- , , ,/ Accesstlle I LBNs 854 ! I RBN :I 14 I Replacement block area CRBNs) (for replacing Host Application U3Ns only) by hoot appHcatkms software Accessible by hoot operating system software Accessible by controller ~ : RBN I 21101 l j ~ 0 Cis 0 (jCj .... (jCj 0 3 :LBN I ~ 0 0' 0 CO SD XBNs :u en N :; ~ o '11 :J ,..&0 "'I "0 6' ... LBNs ,, ---- .. -----, LBN:IJJ'JI ff> ! 113 ! t ---• FCT (4 copies) I Sector 57 CYUNDERS RCT ~ Physical Sectors Sector 56 CYUNDERS 1423-1426 HEAD 14 ./ LBN:lBN: 0:571 ____ L ____ J I I I I I I I I , CYL 2 {DC :{g J~:"" U diagnostics only Ml..fh-l048A CiJ ,. CC· e Cil ~ en OUTER ~UARD~ PHYSICAL CVUNDERS HEAD) o 1 LBN: LBN: •• •• • • Cyl: Cyl • • -5 • -4 : cnly. I~ Sector 0 Sector 1 rea:!: R/W- __0__ L_3:J HEAD I I CYl CVl 1 2 / / " CYl CYUNDERS CYUNDERS CYliNDERS "'0 +-________r-________ ----~----~ L-~~ Host : LBN Application /Vea 331 ~---- T 1 _ _ ._ _ . I : I I I I I I I I I I • • : LBN I I t 34: :u ):10 ~ o 0" CQ 1506 1507-1510 1511-1514 1515-1516 1.~/~~~1_0~__~__~~________________________~~__r-________ , .. LBN: LBN; 1 INNER .GUARD~ ;:t I CVl.JNDER 0 • RCT FCT Replacement Contrd Tables Format Cmtrd Tables I •• '< • ICyI • Cyl Controlloer Diagnostic R/W area : 1521: 1522 IR/W! rea:! cnly I i-r.-4 Physbctl Sectors Sector 32 32 c cs' = :; !!. ;... ::J !. c: • o ::J -< t..... LBN: LBN: Sector w For internal driv9 diagna;tfcs cnly 33 1 65! I (7 copies) r----~ : ' .. RBN: RBN. o ; 1 ; ~ Ac<:essIlIe IBNs : LBN L: (7 ccptes) XBNs DIlN. lBNs 362 RBN 10 • I Replacement block area (RBNs) (for repladng Host AppilcaHon LBNs only) by host appIcoIlons software Accessible by host operating system software Accessible by controUer RBN ! 16620 -I J c JF<>U C dlagncstfcs only :u internal drive Ml.O)-2056B (J) ." o ~ t ~ eaoo IDea en ea g~ en~ ~o "11 i cO' c: N c ca' ::;: !!.. S' CD ... CI1 .. ..I.. OUTER GUARDBAND LOGICAL CYLINDER 1 CYL CYL GRP GRpIGRPI~RPIGRP GRpIGRPI~RP 2 3 0 1 2 3 o 1 2 3 !!.. c ::J ~ i SECTOR LBNILBN 0 o .tt~..J - PHYSICAL SECTORS LBNI 169J -1-I I I I I 1-- I I I I 1 I I I I 1 , I I I 1 1 I I I I I I I I I 1 I 1 ~-;-) lSNllBN: 'SECTOR 41 183 41 SECTOR RBN;RBN; 42 o .1 • H " ~ I II -" .. ;.....T--GUARDBAND 1-- RCT FCT REPLACEMENT CONTROL TABLES FORMAT CONTROL TABLES HOST APPLICATION AREA ILBN 1295 r-I I 1 I I 1 CONTROLLER DIAG R/W AREA Cii c} ." 0 (6 COPIES) (6 COPIES) I I LBNs 1 1 LBNs XBNs DBNs ~BN 1335 RBN' 4 IRBN 'RBN REPLACEMENT BLOCK AREA (RBNs) (FOR REPLACING HOST APPL LBNs ONLY)19551 !7 . ACCESSIBLE BY HOST APPLICATIONS SOFTWARE ~~ ACCESSIBLE BY HOST OPERATING SYSTEM SOFTWARE ACCESSIBLE BY CONTROLLER ~~~~~~~~~~~~~~~~~~~~~~ CD '< ~I ~I :D I :D. m ~I 0 ~I -< I I I - V' FOR INTERNAL DRIVE DIAGNOSTICS ONLY CXO-2343A :u :D l> 0 I I I o ~I ~I m :D ~~ I I I I I ~- ....... ~ » en en LOGICAL LOGICAL LOGICAL I CYL CYLINDERS CYLINDERS CYLINDERS 2381 2382-2387 2388-2393 2394-2399 ...BN, 209 ! ~ I~ 0" ~ 0' iLBN 294 ILBN 168 SECTOR LBN:LBN 1 1 1 43 J T 1 _I __ 'NNER_ LOGICAL CYLINDERS LOGICAL CYLINDER 0 ::J = o N ,"0 0 ~ 0 ~ ...A Cii ~ 0 ." cC' c: CD OUTER. . GUARDBAND .... I PHYSICAL CYLINDERS III( I I CYLINDER 0 t--,...--.,.----.---tCYL CYL HEAD HEAD ,HEAD 1 2 ___ 0 1 ~ SECTOR o SECTOR 1 T PHYSICAL SECTORS ~ SECTOR 68 c <0 ;:;: !!. S' CD 3 c !!. I o::J -< 1w SECTOR 69 LBN I LBN I 0 169' leN I leN 1 I 70 I riBN I 133 I 896 ~ :xJ » CD o ~ o RCT IFCT I REPLACE- FORMAT CONTROL TABLES MENT CONTROL TABLES CONTROLLER DIAG R/W AREA I I I ~o -< Z r I 0 m ~ I :n m N :0 ~ I :0 en m co I 0> :0 ~I~ -< I I I\) RBN I RBN I ' : RBN 0 -'-!__.l~ 12 ~ co "o (4 COPIES) (4 COPIES) I I I , 1--, LBN «I INNER ,.. GUARDBAND III( 64 2649-2650 2651- 26532654- 2655 HOST APPLICATION AREA L829 I ~YLJCYlINDERSICYLINDERS ICYLINDERS I I H'~ I I I , I I I I I I I , I I I I , , , I --..L---I LBN I LBN I 68 .. II I LBN I 828 --r-1 :--1 " ~ N LBNs IXBNs LBNs I iRBN 134462: REPLACEMENT BLOCK AREA (RBNs) (FOR REPLACING HOST APPLICATION LBNs ONLY) SOFTWARE~ ~I ACCESSIBLE BY HOST APPLICATIONS . ACCESSIBLE BY HOST OPERATING SYSTEM SOFTWARE ACCESSIBLE BY CONTROLLER . DBNs CQ '< ..... (7) I CD ;::;: ." o... 3 !. 0 U1 V' FOR INTERNAL DRIVE DIAGNOSTICS ONLY CXO-2346A ~ C '11 ~ ::u t ~ ~ .! IDco fn .... g~ ""0 DSDF for RA60170/80/81/82190 Lesson 1 2.10 DRIVE INTERNAL DIAGNOSTIC AREA Many of the DSA drives provide special cylinders for use by the drive-specific internal diagnostics. These are also shown' in Figure 2-21 through Figure 2-28 for the various drives. For example, refer to Figure 2-25. The RA82 provides two additional physical cylinders located within the inner guard band area of the disk. These additional cylinders are used for drive-internal diagnostics and are only available to the internal microcode of the drive itself. One cylinder is specially formatted and is used for internal read-only testing. A special utility resident within the RA82 permits reformatting the internal read-only cylinder should it become corrupt. The other cylinder is used for internal RfW testing. Notice in Figure 2-26 that the RA70 includes special diagnostic cylinders in both the inner and outer guard band areas of the disk. Controller commands may invoke the drive internal diagnostics; however, drive internal R/W diagnostics are only performed on the specially allocated internal diagnostic cylinders on the inner guard band. These internal diagnostic cylinders are not structured according to DSDF specifications. They do not contain special header codes, EDC, or BCC characters. They merely consist of sectors using special data patterns for internal drive testing purposes. 2.1"1 BAD BLOCK REPLACEMENT (BBR) and REVECTORING Occasionally, defects in the disk storage media occur and cause sectors (blocks) to become bad. The header may, become corrupt causing header-not-found or header-compare errors. The data may become corrupt causing BeC symbol errors. These conditions cause a block to become unusable and result in holes in the disk addressing space. A teChnique known as Bad Block Replacement (BBR) was developed to pennit replacing a bad logical block with a good replacement block. Once a block is replaced, further attempts to read or write to the bad block (sector) are transferred or revectored to the replacement block (sector). This revectoring process is automatic upon each access to the bad block. For this reason, the host always accesses data from a usable block. The disk drive appears to contain a set of contiguous, error-free blocks available to the users or the host Bad block replacement is the process of moving data from a bad sector (block) to another good sector (block) and reassigning the block's address from the bad sector to the replacement sector (block). Revectoring is the process in which read data or write data operations to a block that is bad are rerouted to a replacement block during the read or write transfer operation. 2.11.1 Why Is BBR Performed? To fill holes in the host applications area address space left by bad blocks (sectors). To reduce the risk of failure due to progressive deterioration of sectors that have a high Bee symbol error count To improve the perfonnance in applications where the error correction or error recovery mechanisms require more time than the revectoring mechanism. In the current implementation of DSDF, only logical blocks in the host application area are replaced. Bad blocks in the replacement area can also be replaced by other good blocks in the replacement.area. Bad blocks in the RCT, Fer, and DBN areas are not replaced. The RCT and FCT each contain multiple, redundant copies of information to provide protection in the event of detecting a bad block in these areas. The DBN area is currently not protected against bad block events. 2-44 Digital Internal Use Only DSDF for RA60170/80/81/82190 Lesson 1 2.11.2 When is BBR Invoked? Bad Block Replacement (BBR) is invoked: When a header becomes corrupt resulting in header-compare or header-not-found errors. When BCC errors are detected and the number of symbol errors equals or exceeds the threshold defined by the disk drive. The RA81 threshold is currently set at 6 symbol errors. For example, if a read operation to an RA81 disk drive resulted in an BCC error with 6 or more symbols in error, BBR would be invoked. If, on the other hand, the same read operation resulted in 5 or less symbols in error, BBR would not be invoked, the data would merely be corrected, and the data error would not be reported to the host. When uncorrectable BCC errors occur. This occurs when the number of symbol errors exceeds the correction capability of the controller. UDASO, KDA/KDBSO, HSCSO, and HSC70 controllers, for example, can correct data with 1 to 8 BCC symbol errors maximum. 2.11.3 Who Detects and Performs BBR? The controller is responsible for detecting ECC and header errors during read or write operations and, subsequently, setting the BBR flag. Here the term BBR flag means bad block request or bad block replacement request. UDASO and KDA/KDB50 controllers, for example, do not perform the actual replacement process but, instead, pass the BBR flag to the host. The host system operating software is then responsible for perfonning the actual block replacement tasks. HSCSO controllers running microcode Version 200 (or higher) and HSC70 controllers set the BBR flag and also perfonn the actual block replacement tasks. The host is not burdened with the additional tasks required to accomplish bad block replacement. 2.11.4 How is BBR Performed? Once a decision has been made to invoke Bad Block Replacement (BBR flag set), bad blocks in the host operating area are replaced using the procedure described here. Refer to Figure 2-29 for a simplified flow diagram and the following numbered steps for a simplified description. Digital Internal Use Only 2~5 DSDF for RA60n0180181/82190 Lesson 1 Figure 2-29: Basic BBR Flow Get data from suspected block 2 store data to temp block into RCT 3 Suppress ECC and error recovery 4 Test suspected block 5 Restore error recovery Restore data to original block 7 Original block ok for reuse 6 YES Locate available 8 RBN Get data from 9 temp block in ReT store data into new 10 RBN block UpdateRCT '"--------' 11 M~l3l15C 2-46 Digital Internal Use Only DSDF for RA60nO/80/81/82190 Lesson 1 Basic BBR Flow notes 1. The data is retrieved (reread) from the block suspected to be bad. 2. This data is temporarily stored (written) in a block in the ReI' area. 3. The use of hardware-assisted error recovery and BCC correction is suppressed so that the suspect block is tested in its default state. NOTE Any ECC error reported in this state is considered uncorrectable. 4. The block suspected to be bad is tested by performing read and write data operations with user data and inverted user data to verify that the block is indeed bad. 5. Hardware error recovery and operations. 6. If the test(s) pass, then the block is considered reusable. 7. Since the block is considered reusable, the original data is retrieved from the RCT (see step #2) and rewritten to the original logical block (sector). 8. If the block failed the test from step #4, the block must be replaced and the original data moved to the replacement block. In this step, an available replacement block (RBN) in the replacement area is located Bee correction capability is restored for use by subsequent read and write using infonnation found in the RCT. 9. The original data from the bad block is retrieved from the temporary storage block in the RCT (see step #2). 10. The original data is now written into the new replacement block (RBN). 11. Infonnation in the RCT is updated to reflect the replacement process that has just occurred. Future access to the old bad block may require this infonnation to find the new replacement block during revectoring. The new RBN is no longer available for replacement of other blocks. Bad block replacement cannot be perfonned if the disk drive is write protected. If a bad block is detected on a disk drive that is write protected, the BBR functions fail with .a write-protect error. Also, if the drive becomes write protected after BBR has started, incomplete replacement and possible loss of data could result. Digital Internal Use Only 2-47 DSDF for RA60f70/S0181/S2I90 lesson 1 2.11.5 Types of Replacement and Revectoring There are two types of replacement currently implemented by the SDl: 1. Primary Replacement When the selected RBN resides on the same track as the block being replaced, the replacement is called primary. During replacement, the first priority for locating an available RBN is to attempt to locate a primary RBN. This way, subsequent revectoring requires the least amount of time. Refer to Figure 2-29, step #8. 2. Non-Primary Replacement When the selected RBN resides on a track other than that containing the block being replaced, the replacement is called non-primary. lfthe primary RBN is not availahle during bad block replacement, the closest available RBN to the track containing the bad block is selected Refer to Figure 2-29, step #8. The intent is to minimize the time required to revector to the replacement block during subsequent read or write data operations. Some of the earlier documentation and utilities used the tenns secondary and tertiary. Recent changes to the specifications made these tenns obsolete. The current and proper term is non-primary. 2-48 Digital Internal Use Only DSDF for RA60nO/80/81/82190 Lesson 1 2.12 HARDWARE ERROR RECOVERY 2.12.1 RA82 Error Recovery The RA82 disk drive incorporates a feature known as hardware error recovery. This is implemented as part of the RA82 hardware circuitry. When activated, special circuits alter the characteristics of the read data circuits in the disk drive. Hardware error recovery is typically used to assist the controller during read operations when uncorrectable or unrecoverable errors are detected. This feature enhances the ability of a disk/controller subsystem to recover data that would othelWise be lost when specific media failures are detected. 2.12.2 What are the RA82 Error Recovery Circuits? The RA82 hardware error recovery circuitry is currently divided into three functional areas. These are described as follows: 1. Decrease read threshold When activated, this circuitry decreases the threshold at which the read circuitry detects read pulses from the disk media. This makes the read circuits more sensitive to potentially weak. signals from the HDA. 2. Hold-over one-shot When activated, this circuitry holds the VCO control voltage stable and prevents large phase errors from occurring due to a momentary loss of read pulses from the disk. 3. Skew read gate When enabled, this circuitry introduces a delay of one or two bytes of time between the moment the hybrid module receives the READ GATE signal from the SDr controller and the time the read/write module receives the READ GATE signal from the hybrid module. The amount of delay (skew) changes on each revolution of the disk when the index pulse is received. The skew time is one byte during the first revolution, two bytes during the second revolution, one byte during the third revolution, etc. 2.12.3 When are the Error Recovery Circuits Activated? The RA82 error recovery circuits are activated only when the SDr controller issues an SDr ERROR RECOVERY command to the drive. When the controller issues the ERROR RECOVERY command to a disk, it also specifies an error recovery level number. This level number tells the disk which combination of error recovery circuits to activate. The controller is not aware of exactly what actions the disk will perfonn when the ERROR RECOVERY command is issued. It only knows that the disk will alter its R/W hardware characteristics. The RA82 has seven different levels of error recovery. The circuits that are activated for each level are as follows: LEVEL LEVEL LEVEL LEVEL LEVEL LEVEL LEVEL 7 S 5 4 3 2 1 DECREASE READ THRESHOLD (usually the first level tried by the controller) HOLD-OVER ONE-SHOT SKEW READ GATE DECREASE READ THRESHOLD and HOLD-OVER ONE-SHOT DECREASE READ THRESHOLD and SKEW READ GATE HOLD-OVER ONE-SHOT and SKEW READ GATE DECREASE READ THRESHOLD and HOLD-OVER ONE-SHOT and SKEW READ GATE (usually the last level tried by the controller) LEVEL 0 NOP (This is the normal default state of the drive where none of the error recovery circuits are activated) Different SDI disk types may have different levels depending upon the error recovery circuits available within the particular disk drive. The disk drive provides the number of error recovery levels it has to the SDI controller during the response to a GET COM:MON CHARACTERISTICS command The RA82 provides the value seven since it supports up to seven levels of error recovery. The RA60, RA80, and RA81, however, do not have error recovery circuits and, therefore, only support error recovery level zero. Digital Internal Use Only 2-49 DSDF for RA60!70/80/81/82190 Lesson 1 2.12.4 RA70 Hardware Error Recovery The RA70 also incorporates hardware error recovery circuits. Ten error recovery levels can be performed via the controller error recovery commands. Each error recovery level command to the RA70 changes only one recovery parameter of the drive. All other recovery parameters are returned to their normal condition. The default (normal) parameters of the circuits are as follows: Normal data read gate is delayed by 3 bytes PLO fast lock time is 6.36 microseconds Lockup is on the header preamble only Read threshold is 50% The RA70 error recovery levels are divided into two major categories: 1. Drive logic recovery operations which change the electrical characteristics of the read/write path circuits. Recovery Level 2. RA70 Operation Performed 10 Reduce read threshold to 25% 9 ShifVdelay data read gate by 4 bytes 8 7 6 ShifVdelay data read gate by 2 bytes 5 Shift PLO fast lock time to 2.23 usee 4 Shift PLO fast lock time to 8.45 usec 3 Lockup on both header and data preambles ShifVdeJay data read gate by 1 byte Shift PLO fast lock time to 4.31 usee Drive servo error recovery operations which change the servo characteristics of the embedded servo centerline. Recovery Level RA70 Operation Performed 2 Shift the embedded centerline by -12% 1 Shift the embedded centerline by +12% o Return all error recovery to normal 2-50 Digital Internal Use Only DSDF for RA60170180181182190 Lesson 1 2.12.5 How Is Error Recovery Used? The following paragraphs explain how the error recovery feature is used in a disk subsystem during a read data operation. Refer to Figure 2-30. Read/Write Error Recovery Flow Notes Firs~ the controller reads a block of data from the disk drive. If no ECC errors are detected, the data is sent back to the host operating system. If, however. ECC are errors detected, the controller determines if the number of ECC symbols in error equals or exceeds the recommended threshold supplied by the drive. In the case of the RA82, for example, the threshold is 6 symbols. This means that if 5 or less ECC symbols were in error for this block, the controller would merely correct the data and send it to the host. If 6 or more ECC symbols were detected in error, the controller would send an error to the host error log and set the BBR (bad block replacement/request) flag. The BBR process is actually invoked at a later time. Ne~ the controller determines if the data is correctable. This depends upon the correction capability and the maximum number of symbols that can be corrected by· the particular controller. If the data is uncorrectable, the controller usually retries the read data operation. In most cases, the number of retries depends upon the retry count recommended by the drive characteristics. With the RA82, for example, the recommended retry count is 5. If the data is uncorrectable after all retry operations have been exhausted, the next step is to determine if the particular drive has any hardware error recovery capabilities. For the RA82, 7 levels of error recovery are available. In this case, the controller issues an ERROR RECOVERY command and usually starts with level 7. This causes the RA82 to activate the R/W error recovery circuits corresponding to level 7. The controller now repeats the entire read data block process previously described, including additional retry operations as necessary. If the data block is still uncorrectable after all retries are exhausted during level 7 of error recovery, the controller issues another ERROR RECOVERY command and specifies the next lower level number, or level 6 in this example. Again, the entire operation is repeated. This process continues with level 5, level 4, etc., until the data block is eventually read without an uncorrectable ECC error. If all levels of error recovery are exhausted and the data is still uncorrectable, the controller returns an error to the host. For disk drives that do not support hardware error recovery, the operation is only performed to the point where all retry operations have been exhausted. At that poin~ the controller will also return an uncorrectable ECC error to the host. This discussion on error recovery is a very simplified example of one way that this drive feature is used. Hardware error recovery is neither restricted nor limited to read errors due to ECC error detection. In fact, the controlleIS may also utilize drive hardware error recovery during read operations where header-related errors and other similar problems are detected during read operations. Digital Internal Use Only 2-51 DSDFforRA60nomom1m~ Lesson 1 Figure 2-30: ReadlWrlte Error Recovery C ___E-:-nte_r_ _) (RE-Reod the block) Controller read a block from disk N Senddato to host N Correct thedato Send error to host Set BBRflog y Increment retry y Reset retry count 2-52 Digital Internal Use Only 5etnext drive error recovery level MlDS-105CB DSDF for RA60!70/80/81/82190 Lesson 1 2.13 FORCED ERROR The forced error flag indicates to the host that incorrect data is correctly written into a sector. When an uncorrectable ECC error is encountered in a block, several attempts are made to read and/or correct the data. If these attempts fail, the block causing the uncorrectable ECe error is assumed to be bad and becomes a candidate for replacement. During the replacement process, the bad block is read again (including retries and error recovery) in an attempt to extract the data for relocation to the replacement block. If the data is still uncorrectable, the BBR process writes best-guess data into the replacement sector. The result is invalid data being correctly written to a good block. To inform the user that the data was at one time uncorrectable, the forced error flag is attached to the block. The actual mechanism used to indicate forced error is accomplished by inverting the EDC character during a write to a sector on the disk. Refer to Figure 2-6. It is the responsibility of the host software or the user to take the necessary steps to correct or replace the data and clear the forced error indicator. The actual methods used depend upon the particular operating system, but the following points should be remembered: 1. Rewriting the block with any data will clear the forced error indicator. 2. Perfonning a simple read of the block with the forced error and then merely rewriting the data back to the block will result in clearing the forced error flag, but the data in the block will still be invalid. It is the responsibility of the user to insure that the data rewritten to the block is the desired data NOTE The only reliable technique that should be used to recover from a forced error is to replace the file containing the forced error with a KNOWN GOOD COpy OF THAT FILE. Digital Internal Use Only 2-53 DSDF for RA60n0/80/81/82190 Lesson 1 2.14 LOGICAL BLOCK NUMBER CONVERSION Example 2-1: RA70/S0/S1/S2I90 LBN to Physical and Logical Parameters PC Physical Cylinder LBN PH Physical Head .PC Rem GP Group (Logical) . PC Rem TK Track (Logical) .GP Rem PC • PC Rem BPPC * BPPC BPPT * BPPC GP GP Rem BPG * BPG TK . TK Rem BPPT S Sector (Logical) SFI Physical Sector from Index * .TK Rem (GP * BPPT GP_Offset) + S S (Rounded to nearest whole number) x . SFI Rem PSPT (discard X) SFI Rem * PSPT SFI (Rounded to nearest whole number) 2-54 Digital Internal Use Only DSDF for RA60170/80/81/82190 Lesson 1 Table 2-2: Values for RA70/80/81/82190 Disk 16-bit RA70 363 RA80 434 406 Blocks (LBNs) per RAS1 714 644 Physical cylinder RA82 S55 RA90 S97 RA70 33 RASO 31 28 46 BPPC BPPT 18-bit Blocks (LBNs) per RAs1 51 Physical track RAS2 57 RA90 69 RA70 33 RASO 434 392 Blocks (LBNs) per RAS1 51 46 Group RA82 57 RA90 69 RA70 S RASO 16 16 12 BPG GP_offset Group offset PSPT RAS1 14 RAS2 11 RA90 14 RA70 34 RASO 32 29 Physical sectors RAS1 52 47 Per track RAS2 58 RA90 70 Digital Internal Use Only 2-55 DSDF for RA60n0/80/81/82190 Lesson 1 Example 2-2: PC Physical Cylinder RA81 16-Blt HDA LBN =2498 LBN 2498 BPPC 714 3.498 * PH Physical Head ---------------- GP Group (Logical) ---------------- TK Track (Logical) --------------- .PC Rem = PH 6 6.972 GP 6 0.972 TK o 6.972 51 * BPPC BPG .GP Rem 3 714 ------------- BPPT .PC Rem * 0.498 BPPC PC * BPG BPPT 0.498 * 714 ------------ = 51 0.972 * 51 -----------51 S Sector Logical .TK Rem * BPPT * 0.972 49.572 51 S = 50 (Rounded to nearest whole number) SFI Physical Sector from Index SFI (GP * GP_Offset) + S (6 * 14) + 50 2.576 52 PSPT SFI Rem * PSPT 0.576 * 52 29.952 30 SFI 30 (Rounded to nearest whole number) SUMMARY: 2-56 Physical Cylinder (PC) 3 Physical Head (PH) 6 Group (GP) = 6 Track (TK) = 0 Logical Sector (S) 50 Phy Sector from Index = 30 Digital Internal Use Only DSDF for RA60n0/80/81/82190 Lesson 1 Example 2-3: RA81 16-81t HDA LBN LBN PC Physical Cylinder =2499 2499 ------ .PC Rem 0.5 * BPPC PH Physical Head ---------------- GP Group (Logical) ---------------- TK Track (Logical) 3.5 PH 7 7.0 GP 7 0.5 0.0 TK 0 0.0 * BPG * 51 ------------ BPPT 51 S * BPPT 7.0 51 --------------.TK Rem * 714 ------------ BPG .GP Rem * 714 51 * BPPC .PC Rem Sector Logical 3 ------------- BPPT = PC 714 BPPC 0.0 * 51 = 0.0 S = 0 (Rounded to nearest whole number) SFI Physical Sector from Index SFI (GP * GP_Offset) + S (7 * 14) + 0 1.884 PSPT = SFI Rem * 52 PSPT 0.884 45.96 46 * 52 SFI 46 (Rounded to nearest whole number) SUMMARY: Physical Cylinder (PC) 3 Physical Head (PH) 7 Group (GP) 7 Traok (TK) 0 Logioal Seotor (S) 0 Phy Seotor from Index = 46 Digital Internal Use Only 2-57 DSDF for RA60170/80/81/82190 Lesson 1 Example 2-4: RA60 LBN to Physical and Logical Parameters LC Logical Cylinder LBN LC . LC Rem BPLC GP Group .LC Rem * BPLC GP . GP Rem BPG TK Track (Logical) * BPG .GP Rem TK . TK Rem BPPT S Sector (Logical) .GP Rem * BPT (Result rounded to nearest whole number) LBN CYL60 Physical Cylinder BPPC * (4 Physical Head = * 4 LBN - CYL60 • Remainder (Discard) CYL60) + GROUP (CYL60 * 4 * BPPC) HEAD • Remainder BPLC SFI Physical Sector from Index (GP * GP_Offset) + S x . SF! Rem PSPT (discard X) SF! Rem * PSPT = SFI (Rounded to nearest whole number) 16-bit 18-bit 168 42 42 252 43 16 152 38 38 228 39 15 --------------- BPLC BPPT BPG BPPC PSPT GP Offset 2-58 Digital Internal Use Only Blocks (LBN's) Per Logical Cylinder Blocks (LBN's) Per Physical Track Blocks (LBN's) Per Group Blocks (LBN's) Per Physical Cylinder Physical Sectors Per Track Group Offset DSDF for RA60170/80181/82190 Lesson 1 Example 2-5: RA60 16-Blt HDA LBN LC Logical Cylinder GP Group LBN 6000 BPLC 168 =6000 35.714 * BPLC .LC Rem TK Track (Logical) * BPG 2 * 42 0.856 0.856 = TK 0 42 BPPT * .GP Rem GP 42 .GP Rem S 35 168 2.856 BPG Sector (Logical) * 0.714 LC BPT * 42 0.856 s = 36 35.952 (Rounded to nearest whole number) LBN 6000 5.952 CYL60 * 4 BPPC 4 * CYL60 = 5 252 (discard fraction) Physical Cylinder (4 * CYL60) + GROUP (4 * 5) + 2 Phy Cyl 22 22 Physical Head LBN - (CYL60 * 4 * BPPC) 6000 - (5 * 4 * 252) 168 BPLC 5.714 Physical Head SFI Physical Sector from Index (GP * GP_Offset) + S 5 (2 * 16) + 36 43 PSPT 1.581 SFI SFI_Rem * PSPT 0.581 * 43 24.98 25 (Rounded to nearest whole number) SFI 25 Example 2-5 Cont'd. on next page Digital Internal Use Only 2-59 DSDF for RA60170/80/81/82190 Lesson 1 Example 2-5 (Cont.): ·SUMMARY: 2-60 RA60 16-Bit HDA LBN Physical Cylinder (PC) Logical Cylinder Physical Head (PH) Group (GP) Track (TK) Logical Sector (S) Phy Sector from Index Digital Internal Use Only =6000 22 35 5 2 0 36 25 DSDF for RA60/70/80/81/82190 Lesson 1 Example 2-6: Quick RA60 Head Algorithm If you know the LBN (Logical Block Number), first determine the logical cylinder: LBN logical cylinder . fraction (discard fraction) BPLC Logical Cylinder xxx . YYY 6 (heads) PHYSICAL HEAD BPLC = 168 152 * (. YYY) 6 (16-bit packs) (18-bit packs) Blocks Per Logical Cylinder Using "quick" RA60 head algorithm for the previous RA60 sample. LBN Logical Cylinder . Fraction (discard fraction) BPLC 6000 35.714 LOGICAL CYLINDER 35 168 Logical Cylinder 35 6 (heads) 6 5.8333 PHYSICAL HEAD BPLC = 168 152 6 * (16-bit packs) (18-bit packs) 0.8333 (keep fraction) 4.99 = HEAD 5 Blocks Per Logical Cylinder Digital Internal Use Only 2-61 DSDF for RA60/70/80/81/82190 Lesson 1 2.15 EXERCISES At this time complete the following exercises. You may use any reference material to answer the questions. 1. The term sector is used interchangeably with what other term? GROUP A. ® 2. 3. BLOCK C. TRACK D. LOGICAL CYLINDER Which of the following perform bad block replacement? A. The host B. The drive C. The controller (E) Either the host or the controller E. Either the drive or the controller What would cause bad block replacement (BBR) to be invoked? ~ ECCerror B. EDC error C. Mis-seek error D. Header error 4. @ AandD F. AandB Where are the RBNs located on a disk? A. On the last sector of each track in the Fer area. B. In the outer guard band areas. C. On the last sector of each track in the host area @ 2-62 On the last sector of each track in the host and ReI' areas. Digital Internal Use Only DSDF for RA60170/80/81/82190 Lesson 1 5. 6. Replacement blocks (RBNs) are used to replace logical blocks from which areas? r:!J Defective blocks in the host applications area B. Defective blocks in the RCT area C. Defective blocks in the FCI' area. D. Defective blocks in the diagnostic block area. E. A andB. How many logical groups are there in a logical cylinder on an RA82 disk? A. 8 ~) C. D. 7. 2 1 What is the forced error flag used for? A. @ 8. 15 Indicate that a block is bad. Indicate the data in a block was at one time uncorrectable. C. Indicate the block: is being tested with forced errors. D. Indicate that the header in the block is bad. E. A andB. How is drive hardware error recovery activated? A. @ By the drive when an ECe error is detected below drive threshold. By the controller when an uncorrectable error is detected after all retry operations have been attempted. C. By the drive when an uncorrectable error is detected after all retry operations have been attempted. D. By the controller when an ECe error is detected below drive threshold. Digital Internal Use Only 2-63 DSDF for RA60n0/80/81/82190 Lesson 1 2-64 Digital Internal Use Only CHAPTER 3 DRIVE CHARACTERISTICS Drive Characteristics 3-1 Drive Compare Specifications RA90 RA70 RA82 RA81 16-bit 16-bit 16-bit 16-bit 2 2 Recording surfaces 13 11 7.5 7.0 Servo surfaces 1 0.5 0.5 Data heads 13 11 15 14 Blocks per track 70 34 58 52 Phys!cal Specifications Heads per surface Physical cylinders 2661 1517 1435 1258 Tracks per disk 37254 16687 21525 17612 Embedded servo yes yes yes yes User blocks (host applications area) 2376153 547041 1216665 891072 Megabytes per disk (Host applications area) 1216 280 623 456 Logical cylinders 2661 1511 1435 1258 13 11 15 14 Replacement blocks per disk 34463 16611 21405 17528 Tracks for replacement control table 39 44 60 56 Tracks for diagnostic use 26 22 60 28 Tracks per inch (TPI) 1750 1355 1063 960 Bits per inch (BPI) 22839 22437 12800 11400 22.20 '1.6 19.2 17.4 Data Specifications Tracks per logical group Groups per logical cyl Reserved Space Recording density Transfer rate Burst (MHz) 3-2 Digital Internal Use Only Drive Compare Specifications RA90 16-bit RA70 16-bit RA82 16-bit RA81 16-bit Single track seek (ms.) 5.5 5.5 6.0 7.0 Average seek (ms.) '9.0 30.0 19.5 24.0 28.0 Total full stroke (ms.) 35.0 38.0 50.0 Head switch (ms.) 3.0 4.5 6.0 4 Positioner access time Digital Internal Use Only 3-3 Drive Compare Specifications 3-4 Digital Internal Use Only RA60 Common Characteristics During SOl Get Common Characteristics 3.1 RA60 Common Characteristics BYTE # Hex byte 1 78 Response Opcode 2 33 (3) (3) 3 9E Xfer Rate 4 F7 (5) (7) 5 06 86 6 Copies FCT/RCT, 512 byte mode only (one copy/head 6 Copies FCT/RCT, 512 or 576 byte mode (one copy/head) 6 00 Error Recovery Level 7 04 ECC Threshold abnormal) SDI Version 3.0 Short timeout = 8 seconds (2 A 3) = 15.8 MHz (9E hex = 158 Decimal) 15 retries for data transfer operation 128 second long timeout (2 A 7) =4 = 0 (0 levels) (# of ECC symbol errors to consider Microcode Revision Level 8 9 Ox (0) IE=O, No Special Internal Error Log Available (x) Hardware Rev from operator panel 10 11 12 13 14 15 xx xx Lo - xx 00 00 00 Hi - 16 04 Drive type Identifier per MSCP Spec 17 3C 60 ReVS/Second 18 19 00 00 00 00 00 00 20 21 22 23 1 1--- SIN of Drive 1 1 (RA60) (3C hex = 60 decimal) I 1--- Error 'Recovery Threshold (not used in RA60) I 1 Digital Internal Use Only ~ RA60 Subunit Characteristics During SOl Get Subunit Characteristics Command 3.2 RA60 Subunit Characteristics BYTE # Hex byte 1 77 2 5 54 09 00 00 6 04 4 Groups/Cylinder 7 00 1st XBN 8 01 1 Track/Group 9 00 1st DBN 10 3 4 Response Opcode 1-- 00000954 hex = 2388 Decimal (LOGICAL) Cylinders in LBN 1 Space (Host Cyls + RCT Cyls, 0 thru 2387) =0 1st LBN o 0 1st RBN o 81 1 RBN/Track (RM bit 1, REMOVABLE Media) 11 00 Reserved 12 OD 13 Words DATA PREAMBLE 13 05 5 Words HEADER PREAMBLE (for 512-byte mode, 43 sectors) 14 15 16 17 3C 10 A4 22 18 19 AC 00 = (for 512-byte mode, 43 sectors) 1--- Media Type Identifier RA60-DJ 1 Lo Hi FCT Copy Size - XBNs ( OOAC hex 3-6 Digital Internal Use Only = = 172 172 Decimal= 4 x 43 ) RA60 Subunit Characteristics During SDI Get Subunit Characteristics Command BYTE '* Hex byte ********************* 512-BYTE MODE ************************** 20 2A 42 LBN's/Track 21 10 Group Offset 22 23 24 25 30 1B 06 00 26 27 A9 00 '* 16 Decimal of HOST LBNs = 400,176 RCT Copy Size (LBNs) (= 00061B30 hex) 168 decimal (4x42) (00A9 hex) ********************* 576-BYTE MODE ************************** 29 26 39 29 10 Group Offset 30 31 32 33 50 96 05 00 34 35 98 00 LBNs/Track '* 16 decimal of HOST LBNs = 362,064 RCT Copy Size - LBNs (= 152 00059650 hex) (h:38) (0099 hex) ************************************************************* 0006 = 36 37 06 00 2 LOGICAL Cylinders in XBN Space 39 02 Size of Diagnostic READ-ONLY DBN Area (Groups) = 2 39 06 6 LOGICAL Cylinders in DBN Area Digital Internal Use Only 3-7 RA60 Subunit Characteristics During SOl Get Subunit Characteristics Command 3-8 Digital Internal Use Only RA70 Common Characteristics During SOl Get Common Characteristics Command 3.3 RA70 Common Characteristics BYTE =It Hex byte 1 78 Response Opcode 2 43 (4) (3) 3 74 Xfer Rate 4 57 (5) (7) 5 07 7 Copies FCT/RCT, 512 byte mode only 6 OA Error Recovery Level 7 06 ECC Threshold = 6 abnormal) 8 xx Microcode Revision Level 9 Sx (1) (Bit 7, IE=l), Special Internal Error Log Available (x) Hardware Rev from microcode 10 xx Lo - 11 xx 12 13 14 15 xx 00 00 00 16 12 Drive type Identifier per MSCP Spec RA70) 17 43 67 Revs/Second 18 19 20 21 22 23 00 00 00 00 00 00 SDI Version 4.0 Short timeout = 8 seconds (2 A 3) = 11.6 Mhz (4 hex = 116 Decimal) 5 retries for data transfer operation 128 second long timeout (2 A 7) <# = 10 decimal (10 levels) of ECC symbol errors to consider 1 1--- SIN of Drive 1 1 Hi - (43 hex (IS Decimal, 22 Octal, 67 decimal) 1 1--- Error Recovery Threshold (not used in RA70) I 1 Digital Internal Use Only 3-9 RA70 Subunit Characteristics During SOl Get Subunit Characteristics Command 3.4 RA70 Subunit Characteristics BYTE * Hex byte 1 77 2 3 4 5 E7 05 00 00 6 OB 11 Groups/Cylinder 7 00 1st XBN 8 01 1 9 00 1st DBN 10 Response Opcode 1-- 00005E7 hex = 1511 Decimal (LOGICAL) Cylinders in LBN Space (Host Cyls + RCT Cyls, o thru 1510) 1 =0 1st LBN o =0 1st RBN o 01 1 RBN/Track (RM bit 0, Non-removable Media) 11 00 Reserved 12 OE 14 words DATA PREAMBLE 13 09 9 Words HEADER PREAMBLE (for 512-byte mode, 34 14 15 16 17 46 10 64 25 18 19 CC 00 Track/Group (for 512-byte mode, 34 sectors) 1--- Media Type Identifier 1 Lo Hi FCT Copy Size - XBNs = 204 ( OOCC hex = 204 Decimal) 3-10 Digital Internal Use Only secto~s) RA70 Subunit Characteristics During SOl Get Subunit Characteristics Command BYTE -# Hex byte. ********************* 512-BYTE MODE ************************** 20 21 33 LBNs/Track 21 08 Group Offset 22 23 24 25 E1 58 08 00 -# of HOST LBNs 26 27 C6 00 ReT Copy Size (LENs) = 198 decimal (198 decimal = 00C6 hex) 8 Decimal 547,041 (= 000858E1 hex) (33 LBNs/Track x 11 Heads x 1507 LOGICAL Cylinders ********************* 576-BYTE MODE ************************** la-Bit ROA's NOT SUPPORTED by RA70 28 00 LBNs/Track 29 00 Group Offset 30 31 32 33 00 00 00 00 -# of HOST LBNs 34 35 00 00 RCT Copy Size ************************************************************* 0004 = 36 37 04 00 4 Cylinders in XBN Space 38 OB Size of Diagnostic READ-ONLY DBN Area (Groups) 39 02 2 Cylinders in DBN Area 11 decimal Digital Internal Use Only 3-11 RA70 Subunit Characteristics During SDI Get Subunit Characteristics Command 3-12 Digital Internal Use Only RASO Common Characteristic During SOl Get Common Characteristics Command 3.5 RASO Common Characteristics BYTE 41= Hex byte 1 78 Response Opcode 2 33 (3) (3) 3 61 Xfer Rate = 9.7 Mhz 4 57 (5) (7) 5 04 84 4 Copies FCT/RCT, 512 or 576 byte mode (HDA jumper out) 6 00 Error Recovery Level 7 02 ECC Threshold = 2 (# of ECC symbol errors to consider abnormal) SDr Version 3.0 Short timeout = 8 seconds (2 A 3) (61 hex = 97 Decimal) 5 retries for data transfer operation 128 second long timeout (2~7) 4 Copies FCT/RCT, 512 byte mode only, = 0 (RDA jumper in) (0 levels) Microcode Revision Level 8 9 Ox (0) IE=O, No Special Internal Error Log Available (x) Hardware Rev from operator panel 10 Lo - 12 13 14 15 xx xx xx 00 00 00 16 01 Drive type Identifier per MSCP Spec 17 3C 60 Revs/Second 18 19 20 21 00 00 00 00 00 00 11 22 23 1 1--- SIN of Drive 1 1 Hi (RA80) (3C hex = 60 decimal) 1 1--- Error Recovery Threshold (not used in RA80) 1 I Digital Internal Use Only 3-13 RA80 Subunit Characteristics During SOl Qet Subunit Characteristics Command 3.6 RASO Subunit Characteristics BYTE * Hex byte 1 77 2 3 4 5 13 01 00 00 6 02 2 Groups/Cylinder 7 00 1st XBN 8 OE 14 Tracks/Group 9 00 1st DBN = 0 10 01 1 RBN/Track 11 00 Reserved 12 OB 11 Words DATA PREAMBLE 13 04 4 Words HEADER PREAMBLE (for 512-byte mode, 32 sectors) 14 15 16 17 50 10 64 25 18 19 EO 01 Response Opcode 1-- 00000113 hex = 275 Decimal (LOGICAL) Cylinders in LBN 1 Space (Host Cyls + RCT Cyls, 0 thru 274 logical, o thru 558 physical) = 0 o 1st RBN o (RM bit 0, Non-removable Media) (for 512-byte mode, 32 sectors) 1--- Media Type Identifier DURA 80 1 Lo Hi FCT Copy SiZe - XBNs ( OlEO hex 3-14 1st LBN Digital Internal Use Only = 480 = 480 Decimal= 15 x 32 ) RASO Subunit Characteristics During SOl Get Subunit Characteristics Command BYTE # Hex byte *********************** 512-BYTE MODE ************************** 20 IF 31 LBNs/Track 21 10 Group Offset 22 23 24 25 9C 9E 03 00 26 D1 27 01 16 Decimal # of HOST LBNs 237,212 (= 00039E9C hex) (31 LBNs/Track x 14 Heads x 2 grps x 273 LOGICAL Cylinders + 248 LBNs borrowed from RCT) RCT Copy Size (LBNs) hex) = 465 decimal (15x31) (OlDl ********************* 576-BYTE MODE ************************** 28 lC 28 LBNs/Track 29 10 Group Offset 30 31 32 33 FO 44 03 00 # of HOST LBNs 34 35 A4 01 RCT Copy Size - LBNs = 420 16 decimal 214,256 (= 000344FO hex) (28 LBNs/Track x 14 Heads x 2 GRPS X 273 LOGICAL Cylinders + 224 LBNs BORROWED FROM RCT) (13x28) = (IA4 hex) ****************************************************w******** 0002 = 36 37 02 00 2 Cylinders in XBN Space 38 01 Size of Diagnostic READ-ONLY DBN Area (Groups) 39 02 2 Cylinders in DBN Area 1 Digital Internal Use Only 3-15 RASO Subunit Characteristics During SOl Get Subunit Characteristics Command 3-16 Digital Internal Use Only RA81 Common Characteristics During SDI Get Common Characteristics Command 3.7 RA81 Common Characteristics BYTE .f Hex byte 1 78 Response Opcode 2 33 (3) (3) 3 AE Xfer Rate 4 57 (5) (7) 5 04 84 4 Copies FCT/RCT, 512 byte mode only, (HDA jumper in) 4 Copies FCT/RCT, 512 or 576 byte mode (HDA jumper out) 6 00 Error Recovery Level 7 06 ECC Threshold = 6 (i of ECC symbol errors to consider abnormal) 8 xx Microcode Revision Level 9 Ox (0) IE=O, No Special Internal Error Log Available (x) Hardware Rev from operator panel 10 11 12 13 14 15 xx Lo - SDI Version 3.0 Short timeout = 8 seconds (2 A 3) = 17.4 Mhz (AE hex = 174 Decimal) 5 retries for data transfer operation 128 second long timeout (2 A 7) = 0 xx 1 xx 1--- SIN of Drive 00 00 00 1 Hi - 16 05 Drive type Identifier per MSCP Spec (RA81) 17 3C 60 Revs/Second 18 19 20 21 22 23 00 00 00 00 00 00 1 (3C hex = 60 decimal) 1 1--- Error Recovery Threshold (not used in RA81) I I Digital Internal Use Only 3-17 RA81 Subunit Characteristics During SOl Get Subunit Characteristics Command 3.8 RA81 Subunit Characteristics BYTE '* Hex byte 1 77 2 3 4 5 E4 04 00 00 6 14 14 Groups/Cylinder 7 00 1st XBN = 0 8 01 1 Track/Group 9 00 10 Response Opcode 1-- 000004E4 hex = 1252 Decimal (LOGICAL) Cylinders in LBN 1 Space (Host Cyls + RCT Cyls, 0 thru 125l) 1st LBN o 1st DBN = 0 1st RBN o 01 1 RBN/Track (RM 11 00 Reserved 12 13 19 Words DATA PREAMBLE 13 OC 12 Words HEADER PREAMBLE (for 512-byte mode, 52 sectors) 14 15 16 17 51 10 64 25 18 19 OC 03 bit 0, Non-removable Media) (for 512-byte mode, 52 sectors) 1--- Media Type Identifier DURA 81 1 Lo Hi FCT Copy Size - XBNs = 780 ( 030C hex = 780 Decimal= 15 x 52 ) 3-18 Digital Internal Use Only RA81 Subunit Characteristics During SDI Get Subunit Characteristics Command BYTE # Hex byte ********************* 512-BYTE MODE ************************** 20 33 51 LBNs/Track 21 OE Group Offset 22 23 24 25 CO 98 OD 00 26 FD 27 02 14 Decimal # of HOST LBNs 891,072 (= OOOD98CO hex) (51 LBNs/Track x 14 Heads 1248 LOGICAL Cylinders RCT Copy Size (LBNs) hex) 765 decimal (15x51) (02FD ********************* 576-BYTE MODE ************************** 28 2E 46 29 OC Group Offset 30 31 32 33 80 43 OC 00 # of HOST LBNs 34 35 B2 02 RCT Copy Size - LBNs LBNs/Track ..... 12 decimal 803,712 (= OOOC4380 hex) (46 LBNs/Track x 14 Heads X 1248 LOGICAL Cylinders 690 (15x46) (2B2 hex) ************************************************************* 36 37 04 00 0004 = 4 Cylinders in XBN Space 38 OE Size of Diagnostic READ-ONLY DBN Area (Groups) 39 02 2 Cylinders in DBN Area 14 Digital Internal Use Only 3-19 RA81 Subunit Characteristics During SDI Get Subunit Characteristics Command 3-20 Digital Internal Use Only RA82 Common Characteristics During SOl Get Common Characteristics Command 3.9 RA82 Common Characteristics BYTE # Hex byte 1 78 Response Opcode 2 43 (4) (3) 3 co Xfer Rate 4 57 (5) (7) 5 04 4 Copies FCT/RCT, 512 byte mode only, 6 07 Error Recovery Level 7 06 ECC Threshold = 6 8 xx Microcode Revision Level 9 Ox (0) SOl Version 4.0 Short timeout = 8 seconds (2 A 3) = 19.2 Mhz (CO hex = 192 Decimal) 5 retries for data transfer operation , 128 second long timeout (2 A 7) Cn1r:r-;( SOT Spet.j loy =7 (HDA jumper in) (7 levels) (# of ECC errors to consider abnormal) IE=O, No Special Internal Error Log Available (x) Hardware Rev from operator panel 10 11 12 13 14 15 xx xx 16 OB Drive type Identifier per MSCP Spec (11 decimal, 13 Octal, RA82) 17 3C 60 Revs/Second 18 19 20 21 22 23 00 00 00 00 00 00 xx 00 00 00 Lo 1 1--- SiN of Drive 1 I Hi - I 1--- Error Recovery Threshold (not used in RA82) 1 I Digital Internal Use Only 3-21 RA82 Subunit Characteristics During SDI Get Subunit Characteristics Command DSA SUPPORT SEMINAR 3.10 RA82 Subunit Characteristics BYTE * Hex byte 1 77 2 3 5 93 05 00 00 6 OF 15 Groups/Cylinder 7 00 1st XBN = 0 8 01 1 Track/Group 9 00 1st DBN 10 01 1 RBN/Track 11 00 Reserved 12 12 18 Words DATA PREAMBLE 13 06 6 Words HEADER PREAMBLE (for 512-byte mode, 58 sectors) 14 15 16 17 52 10 64 25 18 19 AO 03 4 Response Opcode 1--- 00000593 hex = 1427 Decimal LOGICAL Cylinders in LBN 1 Space (Host Cyls + RCT Cyls, 0 thru 1426) =0 1st LBN o 1st RBN o (RM bit 0, Non-removable Media) (for 512-byte mode, 58 sectors) 1--- Media Type Identifier DURA 82 1 Lo Hi FCT Copy Size - XBNs = 928 ( 03AO hex = 928 Decimal= 16 x 58 ) 3-22 Digital Internal Use Only RA82 Subunit Characteristics During SDI Get Subunit Characteristics Command BYTE #" Hex byte ********************* 512-BYTE MODE ************************** 20 39 57 LBNs/Track 21 OB Group Offset 22 23 24 25 99 90 12 00 #" of 26 27 90 03 RCT Copy Size (LBNs) HO~T 11 Decimal LBNs 1216665 (= 129099 hex) (57 LBNs/Track x 15 Heads x 1423 Cylinders) 912 (16x57) (390 hex» ********************* 576-BYTE MODE ************************** (There are no plans to implement actual 18-bit HDA's at the present.) 28 33 51 LBNs/Track 29 OE Group Offset 30 31 32 33 53 9C 10 00 # of HOST LBNs 34 35 30 03 RCT Copy Size - LBNs 14 1,088,595 (= 109C53 hex) (51 LBNs/Track x 15 Heads x 1423 Cylinders) 816 (16x51) (330 hex) ************************************************************* = 36 37 04 00 38 OF Size of Diagnostic READ ONLY DBN Area (Groups) 39 04 4 Cylinders in DBN Area 0004 4 Cylinders in XBN Space 15 Digital Internal Use Only 3-23 RA82 Subunit Characteristics During SOl Get Subunit Characteristics Command 3-24 Digital Internal Use Only RA90 Common Characteristics During SOl Get Common Characteristics Command 3.11 RA90 Common Characteristics BYTE -it Hex byte 1 78 Response Opcode 2 43 (4) (3) 3 DD Xfer Rate = 22.198 Mhz ·4 57 (5) (7) 5 04 4 Copies FCT/RCT, 512 byte mode only, 6 OC 7 06 ECC Threshold abnormal) 8 xx Microcode Revision Level 9 xx Bit<0:3> Bit<4> Bit<5:6> Bit<7> 10 11 12 13 14 15 xx xx xx 00 00 00 Lo 16 13 Drive type Identifier per MSCP Spec 23 Octal, RA90) 17 3C 60 Revs/Second 18 19 20 21 22 23 00 00 00 00 00 00 * SDI Version 4.0 Short timeout 8 seconds (2"3) (DD hex = 221 Decimal) 5 retries for data transfer operation 128 second long timeout (2"7) (HDA jumper in) Error Recovery Level = TBD (12 as implemented to date 12-15-87, subject to change) Hi - = 6 (-it of ECC symbol errors to consider Hardware rev~s~on from switches 1 = Embedded Servo Enabled HDA revision bits IE=l, Special internal error log available (Drive Serial number is determined SIN of Drive by switches on the flex cable, a part of the chassis assy. (item 7 of 70-22941-01) Decode: CXO mfg drive Bit <19:18> = 00 Bit <17:00> 1 thru 262143 Bit <19:18> = 01 CXO mfg drive Bit <17:00> = 262144 thru 309,999 !{BO mfg drive Bit <19:18> = 10 Bit <17:00> 1 thru 262143 TBD plant mfg drive Bit <19:18> = 11 Bit <17:00> 1 thru 262143 (3C hex = (19 Decimal, 60 decimal) 1 1--- Error Recovery Threshold (not used in RA90) 1 1 Digital Internal Use Only 3-25 RA90 Subunit Characteristics During SOl Get Subunit Characteristics Command 3.12 RA90 Subunit Characteristics BYTE * Hex byte 1 77 2 4 S 5B OA 00 00 6 OD 13 Groups/Cylinder 7 00 1st XBN 8 01 1 9 00 1st DBN 10 3 Response Opcode 1-- 0000A5B hex = 2651 Decimal {LOGICAL} Cylinders in LBN Space (Host Cyls + RCT Cyls, 0 thru 2650) . 1 =0 1st LBN o =0 1st RBN o 01 1 RBN/Track (RM bit 0, Non-removable Media) 11 00 Reserved 12 OE 14 Words DATA PREAMBLE 13 05 5 Words HEADER PREAMBLE (for 512-byte mode, 70 sectors) 14 15 16 17 SA 10 64 25 18 19 76 02 Track/Group 1--- Media Type Identifier DURA 90 1 Lo Hi FCT Copy Size - XBNs ( 0276 hex 3-26 (for 512-byte mode, 70 sectors) Digital Internal Use Only = 630 = 630 Decimal) RA90 Subunit Characteristics During SOl Get Subunit Characteristics Command BYTE * He~: byte ********************* 512-BYTE MODE ************************** 20 45 69 LBNs/Track 21 OE Group Offset 22 23 24 25 D9 41 24 00 26 27 9E 01 14 Decimal * of HOST LBNs 2,376,153 (= 002441D9 hex) (69 LBNs/Track x 13 Heads x 2649 LOGICAL Cylinders RCT Copy Size (LBNs) 414 decimal (19E hex) ********************* 576-BYTE MODE ************************** 18-Bit HDA's NOT SUPPORTED by RA90 28 00 LBNs/Track 29 00 Group Offset 30 31 32 33 00 00 00 00 34 35 00 00 * of HOST LBNs RCT Copy Size ************************************************************* 36 37 03 00 38 00 Size of Diagnostic READ-ONLY DBN Area (Groups) 39 02 2 Cylinders in DBN Area 0003 = 3 Cylinders in XBN Space 13 decimal Digital Internal Use Only 3-27 RA90 Subunit Characteristics During SDI Get Subunit Characteristics Command 3-28 Digital Internal Use Only RA90 Subunit Characteristics During SDI Get Subunit Characteristics Command DRIVE CHARACTERISTICS QUIZ Digital Internal Use Only 3-29 STUDENT QUIZ DSDF Drive Characteristics 3.13 Student Exercises 1.· ECC error handling is performed by which of the following? A. The drive ~ The controller C .. The host D. 2. The drive if the ECC threshold is exceeded The number of multiple copies of the RCT and the FCT for each drive is: A. 4 copies of RCT/Fcr B. 6 copies of RCT/Fcr C. 7 copies of RCT/Fcr @ 3. Depends upon drive type ECC threshold value is derived from which of the following? (£J 4. B. The controller C. The host hardware D. The host software if BBR is supported The purpose of the ECC character is: A. Only detect disk transfer errors. B. Only detect controller internal data path errors. @ D. 5. The drive Detect disk transfer errors and provide for data correction. Detect controller internal data path errors and provide for data correction. The purpose of the EDe character is: A. ® Only detect disk transfer errors. Only detect controller internal data path errors. C. Detect disk transfer errors and provide for data correction. D. Detect controller internal data path errors and provide for data correction. 3-30 Digital Internal Use Only STUDENT QUIZ DSDF Drive Characteristics 6. Multiple copies of the RCT and FCT are located where? A. Usually on the same cylinder. B. Usually on the same track. C. Usually distributed across the same media surface. C91 7. S. Usually distributed across different heads and cylinders. What are the addressing characteristics of the RA70? 8 1 track per group and 11 groups per logical cylinder. B. 11 tracks per group and 1 group per logical cylinder. C. 1 track per group and 2 groups per logical cylinder. D. 11 tracks per group and 11 groups per logical cylinder. A subsystem consisting of an HSC and an RA81 disk drive detects uncorrectable ECC errors.. what is the error recovery technique used if the ECC error continues to be uncorrectable? A. The controller will retry the operation at least 14 times. B. The drive will retry the operation at least 14 times. @) The controller will retry the operation at least 5 times. D. 9. The drive will retry the operation at least 5 times. Which of the following statements is false? A. The ECC character protects both the data field and EDC character in a sector. B. The EDC character is used to detect internal controller parallel data path problems. C. The ECC character is used to detect serial read/write disk data path problems. @ The ECC character only protects the data field of a sector. ao..., HEADER SYNC - AllOWS THE PLO TO SYNC UP \ I' I lJ to Z UEADER PREAMBLE - ALLOWS DRIVES PLO TO SETTLE BEFORE HEADER SYNC \ \ , \ \ , \ / CODE \ , \ , \ , \ I 'II I I ~ . CODE: 06 - USABLE PLACEMENT SECTOR RBN 11 - UNUSABLE SECTOR RBN See- P,t{..( J '. PAD = 8 DATA-BYTE ZEROS FOR 512-BYTE MODE _t. ~ D4\TA-BYTE ZEROS FOR 576·BYTE MODE HeoJf! r 7 I::Jj-d«.en o.OUJ:V1J1ttfd ...2-fa~}(·$ ~r:." 1110--4 6Q h/~~CY 4Ir rfcci?tt-or-/S 'vtcJr{!fp;r!'t-rf f 'P<'UJ «f f/,.,e. fJ,(v.(J.· H V~~W I {.. Lovl ~~tlJ(k I4d.·v,cLh-", OJ ~t+qr; 4/~/ \ ¥Ld CXO-680B m ::l Q. i--------==:J L· 0' n ~ ~=:J C---~-"~,.""···--:J ·~I rD':J: fn CD fn m OQ. ::l CD will t "11 cO' c (; c cS' ::;: !t :i CD . t JINDEX L -- -- -I :::J !. s: ~ 1 HEADER HEADER HEADER HEADER HEADER HEADER SLOP COpy 1 COPY 2 COPY 3 COPY 4 PRESYNC AMBLE SPLICE ~ DATA PERAMBLE DATA CD . REINEDC PAD ECC DATA WRITE Os POST- TO READ STRUCT AMBLE RECOVERY TIME ~~- I I , \ , \ , \ , \ , \ , \ , \ , \ , II I I I I I II I ' II CODE II I I DON j 1 I \ ~ I \ \ \ HEADER PREAMBLE - ALLOWS DRIVES PLO TO SETTLE BEFORE HEADER SYNC \ \ \ \ \ \ HEADER SYNC - ALLOWS THE PLO TO SYNC UP \ \ SLOP - GIVES CONTROLLER TIME TO SWITCH BETWEEN READING HEADER AND WRITING DATA PREAMBLE \\ \ \ \ I _ \ SPLICE - TIME NEEDED FOR TRANSMISSION DELAYS, HEADER COMPARE TIME AND PlO LOCK TIME DATA PREAMBLE - ALLOWS TIME FOR THE HEADER COMPARE AND Pto TO SYNC UP DATA SYNC·- ALLOWS THE PLO TO SYNC UP 1-----' DATA POSTAMBLE WRITE 10 READ RECOVERY - TIME NECESSARY FOR WRITE RECOVERY PLUS 48 STATE BIT TIMES REINSTRUCT TIME - TIME ALLOTTED WHILE CONTROLLER IS CLEANING UP CURRENT SECTOR TRANSFER AND SENDING COMMAND FOR THE NEXT ONE PAD - 8 DATA BYTE ZEROS FOR 512-BYTE MODE 6 DATA-BYTE ZEROS FOR 576-BYTE MODE CODE: 11 - UNUSABLE SECTOR DBN 14 - USABLE DIAGNOSTIC SECTOR DBN S.ft) e II ~ ' OJ Z 2o " DATA SYNC c CJ) ----------------- c o::s JINDEXL £. CXO-2382A t;!!! =g o ;II;" :::J fn WI» :::J a. :t i CD iii ." cO' c (; J;~~~; L I ~--~-- ___uu I ~ ~] ffn HEADER HEADER HEADER HEADER HEADER HEADER SLOP COpy 1 COPY 2 COPY 3 COpy 4 PRESYNC AMBLE DATA ~ SYNC DATA " DATA PREAMBLE .. 0" ~- II >< to Z -------------------- --------------------- t EDC PAD ECC DATA WRITE REINOs POST- TO READ STRUCT AMBLE RECOVERY TIME "" II , \ ' \ ' \ , \ , \ , \ , \ , \ I I I I I I II I ,II CODE , II I I I' 1t I ' \ \ \ HEADER PREAMBLE - ALLOWS DRIVES PLO TO SETTLE BEFORE HEADER SYNC ' \ \ \ \ \ \ HEADER SYNC - ALLOWS THE PlO TO SYNC UP \ \ SLOP - GIVES CONTROLLER TIME TO SWITCH BETWEEN READING HEADER AND WRITING DATA PREAMBLE \ \ \ '. \ " I_ I SPLICE - TIME NEEDED FOR TRANSMISSION DELAYS, HEADER COMPARE TIME AND PLO LOCK TIME - DATA PREAMBLE - ALLOWS TIME FOR THE HEADER COMPARE AND PLO TO SYNC UP DA T A SYNC - ALLOWS T UE PLO TO SYNC UP XBN 1 - - - -.... DATA POSTAMBLE WRITE TO READ RECOVERY - TIME NECESSARY FOR WRIl E RECOVERY PLUS 48 STATE BIT TIMES REINSTRUCT liME .. TIME ALLOTTED WHitE CONTROLLER IS CLEANING UP CURRENT SECTOR TRANSFER AND SENDING COMMAND FOR THE NEXT ONE c cO' ~ S- ..eD :::s !t c:: I o :::s -< t PAD - 8 DATA-BYTE ZEROS FOR 512-BYTE MODE 6 DATA-BYIE ZEROS FOR 576-BYTE MODE CODE: 11 - UNUSABLE SECTOR XBN 12 - USABLE EXTERNAL SECTOR XBN :see ~I r. -,., " m 0- ~ NOTE: ONLY THE FACTORY SCANNER CURRENTLY WRITES THESE HEADER CODES. CXO-2383A I» :::l r;~ CD fI) fI) I» 00. :::l CD (.)~ Simplified Summary of Header Codes Lesson 3 4.1 Simplified Summary of Header Codes The controller reads disk headers when it is searching for a block of data Each header contains a 4-bit code and a 28-bit block address field. The 4-bit code field contains information to tell the controller where the data can be found. It is the controller's responsibility to determine from the header code where the data resides and to retrieve the data Since the disk is divided into different areas (HOST/RCT, RBN, DBN, XBN), codes for each area are used to protect against invalid access. 4-Blt Header Codes for LBN (LBNs in Host Applications Area) 00 This is a usable LBN. This code directs the controller to access the data following the header information just read. 03 This LBN is unusable and has been replaced by a non-primary RBN. This code indicates to the controller that the data following the header just read is invalid and directs the controller to retrieve the data from an RBN that is located on a different track than the track containing this LBN. The controller will use the RCT information to determine exactly what RBN was used. 05 This LBN is unusable and has been replaced by a primary RBN. This code indicates to the controller that the data following the header just read is invalid and directs the controller to retrieve the data from the RBN at the end of the current track. 4-Blt Header Codes for LBN (LBNs in ReT Area) 00 l' This is a usable LBN. This code directs the controller to access the data following the header information just read. This is an unusable LBN (not replaced). This code indicates to the controller that the data following the header just read is invalid and directs the controller to retrieve the data from the next copy of the RCT. If all copies of the ReT are unreadable, an uncorrectable error is reported. 4-Blt Header Codes for RBN 06 This is a usable RBN. This code directs the controller to access the data following the header information just read. l' This is an unusable RBN. This code indicates to the controller that the data following the header just read is invalid and directs the controller to retrieve the data from another RBN that is located on a different track than the track containing this RBN. The controller will use the ReT information to determine exactly what RBN was used. 4-Blt Header Codes for DBN 14 This is a usable DBN. This code directs the controller to access the data following the header information just read. 11 This is an unusable DBN. This code indicates to the controller that the data following the header just read is invalid. There is no multi-copy protection for DBNs, and DBNs are not replaced if they become defective. The controller will report an uncorrectable error if the data is not retrievable. 4-Blt Header Codes for XBN I""~ ~ R/W DATA CONTROL CO 1....------. I I TRANSLATE I CPU CONTROL R/W ENCODE DECODE 'I' HDA AND PREAMP SLAVE CPU CONTROL R/W MODULE j CXO-2385A 5-4 Digital Internal Use Only ::!! (Q C CYL OGBlo LBNs IRCTSIFCTSIDBNS 1248- 1252- 12561247 1251 1255 1257 11GB CYl OGB 10 LBNs c;J IRCTSIFCTSIDBNS 1248- 1252- 12561247 1251 1255 1257 11GB I I. ::u ~ co .... c} -c 0 0(Q '< ID :::J Do SPINDLE CARRIAGE ASSEMBLY, , ' "tI ~ '< f/J 0" !. c .... iii' ...cCr0= ::s c cS" l _ _ _....6' SERVO HEAD s- 1 TRACK PER GROUP 14 GROUPS PER LOGICAL CYL !!. :::J 1GB = INNER GUARD BAND OGB = OUTER GUARD BAND c BYTE: LOGICAL CYL :::J 456 MBYTE 1258 CYUHEAD 52 SECTORS .. iD = o -< x: Y DIAG CYls IN GUARD BAND ARE FOR INfERNAL DRIVE DIAGNOSTICS ONLY CYL 1261 = R/W CYl 1262 = READ-ONLY T c m z J> iII) II) :s Do = PHYSICAL CYL :D ~ C ; CXO-2379A "'0 !!. ::r fn DBN Area and RIW Data Paths 5-6 Digital Internal Use Only CHAPTER 6 REPLACEMENT CONTROL TABLE (RCT) Replacement Control Table (RCT) 6-1 Replacement Control Table Lesson 5 Figure 6-1: Simplified Replacement and Control Table SECTOR 0 RCT CONTROL BLOCK SECTOR 1 STORED HOST LBN COpy DURING BBR SECTOR 2 32-BIT DESCRIPTOR -RBNO 32-BIT DESCRIPTOR -RBN 1 32-81T DESCRIPTOR -RBN2 ·· · ~ ~ 128 32-BIT DESCRIPTORS 1 DE$CRIPTOR/RBN 32-91T DESCRIPTOR 32-91T DESCRIPTOR ~------------------~ -RBN 127 -RBN 128 SECTOR 3 SECTOR N CXO-2370A 6-2 Digital Internal Use Only LD('}i<.. LI~i ·t.ov w\~h 'DA REPLACEMENT CONTROL INFORMATION 15 WORD 0-3 b~''\ (3c,,' I. It,)d\CJ ",-" fJ --?I- • I t!Vc,r&,l r ! ' f j' /)c,h t. . cif ~ fd J 4 ' l::tt 0 wl-ilr. ]- (jnhrllJ~~Or 5 rRESERVED L. s LBN BEING REPLACED VALID ONLY IF PHASE 1 FLAG SET 8-9 REPLACEMENT RBN VALID ONLY IF PHASE 2 FLAG SET r I- 15 14 c 13 cD' ~ S' ;- . ::J !!. c: • o::J ~ t 7 P2 - PHASE 2 FLAG, REPLACEMENT OF BLOCK IF IT IS INDEED BAD BR - BAD RBN FLAG, INDICATING THE REPLACEMENT IN PROGRESS WAS CAUSED BY A BAD RBN FE - FORCED ERROR FLAG, INDICATINFG REPLACEMENT PROCESS SHOULD SET FORCED ERROR INDICATOR IN REPLACEMENT BLOCK ~ c CD Q. .-I :xJ CD 31 "ii"~ DESCRIPTOR 04 ..., l-'~ ~ i ::J ________________________________ _J DESCRIPTOR 127 II ::J Q. RCT_BLOCK RCT. MINUM-1 WORD 4 BIT DESCRIPTIONS P1 - PHASE 1 FLAG, WHEN SET, DETERMINATION OF BAD BLOCK (REALLY BAD) IS IN PROGRESS CiJ !!. 128 REPLACEMENT BLOCK DESCRIPTORS BIT e: .~ ~ _J '11 cQ' !. RCT_BLOCK 3 128 REPLACEMENT BLOCK DESCRIPTORS RESERVED Btv t I- RCT_BLOCK 2 128 REPLACEMENT BLOCK DESCRIPTORS BAD RBN VALID ONLY IF BR FLAG SET f:- ~ r-ra-------------------------------. DESCRIPTOR 00 RCT BLOCK 1 REPLACEDLBN IMAGE G 6-7 10-11 . 'PtOkcf 0:-" 0J\ Q REPLACEMENT CONTROL TABLE RCT_BLOCK 0 REPLACEMENT CONTROL INFORMA liON F .s i..voL S' ~ Se r\ C.·I 54-BIT VOLUME 10 ASSIGNED AT FACTORY FORMAT TIME P P B I 1.j..)k'~'-vIPIO\') .1 ~ ~ NULL ENTRIES FOR FILLER I I , I IDENTIFIES DISPOSITION OF RBN ON THIS TRACK TYPICAL TRACK ~:N N - SECTORS I oo ..2- a -;} c:r CD r I MULTIPLE COPIES (SEE NOTE 3) HEADER 02 - ALLOCATED PRIMARY DATA ..... _ - - - - - , I REPLACEMENT DESCRIPTOR CODES 00 - NOT ALLOCATED I CODE 03 :D CD "0 03 - ALLOCATED NON·PRIMARY 04 - UNUSABLE RBN CODE OF 6 05 - TREAT AS 04 10 - NULLde611~) f?Vl,J, t·1>{)$CToU~ CODE OF 11 = = USABLE RBN, DATA AREA CAN BE READ UNUSABLE RBN, DATA AREA IS UNUSABLE NOTES: 1. RBN DESCRIPTOR CODES ARE DIFFERENT FROM HEADER CODES. 2. RBN HEADER CODES ARE DIFFERENT FROM LBN HEADER CODES. 3. EACH COpy IS PLACED ON DIFFERENT SURFACES AND DIFFERENT CYLINDERS FOR PROTECTION. ~ CD i::J oo ::J (;a en - ~;' ::J 0- CXO-2366A U'li' Replacement Control Table Lesson 5 6.1 THE REPLACEMENT CONTROL TABLE The replacement control table (ReT) records the status of each replacement block on the unit and the location of all revectored logical blocks. The RCT is a multi-copy structure. The subsystem provides the host with the number of copies of the RCT and an offset which enables the host to compute the location of the next copy of an. RCT block. The RCT is a two-part structure. The first part of the structure contains two blocks: a flags/control block and a temporary data storage block. The second part of the structure is an array of replacement block descriptors with an entry for each replacement block on the unit, and it is organized in ascending RBN order. There are as many sectors in the second half of the table as are required for replacement block descriptor storage. There are n copies of the R CT in the R CT area, where n is a device characteristic. Each copy of the R CT is located "ret" LBNs from the previous copy. Copy 1 of the RCT is the base copy. The remaining copies provide individual backup blocks for the corresponding blocks in the base copY' of the RCf. Both n, the number of RCT copies, and "rct," the offset to the· next Rcr copy, are passed to the host as unit characteristics in the response to the MSCP GET UNIT STATUS command. While the size of the host application area is specified to the host, the size of the RCT area is not specified. The host is guaranteed that the RCT area will be at least large enough to contain n copies of the RCT. If any blocks in the RCT area are not actually used by an RCT copy, they are reserved and are not to be used by the host The following restrictions apply to RCT space access: 1. The subsystem must prohibit spiraling from the host application portion of LBN space into the RCT space. 2. I/O to the RCT must be a single block operation. This requirement does not have to be enforced by the subsystems, but it is required by the replacement algorithms. Transfers other than one block in length may have undefined results. 3. Any portion within the ReT space that is not used for a replacement control table is controller specific and must not be accessed by the host. Host access to any part of the RCT space other than within a replacement table may have undefined results. 4. Host write access to the RCT is prohibited during controller-initiated bad block replacement (BBR). 5. Controller write access to the ReT is prohibited during host-initiated BBR. 6-4 Digital Internal Use Only Replacement Control Table Lesson 5 6.2 RBN DESCRIPTOR FORMAT Each entry in the second part of the replacement control table points to a replacement block on the unit. The table is in ascending RBN order. Thus, the first entry corresponds to the RBN 0 on the unit, the second entry corresponds to the RBN 1 on the unit, etc. Entries that do not correspond to RBNs on the unit may be present to pad the RCf to a block boundary. Any entry which does not correspond to an RBN on the unit is called a null entry. There is always one null entry at the end of the ReT to demarcate the end the table. All other entries past this last null entry are undefined. The fonnat of a replacement block descriptor in the replacement control table is: Figure 6-3: Replacement Block Descriptor -e.;'jIJ \PltuJ~ 'l \ L-Qff {tvV\C} \lASh) LSLI',\ LBN (LOW) CODe LBN (HIGH) CXO-2367A LBN is the logical block number of a revectored logical block. CODE is one of the following octal values: 00 - Unallocated (empty) replacement block. 02 - Allocated replacement block - primary replacement block. 03 - Allocated replacement block - NON-primary replacement block. 04 - Unusable replacement block. 05 - Alternate unusable replacement block. Code 05 is reserved. Programs should treat this code as if it were code 04. 10 - Null entry - no corresponding replacement block. For codes 00, 04, and 10 the LBN field is always zero. Digital Internal Use Only 6-5 Replacement Control Table Lesson 5 6.3 PHYSICAL LAYOUT OF THE RCT The n copies of the ReT are stored at the highest addresses of the LBN space. Each sector in the second part of the RCT contains 128 entries, regardless of the actual disk fotmat (bytes 512 through 575 of 576 byte sectors are zero filled). . The size of the copies must be adjusted so that corresponding blocks of each copy are accessed using physically distinct components to the extent possible. This implies that: If the number of copies is less than or equal to the number of heads, then corresponding blocks of each copy must be accessed by different heads. If the number of copies is greater than the number of heads, then corresponding blocks of each copy must be distributed as evenly as possible across th~ heads. If a device uses a dedicated servo surface, then corresponding blocks of each copy must be located using different tracks of the servo surface. The first sector in the RCT contains infotmation about the state of any replacement operation that may be in progress. A copy of the volume serial number is contained in this sector to allow validation of the RCT by diagnostics. The second sector in each copy of the ReT is used by the bad block replacement algorithm. This sector is used to hold a copy of the data from the sector being replaced. The remaining sectors each contain 128 32-bit replacement block descriptors. The ReT structure is shown in Figure 6-4. Sector 0 of the ReI' is illustrated in Figure 6-5. Figure 6-4: RCT Structure o 15 REPLACEMENT CONTROL INFORMATION SECTOR 0 REPLACEMENT LBN IMAGE SECTOR 1 128 REPLACEMENT BLOCK DESCRIPTORS SECTOR 2 128 REPLACEMENT BLOCK DESCRIPTORS SECTOR 3 ·· · ~ 128 REPLACEMENT BLOCK DESCRIPTORS ~ SECTOR RCT.MINIMUM-1 CXO-2368A 6-6 Digital Internal Use Only Replacement Control Table Lesson 5 Figure 6-5: ReT Sector 0 o 15 LOW ORDER VOLUME SERIAL NUMBER P 1 ... r~ WORD 00 VOLUME SERIAL NUMBER WORD 01 VOLUME SERIAL NUMBER WORD 02 HIGH ORDER VOLUME SERIAL NUMBER WORD 03 P 2 F E B R V P WORD 04 RESERVED WORD 05 LOW ORDER LBN OF BLOCK BEING REPLACED WORD 06 HIGH ORDER LBN OF BLOCK BEING REPLACED WORD 07 LOW ORDER RBN OF REPLACEMENT WORD 08 HIGH ORDER RBN OF REPLACEMENT WORD 09 LOW ORDER BAD REPLACEMENT BLOCK WORD 10 HIGH ORDER BAD REPLACEMENT BLOCK WORD 11 ________ R_E_S_E_R_V_E_D______ ~r k WORD 255 CXO-2369A Digital Internal Use Only 6-7 Replacement Control Table Lesson 5 Table 6-1 : RCT Block 0 Defined WORD O-WORD 3: The 54-bit volume ID assigned during the factory formatting process. If the pack was formatted without the use of factory format information, a site volume 10 must be input to the formatter for entry into this field. The low order 32 bits of this field are used as a volume 10 in MSCP log packets. WORD 4: This word contains the status flags used during the bad block replacement process. BIT 7: FE The force error flag, indicating that the replacement process should set the forced error indicator in the target replacement block. This flag is reset when the replacement operation finishes. The flag' is initially reset. BIT 13: SR The bad replacement block flag, indicating that the replacement in progress was caused by a bad replacement block. This flag is reset when the replacement operation finishes. The flag is initially reset. BIT 14: P2 The phase 2 flag, indicating that the replacement process is in phase 2 of the replacement algorithm. If this flag is set when the unit comes on line, it indicates that a replacement was interrupted and must be completed. This flag is reset when replacement is completed. The flag is initially reset. BIT 15: P1 The phase 1 flag, indicating that the replacement process is in phase 1 of the replacement algorithm. If this flag is set when the unit comes on-line, it indicates that a replacement was interrupted and must be completed. This flag is reset when phase 1 is completed. The flag is initially reset. NOTE If any other bit in word 4 becomes set, whether deliberate or by accidental corruption, the controller usually considers the media to be VOLUME DATA SAFETY WRITE PROTECTED. The host operating system, in turn, is prevented from performing write data operations to the media. WORD 6-WORD 7: A copy of the LBN of the block being replaced, if a replacement operation is in progress. This field is invalid if the P1 flag is not set. This field is initialized to zero. ' WORD 8-WORD 9: A copy of the RBN of the block with which the LBN is being replaced, if a replacement operation is in progress. This field is invalid if the P2 flag is not set. This field is initialized to zero. WORD 10-WORD 11: The RBN of the bad replacement block being replaced. This field is invalid if the BBR flag is not set. This field is initialized to zero. 6-8 Digital Internal Use Only CHAPTER 7 FORMAT CONTROL TABLE (FCT) Format Control Table (FCT) 7-1 ~ c cO' !!. = LOW ORDER PHYSICAL BLOCK NUMBER ~ 3 !!. c "11 r'TI c o 3 to' N CODE ..r.;;;;L (BLOCK 0) SECTOR 0 OF FCT I o::s 128 BAD BLOCK DESCRIPTORS 512-BYTE MODE VOlUME INFORMATION BLOCK ~ / - / ~ ~ 0 ~ 0 0 0 0 .... (J1 0 0 0 ..... 0 N (,.) 0 0 ~ 0 :0 0 0 ~ 0 :0 :0 0 ~ :0 :0 . / It¢ :0 0 ~ ~ 0 :0 :0 0 ..... 0 en I /' ~ 128 BAD BLOCK DESCRIPTORS 576-BYTE MODE 128 BAD BLOCK DESCRIPTORS 512-BYTE MODE 128 BAD BLOCK DESCRIPTORS 576-BYTE MODE 128 BAD BLOCK DESCRIPTORS 576-BYTE MODE 0 0 ~ ~ ~ 0 0 0 0 0 0 0 C; :0 0 ClD :0 co :0 :E 0 ~ 0 ~ 0 0 0 ~ ~ :0 :0 ::: N 0 0 :E 0 :0 :0 ~ 0 :0 ~ 0 :0 :E 0 0 0 :0 0 ~ 0; ~ :E 0 ~ 0 ~ :0 0 :0 :0 0; ~ 0 ~ 0 ~ 0 0 :0 0 :0 N N ..... N N 0 0 H~d~~J !14-15 - NUMBER OF USED 512 TABLE ENTRIES WORD 14 IS LEAST SIGNIFICANT WORD 15 IS MOST SIGNIFICANT 16-17 - NUMBER OF USED 576 TABLE ENTRIES 18-19 - SCRATCH AREA ADDRESS WORD 18 IS LEAST SIGNIFICANT WORD 19 IS MOST SIGNIFICANT 20 - SIZE OF THE SCRATCH AREA IN THIS FCr COPY 21.dNCLUDES·FK (J1l (SEENQTE) 22 - VERSION NO: OF THE FORMAT 23-UP - Os mO ~ r;- '< 0 ]]] ~\}. WORD: .::. .t- /V'1.t,>~ir O-MEDIAlAOQE" CO~1pkte . 126736. 51:2~BYTEMODE 074161 = 576:9YTEMODE 1 - FORMATTING INSTANCE NUMBER 2-5 - VOLUME SERIAL NUMBER WORD 2 IS LEAST SIGNIFICANT WORD 5 IS MOST SIGNIFICANT 6-9 - DATE VOLUME FIRST FORMATTED WORD 6 IS LEAST SIGNIFICANT WORD 9 IS MOST SIGNIFICANT 10-13 - DATE OF MOST RECENT FORMATTING WORD 10 IS LEAST SIGNIFICANT WORD 13 IS MOST SIGNIFICANT ~ C SUBSYSTEM SCRATCH STORAGE --------------------- / / 0 " ::J!!. "11 ,", " CiJ -' HIGH ORDER PHYSICAL BLOCK NUMBER CODE: XO - UNUSED ENTRY X 1 - BAD HEADER X2 - OTHER BAD FIelD X4 - BAD DATA (INCLUDING EDC AND ECC FIelDS) X = 1 If A PBN IN HOST LBN AREA WAS PRIMARY REPLACEMENT BLOCK X = 0 IF A PBN IN HOST LBN AREA WAS SECONDARY REPLACEMENT BLOCK X = 1 IF A PBN MAPS INTO RBN, FCT, OR DBN AREAS (PBN = PHYSICAL BLOCK NUMBER) NOTEF·THE FK BIT SET INDICATES THAT THIS IS A FAKE rCT AREA, AND THAT SECTOR 0 (BLOCK 0) Of THIS FCT AREA IS THE ONLY BLOCK WITH VALID INfORMATION. ALL REMAINING SECTORS IN THIS FCT AREA ARE MEANINGLESS AND CONTAIN NO USABLE INFORMATION. THE FK BIT IS MSB (BIT 15) OF WORD 21. CXO-2371A CD 0 = .. ...... 0 ::J 2~ CT CD Format Control Table Lesson 6 7.1 FCT STRUCTURE Each copy of the FCT is composed of one volume information block, one 512-byte format table, one 576-byte format table, and one subsystem temporary storage area (distributed among the alignment pads). The 576-byte table is normally only supplied by manufacturing with 576-byte formatted media An FCT copy has the format shown in Figure 7-2. Details of FCI' block 0 (volume information block) are shown in Figure 7-3. Figure 7-2: FCT Structure VOLUME INFORMATION BLOCK SECTOR 0 128 BAD BLOCK DESCRIPTORS 512-MODE SECTOR 1 128 BAD BLOCK DESCRIPTORS 512-MODE SECTOR 2 :: .:l 128 BAD BLOCK DESCRIPTORS 576-MODE SECTOR m 128 BAD BLOCK DESCR.IPTORS 576-MODE SECTOR m + 1 ::t ::t 128 BAD BLOCK DESCRIPTORS 576-MODE SECTOR P SUBSYSTEM SCRATCH STORAGE SECTOR P + 1 ,~ :~ SUBSYSTEM SCRATCH STORAGE SECTOR Fct - 1 CXO-2372A Digital Internal Use Only 7-3 Format Control Table Lesson 6 The XBN area itself is always formatted to contain 512-byte sectors .. Sector m is the first block of the table to store the factory detected bad blocks for 576-byte formatted disks. Sector p+1 is the first block of the subsystem scratch storage, reserved for use by controllers and formatting utilities, as needed. Sector FCT-1 marks the end of a copy of the FCI'. Sector 0 contains various volume identification and format information. The format is shown in Figure 7-3. Figure 7-3: FCT Sector 0 - (Volume Information Block) MEDIA MODE WORD 00 NUMBER OF USED ENTRIES IN 512 TABLE (LOW) WORD 14 FORMATTING INSTANCE NUMBER WORD 01 NUMBER OF USED ENTRIES IN 512 TABLE (HIGH) WORD 15 VOLUME SERIAL NUMBER LEAST SIGNIFICANT WORD WORD 02 NUMBER OF USED ENTRIES IN 576 TABLE (LOW) WORD16 VOLUME SERIAL NUMBER WORD 03 NUMBER OF USED ENTRIES IN 576 TABLE (HIGH) WORD 17 VOLUME SERIAL NUMBER WORD 04 OFFSET TO PAD AREA IN ALL COPIES WORD 18 VOLUME SERIAL NUMBER MOST SIGNIFICANT WORD WORD 05 SIZE OF AREA IN ALL BUT LAST COPY WORD 19 DATE VOLUME WAS FIRST FORMATTED (LOW) WORD 06 SIZE OF AREA IN LAST COPY WORD 20 DATE VOLUME WAS FIRST FORMATTED WORD 07 DATE VOLUME WAS FIRST FORMATTED WORD 08 DATE VOLUME WAS FIRST FORMATTED (HIGH) WORD 09 DATE OF MOST RECENT VOLUME FORMATTING (LOW) WORD 10 DATE OF MOST RECENT VOLUME FORMATTING WORD 11 DATE OF MOST RECENT VOLUMLE FORMATTING WORD 12 DATE OF MOST RECENT VOLUME FORMATTING (HIGH) WORD 13 F K WORD 21 WORD 22 FORMAT VERSION ~ ZEROS " J l",,",-_ _Z_ERO_S_ _ _ _ WORD 255 CXO-2373A 7-4 Digital Internal Use Only Format Control Table Lesson 6 7.2 VOLUME INFORMATION BlOCK DETAILS WORD 0: MEDIA MODE-is 126736 for a 512-byte format and 074161 for a 576-byte format. During formatting the media mode word is set to zero. WORD 1: Formatting Instance Number-is a counter that is incremented each time the HDA or volume is formatted. Initialized to 1 at the factory. WORD 2-5: Volume Serial Number-is the HDA or volume identification. WORD 6-9: Time and Date of the First Formatting -is expressed in a quad-word field as the number of clunks since 00:00 o'clock, Nov. 17, 1858 (in the local time zone), or zero if the current time and date is not available. A clunk has a value of 100 nanoseconds. This is the standard VAXNMS time and date format. WORD 10-13: Time and Date of the Most Recent Formatting-is the date that the media was last formatted. WORD 14-15: The Number of Used 512 Table Entries-indicates how many of the entries in the 512 byte table are used. WORD 16-17: The Number of Used 576 Table Entries-indicates how many of the entries in the 576 byte table are used. WORD 18: Scratch Area Offset-is the offset, in words, counted from the beginning of the FCT to the scratch area in all copies of the FCT. WORD 19: Size of Area-is the size of the scratch area in all copies of the FCr, except the last copy of the FCT. WORD 20: Scratch Area Size of Last Copy - is the size of the scratch area in the last copy of the FCT. WORD 21: Bit FK-is set if this is a fake FCT (Le., only the first block exists). If this bit is set, then only the media mode, format instance number, serial number, date of last format, and pad area pointers (words 18-20) are valid. The format instance number will be O. The contents of all other words in block 0 and all blocks following block 0 are undefined and, therefore, considered invalid. WORD 22: Format Version-is the version number of the format Sectors 2 through m-l contain the 512-byte mode bad block descriptors. Each descriptor describes the physical block (PBN) on the HDA or volume that is bad and the problem that has been detected. Additional information is contained in the code field for the use of the formatter in allocating primary and secondary replacements. The format of a bad block descriptor is shown in Figure 7-4. Digital Internal Use Only 7-5 Format Control Table Lesson 6 Figure 7-4: Bad Block Descriptor LOW ORDER PHYSICAL BLOCK NUMBER CODE HIGH ORDER PHYSICAL BLOCK NUMBER WORDw WORDw+ 1 CXO-2374A Where: PHYSICAL BLOCK NUMBER-is the relative position of the sector from the beginning of the HDA or volume. CODE is an indication of the problem (reason) that caused the sector to be retired. The legal values for code are: XO Xl X2 X4 - Unused entry. Bad header. Other bad field. Bad data (including EDC and ECC fields). The formatter uses bit 15 (X) of the code field to indicate that the bad PBN, if a non-RCT LBN, was replaced by a primary RBN, X=l, or a non-primary RBN, X=O. X will also equal I for those PBNs that map into RBNs, XBNs, DBNs, or LBNs in the RCT. These bits are set only in those formatting modes that result in the creation of (or re-creation of an invalid) FCT. Bad block descriptors are sorted in descending track order within each sub-table (512 and 576). The entries are further sorted in ascending PBN order within each track. A single unused entry is placed at the end of the sorted list in each table. The values in the remaining unused entries are undefined. Sectors m through p contain the 576 byte mode bad block descriptors. The format of these descriptors is identical to that of the 512 byte descriptors. 7-6 Digital Internal Use Only CHAPTER 8 STANDARD DISK INTERFACE (SOl) Standard Disk Interface (SOl) 8-1 Standard Disk Interface (SOl) 8.1 INTRODUCTION This section describes the Standard Disk Interface (SDI) protocol that allows SOl controllers to communicate with SOl disks in the Digital Standard Architecture (DSA) disk subsystem. This section stresses the Standard Disk Interface (SDI) characteristics that DSA disk drives implement. You will learn some SOl drive characteristics and how drive responsibilities differ from those of the SDI controllers. 8.2 OBJECTIVES Upon completion of this· discussion, you will be able to: 1. Identify and define the use of the four lines that comprise the SDI. 2. Identify and describe the use of the SOl bus encoding scheme. 3. Describe the fonnat of the RTCS line and define the use of each signal. 4. Describe the fonnat of the RTDS line and define the use of each signal. 5. Define the different drive states relative to the controller. S. Describe the fonnat used by the controller to send commands to the drive. 7. Describe the fonnat used by the drive to send responses back to the controller. S. Describe the events that occur between the controller and the drive during a seek operation. 9. Describe the events that occur between the controller and the drive during a read and a write operation. 8-2 Digital Internal Use Only Standard Disk Interface (SOl) 8.3 SOl BUS Disk drives and controllers within a DSA disk subsystem communicate with each other using a standard protocol. This protocol is transmitted over the Standard Disk Interconnect (SD1) bus. A separate SDl bus connects the controller to each drive. Refer to Figure 8-1. This radial configuration allows simultaneous transactions to occur from more than one drive to the same controller. The radial bus allows you to disconnect a drive that is not being used while the controller continues to selVice the host on other drives. Figure ~1: SOl Radial Bus SDI BUS CONTROUER SDI BUS MLDS-1339A Digital Internal Use Only 8-3 Standard Disk Interface (SDI) In a dual-port configuration, the radial bus also shows it is possible to disconnect a controller from a drive port. Refer to Figure 8-2. The drive can continue to service the other controller on its other port. Figure 8-2: SOl Dual Port DISK DRIVE CONTROLLER DISK DRIVE CONTROLLER DISK DRIVE CXO-2348A 8-4 Digital Internal Use Only Standard Disk Interface (501) 8.3.1 SOl Lines The SDI bus consists of four high-speed, unidirectional lines. Each line transmits serial information in only one direction. See Figure 8-3. Figure 8-3: SOl Bus I CONTROLLER DISK DRIVE SOl BUS ~ I' v REAL-TIME CONTROLLER STATE I .... COMMAND/WRITE-DATA CONTROLLER DISK DRIVE RESPONSE/READ-DATA REAL-TIME DRIVE STATE CXO-1333B The Real Time Controller State (RTCS) line repeatedly transmits controller state information to the drive. The Real Time Drive State (RTDS) line repeatedly sends drive state information to the controller. The Command/Write-Data line serves two pwposes. It transmits commands and parameters to the drive. It also transmits write data to the recording surfaces in the drive. This line is also called the WRT/CMD line. The Response!Read-Data line also performs two functions. It sends drive response messages to the controller. It also transmits read data from the recording surfaces in the drive to the controller. This line is also frequently called the ReadlResponse line. Digital Internal Use Only 8-5 Standard Disk Interface (SOl) 8.3.2 SOl Bus Encoding Each of the four· SDr lines transmit serial ones and zeros using 12 nanosecond pulses occurring within bit cell times. Refer to Figure 8-4. The duration of each pulse is fixed at 12 nanoseconds. The bit cell time, however, is a function of the drive transfer rate. Drive transfer rates vary, depending on the disk speed and recording density of the drive. Figure 8-4: SOl Bus Encode ~ BINARY DATA BIT" CELL I I I I o I I I SOl ENCODED DATA a..- I I I I I o I I o VOLTS I I o DC I I I I I .....1 ~..... PULSE WIDTH I I (12 NS) "BIT CELL VARIES FROM DRIVE TO DRIVE CXO-2349A The receiver separates the data in the following manner: A positive pulse at the beginning of a bit cell indicates a logical one. A negative pulse at the beginning of a bit cell indicates a logical zero. When a bit in the next cell is the same as the previous one, then a pulse of the opposite polarity is added immediately after the pulse for the previous bit. In this manner, every pulse on an SDr line alternates polarity. This results in a net DC voltage of zero. Circuitry in the controller and in the drive detect any missing or additional, unwanted pulses that may occur. Missing or unwanted pulses represent transmission errol'S. When detected, they are usually entered in the system error log. These errors are referred to as pulse errors. For example, if a drive detected two sequential pulses of the same polarity while receiving information on the WRT/OvID data line, the error is referred to as a write/command pulse error. 8-6 Digital Internal Use Only Standard Disk Interface (SOl) 8.4 DRIVE STATES A drive can be in one of four different states relative to a controller. When the drive is not operati()nal, it's in a state called "drive off line." See Figure 8-5. At this time, no communication takes place between the drive and either controller. A drive is off line to a controller when its port switch to that controller is released. Figure 8-5: Drive Off Line Disk Drive :I"·---..·*r-~~~----------------' '--__c_o_n_rr_oIIe_rA_ _ ~ I I ~UNE I· ·L~~~_B_________________, __c_o_ntr_oller_B_ _... MLDS-l336A Now look at Figure 8-6. When the drive becomes operational, it enters a drive available state. This means that it is visible to, and capable of communicating with, either controller, providing the port switches are enabled. The term operational implies several conditions. The port switch(s) must be pressed to enable the communication paths between a drive and a controller. The drive must also be able to spin up; that is, no major drive problems prevent the drive spindle from spinning or prevent the drive from properly communicating with the controller. Digital Internal Use Only 8-7 Standard Disk Interface (SOl) Figure 8-6: Drive Available Disk Drive Controller A . PortA AVAILABLE ControlierB PortB MLDS-1337A When controller A wants to communicate with the drive, it must bring it to a state called "drive on line." This means that the drive, through Port A, becomes dedicated to the exclusive use of controller A. This is illustrated in Figure 8-7. During this time, the drive is visible but not available to controller B. Its state relative to controller B is called "drive unavailable." Figure 8-7: Drive On Line MLD5-1338A Since the drive can only communicate with one controller at a time, it must assum~ a "drive unavailable" state with one controller as the other controller brings it to a "drive on-line" state. When the controller which is communicating with the drive completes all of its activities and releases the drive on Port A, the drive will return to a "drive available" state relative to both ports. 8-8 Digital Internal Use Only Standard Disk Interface (SOl) 8.5 RTCS FORMAT (Real Time Controller State) The controller uses the Real-Time Controller State line to transmit a 16-bit pattern to the drive. See Figure 8-8. This pattern indicates the state of the controller and includes logical signals which are used to synchronize controller/drive operations. This pattern is repeatedly sent to the drive by the controller. Figure 8-8: RTCS I P I xIx I xI xI xI xI 1 10 I 0 I 0 I 0 i 0 I 0 I 0 I 0 I I I L-preombIe (eight zeros) Sync (1) Receiver Ready Writegate Readgate Inlt Unused Unused Parity MlDS-1334A Digital Internal Use Only 8-9 Standard Disk Interface (SOl) The first 8 zeros followed by a 1 bit constitute the sync character. Sensing a minimum of 7 zeros followed by a 1 bit also accomplishes synchronization. The next four bits are logical signals required to synchronize controller and drive operations. They are used as follows: RTCS RECEIVER READY When asserted, this signal indicates that the controller is ready to receive a response from the drive on the Read/Response Data line. RTCS WRITE GATE During a write operation, the drive uses this signal to generate an internal signal that turns on the write current. The leading edge of this signal causes the drive to begin writing infonnation from the WRT/C:MD DATA line to the recording surfaces. The trailing edge of WRITE GATE indicates to the drive that the current WRITE command is finished. RTCS READ GATE During a read data operation, the drive uses this Signal to enable a circuit that reads infonnation from the recording surfaces and sends it to the controller on the SDI Read/Response line. The controller asserts READ GATE such that the leading edge of this signal occurs after the header field but before the data field of the sector. It remains asserted until after the BCC character has been read. The trailing edge of READ GATE indicates to the drive that the current data transfer command is finished. RTCS INIT This signal initializes the drive. The leading edge of this signal instructs the drive microprocessor to unconditionally go to a known memory location and execute the initialization sequence. This sequence aborts all operations in progress. The drive saves its status at the time of the intenupt and executes sufficient internal diagnostics to verify its processor and communication paths to the controller. Upon completion of the initialization sequence, the drive notifies the controller by asserting an appropriate signal on the RIDS line. RTCS PARITY The parity used is even over the entire 16 bits, including the SYNC bit. The parity is appended by the controller and used by the drive to further detect errors encountered when receiving information on this line. If parity errors occur, the information is ignored by the drive and the previous state of the controller is used until a valid update is received. 8-10 Digital Internal Use Only Standard Disk Interface (SDI) 8.6 RTDS FORMAT (Real Time Drive State) The drive uses the Real-Time Drive State line to transmit a 16-bit pattern to the controller. Refer to Figure 8-9. This pattern indicates the state of the drive and includes logical signals which are used to synchronize controller/drive operations. This pattern is sent continuously by available drives to all controllers for which drive port switches are pressed (enabled). Figure 8-9: RTDS I P I x I x I x I x I x I x I 110 101 a I 0 I 0I a I 0 10 I I J L-Preamble (eight zeros) Sync (1) ReceiverReady Attenticn i Reod /Write Reody Sector pulse I Index p ulse Avoiloble Po'my MLD5-1335A Digital Internal Use Only 8-11 Standard Disk Interface (SOl) The first 8 zeros followed by a 1 bit constitute the sync character. Sensing a minimum of 7 zeros followed by a 1 bit also accomplishes synchronization. The remaining bits are used as follows: RTDS RECEIVER READY This signal asserted indicates that the drive is ready to receive a command from the controller on the WRT/CMD line. RTDS ATTENTION This signal asserted notifies a controller that a potentially significant event has occurred and caused the status and/or state of the drive to change. The ATIENTION signal has no affect on any other activity on the SDl bus. RTDS READIWRITE READY This signal indicates that the drive is capable of perfonning a data transfer to or from the disk surface. This signal is only asserted by drives in the on-line state when no condition prevents a transfer operation. RIDS SECTOR PULSE This signal marl.cs the boundary between sectors. The leading edge of SECTOR PULSE may be used for rotational position sensing. The trailing edge of SECfOR PULSE marks the beginning of a sector. RTDS INDEX PULSE This signal is asserted once per revolution of the disk. The controller uses the leading edge of INDEX PULSE for rotational position sensing and the trailing edge to mark the begirming of the first sector after index. RIDS AVAILABLE This signal indicates to the controller that the drive is in the available state. RIDS PARITY The parity used is even over the entire 16 bits, including the SYNC bit This parity bit is appended by the drive and used by the controller to further detect errors encountered when receiving information on this line. If a parity error occurs during fonnatting, the operation is aborted. When a parity error occurs at other times, the information is ignored by the controller and the previous state is used until a valid update is received from the drive. In addition, an error log message is generated and sent to the host. 8-12 Digital Internal Use Only Standard Disk Interface (SOl) 8.7 COMMAND FORMATS on the WRT/CMD LINE The controller uses the WRT/CMD line to send write data to the drive. In addition, it uses this line to send commands and command messages or parameters to the drive. Refer to Figure 8-10. Commands and messages are transmitted using a 32-bit SDI command frame. This command frame consists of a 16-bit sync frame followed by a 16-bit control frame. Figure 8-10: "F SOl Command Frame 501 COMMAND FRAME::----J -I- CONTROL FRAME ~~~~E~ MSB LSB 31 24 23 16 15 0 FRAME CODE FRAME DATA SYNC 8 BITS 8 BITS 16 BITS CXO-881B The sync frame portion is always sent first and is a special pattern used to synchronize the drive for receiving a command frame. The control frame consists of an 8-bit frame code and 8 bits of frame data. There. are two levels of commands transmitted on the Write Command/Data line. 8.8 LEVEL 1 COMMANDS Refer to Figure 8-11. A level 1 command consists of one 32-bit SDI command frame. It begins with the sync frame which contains the sync character. The frame code field contains the opcode of the command. The frame data field contains the message data necessary to complete the level 1 command. There are six level 1 commands. Refer to Figure 8-12. Figure 8-11: Level 1 Command Format F 501 COMMAND F R A M E = - 1 CONTROL FRAME -r--I MSB 31 SYNC FRAME~ 24 23 16 FRAME CODE FRAME DATA COMMAND OPCODE MESSAGE DATA LSB 15 0 SYNC SYNC CXO-875B Digital Internal Use Only 8-13 Standard Disk Interface (501) Figure 8-12: Level 1 Commands SOl COMMAND FRAME =j CONTROL FRAME ~SYNC FRAME I MSB 24 31 23 16 LSB 15 0 FRAME CODE FRAME DATA SYNC COMMAND OPCODE MESSAGE DATA SYNC SELECT GROUP 8E GROUP NO. SYNC SELECT TRACK AND READ 17 TRACK NO. SYNC SELECT TRACK AND WRITE A5 TRACK NO. SYNC SELECT TRACK AND FORMAT ON INDEX 2B TRACK NO. SYNC FORMAT ON SECTOR OR INDEX 40 SYNC DIAGNOSTIC ECHO E8 SYNC CXO-876B 8-14 Digital Internal Use Only Standard Disk InterfaCe (SDI) SELECT GROUP This command has an opcode 8E in the frame code field. The group number to be selected is contained in the frame data field. This command causes the drive to clear R/W READY on the RTDS line, select the specified group, and set R/W READY when it is ready to perfonn another command or I/O operation. SELECT TRACK AND READ This command has an opcode of 17 in the frame code field and the track number in the frame data field. This command causes the drive to select the desired track and prepare for a read data operation. Read data operations are discussed in more detail later in the course. SELECT TRACK AND WRITE This command has an opcode of A5 in the frame code field and the track number in the frame data field. This command causes the drive to select the desired track and prepare for a write data operation. Write data operations are discussed in more detail later in the course. SELECT TRACK AND FORMAT ON INDEX This command causes the drive to select the desired track and prepare the necessary circuits to format the entire track. FORMAT ON SECTOR OR INDEX This command causes the drive to use the last selected track and prepare the necessary circuits to fonnat one sector. Notice that this command does not require any further information in the frame data field. DIAGNOSTIC ECHO This command causes the drive to transmit diagnostic infonnation back to the controller for testing purposes. Digital Internal Use Only 8-15 Standard Disk Interface (SOl) 8.9 8.9.1 LEVEL 2 COMMANDS Command Formats on the WRT/CMD Line The basic characteristic of a level 1 command is that it only requires a single 32-bit SDI command frame to complete the entire command from the controller to the drive. Many commands, however, require much more information than could fit into a single 32-bit frame. Level 2 commands, also transmitted on the WRT/CMD line, contain more than one 32-bit command frame. The actual number of frames sent for a level 2 command depends on the particular command. There are three types of level 2 command frames. They are: START Command Frame CONTINUE Command Frame END Command Frame Like the level 1 command frame, each one begins with a sync frame. Refer to Figure 8-13. The frame code field of a START command frame contains a code of 71. This code indicates to the drive the beginning of a level 2 command. The frame data field contains the opcode for the level 2 command. This indicates to the drive the type of level 2 command to be performed. Figure 8-13: Level 2 START Command Frame F SOl COMMAND F R A M E = j CONTROL FRAME ~ I MS8 31 SYNC FRAME 24 23 16 FRAME CODE FRAME DATA 71 CODE FOR START FRAME OPCODE FOR LEVEL 2 COMMAND LS8 0 15 SYNC SYNC I CXO-877B Refer to Figure 8-14. The frame code field of a CONTINUE command frame contains a code of D4. This code indicates to the drive the continuation of a level 2 command. The frame data field contains further message data necessary to complete the particular level 2 command. Most, but not all, level 2 commands require at least one CONTINUE frame. This depends on the amount of infonnation needed to complete the command. 8-16 Digital Internal Use Only Standard Disk Interface (SOl) Figure 8-14: Level 2 CONTINUE Command Frame F SOl COMMAND FRAME =j CONTROL FRAME ~ SYNC FRAME MSB I . 31 24 : CbDE FOR I CONTINUE I FRAME 1 0 FRAME DATA SYNC MESSAGE DATA FOR LEVEL 2 COMMAND SYNC FRAME CODE 10 LSB 15 16 23 CXO-878B Refer to Figure 8-15. The frame code field of an END command frame contains a code of B2. This code indicates to the drive the end of a level 2 command. The frame data field contains a checksum character for the level 2 command. The checksum is used for error detection and is computed against all of the infonnation transmitted in the 8-bit frame data fields for all of the 32-bit command frames sent during a level 2 command. Figure 8-15: Level 2 END Command Frame SOl COMMAND FRAME CONTROL FRAME ----r- 24 FRAME CODE II B2 CODE FOR END : FRAME 1 SYNC FRAME I MSB 31 =j 23 LSB 0 15 16 FRAME DATA SYNC CHECKSUM; SYNC I I I I CXO-879B Digital Internal Use Only 8-17 Standard Disk Interface (SDI) Refer to Figure 8-16. In summary, all level 2 commands require a START frame, an END frame, and usually one or more CONTINUE frames. Some level 2 commands do not require a CONTINUE frame. Figure 8-16: Level 2 Commands START 71 OPCODE SYNC SYNC SYNC CONTINUED SYNC END B2 I CHECKSUM I SYNC CXO-880B There are 16 level 2 commands. The number in parentheses indicates the number of CONTINUE frames needed to complete the command. (2) CHANGE MODE (2) CHANGE CONTROLLER FLAGS (2) DIAGNOSE (1) DISCONNECT (1) DRIVE CLEAR (1) ERROR RECOVERY (0) GET COMMON CHARACTERISTICS (1) GET SUBUNIT CHARACTERISTICS (0) GET STATUS (5) INITIATE SEEK (1) ON-LINE (0) RUN (5) READ MEMORY (0) RECALIBRATE (1) TOPOLOGY (x) WRITE MEMORY (the number of CONTINUE frames required will vary) The following pages briefly describe the level 2 commands and the basic functions that will be performed by the drive when these commands are executed. 8-18 Digital Internal Use Only Standard Disk Interface (SOl) Level 2 Command Command Opcode CHANGE MODE 81 Instructs the drive to alter its mode to the specified settings. CHANGE CONTROLLER FLAGS 82 Directs the drive to change the specified bit(s) in its status "Controller Byte" to the specified settings. DIAGNOSE 03 Directs the drive to execute the program which is resident in the specified drive memory region. DISCONNECT 84 Directs an on-line drive (provided the Terminate Topology bit is clear) to enter the available state relative to all active ports. DRIVE CLEAR 05 Directs the drive to clear the specified status bits in the error byte of the drive's generic status and attempt to clear the associated error condition. ERROR RECOVERY 06 Directs the drive to invoke the specific hardware error recovery mechanism corresponding to the specified error recovery level. GET COMMON CHARACTERISTICS 87 Requests the drive to send the controller a description of its common drive hardware characteristics. GET SUBUNIT CHARACTERISTICS 88 Requests the drive to send the controller a description of characteristics, geometry, and topology of the specified subunit. For current DSA disk drives, there is only one subunit and it is equivalent to the HDA (or pack) that is installed. GET STATUS 09 Request the drive to send all of its current status bytes to the controller. INITIATE SEEK OA Directs the drive to initiate a seek to the specified group on the specified cylinder. ONLINE 8B Directs the drive to enter the on-line state relative to the controller that issued the ONLINE command. RUN OC Directs the drive to spin up if the RUN/STOP switch is pressed. READ MEMORY 80 Directs the drive to fetch and send to the controller the specified number of bytes starting at the specified offset into ·the specified memory region of the drive. This command has no association with SELECT TRACK and READ commands or operations. RECALIBRATE 8E Directs the drive to perform a recalibration operation and to then seek to Cylinder O. TOPOLOGY 90 Instructs the drive to make itself available for limited dialogue to any and allcontroller(s) on enabled alternate ports. WRITE MEMORY OF Directs the drive to load the supplied data into the indicated area of its memory. This command has no association with any SELECT TRACK and WRITE or FORMAT commands or operations. Drive Function Performed Some level 2 commands do not require CONTINUE frames. However, they are not sent as level 1 single frame commands because level 1 commands do not require any response from the drive. Level 2 commands do require a response from the drive. Digital Internal Use Only 8-19 Standard Disk Interface (SOl) 8.9.2 Response Formats on the Read/Response Line Each level 2 controller command requires a response from the drive. Responses are sent to the controller over the Read/Response line. They use the same protocol as command messages. Refer to Figure 8-17. The sync frame is transmitted first, followed by the control frame. The control frame contains an 8-bit frame code and 8 bits of frame data Figure 8-17: SOl Response Frame Format F SDI COMMAND FRAME~ CONTROL FRAME -I- ~~~~E MSB 31 LSB 24 23 16 0 15 FRAME CODE FRAME ' DATA SYNC 8 BITS 8 BITS 16 BITS CXO-881B The response protocol uses the same START,· CONTINUE, and END frames as the command protocol. The frame codes for the START, CONTINUE, and END frames are also identical to those used in the command frame fonnat. See Figure 8-18. In a level 2 start response frame, the frame data field contains a response status code that indicates whether the original level 2 command was executed successfully or unsuccessfully. If the original level 2 command was successful, this field contains any of several success codes, depending on the particular level 2 command issued by the controller. The response also contains an end frame and usually some continue frames containing message data specific to the particular level 2 command that the controller previously issued. 8-20 Digital Internal Use Only Standard Disk Interface (SDI) Figure 8-18: Level 2 Response Start Frame F SOl COMMAND FRAME CONTROL FRAME -,.. =j SYNC FRAME MSB 31 LSB 24 15 16 23 FRAME CODE FRAME DATA 71 CODE FOR START FRAME XX CODE FOR RESPONSE STATUS 0 SYNC I I I I I SYNC I NOTE: XX = 7D IF ORIGINAL COMMAND WAS UNSUCCESSFUL. CXO-882B If the original level 2 command is unsuccessful, the frame data field always contains the code 7D, indicating to the controller that a problem prevented the proper execution of the original level 2 command. When the drive provides an unsuccessful response, it automatically provides 14 additional CONTINUE frames containing drive status bytes and an END frame to complete the response. The controller uses the drive status bytes to determine the nature of the problem. In addition, most of the drive status bytes are usually logged into the system error log. This information is quite useful when you are isolating drive problems or disk subsystem problems in the field. Tables of DSA drive status byte information are located in the DSA Error Log Reference Manual (EK-DSAEL-MN). Like the level 2 command forma4 the END frame of a drive response contains the checksum character in the frame data field. Digital Internal Use Only 8-21 Standard Disk Interface (SDI) 8.10 INITIATE SeEK COMMAND (Level 2 Command Example) An INITIATE SEEK command directs a drive to initiate a positioning operation to a specified group within a specified cylinder. The drive response. to this command indicates if the. operation was successfully started or not. Refer to Figure 8-19 to see how the INITIATE SEEK command starts. The controller transmits a start frame. The drive first sees the sync frame and then the control frame infonnation. The frame code field contains a code 71, indicating that this is a start frame. The frame data field contains the opcode OA which identifies it as an INITIATE SEEK command. Figure 8-19: INITIATE SEEK Command START 71 CONTINUE D4 I SYNC CONTINUE D4 I SYNC CONTINUE D4 I SYNC CONTINUE D4 IGROUP No·1 SYNC END B2 I CHECKSUM I SYNC SYNC OA CYL : ADDRESS CYL : ADDRESS CYL ADDRESS CXO-8838 The next frames sent are CONTINUE frames, indicated by the D4 in the frame code field The message data bytes of the CONTINUE frames contain the cylinder address and the group number for the positioning operation. The last command frame sent is the END frame. This is indicated by the B2 in the frame code field. The message data field contains the checksum character which is the one's compliment of the sum of the six bytes transmitted in the message data field during the START frame and all CONTINUE frames. This completes the transmission by the controller. It's now up to the drive to respond 8-22 Digital Internal Use Only Standard Disk Interface (SOl) Figure 8-20 shows how a success response appears. The response contains a code 7E in the message data field of the start frame. This indicates success for an INITIATE SEEK. command. The END frame contains the checksum. The success response for a level 2 INITIATE SEEK command informs the controller that the seek operation has started without error and is currently under way. Figure 8-20: Successful Response for SEEK Command START 71 END 92 7E SYNC SYNC CXO-884B An unsuccessful response tells the controller that the operation could not be initiated. Refer to Figure 8-21. It consists of the unsuccessful code (7D) in the START frame, 14 bytes of drive status in the CONTINUE frames, and the checksum in the END frame. The error information in the drive status bytes allows the controller to develop error log information which it sends to the host. Figure 8-21: Unsuccessful Response for SEEK Command START 71 7D SYNC CONTINUE 04 STATUS BYTE 2 SYNC ~ ~ ~ CONTINUE 04 STATUS BYTE 15 END B2 I CHECKSUM SYNC I SYNC CXO-885B Now let's look at the timing associated with an INITIATE SEEK. command that is sent over the Write/Command line. Refer to Figure 8-22. Digital Internal Use Only 8-23 Standard Disk Interface (501) Figure 8-22: Initiate Seek Simplified RTCSWord ocr::JJ:J:1:JJ 1OXXXXXP RCVRRDY ____~~~ RTCS line - - - - - - - - - - - Seek WRT/CMD data ---------~ line Strt Frm ConI' ConI' Frm Frm Drive RCVR-----..., RDY Cent Frm Cont Frm ConI' Frm End Frm ~Seektime ~~ Drive R/W - - - - - - - - , Ready ~c'- ~----------------------------------------~c'~ I I rI Controller RCVR_RD_Y_____________________________________I ~ ~~ Successful Start Erd Read/res 1IIlII11II'LJII____ ~ line ----------------------------------------~~ RTDS line - - - - - - - - - - ------ric!ML.DS-681A COXXXXDllX1XXP RCVR Ready Read/Wrlte 8-24 ~t t ReadY~ Digital Internal Use Only Standard Disk Interface (SOl) Before the controller sends the SEEK command, the drive has both R/W READY and RECEIVER READY asserted on the RTDS line. This infonns the controller that it is ready to receive a command and that no data transfer is currently under way. When the drive receives the seek opcode in the start frame on the Write/Command line, both RECEIVER READY and R/W READY are cleared. . R/W READY remains cleared until the seek is complete and indicates that data transfers must not occur to or from the recording surfaces on the WRT/CMD or ReadlResponse. The drive then sets RECEIVER READY to receive the next frame. This continues until the drive receives the END frame. Then RECEIVER READY remains off until the seek is complete. On the RTCS line, RECEIVER READY is asserted by the controller after it transmits the END frame. At this time, the drive is permitted to send the response on the Read/Response line. The controller sets RECEIVER READY to receive each frame of the response. This particular response indicates that the seek operation has successfully started. When the seek is complete, the drive asserts R/W READY and RECEIVER READY on the RIDS line. This indicates to the controller that the entire seek operation has, in fact, been completed. Digital Internal Use Only 8-25 Standard Disk Interface (SDI) 8.11 SOl READ OPERATION A read operation is initiated from the controller using a level 1 SELECf TRACK AND READ command. Figure 8-23 illustrates the timing between the controller and the drive on the SDl bus during this command. Figure 8-24 illustrates the basic flow of events that take place between the drive and the controller for this· operation. Figure 8-23: SOl Select Track and Read Timing Command Available H lJl R/WReady H (RTDS) I I I I I I I , I I I I Drive RCVRReady H (RTDS) Sector Pulse H (RTDS) I, I I I I I I ,, , ,,, n, I ,,, I I I I I I I I Read gate H Drive internal , I I I , RD/Resp data I I I I I n Read header I I I I I I I I I I I I I I I I I I I I II I I , , I ,, I I I I I I I I I I I n In,,, I , , I I I Read header I I I I I I SDI read gate H (RTDS) n n ,, Header Header 11111 11III n I Read header I I I I I I I I I I I I I , I I I I I I I I I I I I I I ! Header Read data , 1111111111111111111 I I WRT/CMD data 111111" III (Select track + read) MLDS-662A 8-26 Digital Internal Use Only rL Standard Disk Interface (SDI) Figure 8-24: Select Track and Read Flow Controller Disk Drive Check drive RCVR READY is asserted from drive RTDS Send SDI SELECT TRACK and WRITE command WRT/CMD DATA Drive RCVR READY Receive sync character Decode control frame Negate drive RCVR READY Select track (head) .. RTDS Controller detec1s trailing edge of sector pulse .. READ/RESP DATA Send sector pulse Enable read circuits on trailing edge of sector to send sector header Read and compare sector headers Assert SDI read gate when header match is found RTCS Detect SDI read gate asserted Controller reads data .. READ/RESP DATA Enable read circuits RTCS Detect SDI read gate negated Disable read circuits Detect end of data Negate SDI read gate Assert drive RCVRREADY MLDS-663B Before the controller initiates a READ command, the drive has both R/W READY and RECEIVER READY asserted on the RTDS line. This infonns the controller that the drive is ready to receive a command.. When the drive receives the SELEcr TRACK AND READ command, the drive internally generates a COMMAND AVAILABLE signal and decodes the command Since this is a read data operation, drive hardware generates an INTERNAL READ GATE that enables header infonnation to be read from the recoIding surface and sent to the controller over the Read/R.esponse Data line. Header infonnation is sent for every sector on the currently selected track. The controller is responsible for using the header infonnation to determine when the desired sector is under the R/W head(s). When the controller determines that the correct sector is under the data head, it asserts the SDI READ GATE signal on the RTCS line after the desired header and before the data area of the desired sector. The leading edge of the SDI READ GATE causes the drive to read the data area of the sector and send the infonnation to the controller, also over the SDI Read/Response Data line. Different controllers use different techniques for detennining the desired sector. In the UDA50, for example, once the SELECT TRACK AND READ command is sent, the controller reads each header until the target sector is found. The HSC50, however, uses the SECTOR PULSE signal from the RIDS line and keeps track of the sector count. The HSC50 then issues the SELECT TRACK. AND READ command just prior to the target sector. In this way th~ target sector header is the first header read after issuing the SELECT TRACK AND READ command. Digital Internal Use Only 8-27 Standard Disk Interface (SOl) When the desired data has been read, the controller lowers (de-asserts) the SOl READ GATE SIGNAL on the RTCS line. The trailing edge of the SOl READ GATE notifies the drive to terminate the entire read-operation. The drive then re-asserts the RECEIVER READY signal on the RTDS line to the controller in preparation for receiving another command. Notice that the R/W READY signal remained asserted on the RTDS line throughout this operation. This is an indication to the controller that the Read/Response line was available for transferring actual data from the recording surface to the controller during this operation. As you can see, the controller is responsible for controlling most of the entire read d:ata operation. 8.12 SEEK followed by a SELECT TRACK AND READ Figure 8-25 shows the timing on the SOl for a seek operation followed by a -SELEcr TRACK AND READ command. 8-28 Digital Internal Use Only ::!! CQ c:: ~ Command Available H '2: , tc I ~I..·-------------;t I : p R/WReady H (RTDS) I I Drive RCVRReadv H (RTDS) Sector Pulse H (RTDS) l , I n'---___________ i m m ;i'Ii: : : r- I ~tSaekHme ~!I I I I n" ni n n " i h il 51 t't Readgate H Drive Internal t t; I I I I : I I I I I I I I Read header Read header I I I n irL h-----r-I I I I I Read header I I (RTDS) c r----------,J~' ttY :::I !!. c = o :::I : Response • WRT/CMD data ! ! I I I I I I ~ .. Successful !llrWilllll RD/Resp data cC' S' CD I I 1111111 (Seek command end frame) ~~ 0 0 3 3 D) :::I a. ~ 20' ~ CD a. C" '< rJ) m m r0 I I I SOl read gate H ~ ~ tJ) I : ~ CiJ I I I Header IIHI- I I I I I I Header mil I : : I I L I I I Header: Read datp 11II11I1II1I1II1Ht- -i -i JJ ~ 0 "z ~ C ::0 ~ C ~t) g? D1 ::J a. ItHHttHt--1- - - - - - ..,I» (Select track + read) 0 a. iii" MLD5-664A "SiD -< :l ~co (j) 0 I» Q .,:::I Standard Disk Interface (SDI) 8.13 SOl WRiTE OPERATION A write operation is initiated from the controller using a level 1 SELECf TRACK AND WRITE command. This operation is very similar to a READ operation. Figure 8-26 illustrates the timing between the controller and the drive on the SDI bus during this command. Figure 8-27 illustrates the basic flow of events that take place between the drive and the controller for this operation. Figure 8-26: SOl Select Track and Write Timing Command Available H ~~-------------------------------------------------------I I I R/WReady H (RTDS) Drive RCVRReady H (RTDS) I I I I I 1.n n !-j_ _ _ _....jI -_ _ _ _--'. n n !-j_ _ _ _..&.I• j I I I I I I I I I I I I Sector Pulse H ......._ _ _ _..1. (RTDS) n Read gate H ~-----~ Drive internal Read header nn Read header 1nL 1 - ._ _ _ _ _ _ I I I I I ~-------+f----------------- Read header .... SDI read gate Ho-Io.__________~-'---------..I--'---------~ (RTDS) Header Header Header RD/Respdata -+----f+H+IIIII--+++++-IIIII--+++lIlll+T--li- - ; - - - I I :Write dato WRT/CMD datoUo..+folIIi++++++1 II 111++--11- - - - - - - + I I H + H + I I l I I I I f + + + + + + I I I I I I - - (Select track & write) Ml..DS-665A 8-30 Digital Internal Use Only Standard Disk Interface (SOl) Figure 8-27: Select Track and Write Flow Controller Check drive RCVR READY is asserted from drive Disk Drive ... RTDS Drive RCVR READY WRT/CMD DATA Send SDI SELECT TRACK and WRITE command I Receive sync character Decode control frame Negate drive RCVR READY Controller detec1s trailing edge of sector pulse ... Select track (head) RTDS Send sector pulse Enable read dircults on trailing edge of sector to send sector header READ/RESP DATA Read and compare sector headers Assert SDI read gate When header match is found Controller sends write data Detect SDI write gate asserted RTCS WRT/CMD DATA Negate SDI write gate RTCS .. Enable write circuits and wrttedara Detect SDI write gate negated disable write circuits Assert drive RCVR READY MLD5-666B Before the controller initiates a READ command, the drive again has both R/W READY and RECEIVER READY asserted on the RTDS line. This informs the controller that the drive is ready to receive a command. When the drive receives the SELECT TRACK AND WRITE command, the drive internally generates a COMMAND AVAILABLE signal and decodes the command. Since this is a write data operation, drive hardware again generates an INTERNAL READ GATE that enables header information to be read from the recording surface and· sent to the controller over the Read/Response Data line. Header infonnation is sent for every sector on the currently selected track. The controller is responsible for using the header information to determine when the desired sector is under the R/W head(s). When the controller detennines that the correct sector is under the data head, it asserts the SDI WRITE GATE signal on the RTCS line after the desired header and before the data area of the desired sector and then begins sending data to the drive on the WRT/CMD Data line. Different controllers use different techniques for detennining which sector is the desired sector. In the UDA50, for example, once the SELECT TRACK AND WRITE command is sent, the controller reads each header until the target sector is found. The HSC50, however, uses the SECTOR PULSE signal from the RTDS line and keeps track of the sector count. The HSC50 then issues the SELECT TRACK AND WRITE command just prior to the target sector. In this way, the target sector header is the first header read after issuing the SELECT TRACK AND WRITE command. Digital Internal Use Only 8-31 Standard Disk Interface (SDI) The leading edge of the SDI WRITE GATE causes the drive to do two things. First, it de-asserts its internal read gate and discontinues sending header information to the controller. Then the drive turns on its write current and writes the data onto the data recording area of the sector. The data that it writes will be the data it receives on the WRT/CMD Data line from the controller. When the desired infonnation has been written, the controller lowers (de-assens) the. SDI WRITE GATE signal on the RTCS line. The trailing edge of the SDI WRITE GATE signal notifies the drive to terminate the entire write operation. The drive then re-asserts the RECEIVER READY signal on the RTDS line to the controller in preparation for receiving another command. Like the read operation, the R/W READY signal remains asserted on the RIDS line throughout this operation. As you can see, the controller is also responsible for controlling most of the entire write data operation. 8.14 SEEK followed by SELECT TRACK AND WRITE Now that you have seen the SDI timing for a seek and a write operation, let's put the two together. Figure 8-27 shows the timing on the SDI for a seek operation followed by a SELECT TRACK AND WRITE command. 8-32 Digital Internal Use Only ::!1 CQ Command Available H R/WReady H (RTDS) -fl ~~ ~I• " tt" ••• •• ,,5eektlme t( Drive RCVRReady H (RTDS) m m n r• n h tf I I I I (( Read header • I • • • I , - r - - - - - - - P J "( ( i S" ; 3 !!. Read header , • , I I I I • I • • I • I I I I I 2- h~----+ Read header , , , , , , : : Header, : IIIII! , i I ' I • 'Write datd 1111111 11111 , ' )) IIIIHtttP J.- I I 11111111111111 o:::J -< t w m m -I » 0 ~ » z c ~ :n =f m -... (J) m :::I Q. m (Seek command end frame) (Select track + write) Q, C c 8: en :xJ I : Iml 0' '< : , u ~ a. ~ : : Header 0' ,r--------I : : Header 'TI r- , I :::J CD I , 3 3I» irL : : • WRT/CMD data I I I I n 0 a. , I cS' n h ~ 0 ,,• :I :I :illilart~llill RD/Resp data I I I I , • , :, :I Successful Response c ~ en n" n " Jl Readgate H DrIve Internal SOl write gate H (RTCS) Cil ~ ~ Sector Pulse H (RTDS) nL--________ C iii" MlDS-667A '" 5" CD m :l. -gi en Standard Disk Interface (SDI) At this time complete the following exercises. 8-34 Digital Internal Use Only Standard Disk Interface (501) 8.15 1. EXERCISES What are the four lines that comprise the SDI bus? A. RTDS, RTCS, READ GATE, WRITE GATE @ RTCS, WRT/CMD, READ/RESPONSE, RIDS 2. C. READJRESPONSE, WRT/CMD, READ GATE, WRITE GATE D. R/W READY, READ/RESPONSE, RECEIVER READY, AVAILABLE What are the states of a drive relative to a controller? A. Available, Operational, Off-Line, On-Line B. On-Line, Off-line, Available, R/W Ready C. On-Line, Off-line, Unavailable, R/W Ready @ 3. Available, Unavailable, On-line, Off-line How does the controller know when a drive has completed a SEEK operation? (f) When the drive asserts both R/W Ready and Receiver Ready after receiving a SEEK command. 4. B. When the controller receives a response to the SEEK command. C. When the drive asserts Read Gate after receiving a SEEK command. D. Both B and C. What constitutes a level 2 command? A. At least one START frame and two or more CONTINUE frames. ~-\ At least two START frames and one END frame. C At least one START frame and one END frame. ~ . At least one START frame, one CONTINUE frame, and one END frame. Digital Internal Use Only 8-35 Standard Disk Interface (SOl) 5. 6. During a SiLJiCT TRACK AND READ command, when is the header information sent to the controller? A. When the controller asserts SDI READ GATE. B. When the drive asserts SDI READ GATE. C. When the controller asserts RECEIVER READY. @ When the drive asserts its internal READ GATE. At what time does the controller raise SDI READ GATE? A. As soon as the SDI drive RECEIVER READY negates. B. On the trailing edge of each sector pulse. @ After a header match for the target block. D. 7. On command from the drive. When is write data sent to the drive from the controller? t:i)) After a header match on the target block. B. With the SELECT' TRACK AND WRITE command. C. On the trailing edge of each sector pulse. D. Immediately before the SELBcr TRACK AND WRITE command. 8-36 Digital Internal Use Only CHAPTER 9 LEVEL 2 SOl COMMANDS Level 2 SOl Commands 9-1 Level 2 SDI Commands 9.1 INTRODUCTION This section describes each of the level 2 SDI commands and relates to a variety of DSA disks. In some instances, specific disk drives intetpret bits differently for unique reasons. Refer to the technical description manuals for specific disk drives for clarification of these matters. 9.2 CHANGE MODE Command The CHANGE MODE command directs the drive to alter its mode to the specified settings illustrated in Figure 9-l. Only those bits in Byte 2 that have corresponding bits set in Byte 3 are altered. The remaining bits are unchanged The following modes may be changed: Figure 9-1: CHANGE MODE MSB LSB t BYTE 01 BYTE 02 BYTE 03 1 I t t 0 I o I 0 10 I o! , 0 I 1 OPCODE = 81 W4 : W3 : W2 : W1 : D"D : FO: DB : S7 1 .1 1 t BIT MASK I· I ! 1 I ! I CXO-1324A Write Protect (WI-Bit)-The controller can request the drive write protect itself. If the drive is hardware write protected via the operator control panel and the controller attempts to write enable the drive, a specific drive-detected error occurs indicating "SD! write enable command while drive is hardware write-protected." NOTE Bits W2 and W3 are not used by current drives. Write Protect (W4IED-Bit)-The W4 bit is not used by the RA60, RA80, RA81, or RA82. In the RA70 and the RA90, this bit has been redefined as ED (error log disable). The controller sets this bit during certain diagnostics to disable the special internal error logging features of the drive. This prevents the internal error log from over-writing its buffers with useless information while the controller is perfonning forced-fault and verify diagnostics to these drives. Drive Disable (DD-Bit)-Tbe controller sets this bit under the following conditions: If special diagnostics in the controller require the drive to be disabled. If the DD bit gets set, the drive drops off line to host computers and spins down. The drive no longer responds to front panel activity until the DD bit is cleared or the drive is re-powered. The drive does not normally set the DD bit itself. • Format Operations (FO-Bit)-Tbe controller sets this bit prior to issuing any level 1 fonnat commands to the drive. The drive cannot accept any level 1 format commands if this bit is not set. If the drive receives a level 1 fonnat command and this bit is not set, the drive reports an error indicating ''FORMAT attempted while fonnat disabled." DBN Access (DB-Bit)-This bit must be set before the controller can access data in the DBN space (controller diagnostic blocks). If the drive receives a request to seek to the DBN space and this bit is not set, the drive reports an error indicating "Seek command contained an invalid cylinder address." 512/576 Byte Mode Select (S7-Bit)-The controller sets this bit when the drive is to operate in the 576-byte mode. When cleared, the drive operates in 512-byte mode. Some drives do not support the 576-byte mode of operation and report an error. 9-2 Digital Internal Use Only Level 2 SOl Commands 9.3 CHANGE CONTROLLER FLAGS Command The CHANGE CONTROLLER FLAGS command (Figure 9-2) instructs the drive to change the appropriate bits in Byte 7 (controller byte) of the status byte. Only those bits in Byte 2 that have corresponding bits set in Byte 3 are altered. The remaining bits are unchanged. Byte 7 of the status byte is part of the GET STATUS response the drive sends back to the controller after a GET STATUS command. Under normal circumstances, the bits in the controller byte are used only by controllers except as noted here. Figure 9-2: CHANGE CONTROLLER FLAGS M5B L5B I BYTE 01 BYTE 02 BYTE 03 1 1 I 54 I I I ° °,° °,° 1 1 I 53 , , ' 52 I I I 5' r I C1 I I , 1 ' ,0 I I OPCODE = 82 C2 I C3 ,C4 ~IT ~A5~ : : CXO-1347A If any C bits are set (C1, C2, C3, or C4), most drives spin down and ignore any attempt to spin up using the front panel. Drives clear all the controller bits if any of the following conditions exist: The drive powers up. There is no unit select plug inserted in the front panel. The drive has a fault condition. Both port switches are disabled. Table 9-1 lists the C-bit values and their intetpretation. Table 9-1: Byte 2 C-Blts C1 C2 C3 C4 Indication o o o Normal drive operation. o o o o o o Drive offline to hosts due to being under control of diagnostic. Drive offline to hosts due to a duplicate drive unit number detected. Most drives ignore the suppress attention bits (S1, S2, S3, and S4) under normal operating conditions. These bits can be set only by the controller. The drives may, however, clear the suppress attention bits in the same manner as they clear the C bits. Digital Internal Use Only 9-3 Level 2 SOl Commands 9.4 DIAGNOSE Command The DIAGNOSE command instructs the drive to execute the diagnostic program resident in the specified drive memory. Figure 9-3 shows the format for the DIAGNOSE command. For most drives, the memory region specified in a DIAGNOSE command corresponds to the internal diagnostic test number. Refer to the specific drive service manuals for lists of the internal diagnostic test numbers. Figure 9-3: DIAGNOSE Command MSB LSB I BYTE 01 0 , 1 0 I I 0 I I 0 I I I 0 I o I 1 I BYTE 02 MEMORY REGION 10 LO BYTE 03 MEMORY REGION ID HI 1 OPCODE = 03 CXO-1348A When the disk drive has executed the requested test, it sends a response to the controller specifying the region containing the results of the test. The controller then issues a READ MEMORY command to determine if the diagnostic passed or failed and what drive error code(s) were generated during the internal diagnostic. When a controller issues a DIAGNOSE command, the drive initially provides one of the following responses: UNSUCCESSFUL-The DIAGNOSE command was incorrectly formatted, unintelligible, or specified incorrect information. SUCCESSFUL-The DIAGNOSE command was accepted, and the test was executed. The response also provides the drive memory region address containing the result of the diagnostic test If a DIAGNOSE command is received with an invalid test number, the drive reports an error indicating a specific error code or sets the PE (protocol error) bit in Byte 6 (error byte) of drive status. An invalid test number is one that exists within the drive but cannot be executed using the DIAGNOSE command. Invalid tests include: Front panel tests. SDI loopback tests. Fonnat read-only cylinder test/utility. Tests requiring more than 128 seconds to complete. If the drive receives a DIAGNOSE command with a nonexistent test number, the drive reports an error indicating a specific error code or sets the PE (protocol error) bit in Byte 6 (error byte) of drive status. 9-4 Digital Internal Use Only Level 2 SOl Commands 9.5 DISCONNECT Command The DISCONNECT command (Figure 9-4) is used in a number of ways depending upon the assertion of the IT (terminate topology) and ST (stop/spin down) bits in Byte 2. Table 9-2 lists and describes each of the conditions. Figure 9-4: DISCONNECT Command MSB LSB BYTE 01 1 I BYTE 02 n I 0 I 0 : 0 1 0 1 RESERVED 1 : 0 I o I ST OPCODE = 84 I CXO-1349A Table 9-2: DIAGNOSE Command TT/ST Bits TTBit ST Bit Description o o instructs an on-line drive to disconnect itself from the current controller and become available. The drive does not spin down after completing this command. o Instructs an on-line drive to disconnect itself from the current controller, become available, and spin down. o Informs an unavailable drive in the process of executing a TOPOLOGY command this controller is finished processing and the drive can return to the on-line port. Reter to the TOPOLOGY command for more detail. This condition is invalid. Digital Internal Use Only 9-5 Level 2 SDI Commands 9.6 DRIVE CLEAR Command The DRIVE CLEAR command instructs the drive to clear the bits specified in the ERROR BYTE of the drive status response. It also instructs the drive to attempt to clear the error condition. Cleared bits are set to 1 in the bit mask field (Figure 9-5). The drive sends a COMPLETED response to the controller if the specified bits were cleared. If the error condition and error bits could not be cleared, the drive sends an UNSUCCESSFUL response to the controller. It also sends all of the status bytes to the controller to help the controller detemrine why the error could not be .cleared. A separate section in your Student Reference .Manual illustrates the drive status bytes in more detail. Figure 9-5: DRIVE CLEAR Command MSB BYTE 01 LSB o OPCODE = 05 BYTE 02 BIT MASK CXO-1350A 9.7 ERROR RECOVERY Command The ERROR RECOVERY command (Figure 9-6) instructs the drive to activate its error recovery circuits. The circuit activated depends upon the error recovery level specified in the command. RA60,RA80, and RA81 drives do not support controller-assisted hardware error recovery circuits. For drives that do support this feature (RA70, RA82, RA90, etc.), hardware error recovery infonnation may be found in the drive technical description manuals. Figure 9-6: ERROR RECOVERY Command MSB LSB , BYTE 01 BYTE 02 0,0,° 1 I °,0,1,1,0 OPCODE = 06 ERROR RECOVERY LEVEL CXO-1351A 9.8 GET COMMON CHARACTERISTICS Command The GET COMM:ON CHARACI'ERISTICS command instructs the drive to send the controller a description of its hardware characteristics common to all subunits of the drive. Figure 9-7 shows the hardware characteristics sent to the controller. The specific information that each drive type sends to the controller is listed in your Student Reference Manual. 9-6 Digital Internal Use Only Level 2 SOl Commands Figure 9-7: GET COMMON CHARACTERISTICS Command and Response SOl COMMAND MSB ! ! ! 0 0 0 !1 !1 !1 MSB DRIVE RESPONSE BYTE 01 OPCODE = 87 I 1 l I 1 ~ 1 II ' ! I I !O~OlO OPCODE = 78 I BYTE 02 SOlVERS BYTE 03 I SHORT T.O. TRANSFER RATE Ii BYTE 04 RETRIES BYTE 05 i RESERVED! SS LONG T.O. FCT/RCT COPIES BYTE 06 ERROR RECOVERY LEVELS BYTE 07 ECC THRESHOLD BYTE 08 MICROCODE REVISION BYTE 09 I LSB i o cJcr J /<'1 cLe-ta;& LSB !0 11 sq FD II HARDWARE REVISION BYTE 10 DRIVE SIN BYTE 11 DRIVE SIN BYTE 12 DRIVE SIN BYTE 13 DRIVE SIN BYTE 14 DRIVE SIN BYTE 15 DRIVE SIN BYTE 16 DRIVE TYPE BYTE 17 REVOLUTIONS/SEC BYTE 18 0 BYTE 19 0 BYTE 20 0 BYTE 21 0 BYTE 22 0 BYTE 23 0 LO HI CXO-1509B Digital Internal Use Only 9-7 Level 2 SDI Commands 9.9 GET SUBUNIT CHARACTERISTICS Command The GET SUBUNIT CHARACTERISTICS command instructs the drive to send the controller a description of hardware characteristics (Figure 9-8 of the subunit specified in the command For most drives, there is only one subunit, and it is equivalent to the HDA installed. The specific infonnation that each drive type sends to the controller is listed in your Student Reference Manual. 9-8 Digital Internal Use Only -n SOl COMMAND MSB ca' e LSB I 1 I ° ° ° I OPCODE = 88 i I (; I I ~ RESERVED .... -, DRIVE RESPONSE ! I ! 011!1 111011!1 11 BYTE 02 LBN SPACE IN CYL BYTE 03 BYTE 04 0; i BYTE 06 BYTE 07 BYTE 09 BYTE 10 OPCODE = 77 BYTE 21 LBNs IN HOST AREA LBN SPACE IN CYL BYTE 23 LBNs IN HOST AREA LBN SPACE IN CYL BYTE 24 LBNs IN HOST AREA HI CYL ; NO. i LO HI LBN CYL BYTE 25 o i i °i i 0 0 =t 512-BYTE FORMAT RCT COpy SIZE (LBNs) LO i BYTE 27 RCT COpy SIZE (LBNs) HI TRACKS/GROUP BYTE 28 LBNs/TRACK ! BYTE 29 GROUP OFFSET HI STRT LBN RBNs/TRACK RMI i BYTE 30 LBNs IN HOST AREA 0 ::r: » :D » HOST LBNs HI BYTE 26 HI STRT LBN z LO GROUP/CYLINDER HI STRT XBN ~ C OJ C GROUP OFFSET BYTE 22 HI STRT XBN BYTE 08 LBNs/TRACK CJ) ,T' BYTE 01 BYTE 05 BYTE 20 LSB MSB C) ~ m :D .... -- en -t 0CJ) 0 0 3 3 Q) LO BYTE 11 RESERVED BYTE 31 LBNs IN HOST AREA BYTE 12 DATA PREAMBLE (WORDS) BYTE 32 LBNs IN HOST AREA BYTE 13 HEADER PREAMBLE (WORDS) BYTE 33 ° i ° i ° i 0 ! HOST LBNs HI BYTE 34 RCT COPY SIZE (LBNs) LO 57G-BYTE FORMAT" ::s 0. C» ::s 0. :D CD UJ ." c BYTE 14 MEDIA TYPE i BYTE 15 MEDIA TYPE BYTE 35 RCT COpy SIZE (LBNs) HI S- BYTE 16 MEDIA TYPE BYTE 36 XBN SPACE IN CYL LO r- ::s BYTE 17 MEDIA TYPE HI BYTE 37 XBN SPACE IN CYL HI < !!.. C BYTE 18 FCT COPY SIZE (XBNs) LO BYTE 38 DIAG READ AREA (GROUPS) BYTE 19 FCT COPY SIZE (XBNs) HI BYTE 39 DBN SPACE IN CYL ca' .. CD !t = o ::s La 0 :::J UJ ..... _ J CD N en 2 0 0 3 3CD ~ t CD :::J NOT USED IN RA82 CXO-1510A Q. Ul Level 2 SOl Commands 9.10 GET STATUS Command The GET STATUS command instructs the drive to send the controller all of its current status bytes (Figure 9-9). More specific details about drive status and decoding for each of the drive types are discussed in your Student Reference Manual and will be covered later in this course. NOTE When a drive sends an UNSUCCESSFUL response for any level 2 command to the controller, GET STATUS bytes are always included .in the response. Figure 9-9: GET STATUS Command SDI COMMAND MSB LSB o I 0 o! 0 !, o! 0 ! 1 MSB DRIVE RESPONSE BYTE 01 1 I OPCODE =09 LSB I 1 Ii 1 l I BYTE 02 1 I 0 I i I 1 I 1 I I I 0 OPCODE = F6 UNIT NO. BYTE 03 SUBUNIT I J HI UNIT NO. BYTE 04 QAlRRloR!SRIELlpB!psIRU REQUEST BYTE BYTE 05 EDOlED11 w21 W11 DOl FOI OBI S7 MODE BYTE BYTE 06 DE : RE: PE! OF BYTE 07 BYTE 08 BYTE 09 I i I 'j 'I S4 I 'I I I I I WE: 1-' , I I ;i ; I !S3 !S2 is, !C 1 !C2 iC3 !C4 RETRY COUNT/ FAILURE CODE ERROR BYTE CONTROLLER BYTE -c--, BYTE 10 BYTE 11 BYTE 12 BYTE 13 I DEVICE-DEPENDENT EXTENDED STATUS INFORMATION BYTE 14 BYTE 15 -c-_ J CXO-1639B 9-10 Digital Internal Use Only Level 2 SOl Commands 9.11· INITIATE SEEK Command The INITIATE SEEK command instructs the drive to seek to the appropriate group and cylinder specified in the INITIATE SEEK command. Figure 9-10 shows the byte configuration for the INITIATE SEEK command. Figure 9-10: INITIATE SeEK Command MSB BYTe 01 o LSB I I ,0, 0 I I 0 J I I I 1 , 0 I 1 , 0 BYTe 02 CYL AD DR BYTe 03 CYL ADDR BYTe 06 = OA LO CYL ADDR BYTe 04 BYTe 05 OPCODE SCi f CYL ADDR HI GROUP NUMBER CXO-1352A For the INITIATE SEEK command, the drive sends one of the following responses: 1. UNSUCCESSFUL-The seek operation could not be initiated 2. SUCCESSFUL-The seek operation was initiated without errors and is currently executing. Digital Internal Use Only 9-11 Level 2 SOl Commands 9.12 ON-LINE Command The ONLINE command instructs the drive to enter the on-line state relative to the controller that issued the ONLINE command. The ONLINE command includes a controller-timeout value expressed in seconds (Figure 9-11). DSA drives use this timeout value to monitor controller activity. Figure 9-11: ONLINE Command MSB LSB BYTE 01 OPCODE = 8B BYTE 02 COMMAND TIMEOUT (SEC) CXO-1353A DSA drives start the command timer (if the drive is on line) for the time specified in the command byte under the following conditions: The drive is ready to send a response to the controller. The drive is ready to assert the ATTENTION bit. The drive is completing a data transfer operation. The drive is waiting for another command (while on line). The timer is started when the first response frame is ready to be transmitted. Then the controller can time the assertion of RECEIVER READY. CONTROLLER RECEIVER READY must be asserted before the drive can send a response to the controller. If the drive c~ot send a response to the controller, it generates a drive-detected error indicating "response timeout error. II Whenever the drive receives a level 1 command or an END frame of a level 2 command from the controller, it cancels and resets the command timer. If the command timer expires, the drive sets the ATTENTION bit and resets and restarts the command timer. If the timer expires a second time (while the ATTENTION bit is set), the drive considers the controller to be off line (SDI cables disconnected, defective, etc.). The drive then disconnects itself from that controller and becomes AVAILABLE to any controller. For troubleshooting reasons, the RA82 loads error code ID into its drive internal error silo if the drive disconnects due to a controller timeout In this case, error code ID is not generated as a LED or front panel fault, nor is the error sent to the controller. (There is no controller to send it to.) 9-12 Digital Internal Use Only Level 2 SDI Commands 9.13 RUN Command The RUN command instructs the drive to perform a spin-up operation, provided the RUN/STOP switch is enabled. If the controller sends this command with the RUN/STOP switch disabled, most drives generate a drive-detected error indicating this condition. Figure 9-12 shows the command fonnat for the RUN command. Figure 9-12: RUN Command MSB BYTE 01 0 LSB 0 0 0 o 0 OPCODE = OC CXO-1354A Digital Internal Use Only 9-13 Level 2 SOl Commands 9.14 READ MEMORY Command The READ MEMORY command instructs the drive to fetch and send a specified number of bytes to the controller. These bytes start at the specified offset and are read into the specified read memory region of the drive. Figure 9-13 shows the command fonnat and the drive response for the READ MEMORY command. Figure 9-13: READ MEMORY Command SOl COMMAND MSB . o : LSB . I • o : I I o : BYTE 1 1 : BYTE 2 MEMORY REGION 10 LO BYTE 3 MEMORY REGION ID HI BYTE 4 OFFSET INTO REGION LO BYTE 5 OFFSET INTO REGION HI 0 1 : 1 1 OPCODE := 80 BYTE COUNT BYTE 6 DRIVE RESPONSE . o· , , . , • MSB BYTE 1 I I I LSB . : 0 : o : , , BYTE 2 BYTE COUNT BYTE 3 DATA BYTE 1 BYTE 4 DATA BYTE 2 BYTE n + 2 DATA BYTE n . I 0 OP CODe :0: 72 CXO-1971A The acceptable responses for the READ :MEMORY command are as follows: UNSUCCESSFUL-The reason in the response data as shown in Figure 9-13. SUCCESSFUL-Contents of the requested memory locations as shown in Figure 9-13. NOTE Memory refers to the ROMIRAM areas of the disk drive and not disk media storage areas. 9-14 Digital Internal Use Only Level 2 SOl Commands The various drive types provide different information during READ :MEMORY, REGION responses, depending upon the extent of the dIjve design. Some of the information provided includes: Error silo information Diagnostic parameters Diagnostic status and results Extended diagnostic status Extended drive status ROM revision and checksum field information RAM, SDI buffers and status words 9.15 RECALIBRATE Command The RECALIBRATE command instructs the drive to perform a recalibrate operation and seek to cylinder O. Figure 9-14 shows the fonnat for the RECALIBRATE command. Some drives (RA81 and RA82, for example) also perform automatic internal servo adjustment routines during the RECALIBRATE command. Figure 9-14: RECALIBRATE Command MSB BYTE 01 LSB 000 o OPCODE = BE CXO-1355A Acceptable responses for the RECALIBRATE command are as follows: • UNSUCCESSFUL SUCCESSFUL -Recalibrate operation completed without errors. Digital Internal Use Only 9-15 Level 2 SOl Commands 9.16 TOPOLOGY Command The TOPOLOGY command (Figure 9-15) instructs an on-line drive to make itself temporarily AVAILABLE for dialogue to any controller on an alternately enabled port (Figure 9-15). While communicating with a drive in topology mode, an alternate controller (controller B) can issue only the following level 2 commands: GET STATUS GET COMMON CHARACTERISTICS GET SUBUNIT CHARACTERISTICS CHANGE CONTROLLER FLAGS' DISCONNECT (with TT bit set to terminate topology communication) Figure 9-15: Topology Command MSB BYTE 01 LSB o 0 o 0 0 0 OPCODE = 90 CXO-1356A The acceptable responses for the TOPOLOGY command are as follows: UNSUCCESSFUL-As indicated by the drive status in the response data. SUCCESSFUL -Topology completed without errors. The drive sets the OA bit of status Byte 4 to indicate to controller B it is online to an alternate controller (controller A), and it is executing a TOPOLOGY command. This feature allows controller B to update its internal registers with status and drive characteristic information. When the alternate controller completes its dialogue with the drive, it issues a DISCONNECT command with the IT (terminate topology) bit set. The drive then returns to its original on-line port and sends a COMPLETED response to that controller indicating the TOPOLOGY command is complete. 9-16 Digital Internal Use Only Level 2 SDt Commands 9.17 WRITE MEMORY Command The WRITE l\.1EMORY command instructs the drive to write the supplied data in the indicated memory regions. Figure 9-16 shows the command format for the WRITE MEMORY command. Figure 9-16: WRITE MEMORY Command MSB 1 1 . . 1 01 1 I LSB . . o . '1 . '1 1 1 1 I 1 1 1 1 BYTE 1 o BYTE 2 MEMORY REGION 10 LO BYTE 3 MEMORY REGION 10 HI BYTE 4 OFFSET INTO REGION LO BYTE 5 OFFSET INTO REGION HI 0 BYTE 6 BYTE COUNT BYTE 7 DATA BYTE 1 BYTE 8 DATA BYTE 2 BYTE n+6 DATA BYTE '1 '1 OPCOOE OF n CXO-1513A Most drives support only one valid region for the WRITE l\1EMORY command, and it is listed below: Region Description No. Bytes FFFC Diagnostic Parameters 6 NOTE Memory refers to the RAM areas of the disk drive and not the disk media storage areas. Digital Internal Use Only 9-17 level 2 SOl Commands 9-18 Digital Internal Use Only CHAPTER 10 DECODING DRIVE STATUS BYTES Decoding Drive Status Bytes 10-1 10-2 Digital Internal Use Only RASO Drive Status Decoding 10.1 RA60 DRIVE STATUS DECODE The following pages describe decoding the RA60 status bytes. These bytes are passed from the drive to the controller when a GET STATUS command is issued to the RA60. The RA60 also provides these status bytes to the controller if a command from the controller fails to execute properly within the drive. System error logs, controller error logs, and diagnostic utilities often display most of the drive status bytes. This decoding information is provided to help you understand the meaning of these bytes and/or the meaning of bits within any of these bytes. Figure 1~1: Summary of RA60 Drive Status Codes BYTE 01 RESPONSE OPCODE BYTE 02 UNIT SELECT (LOWER) BYTE 03 UNIT + SUBUNIT MASK BYTE 04 REQUEST BYTE GENERIC DRIVE STATUS BYTE BYTE 05 MODE BYTE GENERIC DRIVE STATUS BYTE BYTE 06 ERROR BYTE GENERIC DRIVE STATUS BYTE BYTE 07 CONTROLLER BYTE BYTE 08 RETRY COUNT/FAILURE BYTE 09 PREVIOUS CYL (LO) EXTENDED DRIVE STATUS BYTE BYTE 10 PREVIOUS CYL (HI) EXTENDED DRIVE STATUS BYTE BYTE 11 PREVIOUS HEAD EXTENDED DRIVE STATUS BYTE BYTE 12 CURRENT CYL (LO) EXTENDED DRIVE STATUS BYTE BYTE 13 CURRENT CYL (HI) EXTENDED DRIVE STATUS BYTE BYTE 14 CURRENT HEAD EXTENDED DRIVE STATUS BYTE BYTE 15 MASTER ERROR CODE EXTENDED DRIVE STATUS BYTE CXO-2363A Bytes 2 through 15 are generally available from system error logs, the HSC console display, and various diagnostics that test RA-series drives. The fonnat in which this infonnation is displayed depends upon the specific type of system (VMS, RSTS, RSX, etc.) or the specific type of controller (HSC50, HSC70, etc.). Digital Internal Use Only 10-3 RA60 Drive Status Decoding Figure 10-2: RA60 Byte 1 Ix xxxix xxxl Byte 1 - RESPONSE OP Code This is the RESPONSE opcode from the drive to the controller~ but it is rarely displayed to a user. It indicates the success or non-success of the previous. command sent from the controller to the drive. Figure 10-3: RA60 Bytes 2-3 10 0 0 llx x x xl T Ix x x xix x x xl o::r oro 1'--____ Drive unit select number ' - - - - - - - - - - - - - Subunit 0 mask (subunit 0 reporting this status) ' - - - - - - - - - - - - - - Subunit 1 mask (not used) ~------------- Subunit 2 mask (not used) ~-------------- Subunit 3 mask (not used) Ml.DS-12800 NOTE The RA60 has no multiple subunits and will always indicate Subunit O. 10-4 Digital Internal Use Only RA60 Drive Status Decoding Figure 10-4: IX RA60 Byte 4 Request Byte X X xl X X X xl Byte 4 Request byte 11--____ (RU) 0 =Run/stop switch out 1 =Run/stop switch in 1...-_ _ _ _ _ (PS) 0 =Port switch out 1 =Port switch in 1...-_ _ _ _ _ _ 1--------- .... ! --------- (PB) (Not implemented) (EL) 0 1 =No Ioggable information in extended status area =Loggable information in extended status area (SR) 0 II: Spindle not ready (not up to speed) 1 = Spindle ready 1------------ (DR) 0 =No diagnostic is being requested from the host 1 = There is a request for a diagnostic to be loaded into the drive microprocessor memory 1------------- (RR) 0 II: Drive requries no recalibrate command 1 = Drives requests recalibrate command 1-------------- (OPV 0 II: Drive online or available to current controller 1 drive unavailable (It is already online to another controller) = MlDS-235OA Digital Internal Use Only 10-5 RA60 Drive Status Decoding Figure 10-5: RA60 Byte 5 Mode Byte loooolxxxxi Byte 5 ---- (57) ~-- 0 Mode byte =512-Byte sector format (16 bit) 1 =576-Byte sector format (18blt) (DB) 0 = DBN area access disabled 1 DBN area access enabled = ' - - - - - (FO) 0 = Formatting operations disabled 1 = Formatting operations endabled ' - - - - - - (DD) 0 =Drive enabled by controller error routine or diagnostic 1 =Drive disabled by controller error routine or diagnostic (fault light = ON) i...-_ _ _ _ _ 01'11) 0 =Write protect switch for subunit 0 is out 1 =Write protect switCh for subunit 0 is in ' - - - - - - - - 01'12) Not implemented 1-------- (ED1) Notimplemented 1---------- (tOO) Not implemented MLDS-2351 A 10-6 Digital Internal Use Only RA60 Drive Status Decoding Figure 10-6: RA60 Byte 6 Error Byte IxxxxlxoooJ , ' - - - - - - (WE) Byte 6 Error byte a1 =a No error Write lock error (attempt to write while write protected) '--_ _ _ _ _ (OF) a =No error1 = Drive failure during intialization ' - - - - - - - - - (PE) 0 = No error 1 Level 2 protocol error (improper command codes or parameters isssued to drive) = 1.--------- (RE) 1.---------- (DE) 0 .. No error 1 = SDf receive error on SDI transmission line (s) from controller 0 = No error 1 -= Drive error (drive fault light may be on possibly clearable via drive clear command) MLDS-2352A Digital Internal Use Only 10-7 RA60 Drive Status Decoding Figure 10-7: I 0 0 0 RA60 Bytes 7-8 0 I x x x I [J x ~ I BYTE 7 CONTROLLER BYTE 0000 = NORMAL DRIVE OPERATION 1000 = DRIVE OFFLINE DUE TO BEING UNDER CONTROL OF A DIAGNOSTIC 1001 = DRIVE OFFLINE DUE TO ANOTHER DRIVE HAVING THE SAME UNIT SELECT IDENTIFIER (S1)1 = SUBUNIT 0 ATTENTION AVAILABLE MESSAGES SUPPRESSED IN THE CONTROLLER (S2) 1 = NOT USED (S3) 1 = NOT USED (S4) 1 = NOT USED x x x x x x x x BYTE 8 RETRY COUNT/FAILURE CODE CONTAINS DIFFERENT INFORMATION DEPENDING UPON CONDITION OF OF BIT IN BYTE G IF OF = 0 BYTE 8 CONTAINS RETRY COUNT FOR PREVIOUS COMMAND ISSUED (THE NUMBER OF TIMES COMMAND WAS RE-EXECUTED BEFORE SUCCESSFUL) IF OF = 1 BYTE 8 CONTAINS SPECIFIC ERROR CODE RELATING TO INITIALIZATION FAILURE THAT CAUSED OF BIT TO SET (REFER TO RAGO ERROR CODE LIST IN SERVICE MANUAL) CXO-2357A 1 0-8 Digital Internal Use Only RASO Drive Status Decoding Figure 10-8: RA60 Bytes 9-15 BYTE 9 BYTE 10 x x x x J x x x X x x x I i &...-""-,1 I x x x x I I x x x X ,.....iL.-l'x I x i I ! i ! x x x Iii I PREVIOUS CYLINDER BYTE 11 PREViOUS HEAD BYTE 12 x x x x I xi 1....-_ _ _ BYTE 13 X X X ~ I I I I x x x x I I I ! x x x I I j I X I CURRENT CYLINDER x x x x x x x X ~--------~--------~ '--x__x__X___ x .......__x__x__x__x~ BYTE' 4 CURRENT HEAD BYTE 15 MASTER ERROR/BYTE CODE (SEE NOTE) NOTE: REFER TO RASO ERROR CODE LIST IN SERVICE MANUAL. CXO-2358A Digital Internal Use Only 10-9 RASO Drive Status Decoding 10-10 Digital Internal Use Only RA70 Drive Status Decoding 10.2 RA70 DRIVE STATUS DECODE The following pages describe decoding the RA70 status bytes. These bytes are passed from the drive to the controller when a GET STATUS command is issued to the RA70. The RA70 also provides these status bytes to the controller if a command from the controller fails to execute properly within the drive. System error logs, controller error logs, and diagnostic utilities often display most of the drive status bytes. This decoding infonnation is provided here to help you Wlderstand the meaning of these bytes andlor the meaning of bits within any of these bytes. . Figure 10-9: Summary of RA70 Drive Status Codes BYTE 01 RESPONSE OPCODE BYTE 02 UNIT SELECT (LOWER) BYTE 03 UNIT + SUBUNIT MASK BYTE 04 REQUEST BYTE GENERIC DRIVE STATUS BYTE BYTE 05 MODE BYTE GENERIC DRIVE STATUS BYTE BYTE 06 ERROR BYTE GENERIC DRIVE STATUS BYTE BYTE 07 CONTROLLER BYTE BYTE 08 RETRY COUNT/FAILURE BYTE 09 PREVIOUS CMD OPCODE EXTENDED DRIVE STATUS BYTE BYTE 10 DRIVE STATE FLAGS EXTENDED DRIVE STATUS BYTE BYTE 11 CYLINDER ADDR (LO) EXTENDED DRIVE STATUS BYTE BYTE 12 CYLINDER AD DR (HI) EXTENDED DRIVE STATUS BYTE BYTE 13 GROUP NO. (HEAD) EXTENDED DRIVE STATUS BYTE BYTE 14 DRIVE ERROR CODE EXTENDED DRIVE STATUS BYTE BYTE 15 MFG ERROR CODE EXTENDED DRIVE STATUS BYTE CXO·2364A Bytes 2 through 15 are generally available from system error logs, the HSC console display, and various diagnostics that test RA-series drives. The format in which this information is displayed depends upon the specific type of system (VMS~ RSTS~ RSX, etc.) or the specific type of controller (HSC50, HSC70, etc.). Digital Internal Use Only 10-11 RA70 Drive Status Decoding Figure 10-10: RA70 Byte 1 Ix xxxix xxxl Byte 1 - RESPONSE OP Code This is the RESPONSE opcode from the drive to the controller, but it is rarely displayed to a user. It indicates the success or non-success of the previous command sent from the contr?ller to the drive. Figure 10-11: 10 0 RA70 Bytes 2-3 0 'Ix x x xl T Ix x x xix x ' I I I! I ? x xl Drive unit select number subunit 0 mask (subunit 0 reporting this status) subunlt , mask (not used) subunlt 2 mask (not used) subunlt 3 mask (not used) MLDS-128OB NOTE The RA70 has no multiple subunits and will always indicate Subunit O. 10-12 Digital Internal Use Only RA70 Drive Status Decoding Figure 10-12: RA70 Byte 4 Request Byte Ix x x xix x x xl L...-_ _ _ _ _ _ 1..-_ _ _ _ _ _ _ _ Byte 4 Request Byte (RU) 0 = Run/stop switch out 1 = Run/stop switch in (PS) ' - - - - - - - - - - - cPS) ' - - - - - - - - - - - - - (EL) 0 = Port switch out 1 = Port switch in 0 = Port A receivers enabled 1 = Port B receivers enabled 0 = No loggable information in extended status area 1 = Loggable information in extendea status area ' - - - - - - - - - - - - - (SR) 0 =Spindle not ready (not up to speed) 1 =Spindle ready 1..-_ _ _ _ _ _ _ _ _ _ _ _ _ 1..-_ _ _ _ _ _ _ _ _ _ _ _ _ _ (DR) 0 = No diagnostic is being requested from the host 1 = There is a request for a diagnostic to be loaded into the drive microprocessor memory (RR) 0 = Drive requires no recalibrate command 1 Drive requests recalibrate command = '--_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ (OIV 0 = Drive online or aviolable to current controller 1 Drive unaboilable (It is already online to another controller) = MLDS-1281A Digital Internal Use Only 10-13 RA70 Drive Status Decoding Figure 10-13: RA70 Byte 5 Mode Byte I x x0 xix x x xl ByteS Mode byte ---- (57) 0 II: 512-Byte sector format (16 bit) 1 II: 576-Byte sector format (18blt) (No current plan to implement 18-blt) ' - - - (DB) 0 c DBN area access disabled 1 II: DBN area access enabled ' - - - - - (FO) 0 .. Formatting operations disabled 1 II: Forma1ting operations endabled 1....----- (DO) 0 orDrive enabled by controller error routine diagnostic II: 1 .. Drive disabled by controller error routine or diagnostic (fault light .. ON) 1....------ (Wl) 01 .. Write protect switch for subunit 0 is out Write protect switch for subunit 0 is on II: 1....------- (W2) Not implemented ' - - - - - - - - - (EOn Error log disable (set by 2-board controller diagnostics) ' - - - - - - - - - - (EOO) Error log disable (set by 2-boord controller diagnostics) MLDS-2193A 10-14 Dfgitallnternal Use Only RA70 Drive Status Decoding Figure 10-14: RA70 Byte 6 Error Byte txxxxlxoool Byte 6 Error byte I~___- (WE) 0 = No error 1 - Write lock error (attempt to write while write protected) ........- -_ _ _ (OF) 0 = No error 1 Drive failure during intialization = ' - - - - - - - - (PE) 0 = No error 1 = Level 2 protocol error (improper command codes or parameters isssued to drive) ' - - - - - - - - - - - (RE) 0 = No error 1 = SOl receive error on SDI transmission line (s) from controller ' - - - - - - - - - - (DE) 0 = No error 1 = DriVe error (drive fault light may be on possibly clearable via drive clear command) MLDS-2352A Digital Internal Use Only 10-15 RA70 Drive Status Decoding Figure 10-15: RA70 Bytes 7-8 I 0 0 00 I x x x x I I Byte 7 Controller byte o000 = Normal drive operation 1 000 == Drive offline (under control of a dianostic) 1 00 1 = Drive offline (another drive has same unit select indentifier) (51) 1 (not used) (52) 1 (not used) (53) 1 (not used) (54) 1 (not used) xxxx xxxx Byte 8 Retry count/failure code MLDS-l284B NOTE Byte 8; Retry count during the last Seek or Recalibration Command. The number of times the command was re-tried internal to the RA70 in order to attempt suttessful completion of the SEEK or RECALIBRATE operation. 10-16 Digital Internal Use Only RA70 Drive Status Decoding Figure 10-16: RA70 Byte 9 Last Opcode xxxx xxxx Byte 9 Lastopcode (Extended drive status byte) ' - - - - - Opcode of the last previous level 2 drive command decoded by the drive (received from the SDI controller) 81 .. Change mode 82 .. Change controller flogs 03 - Diagnpse 84 =Disconnect (drive) 05 -= Drive clear 06 =Error recovery 87 = Get common characteristics 88 = Get subunit charaCTeristics OA - Initiate seek 8S-0nline QC=Run 8D .. Read memory 8E .. Recalibrate 90 .. Topology OF =Write memory FF .. Select group (level 1 command - processed by firmware seek head select sub-routines Mt..r::&-128SB Digital Internal Use Only 10-17 RA70 Drive Status Decoding Figure 10-17: RA70 Byte 10 Drive-Detected SOl Error I x x x x 1x x x x I BYTE 10 DRIVE STATE BIT FLAGS CONTAINS A NUMBER REPRESENTING STATE OF DRIVE AT TIME OF ERROR ' - - - - AV 1 = AVAILABLE IS ASSERTED OL 1 = DRIVE IN ONLINE STATE ' - - - - - - TP 1 = DRIVE EXECUTING LEVEL II TOPOLOGY COMMAND AT 1 = ATTENTION IS ASSERTED ' - - - - - - - - - - TG 1 = SECTOR + INDEX TIMING ENABLED FOR TRANSMISSION VIA RTDS LINE 1...---------1--......;..._ _ _ _ _ _ _ _ _ RW 1 = DRIVE IS INTERNAL R/W READY SF 1 = SOFT FAULT DETECTED; POSSIBLY CLEARABLE VIA LEVEL II CLEAR COMMAND L - -_ _ _ _ _ _ _ _ _ _ _ HE 1 = HARD ERROR; DRIVE MUST BE POWER CYCLED TO ATTEMPT TO CLEAR THIS ERROR CXO-2359A 10-18 Digital Internal Use Only RA70 Drive Status Decoding Figure 10-18: RA70 Bytes 11-15 BYTE 12 x x x x i i BYTE 11 x x x x x x x t i 1....-_ _ _ x x x x x x x x i I x ~ x x x x CYLINDER REQUESTED DURING LAST SEEK COMMAND I BYTE 13 1-----.. GROUP NUMBER CURRENTLY SELECTED (WILL BE RlW HEAD NUMBER IN AN RA70) x x x x x x x X BYTE 14 DRIVE ERROR CODE (SEE NOTE) ----------~--------~ x x x x x x x X BYTE 15 .....- - - - - - - -....- - - - - - - - ' MFG FAULT CODE INDICATES TO MODULE REPAIR CENTERS (AS CLOSE AS POSSIBLE) AREA OF LOGIC SPECIFICALLY IN QUESTION NOTE: REFER TO RA70 ERROR CODE LIST IN SERVICE MANUAL. CXO-2360A Digital Internal Use Only 10-19 RA70 Drive Status Decoding 10-20 Digital Internal Use Only RASO Drive Status Decoding 10.3 RASO DRIVE STATUS DECODE The following pages describe decoding the RA80 status bytes. These bytes are passed from the drive to the controller when a GET STATUS command is issued to the RA80. The RA80 also provides these status bytes to the controller if a command from the controller fails to execute properly within the drive. System error logs, controller error logs, and diagnostic utilities often display most of the drive status bytes. This decoding infonnation is provided here to help you understand the meaning of these bytes and/or the meaning of ~its within any of these bytes. Figure 10-19: Summary of RASO Drive Status Codes BYTE 01 RESPONSE OPCODE BYTE 02 UNIT SELECT (LOWER) BYTE 03 UNIT + SUBUNIT MASK BYTE 04 REQUEST BYTE GENERIC DRIVE STATUS BYTE BYTE OS MODE BYTE GENERIC DRIVE STATUS BYTE BYTE 06 ERROR BYTE GENERIC DRIVE STATUS BYTE BYTE 07 CONTROLLER BYTE BYTE 08 RETRY COUNT/FAILURE BYTE 09 PREVIOUS CMD OPCODE EXTENDED DRIVE STATUS BYTE BYTE 10 SOl ERROR BITS EXTENDED DRIVE STATUS BYTE BYTE 11 CYLINDER ADDR (LO) EXTENDED DRIVE STATUS BYTE BYTE 12 CYLINDER AD DR (HI) EXTENDED DRIVE STATUS BYTE BYTE 13 CURRENT GROUP NUMBER EXTENDED DRIVE STATUS BYTE BYTE 14 LED ERROR CODE EXTENDED DRIVE STATUS BYTE BYTE1S F.P. ERROR CODE EXTENDED DRIVE STATUS BYTE CXO-236SA Bytes 2 through 15 are generally available from system error logs, the HSC console display, and various diagnostics that test RA-series drives. The fonnat in which this infonnation is displayed depends upon the specific type of system (VMS, RSTS, RSX, etc.) or the specific type of controller (HSC50, HSC70, etc.). Digital Internal Use Only 10-21 RASO Drive Status Decoding Figure 10-20: RASO Byte 1 Ix xxxix xxxl Byte 1 - RESPONSE OP Code This is the RESPONSE opcode from the drive to the controller, but it is rarely displayed to a user. It indicates the success or non-success of the previous command sent from the controller to the drive. Figure 10-21: 10 RASO Bytes 2-3 x xl 0 0 l' X X I Ix x x xix x x xl ! ! I! I o:ro I Drive unit select number Subunit 0 mask (subunit 0 reporting this status) Subunit 1 mask (not used) Subunit 2 mask (not used) Subunit 3 mask (not used) MLDS-128OB NOTE The RA80 has no multiple subunits and will always indicate Subunit O. 10-22 Digital Internal Use Only RASO Drive Status Decoding Figure 10-22: I RASO Byte 4 Request Byte X X X xl X X X xl Byte 4 Request byte 11000--_ _ _ _ CRU) 0 = Run/stop switch out 1 = Run/stop switch in ~-----~S) O=Portswitcho~ 1 = Port switch in ~------ r--------- ~-------- CPB) (Not implemented) (El) 0 = No Ioggable information in extended status area 1 = LoggaOIe information in extendeo status area (SR) 0 = Spindle not ready (not up to speed) 1 = Spindle ready r----------- (DR) 0 = No diagnostic is being requested from the host 1 = There is a request for 0 diagnostic to be loaded into the drive microprocessor memory 1------------- (RR) 0 = Drive requries no recallbrate command 1 =Drives requests recalibrate command 1-------------- (O/4J o. Drive online or available to current controller 1 == drive unavailable (it is already online to another controller) ML.CS-235OA Digital Internal Use Only 10-23 RASO Drive Status Decoding Figure 10-23: RASO Byte 5 Mode Byte loooolxxxxi ByteS Mode byte ~ (57) 0= 512-Bytesectorformat (16 bit) 1 = 576-Byte sector format (18blt) ~-- (DB) 0= DBN area access disabled 1 = DBN area access enabled ' - - - - - (FO) a =Formatting operations disabled = 1 Formatting operations endabled ' - - - - - - (DD) ' - - - - - - - (W 1) a =Drive enabled by controller error routine or diagnostic 1 = Drive disabled by controller error routine or diagnostic (fault light =ON) a=Write protect switch for subunit 0 Is out 1 = Write protect switch for subunit 0 is in ' - - - - - - - - (W2) Not Implemented . . . . - . . - - - - - - (ED1) Not Implemented ' - - - - - - - - - (EDO) Not implemented MlDS-2351 A 10-24 Digital Internal Use Only RASO Drive Status Decoding Figure 10-24: RASO Byte 6 Error Byte Ix x x 01 x I 0 0 0 I Byte 6 Error byte (WE) 0 = No error 1 = Write loCK error (ottemptto write while write protected) Not used (DF) 0 = No error 1 = Drive failure during inlt (PE) 0 = No error 1 = Level 2 protocol error (improper command codes or parameters issued to drive) (RE) 0 = No error 1 = SDI receive error on SDI transmission line(s) from controller (DE) 0 = No error 1 = DriVe error (drive fuatt light may be on possibly clearable via drive clear command) MLDS-1283B Digital Internal Use Only 10-25 RA80 Drive Status Decoding Figure 10-25: RASO Bytes 7-8 I 0 0 0 0 I xxxx I I Byte 7 Controller byte 0000 == Normal drive operation 1 00 0 =Drive offline (under control of a dianostic) 1 00 1 = Drive offline (another drive has same unit select indentifier) (51) 1 (not used) (52) 1 (not used) (53) 1 (not used) (54) 1 (not used) x XXX XXXX Byte 8 Retry count/failure code MLDS-l2848 NOTE Byte 8: Retry count during the last SEEK or RECALmRATION COMMAND (the number of times the command was re-tried before successful completion). Maximum allowed by microcode is 3 before the seek or recalibration will be aborted. A value of 0 indicates 3 retries completed since the counter decrements to 0 during each retry. 10-26 Digital Internal Use Only RASO Drive Status Decoding Figure 10-26: RA80 Byte 9 Last Opcode xxxx xxxx Byte 9 Lastopcode (Extended drive status byte) 1 -_ _ _ _ Opcode of the last previous level 2 drive command decoded by the drive (received from the SDI controller) 81 = Change mode 82 = Change controller flogs 03 =Diagnose 84 = Disconnect (drive) 05 = Drive clear 06 = Error recovery 87 = Get common characteristics 88 = Get subunit characteristics OA = Initiate seek 8B=Online oc = Run 8D = Read memory 8E =Recalibrate 90 = Topology OF = Write memory FF = Select group (level 1 command - processed by firmware seek head select sub-routines MLDS-1285B Digital Internal Use Only 10-27 RA80 Drive Status Decoding Figure 10-27: RASO Byte 10 Drive-Detected SOl Error Ix 0 x xI x 0 0 0 I Byte 10 L(o~) Drive-detected SDI error Hard SDI transmissioh errors detected by drive receive logic circuitry 1 = R/W overrun/overwrite error Sector pulse detected before finishing R/W transfer or read/write gate up too long, or "glitch" detected as a sector pulse. L..-_ _ _ _ _ _ (PAR) ' - - - - - - - - (CPE) 1 = Parity error detected on the RTCS line from controller to drive. , = Control pulse error Two (2) or more pulses of same poiarity detected on RTCS line from controlier to drive ' - - - - - - - - - - - (OPE) = 1 Data pule error Two (2) or more pulses of same polarity detected on CMD/WRT dta line from controller to drive. MLDS-2353A 10-28 Digital Internal Use Only RA80 Drive Status Decoding Figure 10-28: RA80 Bytes 11-15 Byte x X X xl Byte 11 12 X X X i i i xl X X X xl X X X iii I xl I I 1- Cylinder requested during lost seek command X X xl X X X xl Byte 13 Group number selected during last seek command X X X xl X XX xl Byte 14 Led error cooe (see note 1) X X X xl XXX xl Byte 15 Front panel display fault code (see note 2) x NOTES: 1. Refer to LED error code list in service manual. 2. Refer to control ponel fault code list in service manual. MLDS-2354A Digital Internal Use Only 10-29 RASO Drive Status Decoding 10-30 Digital Internal Use Only RA81 Drive Status Decoding 10.4 RA81 DRIVE STATUS DeCODE The following pages describe the decoding of the RA81 statUS bytes. These bytes are passed from the drive to the controller when a GET STATUS command is issued to the RA81. The RA81 also provides these status bytes to the controller if a command from the controller fails to execute properly within the drive. System error logs, controller error logs, and diagnostic utilities often display most of the drive status bytes. This decoding information is provided to help you understand the meaning of these bytes and/or the meaning of bits within any of these bytes. Figure 10-29: Summary of RA81 Drive Status Codes BYTE 01 RESPONSE OPCODE BYTE 02 UNIT SELECT (LOWER) BYTE 03 UNIT + SUBUNIT MASK BYTE 04 REQUEST BYTE GENERIC DRIVE STATUS BYTE BYTE 05 MODE BYTE GENERIC DRIVE STATUS BYTE BYTE 06 ERROR BYTE GENERIC DRIVE STATUS BYTE BYTE 07 CONTROLLER BYTE BYTE 08 RETRY COUNT/FAILURE BYTE 09 PREVIOUS CMD OPCODE EXTENDED DRIVE STATUS BYTE BYTE 10 SDI ERROR BITS EXTENDED DRIVE STATUS BYTE BYTE 11 CYLINDER ADDR (LO) EXTENDED DRIVE STATUS BYTE BYTE 12 CYLINDER AD DR (HI) EXTENDED DRIVE STATUS BYTE BYTE 13 BYTE 14 BYTE 15 CURRENT GROUP NUMBER LED ERROR CODE F.P. ERROR CODE EXTENDED DRIVE STATUS BYTE EXTENDED DRIVE STATUS BYTE EXTENDED DRIVE STATUS BYTE CXO-2365A Bytes 2 thru 15 are generally available from system error logs, the HSC console display, and various diagnostics that test RA-series drives. The format in which this information is displayed depends upon the specific type of system (VMS, RSTS, RSX, etc.) or the specific type of controller (HSC50, HSC70, etc.). Digital Internal Use Only 10-31 RA81 Drive Status Decoding Figure 10-30: RA81 Byte 1 Ix xxxix xxxl Byte 1 - RESPONSE OP Code This will be the RESPONSE opcode from the drive to the controller, but it is rarely displayed to a user. It indicates the success or non-success of the previous command sent from the controller to the drive. Figure 10-31: 10 0 0 RA81 Bytes 2-3 'Ixxxxi ~ Ix x x xix ! I ! I! I x x xl c:r;:o Drive unit select number subunit 0 mask (subunit 0 reporting this status) subunit 1 mask (not used) subunit 2 mask (not used) subunit 3 mask (not used) MlDS-128OB NOTE The RA81 has no multiple subunits and will always indicate Subunit O. 10-32 Digital Internal Use Only RA81 Drive Status Decoding Figure 10-32: RA81 Byte 4 Request Byte Ix x x xix x x xl Byte 4 1.-______ (RU) 1.-------- (PS) ~ ~ Request Byte 0 = Run/stop switch out 1 = Run/stop switch in 0 = Port switch out 1 = Port switch in 1..-_ _ _ _ _ _ _ _ _ _ cPB) 0 = Port A receivers enabled 1 Port B receivers enabled 1..------------ (EL) 0 No loggable information in extended status area 1 = Loggable information In extended status area 1..------------- (SR) 0 Spindle not ready (not up to speed) 1 = Spindle ready = = = _ _ _ _ _ _ _ _ _ _ _ _ _ (DR) 0 = No diagnostic Is being requested from the host 1 = There is a request for a diagnostic to be loaded into the drive microprocessor memory _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ (RR) 0 = Drive requires no recallbrate command 1 = Drive requests recalibrate command 1..-________________ (OPV 0 .. Drive online or avialable to current controller 1 .. Drive unabailable (It Is already online to another controller) MU::S-1281A Digital Internal Use Only 10-33 RA81 Drive Status Decoding Figure 10-33: RA81 Byte 5 Mode Byte loooolxxxxi Byte 5 - - - (57) ~-- (DB) ""----- (FO) Mode byte 0= 512-Byte sector format (16 bit) 1-= 576-Bytesectorformat (18blt) 0= DBN area access disabled 1 = DBN area access enabled a = Formatting operations disabled = 1 Formatting operations endabled 1------ (DD) a.. Drive enabled by controller error routine or diagnostic 1 = Drive disabled by controller error routine or diagnostic (fault light -= ON) ........- - - - - 0N 1) a = Write protect switch for subunit 0 is out 1 = Write protect switch for subunit 0 is in ' - - - - - - - - (W2) Not implemented 1-------- (EDl) Notimplemented 1--------- (EDO) Not implemented MLDS-2351 A 10-34 Digital Internal Use Only RA81 Drive Status Decoding Figure 10-34: RA81 Byte 6 Error Byte I x x x o! x I 0 0 0 I Byte 6 Error byte (WE) 0 -= No error 1 -= Write lOck error (attempt to write while write protected) Not used (OF) 0 = No error 1 = Drive failure during intt (PE) 0 = No error 1 = Level 2 protocol error (improper command codes or parameters issued to drive) (RE) 0 = No error 1 = SDI receive error on SDI transmission line(s) from controller (DE) 0 = No error 1 = Drive error (drive fualt light may be on possibly clearable via drive clear command) MLDS-12838 Digital Internal Use Only 10-35 RA81 Drive Status Decoding Figure 10-35: RA81 Bytes 7-8 Controller Byte looooJ xxxx I I Byte 7 Controller byte o000 = Normal drive operation 1 0 00= Drive offline (under control of a dianostic) 1 00 1 =Drive offline (another drive has same unit select indentlfier) (51) 1 (not used) (52) 1 (not used) (53) 1 (not used) (54) 1 (not used) xxxx xxxx ByteS Retry count/failure code MLDS-l284B NOTE Byte 8; Retry count during the last SEEK or RECALIBRATION command (the number of times the command was re-tried before successful completion). Maximum allowed by microcode is 3 before the seek or recalibration will be aborted. A value of 0 indicates 3 retries completed since the counter decrements to 0 during each retry. 10-36 Digital Internal Use Only RA81 Drive Status Decoding Figure 10-36: RA81 Byte 9 Last Opcode xxxx xxxx Byte 9 Lastopcode (Extended drive status byte) ' - - - - - Opcode of the last previous level 2 drive command decoded by the drive (received from the SOl controller) 81 .. Change mode 82 =Change controller flags 03 = Diagnose 84 =Disconnect (drive) 05 - Drive clear 06 = Error recovery 87 = Get common characteristics 88 = Get subunit characteristics OA - Initiate seek 89-0nline DC-Run 80 = Read memory 8E = Racallbrate 90 - Topology OF = Write memory FF - Select group (level 1 command - processed by firmware seak head seleCt sub-routines MLDS-12858 Digital Internal Use Only 10-37 RA81 Drive Status Decoding Figure 10-37: RA81 Byte 10 Drive-Detected SOl Error Ix 0 x xI x 0 0 0 I Byte 10 L(OM) Drive-detected SDI error Hard SDI transmission errors detected by drive receive logic circuitry 1 =R/W overrun/overwrite error Sector pulse detected before finishing R/W transfer or read/write gate up too long. or "glitCh" detected as-a sector pulse. ~------ (PAR) ' - - - - - - - - (CPE) 1 z Parity error detected on the RTCS line from controller to drive. 1 = Control pulse error Two (2) or more pulses of same polarity detected on RTCS line from controller to drive ' - - - - - - - - - - (OPE) 1 = Data pule error Two (2) or more pulses of same polarity detected on CMD/WRT dte line from controller to drive. MLDS-2353A 10-38 Digital Internal Use Only RA81 Drive Status Decoding Figure 10-38: RA81 Bytes 11-15 Byte 12 XXX I I Byte 11 xl x x x xl ! ! I' X X X I I [ I I xl X X X I I xl Cylinder requested during last seek command x XX xl XXX xl Byte 13 Group number currently selected (this is aiso the R/W head number for an RA81) XX X xl XXX xl Byte 14 Led error code (see note 1) XXX xl XXX xl Byte 15 Front panel display fault code (see note 2) NOTES: 1. Refer to LED error code list in service manual. 2. Refer to control panel fault code list in service manual. MLDS-2355A Digital Internal Use Only 10-39 RAS1 Drive Status Decoding 10-40 Digital Internal Use Only RA82 Drive Status Decoding 10.5 RA82 DRIVE STATUS DECODE The following pages describe the decoding of the RA82 status bytes. These bytes are passed from the drive to the controller when a GET STATUS command is issued to the RA82. The RA82 also provides these status bytes to the controller if a command from the controller fails to execute properly within the drive. System error logs, controller error logs, and diagnostic utilities often display most of the drive status bytes. This decoding infonnation is provided to help you understand the meaning of these bytes and/or the meaning of bits within any of these bytes. Figure 10-39: RAB2 Drive Status Decode BYTE 01 RESPONSE OPCODe BYTE 02 UNIT SELECT (LOWER) ) ~. INt 1-( {C) r it! I,) 1+ j.p> A \ BYTE 03 UNIT + SUBUNIT MASK BYTE 04 REQUEST BYTe GENERIC DRIVE STATUS BYTE BYTE 05 MODE BYTe GENERIC DRIVE STATUS BYTE BYTE 06 ERROR BYTe GENERIC DRIVE STATUS BYTE BYTE 07 CONTROLLER BYTe BYTE 08 RETRY COUNT/FAILURe BYTE 09 PREVIOUS CMD OPCODE EXTENDED DRIVE STATUS BYTE BYTE 10 INTERNAL PORT REG EXTENDED DRIVE STATUS BYTE BYTE 11 CYLINDER ADDR (LO) EXTENDED DRIVE STATUS BYTE BYTE 12 CYLINDER ADDR (HI) EXTENDED DRIVE STATUS BYTE BYTE 13 RECOVERY I GROUP NO. EXTENDED DRIVE STATUS BYTE BYTE 14 LED ERROR CODE EXTENDED DRIVE STATUS BYTE BYTE 15 F.P. FAULT CODE EXTENDED DRIVE STATUS BYTE CXO-1279B Bytes 2 thru 15 are generally available from system error logs, the HSC console display, and various diagnostics that test RA-series drives. The fonnat in which this infonnation is displayed depends upon the specific type of system (VMS, RSTS, RSX, etc.) or the specific type of controller (HSC50, HSC70, etc.). Digital Internal Use Only 10-41 RA82 Drive Status Decoding Figure 10-40: RA82 Byte 1 Ix x x xix x x xl Byte 1 - RESPONSE OP Code This will be the RESPONSE opcode from the drive to the controller, but it is rarely displayed to a user. It indicates the success or non-success of the previous command sent from the controller to the drive. Figure 10-41: /000 RA82 Bytes 2-3 'Ixxxxi T Ix ! I x x xix x x xl !! ! I t::LfD Drive unit select number Subunit 0 mask (subunit 0 reporting this status) Subunit 1 mask (not used) Subunit 2 mask (not used) Subunit 3 mask (not used) MLDS-128OB NOTE The RA82 has no multiple subunits and will always indicate Subunit O. 10-42 Digital Internal Use Only RA82 Drive Status Decoding Figure 10-42: RA82 Byte 4 Request Byte Ix x x x Ix x x xl L..-_ _ _ _ _ _ Byte 4 Request Byte (RU) 0 = Run/stop switch out 1 Run/stop switch in = ' - - - - - - - - - - (PS) ' - - - - - - - - - - - (PS) ' - - - - - - - - - - - - - eEL) = 0 Port switch out 1 = Port switch in j~)fP'-~}· 0 = No loggable information in extended status area cQl-'n:f f,ct,~ 1 = Loggable information in extended status area ~ rYO 1" I ~ ~o j "')1 L..-_ _ _ _ _ _ _ _ _ _ _ _ 'f- b0 1: i C'D,'-' , ;;:, 1: '<)I. H' lh:\~ \ ~ '1'\ l~:', I ('}I\ r,,-( 0= No error 1 = Drive error (drive fuait light may be on possibly clearable via drive clear command) MLDS-1283B Digital Internal Use Only 10-45 \. RA82 Drive Status Decoding Figure 10-45: RA82 Bytes 7-8 Controller Byte I 0 0 0 0 I xxxx I I Byte 7 Controller byte 000 0 == Normal drive operation 1 00 0 = Drive offline (under control of a dianostic) 1 00 1 =Drive offline (another drive has same unit select indentifier) (51) 1 (not used) (52) , (not used) (53) 1 (not used) (54) 1 (not used) xXXX XXXX Byte 8 Retry count/failure code MLDS-1284B NOTE Byte 8; Retry count during the last SEEK or RECALmRATION command. The number of times the command was re-tried internal to the RA82 in order to attempt successful completion of the SEEK or RECALmRATE operation. 10-46 Digital Internal Use Only RA82 Drive Status Decoding Figure 10-46: RA82 Byte 9 Last Opcode xxxx xxxx Byte 9 Lastopcode (Extended drive status byte) ' - - - - - Opcode of the lost previous level 2 drive commard decoded by the drive (received from the SDI controller) 81 = Change mode 82 = Change controller flags 03 = Diagnose 84 = Disconnect (drive) 05 = Drive clear 06 = Error recovery 87 = Get common characteristics 88 = Get subunit characteristics OA = Initiate seek 8S=Online OC=Run 8D = Read memory 8E = Recalibrate = 90 Topology OF = Write memory FF • Select group (level 1 command - processed by firmware seek head select sub-routines MlO)-1285B Digital Internal Use Only 10-47 RA82 Drive Status Decoding Figure 10-47: Ix RA82 Byte 10 Real-Time Drive Port Image x x xl x x x xl Byte 10 Real-time drive port image An internal RA82 byte used to reflect the condition of the inteNIOI hardware port selection hardware and the drive state bits currently activated at the port. 1 .. Port B RIDS (output) enabled generally indicates that the Port B switch is enabled. = 1 Port A RIDS (output) enabled generally indicates that th$ Port A switch is enabled. 1 - - - - - - - - - - - 1 =Port B (RTCS + WRT/CMD input) enabled generally indicates that Port B Is currently on line to a controller. 1----------- 1 =Port A (RTCS + WRT/CMD input) enabled generally indicates that Port A is currently online to a controller. ~------------ 1 .. Available asserted 1--------------- 1 =Attention asserted 1....-_ _ _ _ _ _ _ _ _ _ _ _ _ _ 1 = R/W ready asserted ~--------------- 1 =Receiver ready asserted MLD5-2356A 10-48 Digital Internal Use Only RA82 Drive Status Decoding Figure 10-48: RA82 Bytes 11-15 t<.- t\ BYTE 1=2 BYTE~- ~_x_x_x_x~~x_x__x_x__1 ~I__x_x_x_x__~x__x_x_x~ I I I I I I I I I I I I I I I I CYLINDER Requested during Last Seek Command xxx x xxxx BYTE 13 GROUP # currently selected. This will be the R/W head number in an RA82. ERROR RECOVERY LEVEL that the RA82 is currently executing. xxxx xxxx BYTE 14 xxxx xxxx BYTE 15 FRONT PANEL DISPLAY FAULT CODE; refer to the RA82 Control Panel Fault Code list in the Service manual. Digital Internal Use Only 10--49 RA82 Drive Status Decoding 10-50 Digital Internal Use Only RA90 Drive Status Decoding 10.6 RA90 DRIVE STATUS DECODE The following pages describe the decoding of the RA90 status bytes. These bytes are passed from the drive to the controller when a GET STATUS command is issued to the RA90. The RA90 also provides these status bytes to the controller if a command from the controller fails to execute properly within the drive. System error logs, controller error logs, and diagnostic utilities often display most of the drive status bytes. This decoding information is provided to help you understand the meaning of these bytes and/or the meaning of bits within any of these bytes. Figure 10-49: Summary of RA90 Drive Status Codes Byte 01 Response opcode Byte 02 Unit number low byte Byte 03 Subunit mask Byte 04 Request byte Generic drive status byte Byte 05 Mode byte Generic drive status byte Byte 06 Error byte Generic drive status byte Byte 07 Controller byte GeneriC drive status byte Byte 08 Retry count Byte 09 Previous command opcode Extended drive status Byte 10 HDA revision bits Extended drive status Byte 11 Cylinder address (10) Extended drive status Byte 12 Cylnder address (hi) Extended drive status I Extended drive status Byte 13 Recovery LVL Group No. Byte 14 Error code Extended drive status Byte 15 MFG fault code Extended drive status Bytes 2 thru 15 are generally available from system error logs, the HSC console display, and various diagnostics that test RA-series drives. The format in which this information is displayed depends upon the specific type of system (VMS, RSTS, RSX, etc.) or the specific type of controller (HSC50, HSC70, etc.). Digital Internal Use Only 1 0-51 RA90 Drive Status Decoding Figure 10-50: RA90 Byte 1 ·Ix xxxix xxxl Byte 1 - RESPONSE OP Code This will be the RESPONSE opcode from the drive to the controller, but it is rarely displayed to a user. It indicates the success or non-success of the previous command sent from the controller to the drive. Figure 10-51: 10 0 0 RA90 Bytes 2-3 'Ix x x xJ I J ~ I J Ix ! I x x xix x ! I! ! x xl o:ro Drive unit select number subunlt 0 mask (subunit 0 reporting this status) subunlt 1 mask (not used) subunlt 2 mask (not used) subunlt 3 mask (not used) ML..OS-128OB NOTE The RA90 has no multiple subunits and will always indicate Subunit O. 10-52 Digital Internal Use Only RA90 Drive Status Decoding Figure 10-52: RA90 Byte 4 Request Byte Ix x x xix x x xl 1....-_ _ _ _ _ _ Byte 4 (RU) 0 = Run/stop switch out 1 = Run/stop switch in ' - - - - - - - - - (PS) ~--------- (PS) ' - - - - - - - - - - - - - (EL) 1....-_ _ _ _ _ _ _ _ _ _ _ Request Byte 0 = Port switch out 1 = Port switch in = 0 Port A receivers enabled 1 = Port B receivers enabled .Q = No log gable information in extended status area 1 = Loggable information in extended status area (SR) 0 = Spindle not ready (not up to speed) 1 Spindle ready = ~------------- (DR) 0 = No diagnostic is being requested from the host 1 = There is a request for a diagnostic to be loaded into the drive microprocessor memory ' - - - - - - - - - - - - - - - - (RR) 0 = Drive requires no recallbrate command 1 Drive requests recalibrate command = '--_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ (OA) 0 = Drive online or aviolable to current controller 1 = Drive unoballable (It is already online to another controller) MLDS-1281A Digital Internal Use Only 10-53 RA90 Drive Status Decoding Figure 10-53: RA90Byte 5 Mode Byte I x x 0 xix x x xl - Byte 5 Mode byte (57) 0 = 512-Byte sector format (16 bft) 1 = 576-Byte sector format (18bft) (No current plan to implement 18-blt) ' - - - - (DB) 0 = DBN area access disabled 1 = DBN area access enabled ~--- ~---- ~----- ~------ ~------ = (FO) 0 Formatting operations disabled 1 c Formatting operations endabled (DD) 0 = Drive enabled by controller error routine or diagnostic 1 = Drive disabled by controller error routine or diagnostic (fault light == ON) (Wl) 0 = Write protect switch for subunit 0 is out 1 = Write protect switch for subunit 0 is on (W2) Not Implemented (EDl) Error log disable (set by 2-board controller diagnostics) ' - - - - - - - - - - (EDO) Error log disable (set by 2-board controller diagnostics) Ml.CS-2193A 10-54 Digital Internal Use Only RA90 Drive Status Decoding Figure 10-54: RA90 Byte 6 Error Byte ~. x x x xl x 0 0 0/ I~---- Byte 6 Error byte = (WE) 0 No error 1 = Wrtte lock error (attempt to write while write protected) 1.. . -_____ (OF) 0 = No error 1 = DriVe failure during intialization " ' - - - - - - - - (PE) 0 = No error 1 = Level 2 protocol error (improper command codes or parameTers isssued to drive) 1.....-------- (RE) 1.....--------- (DE) 0 = No error 1 SDI receive error on SDI transmission line (s) from controller = 0 = No error 1 = DriVe error (driVe fault light may be on possibly clearable via drive clear command) MLDS-2352A Digital Internal Use Only 10-55 RA90 Drive Status Decoding Figure 10-55: RA90 Bytes 7-8 Controller Byte I0 0 0 0 I xxxx ~ I Byte 7 Controller byte 0000 = Normal drive operation 1 000 =Drive offline (under control of a dianostic) 1 00 1 = Drive offline (another drtve has same unit select indentifier) (S 1) 1 (not used) (52) 1 (not used) (53) 1 (not used) (54) 1 (not used) x X X X X X X X Byte 8 Retry count/failure code MLDS-12848 NOTE Byte 8; Retry count during the last SEEK or RECALmRATION command. The number of times the command was re-tried internal to the RA90 in order to attempt successful completion of the SEEK or RECALmRATE operation. 10-56 Digital Internal Use Only RA90 Drive Status Decoding Figure 10-56: RA90 Byte 9 Last Opcode xxxx xxxx Byte 9 Lastopcode (Extended drive status byte) ' - - - - - Opcode of the last previous level 2 drive command decoded by the drive (received from the SDI controller) 81 • Change mode 82 .. Change controller flags 03 .. Diagnose 84 • Disconnect (drive) 05 • Drive clear 06 .. Error recovery 87 II: Get common characteristics 88 • Get subunit characteristics OA -Initiate seek 8S=Online DC=Run 8D .. Read memory 8E .. Recalibrate 90 .. Topology OF .. Wrtte memory FF • Select group (level 1 command - processed by firmware seek head select sub-routines MLDS-1285B Digital Internal Use Only 10-57 RA90 Drive Status Decoding Figure 10-57: RA90 Byte 10 HDA Revision Bits HDA revision bits (bits are used to identify HDA revision) Byte 10 - - HDA revision bit 01 ~-- HDA revision bit 02 MLD5-2362A Figure 10-58: RA90 Bytes 11-15 BYTE 11 BYTE 12 X X X I X I X X X X I I X X X X i I i I ~ X I X X X ! I ! ' " - - - - CYLINDER REQUESTED DURING LAST SEEK COMMAND x x x x BYTE 13 ' - - - - - - - - GROUP NUMBER CURRENTLY SELECTED (WILL BE R/W HEAD NUMBER IN AN RA90) ' - - - - - - - - - - - - - - ERROR RECOVERY LEVEL RA90 IS CURRENTLY EXECUTING x x x x x x x x x x x x x x X BYTE 14 DRIVE ERROR CODE (SEE NOTE) ~------------------~ X BYTE 15 """---------......-------~ MFG FAULT CODE INDICATES TO MODULE REPAIR CENTERS (AS CLOSE AS POSSIBLE) AREA OF LOGIC SPECIFICALLY IN QUESTION NOTE: REFER TO RA90 ERROR CODE LIST IN SERVICE MANUAL. CXO-2362A 10-58 Digital Internal Use Only DSA Troubleshooting Course Exercise Status Error Decoding Sample 1 Drive Status Decoding SAMPLE 1 Digital Internal Use Only 10-59 D,SA Trol;lbleshooting Course Exercise S1atus Error Decoding Sample 1 10.7 Status Error Decoding Sample 1 ERROR-E Drive detected error at 8-apr-1986 15:11:44.37 00000000 Command Ref '* 66. RA82 unit '* 444. Err Seq '* Error Flags 40 Event OOEB Request lB Mode 00 Error 80 .:Drive Controller 00 Retry/fail 00 C .1 Ci~,..),t Dt-/tX- 1I SllOi v, Extended Statusq OC RulJ ()/'YI jrhO,t'''I., f I ttl OB'i'oytA 15()~h~t.-)I.J.'l..t~ " OO? ~~k?d (De te,ult) 00 S ..,,/ (De fcvJ U) tL t5 OOHD * Requestor Drive port '* ERROR-I End of error. 1~O ,,<{ co {S 30 7. O. Digital Internal Use Only VJ \ r: ..';i'YI<...s.ISVU.f"'\~~ 50/Vq ~\~ Fc..}t;\j! ls i v" (,Y'uei r"1oo~vlc.. Up II DSA Troubleshooting .C9y~se;E:X'f;9ise Status Error Decoding S~r:t1p'le 1 V A X / VMS SYSTEM ERROR REPORT **************************** ENTRY . ERROR SEQUENCE 83. ERL$LOGMESSAGE ENTRY COMPILED 8-APR-1986 16:41 PAGE 2. 3. **************************** LOGGED ON SID 01380A4F 8-APR-1986 15:11:44.37 KA780 REV# 7. SERIAL# 2639. MFG PLANT O. I/O SUB-SYSTEM, UNIT _HSC007$DUA66: MESSAGE TYPE 0001 DISK MSCP MESSAGE MSLG$L_CMD_REF MSLG$W_UNIT 00000000 0042 MSLG$W_SEQ_NUM OlBC UNIT #66. SEQUENCE #444. MSLG$B_FORMAT 03 MSLG$B_FLAGS 40 MSLG$W_EVENT OOEB "SDI" ERROR OPERATION CONTINUING DRIVE ERROR DRIVE DETECTED ERROR MSLG$Q_CNT_ID 0000F807 01010000 UNIQUE IDENTIFIER, 00000000F807 MASS STORAGE CONTROLLER HSC70 MSLG$B_CNT_SVR 02 MSLG$B_CNT_HVR 00 CONTROLLER SOFTWARE VERSION #2. CONTROLLER HARDWARE REVISION #0. MSLG$W MOLT UNT 0050 MSLG$Q=UNIT=ID 00000108 020BOOOO UNIQUE IDENTIFIER, 000000000108 DISK CLASS DEVICE RA82 DSA Troubleshooting Course Exercise Status Error Decoding Sample 1 V A X / VMS SYSTEM ERROR REPORT COMPILED 8-APR-1986 16:41 PAGE 3. UNIT SOFTWARE VERSION #1. MSLG$B_ONIT_HVR . OF ONIT HARDWARE REVISION #15. MSLG$L_VOL_SER 03C769A2 MSLG$L_HEADER 00000000 VOLUME SERIAL #63400354. LBN #0. GOOD LOGICAL SECTOR MSLG$ Z_SD I REQUEST 1B RON/STOP SWITCH IN PORT SWITCH IN LOG INFORMATION IN EXTENDED AREA SP INDLE READY PORT A RECEIVERS ENABLED MODE 00 ERROR 80 CONTROLLER 00 RETRY 00 512-BYTE SECTOR FORMAT DRIVE ERROR NORMAL DRIVE OPERATION o. RETRIES LEFT V)ot O$.(J! CONTROLLER OR DEVICE DEPENDENT INFORMATION LED CODE co PANEL CODE 30 LAST OPCODE OC RUN OB PORT IMAGE PORT B RTDS ENABLED PORT A RTDS ENABLED PORT A ENABLED VAX/VMS SYSTEM ERROR REPORT CUR CYLNDR COMPILED 8-APR-1986 16:41 PAGE 4. 0000 CURRENT CYLINDER, #0. CUR GROUP 00 REQUESTOR 07 DRIVE PORT 00 CURRENT GROUP, #0. REQUESTOR #7. DRIVE PORT 10-62 Digital Internal Use Only to. DSA Troubleshooting Course Exercise Status Error Decoding Sample 2 Drive Status Decoding SAMPLE 2 Digital Internal Use Only 10-63 DSA Troubleshooting Course Exercise Status Error Decoding Sample 2 10.8 Status Error Decoding Sample 2 ERROR-E S1 Command Timeout at 8-apr-1986 15:11:44.37 Command Ref t 00000000 RA82 unit t 66. Err Seq t 489. Error Flags 41 Event 002B Request 13 Mode 00 Error 00 Controller 00 Retry/fail 00 'I,~.f ~ 1" t I ("",}"Ti rn~,,()uf " Extended Status OA 'I~ ~~ ~ OB 8F 05 00 00 00 7. Requestor t Drive port t O. ERROR-I End of error. ERROR-E Drive Detected Error at 8-apr-1986 15:11:44.37 Command Ref t 00000000 RA82 unit 66. Err Seq t 490. Error Flags 40 Event OOEB \ .t Request 1B -C£ L T!>~) Erro",' 1I1'{1' 0 Mode 00 Error 80 Controller 00 Retry/fail 00 Extended Status OA <~L OB A ('jULI !t, 00 / 01 3 V'1' I * 14) 01 ~tc)uf 4D Requestor t Drive port i 10-64 I ~ ?IQCe( 28 7. O. ERROR-I End of error. Digital Internal Use Only J DSA Troubleshooting Course Exercise Status Error Decoding Sample 2 V A X / VMS SYSTEM ERROR REPORT **************************** ENTRY ERROR SEQUENCE 257. ERL$LOGMESSAGE ENTRY COMPILED 8-APR-1986 16:41 PAGE 2. 5. **************************** LOGGED ON SID 01380A4F 8-APR-1986 15:11:44.37 KA780 REV# 7. SERIAL# 2639. MFG PLANT O. I/O SUB-SYSTEM, UNIT _HSC007$DUA66: MESSAGE TYPE 0001 DISK MSCP MESSAGE MSLG$L CMD REF MSLG$W=:UNIT 00000000 0042 MSLG$W_SEQ_NUM 01E9 UNIT #66. SEQUENCE #489. MSLG$B_FORMAT 03 MSLG$B_FLAGS 41 "SDI" ERROR SEQUENCE NUMBER RESET OPERATION CONTINUING MSLG$W_EVENT 002B DRIVE ERROR DRIVE COMMAND TI!1EOUT MSLG$Q_CNT_ID 0000F807 01010000 UNIQUE IDENTIFIER, 00000000F807 MASS STORAGE CONTROLLER HSC70 MSLG$B_CNT_SVR 02 MSLG$B_CNT_HVR 00 CONTROLLER SOFTWARE VERSION #2. CONTROLLER HARDWARE REVISION #0. MSLG$W MOLT UNT 0050 MSLG$Q=:UNIT=ID 00000108 020BOOOO UNIQUE IDENTIFIER, 000000000108 DISK CLASS DEVICE RA82 Digital Internal Use Only 10-65 DSA Troubleshooting Course Exercise Status Error Decoding Sample 2 VAX/VMS SYSTEM ERROR REPORT COMPILED 8-APR-1986 16:41 PAGE 3. UNIT SOFTWARE VERSION #1. UNIT HARDWARE REVISION #15. VOLUME SERIAL #63400354. 0000.0000 LBN *0. GOOD LOGICAL SECTOR MSLG$Z_SDI REQUEST 13 RUN/STOP SWITCH IN PORT SWITCH IN SP INDLE READY PORT A RECEIVERS ENABLED MODE 00 ERROR CONTROLLER 00 00 RETRY 00 512-BYTE SECTOR FORMAT NORMAL DRIVE OPERATION O. RETRIES LEFT DEVICE DEPENDENT INFORMATION LONGWORD 1. 058FOBOA / LONGWORD 2. 07000000 LONGWORD 3. 00000000 LONGWORD 4. 00000000 .... / .... / / .... / / / 1~6 Digital Internal Use Only .... / DSA Troubleshooting Course Exercise Status Error Decoding Sample 2 V A X / VMS SYSTEM ERROR REPORT **************************** ENTRY ERROR SEQUENCE 258. ERL$LOGMESSAGE ENTRY COMPILED 8-APR-1986 16:41 PAGE 4. 6. **************************** LOGGED ON SID 01380A4F 8-APR-1986 15:11:44.37 KA780 REV# 7. SERIAL# 2639. MFG PLANT O. I/O SUB-SYSTEM, UNIT _HSC007$DUA66: MESSAGE TYPE 0001 DISK MSCP MESSAGE MSLG$L CMD REF MSLG$W=::UNIT 00000000 0042 MSLG$W_SEQ_NUM OlEA UNIT #66. SEQUENCE #490. MSLG$B_FORMAT 03 MSLG$B_FLAGS 40 MSLG$W_EVENT OOEB "SDI" ERROR OPERATION CONTINUING DRIVE ERROR DRIVE DETECTED ERROR MSLG$Q_CNT_ID 0000F807 01010000 UNIQUE IDENTIFIER, 00000000F807 MASS STORAGE CONTROLLER HSC70 MSLG$B_CNT_SVR 02 MSLG$B_CNT_HVR 00 CONTROLLER SOFTWARE VERSION #2. CONTROLLER HARDWARE REVISION #0. MSLG$W MOLT UNT 0050 MSLG$Q=::UNIT=::ID 00000108 020BOOOO UNIQUE IDENTIFIER, 000000000108 DISK CLASS DEVICE RA82 Digital Internal Use Only 10-67 DSA Troubleshooting Course Exercise Status Error Decoding Sample 2 VAX/VMS SYSTEM ERROR REPORT MSLG$B_ONIT_SVR 01 MSLG$B_ONIT_HVR OF COMPILED 8-APR-1986 16:41 PAGE 5. UNIT SOFTWARE VERSION #1. UNIT HARDWARE REVISION #15. MSLG$L_VOL_SER 03C769A2 MSLG$L_HEADER 00000000 VOLUME SERIAL #63400354. LBN #0. GOOD LOGICAL SECTOR MSLG$Z_SDI REQUEST 1B RUN/STOP SWITCH IN PORT SWITCH IN LOG INFORMATION IN EXTENDED AREA SPINDLE READY PORT A RECEIVERS ENABLED MODE 00 ERROR 80 CONTROLLER 00 RETRY 00 512-BYTE SECTOR FORMAT DRIVE ERROR NORMAL DRIVE OPERATION o. RETRIES LEFT CONTROLLER OR DEVICE DEPENDENT INFORMATION LED CODE 4D PANEL CODE 28 LAST OPCODE OA INITIATE SEEK OB, PORT IMAGE PORT B RTDS ENABLED PORT A RTDS ENABLED PORT A ENABLED VAX/VMS SYSTEM ERROR REPORT CUR CYLNDR COMPILED 8-APR-1986 16:41 PAGE 6. 0001 CURRENT CYLINDER, #1. CUR GROUP 01 REQUESTOR 07 DRIVE PORT 00 CURRENT GROUP, i1. REQUESTOR i7. DRIVE PORT . . 10-68 Digital Internal Use Only to • DSA Troubleshooting Course Exercise Status Error Decoding Sample 3 Drive Status Decoding SAMPLE 3 Digital Internal Use Only 10-69 DSA Troubleshooting Course Exercise Status Error Decoding Sample 3 10.9 Status Error Decoding Sample 3 ERROR-E Drive detected error at 8-apr-1986 15:11:44.37 00000000 Command Ref i RA82 unit # 66. Err Seq # 1. Error Flags 41 Event OOEB Request lB Mode 00 Error 40 Controller 00 Retry/fail 00 Extended Status OA OB 1\ ~ ~,t+t'C:~ /-1 -f' 0 i ,", s-eet '?:OO , 65 OB 4F OC Requestor # Drive port :# ERROR-I End of error. 10-70 7. O. Digital Internal Use Only C'1 L c. HP 13 ~ t-IV/C DSA Troubleshooting Course Exercise Status Error Decoding Sample 3 VAX/VMS SYSTEM ERROR REPORT **************************** ENTRY ERROR SEQUENCE 690. ERL$LOGMESSAGE ENTRY COMPILED 8-APR-1986 16:41 PAGE 2. 3. **************************** LOGGED ON SID 01380A4F 8-APR-1986 15:11:44.37 KA780 REV# 7. SERIAL# 2639. MFG PLANT O. I/O SUB-SYSTEM, UNIT _HSC007$DUA66: MESSAGE TYPE 0001 DISK MSCP MESSAGE MSLG$L CMD REF MSLG$W::UNIT 00000000 0042 UNIT #66. MSLG$W_SEQ_NUM 0001 SEQUENCE #1. MSLG$B_FORMAT 03 MSLG$B_FLAGS 41 "SDI" ERROR SEQUENCE NUMBER RESET OPERATION CONTINUING MSLG$W_EVENT OOEB DRIVE ERROR DRIVE DETECTED ERROR MSLG$Q_CNT_ID 0000F807 01010000 UNIQUE IDENTIFIER, 00000000F807 MASS STORAGE CONTROLLER HSC70 CONTROLLER SOFTWARE VERSION #2. MSLG$B_CNT_HVR 00 CONTROLLER HARDWARE REVISION #0. MSLG$W MOLT UNT 0050 MSLG$Q::UNI T::ID 00000108 020BOOOO UNIQUE IDENTIFIER, 000000000108 DISK CLASS DEVICE RA82 Digital Internal Use Only 10-71 DSA Troubleshooting Course Exercise Status Error Decoding Sample 3 VAX/VMS SYSTEM ERROR REPORT MSLG$B_UNIT_SVR E5 MSLG$B_UNIT_HVR OF COMPILED 8-APR-1986 16:41 PAGE 3. UNIT SOFTWARE VERSION #229. UNIT HARDWARE REVISION #15. MSLG$L_VOL_SER 03C769A2 MSLG$L_HEADER 00000000 VOLUME SERIAL #63400354. LBN #0. GOOD LOGICAL SECTOR MSLG$Z_SDI REQUEST 1B RUN/STOP SWITCH IN PORT SWITCH IN LOG INFORMATION IN EXTENDED AREA SPINDLE READY PORT A RECEIVERS ENABLED MODE 00 ERROR 40 CONTROLLER 00 RETRY 00 512-BYTE SECTOR FORMAT SDI RECEIVE ERROR NORMAL DRIVE OPERATION O. RETRIES LEFT CONTROLLER OR DEVICE DEPENDENT INFORMATION LED CODE 4F PANEL CODE OC LAST OPCODE OA INITIATE SEEK PORT IMAGE OB PORT B RTDS ENABLED PORT A RTDS ENABLED PORT A ENABLED VAX / VMS SYSTEM ERROR REPORT CUR CYLNDR COMPILED 8-APR-1986 16:41 4. PAGE 0065 CURRENT CYLINDER, #101. CUR GROUP OB REQUESTOR 07 DRIVE PORT 00 CURRENT GROUP, #11. REQUESTOR #7. DRIVE PORT #0. 10-72 Digital Internal Use Only DSA Troubleshooting Course Exercise Status Error Decoding Sample 4 Drive Status Decoding SAMPLE 4 Digital Internal Use Only 10-73 DSA Troubleshooting Course Exercise Status Error Decoding Sample 4 10.10 Status Error Decoding Sample 4 ERROR-E SI Receiver Ready Collision at 8-apr-1986 15:11:44.37 Command Ref # 00000000 RA82 unit # 66. Err Seq # 430. Error Flags 41 Event ./ 01AE Request 13 Mode 00 Error 00 Controller 00 Retry/fail 00 Extended Status 90 OB C7 02 07 00 00 Requestor # 7. Drive port # O. ERROR-I End of error. ERROR-E SI Rec;:eiver Ready Collision at 8-apr-1986 15:11:44.37 Command Ref # 00000000 RA82 unit # 66. Err Seq # 431Error Flags 40 Event 01AE Request 13 Mode 00 Error 00 Controller 00 Retry/fail 00 Extended Status 90 OB C7 02 07 00 00 Requestor # 7. Drive port # O. ERROR-I End of error. 10-74 Digital Internal Use Only DSA Troubleshooting Course Exercise Status Error Decoding Sample 4 DISK-E Seq. 8. at 8-apr-1986 15:12:01.77 Unrecoverable error on disk unit 66. Drive appears inoperative. intervention required. ERROR-E SI Receiver Ready Collision at 8-apr-1986 15: 12: 01.77 Conunand Ref #' 00000000 66. RA82 unit #' 432. Err Seq "* 00 Error Flags Event OlAB Request 13 Mode 00 00 Error Controller 00 Retry/fail 00 Extended Status 90 OB C7 02 07 00 00 7. Requestor # o. Drive port # ERROR-I End of error. ERROR-E Position or Unintelligible Header Error at 8-apr-1986 15:12:01.77 Conunand Ref # 130E0006 RA82 unit # 66. Err Seq # 433. Error Flags 00 Event 006B Recovery level 7. Recovery count O. LBN 608485. Orig err flags 014000 Recovery flags 000002 LvI A retry cnt 3. LvI B retry cnt O. Buffer addr 141706 Source Req. 7. Detecting Req. 7. ERROR-I End of error Digital Internal Use Only 10-75 DSA Troubleshooting Course Exercise Status Error Decoding Sample 4 VAX/VMS SYSTEM ERROR REPORT **************************** ENTRY ERROR SEQUENCE 26. ERL$LOGMESSAGE ENTRY COMPILED 8-APR-1986 16:41 PAGE 2. 6. **************************** LOGGED ON SID 01380A4F 8-APR-1986 15:11:44.37 KA780 REV# 7. SERIAL# 2639. MFG PLANT O. I/O SUB-SYSTEM, UNIT _HSC007$DUA66: MESSAGE TYPE 0001 DISK MSCP MESSAGE MSLG$L CMD REF MSLG$W=UNIT 00000000 0042 MSLG$W_SEQ_NUM 01AE UNIT #66. SEQUENCE #430. MSLG$B_FORMAT 03 MSLGSB FLAGS 41 "SOl" ERROR SEQUENCE NUMBER RESET OPERATION CONTINUING MSLG$W_EVENT 01AE DRIVE ERROR RECEIVER READY COLLISION MSLG$Q_CNT_ID 0000F807 01010000 UNIQUE IDENTIFIER, 00000000F807 MASS STORAGE CONTROLLER HSC70 MSLG$B_CNT_S\TR 02 MSLG$B_CNT.....;H"VR 00 CONTROLLER SOFTWARE VERSION #2. CONTROLLER HARDWARE REVISION *0. MSLG$W MULT UNT 0050 MSLG$Q=UNIT=ID 00000108 020BOOOO UNIQUE IDENTIFIER, 000000000108 DISK CLASS DEVICE RA82 10-76 Digital Internal Use Only DSA Troubleshooting Course Exercise Status Error Decoding Sample 4 VAXIVMS SYSTEM ERROR REPORT MSLG$B_UNIT_SVR 01 MSLG$B_UNIT_HVR OF COMPILED 8-APR-1986 16:41 PAGE 3. UNIT SOFTWARE VERSION #1. UNIT HARDWARE REVISION #15. MSLG$L_VOL_SER 03C769A2 MSLG$L_HEADER 00000000 VOLUME SERIAL #63400354. LBN #0. GOOD LOGICAL SECTOR MSLG$Z_SDI REQUEST 13 RUNISTOP SWITCH IN PORT SWITCH IN SPINDLE READY PORT A RECEIVERS ENABLED MODE 00 ERROR CONTROLLER 00 00 RETRY 00 512-BYTE SECTOR FORMAT NORMAL DRIVE OPERATION O. RETRIES LEFT DEVICE DEPENDENT INFORMATION LONGWORD 1. 02C70B90 1 .... 1 LONGWORD 2. 07000007 LONGWORD 3. 00000000 LONGWORD 4. 00000000 / .... / I. 0 0 0/ /.0.0/ Digital Internal Use Only 10-77 DSA Troubleshooting Course Exercise Status Error Decoding Sample 4 VAX/VMS SYSTEM ERROR REPORT **************************** ·ENTRY ERROR SEQUENCE 27. ERL$LOGMESSAGE ENTRY COMPILED 8-APR-1986 16:41 PAGE 4. 7. **************************** LOGGED ON SID 01380A4F 8-APR-1986 15:11:44.37 KA780 REV# 7. SERIAL# 2639. MFG PLANT O. I/O SUB-SYSTEM, UNIT _HSC007$DUA66: MESSAGE TYPE 0001 DISK MSCP MESSAGE MSLG$L_CMD_REF MSLG$W_UNIT 00000000 0042 MSLG$W_SEQ_NUM OlAF UNIT #66. SEQUENCE #431. MSLG$B_FORMAT 03 MSLG$B_FLAGS 40 MSLG$W_EVENT olAB "SOl" ERROR OPERATION CONTINUING DRIVE ERROR RECEIVER READY COLLISION MSLG$Q_CNT_ID 0000F807 01010000 UNIQUE IDENTIFIER, 00000000F807 MASS STORAGE CONTROLLER HSC70 MSLG$B_CNT_SVR 02 MSLG$B_CNT_HVR 00 CONTROLLER SOFTWARE VERSION #2. CONTROLLER HARDWARE REVISION #0. MSLG$W MOLT UNT 0050 MSLG$Q=:UNIT=:ID 00000108 020BOOOO UNIQUE IDENTIFIER, 000000000108 DISK CLASS DEVICE RA82 10-78 Digital Internal Use Only DSA Troubleshooting Course Exercise Status Error Decoding Sample 4 VAX/VMS SYSTEM ERROR REPORT MSLG$B_UNIT_SVR 01 MSLG$B_UNIT_HVR OF COMPILED 8-APR-1986 16:41 PAGE 5. UNIT SOFTWARE VERSION #1. UNIT HARDWARE REVISION #15. VOLUME SERIAL #63400354. 00000000 LBN #0. GOOD LOGICAL SECTOR MSLG$Z_SDI REQUEST 13 RUN/STOP SWITCH IN PORT SWITCH IN SPINDLE READY PORT J.. RECEIVERS ENABLED MODE 00 ERROR CONTROLLER 00 00 RETRY 00 512-BYTE SECTOR FORMAT NORMAL DRIVE OPERATION o. DEVICE DEPENDENT INFORMATIONLONGWORD 1. 02C70B90 / LONGWORD 2. 07000007 LONGWORD 3. 00000000 LONGWORD 4. 00000000 RETRIES LEFT .... / .... / / .... / / / .... / ***************~********************************************~*********** 1. INTERVENING RECORD (S) WILL BE PRINTED AT INPUT FILE " " Digital Internal Use Only 1 0-79 DSA Troubleshooting Course Exercise Status Error Decoding Sample 4 10-80 Digital Internal Use Only VMS V4.4 Error Log Entry Formatter Problem 70-/10 71 -0 I '-Cr¥£c C X:i~XXj.:7.xxx xx xx xx THIS IS xxxxxxxxxx xxxxxxxxxx xx:..:xxxxxxx 10-86 Digital Internal Use Only o. RETRIES LEFT XXXXxxxxxxxxxxxxxxxxxxxx:a.:xxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxx..~xx.xx::~xxx :i-:X}:Xxxxxxxxxxxxxx}:xxxxxxx:~xxx xxxxxx:..:xxXX:O>:XXXXX:i;;XXXXXXXXXX ~dERE THE EXTENDED DRIVE STATUS BYTES SHOULD HAVE BEEN DISPLAYED. xxxxxxxxxxxxxxxx:oo:xxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxx:a.:x CHAPTER 11 VAXSIMPLUS VAXsimPLUS 11-1 VAXsimPLUS 11.1 VAXsimPLUS OVERVIEW SYMPTOM DIRECTED DIAGNOSIS o ARTIFICIAL INTELLIGENCE {AI} o SOO TOOLS KIT: SE~VICE SPEAR Basic RFTS VAXsim-PLUS o 11-2 OEC PROPRIETARY - NOT FOR SALE Digital Internal Use Only TECHNOLOGY VAXsimPLUS VAXSIMPLUS o THE PLUS FEATURE Predictive Analysis Automatic Disk Substitution o BENEFITS TO THE CUSTOMER Enhanced Data Integrity Higher System Availability Perceived Higher Reliability Problem Notification o BENEFITS TO FIELD SERVICE Automatic FRU Analysis Automatic Symptom Directed Diagnosis Formatted Evidence Data Digital Internal Use Only 11-3 VAXsimPLUS 11.2 VAXsimPLUS PHONE NUMBERS VAXSIMPLUSService Delivery VAXsimPLUS notifies customers of impending problems and supplies a DIGITAL phone number. PL01 eSC/AT (Atlanta) 1-800-241-2546 PL31 CSC/CX (Colorado) 1-800-525-6570 The ese will log the call, event code (VAXsim Theory Number), etc. The esc will diagnose the problem and take appropriate actions. 11-4 Digital Internal Use Only VAXsimPLUS 11.3 VAXsimPLUS RESOURCES RESOURCES Part Number Component EY-7687 -PO-001 VAXsimPLUS Training Course QLX07-RW Entire SOD TOOLS KIT: AA-KN79A-TE Getting Started with VAXsimPLUS 25 AA-KN80A-TE VAXsim PLUS User's Guide 25 AB-KN81A-TE SOD Tools Kit Installation Guide AA-KN82A-TE VAXsimPLUS Field Service Guide 25 5 5 5 Qty AA-J917B-RE VAX SPEAR Manual AA-J917B-R1 VAX SPEAR Manual Update #1 AV-M381 B-RE VAX SPEAR Reference Card 5 AV-P012A-TK Guide to Measuring Up Time 25 AV-KQ93A-TE Field Service Tools Cover Letter 1 AV-KQ94A-TE FS SOD Tools T+C Amendment, Part 1 25 AV-KV74A-TE FS SOD Tools T+C Amendment, Part:2 25 99-07862-01 Binder 1 AV-KY93A-TE VAXsimPLUS Spine Insert 25 AQ-KQ91 A-RE FS SDD Tool V1.0 B in TK50 BB-KQ92A-RE FS SDD Tool V1 .0 B in 16MT9 The distribution of the SDD TOOL KIT will be automatic to unit managers on the Control Diagnostics Distribution List. Digital Internal Use Only 11-5 VAXsimPLUS VAXsimPLUS MESSAGE TYPES 11-6 o MEDIA Error Messages o 501 Error Messages o DRIVE-DETECTED (Non-Media) Error Messages Digital Internal Use Only VAXsimPLUS 11.4 VAXsimPLUS MESSAGE EXAMPLES VAXsimPLUS Message Examples The following pages contain examples of messages generated by VAXsimPLUS. These messages were generated as a result of disk subsystem errors detected by VAXsimPLUS. Digital Internal Use Only 11-7 VAXsimPlUS Examples User Example 1 VAXsimPLUS RA Disk Notification Message VAXsimPLUS has detected that the following device needs attention: $3$DUA161 (RA70 S/N:18CB) EVENT CODE: [xx.xx.xx.xx] * Autocopy was not started -- (The autocopy switch is turned off) There were 11 total media related events for this drive. Event Type Soft Hard Number 11 o Suggested recovery procedure (A): 1. Start appropriate backup or copy procedures for your site. 2. Notify Digital Field Service (include event code in service call info). Field service phone: 1-800-224-1900 11-8 Digital Internal Use Only VAXsimPlUS Examples User Example 2 VAXsimPLUS RA Disk Notification Message VAXsimPLUS has detected that the following device needs attention: $3$DUASO (RA81 S/N:2C9SE) EVENT CODE: [~~.AA.~~.:~J * Autocopy was not started -- (There are too many hard errors) There were 19 total media related events for this drive. Event Type Soft Hard Number 17 2 Suggested recovery procedure (B): 1. Notify Digital Field Service (include event code in service call info). Field service phone: 1-800-224-1900 2. Continued use of this drive may result in more hard errors occurring. Take this into account in determining if you wish to continue using this drive or"start a backup or copy operation. Digital Internal Use Only 11-9 VAXsimPLUS Examples User Example 3 VAXsimPLUS RA Disk Notification Message VAXsimPLUS has detected that the following device needs attention: $6$DUA62 (RAe1 S/N:95E01) * EVENT CODE: [xx.xx.xx.xx] Autocopy was not started -- (There are too many hard errors) There were 22 total media related events for this drive. Event Type Soft Hard Number 13 9 Suggested recovery procedure (C): 1. This message is to notify you that one or more hard errors have been detected and the errors did not fall into one of the failure modes. You may want to determine what file the hard error(s) occurred on and do the appropriate action based on that. Field service phone: 1-800-224-1900 11-10 Digital Internal Use Only VAXsimPLUS Examples User Example 4 VAXsimPLUS RA Disk Notification Message VAXsimPLUS has detected that the following device needs attention: $3$DUA151 (RA70 S/N:17CA) * EVENT CODE: [XA. xx . xx . :-c-:] Autocopy was not started -- (The autocopy switch is turned off) There were 47 total non-media events for this drive. Suggested recovery procedure (D): 1. Notify Digital Field Service (include event code in service call info). Field service phone: 1-800-224-1900 Digital Internal Use Only 11-11 VAXsimPLUS Examples User Example 5 VAXsimPLUS RA Disk Notification Message VAXsimPLUS has detected that the following device needs attention: $3$DUA151 (RA70 S/N:17CA) * Autocopy was EVENT CODE: [XX.AA.XX.XX] started There were 47 total non-media events for this drive. Suggested recovery procedure (E): 1. Notify Digital Field Service (include event code in service call info). Field service phone: 1-800-224-190'0 2. Do a VAXsirn/DISMOUNT after autocopy is finished. 11-12 Digital Internal Use Only VAXsimPLUS Examples User Example 6 From: To: Subj: FELIX: : SYSTEM NODE: : NORMAN BOSY::RA81 S/N:F4AC Attn: Field Service Device: FATCAT$DUA4 MIDCAT$DUA4 Theory: [xx.xx.xx.xx] (RA81 S/N:F4AC) ~-rheck~ CJ2&.P.:.t' Evidence (All results are in decimal except LED Code): Total errors on drive: 17 Sector Phys. From Soft Hard Head Cyl. Index Count Count ----- ----8 255 0 3 1 8 255 0 1 8 257 0 1 0 8 310 1 0 8 338 1 8 0 461 4 1 8 0 461 1 8 462 0 4 1 8 462 0 1 0 8 621 1 8 885 0 1 8 1009 4 0 8 1083 1 0 0 9 1133 1 Volume Serial Number Error. Type (Led Code is in hex) ---------- -------------------------------------20728 20728 20728 20728 20728 20728 20728 20728 20728 20728 20728 20728 20728 20728 Lost Read/Write Ready LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off Lost Read/Write Ready LED 39, Write and Off Lost Read/Write Ready LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off Track Track Track Track Track Track Track Track Track Track Track Digital Internal Use Only 11-13 VAXsimPLUS Examples User Example 7 From: To: SUbj: 3-NOV-1987 21:48 GRAMPS:: SYSTEM DETMAC: :NICHOLS GRAMPS::GRAMPS$DUA2 analysis Attn: Device: Field Service GRAMPS$DUA2 (RA81 S/N:A352) [xx.xx.xx.xx] Theory: /." t" 'Jq~ . l' l.A)IVvlct I - Cl I::I-X-, kl~"-) (. \ 'C l Evidence (All results are in decimal except LED Code): Total errors on drive: 10 Sector Phys. From Soft Hard Head Cyl. Index Count Count 1 2 2 2 2 2 2 2 2 2 Time: 11-14 760 451 451 629 637 662 665 749 760 761 3 11 22 34 31 9 5 21 29 44 1-NOV 23:45:07 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 TO Digital Internal Use Only Volume Serial Number Error Type (Led Code is in hex) ---------- --------------------------------------116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 ECC ECC ECC ECC ECC ECC ECC ECC· ECC ECC 3-NOV 23:38:47 Error Error Error Error Error Error Error Error Error Error Span: 47:54:39 VAXsimPLUS Examples User Example 8 From: To: Subj: DTHSTR: : SYSTEM 17-AUG-1987 20:34 NODE: : NORMAN DTHSTR::$3$DUA9,LUKE$DUS9 analysis Attn: Device: Field Service LUKE$DUA9 (RA82 S/N:22C3) LUKE$DUS9 Theory: (xx.xx.xx.xx] NOTE: There were 1 hard errors recorded for this device. A hard error is defined as an error on a block in which BBR was invoked and the data was replaced with 'Force Error'. Evidence (All results are in decimal except LED Code): Total errors on drive: 1 Sector Phys. From Soft Hard Index Count Count Head Cyl. 11 432 3 0 Volume Serial Number ---------- 1 72109769 Time of Error: II-AUG 09:29:32 Digital Internal Use Only 11";'15 VAXsimPLUS Examples User Example 9 From: To: Subj: 17-NOV-1987 23:07 DTHSTR: : SYSTEM NODE: : JEFFRY DIMILO::$3$DUA50 analysis Attn: Device: Field Service OBIWAN$DUA50 (RA81 S/N:2C95E)- Theory: [xx.xx.xx.xx] NOTE: There were 1 hard errors recorded for this device. A hard error is defined as an error on a block in which BBR was invoked and the data was replaced with 'Force Error' . Evidence (All results are in decimal except LED Code): Total errors on drive: 5 Sector Phys. From Soft Hard Index Count Count Head Cyl. 0 0 0 0 0 47 299 302 18 302 306 314 18 Time: 17-NOV 20:49:41 11-16 0 1 1 1 1 1 0 0 0 0 TO Digital Internal Use Only Volume Serial Number Error Type (Led Code is in hex) ---------- -------------------------------------142538 142538 142538 142538 142538 J' ECC Error (L~r;0»", (CNW~O Lost Read/Write Ready J LED 39, Write and Off Track:pt LED 39, Write and Off Track ECC Error 17-NOV 22:59:28 Span: 2:10:47 VAXsimPLUS Examples User Example 10 From: To: Subj: 4-NOV-1987 04:45 DTHSTR:: SYSTEM NODE: :MSTONE GUITAR::$3$DUA191 analysis Attn: Device: Field Service C3PO$DUA191 (RA82 S/N:57A2) Theory: [xx.xx.xx.xx] Evidence: 18 SDI Communication Errors Count EventStatus Translation Cj) ~. (C;(OC"" 1. ----------- 4. 14. r/.RAA c/( ( 4B) Controller Detected Transmission Errors . ( lOB) Controller Detected Pulse or State Parity Errors Since there were SDI communication errors which could have caused media related transfer errors to occur, the following is a summary of those media errors: Evidence (All results are in decimal): Total media errors on drive: 2 Head 3 4 Time: Sector From Soft Hard Cyl. Index Count Count 3 6 23 o 4-NOV 01:45:11 Volume Serial Number o 1 1 o TO Error Type (Led Code is in hex) 101859 ECC Error 101859 ECC Error 4-NOV 04:22:13 Span: 2:37:01 Digital Internal Use Only 11-17 VAXsimPLUS Examples User Example 11 From: To: Subj: PICKUP: : SYSTEM 22-JUL-1987 14:00 NODE: : NORMAN PICKUP::$1$DUA15,$1$DUS52 analysis Attn: Device: Field Service HSC015$DUA15 (RA80 S/N:163A) HSC015$DUS52 Theory: [xx.xx.xx.xx] NOTE: There were 2 hard errors recorded for this device. A hard error is defined as an error on a block in which BBR was invoked and the data was replaced with 'Force Error' • Evidence (All results are in decimal except LED Code): Total errors on drive: 18 Sector Phys. From Soft Hard Head Cyl. Index Count Count 0 0 0 0 3 4 5 6 6 9 9 10 10 10 11 11 11 11 252 510 510 510 3 250 252 530 530 540 540 517 517 517 2 2 71 71 30 20' 21 24 23 28 12 26 29 12 14 11 12 20 7 25 20 26 Time: 22-JUL 13:43:16 1 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 TO Volume Serial Number Error Type (Led Code is in hex) ---------- -------------------------------------17758 17758 17758 17758 17758 17758 17758 17758 17758 17758 17758 17758 17758 17758 17758 17758 17758 17758 Lost Read/Write Ready Eee Error Lost Read/Write Ready Positioning Error Lost Read/Write Ready Eee Error Lost Read/Write Ready Lost Read/Write Ready Positioning Error Lost Read/Write Ready Positioning Error ECe Error Lost Read/Write Ready Positioning Error Lost Read/Write Ready Eee Error Lost Read/Write Ready Lost Read/Write Ready 22-JUL 14:01:51 Span: 0:19:34 Dt1fh\ fVo..1 \'" 't ?I)J. c')N cl \(.!w S~,vt..)cJ l)t,UIl ~,1 SO'\citro I \4 1) It-!'{{. ~' 11-18 Digital Internal Use Only I ,i \ pi,~y\~k!o:\ 1:11' ,WV\L ~t'-t\ "11 f:. ',' 0, I I~ VAXsimPLUS Examples User Example 12 From: To: Subj: GRAMPS: : SYSTEM 5-NOV-1987 10:14 FREDH GRAMPS::GRAMPS$DUA2 analysis Attn: Device: Field Service GRAMPS$DUA2 (RA81 S/N:A352) Theory: [xx.xx.xx.xx] NOTE: There were 3 hard errors recorded for this device. A hard error is defined as an error on a block in which BBR was invoked and the data was replaced with 'Force Error'. Evidence (All results are in decimal except LED Code): Total errors on drive: 728 Volume Serial Number Sector Phys. From Soft Hard Head Cyl. Index Count Count 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 6 6 6 6 189 275 415 435 444 449 479 483 513 514 516 529 624 624 636 711 363 415 415 419 419 419 420 420 425 425 429 1251 670 775 967 1127 31 23 1 17 46 3 24 51 50 12 34 17 14 50 40 50 47 9 40 o 35 39 4 41 37 47 22 43 51 22 35 1 Time: 4-NOV 16:55:52 (Truncated) o o o o o o o o o 1 1 1 1 1 1 1 1 1 1 o o 1 1 1 o o o o o o o 1 1 1 1 1 1 1 1 1 1 1 1 1 1 o o o o o o o o o o o 2 1 1 1 1 o o o TO 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 116949 Error Type (Led Code is in hex) ECC Error ECC E~ror ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error Invalid Header Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error 5-NOV 12:01:02 Span: 19: 05 :09 Digital Internal Use Only 11-19 VAXsimPLUS Examples User Example 13 From: To: Subj: DTHSTR::SYSTEM 12-0CT-1987 12:24 FREDH GUITAR::$3$DUA35 analysis Attn: Device: Field Service HAN$DUA35 (RA81 S/N:2AD2B) Theory: [xx.xx.xx.xx] NOTE: There were 2 hard errors recorded for this device. A hard error is defined as an error on a block in which BBR was invoked and the data was replaced with 'Force Error' . Evidence (All results are in decimal except LED Code): Total errors on drive: 6 Sector Phys. From Soft Hard Head Cyl. Index Count 3 3 5 13 44 45 54 44 13 13 13 13 2 2 0 0 Time: 12-0CT 10:31:10 11-20 Digital Internal Use Only Volume Serial VAXsimPLUS Examples User Example 14 From: To: Subj: DTHSTR: : SYSTEM 13-0CT-1987 04:12 8672::SMITHJ OBOE::$3$DUA143 analysis Attn: Device: Field Service C3POSDUA143 (RA70 S/N:O) LUKESDUA143 Theory: [xx.xx.xx.xx] Evidence (All results are in decimal except LED Code) : Total errors on drive: 52 Sector Phys. From Soft Hard Head Cyl. Index Count Count o o o o o o o 4 4 4 6 6 6 6 6 6 7 8 8 8 9 9 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 2 4 12 37 45 386 417 369 387 409 205 205 365 368 373 405 371 405 406 409 365 370 372 373 374 381 391 405 410 411 416 183 183 372 384 385 393 397 403 405 2 10 Time: 10-OCT 10:17:21 o 3 3 1 1 1 1 o o o o o o o o o o o 1 1 2 1 2 1 1 1 o o o o o o o 1 1 1 2 1 1 1 o o o o o o o o o o o o 1 1 1 1 1 1 1 2 1 o o 1 1 o o 1 1 1 2 1 1 1 o o o o o TO Volume Serial Number 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 Error Type (Led Code is in hex) LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off Lost Read/Write Ready ~D 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off Lost Read/Write Ready LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off LED 39, Write and Off 13-0CT 03:54:53 Span: Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track 65:38 :31 Digital Internal Use Only 11-21 VAXsimPLUS Examples User Example 15 From: To: SUbj: COOKIE::SYSTEM GENRAL: :USER1 COOKIE::RA82 S/N:59 27-0CT-1987 01:10 Attn: Field Service Device: CAROB$DUA120 (RA82 S/N:59) Theory: [xx.xx.xx.xx] Evidence (All results are in decimal except LED Code): Total errors on drive: 6 Sector Phys. From Soft Hard Head Cyl. Index Count Count 7 7 711 49 711 Time: 26-0CT 14:20:16 5 1 0 0 TO 11-22 Digital Internal Use Only Volume Serial Number ---------- Error Type (Led Code is in hex) 62600173 Positioning Error 62600173 LED 4D, Bad Embedded Servo During Write 26-0CT 16:55:44 Span: VAXsimPLUS Examples User Example 16 From: To: Subj: USMRM9: : SYSTEM 15-DEC-1987 09:36 GENRAL: : USER4 USMlU16::$1$DUA34 analysis Attn: Device: Field Service HSCOOl$DUA34 (RA81 S/N:122C3) Theory: [xx.xx.xx.xx] NOTE: There were 4 hard errors recorded for this device. A hard error is defined as an error on a block in which BBR was invoked and the data was replaced with 'Force Error' . Evidence (All results are in decimal except LED Code) : Total errors on drive: 8 Sector Phys. From Soft Hard Head Cyl. Inde~: Count Count 2 2 2 2 2 8 8 11 2 1 67 67 : ~ s~"e,(f,l~~ 67 67 47 0 259 1 31 1224 1 1 1 0 44} ·0 Time: 15-DEC 01:39:30 1 1 1 1 0 0 0 TO Volume Serial Number Error Type (Led Code is in hex) ---------- -------------------------------------58142 58142 58142 58142 58142 58142 58142 58142 LED 25, Servo Check ECC Error ECC Error Positioning Error Positioning Error LED 25, Servo Check ECC Error LED 25, Servo Check I5-DEC 11:36:31 Span: 9:57:01 S{)VVCJ ~lw Htrl1 ~(1)' ~r ~ t'__S '0 Digital Internal Use Only 11-23 VAXsimPLUS Examples User Example 17 From: To: Subj: DTHSTR: : SYSTEM 28-JUL-1987 12:26 NODE: : NORMAN DTHSTR::$3$DUA77 analysis Attn: Device: Field Service LUKE$DUA77 (RA81 S/N:151E) Theory: [xx.xx.xx.xx] Evidence (All results are in decimal except LED Code) : Total errors on drive: 14 Sector Phys. From Soft Hard Head Cyl. Index Count Count 9 9 9 9 9 9 9 33 2 33 3 33 5 33 8 33 9 33 18 33 Time: 2S-JUL 12:20:40 1 1 1 1 1 1 o o 8 o o o o o TO 11-24 Digital Internal Use Only Volume Serial Number 81052 81052 81052 8105,2 81052 81052 81052 Error Type (Led Code is in hex) Lost Read/Write Ready 0 Lost Read/Write Ready ?ICh~iJ\ (}.,.t II 1\". ,,_,,(\, ~ Lost Read/Write Ready ~ K(J~~~~~~ Lost Read/Write Ready SC-'v{)~¢~ R/W f:'\ ~ci1~ ?,:",h\. ,~L Lost Read/Write Ready L/Sc.P'" C, yO~\\l\C ' Lost Read/Write Ready S)~vVp \I ')LED 4D, Bad Embedded Servo During Write rh., r, IS ( 28-JUL 12:21:30 Span: 0:01:50 VAXsimPLUS Examples User Example 18 From: To: Subj: COOKIE: : SYSTEM System Management" 14-DEC-1987 15:53 GENRAL: : JACKN COOKIE::$3$DUAl19 analysis Attn: Device: Field Service SPICE$DUA119 (RA82 S/N:47) Theory: [xx.xx.xx.xx] Evidence (All results are in decimal except LED Code): Total errors on drive: 40 Sector Phys. From Soft Hard HeadCyl. Index Count Count 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 641 2 4 5 6 7 8 1~ 12 13 14 15 16 17 18 19 23 24 25 27 28 29 30 32 33 34 35 37 39 44 45 47 50 51 52 56 57 Time: 14-DEC 15:44:16 o o o o o o o o o o o o o o o o o o o o 1 1 1 1 1 1 1 1 1 1 1 1 J. 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 o o o o o o o o 1 o 1 1 1 1 1 1 1 o o o o o o o TO Volume Serial Number 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 64301171 Error Type (Led Code is in hex) Invalid Header Error Positioning Error Positioning Error Invalid Header Error Positioning Error Positioning Error Positioning Error Positioning Error Invalid Header Error Positioning Error Posi~ioning Error Positioning Error Invalid Header Error Positioning Error Positioning Error Positioning Error Positioning Error Invalid Header Error Positioning Error Positioning Error Positioning Error Positioning Error Positioning Error Positioning Error Positioning Error Positioning Error Positioning Error Positioning Error Positioning Error Positioning Error Positioning Error Positioning Error Positioning Error Invalid Header Error Positioning Error Positioning Error 14-DEC 15:45:27 Span: HD~", ?W\'J,t ~ dl S0~ J ~wJJ S'.RyVd prd",~ 'P,.c.,' 0:01:11 Digital Internal Use Only 11-25 VAXsimPLUS Examples User Example 19 From: To: Subj: 6-AUG-1987 15:17 WIMPY: : SYSTEM NODE: : JACKSN WIMPY: :RA82 S/N:3A Attn: Field Service Device: HSC010$DUA44 (RA82 S/N:3A) HSC010$DUS1 Theory: [xx.xx.xx.xx] Evidence (All results are in decimal except LED Code) : Total errors on drive: 8 Sector Phys. From Soft Hard Head Cyl. Index Count Count 1 3 3 3 3 3 4 4 1421 270 272 273 275 276 216 217 46 26 26 26 26 26 20 20 Time: 31-JUL 01:34:44 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 1 TO 11-26 Digital Internal Use Only Volume Serial Number Error Type (Led Code is in hex) ---------- -------------------------------------44 44 44 44 44 44 44 44 Positioning Error ECC Error ECC Error Positioning Error ECC Error ECe Error Ece Error Ece Error 6-AUG 15:16:59 Span: 157: 42: 14 ':9-~" l~ <\'() \i\ue - :\ \1>''\ ~~I" i;t S\:" vJ~\\ ~ (--0 cf.N . VAXsimPLUS Examples User Example 20 From: To: Subj: DTHSTR: : SYSTEM 17-FEB-1988 02:47 NODE: : NORMAN TUBA::$3$DUA41 analysis Attn: Device: Field Service SAX$DUA41 (RA82 S/N:835A) Theory: [xx.xx.xx.xx] Evidence (All results are in decimal except LED Code) : Total errors on drive: 12 Sector Phys. From Soft Hard Head Cyl. Index Count Count 2 2 9 11 36 36 Time: la-FEB 21:05:09 0 0 5 7 Volume Serial Number ---------- 105021 105021 TO Digital Internal Use Only 11-27 VAXsimPLUS Examples User Example 21 From: To: Subj: USMRM5: : SYSTEM 15-FEB-1988 06:43 GENRAL: : FIELD USMRM5::$1$DUA113 analysis Attn: Device: Field Service HSC011$DUA113 (RA81.S/N:8D6B) Theory: [xx.xx.xx.xx] Evidence: LED CODE (HEX) Count 4. 00 01 F1 20. 11. Translation of LED CODE Undefined LED code; no translation available (HARDCORE) l Spindle motor speed transducer timeout -t-i)Uc:..Q.. fU\<;;«-J 1V'I\\~\'r, Slave load timeout The drive detected errors may have been responsible for the following 1 SI events: Evidence : 1 SOI Communication Transfer Errors Count 1. EventStatus Translation lAB) Receiver Ready Collision Errors Time: 13-FEB 18:14:13 11-28 TO Digital Internal Use Only lS-FEB 08:40:31 Span: 38: 26: 18 \ S(1'';'~ ~ 1 P L-r iV'p'\ ~I~\ VAXsimPLUS Examples User Example 22 From: To: SUbj: ZEPHYR: : SYSTEM 21-NOV-1987 14:32 FIELDS ZEPHYR::$5$DUA1 analysis Attn: Device: Field Service ZEPHYR$DUA1 (RA81 S/N:O) Theory: [xx.xx.xx.xx] Evidence: LED CODE (HEX) Count 28. 2. Translation of LED CODE F1 Lobt(-i Slave load timeout Slave seek timeout F8 ij,~S::~5 cf1QI'A' "*' ~0'\1 \VI ~ The following drive-detected errors may have occurred as a resu~t of the above errors. These errors do not have any extencied status, hence no valid led code was available. Evicience: 2 Errors Count Request Byte 2. ------13 Time: 21-NOV 15:15:53 Mode Byte Error Byte ------- ------00 TO 80 Controller Byte ---------- 21-NOV 16:27:19 00 Span: 1:11:26 Digital Internal Use Only 11-29 VAXsimPLUS Examples User Example 23 From: To: Subj: TUBA: : SYSTEM 20-0CT-1987 15:48 NODE: : NORMAN TUBA::$3$DUA77 analysis Attn: Device: Field Service HAN$DUA77 (RA82 S/N:C4E) Theory: [xx.xx.xx.xx] Evidence: Count LED CODE (HEX) 213. OB 1. 1F Translation of LED CODE ~;;~~~-~;;;~-;-~~~-~;~wm r~ b OPCODE. COM lIh {(tV\ The command opcode within a level 2 command from a SDI controller was received with good parity but did not match any of the valid SDI level 2 command opcodes known by the drive. .. 1ft R/W SECTOR OVERRUN ERROR. COh. mu {'II f of! (i "", ( if{ Ct «:J .e t..(l,$1t The drive detected READ or WRITE gate asserted while simultaneously detecting the presence of a sector pulse or an index pulse. Since there were SDI communication errors which could have caused media related transfer errors to occur, the following is a summary of those media errors: Evidence (All results are in decimal) : Total media errors on drive: 195 Head 313 331 316 333 318 329 324 326 331 321 711 711 327 311 320 321 315 328 322 316 0 0 0 5 5 6 6 6 8 8 10 10 10 10 12 12 12 13 14 14 T~e: 11-30 Sector From Soft Hard Cyl. Index Count Count 0 13 20 30 52 0 10 14 17 30 3 4 7 17 9 9 12 1 2 9 1 1 1 1 1 1 1 1 1 1 3 2 1 1 1 1 1 1 1 1 16-0CT 14:04:42 (Truncat.ed) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TO Lett 5 Digital Internal Use Only Volume Serial Number Error Type (Led Code is in he:~) ---------- --------------------------------------504 ECC Error 504 ECC Error ') 1d tltll tQ)lJ! cJ1. 504 504 504 504 504 504 504 504 504 504 504 504 504 504 504 504 504 504 ECC ECC ECC ECC ECC ECC ECC ECC ECC ECC ECC ECC ECC ECC ECC ECC ECC ECC 19-0CT 16:57:45 W\ ~ Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Span: H'ti b.' icll. (r\, ~a-t///« JftP,,~ dj f,/ .i,·V (t~f~t) 74:53:03 . /,)L VAXsimPLUS Examples User Example 24 From: To: Subj: DTHSTR: : SYSTEM 11-0CT-1987 23:15 NODE: : NORMAN DRUM: : $3$DUA143 analysis Attn: Device: Field Service C3PO$DUA143 (RA70 S/N:O) Theory: [xx.xx.xx.xx] Evidence (All results are in decimal except LED Code): Total errors on drive: 29 Sector Phys. From Soft Hard Head Cyl. Index Count Count 0 0 4 4 4 6 6 6 8 9 9 9 9 9 9 9 9 9 9 10 10 10 10 396 417 369 387 409 365 369 373 405 365 370 374 375 381 391 405 410 411 416 372 393 397 403 Time: 10-0CT 10:17:21 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ..., 2 2 1 1 1 1 1 ., .. 1 1 1 2 1 1 2 1 1 2 1 2 TO Volume Serial Number Error Type (Led Code is in hex) ---------- -------------------------------------60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 60032 LED LED LED LED LED LED LED LED LED LED LED LED LED LED LED LED LED LED LED LED LED LED LED 11-0CT 22:00:26 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, Write Write Write Write Write Write Write Write Write Write Write Write Write Write Write Write Write Write Write Write Write Write Write Span: and and and and and and and and and and and and and and and and and and and and and and and Off Off Off Off Off Off Off Off Off Off Off Off Off Off Off Off Off Off Off Off Off Off Off Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track Track t \,(,h-\Q)tM-~ fG~ 35:43:04 Digital Internal Use Only 11-31 te>Y~JJc VAXsimPLUS Examples User Example 25 From: To: Subj: DTHSTR: : SYSTEM 19-NOV-1987 09:34 JACKN GUlTAR::$3$DUA553 analysis Attn: Device: Field Service LANDO$DUA553 (RA90 S/N:225) Theory: [xx.xx.xx.xx] Evidence: Error Code (HEX) Count 20. 21 :~~~~~~:~~~-~=-~~~~~-=~~~ SDl Pulse Error CbW\V't"- ~ rOb \~l(Y' I"sQc)tv\ W\ The drive detected errors may have been responsible for the following 14 Sl events: Evidence : 14 SDl Communication Transfer Errors Count 1. 11. 2. 11-32 EventStatus Translation 2B) SDl Drive Command Timeout Errors CB) Lost Receiver Ready for Transfer Errors 16B} Drive Failed Initialization Errors Digital Internal Use Only <-~ 1"0?\~ "'"~ VAXsimPLUS Examples User Example 26 From: To: Subj: COOKIE::SYSTEM System Management" 14-DEC-1987 15:53 GENRAL : : JACKN LANDO$DUAI69 analysis Attn: Device: Field Service LANDO$DUAI69 (RA70 S/N:O) Theory: [xx.xx.xx.xx] Evidence (All results are in decimal except LED Code) : Total errors on drive: 143 Sector Phys. From Soft Hard Head Cyl. Index Count Count 7 7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 9 9 9 9 9 9 9 10 10 10 10 10 10 10 Time: 158 158 158 158 158 158 158 158 158 158 158 158 158 753 158 158 158 158 158 158 158 158 158 158 158 158 158 158 158 158 158 158 158 158 158 Volume Serial Number 9 1 2 o o 10 1 o 11 1 2 1 1 2 1 1 2 8 12 13 14 15 16 17 18 19 20 15 o 1 2 3 4 o o o o o o o o o o o o o o o o o o 1 1 29 1 1 2 1 1 5 2 6 1 2 1 1 2 1 o 1 2 4 5 6 33 12 13 14 15 16 17 18 5-NOV 14:46:53 (truncated) o o o o o o 1 1 2 1 1 o o ·0 o o 2 o o o 1 1 1 TO 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 172500556 Error Type (Led Code is in hex) Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning Positioning 5-NOV 14:56:17 Span: Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error S€r \rO CI Yv 0 VV\~ h \-~N\~ \ ~~ c"-" ~ ~S \l K (HDr9) 0:09:24 Digital Internal Use Only 11-33 )VY.~tJ VAXsimPLUS Examples User Example 'Z1 From: To: Subj: COOKIE::SYSTEM 14-DEC-198715:53 GENRAL: : JACKN FATCAT$DUA5 analysis Attn: Device: Field Service FATCAT$DUA5 (RA81 S/N:F885) MIDCAT$DUA5 Theory: [xx.xx.xx.xx] Evidence (All results are in decimal except LED Code): Total errors on drive: 68 Sector Phys. From Soft Hard Head Cyl. Index Count Count 12 12 12 12 12 12 12 12 12 " ., .l. .... 12 12 12 12 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 9 6 356 45 428 1160 0 1160 15 1160 28 1160 1174 8 1174 46 1221 31 1226 10 1236 3 1238 33 1241 8 318 24 342 16 396 15 442 46 445 14 445 50 958 34 980 17 1055 39 1070 51 1071 46 1119 15 47 1136 1174 16 1174 24 1174 36 1175 11 1176 18 1178 11 1223 23 1223 23 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Volume Serial Number Error Type (Led Code is in hex) ---------- -------------------------------------0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 46726 Lost Read/Write Ready ECC Error LED 39, Write and Off Track ECC Error Lost Read/Write Ready ECC Error LED 39, Write and Off Track ECe Error Positioning Error ECC Error Ece Error ECC Error ECC Error ECC Error ECC Error ECe Error Ece Error Positioning Error ECC Error Positioning Error ECe Error ECC Error ECC Error Positioning Error ECC Error ECC Error ECC Error Invalid Header Error Positioning Error Positioning Error ECe Error Positioning Error ECe Error Ece Error Invalid Header Error Time: 24-JUL 10:42:56 TO 31-JUL 08:49:55 «) truncated> 11-34 Digital Internal Use Only Span: 166:07:58 ( ~~) S(Am,Q, 9uv4O- Q JL C ..J Pi (".1\)0)-- VAXsimPLUS Examples User Example 28 From: To: SUbj: PICKUP::SYSTEM 22-JUL-1987 14:00 NODE: : NORMAN PICKUP$DUA71 analysis Attn: Device: Field Service PICKUP$DUA71 (RA80 S/N:O) Theory: [~:x.xx.xx NOTE: .xx] There were 1 hard errors recorded for this device. A hard error is defined as an error on a block in which BBR was invoked and the data was replaced with 'Force Error'. Evidence (All results are in decimal except LED Code): Total errors on drive: 6 Sector Phys. From Soft Hard Head Cyl. Index Count Count 4 4 4 6 6 6 Time: 72 82 123 2 42 119 1 8 18 5 4 27 2-MAR 15:10:33 0 1 0 0 0 0 1 0 1 1 1 1 TO Volume Serial Number ---------- Error Type (Led Code is in hex) 1 Positioning Error ... ECC Error ~ Invalid Header Error 1 Positioning Error 1 ECe Error 1 Positioning Error 1 5-MAR. 15:25:12 Span: 72:15:38 Digital Internal Use Only 11-35 V~Xsim,PLUSExamples VserEx~mpIEt29 From: To: Subj: MUFFIN: : SYSTEM 7-MAY-1988 10:23 GENRAL: : HOLMES MUFFIN::$5$DUA1 analysis Attn: Device: Field Service GREASY$DUA1 (RA81 S/N:20D9) Theory: [xx.xx.xx.xx] ~ LfY\ b'f'vvf l0~p he Evidence: 18 SDl Communication Errors \~""""'D(k·t.:\' \~\ \ ~\<~ U, Count 18. Time: EventStatus Translation (lOB) Controller Detected Pulse or State Parity Errors 7-MAY 10:02:45 TO 7-MAY 10:19:16 Span: 0:16:31 VAX'simp~tt1S>'EX~hTl~jes User Exampre,'30 From: To: Subj: PICKUP::SYSTEM 25-FEB-198814:35 REEMAN PICKUP::$1$DUA252 analysis Attn: Device: Field Service HSC015$DUA252 (RA90 S/N:20B) Theory,: [xx.xx.xx.xx] Evidence: 19 SDl Communication Errors Count - 1. --- 18. EventStatus Translation ( 2B) SDl Drive Command Timeout Errors (lAB) Receiver Ready Collision Errors The following 1 drive detected errors may be related to the above SDl errors: Evidence: Count 1. Error Mfg. Code Code (HEX) (HEX) Translation of Error Code 1F 2D Sector Overrun Error Time: 24-FEB 15:37:50 TO 25-FEB 14:32:50 Span: 22:55:00 VAXsimPLUS Examples User Example 31 From: To: Subj: 30-MAR-1988 16:28 HICKUP : : SYSTEM GENRAL: : HIMES HICKUP::$5$DUA231 analysis Attn: Device: Field Service WHEEZY$DUA231 (RA81 S/N:2ED92) Theory: [xx.xx.xx.xx] Evidence: LED CODE (HEX) Count 1. Xt\ ~. Status error byte non-zero while atternpt~ng to execute a command Two or more pulses of the same polarity are detected on the controller real-time state line (control pulse error) Two or more pulses of the same polarity are detected on the controller write command data line (data pulse error), SOl controller response. time out ?c-vS"d..... cJ.A 'Lr ~Iotfl)(e. tl tltlf'c;vf 00 21 -' 452. ~ 8. 22 1. Translation of LED Code 41 The drive detected errors may have been responsible for the following 405 £I events: Evidence : 405 SDI Communication Transfer Errors Count EventStatus Translation -----3. 396. 6. 2B) SOl Drive Command Timeout Errors CB) Lost Receiver Ready for Transfer Errors 16B) Drive Failed Initialization Errors Since there were SOl communication errors which could have caused media related transfer errors to occur, the following is a summary of those media errors: Evidence (All results are in decimal, except LED code) : Total media errors on drive: 438 Head Sector From Soft Hard Cyl. Index Count Count 0 0 0 0 102 236 627 451 2 3 11 18 " " " " " 13 13 13 13 279 418 280 546 43 44 50 50 Time: 25-MAR 17:06:32 0 0 0 0 1 1 1 1 " " 1 1 1 1 0 0 0 0 TO 11-38 Digital Internal Use Only Volume Serial Number Error Type (Led Code is in hex) ---------- --------------------------------------152034 152034 152034 152034 Lost Read/Write Ready Positioning Error Lost Read/Write Ready Lost Read/Write Ready " " " 152034 152034 152034 152034 Lost Lost Lost Lost 30-MAR 15:27:47 " " " " Read/Write Read/Write Read/Write Read/Write " " Ready Ready Ready Ready Span: 118:21:14 ?e I-S (L, C,JTLR. d VAXsimPLUS Examples User Example 32 From: To: Subj: HlCKUP: : SYSTEM 30-MAR-1988 10:50 USERS HlCKUP::$5$DUA232 analysis Attn: Device: Field Service WHEEZY$DUA232 (RA81 S/N:2ED7C) Theory: [xx.xx.xx.xx] Evidence: /1 Count LED CODE (HEX) -(~~ 21 (56. Crp<'/ ?.eM /,wJl'flV Translation of LED Code Two the Two the 22 PloN6tr1 J'C~_ : or more pulses of the same polarity are detected on ~ controller real-time state line (control pulse error) or more pulses of ~he same polarity are detected on controller write command data line (data pulse error) The drive detected errors may have been responsible for the following 6 Sl events: Evidence : 6 SDI Communication Transfer Errors Count 2. 4. EventSta-cus Translation ( 2B) SDI Drive Command Timeout Errors "....., ( 16B) Drive Failed Initialization Errors/ Since there were SDl communication errors which could have caused media related transfer errors to occur, the following is a summary of those media errors: Evidence (All results are in decimal, except LED code): Total media errors on drive: 2 Head o o NOTE: Sector From Soft Hard Cyl. Index Count Count 1248 624 1 o 3 1 Volume Serial Number Error Type (Led Code is in hex) ~ ----~~~~:: ~~:~~:!~::~:~~~;~T~~-t~is/--1~:---iO~C-6" N'~#;;)('1 tl i('e;k1,fJlf1 There were 1 hard errors recorded for this device.' n A hard error is defined as an error on a block in which BBR was invoked and the data was replaced with 'Force Error', or any Media Format Errors. Time: 30-MAR 09:31:31 TO 30-MAR 10:34:22 Span: /J/t IJ"\\ J 1:03:51 Digital Internal Use Only 11-39 el--Y I>\.~/ ictt IJQV! VAXsimPLUS Examples User Example 33 From: To: Subj: MUFFIN: : SYSTEM 17-MAY-1988 15:09 SERVICE MUFFIN::$5$DUA253 analysis Attn: Device: Field Service SLEAZY$DUA253 (RA90 S/N:1FF) Theory: [xx.xx.xx.xx] NOTE: There were 11 hard errors recorded for this device. A hard error is defined as an error on a block in which BBR was invoked and the data was replaced with 'Force Error'. Evidence (All results are in decimal except LED Code): Total errors on drive: 56 Sector Phys. From Soft Hard Head Cyl. Index Count Count o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 1 1 1 40 40 40 528 528 528 528 528 528 528 528 528 528 914 914 914 914 914 914 914 914 914 1181 1181 1181 2416 2416 2416 2416 181 181 181 21 24 30 7 8 15 16 17 18 19 22 24 31 6 1 1 1 1 1 1 1 1 1 1 1 o o o o o o o o o o 1 o o o 1 o :. 9 o 10 1 1 o o 1 1 1 1 11 12 17 20 21 23 14 19 20 39 51 52 53 35 50 51 Time: 16-MAY 08:32:30 1 o o o o o 1 1 1 1 1 1 1 o o o o o o o 1 1 1 1 o o o TO 11-40 Digital Internal Use Only Volume Serial Number Error Type (LED Code is in hex) 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 ECC Error ECC Error ECC Error Positioning Positioning ECC Error ECC Error ECC Error ECC Error ECC Error Positioning ECC Error ECC Error ECC Error ECC Error Positioning ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error 16-MAY 15:46:41 Span: Error Error oatoW.\\1"\) 7• H1)A Error Error 7:14:11 VAXsimPLUS Examples User Example 34 From: To: Subj: HICKUP: : SYSTEM 12-MAY-1988 15:53 GENRAL: : FHARDY HICKUP::$5$DUA253 ana1.ysis Attn: Device: Field Service GREASY$DUA254 (RA90 S/N:20A) Theory: [xx.xx.xx.xx] Evidence (All results are in decimal except LED Code): Total errors on drive: 16 Sector Phys. From Soft Hard Head Cyl. Index Count Count 3 3 4 4 4 4 4 4 6 8 9 9 9 11 11 11 27 49 2403 2420 2453 2453 2453 2562 2607 2507 2317 2357 2463 37 40 45 13 29 25 6 31 33 37 44 22 30 21 29 18 41 9 50 Time: 12-MAY 14:48:51 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 0 TO Volume Serial Number Error Type (LED Code is in hex) ---------- -------------------------------------521 521 521 521 521 521 521 521 521 521 521 521 521 521 5"" ... 521 ~ ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error ECC Error Positioning Error ECC Error ECC Error ECC Error Eee Error Positioning Error ECC Error ECC Error 12-MAY 15:48:32 Span: 17~1J~ C01-tt-UURJ\ prolo 1£)/1, ~ D\~f. . 0: 60 : 40 Digital Internal Use Only 11-41 VAXsimPLUS Examples User Example 35 From: To: Subj: MUFFIN: : SYSTEM 29-MAR-1988 14:25 GENRAL: : FMILER MUFFIN::$5$DUA7 analysis Attn: Device: Field Service GRANPA$DUA7 (RA81 S/N:12D1) Theory: [xx.xx.xx.xx] Evidence: LED CODE (HEX) Count 45. 41 Translation of LED Code rpt'61)oL f ,,'\ SDl controller response time out The drive detected errors may have been responsible for the following 2 SI events: Evidence : 2 SDl Communication Transfer Errors Count 2. EventStat-us Digital Internal Use Only 29-MAR 14:16:43 Span: . , C~~\\:l\.tA ~\~~ c~4:T eQd J)t-~ 1--,J(",S:;)v(\\ C ~t· ' t A (lAB) Receiver Ready Collision Errors TO Jr \ m.)./C/L; p r~U',~$.d\lp~c)f~n1 \~,~,~e_kutJ Translation Time: 29-MAR 14:15:23 11~2 I'., 0:01:20 5" Dt It,t~_P d.-, Urrrt, f VAXsimPLUS Examples User Example 36 From: To: Subj: MUFFIN: : SYSTEM 21-MAR-1988 15:45 GENRAL: : HIMES MUFFIN::$5$DUA7 analysis Attn: Device: Field Service WHEEZY$DUA7 (RA81 S/N:ED9) Theory: [xx.xx.xx.xx] 'Initvl-( ~ \ -1-1 ece tr~(>j,L \ Evidence: LED CODE (HEX) Count 1. 4. 00 23 2. 34 41 46 FI 1. 4. 6. ® i'nl((;YC . Translation of LED Code Undefined LED code; no translation available (HARDCORE) Spindle motor interlock broken (belt tension lever is released) Read data separator/encoder error SDI controller response time out R/W safety interrupt occurred with no cause bits set"'::;' P11~1-" Slave load timeout The drive-detected errors may have been responsible for the following 5 SI events: Evidence : 5 SDI Communication Transfer Errors Count 3. 2. EventStatus Translation 2B) SDr Drive Command Timeout Errors I6B) Drive Failed Initialization Errors Time: 21-MAR 14:44:04 TO 2I-MAR 15:38:00 Span: 0:54:56 Digital Internal Use Only 11-43 VAXsimPLUS Examples User Example 37 From: To: Subj: MUFFIN: : SYSTEM 30-MAR-1988 16:48 GENRAL: : HIMES MUFFIN::$5$DUA35 analysis Attn: Device: Field Service WHEEZY$DUA35 (RA81 S/N:2ED93) Theory: [xx.xx.:-:x.xx] Evidence: 19 SDI Communication Errors Count 4. 6. 1. 8. EventStatus Translation 2B) SDr Drive Command Timeout Errors 4B) Controller Detected Transmission Errors lOB) Controller Detected Pulse or State Parity Errors 16B) Drive Failed Initialization Errors The following 2 drive detected errors may be related to the above SDI errors: Evidence: Count LED CODE (HEX) 1. IF 1. 21 Translation of LED Code A sector pulse is detected during the execution of a read or write of a sector Two or more pulses of the same polarity are detected on the controller real-time state line (control pulse error) Time: 26-MAR 09:21:57 11-44 TO Digital Internal Use Only 30-MAR 16:30:37 Span: 103:09:39 s VAXsimPLUS Examples User Example 38 From: To: Subj: PICKUP: : SYSTEM 25-FEB-1988 14:49 NEWTON PICKUP::$1$DUA252 analysis Attn: Device: Field Service HSC015$DUA252 (RA90 S/N:20B) Theory: [xx.xx.xx.xx] Evidence (All results are in decimal except LED Code) : Total errors on drive: 14 Sector Phys. From Soft Hard Head Cyl. Index Count Count 7 2370 0 14 Time: 25-FEB 14:46:38 TO Volume Serial Number Error Type (LED Code is in hex) ---------26 LED 4B, Index Error 25-FEB 14:47:41 Span: 0:01:03 - Vh X51 )'?~ SC11)' Sc.r-~fc,~ O'h ~?." ~ 5 f,; y~!.f' :to /}11011; :1./0 Digital Internal Use Only 11-45 VAXsimPLUS Examples User Example 39 12-MAY-1988 15:33 MUFFIN: : SYSTEM GENRAL: : JACKN MUFFIN::$5$D0A253 analysis From: To: SUbj: Attn: Device: Field Service SLEAZY$D0A253 (RA90 S/N:1FF) Theory: [xx.xx.xx.xx] Evidence: Error Code (HEX) Count Translation of Error Code 07 13 21 60 EB F3 1. 1. 407. 94. 25. 41. SOl Frame Sequence Failure Spindle Motor Control Fault SOl Transfer Error (Pulse Error) -Read/Write Head Select failure Unknown Error Code Servo Spinup Failed n . -M, ~\ prvh I QiV)\ The following drive detected errors may have occurred as a result of the above errors. These errors do not have any extended status, hence no valid led code was available. Evidence: 1 Errors Count Request Byte 1. Mode Byte ------- ------03 Error Byte ------- 00 00 Controller Byte ---------00 The drive detected errors may have been responsible for the following 307 SI events: Evidence : 307 SOl Communication Transfer Errors Count 8. 293. 5. 1. EventStatus 2B) CB) 16B) lAB) Translation SOl Orive Command Timeout Errors Lost Receiver Ready for Transfer Errors Drive Failed Initialization Errors Receiver Ready Collision Errors ........... CONTINUED 11-46 Digital Internal Use Only VAXsimPLUS Examples User Example 39 (continued) Since there were SOl communication errors which could have caused media related transfer errors to· occur, the following is a summary of those media errors: Evidence (All results are in decimal, except LED code) : Total media errors on drive: 341 Head 0 0 0 0 0 0 0 0 0 0 12 12 12 12 12 12 12 12 12 12 12 12 Time: Sector From Soft Hard Cyl. Index Count Count 1179 1388 408 1804 326 1892 2221 2226 2 545 2214 1418 1459 333 2107 1914 2256 1023 1967 1585 2109 821 0 0 1 2 6 9 9 9 12 17 1 1 1 1 1 1 1 1 1 1 " " " " 19 26 34 36 36 37 38 45 50 52 52 58 9-MAY 14:14:11 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 TO Volume Serial Number Error Type (Led Code is in hex) ---------- --------------------------------------28 28 28 28 28 28 28 28 28 28 Lost Lost Lost L9st Lost Lost Lost Lost Lost Lost " " " 28 28 28 28 28 28 28 28 28 28 28 28 Lost Lost Lost Lost Lost Lost Lost Lost Lost Lost Lost Lost 12-MAY 15:19:51 Read/Write Read/Write Read/Write Read/Write Read/Write Read/Write Read/Write Read/Write Read/Write Read/Write Ready Ready Ready Ready Ready Ready Ready Ready Ready Ready " " " Read/Write Read/Write Read/Write Read/Write Read/Write Read/Write Read/Write Read/Write Read/Write Read/Write Read/Write Read/Write Span: Ready Ready Ready Ready Ready Ready Ready Ready Ready Ready Ready Ready 73:06:40 Digital Internal Use Only 11-47 VAXsimPLUS Examples User Example 39 (continued) 11-48 Digltallntemal Use Only CHAPTER 12 DSA DSDF/BBR DSA DSDF/BBR 12-1 DSA DSDFIBBR 12.1 INTRODUCTION This section is intended to introduce some of the features in the the Digital Storage Architecture (DSA) which are different from other storage subsystems. The major topic is Bad Block Replacement (BBR). To help describe BBR, disk addressing, Bee detection and correction, and Bee thresholding are also covered. One section contains answers to often asked questions. Throughout this document are references to blocks and sectors. They are interchangeable. Also, the term controller refers to the UDASO, KDA50, KDB50, and HSC50no. OVERVIEW MATERIAL for UNDERSTANDING BBR 12.2 The following sections cover the terminology used in BBR and REVECTOR topics described later in this document. These new DSA features are prerequisite to understanding BBR and REVECTOR. 12.2.1 LBN and RBN Association (Disk Organization for BBR) The organization of logical blocks (LBNs) and replacement blocks (RBNs)on the RA-series drives is defined in such a way that the user area of the disk remains constant regardless of the number of blocks that go bad. The constant number of blocks is maintained by supplying RBNs at the end of each track on the disk which can be substituted for a bad block. A track on the disk contains many LBNs. The number of LBNs and RBNs depends on the device type. At this time, there is only one RBN per disk track. The BBR algorithm selects a replacement RBN by finding the first available, good RBN that is closest to the bad block. A typiCal sector contains a header (replicated four times), a data area which can be 512 or 576 bytes in length, an EDC field, and an BCe field. The header, EDC field., and BCC field are described later. A track on a disk is circular, starts at index, and continues until index is again reached. For the sake of illustration Figure 12-1 depicts a track in a straight line (rather than circular as it really is on the disk platter) to show the relationShip between LBNs, RBNs, and the typical sector. Figure 12-1: Index Pulse Disk Track and Sector Organization ~~--------------~ Track 0 Track 1 Track n Typical Sector HEADER I 12-2 Digital Internal Use Only DATA DSA DSDF/BBR 12.2.2 Disk Addressing The DSA disk addressing and disk header concept is very different from the traditional schemes. First, the disk header does not contain the traditional cylinder, track, and sector information. The DSA header contains 32 bits. The upper 4 bits contain a header code, and the lower 28 bits contain a block number (see Figme 12-2). There are four copies of the 32-bit header for integrity. The code field of the header defines whether a block is a logical block (LBN), a replacement block (RBN), a diagnostic block (DBN), or an external block (XBN). If the block is an LBN, the header code also indicates if the block has been replaced. From the addressing poInt of view, the disk header is used by the controller when accessing data stored on the disk. Disk headers are read by the controller and compared for a match against the target block number when searching for a block of data. Both the header code and the block number must be valid in order to have a header match. The controller checks the header code to ensure that the correct area of the disk is accessed. For example, if the controller is trying to access a sector in LBN space and the disk is actually positioned in XBN space, the code . portion of the header would mismatch and an error would be reported. The controller also checks the code field to see if the block has been revectored. If a block has been revectored, it is the controller's responsibility to determine where the data has been revectored to. There is more discussion on the revector process later. Figure 12-2:. Disk Header Format CODE 12.2.3 o 2827 31 BLOCK NUMBER How Header Codes are Used Following is a list of the header codes and their purpose. Code OO-This code indicates the LBN data is usable and directs the controller to access the data following the header information. Code 03-This is the non-primary replacement code. This code indicates to the controller that the data following the header is invalid and directs the controller to retrieve the data from an RBN that is located on a different track than the track containing the LBN. The controller uses the ReT information to detemrine exactly which RBN was used. This operation is a non-primary revector. Code 05-This is the primary replacement code. This code indicates to the controller that the data following the header is invalid and directs the controller to read the data from the RBN at the end of the current track. For RA-based disks, each track contains one or more RBNs which can be used for primary replacement data. This operation is a primary revector. Code 06-This is the RBN header code. This code indicates to the controller that it is accessing an RBN. Code 11-This is the unusable block code. Code 12-This is the XBN header code. This code indicates to the controller that it is accessing FCT which resides in the external blocks (XBN) area of the disk. Code 14-This is the DBN header code. This code indicates to the controller that it is accessing a diagnostic block (DBN). Digital Internal Use Only 12-3 DSA DSDFIBBR 12.2.4 Special Uses of the Header Code Field BBR does not apply to the RCT and FCT since they are multi-copy structures. The controller takes the following action based on ReT header codes: Code 00--This code indicates the LBN data is usable and directs the controller to access the data following the header information. Code 11- This is the unusable block code. This code indicates to the controller that the data following the header is invalid and directs the controller to retrieve the data from the next copy of the ReT or FCT. If all copies of the ReT or FCT are not readable, an uncorrectable error is reported. 12.2.5 EDC Protection The DSA architecture provides an error detection code (EDC) mechanism to protect the controller data paths which are not protected by parity or ECC. The EDC is generated by the controller, at the bus interface, when a block. of data is to be written to the disk. The EDC is then written to the disk along with the data and ECC. The EDC is checked, at the bus interface, when the controller transfers a block of data to the host If an EDC error is detected by the controller, it first checks to see if the error is a forced error (FE). An FE is detected by inverting the EDC and finding a result of zero. If an EDC error still exists after cheCking for the FE, the controller retries the read operation and logs an error. If the FE is detected, the controller reports the event to the host in the MSCP end packet and does not log the error to the error log. To set the FE indicator, the host issues an MSCP WRITE command with the FE modifier set When the controller detects the FE modifier, it calculates the normal EDC, then inverts it to set the indication of the FE. 12.2.6 ECC Detection and Correction The RA-series drives use a 17o-bit Bec to correct up to 80 bits in error on a single disk sector. The controller generates the BCC and appends it to the data when writing a sector to the disk. The controller checks each sector of data read from the disk for the existence of an BCC error to detennine if ECC correction is needed. If the controller determines that correction is needed, it enters the microcode algorithm to perform the correction. The Eee correction algorithm determines how many bits are in error by keeping a count of the number of symbols and bits per symbol in error ona single disk sector. A symbol is defined as 10 contiguous bits of correction information that the controller applies to the data in error to correct it. The maximum number of symbols that can be corrected before an an uncorrectable BCC error occurs is 8. Since each symbol contains information to correct a lo-bitburst, the maximum correction capability is 80 bits (8 symbols X 10 bits per symbol). 12.2.7 ECC Thresholding The controller uses the ECC symbol count as a threshold to determine when a disk block is going bad and needs to be replaced. Each drive contains a threshold count parameter based on its media type, density, and head technology. The controller uses the parameter information to understand when to set the BBR flag and log ECC errors. The rules for thresholding are simple: If the number of symbols used to correct the ECC error is below the threshold, then perform correction only. If the number of symbols is greater than or equal to the threshold, then perfonn correction, set the BBR flag in the MSCP END packet, and log an ECC error. See Figure 12-3 for drive threshold settings. 12-4 Digital Internal Use Only DSA DSDF/BBR Figure 12-3: ECC Symbols and Drive Threshold for BBR and Error Logging FORCED ERROR ~ UNCORRECTABLE 8 SYMBOLS 7 SYMBOLS 6 SYMBOLS RA70/81/82/90 THRESHOLD ~ 5 SYMBOLS 4 SYMBOLS RASO THRESHOLD RASO THRESHOLD ~ 3 SYMBOLS .. 2 SYMBOLS 1 SYMBOL Digital Internal Use Only 12-5 DSA DSDFIBBR 12.3 BBR PROCESS OVERVIEW Bad block replacement (BBR) is the mechanism that Digital Storage Architecture (DSA) uses to replace disk blocks (sectors) which are or may become unusable. Normally, BBR is transparent to the user except for the error log events that are recorded by the BBR process. The only exception is when the block being replaced contains an uncorrectable ECe error. If an uncorrectable ECC is detected, the bad block is written with a forced error (FE) and the user is notified. Once a block is written with FE, the data is lost to the user. The possibility of recovering the data from a block after an un correctable ECC has occurred and FE has been applied is remote. If RMS, or its equivalent, is the mechanism used to perform I/O operations when the uncorrectable ECC is detected, the job is terminated with an error message. If QIO system service or its equivalent is the mechanism used to perform I/O operations when a uncorrectable ECC is detected, the user is notified with an MSCP END PACKET. In this case, the user has the option to continue or terminate the job. It is important to note that the physical disk structure is pennanently modified by BBR. The actual entities that BBR modifies are the bad block's header to reflect the replacement type used (primary or non-primary) and the replacement control table (RCT) descriptor to reflect the block used for replacement (RBN) is in use. BBR is implemented in host software in UDA50, KDA50, and KDB50 controllers because there is not enough code space available in the controller ROM. BBR is implemented in HSC50 and HSC70 controllers after V250. Three actions take place when invoking, performing, and reporting the replacement of a bad block: Notification that a block needs to be replaced process that BBR is needed. The controller sets the BBR flag to inform the replacement Attempt to replace the bad block- This is the actual execution of the BBR algorithm. Report the results of the bad block replacement attempt to the error log. 12.3.1 Notification that 8 Block Needs to be Replaced A request to perform BBR can only occur when the controller is attempting to transfer data to or from the disk. To understand when data transfers occur, consider the following: Data transfer requests are initiated by the host issuing a data transfer MSCP command, such as READ or WRITE, to the controller. The controller processes the command and performs the action requested. The controller checks for errors and sets the BBR flag, if needed. Reasons for setting the BBR flag include: The controller is not able to successfully read a header when trying to locate a sector of data to be transferred to or. from the host Before setting the BBR flag, the controller determines if the sector had been previously replaced. The controller cannot find the data sync on a sector when attempting to transfer the sector of data to the host While the controller is attempting to read a sector of data from the disk, it detects an ECC error greater than or equal to the drive's error reporting threshold. 12.3.1.1 Host BBR If the controller detects errors at the end of the data transfer, it places the error status in the MSCP END PACKET and possibly generates an error log packet If BBR is needed, the controller also sets the BBR flag in the MSCP END PACKET. The controller then passes the end packet to the host. The host, upon receiving the END PACKET determines if errors occurred. If errors occurred, the host takes the appropriate action to handle the error. If the action needed is BBR, the host transfers control to the host BBR software. 12-6 Digital Internal Use Only DSA DSDF/BBR 12.3.1.2 Controller BBR If the controller detects errors at the end of the data transfer, it places the error status in the MSCP END PACKET and possibly generates an error log packet. If BBR is needed, the controller sets the internal BBR flag. The internal BBR flag causes the controller firmware to enter its BBR routine. 12.3.2 Executing Bad Block Replacement The host and controller perform the same functions to replace a block, so the following paragraphs represent both. The BBR code retrieves and saves a copy of the bad block's data. This is because the suspected bad block must be read and written to confirm it is bad. This destroys the original data If an uncorrectable ECC is detected while retrieving the bad block, the BBR algorithm sets an internal flag to remember that forced error must be set on the block. The BBR code determines if the LBN in question is really bad before attempting to replace it. This testing is done to protect from replacing too many blocks because of transient errors. The current BBR implementation uses the customer data instead of test patterns to test the bad block. Test patterns alone would not find the pattern-sensitive spots in the disk media If no errors are detected when the block is written and read with selected patterns and the customer data, the block is not replaced. If the suspected bad LBN fails the tests, then the BBR code finds a substitute block called a replacement block (RBN) and moves the data from the bad LBN to the RBN. The actions used to move the bad block's data to the replacement block include: The BBR code marks the ReT descriptor which corresponds to the RBN used for the replacement "in use" so that it cannot be used again by subsequent replacements. The BBR code uses the MSCP REPLACE command to pennanently mark the bad LBN header with a primary or non-primary replace code (Figwe 12-4 and Figure 12-5, step A). The BBR code writes the data back to the bad block's LBN to force the controller to revector the data to the replacement block. This action is forced because the replacement code has been applied to the bad block's header (Figure 12-4 and Figure 12-5, step B). Digital Internal Use Only 12-7 DSA DSDFIBBR Figure 12-4: Primary ReplacementlRevector STEP A - - - - BAD BLOCK REPLACEMENT---.. BLOCK TRACK A L..-_ _ BBR MARKS HEADER OF BAD BLOCK WITH A REPLACE CODe STEP B ....--- BAD BLOCK REPLACEMENT---, BLOCK TRACK A REV ECTOR L = LBN R = RBN NOTE: BBR MOVES BAD BLOCK'S DATA TO REPLACEMENT BLOCK BY WRITING DATA BACK TO BAD BLOCK'S LBN. THIS CAUSES CONTROLLER TO REVECTOR DATA TO RBN. CXO-2375A 12-8 Digital Internal Use Only DSA DSDF/BBR Figure 12-5: Non-Primary ReplacementlRevector STEP A ,----- BAD BLOCK REPLACEMENT--..., BLOCK TRACK A " ' - - - BBR MARKS HEADER OF BAD BLOCK WITH A REPLACE CODE STEP B ,...--- BAD BLOCK TRACK A TRACK B L = LBN R = RBN NOTE: BBR MOVES BAD BLOCK'S DATA TO REPLACEMENT BLOCK ON A DIFFERENT TRACK BY WRITING DATA BACK TO BAD BLOCK'S LBN. THIS CAUSES CONTROLLER TO USE POINTERS IN RCT TO REVECTOR DATA TO REPLACEMENT RBN. CXO-2376A 12.3.3 Restarting BBR The BBR code must keep track of its progress while perfonning BBR in case errors or power fail occur and BBR may need to be restaned. Block 0 and 1 of the ReI' are used for this pwpose. ReI' block 0 contains the state of the replacement while the BBR code is executing. ReI' block 1 contains the bad block's data. This data is good data if the Eee correction capability was not exceeded or is "best guess" data if the Eee correction capability is exceeded. .Best guess data is passed to the host even though it contains errors. The rest of the ReI' contains the RBN descriptors. The descriptors are used to indicate whether an RBN is available for use, in use, or bad. There is one descriptor per RBN. Digital Internal Use Only 12-9 DSA DSDFIBBR 12.3.3.1 Host BBR Host BBR generates many error log packets while handling BBR. This is because host BBR generates many MSCP commands during execution. Any of the commands can generate an error log event. As many as six error log events can be logged against one bad block replacement attempt. 12.3.3.2 Controller BBR Controller BBR only generates one error log packet for the entire BBR operation since it generates no MSCP commands. Therefore, has better control of logging the event. 12.4 TROUBLESHOOTING BBR BBR is a common, expected event in the DSA architecture. There will always be blocks which need to be replaced. However, when BBR becomes excessive, it is the symptom of a hardware problem. The following is a list of the most common occurrences and how to handle them. A disk experiencing BBRs in the range of one to two per month is reasonable. There is nothing wrong. BBRs in this range occur because the customer data pattern may affect a sector such that an ECC error above the threshold is. generated. A disk is experiencing BBRs to the same LBN, but the LBN is not being replaced. This probiem existed before VMS Version 4.4. The algorithm was using test patterns to test the block in question rather than using the customer's data. The test pattern didn't stress the bad block to the point where the algorithm thought the block was bad, so the block wasn't replaced. If this is really annoying the customer, eliminate the use of the block by creating a new copy of the file. You can also run the SCRUBBER utility (ZUDL or EVRLK) on the disk to attempt to replace the block. If the disk is attached to an HSC50/70, you can use DKUTIL to replace the block. BBR is causing excessive error log events to be placed in the error log, and it is hard to understand which events are meaningful. This problem applies to host BBR only, so systems with a KDA50. UDA50, or KDB50 are affected. This problem came about when the enhanced version of BBR was released to the field. The enhanced algorithm caused more error log events to be generated during testing of the bad block to detennine if it was really bad. The change to the algorithm specifies that testing is to be done with correction and recovery disabled. By disabling correction and recovery, the controller generates an un correctable BCC error if the bad block has any BCC symbol errors. There are only two meaningful error log events when BBR is attempted. They are the error log packet which indicates that an ECC error above the threshold occurred, and the BBR event error log packet. The error log packets after the packet which indicates BCC above the threshold and before the BBR error log packet have no meaning except to the BBR algorithm. To match the BCC over threshold event log packet with the BBR log packet do the following: Match the command reference number field of the error log packets. Match the unit number field of the error log packets. Match the LBN field of the error log packets. When you match the fields of the enor log packets and find the BBR packet, you have the necessary infonnation about the BBR to understand what happened. 12.5 REVECTORING The revector process is a controller function perfonned on disk blocks which have been replaced. The revector process is totally transparent unless an error occurs during revector. A disk block is considered revectored when its header has been written with a replace code. Once the block is replaced, the data portion of the sector is no longer accessible and the controller executes the revector procedure to determine where the data was revectored to. 12-10 Digital Internal Use Only DSA DSDF/BBR 12.6 QUESTIONS + ANSWERS QUESTION: WHAT IS A FORCED ERROR AND HOW DOES IT OCCUR? ANSWER: A forced error (FE) is a way of identifying a block of data that contained an uncorrectable EeC error. It is also a fast detection mechanism, since ECe correction takes about 17 ms of microcode execution time in the controller, and detection of an FE only takes the time to read one sector. A forced error (FE) is manufactured by the BBR code when the sector being replaced contains an un correctable ECe error. The FE is applied to the data by an MSCP WRITE command with the FE modifier set. At this time, only the BBR code writes FEs. QUESTION: CAN A FORCED ERROR BE UNDONE? ANSWER: An FE can be cleared by issuing an MSCP WRITE command with the FE modifier CLEARED. Clearing the FE will not make the customer's data appear good again since the FE was placed on the sector to indicate the data is invalid (uncorrectable BCC). Running BACKUP will remove the FE, but it will also produce an error message indicating that a corrupt block was detected. One possible way to fix a block containing an FE is to: Read the LBN containing the FE. Today, DKUTIL is the only tool which allows you to read a block containing an FE. However, a program to read blocks which contain FE the QIO system service or its equivalent can be written and used. Examme the contents to see if something obvious is wrong and, if so, correct and rewrite the LBN with FE CLEP~D. If the contents of the LBN cannot be fixed, get a backup copy of the file in question and restore the backup copy to the disk. If a backup copy of the file cannot be obtained and the LBN cannot be fixed, the customer must re-run the job used to create the file. Digital Internal Use Only 12-11 DSA DSDFIBBR QUESTION: WHAT IS THE RCT? ANSWER: The RCT is a mnlti-copy structure. There are four or more identical copies of the ReT for data integrity. Each copy of the ReT contains three areas of interest described below. The RCT is always physically located above the user LBN area of the disk. The RCT can be read by using the QIO - READRCT function. ReT Block ° ° The first area is ReT block which contains state infOImation while a replacement is in progress. If the BBR algorithm is aborted during replacement, then block state is used to determine where to restart the algorithm. When a drive is brought on line, the host checks to see if a replacement was in progress. If it was, the information in block 0 is used to restart the algorithm. In the case of the RA81, RCT block 0 is at LBN 891072. (See Figure 12-6.) ° ReT Block 1 The second area of the RCT is used to save the data from the suspected bad block while the BBR algorithm is in progress. This data will be moved to the RBN after the bad block is marked replaced. In the case of the RAgl, RCT block 1 is at LBN 891,073. (See Figure 12-6.) The ReT Descriptor Blocks The rest of the RCT is used to keep track of the disposition of the RBNs in the host LBN area of the disk. There is 1 descriptor per RBN. For example, the RA81 contains 17,472 descriptors in the RCT since there are 17,472 RBNs in host LBN space. There are no descriptors for theRCT since BBR does not protect the ReT area of the disk. The descriptors start at block 2 of the RCT and occupy as many blocks as necessary to account for the 17.472 descriptors. Each block contains 128 descriptors, so there are 136 blocks of descriptors on an RA81 (17,472 divided by 128). In the case of the RA81, the deScriptor block area of the disk starts at LBN 891,074. See Figure 12-6. 12-12 Digital Internal Use Only DSA DSDF/BBR Figure 12-6: ReT Layout for an RA81 copy 3 copy 2 copy 1 copy Ibn 891,072 ° RCT BLOCK 0 (replacement status) Ibn 891,073 RCT BLOCK 1 (customer data from bad block) (temporary storage during BBR) Ibn 891,074 RCT BLOCK 3 (descriptors for RBN 0-127) Ibn 891075 RCT BLOCK 4 (descriptors tor RBN 128-255) Ibn 891,210 RCT BLOCK 137 (descriptors for RBN 17,08417,136) QUESTION: WHAT IS THE FCI' AND WHAT IS IT USED FOR? ANSWER: The factory control table (FCT) contains a list of physical blocks which were found bad when the HDA was built and scanned for bad spots on the media. After formatting the HDA, the formatter uses the list of physical blocks contained in the FCT as the basis for blocks to replace. The formatter will construct the content of the RCI' from the FCT contents. If the FCT contents are destroyed, the integrity of the HDA cannot be maintained after a format operation is performed. The FCI' occupies four cylinders following the RCI' and is a multi-copy structure for data integrity purposes. Only the controller can access the FCT. In the case of the HSC50 or HSC70, DKUTIL can be used to dump the contents of the FCf. Digital Internal Use Only 12-13 DSA DSDFIBBR QUESTION: WHAT OPERATING SYSTEMS SUPPORT BBR AND ERROR LOGGING? ANSWER: There are actually two versions of BBR. The original version did a poor job of replacing blocks, so enhancements were made and a new version was implemented. The table below lists which operating systems and versions contain the new version of BBR. Operating systems which supported BBR before the version listed in the table could have problems. A summary of the BBR enhancements include: Better testing of the potential bad block. This change came about because the old algorithm wasn"'t replacing blocks often enough. The fix was to use the customer's data for the test pattern. Since the customer's data almost always catches the Bee error, it is the best test pattern. Better stress testing of the bad block. The old algorithm wrote and read the block once, so the block almost always tested good. The new algorithm reads four times, then writes and rereads four more times before declaring the block good. In addition, testing is done with correction and recovery disabled. If a data path problem provoked BBR, there was a potential of recursively replacing blocks until the RCf filled up. The new algorithm allows two attempts to find a good replacement block before stopping the replacement attempt. 12-14 Digital Internal Use Only DSA DSDF/BBR QUESTION: CAN AN RBN BE REPLACED? ANSWER: The BBR algorithm provides for RBN replacement. Replacement is accomplished by marking the corresponding descriptor in the RCT unusable, then finding another replacement block and revectoring the bad block's data to that block. QUESTION: HOW CAN BBR BE DONE ON SYSTEMS WITHOUT THE BBR CAPABIUTY? ANSWER: The following tools can perform the BBR function even when the operating system doesn't support BBR: SCRUBBER (ZUDL or EVRLK)- This standalone tool searches the disk for LBNs to replace and replaces 3 K~~ them. MAI\)UA 'RABADS for ULTRIX- This is the same as SCRUBBER. t\tJTo . vet2..:t,F''i - leu", if Do-:tk.- 7:b.-A '~'1 jt".e.flw';." System copy function - This function makes a new version of the file that contains the bad block. The new version occupies a different set of LBNs and, therefore, removes the bad block from use. This also removes LBNs from the system and causes the disk to shrink by the size of the file. QUESTION: WHAT IS THE !vffiANING OF THE UDASO, KDA50-Q, KDB50 (xDA) CONTROLLER HANG? ANSWER: When the (xDA) controller hangs, it is usually interpreted as a failure in the controller. Actually, when the CPU attached to the controller hangs, it forces the controller into a command timeout state. The command timeout state is entered when the (xDA) controller decrements its command timer to zero before receiving an MSCP command from the host CPU. NonnaIly, if a command is received, the controller resets the command timer. If a command is not received within the timeout interval (approximately 2 minutes), the controller displays a timeout error code. Therefore, the controller timeout is usually a symptom of a failure elsewhere in the system. 1jj,()Je -f/wer/'o tel. 0')," -(J+VIC J ;LO -I ,~ C 6 >20 It/h' S Digital Internal Use Only '12-15 DSA DSDFIBBR QUESTION: WHAT IS THE NAME AND FUNCTION OF THE CURRENT DIAGNOSTIC SET? ANSWER: '4a~f ~t~0 'c)\OS'hO\~;c.J ~ ?,VYI,IJ. JRP Test 1 UNIBUS Interrupt/Address Test > Test 2 = Executes drive resident diagnostics > Test 3 = Disk Function Test (Read/Write etc) Released ZUDIAO -- Test 4 > Test 4 = A disk exerciser Released ZUDJAO -- Test 5 > Test 5 = UDA50/KDA50 Subsystem Exerciser -- This is an MSCP product that works like an operating system and reports MSCP error log packets, just like an operating system. A very good subsystem exerciser. - Released - ZUDJBO - Target release - Q2 87 ZUDKBO -- Formatter (DM Code Version 14) > APPROXIMATE RUN TIME IS 2 (TWO) HOURS. - Released - ZUDKCO (DM Code Version 15) - Target release Q2 87 ZUDLAO -- Bad Block Replacement Utility > "Scrubber" - MSCP product (Utility) for media maintenance. Released ZUDMxx -- Disk Error Log Utility > Will display the 16 Error SILO in RA81, RA80 - Target release Q2 87 - 12-16 ZUDC OBSOLETE Tests 1-4 ZUDE OBSOLETE FORMATTER Digital Internal Use Only DSA DSDF/BBR ********************************************** * * * * VAX BASED DIAGNOSTIC SET for UDA50/KDB50 * * ********************************************** EVRLB -- V 5.1, Formatter (See description above for PDP-11 ZUDK) (DM Code Version 14) - Released with VAX Diagnostic Release 24 -- V 6.0, (DM Code Version 15) - VAX Diagnostic Release 26 - Jan 87 EVRLB prior to VS.l EVRLF V 7.0, Tests 1-3 (See description above for PDP-II ZUDH) - Released with VAX Diagnostic Release 24 V 8.0 - VAX Diagnostic Release 25 - OCT 86 EVRLG V 7.0, Test 4 (See description above for PDP-II ZUDI) - Released with VAX Diagnostic Release 24 V 8.0 - VAX Diagnostic Release 25 - OCT 86 EVRLJ Test 5 (See description above for PDP-11 ZUDJ) (NOT RELEASED - target release - JAN 87 - for Release 26) EVRLK V 2.0, Bad Block Replacement Utility (See PDP-II ZUDL) - Released with VAX Diagnostic Release 24 V 2.1 - VAX Diagnostic Release 25 - OCT 86 EVRLL Disk Error Log Utility (Will display the 16 Error SILO in RA81, RA80) (NOT RELEASED - target release - OCT 86 - for Release 25) EVRLA -- OBSOLETE Tests 1-4 diagnostic NOTE The VAX diagnostic supervisors (VDS) that support the host level diagnostics require action by the user. ***************************************************** * * * VAX BASED DIAGNOSTIC SET for KDA50 (MicroVAX II) * * * ***************************************************** MicroVAX II uses the KDA50 controller and its diagnostics are written for the "Micro Diagnostic Monitor" (MOM). NAKDAB -- Diagnostics for the uVAX-2 and R** drives -- All in one Tests 1-3 Test 5 (Subsystem Exerciser) - Released with MicroVAX Diagnostic Release 112 NAKDAC -- Target Release 02 87 NAKDAD -- Formatter upgrade to DM Code Version 15 - Target Release 03 87 Digital Internal Use Only 12-17 DSA DSDFIBBR (0 1....----..,...--- o o @ '----,....-_...... SET P2 IN RCT WORD4 '-------r---- ~------~-------- UPDATE RCT DESCRIPTOR @ o ~-~----' L..--_--..,..._ __ ~~___D_IS_A_B~L_E_E_C_C_____ o ~___ T_E_S...,Tr-L_B_N_ __ REPORT REPLACEMENT fg'\ ENABLE ECC V'---~r----- e '--___. .,. .______ - CLEAR WORD4 .......~ OFRCT REPORT RECURSION ERROR CXO-2377A PrIM~r~ 12-18 Digital Internal Use Only Stco-"Ic~~rl recrhCCt~ DSA DSDF/BBR Figure 12-8: Typical Mount Flow SYSTEM MOUNT (ON-LINE) DRIVE NO REPORT MEDIA SPIN DOWN DISK ">---..-.. FORMAT ERROR 1 - -....... TO USER READ RCT WORD4 YES >-_ _ _ _..... ATTEMP BBR COMPLETION STARTING AT PHASE 1 ENTRY ATTEMP BBR , -_ _ _ _....... COMPLETION AT PHASE 2 ENTRY CXO-2378A 71ct //)le{j~ /{:/ ~fcJ Tcrr-e .rYlcIe ·fL~ cICtl:ti 14; ~V /,Ph1 ~d "j;?vLp Digital Internal Use Only 12-19 DSA DSDFIBBR 12-20 Digital Internal Use Only ---+---+---+---+---+---+ i \ \ \ g\ 1 \ i \ tla \ I \ \ \1\ \ I I I N T E R 0 F FIe E M E M 0 RAN DUM I ---+---+---+---+---+---+ Lke Mayfield Milano Leth Brown :lve Varner L Snyder )n FROM: DATE: DEPT: LOC.: TEL.: ENET: Glenn Scadden 21-Feb-1986 CX/CSSE CX01-1/P14 303-594-2345/522-2345 NERMAL::SCADDEN Results of Feasibility Test -- 18 bit RA81 functionality on VAX ~OUND ~fter some reports of customers using 18 bit formatted RA81's and RA60's K processors, I volunteered to investigate this reported ability. :1 offices were also asking for a determination of the "supportability" bit converted media on a VAX. We have since "day 1" informed the field L8 bit media could not be used on 16 bit processors (via Tech Tip, Right etc) . ve offered KL customers several KL to VAX trade-in programs and tives. As a corporation we have been attempting to "migrate" these KL ners to VAX machines. Yet with all the financial incentives, when a KL ner that has RA disks looks at the financial impact of having to purchase ew HDA's/packs, it significantly detracts from our trade-in programs. ny cases these KL customers have made a significant expenditure for RA and done so only recently. Being able to a'dvertise 18 'bit RA drive rsion to a VAX, will certainly improve customer satisfaction on the part Dse customers who are converting. It may also enable a KL customer, who ~ drives, to purchase new VAX machines earlier than previously expected. customer may also feel incentive to purchase new 18 bit RA drives ng his investment will be completely compatible with a future VAX ase. his test only related to the test of an RA81 18 to 16 bit conversion se on a VAX running VMS. For purposes of this report, when VAX is stated, an assume a VAX running VMS, unless otherwise stated. My purpose was to conduct a test that would match that of a Field eer intending to make a conversion of an 18 bit RA81 for use on a VAX. a quick look at the different diagnostics and utilities that would be d to support that 18 bit media on a VAX. Both during the conversion and wards on the VAX system. OAL did not intend to conduct an exhaustive test of all the VMS operating stem functions, diagnostic functions and the large number of other ilities and programs that could be used on the 18 bit media, when on a x. .s initial feasibility test also only directed its efforts at the lversion of an 18 bit RAB1. RA60 was not addressed in this feasibility it but should be if further investigation is necessary. lid not intend to test a "converted" RA81 on any other VAX, or PDP-11 ?rating system. VMS was the only operating system used. )TIONS :estricted my investigation to a customer that would have a KL10/HSC i intending to move an RAB1 to a VAX processor running VMS. like formatting an 18 bit HDA to 16 bit, a 16 bit (Burst written) CANNOT be converted for use as an 18 bit HDA/Pack ...... . ~/Pack ~ssumed their would be NO difference in the results of this test, if ,er VAX processors were used. As long as all other factors and resources nained the same. e terms "lB bit" and "576 byte" are used interchangeably. e terms "16 bit" and "512 byte" are used interchangeably. RCES I used the following resources during this investigation: HSC50 RAB1 with 18 bit Burst written and formatted HDA. 11780 with UDA50 and HSC50. USION decided to provide my conclusion at this point in the report. The next on will outline the test details. If not interested in that detail, this usion should provide the "bottom line". THE CONVERSION OF AN 1B BIT BURST WRITTEN AND FORMATTED RA81 HDA TO A T FORMATTED BDA (STILL 18 BIT BURST WRITTEN) WAS ACCOMPLISHED WITHIN THE : OF THIS INVESTIGATION. USE OF THIS 16 BIT FORMATTED AND 1B BIT BURST .'EN HDA ON A VMS OPERATING SYSTEM WORKED FOR THE LIMITED NUMBER OF :TIES AND FUNCTIONS TESTED. I recommend that CX/CSSE develop the best approach to "retracting" the long standing policy preventing 18 bit media from use on 16 bit machines. a. This "retraction" process should be coupled with a project for accomplishing a more extensive test. This was only a feasibility test and did NOT address all permutations of the BSC and VAX diagnostics, when used on a "converted" 18 bit RAB1. Nor did it even attempt to try all VMS based utilities and functionality. b. A test of this conversion for an RA60-PE pack (18 bit pack) should be accomplished, before "retracting" our policy for RA60. c. Complete confidence in the "supportability" aspects may require an SVT type process, with availability of all necessary resources. These resources would be systems, media, drives and manpower (to include VMS expertise) etc. d. A decision is also required on the "scope" of any further testing into other VAX based (Like ULTRIX, SYSTEM V, VAX/ELN etc) and PDP-11 based operating systems. If other operating systems support is required, these resources and expertise would be needed. l. A "converted" 18 bit HDA will have 803,712 user LBN's available, when used under VMS. Normally, a 16 bit RA81 HOA has 891,072 user LBN's. This equates to only 9.8% less LBN space for a "converted" 18 bit HDA, as compared to a 16 bit HDA. Our current MLP for an RA81 16 bit HDA is in excess of $5,000. Also, as bad as this may sound, a customer who makes the conversion would most likely obtain the full 891,072 LBN's upon the failure of the 18 bit HOA and replacement of it with a 16 bit HOA. It would be foolish to replace an 18 bit RA81 HDA on a VAX with another 18 bit HDA upon the failure of the 18 bit HOA, or during an FCO replacement. One more point. The last logistics price list I saw, showed the 18 bit HDA as being more expensive than the 16 bit HOA. So a thinking field service branch manager would naturally make the customer happy and save some money also. :ome important information was learned along the way, which will be bed in the next section. ~ETAILS kept good notes of each step of this test. I will thus, only summarize :ignificant points. 'ations: RA81 contains a "jumper" on the front of the Preamp module of the HDA. s "jumper" identifies to the drive logic whether the HOA is burst tten in 16 or 18 bit. 18 Bit HOA's have this Jumper removed. The lversion process maintains this jumper in the out state -- As the burst te on the HDA never changes in the conversion process. The jumper also Inges the "characteristics" response from the drive on an SOl "Get Iracteristics" response . ! .nging the jumper from the out to the in state results in drive faults l other unpredictable error conditions. AT ALL TIMES THE JUMPER SHOULD IN THE OUT STATE. HSCSO (All KL customers that have RA's have an HSC50) is "configured" 576 or 512 byte operation. Obviously, it is configured in 576 for , on the KL. However, when used on a VAX (as if a KL customer was to move HSC and RA's to a VAX), the mode does not seem to matter. It works the VAX or the KL when "configured" in 576 byte mode. The mode is set the "SETSHO" utility command "Set Sector size 576" or "Set Sector size ," command. ! Media "mode" field in the FCT is changed automatically by the mode the rmatter is run in. i.e. When using either the HSC or VAX formatter to rmat an 18 bit HDA to 16 bit format, the FCT mode is automatically itched and appropriately displayed in the response to an MSCP "online" nmand from a host. !ase note that an 18 bit (576 byte) HDA once formatted to 16 bit, can be rmatted back to 18 bit without any trouble. FCT contains two "subtables". One allocated for 576 byte and one for 2 byte formatted physical block numbers that are bad. For a 576 byte rst written HOA, both tables contain the same physical block numbers. r an 512 byte burst written HDA, only the 512 byte subtable contains {sical block numbers. For the case of the 576 byte burst written HDA, is was the correct thing to do. The SDI/DSDF spec is not at all clear this situation and could be easily misunderstood. Also, since the 512 byte rst written HDA CANNOT be field "converted" for 18 bit use, not having { contents in the 576 byte subtable is of no consequence. e can "convert" a 576 byte burst written HDA with 18 bit format to bit format, without having to reformat. This "conversion" technique ~uires the use of the "write enabled HSC OKUTIL" program to modify the Mode eld of the FCT and to accomplish other conversion functions. However, rmatter "con~ersion" is by far the simplest and easiest method. nversion requires the ability to write all Host LBN's (Including the RCT Dcks), otherwise uncorrectable ECC errors result from the "logic" the controller "picking-off" the ECC field from within the data field an 18 bit block. nce the formatter diagnostic was never intended to be a "scanner" and lied upon to find bad blocks, the conversion may result in some marginal d blocks remaining on the media. A recommendation should be made, in any nversion instructions, to have the field engineer "keep an eye out" for d block reports in the VAX error log. The instructions should describe w to use the HSC DKUTIL to "revector" (replace) bad blocks. This assumes at since the 13 bit drive came from a KL, that an HSCSO will be available. is recommendation also, unfortunately, assumes that a possibility always ists that the drive may end up an a VAX running a UNIX based operating stem without dynamic BBR, regardless of any position statement relative the conversion. test for any impact on VMS Dynamic BBR was also conducted. A large number block replacements were made during this test, without any problems pacting the BBR process noted. ties and Diagnostics used ormatter erify LEXER KUTIL LDISK VDS) UDASO based Diagnostics EVRLB Formatter EVRLF Tests 1-3 EVRLG Test 4 EVRLK "Scrubber" IN VMS FUNCTIONS Concerns yet needing investigation VMS Shadowing Backup modes - VMS /Physical - HSC Backup Tutorial on Formatting RA Drives CHAPTER 13 TUTORIAL ON FORMATTING RA DRIVES Digital Internal Use Only 13-1 Tutorial on Formatting RA Drives 13.1 INTRODUCTION This section reinforces and expands the information previously presented in this course relative to formatting RA drives. It covers when and when not to use the formatter. The formatter is: . A beneficial tool when used properly for the right reasons. A destructive tool when used improperly or for the wrong reasons. IT IS IMPORTANT FOR YOU TO UNDERSTAND THE FOLLOWING OBJEcrIVES FROM THE DSA TROUBLESHOOTING COURSE: The Rcr purpose and functionality. The Fer purpose and functionality. The purpose and functionality of drive thresholds. The pUlpose and functionality of tile SCRUBBER diagnostic. The purpose and functionality of the HSCVERIFY and DKUTIL utilities. The purpose and functionality of the CX/esSE RAUTIL, VMS error log tool, and other programs used during the DSA Troubleshooting Course. The purpose and functionality of dynamic bad block replacement (BBR). The purpose and functionality of revectoring. The basic DSA block (sector) header concepts. 13-2 Digital Internal Use Only Tutorial on Formatting RA Drives 13.2 BASIC FORMATTER FUNCTIONALITY REVIEW Several key concepts of formatter functionality are: 1. The formatter was designed and intended to format the media. The key functions of formatting are: Write the headers (along with the associated data field, etc.) of the host LBN area, RCT area, and DBN area NOTE DO NOT run the formatter in reconstruct mode. The XBN (FCT) area will be reformatted if you are running in reconstruct mode. Replace the blocks identified as bad via the contents of the FCT (by default). Replace any additional bad blocks (besides those identified in the FCT) that it finds. Establish the contents and integrity of the RCT. After formatting, the RCT should contain replacement and revectoring inionnatioQ on all the blocks identified as bad This will be the bad blocks as identified in the Fer plus any bad blO~ that the formatter found during the fonnatting process. 2. The formatter was not intended to be a SCRUBBER, scanner, or diagnostic. 3. Only use the formatter on a drives mgood working order. Since the formatter is only intended to format media, the drive itself must be in good working order. If not, the formatter may: Degrade the integrity of the logical structures on the media (such as the RCT) or make the HDA unusable. Replace many good blocks. If a data path problem exists in the drive and the problem causes BCe errors (among others), then many good blocks on the media could be replaced by the formatter. 4. The formatter replaces bad blocks that are below the drive threshold. The HSC formatter (not using special options) will replace all bad blocks that are two symbol in error and above. The other formatters (EVRLB, ZUDK) will replace all bad blocks that are one symbol in error and above. 5. The SCRUBBER, since it is an MSCP product, only sees blocks that are bad above the drive threshold (threshold is 6 symbols in error for RA70/81/82/90, 4 symbols for the RA60, and 2 symbols for the RA80). Thus, the SCRUBBER will not replace bad blocks that are below the drive threshold. reT fc, ci(:()Js ·T ~f&j,Iy,~ t FOrj'~l(.t F'QvrJ I- - 4::-- /lcr{ vQ"j f (i'f/~/I..lJ 'Dr)'" "'tLB'13 ~ 'J (. () s+ r !1C1-t vv1 ;< "'xl< Fo.c/o, r l=O U del / : .~ / 1)~R_Fcl T S'clt-lI&hey (}-7iHSL 'J/Y)o1tt4 l2c:r dt",tl, Fe r .61? Co-'tt! i/(J/I~ e(,,/I<~/ 1:; Digital Internal Use Only 13-3 Tutorial on Formatting RA Drives 13.3 SCRUBBER, FORMATTER, HSC50nO-WHAT REPLACES BLOCKS? 1. The HSC does controller-initiated BBR. 2. The RQDX controller (for RD disks) does controller-initiated BBR (even though the BBR algorithm used in the RQDX may not be the same as the one used by the HSC). 3. Operating systems do dynamic BBR if the operating system has the proper coding. 4. If an operating system has BBR capability and an HSC, the HSC does the BBR. 5. The SCRUBBER does BBR. The SCRUBBER was primarily intended for use when the operating system does not do dynamic BBR and there is no HSC controller. UNIX (Berkeley) RT-ll, prior to Version 5.3 ELN, prior to Version 2.1 UNIX system V, prior to Version 2 Release 2 ULTRIX-ll DSM-ll, prior to Version 3.1 lAS, prior to Version 3.2 MicroPower/PASCAL ULTRIX-32, prior to Version 2 However, the SCRUBBER is used quite successfully during the installation of a new drive or HDA. A quick pass of the SCRUBBER can clean up bad blocks up before the equipment is given to the customer. Also use the SCRUBBER to clean up bad blocks that the operating system is having a hard time doing. Make note of any bad blocks you wish to replace (from information in the error log, etc.), and invoke manual mode of the SCRUBBER. 6. The fonnatter does BBR. The HSC fonnatter will replace blocks that are identified as bad in the fonnat control table (Fer). It will also replace bad blocks that are above two symbols in error (or one symbol in error, using special options). Since the fOImatter works at one (or two in HSC formatter without special options) symbols in error, it is unlikely to pass a bad block. However, it has happened 7. HSC utilities do BBR. SCRUBBER-type functionality can be obtained by using the HSC utilities VERIFY and DKUTIL. 1. VERIFY will give you a list of the bad blocks from one symbol in error to uncorrectable and header errors. 2. Once you have obtained this list, you can decide at what level you want to make replacements. 3. If you decide to replace all bad blocks that VERIFY shows are 4 symbols in error and above, you could use the REVECfOR command in DKUTIL. This allows you to manually replace all bad blocks that VERIFY shows are 4 symbols and above in error. When possible, use the HSC VERIFY/DKUTll... utilities rather than the formatter. 13-4 Digital Internal Use Only Tutorial on Formatting RA Drives 13.4 WHEN TO USE THE FORMATTER Use the fonnatter under the following circumstances: 1. If you have a drive that has a data path problem or a worn spindle ground brush. Assuming you have an operating system that supports dynamic bad block replacement or an HSC controller that does controller initiated BBR, these a data path problem or a worn spindle ground brush will cause many ECC errors, header errors, and other data-related errors. BBR will most likely replace the blocks reporting the errors. However, because of the data path problem or worn spindle ground brush, most of the blocks reporting errors are not bad. The good blocks that have been unnecessarily replaced will begin to cause perfonnance degradation due to revectoring . . If you repair a drive that has a data path problem or worn spindle ground brush and the error log has accumulated lots of ECC/header errors, assume good blocks have been replaced. (VMS supports the BBR error log packet, and this indicates blocks being replaced for the data problem you have corrected. Many other operating systems do not support the BBR error log packet.) In severe cases, the good blocks that get replaced can be almost enough to fill the RCT (17,472 RBNs in the RA81). If you wish to see how many biocks are currently replaced, you can dump the contents of the RCT. This can be done in one of several ways: HSC-Use the DISPLAY RCT command of DKUTIL. HSC-Use the HSC VERIFY program. UDA/KDA/KDB - Use the SCRUBBER program (VAX=EVRLK PDP-ll=ZUDL). NOTE Remember there is no SCRUBBER for the MicroVAX-ll environment (currently in development). For MicroVAX-ll, the only way to see the RCT is with the CX/eSSE RAUTIL program (assuming the operating system is VMS). RAUTIL-Use the SUM command to RAUTIL (assuming you have a copy available on. site and VMS is the operating system). There is no official answer to how many replacements are too many. However, use the following as a guideline: In an RA60, more than 600 replacements is suspect In an RA70, more than 500 replacements is suspect In an RA80, more than 400 replacements is suspect • In an RA81, more than 1000 replacements is suspect • In an RA82, more than 1000 replacements is suspect In an RA90, more than 1000 replacements is suspect Suspect means that further analysis and determination is necessary. You must detennine if scratches or other problems are causing the replacements. RAUTIL could be very helpful in this analysis. To compare the replacements the factory found to the contents of the Rcr, use the HSC DKUTll.. program. The DKUTIL command DISPLAY FCT or the VERIFY program shows the contents of the Fer. For other controllers, there is currently no way to see the contents of the FCT. However, the current version of the formatters will not begin fonnatting if a valid FCT does not exist Digital Internal Use Only 13-5 Tutorial on Formatting RA Drives CONCLUSION If you detennine that good blocks have been replaced due to a data path problem or worn spindle ground brush, use the fonnatter. HOWEVER, ONLY USE THE FORMATTER AFTER YOU HAVE REPAIRED THE DISK DRIVE. The fonnatter replaces the bad blocks identified in the Fer and any other bad blocks it finds. Therefore, after correcting a data path problem (or a worn out spindle ground brush), the fonnatter returns the good blocks that BBR replaced into use and re-establishes the contents and integrity of the RCT to known values. 2. Use the fonnatter to correct a problem that relates to headers. Headers are never written unless you are fonnatting. Header errors (such as header not found, header compare error, or invalid header) also invoke BBR (assuming you have host or controller BBR available). Thus, the header errors may be corrected "dynamically." However, if you are troubleshooting intermittent header-related errors, then the fonnatter may help. Over time, the headers may diminish in ~plitude. This can cause various problems, such as intennittent header-related errors. Also, under some circumstances, the automatic gain circuits may not be able to react properly, causing data-related errors (BeC type errors), when the cause is actually a low amplitude of the headers. The formatter is the only way to rewrite headers. Use the formatter to rewrite the headers and establish the amplitude intended for the drive "logic". CONCLUSION If the drive is worlting properly and the data path is not a source of errors, use the fonnatter. Make sure you understand the situation before deciding to use the fonnatter. If there are only a few blocks (1 to 10) reporting header errors, record the LBNs and manually replace those blocks using DKUTll.., RAUTIL, or the scrubber programs in manual mode. This will take considerably less time than formatting. 3. Although the formatter will replace bad blocks below the drive threshold, the fonnatter is not a "scanner" and may miss some bad blocks. In some cases, bad blocks can be at the marginal level of the drive threshold and cause intermittent errors, and never be replaced. This occurs most often in operating systems that do not have the most current BBR algorithm or that do not have BBR support (controller BBR or host BBR). In those cases, the fact that the fonnatters work below the drive threshold could be beneficial. CONCLUSION You may be able to eliminate many of the "marginal" bad blocks that are below the drive threshold but show up intermittently above the drive threshold (via error log entries). 4. You can also use the formatter for problems related to the media (HDA/pack) before replacing that HDA/pack. If a problem relates to the HDA/pack and you are considering replacing that HDA/pack, try the f01lllatter first. Since the formatter rewrites evetything (except the FCT), it may correct your problem and eliminate the need to replace the HDA/Pack. 13-6 Digital Internal Use Only Tutorial on Formatting RA Drives 13.5 WHEN NOT TO USE THE FORMATTER This section discusses when you should use a method other than the formatter to correct a problem. 1. Avoid using the formatter as a troubleshooting tool. The formatter is not a diagnostic. 2. Avoid using the formatter on a broken drive. Using the formatter on a broken drive may cause damage to the HD~a~. 3. . Avoid using the formatter as a SCRUBBER or "scanner." For scrubbing or scanning, use the utilities EVRLK-VAX (with UDA/KDB), and HSC VERIFY and DKUTIL REVECTOR. Until an MDM scrubber is available for the MicroVAX IT, formatting may be the only alternative. The formatter can be used as a scrubber to work below the drive threshold. However, if you have a bad block that keeps showing errors and BBR does not to replace it, manually replace it with the HSC DKUTll.. REVEcrOR command or use the manual mode of the scrubber. The RAUTIL VMS utility will also allow you to perfonn manual block replacements for RA-series drives. _ Ir ,.fL'/ '"1, !10 /l(!A;(JYI{)J{/lf ~1 /;-i~r /UA--4~ rf F~ &f ::: I FeT ,r:>-/- '61 I: (K' ' / Ik, "'/ Fe /'/J I"C' '. I .., {Ie,v, 110v /2~jI. ~~C FIC !>tI' '. / ~"' ct I~~ ,11 P,k:.eFC7 ckt-f btblc) I O> ILEXER>O>Unit lLEXER>O> No ILEXER>O>0064 I LEXER>O > ILEXER>O>Unit ILEXER>O> No ILEXER>O>0064 ILEXER>O> ILEXER>O>Unit ILEXER>O> No ILEXER>O>0064 lLEXER>O> ILEXER>O>Unit ILEXER>O> No ILEXER>O>0064 ILEXER>O> ILEXER>O>Unit ILEXER>O> No ILEXER>O>0064 ILEXER>O> 14-2 R Serial Number 000000001F46 Posi tion 34874 Kbyte Read 0001371380 Kbyte Written 0000000000 Hard Error 00000 Soft Error 00000 Software Corrected 00076 R Serial Number 000000001F46 Posi tion 69698 Kbyte Read 0002740800 Kbyte Written 0000000000 Hard Error 00000 Soft Error 00000 Software Corrected 00139 R Serial Number 000000001F46 Posi ticn 04420 Kbyte Read 0004110360 Kbyte Written 0000000000 Hard Error 00000 Soft Error 00000 Software Corrected 00186 R Serial Number 000000001F46 Posi tion 39109 Kbyte Read 0005479670 Kbyte Written 0000000000 Hard Error 00000 Soft Error 00000 Software Corrected 00246 R Serial Number 000000OOlF46 Posi tion 73822 Kbyte Read 0006848780 Kbyte Written 0000000000 Hard Error 00000 Soft Error 00000 Soft-ware Corrected 00288 Digital Internal Use Only ILEXER Sample 2 14.2 ILEXER SAMPLE 2 ILEXER>D> ILEXER>D>Unit ILEXER>D> No ILEXER>D>D021 !LEXER>D> ILEXER>D>Unit ILEXER>D> No ILEXER>D>D021ILEXER>D> ILEXER>D>Unit ILEXER>D> No ILEXER>D>D021 ILEXER>D> ILEXER>D>Unit ILEXER>D> No ILEXER>D>D021 n.EXER>D> I:'EXER>D>Unit ILEXER>D> Nc ILEXER>D>D021 I LEXER>D > 000~4i:POO Kbyte Written 0009078532 Hard Error 00000 Soft Error 00094 Software Corrected 00191 Posi tion 70877 Kbyte Read 0008631700 Kbyte Written 0009238632 Hard Error 00000 Soft Error 00095 Software Corrected 00193 Serial Number 000000000802 P·~si tion 74041 Kbyte Read 0008792300 Kbyte Written 0009399332 Hard Error 00000 Soft Error 00097 Software Corrected 00199 R Serial Number 000000000802 Posi tion 77205 Kbyte Read 0008952600 Kbyte Written 0009559932 Hard Error 00000 Soft Error 00098 Software Corrected 00202 R Serial Number 000000000802 Posi tion 80373 Kbyte Read Kbyte Written 0009720632 Hard Error 00000 Scft Error 00099 Software Corrected 00205 R R R Serial Number 000000000802 Posi tion 6ii30 Serial Number 000000000802 Kbyte Read 00O~1l3t400 Digital Internal Use Only 14-3 Drive Error Tolerance 14.3 ACCEPTABLE DRIVE ERROR RATES RECOVERABLE RIW Errors Sometimes referred to as SOFrWARE CORRECTED ERRORS when using HSC ILEXER for example. Errors that are correctable by ECC without Retry/Error-recovery sequences. The maximum number of recoverable RIW errors is: For the RAGa, RA70, RASa, RA81, RA82, RAga M 73 1 error per: 107 bits read I SOFT RECOVERABLE RIW Errors {/ -" Sometimes referred to as soft errors when using HSC ILEXER for example. Errors that . are correctable with ECC and Ret~Error~recofvery sequences. The maximum number of soft recoverable RIW errors is: M Comment lines can be entered by prefixing them with an exclamation point (!). A null line is ignored. Entering a CTRL-Z terminates the program. Commands are executed immediately and usually take only the time necessary to print their results. Entering a CTRL-Y or crRL-C at any time will abort the program and release the drive. 15.3 COMMAND SYNTAX Commands, command options, and modifiers are recognized by initial substrings. For example, DUMP can entered as DUM, DU, or D. Where an initial substring can indicate one of several, the match depends on an order based on history and expected frequency of usage. Thus, D specifies DUMP, DI specifies DISPLAY, and DE specifies DEFAULT. Some command options take optional parameters. If omitted, there are default parameters. 15.4 MODIFIERS Some commands allow parameters. Parameters may appear anywhere after the command. Parameters are preceded by a slash (one slash for each). The following are equivalent: DUMP/NOEDC RBN 0 DUMP /NOEDC RBN 0 DUMP RBN/NOEDC 0 DUMP RBN O/NOEDC DUMP RBN 0 /NOEDC Modifiers are processed left to right and applied to the current default modifiers, if any. The default modifiers for DUMP can be changed via the DEFAULT command. The initial default modifiers for DUMP are /DATA, /EDC, and IIFERROR. 15-2 Digital Internal Use Only HSCS0t70 DKUTIL User Guide 15.5 SAMPLE SESSION The following is a sample session using DKUTIL. User input is in boldface. "'Y HSC50> RON DKO~XL DKUTIL-Q Enter unit number (U) 0000000004 512 17-Nov-1858 00:35:47.48 04-Apr-19B4 00:05:09.20 Serial Number: Mode: First FODnatted: Date Formatted: Format Instance: FCT: DKUTIL> DXS/F [DO]1D133 6 VALID FC~ Factory Control Table for D133 (RABO) Serial Number: Mode: First FODnatted: Date Formatted: Format. Instance: FCT: 0000000004 512 17-Nov-1858 00:35:47.48 04-Apr-1984 00:05:09.20 VALID Bad PBNs in FCT: 1 (512), 0 (576) 6 Scratch Area Offset: 63 Size (Not Last): 417 Size (Last): 289 Flags: Format Version: 000000 o PBNs in 512 Byte Subtable (04) 244865 (LBN 237213), DKUTIL> REV 1000 ERROR-W Bad Block Replacement (Success) at 04-Apr-1984 17:47:24.20 Command Ref # 00000000 RA80 Unit # 133. Err Seq # 6. Error Flags 80 Event 0014 Replace Flags A400 LBN 1000. Old RBN 32. New RBN 33. 004A Cause Event ERROR-I End of error. DKUTIL> DXS/r RC~ Revector Control Table for D133 (RA80) Serial Number: Flags: 0000000004 000000 LBN Being Replaced: Replacement RBN: Bad RBN: 1000 (000000 001750) 33 (060000 000041) 32 (060000 000040) Cache ID: Cache Incarnation: Incarnation Date: 0000000000 Bad RBN: 139512 --> 32, 4500, o 17-Nov-1B58 00:00:00.00 1000 *-> 33, 25512 --> 822, Digital Internal Use Only 15-3 HSC50170 DKUTIl User Guide RCT Statistics: 1 3 2 1 o Bad RBNs, Bad LBNs, Primary Revectors, Tertiary Revectors, Probationary RBNs. DKUTIL> DOMP LBN 1000 ****** Buffer for LBN 1000 (000000 001750), MSCP Status: 000000 Error Summary = header compare Original Error Bits 004000 Error Recovery Flags = 000 Error Retry Counts 0,1,0 Header BN = 1000 (000000 001750) ECC Symbols Corrected = 0,0 Error Recovery Command = 000 001750 030000 001750 030000 001750 030000 001750 030000 = 000000 EDC 000105 ECC 000000 000000 000000 000000 000000 000000 000000 000000 000000 000003 000000 000000 Calculated EDC Difference DKUTIL> DIS CHAR LBN 1000 Characteristics for LBN 1000 (000000 001750) Cylinder 1, Group 0, Track 4, Position 8 PBN 1032 (000000 002010) Primary RBN 32 (060000 000040) in RCT Block 3 at Offset 128 DKUTIL> DIS CHAR DISK Drive Characteristics for D133 Type: RA80 (576 byte mode allowed) Media: FIXED Cylinders: 275 LBN, 2 XBN, 2 DBN Geometry: 14 tracks/group, 2 groups/cylinder, 28 tracks/cylinder 31 LBNs/track, 1 RBNs/track, 32 sectors/track, 32 XBNs 896 XBNs/cylinder, 868 LBNs/cylinder, 28 RBNs/cylinder Group Offset: 16 (LBN), 16 (XBN) LBNs: 237212 (host), 238700 (total) RBNs: 7700 XBNs: 1792 DBNs: 1344 (read/write), 448 (read only) PBNs: 249984 RCT: 465 (size), 63 (non-pad), 4 (copies) FCT: 480 (size), 63 (non-pad), 4 (copies) SOl Version: 3 Transfer Rate: 97 Timeouts: 3 (short), 7 (long) Retry Limit: 5 Error Recover: 0 command levels ECC Threshold: 2 symbols 15-4 Revision: 10 (microcode), 0 (hardware)· Drive ID: OA7AOOOOOOOO Digital Internal Use Only HSC50no DKUTIL User Guide Drive Type ID: 1 DBN RO Groups: 1 Preamble Size: 11 (data) , 4 (header) DKUTIL> DUMP RCT BLOCK 3 . ****** RCT Block 3, Copy 1 ****** ****** Buffer for LBN 237214 (000003 117236), MSCP Status: 000000 Data = +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 = 023277 EDC 000000 000000 000000 000000 000000 000000 000000 000000 040000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 001750 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 030000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 ·000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 Calculated EDC Differenceigital Internal Use Only 15-5 HSC50170 DKUTIL User Guide 15.6 DETAILED COMMAND DESCRIPTIONS Following are descriptions for the DKUTIL commands. Command options are shown by separate lines in' the syntax specification. Parameters are indicated in the syntax by braces ({ }) and lower case. Options which may be omitted are indicated by brackets ([ ]). 15.6.1 CHECK Command Purpose: To fill the host LBN area with data unique to each LBN, or to check the area for containing that data. Syntax: CHECK [READ] Parameters: CHECK WRITE none Modifiers: IBBR .Sc +C-SSc_ COrltr_o0 Normally, when a block is accessed, bad block replacement is inhibited. If this modifier is specified, bad block replacement is allowed. It will only occur, however, if the block being accessed is detected as bad by the error recovery code and is an LBN in the host area. Usage: If the WRITE command option is given, all LBNs in the host area will be written with a pattern which consists of the LBN number repeated enough times to fill the sector. Thus, each LBN will have unique data. If the READ command option is selected (the default if no option is given), every LBN in the host area is read and checked for the expected unique data pattern. For both options, hard errors are reported with a status line showing the MSCP status returned along with the LBN number. For the READ option, an MSCP status of ST.CMP (000007) indicates that the data did not compare to what was expected. This status will override other errors such as forced error. Therefore, for CHECK READ, a status other than ST.CMP indicates that the data in the LBN was correct. Examples: CHECKlBBR WRITE ZCH READ Sef C:;(G. _WI Itc~ otJ C v5e S~'" cuslch!tr eo-p1 1)0-,(' e ~ccl/::n•. I/:tl 15-6 Digital Internal Use Only c/,Jc, t;; HSC50no DKUTIL User Guide 15.6.2 DEFAULT Command Purpose: To change the default modifiers for the DUMP command. Syntax: DEFAULT Parameters: none Modifiers: IIFERROR (NOIFERROR) If this modifier is sp~cified (default on), the error, header, and ECC fields in the buffer will be dumped if an error occurs when reading the block. When used in conjunction with the IRAW modifier, the error must occur on the reread of the block with the header code extracted from the first read. IERRORS (NOERRORS) If this modifier is specified (default off), the error fields in the buffer will be dumped. IEDC (NOEDC) If this modifier is specified (default on), the EOC and calculated EDC fields in the buffer will be dumped. IECC (NOECC) If this modifier is specified (default off), the ECC fields in the buffer will be dumped. IDATA (NODATA) If this modifier is specified (default on), the data in the buffer will be displayed unless the INZ modifier has also been specified. See below. IHEADERS (NOHEADERS) If this modifier is specified (default off), the header fields in the buffer will be displayed. IAll (NONE) This is the same as IERRORS/EDC/ECC/DATAlHEADERS. It requests that all fields be displayed. Its opposite, INONE, requests that no fields be displayed. In this case, only the MSCP status line will print. IRAW (NORAW) This modifier requests that data from the specified LBN be dumped instead of the data from the RBN, if the data had been previously replaced. The IIFERROR modifier, if in effect, applies only to the reread. This modifier only has an effect on dumping an LBN which is revectored. INZ (NONZ) This modifier (default off), does not display data that is all zero. Instead, a single line indicating that the data is zero will be printed. It has no effect if the IDATA modifier is not specified (or is defaulted off). IBBR (NOBBR) Normally, when a block is accessed, bad block replacement is inhibited. If this modifier (default off), is specified, bad block replacement will occur. It will only occur, however, if the block being accessed is detected as bad by the error recovery code and is an lBN in the host area. IORIGINAL (NOORIGINAl) When a block is accessed for dumping, the data is seen by the program twice if an error occurs. It is seen first just after the K detects the error and sends it to error recovery. It is then seen again after error recovery takes place and the data has been corrected or reread. Normally, the data is saved for displaying when it is last seen. If this modifier (default off), is specified, the data saved for display will be the data first seen. Usage: The modifiers specified are applied to the current default modifiers for the DUMP command. The result becomes the new default. Digital Internal Use Only 15-7 HSC50170 DKUTll User Guide Examples: 15-8 DEFAULT/NONE DEF/RAW/NODATA DEiAlORlNZ Digital Internal Use Only HSC50170 DKUTIL User Guide 15.6.3 DISPLAY Command Purpose: To display the disk characteristics, the characteristics of a given block, the error history in the drive, the FCT, or the RCT. Syntax: DISPLAY DISPLAY DISPLAY DISPLAY DISPLAY DISPLAY DISPLAY DISPLAY DISPLAY DISPLAY ALL CHARACTERISTICS CHARACTERISTICS CHARACTERISTICS CHARACTERISTICS CHARACTERISTICS CHARACTERISTICS ERRORS FCT RCT DBN {block} DISK LBN {block} PBN {block} RBN {block} XBN {block} Parameters: block is a number specifying the DBN, LBN, PBN, RBN, or XBN whose characteristics are to be displayed. The default radix is decimal. It can be changed to octal by prefixing the number with the letter O. Modifiers: IFULL If this modifier is specified, all· fields in xCT block 0 which are defined will be displayed. It applies only to the RCT and FCT command options. Normally, for the RCT option the bad block replacement and write back caching fields in RCT block 0 are only displayed if the appropriate flags in the flags field are set indicating they are currently in use (BBR or caching in progress). This modifier forces all fields to be displayed regardless of the flags settings. For the FCT option, the number of bad PBNs field is normally displayed only if the FCT is VALID. Also, the scratch area parameters, format version, and format flags are normally not displayed. This modifier forces all fields in FCT block 0 to be displayed. INOITEMS If this modifier is specified, the individual items in the FeT or RCT are not displayed. It applies only to the FCT and RCT command options. If given, only the block 0 information is displayed. Usage: DISPLAY ALL The disk characteristics, FCT, RCT, and error history are displayed. Because the error history in the drive is dumped by this option, do not use it for RAGO drives; the SOl command to read the error history is illegal and causes the drive to go inoperative. DISPLAY CHARACTERISTICS DISK The drive type, media, cylinders, geometry, group offsets, number of LBNs, number of RBNs, number of XBNs, number of DBNs, number of PBNs, RCT parameters, FCT parameters, SOl version, transfer rate, SOl time outs, SOl retry limit, error recovery command levels. ECC threshold, revision levels. drive 10, drive type 10, DBN Read/Only groups, and preamble sizes are displayed. DISPLAY CHARACTERISTICS xBN {block} The characteristics of the given block are displayed. For DBNs and XBNs, these are the block number in decimal and octal, cylinder, group, track, position, and PBN in decimal and octal. For RBNs, the RCT block number and offset are also displayed. For LBNs, the primary RBN number and its RCT block number and offset are also displayed. For PBNs, what is displayed depends on the type of the block: DBN, LBN, RBN. or XBN. DISPLAY ERRORS The error history in the drive is read from region 2, offset 0, and dumped in hexadecimal. Do not use this option used for RA60 drives; it will cause them to go inoperative. DISPLAY FCT Digital Internal Use Only 15-9 HSC50170 DKUTIL User Guide The information in FCT block 0 is displayed. Certain fields will not be displayed unless the /FULL modifier is specified. The list of bad PBNs is displayed unless the INOITEMS modifier is given. For each item in the list, the header bits, PBN number, type (DBN, LBN, RBN, or XBN), and xBN number are displayed. DISPLAY RCT The information in RCT block 0 is displayed. Certain fields will not be displayed unless the /FULL modifier is specified. The list of revectors, bad RBNs, and probationary RBNs are displayed unless the INOITEMS modifier is given. For bad and probationary RBNs, just the RBN number is displayed in decimal. For revectors, the LBN number and RBN number to which it is revectored are displayed in decimal. A primary revector is distinguished by the character sequence "->". A secondary (tertiary) revector is distinguished by the character sequence "*->. Examples: 15-10 DISPLAY/FULL ALL Ol/FA OICD DIS CHAR LBN 1000 DIINOI RCT Digital Internal Use Only HSC50170 DKUTIL User Guide 15.6.4 DUMP Command Purpose: To dump the given block or table of blocks. Syntax: DUMP DUMP DUMP DUMP DUMP DUMP DUMP Parameters: block is a number specifying the DBN, LBN, RBN, or XBN to be dumped. The default radix is decimal. It can be changed to octal by prefixing the number with the letter O. [BUFFER] DBN [{block}] FCT [BLOCK {number}] [COPY {copy}] LBN [{block}] RBN [{block}] RCT [BLOCK {number}] [COPY {copy}] XBN [{block}] number is the relative block number in the FCT or RCT to be dumped. The default radix is decimal. It can be changed to octal by prefixing the number with the letter o. The value must be in the range 1 through non-pad FCT or RCT size. That is, the first block is number 1 (not 0) and the block must be in the non-pad area. copy specifies which copy of the given block in the FCT or RCT that is to be dumped. The first copy is number 1. The value must not exceed the number of copies. Modifiers: IIFERROR (NOIFERROR) If this modifier is specified (default on), the error, header, and ECC fields in the buffer will be dumped if an error occurs when reading the block. When used in conjunction with the IRAW modifier, the error must occur on the reread of the block with the header code extracted from the first read. fERRORS (NOERRORS) If this modifier is specified (default off), the error fields in the buffer will be dumped. IE DC (NOEDC) If this modifier is specified (default on), the EDC and caiculated EDC fields in the buffer will be dumped. IECC (NOECC) If this modifier is specified (default off), the ECC fields in the buffer will be dumped. IDATA (NODATA) If this modifier is specified (default on), the data in the buffer will be displayed unless the INZ modifier has also been specified. See below. IHEADERS (NOHEADERS) If this modifier is specified (default off), the header fields in the buffer will be displayed. fALL (NONE) This is the same as IERRORS/EDC/ECC/DATAlHEADERS. It requests that all fields be displayed. Its opposite, INONE, requests that no fields be displayed. In this case, only the MSCP status line will print. IRAW (NORAW) This modifier requests that data from the specified LBN be dum ped instead of the data from the RBN, if the data had been previously replaced. The flFERROR modifier, if in effect, applies only to the reread. This modifier only has an effect on dumping an LBN which is revectored. fNZ (NONZ) This modifier (default off), does not display data if it is all zero. Instead a single line indicating that the data is zero will be printed. Ithas no effect if the IDATA modifier is not specified (or is defaulted off). IBBR (NOBBR) Digital Internal Use Only 15-11 HSC50no DKUTIL User Guide Normally, when a block is ~ccessed, bad block replacement is inhibited. If this modifier (default off), is specified bad block replacement will' be allowed to occur. It will only occur, however, if the block being accessed is detected as bad by the error recovery code and is an LBN in the host area. IORIGINAL (NOORIGINAL) When a block is accessed for dumping, the data is seen by the program twice if an error occurs. It is seen first just after the K detects the error and sends it to error recovery. It is then seen again after error recovery takes place and the data has been corrected or reread. Normally, the data is saved for displaying when it is last seen. If this modifier, which defaults off, is specified, the data saved for display will be the data first seen. Usage: DUMP [BUFFER] The current buffer is dumped subject to the given modifiers. If there is no current buffer, an error message will be printed. DUMP xBN [{block}] The specified DBN, LBN, RBN, or XBN is read in and dumped subject to the given modifiers. If the block number is not specified, it defaults to zero (0). DUMP xCT [BLOCK {number}] [COpy {copy}] If a BLOCK number is given, that block in the FCT or RCT is read in and dumped. If none is specified, every block in the non-pad area of the FCT or RCT is read in and dumped. If COPY is not specified, it defaults to copy'. Examples: 15.6.5 DUMP RCT BLOCK 3 COpy 4 DU/NZ RCT C 2 DU LBN 1000 OF 82 OX 01 DATA EXIT Command Purpose: To terminate execution of the program. Syntax: EXIT Parameters: none Modifiers: none Usage: The current drive is released, all resources are returned, and the program exits. Examples: EXIT E 15-12 Digital Internal Use Only HSC50no DKUTIL User Guide 15.6.6 GET Command Purpose: To change the current drive. Syntax: GET [{drive}] Parameters: drive is a valid drive unit specification of the form "Onnn". If this parameter is left out, it defaults to "0000" (unit 0). Modifiers: INOACQUIRE Normally, when a new drive is selected, it is first acquired for diagnostic use by the program before being brought online. In fact, the online operation will fail if it has not been acquired. However, if the drive was previously acquired by the program and not released by the INORELEASE modifier in a previous GET command, the drive does not need to be acquired again. Under these circumstances, the GET will fail unless INOACQUIRE is used. INOIMF By default, a new drive is brought online with the IMF (MO.IMF) MSCP modifier which inhibits reading the FCT block 0 to determine the mode, and reading and writing of RCT block 0 to verify that the RCT is sane. If this modifier is specified, these actions will take place. INORELEASE This command normally makes the current drive available and releases it from diagnostic use before selecting a new drive. If this modifier is specified, the current drive will be left online. It should be used with great caution because the drive left online will be in limbo until the HSC50 reboots or the drive is re-selected with a GETINOACQUIRE and then released. ISHAOOW If this modifier is specified, the drive will be brought online with the MSCP SHADOW (MD.SHO) modifier. The shadow unit (virtual unit) will be 0 and the unit will be made a part of a shadow set. This modifier must be used in conjunction with the INOIMF modifier. IWP If this modifier is specified, the drive will be brought online with the MSCP SET WRITE PROTECT modifier (MO.SWP) and WRITE PROTECT unit flag (UF.WPS). The drive will be software or volume write protected. INOWP If this modifier is specified, the drive will be brought online with the MSCP SET WRITE PROTECT modifier. The drive will not be software write protected. Usage: The current drive is released unless the INORELEASE modifier is specified. The new drive is acquired unless the INOACQUIRE modifier is specified (the unit was previously acquired). It is then brought online with the requested modifiers and unit flags. If the drive is nonexistent, in use, or inoperative, the user is prompted for another unit. The modifiers cannot be changed for this other unit. If the mode word in FCT block 0 is invalid or all copies of FCT block 0 are bad, the user is prompted for the sector size to use. Examples: GET 0133 GIWP 064 G Digital Internal Use Only 15-13 HSC50no DKUnL User Guide 15.6.7 MODIFY Command Purpose: To modify a location or set of consecutive locations in the current buffer. Syntax: MODIFY {offset} [{value} ...] Parameters: offset is a number specifying the initial offset in the current buffer where modification is to start. The default radix is decimal. It can be changed to octal by prefixing the number with the letter O. The value is forced even. value is a number used to modify the next consecutive word in the current buffer. The default radix is decimal. It can be changed to octal by prefixing the number with the letter O. Modifiers: none Usage: The specified word in the current buffer is changed to the given value. The following consecutive words are changed to the· subsequent values, if any. Modification stops when the offset exceeds 574. The modified buffer can be written to an arbitrary block with the WRITE command which will recompute the checksum on the buffer. Examples: MODIFY 130 0040000 MOOOO 15.6.8 POP Command Purpose: To restore the data in the current buffer from the save buffer. Syntax: POP Parameters: none Modifiers: none Usage: The data in the save buffer is restored to the current buffer. The data in the current buffer is lost. Examples: POP P 15.6.9 PUSH Command Purpose: To save the data in the current buffer in the save buffer. Syntax: PUSH Parameters: none Modifiers: none Usage: The data in the current buffer is saved in the save buffer. The data in the save buffer is lost. Examples: PUSH PU 15-14 Digital Internal Use Only HSC50J70 DKUTIL User Guide 15.6.10 REVECTOR Command - (Manual LBN Replacement) Purpose: To force bad block replacement to occur for a given LBN. Syntax: REVECTOR {block} Parameters: block is a number specifying the LBN to be replaced. The default radix is decimal. It can be changed to octal by prefixing the number with the letter O. Modifiers: none Usage: The specified LBN is sent to the bad block replacement module to be revectored. If it is not a valid LBN or is not in the RCT, the revector will fail and an error message will be printed. Otherwise, the result of the replace attempt will be shown in the error log produced, if the appropriate level message level is enabled (INFO). The data in the replacement RBN is read from the specified LBN. Examples: REVECTOR 1000 R 100 Digital Internal Use Only 15-15 HSC50no DKUTIL User Guide 15.6.11 SET Command SET Purpose: To change various program parameters. Syntax: SET [SIZE { size}] Parameters: size specifies the new sector size to be used for the current drive. It must be either 512 or 576. Modifiers: none Usage: SET SIZE {size} The sector size is changed to the given value and the disk parameters are recomputed. This. new sector size is used when doing 1/0 to the LBN area and is also reflected in the parameters printed by the DISPLAY CHARACTERISTICS DISK command. Examples: SET SIZE 576 S S 512 15.6.11.1 Syntax: Usage: This command is only applicable to the DKUTIL utility supplied with HSCsoftware Version 390 or higher. When executed, this command will enable proper operation of the following "disk-write" commands: CHECK WRITE MODIFY WRITE This feature is intended for field service use only. The special disk-write commands will only be enabled until the user terminates execution of DKUTIL. Upon restarting DKUTIL, these commands will again be disabled until the next execution of the SET CSSE_WRITE_ON command. This command eliminates the need for a special DKUTIL ODT patch to enable the disk-write features. NOTE The REVECTOR command is not affected. The REVECTOR command is always enabled. This SET command may not be abbreviated. Example: 15-16 Digital Internal Use Only HSC50no DKUTIL User Guide 15.6.12 WRITE Command Purpose: To write a given block or all copies of a given RCT or FCT block. Syntax: WRITE WRITE WRITE WRITE WRITE WRITE WRITE Parameters: block is a number specifying the DBN, LBN, RBN, or XBN to be written. The default radix is decimal. It can be changed to octal by prefixing the number with the letter O. [BUFFER] DBN [{block}] FCT [BLOCK {number}] [COPY {copy}] LBN [{block}] RBN [{block}] RCT [BLOCK {number}] [COPY {copy}] XBN [{block}] number is the relative block number in the FCT or RCT to be written. The default radix is decimal. It can be changed to octal by prefixing the number with the letter O. The value must be in the range 1 through non-pad FCT or RCT size. That is, the first block is number 1 (not 0) and the block must be in the non-pad area. copy specifies which copy of the given block in the FCT or RCT to be written. The first copy is number 1. The value must not exceed the number of copies. Modifiers: IBADEDC If this modifier is specified, the EDC written is forced to be bad (illegal). The actual EDC used is the correct EDC plus 1. Using this modifier will cause a number of error log messages to be generated. The block(s) written will cause controller errors (ST.CNT) when read. IFE If this modifier is specified, the EDC written will cause the block(s) to have forced errors. The actual EDC used is the complement of the correct EDC. The biock(s) written will cause forced errors (ST.DAT) when read. IBBR Normally, when a block is accessed, bad block replacement is inhibited. If this modifier is specified bad block replacement will be allowed to occur. It will only occur, however, if the block being accessed is detected as bad by the error recovery code and is an LBN in the host area. Usage: A new EDC is computed on the data in the current buffer and modified according to the given modifiers. The current buffer is written with this new EDC to the specified block or blocks. If the FCT or RCT option is used, the first block (actual block O) will be written if the BLOCK parameter was not given. If the COPY parameter is not given, all copies of the block will be written. Examples: WRITE RCT BLOCK 3 WR LBN 1000 WAlBBR L 100 WIFE WIBAD Digital Internal Use Only 15-17 HSC50no DKUTIL User Guide 15.7 COMMAND SUMMARY CHECK Fill or check tor unique <;lata in host LBN area. CHECK [READ] . CHECK WRITE DEFAULT Change default modifiers for DUMP command. DISPLAY Display characteristics, error history, RCT, or FCT. DISPLAY ALL DISPLAY CHARACTERISTICS DISPLAY CHARACTERISTICS DISPLAY CHARACTERISTICS DISPLAY CHARACTERISTICS DISPLAY CHARACTERISTICS DISPLAY CHARACTERISTICS DISPLAY ERRORS DISPLAY FCT DISPLAY RCT DUMP DBN {block} DISK LBN {block} PBN {block} RBN {block} XBN {block} Dump given block or table of blocks .. DUMP DUMP DUMP DUMP DUMP DUMP DUMP EXIT [BUFFER] DBN [{block}] FCT [BLOCK {number}] [COPY {copy}] LBN [{block}] RBN [{block}] RCT [BLOCK {number}] [COPY {copy}] XBN [{block}] Terminate execution of the program. GET Change the current drive. GET [{drive}] MODIFY Modify location(s) in the current buffer. MODIFY {offset} [{value} ...] POP Restore save buffer to current buffer. PUSH Save current buffer in save buffer. REVECTOR Force bad block replacement for the given LBN. SET Change various program parameters. REVECTOR {block} SET [SIZE {size}] SET CSSE_WRITE_ON (enables disk-write commands) WRITE Write a given block or all copies of a FCT or RCT block. WRITE WRITE WRITE WRITE WRITE WRITE WRITE 15-18 [BUFFER] DBN [{block}] FCT [BLOCK {number}] [COPY {copy}] LBN [{block}] RBN [{block}] RCT [BLOCK {number}] [COPY {copy}] XBN [{block}] Digital Internal Use Only HSC50170 DKUTIL User Guide 15.8 ERRORS and INFORMATION MESSAGES Following is a list of error and information messages which may be printed out by DKUTIL. Variable output is as follows: n a decimal number par BLOCK or COpy parm the part of the command in error (modifier, etc.) status MSCP status (an octal number) text the actual text in error xBN DBN, LBN, etc. xCT FCT or RCT 15.8.1 DKUTIL-S CTRUY or CTRUC Abort! This termination message is printed if the user aborts DKUTIL by typing CTRL-C or CTRL-Y. 15.8.2 DKUTIL.-F Insufficient resources to RUN! This message is printed if DKUTIL cannot acquire the necessary resources or if the disk functional code is not loaded. The program terminates after this message is printed. 15.B.3 DKUTIL-F Drive went OFFLINE! This message is printed if the unit selected goes off line while DKUTIL is running. The program terminates after this message is printed. 15.8.4 DKUTIL-F 1/0 request was rejectedl This message is printed if the diagnostic interface (DDUSUB) rejects a request to start an I/O operation. It indicates a bug in DKUTIL and should be reported to field service. The program terminates after this message is printed. 15.8.5 DKUTIL-E Illegal response to start-up question. This message is printed if an invalid response is entered for a start-up question or a prompt for the GET command. The user is prompted again with the same question. 15.8.6 DKUTIL-E Nonexistent unit number. This error message is printed if the unit number entered does not correspond to any known unit. The user is prompted again for a unit numbeL 15.8.7 DKUTIL-E Unit is not available. This message is printed if the unit requested is unavailable. It may be in use by a host or another diagnostic. It may be inoperative. The user is prompted again for another unit. 15.8.8 DKUTIL-E Cannot ONLINE unit. This message is printed if the requested unit is available but the ONLINE command failed. The unit is released, and the user is prompted again for another unit. Digital Internal Use Only 15-19 HSC50no DKUTIL User Guide 15.8.9 DKUTIL-E Invalid decimal number. This message is printed if the user entered an invalid decimal number in a command line. 15.8.10 DKUTIL-E Invalid octal number. This message is printed if the user entered an invalid octal number in a command line. 15.8.11 DKUTIL-E Missing parameter. This message is printed if a command line is entered and a required parameter is missing. 15.8.12 DKUTIL-E There is no buffer to dump. This message is printed if the DUMP BUFFER command is entered and there is no current buffer. This can only happen if a drive has just been selected. 15.8.13 DKUTIL-E Missing modifier (only a slash (I) was specified). This message is printed if a command line is entered with a slash ({) not followed by a modifier. .15.8.14 DKUTIL-E SOl command was unsuccessful. This message is printed when an SDI command is rejected by the drive. A DISPLAY ERRORS command for a RA60 drive will generate this message. 15.8.15 DKUTIL-E n is an invalid par number; maximum is n. This message is printed if an out-of-range number is entered for a BLOCK or COpy value for either the DUMP or the WRITE command. 15.8.16 DKUTIL-E "text" is an invalid parm. This generic error message is printed when an invalid command, command option, modifier, block type, or SET option is specified in a command line. 15.8.17 DKUTIL-E Invalid block number for xBN space. This message is printed if the block: number specified for a DISPLAY CHARACTERISTICS xBN command is out-of-range for the given space. 15.8.18 DKUTIL-E Copy n of xCT Block n (xBN n) Is bad. This message is printed for FCT or RCT blocks which cannot be read correctly with error recovery. It will occur when the Fer or RCT is being read just after a drive has been selected or for the DISPLAY FCT or DISPLAY RCT command. 15.8.19 DKUTIL-E All copies of of xCT Block n are bad. This message is printed for FeT or ReT blocks where all copies are bad. It will occur when the FeT or RCT is being read just after a drive has been selected or for the DISPLAY FeT or DISPLAY ReT command. 15-20 Digital Internal Use Only HSC50170 DKUTIL User Guide 15.8.20 DKUTIL-E Could not write xBN n, MSCP Status: status This message is printed if a write (for the WRITE command) fails. 15.8.21 DKUTIL-E Invalid sector size; only 512 and 576 are legal. This message is printed if the sector size entered for the SET SIZE command is not 512 or 576. 15.8.22 DKUTIL-E Revector for LBN n failed, MSCP Status: This message is printed if a revector (for the REVECfOR command) fails. 15.8.23 DKUTIL-E· CHECK READ for LBN n failed, MSCP Status: This message is printed for any LBN read, by CHECK READ, that has a non-zero MSCP status returned. 15.8.24 DK.lJTIL-E CHECK WRITE for LBN n failed, MSCP Status: This message is printed for any LBN written, by CHECK WRITE, that has a non-zero MSCP status returned. Digital Internal Use Only 15-21 DKUilL Lab Sample - 1 15.9 DKUTIL Lab Samples HSC50> RON OOl:OKOTIL DKUTIL functionality has changed in this release. It no longer prompts the user for the drive number. Instead, use the GET Dxxx command. DKUTIL> GET 0230 Serial Number: Mode: First Formatted: Date Formatted: Format Instance: FCT: 0000160992 \ b t' 512 ( t '2,..1':, I :5 bJ :: ~!\ ~I1J (: cf c,. 17-Mar-1988 18:00:58.00 18-Mar-1988 00:00:00.00 1 VALID ,.... DKUTIL> 15-22 Digital Internal Use Only 6. 'I..) L-I.."'-' F" (.I.\ 'P.,·\t SQ.t DKUTIL Lab Sample - 2 DKUTIL> DISP CHARACTERISTICS DISK Drive Characteristics for D0230 Type: RA81 Media: FIXED Cylinders: 1252 LBN, 4 XBN, 2 DBN Geometry: 1 track/group, 14 groups/cylinder, 14 tracks/cylinder 51 LBNs/track, 1 RBN/track, 52 sectors/track, 52 XBNs/track 728 XBNs/cylinder, 714 LBNs/cylinder, 14 RBNs/cylinder Group Offset: 14 (LBN) , 14 (XBN) LBNs: 891072 (host), 893928 (total) RBNs: 17528 XBNs: 2912 DBNs: 728 (read/write), 728 (read only) PBNs: 915824 ReT: 765 (size), 139 (non-pad), 4 (copies) FCT: 780 (size), 139 (non-pad), 4 (copies) SDl Version: 3 Transfer Rate: 174 Timeouts: 3 (short), 7 (long) Retry Limit: 5 Error Recover: 0 command levels Ece Threshold: 6 symbols Revision: 8 (microcode), 8 (hardware) Drive ID: 010C00030000 Drive Type lD: 5 DBN RO Groups: 14 Preamble Size: 19 (data), 12 (header) DKUTIL> Digital Internal Use Only 15-23 DKUTIL Lab Sample - 3 DKUTIL> DISPLAY ERRORS . 'i(.I.H.lN'I.-"::' j\. i,,l,q.sd /I I' VCh I .r ~o 9 l ) \Lul'll..-- \,j(.\t)st use PU(<')E r r *- _____________________________________ 1l,W11h __________ _________ . .: ___ * I Disk Drive Internal Error Log Display I *---------------------------------------------------------------------* This command will display the internal error log of disk drives that support internal error logging. For the RA80, RA81 and RA82 only 16 bytes of error log data will be displayed. For the RA60, no error log is implemented. For later drives, the internal error log data will be displayed. *---------------------------------------------------------------------* *----------~------------------------------------------ ----------------* Region 2 Data (byte 0 on the right is the oldest Drive Error) *---------------------------------------------------------------------* 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 DKUTIL> l'~'{"".f M0,., .-M/l r/; / ~ L/ 1&-24 Digital Internal Use Only DKUTIL Lab Sample - 4 DKUTIL> DISPLAY ReT ~ Serial Number: Flags: 0000160992 000000 LBN Being Replaced: Replacement RBN: 0 (000000 000000) 0 (060000 000000) 11403 --> 15293 --> Bad RBN: 100244 --> 115561 --> 117394 --> 121987 --> 171620 --> 193520 --> 225652 --> 253500 --> 266355 --> 274359 --> 305893 --> 340423 *-> 374947 --> 389586 --> 391728 --> 442706 --> 497662 --> 567338 --> 578125 --> 659759 --> 672672 --> 682165 --> 685021 --> 687877 --> 699525 --> 744479 --> 773565 --> 814216 --> 859503 --> 889896 --> 223, 299, 1475, 1965, 2265, 2301, 2391, 3365, 3794, 4424, 4970, 5222, 5379, 5997, 6674, 7351, 7638, 7680, 8680, 9758, 11124, 11335, 12936, 13189, 13375, 13431, 13487, 13716, 14597, 15167, 15965, 16853, 17448, ReT Statistics: 13822 22714 83559 114133 116275 117703 150682 181380 210660 231370 254214 272069 275905 308464 365116 379896 390300 420792 446995 507686 567925 614076 671472 680737 682879 686449 688591 721806 746413 790535 814930 869680 --> --> --> --> --> --> --> --> --> --> --> --> --> --> ~-> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> 1 96 94 2 'J~ 0 271, 445, 1638, 2237, 2279, 2307, 2954, 3556, 4130, 4536, 4984, 5334, 5409, 6048, 7159, 7448, 7652, 8250, 8764, 9954, 11135, 12040, 13166, 13347, 13389, 13459, 13501, 14153, 14635, 15500, 15979, 17052, 14536 51330 93562 114847 116989 120021 164702 192415 212809 242077 262070 272623 282789 317579 365829 389162 391014 433639 493817 516964 577411 640975 671958 681451 684307 687163 689305 729217 768332 800493 815644 885017 GO '}jeS --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> *-> --> --> --> --> --> --> --> '-!:vI") ( 610c/c 285, 1006, 1834, 2251, 2293, 2353, 3229, 3772, 4172, 4746, 5138, 5345, 5544, 6227, 7173, 7630, 7666, 8502, 9682, 10136, 11321, 12568, 13175, 13361, 13417, 13473, 13515, 14298, 15065, 15695, 15993, 17353, Bad RBN Bad LBNs Primary Revectors Non-Primary Revectors Probationary RBNs DKUTIL> Digital Internal Use Only 1>-25 DKUTIL Lab Sample - 5 DKUTIL> D:ISP RCT/FULL Revector Control Table'£or 00230 (RA81) Serial Number: Flags: 0000160992 000000 LBN Being Replaced: Replacement RBN: Bad RBN: 0 (000000 000000) 0 (060000 000000) 0 (060000 000000) Cache ID: Cache Incarnation: Incarnation Date: 11403 --> 15293. --> Bad RBN: 100244 --> 115561 --> 117394 --> 121987 --> 171620 --> 1°93520 --> 225652 --> 253500 --> 266355 --> 274359 --> 305893 --> 340423 *-> 374947 --> 389586 --> 391728 --> 442706 --> 497662 --> 567338 --> 578125 --> 659759 --> 672672 --> 682165 --> 685021 --> 687877 --> 699525 --> 744479 --> 773565 --> 814·216 --> 859503 --> 889896 --> 223, 299, 1475, 1965, 2265, 2301, 2391, 3365, 3794, 4424, 4970, 5222, 5379, 5997, 6674, 7351, 7638, 7680, 8680, 9758, 11124, 11335, 12936, 13189, 13375, 13431, 13487, 13716, 14597, 15167, 15965, 16853, 17448, 0000000000 ·0 17-Nov-1858 00:00:00.00 13822 22714 83559 114133 116275 117703 150682 181380 210660 231370 254214 272069 275905 308464 365116 379896 390300 420792 446995 507686 567925 614076 671472 680737 682879 686449 688591 721806 746413 790535 814930 869680 891071 ReT Statistics: DKUTIL> 15-26 Digital Internal Use Only --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> 1 96 94 2 271, 445, 1638, 2237, 2279, 2307, 2954, 3556, 4130, 4536, 4984, 5334, 5409, 6048, 7159, 7448, 7652, 8250, 8764, 9954, 11135, 12040, 13166, 13347, 13389, 13459, 13501, 14153, 14635, 15500, 15979, 17052, 17471, 14536 51330 93562 114847 116989 120021 164702 192415 212809 242077 262070 272623 282789 317579 365829 389162 391014 433639 493817 516964 577411 640975 671958 681451 684307 687163 689305 729217 768332 800493 815644 885017 --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> *-> --> --> --> --> --> --> --> 285, 1006, 1834, 2251, 2293, 2353, 3229, . 3772, 4172, 4746, 5138, 5345, 5544, 6227, 7173, 7630, 7666, 8502, 9682, 10136, 11321, 12568, 13175, 13361, 13417, 13473, 13515, 14298, 15065, 15695, 15993, 17353, Bad RBN Bad LBNs Primary Revectors Non-Primary Revectors 0 Probationary RBNs DKUTIL Lab Sample - 6 DKUTIL> DISPLAY FCT Factory Control Table for D0230 (RA81) Serial Number: Mode: First Formatted: Date Formatted: Format Instance: FCT: 0000160992 512 17-Mar-1988 18:00:58.00 18-Mar-1988 00:00:00.00 1 VALID Bad PBNs in FCT: 98 (512) , 0 (576) 1£ h,f ld lll /1' flit /,)/'11 t~ur d; vrl ha lP t)c1A PBNs in 512 Byte Subtable (04) (14) (14) (14) (14) (14) (14) (14) (14) (14) (14) (14) (14) (14) (14) (14) (14) (11) (14 ) (11) (11) (14) (14) (14) (14) (14) (11) (14) (14) (14) (14) (14) (14) 910999 886732 830927 806011 761066 735985 702058 699874 696234 694050 684670 626116 579026 517640 455759 429046 397908 387344 372293 314512 281280 277403 259198 235906 214790 184936 153636 120008 118552 116368 85197 23157 14111 (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LEN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN 893514) , 869680) , 814930) , 790535) , 746413) , 721806) , 688591) , 686449) , 682879) , 680737) , 671472) , 614076) , 567925), 507686) , 446995) , 420792) , 390300) , 379896) , 365116) , 308464) , 275905) , 272069) , 254214), 231370) , 210660) , 181380) , 150682), 117703) , 116275), 114133) , 83559) , 22714) , 13822) , (14) (14) (14) (14) (14) (14) (14) (14) (14) (14) (14) (11) (14) (14) (14) (14) (14) (14) (14) (11) (14) (14) (14) (14) (14) (14) (14) (14) (14) (14) (04) (14) (14) 907296 876406 830199 788698 759046 713277 701330 698418 695506 685875 672695 589430 578470 507420 451386 399364 397180 382260 347081 311856 279728 271577 258470 230076 197314 175003 124376 119661 117824 102227 76717 15558 11600 (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (RBN (LBN (LBN 889896) , 859503 ), 814216) , 773565) , 744479), 699525), 687877), 685021), 682165) , 672672), 659759), 578125) , 567338) , 497662) , 442706), 391728) , 389586), 374947), 340423) , 305893), 274359), 266355) , 253500) , 225652) , 193520) , 171620), 121987), 117394) , 115561) , 100244) , 1475), 15293) , 11403), (14) (14 ) (14 ) (14) (14) (14 ) (14 ) (14) (14 ) (14 ) (14) (11) (14) (14) (14 ) (14 ) (14 ) (14 ) (14) (14 ) (14 ) (14) (14 ) (14 ) (14 ) (14 ) (14) (14 ) (14) (14) (14 ) (14 ) 902364 831655 816150 783411 743519 702786 700602 697690 694778 685147 653579 588702 527100 503507 442145 398636 396792 373020 323804 288333 277966 267208 246823 216981 196167 167953 122388 119280 117096 95396 52348 14839 I 5/~ rD')'! ~, / (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN (LBN 885017), 815644) , 800493), 768332) , 729217) , 689305) , 687163) , 684307) , 681451) , 671958), 640975) , 577411), 516964) , 493817) , 433639) , 391014), 389162) , 365829) , 317579) , 282789) , 272623) , 262070) , 242077), 212809) , 192415) , 164702) , 120021) , 116989) , 114847), 93562), 51330) , 14536) , DKUTIL> Digital Internal Use Only 15-27 liB,\)$ 'P'f~\S5 A~~5. L~tSj DKUTIL Lab Sample - 7 DKOTIL> DISPLAY CHARA LBN 100 1~e.~.£(Jr CCt'~. Characteristics for LBN 100 (000000 000144) Cylinder 0, Group 1, Track PBN 63 (000000 000077) O,~ Position 11 ~(,') Primary RBN 1(060000 000001) in RCT Block 3 at Offset 4 DKOTIL> 15-28 Digital Internal Use Only DKUTIL Lab Sample - 8 DKUTIL> DISPLAY CHARA DBN 2 Characteristics for DBN 2 (140000 000002) Cylinder 1256, Group 0, Track 0, Position 2 PBN 914370 (000015 171702) DKUTIL> Digital·lnternal Use Only 1>-29 DKUTIL Lab Sample - 9 DKUTIL> OXSPLAY CHARA RBN 24 Characteristics for RBN 24 (060000 000030) Cylinder 1, Group 10, Track 0, Position 35 PBN 1283 (000000 002403) Located in RCT Block 3 at Offset 96 DKUTIL> 15-30 Digital Internal Use Only DKUTIL Lab Sample - '0 DKUTIL> OISPLAY CHARA XBN 400 Characteristics for XBN 400 (120000 000620) Cylinder 1252, Group 7, Track 0, Position 30 PBN 911850 (000015 164752) DKUTIL> Digital Internal Use Only 15-31 DKUTIL Lab Sample - 11 DKUTIL> DUMP LBN lOO/ALL ****** Buffer for LBN 100 (000000 000144), MSCP Status: 000000 1V0 t- oj ~(J 5 va-! i cR L66k@ HS( 01-,,01.- ((j,~ Error Summary J Original Error Bits Error Recovery Flags Error Retry Counts Header = L-en0Ck 000000 000 0,0,0 UPP€v 000144 000000 000144 000000 000144 000000 000144 000000 OcM;:; IdJ d f c.,j",o.l--1' ~I-lDQ. G;cleata = +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320, +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 063146 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 EDC 063043 ECC 000000 000000 000000 000000 000000 000000 000000 000000 010400 000000 000000 000000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 Calculated EDC Difference DKUTIL> 15-32 BN = 0 (000000 000000) ECC Symbols Corrected = 0,0 Error Recovery Command = 000 Digital Internal Use OnlyunL Lab Sample - 12 DKUTIL> DUMP/ALL DBN 123 ****** Buffer for DBN 123 (140000 000173) Error Summary = = MSCP Status: 000000 = Original Error Bits Error Recovery Flags Error Retry Counts Header I = 000000 = 000 = 0,0.,0 ('" Li S ~i f~\e BN = 0 (000000 000000) ECC Symbols Corrected = 0,0 Error Recovery Command = 000 000173 140000 000173 140000 000173 140000 000173 140000 +416 +432 +448 +464 +480 +496 177777 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 EDC 030206 ECC 000000 000000 000000 000000 000000 000000 000000 000000 007400 000002 000000 000000 Data +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 '~alculated EDC Differenceigital Internal Use Only 15-33 DKUTIL Lab Sample - 13 DKUTIL> DUMP/ALL RBN 2000 ****** Buffer for RBN 2000 (060000 003720), MSCP Status: 000010 S+d.ft,s Error Summary = EDC = Data = +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 S c- f v/ p. { of £ [) '-- ~ rrc" r" = 000020 = 000 = 0,0,0 177777 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 15555'5 066666 133333 155555 = = 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 Caloulated EDC Difference 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 147571 ECC 000000 000000 000000 000000 000000 000000 000000 000000 006400 000001 000000 000000 Digital Internal Use Only 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666· 133333 155555 066666 133333 = 177777 EDC DKUTIL> 15-34 I BN = 2000 (060000 003720) ECC Symbols Correoted 0,0 Error Recovery Command 000 t-uHfBLJ; Rotv' 003720 060000 003720 060000 003720 060000 003720 060000 Original Error Bits Error Recovery Flags Error RetrY'Counts Header rcn-c €d ~v-ro r ( 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 133333 155555 066666 '1> ~c 5111) \ \I ) :::- ':-oK«"cl.. PYtOr 1U t YfI tl-t ~ DKUTIL Lab Sample - 14 DKUTIL> 1)OMP /ALL RBN 223 ****** Buffer for RBN 223 (060000 000337), MSCP Status: 000000 Error Summary Original Error Bits Error Recovery Flags Error Retry Counts Header Data +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 = = 000000 000 0,0,0 BN = 0 (000000 000000) ECC Symbols Corrected = 0,0 Error Recovery Command·alculated EDC Difference 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 000000 133331,\ 133331 13,3331 133331 133331 133331 133331 133331 133331 133331 133331 ~ 133331 133331 133331 J) 133331 133331 . 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331J 133331 133331 \;:~HplftJ s",,.c'li !. ()(~ DKOTIL> Digital Internal Use Only 1 >-35 DKUTIL Lab Sample - 15 DKUTIL> DUMP/ALL XBN 0 ****.** Buffer for XBN k.., a (120000 000000), MSCP Status: 000000 Error Summary Original Error Bits Error Recovery Flags Error Retry Counts Header Data +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 = = = 000000 = 000 = 0,0,0 r-USAThlt. l(~~ 000000 120000 000000 1200~0 r-"MO&R tAJovt{ ::- 5 L~ b'1 te f1- tooooootualculated EDC Difference DKOTIL> 15-36 BN = 0 (000000 000000) ECC Symbols Corrected = 0,0 Error Recovery Command = 000 Digital Internal Use Only 000000 ~S"E ~ F" t.. rC T 6/f tlal,oP b;-f-/~ Fe fV\\)ev(,c(bir t~ , r !~~ DKUTIL Lab Sample - 16 \ /, rj ~ 'P\(l:/(,C DKUTIL> DUMP /AI,;L FCT BLOCK 1 COpy 1 4&1 .. ,,\ 1 i",.c.';~Ji; ,\,')"[7\ ****** FCT Block 1, Copy 1 ****** ****** Buffer for XBN 0 (120000 000000), MSCP Status: 000000 Error Summary Original Error Bits Error Recovery Flags Error Retry Counts Header = Data = +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 000000 000 0,0,0 BN = 0 (000000 000000) ECC Symbols Corrected Error Recovery Command = 0,0 = 000 000000 120000 000000 120000 000000 120000 000000 120000 126736 177626 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000001 000220 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 072340 100000 000213 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000002 032532 001201 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 177710 000661 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 Calculated EDC Differenceigital Internal Use Only 15-37 DKUTIL Lab Sample - 17 DKUTIL> DUMP/ALL XBN 1 ****** Buffer for XBN 1 (120000 000001), MSCP Status: 000000 Error Summary Original Error Bits Error Recovery Flags Error Retry Counts Header = 000001 I-oo....Jf'r Data = +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 163227 057566 072026 116352 161075 130272 117652 073463 174413 152722 137034 137441 011124 152464 170334 045300 022331 142047 043406 125633 157024 147430 107523 146174 033437 000000 000000 000000 000000 000000 000000 000000 000000 000 0,0,0 120000 000001 120000 000001 120000 000001 120000 HI~alculated EDC Difference Digital Internal Use Only 103714 125367 1 72063 035361 131622 122532 113442 041667 175636 163010 161472 012454 164420 045711 063115 035633 170646 047625 151150 162730 150760 143220 025655 034767 000000 000000 000000 000000 000000 000000 000000 000000 = 000000 EDC DKUTIL> 15-38 BN = 0 (000000 000000) ECC Symbols Corrected = 0,0 Error Recovery Command = 000 140015 140014 140013 140013 140012 140012 140012 140012 110010 140007 140006 140006 110005 140005 140004 140004 140003 140003 140002 140001 140001 140001 040001 140000 000000 000000 000000 000000 000000 000000 000000 000000 DKUTIL Lab Sample - 18 DKUTIL> DUMP/ALL FCT BLOCK 2 COpy 1 ****** FCT Block 2, Copy 1 ****** ****** Buffer for XBN 1 ( 120000 000001), MSCP Status: 000000 Error Summary Original Error Bits Error Recovery Flags Error Retry Counts Header Data +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 = 000001 = 163227 057566 072026 116352 161075 130272 117652 073463 174413 152722 137034 137441 011124 152464 170334 045300 022331 142047 043406 125633 157024 147430 107523 146174 033437 000000 000000 000000 000000 000000 000000 000000 000000 000 0,0,0 BN = 0 (000000 000000) ECC Symbols Corrected = 0,0 Error Recovery Command = 000 120000 000001 120000 000001 120000 000001 120000 040015 140015 140014 140013 140012 140012 140012 140012 140011 140010 140007 140006 140006 140005 140004 110004 140004 140003 140003 140002 140001 140001 140001 140000 140000 000000 000000 000000 000000 000000 000000 000000 154040 130247 046173 112406 134502 126742 116322 072133 106704 151646 127323 105766 007574 130434 146220 042260 011710 114602 001302 110021 152310 146100 072244 055165 026520 000000 000000 000000 000000 000000 000000 000000 140015 140014 140014 140013 140012 140012 140012 140012 140011 140010 140007 140006 140006 140005 110004 140004 140004 140003 140003 140002 140001 140001 140001 140000 140000 000000 000000 000000 000000 000000 000000 000000 142334 126717 004332 054137 133152 124062 114772 071176 177166 005374 172117 014004 006770 127105 141060 036716 172176 101274 177107 054044 151555 144550 046315 036306 000000 000000 000000 000000 000000 000000 000000 000000 Calculated EDC Difference 140015 140014 140014 140013 140012 140012 140012 140012 110010 140010 140006 140006 140006 140005 110004 140004 140003 140003 140002 110002 140001 140001 140001 140000 000000 000000 000000 000000 000000 000000 000000 000000 = EDC 142337 ECC 000000 000000 000000 000000 000000 000000 000000 000000 006000 000001 000000 000000 103714 125367 172063 035361 131622 122532 113442 041667 175636 163010 161472 012454 164420 045711 063115 035633 170646 047625 151150 162730 150760 143220 025655 034767 000000 000000 000000 000000 000000 000000 000000 000000 140015 140014 140013 140013 140012 140012 140012 140012 110010 140007 140006 140006 110005 140005 140004 140004 140003 140003 140002 140001 140001 140001 040001 140000 000000 000000 000000 000000 000000 000000 000000 000000 000000 DKOTIL> Digital Internal Use Only 15-39 DKUTIL Lab Sample - 19 DKUTIL> DUMP/ALL RCT BLOCK 1 COpy 1 ~fkJl Blovl ****** RCT Block 1, Copy 1 ****** ****** Buffer for LBN 891072 (000015 114300), MSCP Status: 000000 Error Summary Original Error Bits Error Recovery Flags Error Retry Counts Header Data +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 = = = 000000 = 000 =a 0 0 V;SA'8L£ Lew = 0,0 = 000 114300 000015 114300 000015 114300 000015 114300 000015 - AI(.f!Je 5,.111) C "0/(.. /1 SeY"i~ 072340 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 #" , - - C()\o\ fv-t/( tAoec 000000 000000 000000 000000 000000 000000 000000 000000 174400 000002 000000 000000 vee 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 Calculated EDC Difference = 000000 DKUTIL> 15-40 BN = a (000000 000000) ECC Symbols Corrected Error Recovery Command Digital Internal Use Only 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 DKUTIL Lab Sample - 20 DKUTIL> DUMP/ALL LBN 891072 ****** Buffer for LBN 891072 (000015 114300), MSCP Status: 000000 Error Summary Original Error Bits Error Recovery Flags Error Retry Counts Header Data +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 = = 000000 000 0,0,0 = BN 0 (000000 000000) ECC Symbols Corrected = 0,0 Error Recovery Commandalculated EDC Difference 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 DKUTIL> Digital Internal Use Only 15-41 DKUTIL Lab Sample - 21 DKUTIL> DUMP/ALL LBN 100 Buffer for LBN 100 (000000 000144), MSCP Status: 000000 ****** Error Summary Original Error Bits Error Recovery Flags Error Retry Counts Header = 000144 = 063146 000000 000 0,0,0 000000 000144 000000 000144 000000 000144 000000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 Data +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 EDC 063043 ECC 000000 000000 000000 000000 000000 000000 000000 000000 010400 000000 000000 000000 Calculated EDC Difference DKUTIL> 15-42 BN = 0 (000000 000000) ECC Symbols Corrected = 0,0 Error Recovery Command = 000 Digital Internal Use Only = 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 000000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 JCr":! t,i'-I f;rt,{ G(.(d ptt; ltd DKUTIL Lab Sample - 22 ~ DKUTIL> MODIFY 32 1111 2222 3333 4444 5555 6666 DKUTIL> DUMP/ALL BOFFER ****** Buffer for LBN 100 (000000 000144), MSCP Status: 000000 Error Summary Original Error Bits Error Recovery Flags Error Retry Counts 000000 000 0,0,0 BN = 0 (000000 000000) ECC Symbols Corrected = 0,0 Error Recovery Command = 000 Header = 000144 000000 000144 000000 000144 000000 000144 000000 Data = +16 -----> +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 063146 177400 002127 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 177776 177000 004256 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177774 176000 006405 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177770 174000 010534 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177760 170000 012663 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 Calculated EDC Difference 177740 160000 015012 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 = EDC 063043 ECC 000000 000000 000000 000000 000000 000000 000000 000000 010400 000000 000000 000000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 <------ 000000 DKUTIL> Digital Internal Use Only 15-43 DKUTIL Lab Sample - 23 r .rct o~M 'f{\vJ 0~ ~ w~cL ~ . I Wo rG (Oc{pf) Me> t¢ DKUTIL> MODIFY 32 01111 02222 03333 04444 05555 06666 07777 DKUTIL> DUMP/ALL BUFFER ****** Buffer for LBN 100 (000000 000144), MSCP Status: 000000 Error Surmnary Original Error Bits Error Recovery Flags Error Retry Counts Header = 000144 Data = +16 -----> +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 15-44 063146 177400 001111 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 000 0,0,0 BN = 0 (000000 000000) ECC Symbols Corrected = 0,0 Error Recovery Command = 000 000000 000144 000000·alculated EDC Difference 177740 160000 006666 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 = 000000 EDC 063043 ECC 000000 000000 000000 000000 000000 000000 000000 000000 010400 000000 000000 000000 Digital Internal Use Only 177700 140000 007777 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 <------ DKUTIL Lab Sample - 24 DKUTIL> WRITE LBN 891071 DKUTIL> DUMP LBN 891071 ****** Buffer for LEN 891071 (000015 114277) Data = +16 +32 -----> +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 063146 177400 001111 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 = 167125 EDC 177776 177000 002222 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177774 176000 003333 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177770 174000 004444 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 I MSCP Status: 000000 177760 170000 005555 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 Calculated EDC Differenceigital Internal Use Only 15-45 DKUTIL Lab Sample - 25 DKUTIL> DKUTIL> REVEC~OR 891071 DKUTIL> DKUTIL> DISPLAY/FOLL RC~ DKUTIL> ERROR-W Bad Block Replacement (Success) at Command Ref 00000000 RA81 unit 230. Err Seq 1. Format Type 09 * ** ~~:~~ Flags Replace Flags LBN Old RBN ·New RBN Cause Event ERROR-I End of error. ~~14 DKUTIL> 15-46 S- UCI?Q~<;+\J ( 8000 891071. O. 17471. 004A Digital Internal Use Only t 't I~, 9-May-1988 13:50:09.93 DKUTIL Lab Sample - 26 DKUTIL> DISPLAY/FULL ReT Revector Control Table for 00230 (RA81) Serial Number: Flags: 0000160992 000000 LBN Being Replaced: Replacement RBN: Bad RBN: 891071 (000015 114277) 17471 (060000 042077) 0 (060000 000000) Cache ID: Cache Incarnation: Incarnation Date: 0000000000 0 17-Nov-1858 00:00:00.00 11403 --> 15293 --> Bad RBN: 100244 --> 115561 --> 117394 --> 121987 --> 171620 --> 193520 --> 225652 --> 253500 --> 266355 --> 274359 --> 305893 --> 340423 *-> 374947 --> 389586 --> 391728 --> 442706 --> 497662 --> 567338 --> 578125 --> 659759 --> 672672 --> 682165 --> 685021 --> 687877 --> 699525 --> 744479 --> 773565 --> 814216 --> 859503 --> 889896 --> 223, 299, 1475, 1965, 2265, 2301, 2391, 3365, 3794, 4424, 4970, 5222, 5379, 5997, 6674, 7351, 7638, 7680, 8680, 9758, 11124, 11335, 12936, 13189, 13375, 13431, 13487, 13716, 14597, 15167, 15965, 16853, 17448, 13822 22714 83559 114133 116275 117703 150682 181380 210660 231370 254214 272069 275905 308464 365116 379896 390300 420792 446995 507686 567925 614076 671472 680737 682879 686449 688591 721806 746413 790535 814930 869680 891071 --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> .--> --> --> --> --> 271, 445, 1638, 2237, 2279, 2307, 2954, 3556, 4130, 4536, 4984, 5334, 5409, 6048, 7159, 7448, 7652, 8250, 8764, 9954, 11135, 12040, 13166, 13347, 13389, 13459, 13501, 14153, 14635, 15500, 15979, 17052, 17471, 14536 51330 93562 114847 116989 120021 164702 192415 212809 242077 262070 272623 282789 317579 365829 389162 391014 433639 493817 516964 577411 640975 671958 681451 684307 687163 689305 729217 768332 800493 815644 885017 --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> *-> --> --> --> --> --> --> --> 285, 1006, 1834, 2251, 2293, 2353, 3229, 3772, 4172, 4746, 5138, 5345, 5544, 6227, 7173, 7630, 7666, 8502, 9682, 10136, 11321, 12568, 13175, 13361, 13417, 13473, 13515, 14298, 15065, 15695, 15993, 17353, <--- new replacement added for 891071 ReT Statistics: 1 97 95 2 0 Bad RBN Bad LBNs Primary Revectors Non-Primary Revectors Probationary RBNs DKUTIL> Digital Internal Use Only 15-47 DKUTIL Lab Sample - 'Z1 DKUTIL> DUMP/ALL LBN 891071 Buffer for LBN 891071 (000015 114277), MSCP Status: 000000 ****** Error Summary = header compare Original Error Bits Error Recovery Flags Error Retry Counts Header = = 004000 = 200 0,0,0 trA L'B. IV If if £I/I&r ,e: fiG PhllN,CtKj ~/¢ 15-48 Co"'"po!es ~a~ CocRL Digital Internal Use' Only 177700 140000 007777 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 = 000000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 l-h '( ~ Eta (S J)ATA ' ~ P.- DKUTIL Lab Sample - 28 DKUTIL> DUMP/ALL RBN 17471 ****** Buffer for RBN 17471 (060000 042077), MSCP Status: 000000 Error Summary 9riginal Error Bits Error Recovery Flags Error Retry Counts Header Data +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 = = 000000 000 0,0,0 BN = 0 (000000 000000) ECC Symbols Corrected = 0,0 Error Recovery Command = 000 042077 060000 042077 060000 042077 060000 042077 060000 063146 177400 001111 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 177776 177000 002222 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177774 176000 003333 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177770 174000 004444 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177760 170000 005555 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177740 160000 006666 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 = 000000 EDC 167125 ECC 000000 000000 000000 000000 000000 000000 000000 000000 052000 000001 000000 000000 Calculated EDC Difference 177700 140000 007777 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 DKUTIL> Digital Internal Use Only 15-49 DKUTIL Lab Sample - 29 DKUTIL> DUMP/ALL/RAW LBN 891071 ****** Buffer for LBN 891071 (000015 114277), MSCP Status: 000000 Error Summary Original Error Bits Error Recovery Flags Error Retry Counts Header Data +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 = = 000000 000 0,0,0 114277 050015 114277 050015 114277 050015 114277 050015 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 = EDC 000105 ECC 000000 000000 000000 000000 000000 000000 000000 000000 146400 000003 000000 000000 Calculated EDC Difference DKUTIL> 15-50 BN = 0 (000000 000000) ECC Symbols Corrected = 0,0 Error Recovery Command = 000 Digital Internal Use Only 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 Or-I~ l~lcJ-f13'o + kv>/}1 :11 J. 7 DKUTl~ DKUTIL> DUMP RCT BLOCK 2 Te tl/J) /-;' v('p -< 0' '}- Jsed dlUh,'{B Lab Sample - 30 e1\ Ca,cli18 J.dt 1M f2C 't ****** RCT Block 2, Copy 1 ****** ****** Buffer for LBN 891073 (000015 114301), MSCP Status: 000000 Data = +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 063146 177400 001111 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 000000 177400 EDC = 167125 177776 177000 002222 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177776 177000 177774 176000 003333 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177774 176000 177770 174000 004444 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177770 174000 177760 170000 005555 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177760 170000 177740 160000 006666 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 1777.40 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177740 160000 177700 140000 007777 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177700 140000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 177600 100000 Calculated EDC Difference = 000000 DKUTIL> Digital Internal Use Only 15-51 DKUTIL Lab Sample - 31 DKUTIL> DUMP RCT BLOCK 3 ****** RCT Block 3, Copy 1 ****** ****** Buffer for LBN 891074 (000015 114302), MSCP Status: 000000 Data = +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 = 000105 EDC 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 Calculated EDC Difference DKUTIL> 15-52 Digital Internal Use Onlyab Sample - 32 DKUTIL> DUMP ReT BLOCK 5 ****** RCT Block 5, Copy 1 ****** ****** Buffer for LBN 891076 (000015 114304), MSCP Status: 000000 Data +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 = 000000 EDC = 050431 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 034310 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 020000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 Calculated EDC Difference 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 = 000000 000000 000000 032776 000000 000000 000000 000000 000000 000000 035675 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 IJescv!pf-c i- Code:. o~-000000 ~(/o(. cfec.P R.J3 IJ (Pr'.'''I'~) 020000 <--000000 000000 000000 000000 000000 000000 020000 000000 000000 oooboo 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 DKUTIL> Digital Internal Use Only 15-53 DKUTIL Lab Sample - 33 DKUTIL> DUMP RCT BLOCK 9999 DKUTIL-E 9999 is an invalid BLOCK number; maximum is 139. DKUTIL> 15-54 Digital Internal Use Only t".::.e -: Use fr; Lor'1 Se.e MAt SftlMPlt tcr DKUTIL Lab Sample - 34 DKUTIL> DUMP RCT BLOCR 139 ****** RCT Block 139, Copy 1 ****** ****** Buffer for LBN 891210 (000015 114512), MSCP Status: 000000 Data = +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 -----> :+~40 +256 +272 +288 +304 +320 +336 +352 +368 +384 +400 +416 +432 +448 +464 +480 +496 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 112050 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 EDC = 071732 DKUTIL> 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 020015 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 100000 toooooalculated EDC Difference = 000000 Eo V\J c f 1~:do{€.- (of .p.lr.;s ccp~ ') Digital Internal Use Only 15-55 DKUTIL Lab Sample - 35 DKUTIL> MODIFY 0 0177777 DKUTIL-E "MODIFY" is an invalid command. DKUTIL> 15-56 Digital Internal Use Only DKUTIL Lab Sample - 36 DKUTIL> DISPLAY ReT Revector Control Table for D0232 (RA81) 0000160978 000000 Serial Number: Flags: 19784 47784 81181 94334 99436 114316 130442 153918 156330 184367 203282 218759 291924 349045 438151 458862 488026 509499 --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> 387, 20143 --> 394, 947, 936, 48344 --> 1591, 93337 --> 1830, 95762 --> 1877, 1849, 1949, 111798 --> 2192, 2241, 115744 --> 2269, 2557, 153071 --> 3001, 3047, 30181. 155437 --> 3065, 176124 --> 3453, 3615, 192131 --> 3767, 3985, 203947 --> 3998, 4289, 227509 --> 4460, 5724, 293960 --> 5763, 6844, 358890 --> 7037, 8591, 438994 --> 8607, 8997, 468130 --> 9179, 9569, 494505 --> 9696, 9990, 512513 --> 10049, " " " 80~040 810470 811182 813326 838216 853831 --> --> *-> --> --> --> 15863, 15891, 15906, 15947, 16435, 16741, ReT Statistics: 27548 --> 540, 1179, Bad RBN: 93620 --> 1835, 97853 --> 1918, 113602 --> 2227, 125505 --> 2460, 153204 --> 3004, 156158 --> 3061, 183653 --> 3601, 195256 --> 3828, 213933 --> 4194, 254669 --> 4993, 348920 --> 6841, 364645 --> 7149, 457022 --> 8961, 486718 --> 9543, 498756 --> 9779, 5174.16 --> 10145, " 809756 810468 811898 813322 848022 886659 --> *-> --> *-> --> --> 15877, 15892, 15919, 15948, 16627, 17385, 809754 811184 812612 814036 848736 1 157 153 4 0 Bad RBN Bad LBNs Primary Revectors Non-Primary Revectors Probationary RBNs *-> --> --> --> --> 15878, 15905, 15933, 15961, 16641, DKUTIL> Digital Internal Use Only 15-57 DKUTIL Lab Sample - 37 DKUTIL> DISPLAY ReT Revector Control Table for D0117 (RA81) Serial Number: Flags: 3067705207 133331 LBN Being Replaced: Bad RBN: 3067721433 (133331 133331) 3067721433 (173331 133331) Cache ID: Cache Incarnation: Incarnation Date: 3067721433 3067721433 5824 06:31:06.85 ******* ******* ******* ******* ******* ******* ******* ******* ******* ******* ******* ******* ******* ******* ******* ******* ******* ******* *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> "'-> *-> *-> *-> *-> *-> *-> 0, ******* 3, ******* 6, ******* 9,******* 12,******* 15,******* 18, ******* 21, ******* 24, ******* 27,******* 30,******* 33,******* 36,******* 39, ******* 42, ******* 45,******* 48,******* 51,******* *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> DKUTIL-I CTRL/Y or CTRL/C Abort HSC70> 15-58 Digital Internal Use Only 1, ******* 4, ******* 7,******* 10,******* 13,******* 16,******* 19,******* 22, ******* 25,******* 28,******* 31, *Yr***** 34, ******* 36,******* 40, ******* 43,******* 46,******* 49, ******* 52,** .... C *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> *-> 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, 35, 38, 41, 44, 47, 50, DKUTIL Lab Sample - 38 DKUTIL> DUMP RCT BLOCK 1 1) e-S u!Iof' '6f;() b? K ~ <; ur ****** RCT Block 1, Copy 1 ****** ****** Buffer for LBN 891072 (000015 114300), MSCP Status: 000000 Data = +16 +32 +48 +64 +80 +96 +112 +128 +144 +160 +176 +192 +208 +224 +240 +256 +272 +288 +304 +320 +336 +352 ~368 +384 +400 +416 -1-432 +448 +464 +480 +496 EDC = 073567 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 140753 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 133331 Calculated EDC Differenceigital Internal Use Only 1 >-59 DKUTIL Lab Sample - 38 15-60 Digital Internal Use Only DSA TROUBLESHOOTING COURSE Lab Exercise #3 RAUTIL Digital Internal Use Only 1 RAUTIL Lab Lab Exercise 3 Refer to the RAUTIL section of the Student Guide during this exercise. 1. Log into your student account 2. $ SET DEF [STUDENTx.RAUTIL] 3. $ SET PROCESS/PRIV =ALL 4. Specify the desk asSigned to you by the instructor. 5. $ SHOW DEV D . If your disk is Dot mounted, mount it using: $ MOUNT/FOR Device: Also make a note of some other mounted disks (disk names) for use later during the exercise. 6. $ RUN RAUTIL Select the disk assigned to you for this exercise. When prompted, select an output log file so that you may obtain a hardcopy of the results after the exercise. 7. Execute the HELP command. 8. Execute the ANALYZE command. 9. Execute the SUMMARY command. c:g~ ev-.. c.9-~1'l 12Ct 1 Write down the head numbers that have replacements logged. 10. Execute the HEAD command for each of the heads on the disk that have replacements logged in the SUMMARY command. Use CONTROL-C to get out of the head selection prompt mode and back to the Donna! RAu::ra mode. 11. Execute the DD command. 2 Digital Internal Use Only RAUTIL Lab Lab Exercise 3 12. Execute the TL command 13. Execute the TR command. 14. Execute the DUMP command Read the instructions in of the RAUTIL section of the Student Manual for proper command formats. Dump some random LBNs in the host area. Dump blocks 0, 1, and 2 from the RCT. 15. Using the DUMP command, locate an wlUsed LBN that contains either all zeros or the DEC Standard Format Pattern. On an RA81, look in the vicinity ofLBN 891050. Try the WRITE command using different write patterns of your choice, and use the DUMP command to verify the results. 16. Using the available LBN found during the previous step, execute the MODIFY command. Pick your own patterns and bytes to modify. Use the DUMP command to verify your modifications. 17. Execute the BBR command. NOTE If you are using a disk that is connected to an HSC controller, you wiD get a message that indicates you should use DKUTIL. This is normal as the HSC controls ALL BBR activity and prevents RAUTIL from performing this function. Have your instructor clarify this point if necessary. 18. Use the NEXT command and select another disk drive. Select one of the drives that is already mounted. (Refer to your notes from step 5.) 19. Use the ANALYZE, SUMMARY, and HEAD commands to review these drives. 20. If time pennits, re-select the original drive assigned to you by the instructor, and try the SCRUB command. 21. Read the TROUBLESHOOTING section of the RAUTIL user guide in your Student Manual. Digital Internal Use Only 3 RAUTIL User Guide CHAPTER 16 RAUTIL USER GUIDE Digital Internal Use Only 16-1 RAUTIL User Guide 16.1 OVERVIEW RAUTIL is an executable image which runs under VMS and is capable of examining RA-based drives which have been previously mounted using the VMS MOUNT command. RAUTIL allows the user to examine disks that are already mounted and being used interactively by the system without any interference to another user or user's data. If the user intends to only perfonn read operations to examine the disk structures and contents, the drive does not have to be dismounted If the drive is mounted FOREIGN, the user is allowed to write, modify, or replace an LBN. Some of the capabilities that may be utilized by RAUTIL are: Display the number of replacements that exist on a drive. Show the relative position of the bad block on a given track to identify where scratches and other defects may exist. Detennine if any replaced blocks contain a forced error. Display the contents of an LBN within the host area or the RCT area. Scrub the disk to cause automatic replacements to occur. Manually force replacement of a block (xDA controllers only). Modify the contents of an LBN in the host area or ReT area. 16.1.1 Restrictions Some features of RAUTIL cannot be executed for drives connected to an HSC-type controller. This is due to the way the program is constructed and the limitations that the HSC presents to RAUTIL. These will be noted throughout this document. All functionality of RAUTIL is available to xDA-type controllers. 16.2 GETTING STARTED 16.2.1 Compile RAUTILMAR RAUTIL makes use of several system references that are different from system to system. For this reason, RAUTll... must be compiled on the target VMS system or anyone of the system nodes in a VAX/VMS cluster. This may be accomplished using the following steps: 1. Obtain a copy of the source input file RAUTIL.MAR. 2. At the VMS prompt, enter the following commands: $ MACRO RAUTIL+SYS$LIBRARY:LIB/LIB $ LINK RAUTIL+SYS$SYSTEM:SYS.STB+SCSDEF To run RAUTIL, you must have PHY_IO (physicall/O) privilege and CMKRNL (change kernel mode) privilege. 16.2.2 1. Invoke RAUTIL.EXE Select a target drive to interrogate or test. To determine which drives are available, type SHOW DEVICE D at your terminal. If the target drive is not mounted, mount it foreign. If it is mounted and you wish to perfonn write functions to the drive, you will have to dismount and remount it foreign. The program perfonns physical QIO functions, so be sure you have enabled PHY_10 privilege. To enable the privilege, type SET PROC/pRIV=PHY_IO at your tenninal. If you intend to write to the disk, you must also set CMKRNL privilege. 1&-2 Digital Internal Use Only RAUTll User Guide 2. Start the program by typing RUN RAUTIL at your terminal. A sample of the dialog follows. At the first prompt, "what device ? tt, enter the disk you wish to test. This must be a device name that appeared when you issued the SHOW DEVICE D command. For shadowed disks, be sure to select the device name for a member of the shadow set and not the shadow set name. Alternately, you may enter one of the following generic device names rather than a specific device: RA60, RA70, RA80, RA81, RA82, RA90 This allows the program to be used for information-only purposes to perform translations for the selected device type. The following commands may then be used: DO Display device parameters EXIT Exit the program HELP Display command summary NEXT Seiect another "real" device or device TL Translate LBN TR Translate RBN Commands that require selection of a specific device will be non-operational when you select a generic ~~~ - The next prompt will ask if you wish to create a log file. If you respond with ttytt, a log file containing a copy of the commands and responses displayed on your terminal during the RAUTIL session will be generated. The next prompt will ask if you want to verify the RCT consistency. If the controller is an xDA-type and you wish to check RCT consistency, type tty". This will verify the multiple copies of the ReT. If you are connected to an HSC controller, the consistency cannot be checked since the· HSC will only allow the host to read RCT copy 1. The next prompt will be the main RAUTIL prompt From this point, any of the functions listed in Section 16.3 or Section 16.4 can be executed, provided you have the proper privileges. Sample dialog of RAUTIL startup $ RUN RAOTIL RA drive analysis utility version 9.3, type "HELP" for help what device? $1$DOA40 create a log file ? Y creating log file $1$DOA40.dat device is an RA81, seriali 137579, attached to a HSC70, on node (SLEAZY), error count is 0 do you want to verify rct copy consistency ? (yin) N RAOTIL> Digital Internal Use Only 16-3 RAUTIL User Guide 16.3 .COMMAND SUMMARY The following list summarizes the commands that may be used with RAUTIL. A more detailed explanation of these commands and some examples are found in Section 16.4. ANALYZE Analyze and list all replacements and verify RCT/replaced LBNs. BBR Manually replace a bad block (xDA controllers only). DO Display device parameters. DUMP Display the contents of a block in the user LBN area or the RCT. EXIT Exit the program or exit to the main RAUTIL prompt. HEAD Display replacements for a single head or all heads listed individually. HELP Display a summary list of RAUTIL commands. MODIFY Modify the contents of an LBN in the user LBN area or in the RCT. NEXT Release the current drive and select a new drive to examine. SCRUB Scrub the disk. SUMMARY Summarize replacements only. TL Translate LBN. TR Translate RBN. WRITE Write an LBN with a pattern. 16.4 COMMAND DETAILS and EXAMPLES This section explains each individual RAUTIL command in more detail, including examples of command entries and responses. In these examples, text enclosed in parentheses () is infonnational only. Unless otherwise noted, the following commands should work for all disk controllers in the DSA I environment (HSC, UDA, KDA, KDB, etc.). 16-4 Digital Internal Use Only RAUTIL User Guide 16.4.1 ANALYZE The ReT is searched for replacements, and all replacements are verified. All replacements are also checked for the occurrence of a forced error. If a forced error is detected, it is reponed on the user terminal. Only LBNs that have been replaced are actually read by this function. Blocks that contain forced errors are generally the result of BBR (bad block replacement) where the original data was uncorrectable. In some instances, blocks could be flagged with a forced error and not be replaced. These blocks will not be seen or displayed by the ANALYZE command. To locate non-replaced blocks with forced errors, EDC errors, or other problems, use the SCRUB command. All replacements are listed on the terminal in order of ascending LBNs, then rolled into a summary on the terminal. The summary shows the replacements allocated by physical head and categorized by primary replaced LBNs, non-primary replaced LBNs, and bad RBNs. Refer to the following example for an illustration of using the ANALYZE command. Following the example is a legend explaining most of the terminology used in the display. Example 16-1: ANALYZE Command RAUTIL> AN1U.YZE DB LBN cn RBN HEAD POS DESC TYF ---------------------------------------------------------2. } 56. 2. ) 343. 3. ) 8514. 6. ) 263416. } 27260. 7. ) 36333 ~ 7. } 38838. 12. ) 68754. 13. ) 72438. 13. ) ******* 14.) 82039. 19. ) 114914. 20. ) ******* => => => => => => => => => => => => => . 2. 6. 166. 516. 534. 712. 76l. 1348. 1420. 1533. 1608. 2253. 2333. " " (137. ) (137.) (137. ) (137. ) (138. ) (138.) O. O. 1136. 38. 50. 54. 96. 101109. 114. 160. 166. 1. 6. 12. 12. 2. 12. 5. 4. 6. 7. 12. 13. 9. => => => => => => 17306. 17307. 17349. 17406. 17420. 17421. 3000 2000 2000 2000 2000 2000 2000 2001 2001 4000 2001 2001 4000 0038 01572142 66E5 6A7C 8DED 97B6 OC92 1AF6 0000 4077 COE2 0000 " " " 882628. 882695. 884817. 88775l. 888465. 888520. 19. 17. 8. 37. 2. 33. 45. 10. 50. 45. 43. 37. 21. 1236. 1236. 1239. 1243. 1244. 1244. NON-PRI PRI PRI PRI PRI PRI PRI PRI PRI UNUSABLE PRI PRI UNUSABLE " 2. 3. 3. 4. 4. 5. 50~ 28. 8. 49. 49. 15. 200D 200D 200D 200D 200D 200D 77C4 7807 8051 8BC7 8E91 8EC8 PRI PRI PRI PRI PRI PRI replacement by head for the RA81, unit# $5$DOA230, serial # 137579 0 1 2 3 4 5 6 7 8 9 10 11 12 13 TOTAL PRI 0 2 46 25 17 6 8 13 2 6 5 5 16 24 175 NON-PRI 0 1 1 0 0 0 0 1 0 0 0 0 0 0 3 BAD-RBN 0 1 1 0 0 0 0 1 0 1 0 0 1 0 5 TOTAL 0 4 48 25 17 6 8 15 2 7 5 5 17 24 183 HEAD ------------------------------------------------------ RAUTIL> Digital Internal Use Only 16-5 ·RAUTll User Guide Table 16-1: Legend for ANALYZE Command RBN Replacement block. This is the block that currently contains the data for the corresponding logical block listed in the LBN column. DB Descriptor block number. This is the relative block number within the RCT that contains the descriptor for this entry. LBN Logical block. This is a block from the usable host area that has been replaced as described in the RCT. CYL Cylinder. This is the physical" cylinder containing the corresponding LBN. HEAD Disk RIW head. This is the physical RIW head that would be used to read the corresponding LBN. It also describes the media surface or a portion of the media surface containing the LBN. pas (Position from index.) This is the physical sector from index that corresponds to the LBN. DESC (Descriptor contents.) This is the hexadecimal contents of the specific RCT descriptor that corresponds to the LBN/RBN entry listed in the output. TYP (Type of descriptor.) This is a translation of the descriptor code field within the RCT de~rip tor. Primary replaced LBN PRI = Non-primary replaced LBN NON-PRI = UNUSABLE = BAD RS NOTE Asterisks (*******) in the LBN column indicate that an RBN descriptor was found in the ReT table referencing that particular RBN as bad and unusable. 16-6 Digital Internal Use Only RAUTIL User Guide 16.4.2 Manual Bad Block Replacement (BBR) When executed to a drive connected to an xDA-type controller, this command allows the user to manually force the replacement of a specified LBN in the host/user LBN area. This command cannot be executed on HSC controllers. (Use the utility DKUTIL for this purpose when your disk is connected to an HSC controller.) A sample of the results is shown below. RAUTIL> BBR what Ibn ? 891020 flags.... Ibn.. .... new rbn .. old rbn.. P 2 891020 17470 0 is replacement information correct ? y The information in this area is for information only. You will normally answer "Y" (yes) unless you wish to abort the replacement operation. replacing LBN 891020 RAUTIL> 16.4.3 DO - Display Drive This command displays the device parameters for the currently selected drive. A sample of the results is shown below. RAlJTIL> DD drive is an RA81 Ibns per trk .•......• trks per group .....•.. groups per cyl .......• group offset .•........ rbns per track .....•.• number of heads ...... . number of host Ibns ..• rct copy size •.......• number of rct copies .• 51 1 14 14 1 14 891072 765 4 RAUTIL> Digital Internal Use Only 16-7 RAUTIL User Guide 16.4.4 DUMP This command allows you to display the contents of data in both the host/user LBN area and blocks in the ReT. It is particularly useful for reviewing suspected bad data blocks or structures in the RCT. Refer to the following example for an illustration of using the DUMP command. To perform this operation, you must first enter the command DUMP at the RAUTll..> prompt. This will put the program into the dump mode of operation. When selecting LBNs in the host area, use the following format: drop> L xxxx where xxxx is the LBN number When selecting blocks in the RCT area, use the following format: dmp> R x,y where x is the relative block number within a particular copy of the ReT; and y is the copy number. The dump display is organized into 32-bit longworos. Each longword is subdivided into two 16-bit words (hexadecimal). There are four longwords per line numbered from right to left starting with zero. When using the L xxxx format, the program will also attempt to translate the contents of the block into ASCn and display it in the far right columns. Use CONTROL-C to exit from the DUMP mode of operation. This will cause a return to the RAUTIL> prompt When using this command with HSC controllers, only the first copy of any RCT block will be displayed, regardless of which copy you specify during the "R x,y" format. 16-8 Digital Internal Use Only RAUTIL User Guide 16.4.5 EXIT This command causes an exit from RAUTIL when entered from the RAUTIL> prompt. When using functions within RAUTIL (such as the dump or the head commands) use CONTROL-C to exit and return to the RAUTIL> prompt. 16.4.6 HEAD This command allows you to examine the failures on a head.-by-head basis to see where scratches exist on the media. At the RAUTIL> prompt, enter the command HEAD. When prompted "What head ?", you must respond with the particular head number (decimal). Entering ALL will cause a replacement listing on a head-by-head basis for all heads individually. Use CONTROL-C to exit from the HEAD mode of operation. The RAUTIL> prompt will return. An example follows. Following the example is a legend explaining most of the terminology used in the display. . RAUTIL> HEAD What head ? 12 DB LBN RBN PCYL GRP HEAD POS TYP ------------------------------------------------------3.) 6. ) 7. ) 14. ) 37.) 48. ) 56. ) 63. ) 69. ) 79. ) 83. ) (100. ) (101.) (102. ) (107. ) (120. ) (136. ) 8514. 26341. 36333. 82039. 229123. 166. 11. 516. 36. 712. 50. l608. 114. 4492. 320. 5892 • 420. 6914. 493. 7894. 563. 8636. 616. 9868. 704. 10428. 744. 12626. 90l. 12738. 909. 12878. 919. 13494. 963. 15188. 1084. 17204. 1228. => => => .. > .. > => => => => => => .. > => "'> => => => ******* 352638. 402638. 440459. 503313. 531854. 643937. 649639. 656793. 688219. 774614. 877433. 12. 12. 12. 12. 12 • 12. 12. 12. 12. 12. 12. 12. 12. 12. 12. 12. J.'. ... 12. 12. 12. 12. 12. 12. 12. 12. 12. 12. 12. .... ,., ....... '" 12. 12. 12. 12. PRl PRl PRl PRl PRl UNUSABLE PRr PRl PRl PRl PRl PRl PRl PRl PRl PRl PRl S. 37. 33. 43. 43. 1l. 36. 4. 35. 5. 38. 23. 13. 27. 37. 38. 41- replacement by head for the RA81, unit# $5$DUA230, serial # 137579 HEAD 0 1 2 3 4 5 6 7 8 9 10 11 12 13 TOTAL ------------------------------------------------PEl 0 0 0 0 0 0 0 0 0 0 .Q 0 16 0 16 NON-PEI 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 BAD-RBN 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 TOTAL 0 0 0 0 0 0 0 0 0 0 0 0 17 0 17 What head ? RAUTIL> Digital Internal Use Only 16-9 RAUTIL User Guide Table 16-2: Legend for HEAD Command DB Descriptor block number. This is the relative block number within the RCT that contains the descriptor for this entry. LBN Logical block. This is a block from the usable host area that has been replaced as described in the RCT. RBN Replacement block. This is the block that currently contains the data for the corresponding logical block listed in the LBN column. PCYL Physical cylinder. This is the physical cylinder containing the corresponding LBN. GRP This is the logical group number within the cylinder that contains the LBN. HEAD Disk RIW head. This is the physical RIW head that would be used to read the corresponding LBN. It also describes the media surface or a portion of the media surface containing the LBN. pas (Position from index.) This is the physical sector from index that corresponds to the LBN. DESC (Descriptor contents.) This is the hexadecimal contents of the specific RCT descriptor that corresponds to the LBN/RBN entry listed in the output. TYP (Type of descriptor.) This is a translation of the descriptor code field within the RCT descriptor. Primary replaced LBN PRI = NON-PRI = UNUSABLE Non-primary replaced LBN = BAD RBN NOTE Asterisks (•••••••) in the LBN column indicate that an RBN descriptor was found in the ReT table referencing that particular RBN as bad and unusable. 16.4.7 HELP The HELP command displays a summary of all commands on your tenninal. The summary displayed is similar to that listed in Section 16.3 section of this document. 16-10 Digital Internal Use Only RAUTIL User Guide 16.4.8 MODIFY The MODIFY command allows you to modify an LBN on a longword boundary. The drive must be mOWlted FOREIGN and you must have C1vfKRNL (change mode kernel) privileges to execute this command. An example follows. In this example the user entered the MODIFY command at the RAUTIL> prompt. When prompted for the LBN to modify, the user entered "891050." The program then dumps the current contents of the specified LBN. Next, you are prompted for the number of a longword to modify. In this example, the user selected longword number 4. The program then requests the pattern that is to be entered into the specified longword. This value must be 1 to 8 characters long specified in hexadecimal format. Values less than 8 characters are zero-filled for the entire longword.. . The program continues to request longword numbers and patterns. When you are satisfied that all modifications have been entered, enter WRITE command at the longword number prompt to terminate the changes. At this point, the program will display the dump buffer showing all the modifications performed. You will then be prompted to write the record. A tty" will write the LBN (including modifications) back to the disk. Any other response will abort the MODIFY operation. This command may not be used to modify blocks in the RCT on an HSC controller. Use the HSC utility DKUTIL for this purpose. RAUTIL> MODIFY what Ibn ? 891050 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0 4 ................ ................ ................ 8 12 16 120 124 longword to modify, or "WR" to write the block what pa'ttern (hex) ? 12345678 longword to modify, or "WR" to write the block what pattern (hex) ? FFFFF longword to modify, or "WR" to write the block ................ " ................ " ................ 4 15 WRITE the modified record contents is: 0000 0000 COOO OOOF 0000 0000 0000 0000 FFFF 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 " 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 " 0000 0000 0000 0000 write the record (yin) 0000 0000 0000 0000 0000 1234 0000 0000 0000 0000 5678 0000 0000 0000 0 4 ................ "xV4 .•.•.•.•••••. n · ............... ................ 8 12 16 · ............... " 0000 0000 0000 0000 120 124 n n · ............... · ............... -: y RAUTIL> Digital Internal Use Only 16-11 RAUTIL User Guide 16.4.9 NEXT The NEXT command deselects the current drive and allows you to request the next drive or ·select another generic device type. (See Section 16.2.2.) \ NOTE The NEXT command only releases the channel associated with the previous drive, and no VMS dismounts are attempted. If a log file- was generated for the previous drive, it will be closed. After executing the NEXT function, you will be prompted for another log file (optional) for the new device selected. An example follows: RAUTIL> NEXT what device ? NERMAL$DUA33 create a log file ? N device is an RA81, seriali 137427, attached to a UDA50, error count is 0 do you want to verify rct copy consistency ? RAUTIL> 16-12 Digital Internal Use Only (yin) Y RAUTIL User Guide 16.4.10 SCRUB The SCRUB command causes the entire user LBN area of the disk to be scanned in either full-track or one-sector mode. You may select the starting LBN and the mode. The term SCRUB means scanning the entire disk, reading all host/user LBNs, and replacing any blocks that fail the BBR (bad block replacement) algorithm. With this command, all host/user LBNs are scanned and read sequentially. Any blocks read by a DSA controller that cause the BBR flag to set will ultimately cause .the BBR software to be invoked. Blocks that fail the BBR test software will, in fact, be replaced. On VAX/VMS systems using HSC controllers, the HSC performs the actual test and replacement process. On VAXNMS systems us~g xDA-type controller, VMS performs the test and replacement process. The SCRUB function of RAlITIL merely provides the means to force the entire host/user area of a DSA disk to be scanned,and read using the existing BBR functionality within the system. If you wish to force a block replacement, use the BBR command described in Section 16.4.2. There is another benefit in using the SCRUB command. Since this command forces a read of aU host/user LBNs, any blocks that contain forced errors, EDC errors, or exhibit other problems, will also be displayed. This is somewhat different from the ANALYZE command that only reads replaced LBNs to find forced errors. ' Since RAUTIL may be interrogating (or scrubbing) a disk that is being interactively accessed by other users, the time to complete the scrub function varies depending upon system load, disk activity, or controller usage. This operation typically requires 10 minutes to over an hour, to 'complete. If single sector mode is selected, it' takes considerably longer. . Example 16-2 shows the results of a SCRUB operation. Example 16-2: SCRUB Operation RAUTIL> SCRUB Starting LBN ? 0 Single sector scrub (yin) ? N starting scrub at 23-SEP-1987 13:26:36.79 starting error count is 0 reading reading reading reading reading reading " tI reading reading reading reading reading reading lbn lbn lbn lbn lbn lbn " 1bn lbn lbn lbn lbn lbn 515763. 11475. 17187. 22899. 28611. at at at at at at Lcyl Lcyl Lcyl Lcyl Lcyl Lcyl 856851. 862563. 868275. 873987. 879699. 885411. at at at at at at Lcyl Lcyl Lcyl Lcyl Lcyl Lcyl " O. 8. 16. 24. 32. 40. 1200. 1208. 1216. 1224. 1232. 1240. scrub completed at 23-SEP-1987 13:46:40.29 ending error count is 0 RAOTIL> Digital Internal Use Only 16-13 RAUTIL User Guide 16.4.11 SUMMARY The SUMMARY command is similar to the ANALYZE command except that only a summary report of the replacements is displayed on the user terminal. An example of the SUMMARY command follows. Example 16-3: SUMMARY Command RAUTIL> SUMMARY replacement by head for the RA81, unit# $5$D0A230, serial # 137579 HEAD 0 1 2 3 4 5 6 7 8 9 10 11 12 13 TOTAL PRI o 2 46 25 17 6 8 13 2 6 5 5 16 24 NON-PRI 0 1 1 0 0 0 0 1 0 0 0 0 0 0 3 BAD-RBN 0 1 1 0 0 0 0 1 0 1 0 0 1 0 5 TOTAL o 4 48 25 17 6 8 15 2 7 5 5 17 24 183 175 RAUTIL> 16.4.12 TL - TRANSLATE LBN This command translates an LBN into a cylinder, group, track, or sector. An example of this command follows. Example 16-4: TL Command RAUTIL> TL what Ibn '? 8943 pri rbn ••..........•..•.... Ibns protected by pri rbn .. cyl ..••••...••.•...•..•.... grp •..•••...........•.•.... trk •.••••.•..•.....•....... head .•.••.................• position from index ....... . phys block number .•...•.... rct blk ••....•............• offset into rct blk ....•..• Ibn addr of rct block .•.... rbn range . . . . . . . . . . . . . • . : .. what Ibn ? RAUTIL> 16-14 Digital Internal Use Only 175 8925-8976 12 7 o 7 12 9112 3 47 891075 128-255 RAUTIL User Guide 16.4.13 TR-TRANSLATE RBN This command translates an RBN into a cylinder, group, track, or sector. An example of this command follows. Example 16-5: TR Command RAUTIL> TR what rbn ? 983 rbn. • . . • . . . . . cyl. . . . . . . . . . grp.......... trk.......... head......... pos. . . . . . . . . . phy blk num.. 983 70 3 0 3 41 51157 what rbn ? RAUTIL> 16.4.14 WRITE The WRITE command provides the capability to write a longword pattern to an LBN and fill the entire block. The longword pattern will be repeated 128 times in the block. The drive must be mounted FOREIGN and the user must have CMKRNL (change mode kernel) privileges to execute this command. This command may not be used to write blocks into the RCT on an HSC controller. Use the HSC utility DKUTTI.... for this purpose. In the example below, the user is prompted for the LBN to write and the pattern to use. Only one LBN may be specified at a time. The pattern entered must be a hexadecimal value from 1 to 8 characters long. If less than 8 characters are specified, the longword is zero-filled. The example alsQ shows the use of the DUMP command to verify the contents of the specified LBN that was written. Example 16-6: WRITE Command RAUTIL> WRITE what Ibn ? 891050 what pattern {hex} ? 1111FFFF RAUT IL> DUMP dmp> L 891050 1111 FFFF 1111 FFFF 1111 FFFF " 1111 FFFF 1111 FFFF 1111 FFFF " 1111 FFFF 1111 FFFF 1111 FFFF " 1111 FFFF 1111 FFFF 1111 FFFF 1111 FFFF 1111 FFFF 1111 FFFF It It 1111 FFFF 1111 FFFF 1111 FFFF 1111 FFFF 1111 FFFF 1111 FFFF " " 1111 FFFF 1111 FFFF 1111 FFFF 0 4 8 · ............... · ............... " " · ............... " 116 120 124 · ............... " " · ............... " " · ................ " dmp> RAUTIL> Digital Internal Use Only 16-15 RAUTIL User Guide 16.5 TROUBLESHOOTING and USING RAUTIL The following section describes some uses for RAUTIL and what information can be derived from its operation. This section does not include all the capabilities of RAUTIL but provides some basic samples and interpretations. 16.5.1 Radial Scratches Refer to the following example in which the user has used the HEAD command to analyze the LBNreplacements associated with head 4 on an RA80. Notice that there are 18 line entries that all have the same POS value of 21. This represents 18 logical blocks that have been replaced. The common factor is that these blocks are all pOSitioned on the same physical sector (sector 21) from index. Also note that most of the values displayed in the PCYL (physical cylinder) column indicate that these LBNs are on adjacent cylinders. Since this display represents replaced blocks associated with the same head, we know that they are all on the same surface (or a portion of the same surface for the RA80) and, the~fore, adjacent tracks. Given these facts, we can conclude that this display probably represents some radial scratches on the media. A radial scratch (or defect in the media) is one that generally has the property of being aligned perpendicular to the rotation of the media. In this example, there are three radial scratches. 1. Scratch #1 is positioned at physical sector 21 and crosses 6 adjacent tracks (noted by cylinders 58 through 63). 2. Scratch #2 is positioned at physical sector 21 and crosses 12 adjacent tracks (noted by cylinders 107 through 118). 3. Scratch #3 is positioned at physical sector 30 and crosses 3 adjacent tracks (noted by cylinders (174 through 176). . It is also possible that scratches 1 and 2 are the same scratch and that the tracks between cylinder 63 and 107 are not so severely affected by the defect as to cause block replacement It is also possible that the defect was created by a "skipping" action, in the same way a rock can be made to skip across a pond. Radial defects are generally not caused while the media is spinning but rather when it is stationary. There are probably a hundred reasons why scratches occur. Some of them include excessive mishandling of an HD~ misuse of the HDA lock lever, drive failures causing head movement while the media is not spinning, manufacturing defects, and so on. Note that since there are 478 tracks/inch in an RA80 HDA, the largest of these scratches is only. about 0.025 inches long. Also remember that you are observing the analysis of LBNs whose. data has been relocated to RBNs elsewhere on the disk. This is a result of either the manufacturing scanner/format process or error recovery and BBR techniques employed by the Digital Storage Architecture (DSA). You are observing a theory and probability of why the LBNs have been replaced If the defects noted here are severe enough, the scratches may increase in size over time. Monitor this by periodically using RAUTIL and its log file feature to take snapshots of this disk and compare the results. If further errors are encountered and/or the number of replacements associated with these scratches increase, you may have to replace the media. If no further replacements or errors are associated with these LBNs, there is no cause for alarm. The data is safely stored elsewhere on the disk. Most disks contain scratches. With RAUTIL, you now have more visibility into the disk. 16-16 Digital Internal Use Only RAUTIL User Guide RAUTIL> HE.lill . What head ? 4 DB RBN LBN PCYL HEAD POS TYP GRP ----------------------------------------------------8. ) 8. ) 8. ) 8. ) 8. ) 8. ) 13. ) 13. ) 13. ) 14.) 14.) 14.) 14.) 14.) 14. ) 14.) 14. ) 14. ) 21.) 2l.} 21.) => ..,> => => -> -> => => => => => => => => => => => => => z> -> 25317. 25735. 26185. 26603. 27053. 2747l. 46567. 47017. 47435. 47885. 48303. 48753. 49171. 49621. 50039. 50489. 50907. 51357. 75670. 760S8. 76538. 816. 830. 844. 856. 672. 686. 1502. 1516. 1530. 1544. 1558. 1572. 1586. 1600. 1614. 1626. 1642. 1656. 2440. 2454. 2466. 58. 59. 60. 6l. 62. 63. 107. 106. 109. 110. llI. 112. 113. 114. 115. 116. 117. 118. 174. 175. 176. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. O. l. O. l. O. l. l. O. l. O. l. O. , O. l. O. 1. O. O. .... O. PRI PRI PRI PRI PRI PRI PRI PRI PRI PRI PRI PRI PRI 21. 2l. PRI 2l. PRI 21. PRI 2l. PRI 21. PRI 30. PRI 30 •. PRI 30. PRI 2l. 2l. 2l. 2l. 2l. 2l. 2l. 2l. 2l. 2l. 2l. 2l. replacement by head for the RASO, unitl.t $1$DUA50, serial HEAD 0 1 PRI 0 0 NON-PRI 0 0 BAD-PEN 0 TOTAL 0 2 3 5 7 8 10 11 *12lS24513 TOTAL ---------------------------------------------------------------- ... , 0 21 0 0 0 0 0 0 0 0 0 "'-- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 21 0 0 0 0 21 RAUTIL> Digital Internal Use Only 16-17 RAUTIL User Guide 16.5.2 Forced Errors Refer to the following example. The user has performed an ANALYZE command for an RA81 HDA. The result shows some forced errors at LBNs 36620, 36623, 36624, 36627, and 36623. Remember, the ANALYZE command only scans LBNs whose data has already been stored into replacement blocks called RBNs. These forced errors represent data that is correctly stored into replacement blocks and was at one time unrecoverable. Several points should be taken into consideration here. These mayor may not be the only forced errors on the disk. The ANALYZE function only scans LBNs whose data has been replaced by RBNs and have replacement descriptor entries in the RCT. In order to detennine if other forced errors exist, you would have to use the SCRUB command which causes RAUTIL to read every user LBN on the disk. When dealing with forced errors, the only guaranteed method of recovery is to replace the file(s) containing the forced errors with known good copies, or again perform the operation that creates these files. RAum.. does not deal with file structures. It deals with absolute blocks referenced as LBNs and makes no attempt at associating LBNs with file structures. In the VAX/VMS environment, it is probable that several of these forced errors are not currently part of an existing file. Some or all may, in fact, be forced errors that exist with blocks that are not currently part of an existing file but instead part of a pool of currently unused, available blocks. When these blocks become allocated and rewritten (with new data), the forced error will no longer appear or be associated with these particular blocks. Blocks with forced errors that are unused by VMS will generally have no affect on the operating system. RAUTIL> ANALYZE DB en RBN LBN (2.) 15. => O. (4.) 14440. => 283. (7.) 36630. -> 714. lbn 36624, PC= 000051D5 2144 flagged in last sector read ( 7.) 36624."> 715. lbn 36620, PC- 000051D5 2144 flagged in last sector read (7.) 36620. -> 716. (7.) 36616. -> 717. (7.) 36625. -> 718. lbn 36623, PC= 000051D5 2144 flagged in last sector read ( 7.) 36623. => 719. lbn 36632, PC= 000051D5 2144 flagged in last sector read (7.) 36632."> 720. lbn 36627, PC- 000051D5 2144 flagged in last sector read (7.) 36627. => 721. ( 7.) 36634. -> 722. (7.) 36619 . .,> 723. (137. ) (138.) 884889. 891023. HEAD DESC P~S TYP O. O. 15. 2000000F PRI 20. 3. 49. 2000 3868 PRI 51. 4. 16.·3000 8F16 NON-PRI %SYSTEM-F-FORCEDERROR, forced error 51. 4. 10. 3000 8F10 NON-PRI %SYSTEM-F-FORCEDERROR, forced error 51. 4. 6. 3000 8FOC NON-PRI 51. 3. 39. 2000 8F08 PRI 51. 4. 11. 2000 8Fll PRI %SYSTEM-F-FORCEDERROR, forced error 51. 4. 9. 3000 8FOF NON-PRI %SYSTEM-F-FORCEDERROR, forced error NON-PRI 51. 4. 18. 3000 8Fl8 %SYSTEM-F-FORCEDERROR, forced error 4. 4. 4. 13. 20. 5. 3000 8F13 3000 8FlA 3000 8FOB NON-PRI NON-PRI NON-PRI -> 17350. 1239. 4. -> 17471. 1247. 13. 43. 28. 200D 8099 200D 988F PRI PRI 51. 51. 51. replacement by head for the RA81, unitt S5SDUA1, serial t 1271811603 HEAD PRI 0 1 2 3 4 5 6 7 8 9 10 4 8 23 4 9 2 0 5 12 3 12 13 TOTAL 1 1 3 85 NON-PRI 0 0 0 0 12 1 0 0 0 0 0 0 0 0 13 BAD-RBN 0 2 0 0 0 0 0 0 0 0 0 0 0 0 2 10 6 8 23 16 10 2 0 5 12 3 1 1 3 100 TOTAL RAUTIL> 1&-18 10 11 Digital Internal Use Only RAUTIL User Guide If, however, some of these forced errors are part of one or more files, how do we know which files need to be replaced? One method is to use the ANALYZE/DISK command which is a VMS DeL command and not a RAUTIL command. To do this, make sure the disk is mounted (not foreign) and perfonn the following command at the VMS prompt. $ ANALYZE/DISK/READ_CHECK Device-name: This will cause VMS to read and check the structures of all existing files on the selected disk and report any discrepancies on the user terminal. This will include reporting any files that contain forced errors. Now it is merely a matter of replacing these files with known good copies. Also note that the VMS ANALYZE/DISK command will also report other structure errors that may exist. A discussion of these is beyond the scope of this document. You can also use a variation of this command to create a soft copy of the results of the read_check operation. Use the following: $ SPAWN/OUTPUT=File_spec ANALYZE/DISK/READ_CHECK Device-name: Here you specify the name of a file to be created to contain the results of the ANALYZE/DISK command. This file can later be displayed on the tenninal or printed in hardcopy fonnat. Another observation should be noted about the RAUTIL ANALYZE results. Notice that all of the forced errors are associated with the same head (head 4). Also note that there are several other entries associated with head 4, and many of them indicate non-primary replacement You may be inclined to perform the HEAD command to this disk and select head 4 for further analysis. Section 16.5.3 discusses a further analysis of this sample. . Digital Internal Use Only 16-19 RAUTIL User Guide 16.5.3 Circular Defects This section is a continuation of a sample analysis started in the previous section. Refer to the following example. The user has elected to analyze the replacement associated with head 4 on a RA8!. Notice that there are 13 entries that indicate LBNs replaced from the same physical cylinder (peYL 51). Since the analysis here is limited to show replacements for head 4, the conclusion is that all 13 LBNs replaced were from the same physical track. Since only one replacement can be primary on any given track, the remaining replacements had to be revectored as non-primary. Since most of the LBNs replaced occurred from the same track, the geometry of the symptom is considered circular. We know from the previous section that several of the replacements resulted in forced errors. We can conclude that a failure probably occurred while the heads were positioned over cylinder 51, and something (contact erasure, contaminate, head/disk interference," defective spot in the media, vibration, etc.) caused the degradation and subsequent replacement of several blocks under head 4. This mayor may not be serious. If more LBNs continue to be replaced from this track or adjacent tracks, the problem may be growing as indicated by additional forced errors or errors in the VMS error log. Use of the SCRUB command may provide this additional knowledge. In this case, the media or a R/W head (for removable media) may need to be replaced. On the other hand, if the defect does not appear to be growing by evidence of continued errors and/or replacements, there may be no further cause for alann. In this case, it will have merely been an exercise for the user to understand the symptoms that may have resulted in unexpected forced errors or errors in the error log. The "logging" feature of RAUTIL could be used to record these results and be used for comparisons to another analysis at a later time. This would allow the user to monitor this particular disk and determine if problems are growing or just instantaneous in time. The DSA architecture was designed to provide adequate recovery during instantaneous fail~es. Deal with growing problems before they become catastrophic. 16-20 Digital Internal Use Only RAUTIL User Guide RAUTIL> HEAD What head ? 4 DB EBN LBN PCYL HEAD POS TYP GRP ----------------------------------------------------7. ) 7. ) 7. ) 7. ) 7. ) 7. ) 7.) 7. ) 7. ) 7. ) 7. ) 7. ) 7.) ( 79. ) (137. ) (137.) => 71::::. 5I. => 5I. 713. 714. 5I. -> .. > 715. 51. 716. 5I. ""> => 5I. 718. 5l. 719. => => 720. 5I. => 7:1. 5l. 1 __ => 5I. => 723. 5l. => 724. 5l. 725. 5l. '"'> 706. -> 9888. '"'> 17336. ::'238. -> 17350. 1239. 36622. 36629. 36630. 36624. 36620. 36625. 36623. 36632. 36627. 36634. 36619. 36655. 36645. 504332. 884175. 884889. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 8. 15. 16. 10. 6. 1l. 9. 18. 13. 20. 5. 4l. 3I. 48. 43. 43. NON-PRI NON-PRI NON-PRI NON-PRI NON-PRI PRI NON-PEl NON-PEl NON-PRI NON-PRI NON-PRI NON-PRI NON-PRI PRI PRI PRI replacement by head for the RA81, unit# S5SDUA1, serial # 1271811603 HEAD 0 2 3 4 5 7 8 10 11 12 ------------------------------------------------------ 13 TOTAL PRI 0 0 0 0 4 0 0 0 0 0 0 0 0 0 4 NON-PRI 0 0 0 0 12 0 0 0 0 0 0 0 0 0 12 BAD-P.BN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TOTAL 0 0 0 16 0 0 0 0 0 0 0 0 0 16 What head '? RAt'TIL> Digital Internal Use Only 16-21 RAUTIL User Guide 16.5.4 Summary Analysis Refer to the following example. The user elected to perform a SUMMARY command for an RA82 HDA. A couple of observations may be noted. The replacements associated with the even nwnbered heads appears to be higher than those of the odd numbered heads. This is probably nonnal since the even numbered heads are associated with the innermost cylinders on an RA82 where the bit density is much higher. The replacements associated with head 10 are more than twice that of the other heads. You would probably be inclined to perform the HEAD command and select head number 10 for further analysis. If you compared this analysis to an RAUTIL log copy you obtained earlier for this disk and head 10 showed a relatively large number of replacements over a short period of time (within hours or a day), you may have to consider replacing the HDA. If the comparison showed considerable replacements associated with all heads in a short time period. you should consider the possibility of a disk R/W electronics data path problem (R/W module, hybrid module, SDl problem, cable problem, etc.). Since the SUMMARY command allows you to compare replacements associated with all the heads, knowledge of the head select logic for the specified disk may prove very valuable. A single head select line from a microprocessor controlled circuit may affect a certain combination of read/write heads. If this combination of heads accounted for a large number of replacements, you could isolate the problem to a specific field replaceable unit (FRU) using the SUM:MARY command in RAUTIL. RAUTIL> SUMMARY replacement by head for the RA82, unit# $l$DUAO, serial # 0 HEAD 0 1 2 PRI 29 14 30 SEC 100 BAD-RBN TOTAL 3 5 6 7 8 9 10 11 12 13 14 TOTAL 6 14 4 12 15 38 33 89 10 18 15 33 0 0 0 000 1 0 002 8 00000 0 0 000 102 4 4 12 16 38 34 93 10 19 15 37 372 30 14 30 6 14 RAUTIL> 16-22 4 Digital Internal Use Only 1 0 4 360 RAUTIL User Guide 16.5.5 EDC Errors The following example shows how an EDC error would appear. Here, there is an EDC error detected while verifying LBN 56 during the ANALYZE command. Remember, since the ANALYZE command verifies replaced LBNs, the EDC error actually occurred while reading RBN 2 that contains the data (and EDe character) for LBN 56. Most error detection code (EDC) errors are the result of controller data path problems. If, however, the EDC errors are accompanied with ECC errors, the problem is likely in the disk R/W data path, media, or SDI R/W data path. RAUTIL> ANALYZE DB 2. ) ( RBN LBN => ******* CYL HEAD P0S o. l. l. 13. DESS 4000 0000 TYP UNUSABLE lbn 56, PC= 00005105 0054 %SYSTEM-F-CTRLERR, fatal controller error do you want to continue '? (yin) Y 2.) 2. ) 3.) 13. ) 14. ) 20. ) (137. ) (138. ) (138. ) 56. 343. 8514. 72438. 82039. => => => => => => ******* 2. 6. 166. 1420. 1608. 2333. o. l. 6. 12. 6. 12. 9. O. 1l. 10l. 114. 166. ...... 8877~1. => 17406. 1243. 888465. -> 17420. 1244. 888520. => 1742l. 1244. 5. 19. 17. 8. 50. 43. 2l. 3000 2000 2000 2001 2001 4000 49. 49. 15. 200D SBC? 2000 SE91 200D 8Ee8 NON-PRI PRI PRI PRI PRI UNUSABLE 0038 0157 2142 1AF6 4077 0000 PRI PRI PRI replacement by head for the Mal, unic:# $5$OUA230, serial # 137579 HEAD 0 1 2 8 3 9 10 11 12 13 TOTAL -----------------------------------------------------PRI 0 2 46 25 17 8 13 5 2 NON-PRI 0 1 1 0 0 0 0 1 0 0 0 BAD-RBN 0 1 1 0 0 0 0 1 0 1 0 TOTAL 0 4 48 25 17 8 15 2 16 0 24 175 0 3 0 5 17 24 183 RAUTU> Digital Internal Use Only 16-23 RAUTIL User Guide 1~24 Digital Internal Use Only -- All CL()Di. s/( / ((;cd - (~t/')-~ c/,g-()t 1.e. - ~r-( e c2 ,~ ~- f Ci [--j _ --v,,-r,i-,t' JI-t-tl\<--r -'T/1I's it S f<>J.,.) (J.,l', lJ Sktt,J 0 u1- -; ~(j1J f{~ bc,J {i I,~- '_i]I' / )Jo v.b;, f (1 f) ttr . Ii (lJIl?/~ t twck_ DSA TROUBLESHOOTING COURSE Lab Exercise #4 DSAERRlDSA301 Error Log Tool Digital Internal Use Only 1 DSAERRIDSA301 Error Log Tool Lab Exercise 4 Review the DSAERRIDSA301 user guide section of the Student Guide before proceeding with these exercises. 1. Log into your student account 2. $ SET DEF [STUDENTx.ERRORLOG_TOOL] 3. $ RUN DSA301 NOTE If your system is using VMS version 5.0 or higher, then run DSA303 instead of DSA301 for the duration of this lab exercise. 4. Enter the following parameters for the program: Input file [SYS$ERRORLOG:ERRLOG.SYS] ? Output file (file-name.ext) [TERMINAL] ? Device (s), Type (s)· -(Daan,Rann) 'DJ%1,RA60' [ALL]? HEX Event Code(s) (nnnn) 'lAB,E8,%%6B,*4' [ALL)? Starting date (dd-mmm-yyyy hh:mm:ss.cc) [FIRST]? Ending date (dd-mmrn-yyyy hh:mm:ss) [LAST] ? Report Type (Physical, Geographic, Summary, Verbose) MASTER. bAT MASTER 1.0UT [P) ? MASTER.OUT will be the master list of all error log entries contained in the binary error log file, MASTER.DAT. Print MASTER.OUT for reference during this lab exercise. This physical report type contains entries in the order in which they originated from the binary file. Review this report to get familiar with the various types of entries it contains. 5. Run DSA301 using the following parameters: Input file [SYS$ERRORLOG:ERRLOG.SYS] ? Output file (file-name.ext) [TERMINAL] ? Device(s),Type(s) (Daan,Rann) 'DJ%1,RA60' [ALL]? HEX Event Code(s) (nnnn) 'lAB,E8,%%6B,*4' [ALL]? Starting date (dd-mmm-yyyy hh:mm:ss.cc) [FIRST)? Ending date (dd-mmrn-yyyy hh:mm:ss) [LAST] ? Report Type (Physical, Geographic, Summary, Verbose) MASTER.DAT MASTER_2.0UT [P] ? G These selections are similar to those in step 4 except for the report type "G" (geographic). 'IyPe or print the results and notice bow the entries have been sorted in numeric order according to device name, block number, cylinder number, track/head number, etc. 2 Digital Internal Use Only DSAERRIDSA301 Error Log Tool Lab Exercise 4 6. Run DSA301 and select the following parameters: Input file [SYS$ERRORLOG:ERRLOG.SYS] ? Output file (file-name.ext) [TERMINAL] ? Device(s),Type(s) (Daan,Rann) 'DJ%l,RA60' [ALL]? HEX Event Code(s) (nnnn) 'lAB,E8,%%6B,*4' [ALL]? Starting date (dd-mmm-yyyy hh:mm:ss.cc) [FIRST]? Ending date (dd-mmm-yyyy hh:rnm:ss) [LAST] ? Report Type (Physical, Geographic, Summary, Verbose) MASTER.DAT SAMPLE 1.OUT DUA2 *8 [P] ? DUA2 is specified here to limit the selections to entries with device name DUA2. We have entered event code *8 to further limit the selection process to only those entries with MSCP status/event codes that end in the number 8 (hexadecimal). These codes will cause selections that are mostly R/W data or transfer related, such as ECC errors, header sync errors, etc. This is one way to customize the report to fit your specific needs. In this case, review the data errors associated with DUA2. 'IyPe or print the output file for further review. Review the DSA301 user guide section of your Student Guide for further details on the many ways to select device names, device types, and event codes to . customize the reporting results. 7. Run DSA301 and use the same parameters as in the previous step, except this time specify report type tiS" for a summary report. Input file [SYS$ERRORLOG:ERRLOG.SYS] ? Output file (file-name.ext) [TERMINAL] ? Device(s),Type(s) (Daan,Rann) 'DJ%l,RA60' [ALL]? HEX Event Code(s) (nnnn) 'lAB,E8,%%6B,*4' [ALL]? Starting date (dd-mmm-yyyy hh:mm:ss.cc) [FIRST]? Ending date (dd-mmm-yyyy hh:rnm:ss) [LAST] ? Report Type (Physical, Geographic, Summary, Verbose) MASTER.DAT SAMPLE 2.0UT OUA2 *8 [P] ? S The summary report provides you with a map of how the data-related entries in the error log file are distributed with respect to the physical translations (head and cylinder) that are automatically provided by the program. Type or print the output file and review the results. Review the DSA301 user guide section of your Student Guide for further infonnation on how to use a summary report. 8. Run DSA301 and use the following selection parameters: Input file [SYS$ERRORLOG:ERRLOG.SYS] ? Output file (file-name.ext) [TERMINAL] ? Oevice(s),Type(s) (Daan,Rann) 'DJ%1,RA60' [ALL]? HEX Event Code(s) (nnnn) 'lAB,E8,%%6B,*4' [ALL]? Starting date (dd-mmm-yyyy hh:mm:ss.cc) [FIRST]? Ending date (dd-mmm-yyyy hh:rnm:ss) [LAST] ? Report Type (Physical, Geographic, Summary, Verbose) MASTER. OAT SAMPLE_3.0UT ,RA82 EB [P) ? In this example, we have selected a device type (RA82) instead of device name. By placing a comma before the "RA82", we have instructed the program to accept any device name, as long as the device type was an RA82. We also specified event code EB to select all entries. that contained drive detected errors (status/event code EB). Look at the physical report (SAMPLE_3.0UT) and notice that the entries now contain values for LED codes. The LED codes represent the drive LED error codes (drive detected errors). Digital Internal Use Only 3 DSAERRIDSA301 Error Log Tool Lab Exercise 4 9. Run DSA301 and select the following parameters: Input file [SYS$ERRORLOG:ERRLOG.SYS] ? Output file (file-name.ext) [TERMINAL] ? Devioe(s),Type(s) (Daan,Rann) 'DJ~1,RA60' [ALL]? HEX Event Code(s) (nnnn) 'lAB,E8,~%6B,*4' [ALL]? Starting date (dd-mmm-yyyy hh:mm:ss.oo) [FIRST]? Ending date (dd-mmm-yyyy hh:mm:ss) rLAST] ? Report Type (Physical, Geographio, Summary, Verbose) MASTER.DAT SAMPLE 4. OUT 34 [P] ? P In this example, the parameters selected caused the program to select all entries containing an MSCP sta- tus/event code of 34. This event code indicates host LBNs that were flagged for BBR (bad block replacement) during a read operation but did not fail the BBR test and were not replaced. Using this approach, you can easily spot LBNs that were considered marginal by noting the number of times they appear in the error log. CAUTION ! The transfer of an LBN during a single MSCP command could result in multiple entries in the error log. Use the verbose report to detennine if all the entries for an identical block are logged for the same command reference number. If the same LBN appears with event code 34 and it is logged against several different commands (different command reference numbers), that block may be a candidate for manual replacement. Using the same parameters but selecting the summmy report would provide a geographic map of the LBNs that were flagged for BBR and NOT REPLACED. This technique may provide some useful infonnation for troubleshooting. 10. Run DSA301 and select the following parameters: Input file [SYS$ERRORLOG:ERRLOG.SYS] ? Output file (file-name.ext) [TERMINAL] ? Devioe(s),Type(s) (Daan,Rann) 'DJ%1,RA60' [ALL)? HEX Event Code(s) (nnnn) 'lAB,E8,%%6B,*4' [ALL]? Starting date (dd-mmm-yyyy hh:mm:ss.co) [FIRST)? Ending date (dd-mmm-yyyy hh:mm:ss) [LAST] ? Report Type (Physical, Geographio, Summary, Verbose) MASTER.DAT SAMPLE 5. OUT 14 [P) ? P This demonstrates a slight variation to the previous example. Here we have generated a physical report (SAMPLE_S.OUT) to show all the blocks that HAVE BEEN REPLACED by BBR during the period of time specified. This report (Physical) and a summary report may also prove useful in troubleshooting DSA disk/controllers. 4 Digital Internal Use Only DSAERRlDSA301 Error Log Tool Lab Exercise 4 11. Run DSA301 and select the following parameters: Input file [SYS$ERRORLOG:ERRLOG.SYS] ? Output file (file-name.ext) [TERMINAL] ? Device(s),Type(s) (Daan,Rann) 'DJ%l,RA60' [ALL]? HEX Event Code(s) (nnnn) 'lAB,E8, %%6B, *4' [ALL]? Starting date (dd-mmm-yyyy hh:mm:ss.cc) [FIRST]? Ending date (dd-mrnm-yyyy hh:mm:ss) [LAST] ? Report Type (Physical, Geographic, Summary, Verbose) MASTER. OAT SAMPLE 6.0UT 14,34,48,6B [P] ? G The resulting report contains all error log entries with event codes of 48 and 6B (header-related errors) as well as 14 and 34 (BBR testing-related events). Notice that LBN 35896 had an event code of 48 (indicating a corrupt header), followed by an event code of 14 (indicating the block was subsequently replaced). DUAl24, another RA81, shows several entries with status event code 6B, positioner error (header not found). There are several LBNs logged with the code 6B but none of these blocks are duplicated with a code of 34 or 14 to indicate any BBR activity. Also note that most of these LBNs are on the ·same cylinder and associated with track/heads I and 3. There may be a positioning problem with cylinder 627, especially if errors continued to occur on other track/heads within this cylinder. There may also be integrity problems with the media in the area of these blocks causing header sync timeout (status/event=6B on HSC). If these are the only blocks flagged, then manual block replacement would be a simple solution. This example shows how to customize the program selection process and use the results to look for trends in the DSA/disk: subsystem. When using the program to look for trends related to specific block numbers, use the geographic report (report type "Gil) to assure that all of the entries for a given block number are grouped together for easier reference. 12. Run DSA301 using the following selection parameters: Input file [SYS$ERRORLOG:ERRLOG.SYS] ? MASTER.DAT Output file (file-name. ext) [TERMINAL] ? SAMPLE 7.0UT ,Device(s),Type(s} (Daan,Rann) 'DJ%1,RA60' [ALL]? DUA2,RA81 HEX Event Code(s} (nnnn) 'lAB,E8,%%6B,*4' [ALL]? Starting date (dd-mmrn-yyyy hh:mm:ss.cc) [FIRST]? 22-AUG-1986 14:06:23.00 Ending date (dd-mmm-yyyy hh:mm:ss) [LAST] ? 22-AUG-1986 14:06:25.37 Report Type (Physical, Geographic, Summary, Verbose) [P) ? . P This example shows how to use the START DATE and END DATE prompts. The parameters were selected to illustrate an important concept All the entries on this report have the same command reference number (you could confinn this with the verbose report). Carefully review the results. Consult yom instructor if you are unclear. 13. Run DSA301 and select some parameters of your own choosing. Try the wild card features for device names and event codes. Experiment with the different report types and device types. Refer to the DSA301 user guide section of your Student Guide for further infonnation about selection features and wild can1 options.. 14. Locate the current binary system error log resident on your system at SYS$ERRORLOG. Copy the error log to your student account. The file name may either be ERRLOG.SYS or ERRLOG.OLD. You may need some assistance from the system manager, depending upon the privileges that have been assigned to your account. Run DSA301, use this file as your input, and tty some of the selections from the previous steps in this lab exercise. Digital Internal Use Only 5 DSA TROUBLESHOOTING COURSE Lab Exercise #6 Forced Errors/EDC Errors Digital Internal Use Only 1 Forced ErrorslEDC Errors Lab Exercise 6 In this exercise, you will create some bad blocks containing forced errors and EDC errors on your scratch disk. Then you will use some tools and techniques to isolate these blocks and the files that contain these bad blocks. Your scratch disk contains a VMS directory' similar to the student account on the system. The scratch disk also contains files similar to the ones in your' system account. You will select blocks in files on the scratch disk. Be sure to follow the steps carefully. At no time should you be changing blocks in your student account on the system. 1. Log into your system account 2. $ SET DEF DISK:[STUDENTM.MISC] NOTE DISK: is the name of the scratch disk selected by the instructor. Use the VMS command SHOW DEVICE D to determine the exact name to use in place of the DISK:, including any allocation class information. Be sure to use the specific name of your scratch disk in place of the term DISK: throughout this exercise. 3. $ COpy BLOCK.COM OLD1.COM 4. $ COpy BLOCK.COM OLD2.COM 5. $ DUMP/HEADERIBLOCK=(START:O,END:O) OLDl.CQM Make a note of the LBNs that are allocated to OLD 1. COM. The infonnation displayed by this command is extensive. Ask your instructor help you determine which LBN numbers are assigned to the tile OLD1.COM. 6. $ DUMP/HEADER/BLOCK=(START:O,END:O) OLD2.COM Make a note of the LBNs that are allocated to OLD2.COM. bS 7. $ DISM/NOUNL DISK: 0 v r 8. Go to the HSC for your disk, RUN DKUTIL, and install the write patch. 9. Display the ReI' and save the hardcopy. 2 Digital Internal Use Only I Forced ErrorslEDC Errors Lab Exercise 6 10. Refer to the LBNs allocated to these files in your notes from steps 5 and 6. Select three arbitrary LBNs from each of the files (OLD1.COM and OLD2.COM) and install them into the blank spaces in the work table below. As illustrated, two of the three blocks in each file should not be in the RCT list, and one of the three blocks in each file should be in the RCI' list OLD1. COM OLD2.COM LEN A= is NOT in the ReT list LEN B is NOT in the ReT list LBN = C= LBN D is"NOT in the ReT list LBN E LEN F = = = IS in the RCT list is NOT in the ReT list IS in the RCT list 11. Using DKUTIL, issue the following commands. Ignore any errors displayed as a result of these commands. NOTE Be sure to substitute the actual LBN numbers from your work table in place of the letters A, B, C, D, E, and F in these commands. DKUTIL> DUMP LBN A DKUTIL> WRITE/FE LBN A DKUTIL> DUMP LBN C DKUTIL> WRITE/FE LBN C DKUTIL> DUMP LEN D DKUTIL> WRITE/FE LBN D DKUTIL> DUMP LEN B DKUTIL> WRITE/BADEDC LBN B DKUTIL> DUMP LBN E DKUTIL> WRITE/BADEDC LBN E DKUTIL> DUMP LBN F DKUTIL> WRITE/BADEDC LBN F 12. Exit from DKUTIL and return to your account on the VMS tenninal. 13. Remount your scratch disk (DISK:) and set your default directory: $ SET DEF DISK:[STUDENTM.MISC] 14. Try to execute OLDl.COM using the command: $ @OLDl Enter arbitrary selections until you encounter errors from VMS. Observe how these errors are displayed for a "typical user. Digital Internal Use Only 3 Forced ErrorslEDC Errors Lab Exercise 6 15. Try the command: $ COPY OLDl.COM TEMP. COM Note any errors that VMS may produce. 16. Try the command: $ TYPE OLDl.COM Note any errors that VMS may produce. 17. Enter the following command: $ ANALYZEIDISKIREAD_ CHECK DISK: Note any errors associated with OLD1.COM and OLD2.COM. 18. $ SET DEF DISK:[STUDENTM.RAUTIL] 19. RUN RAUTll.. and select your scratch disk (DISK:). 20. Perfonn the RAUTll... ANALYZE command. Note that only LBNs C and F show up with errors. This is because the RAUTIL ANALYZE command only causes replaced LBNs to be scanned and verified. 21. Perfonn the RAUTll... SCRUB command. Now notice that all of the LBNs you modified show up with errors. The SCRUB command causes RAUTll... to read every physical LBN in the host area of the disk. Abort the scrub operation using Control-C after all of your selected LBNs have been displayed. 22. Enter the following VMS command: $ DELETE OLDl.COM 23. Re-execute the VMS command: $ ANALYZEIDISKIREAD DISK: Note that only OLD2.COM shows errors for your arbitrary LBNs. 4 Digital Internal Use Only Forced ErrorslEDC Errors Lab Exercise 6 24. Re-execute the RAUTIL command SCRUB. Notice that all of your arbitrary LBNs are still flagged as errol'S. This is because the VMS DELETE command only deletes the file pointer/header but does not actually rewrite or erase the allocated LBNs. These LBNs still exist with their errors and will not nonnally affect VMS. They will be rewritten (and corrected) the next time VMS allocates them to store some other file. 25. Execute the following VMS command: $ DELETE/ERASE OLD2.COM 26. Perform the VMS command: $ ANALYZEIDISKIREAD DISK: Note that all of your arbitrary LBNs are excluded. 27. Re-execute the RAUTIL SCRUB command. Note that the arbitrary LBNs from OLD2.COM are no longer flagged. That's because the VMS DELETE/ERASE actually rewrites all LBNs associated with the OLD2.COM file as well as de-allocating them. The arbitrary LBNs associated with OLDl.COM are still flagged as explained in step 24. 28. $ DISMOUNT/NOUNLOAD DISK: 29. Use DKUTIL on the HSC and dump each of the two arbitrary LBNs associated with OLDl.COM and rewrite them to the disk using the standard DKUTIT... WRITE command with no special modifiers. This should correct errors associated with them. Use DKUTIL on the HSC, select your scratch disk, and enter the following commands: NOTE Be sure to substitute the actual LBN numbers from your work table in place of the letters A, B, and C in these commands. DKUTIL> DUMP LBN A DKOTIL> WRITE LEN A DKUTIL> DUMP LBN E DKUTIL> WRITE LEN B DKUTIL> DUMP LBN C DKUTIL> WRITE LBN C Digital Internal Use Only 5 Forced ErrorslEDC Errors Lab Exercise 6 30. Exit from DKUTIL and return to your VMS account. . 31. Re-mount the DISK: 32. Use the RAUTll... SCRUB command to verify that all of the arbitrary LBNs are no longer flagged with errors. In this exercise, you used DUMP and WRITE commands to correct the modified LBNs. Writing an LBN using a known good controller will clear the forced error indicator and write good EDC. In our case, we knew that the contents of the LBNs were good (since we never changed them). Nonnally, when you encounter blocks willi forced errors or EDC errors, you will not know if the data is corrupt or not. Therefore, assume the LBNs are corrupt and replace the files with KNOWN GOOD COPIES OF THE FILES. 1 Summary A. Replace files with forced errors using known good backup copies or recreated from a known good source. B. ANAL/DISKIREAD_CHECK (VMS command) identifies existing files that contain forced errors and/or EDC errors. Replace these files with copies from known good backup(s). c. ANAL/DISKIREAD_CHECK (VMS command) does not report errors for unused or unallocated LBNs. These will still exist, but they will be rewritten and corrected the next time VMS allocates them for storage. D. The RAUTIL ANALYZE command only reports errors for LBNs that have been replaced (according to the ReI'table). E. The RAUTIL SCRUB command reports errors encountered for any host LBN whether it has been replaced or not. LBNs that are not allocated or not currently used by operating system software but are written with bad EDC or forced error can be corrected by rewriting them with DKUTIL if they become a nuisance. This is not nOlUlally necessary as most system configurations ignore errors associated with unused LBNs 6 Digital Internal Use Only DSAERR V3.01 User Document VMS Error Log Tool CHAPTER 17 DSAERR V3.01 USER DOCUMENT Digital Internal Use Only 17-1 DSAERR V3.01 User Document VMS Error Log Tool 17.1 OVERVIEW DSAERR is an executable image which runs under VMS. DSAERR is capable of extracting selected DSA disk information from a VMS binary error log file and displaying it into a variety of formats. These formats are more condensed than the conventional styles presented by ERRFMT as used with the ANALYZE/ERROR command within VMS. The selected information provides only the elements necessary to understand the root nature of most DSA disk errors. DSAERR is an extremely powerful service tool for analyzing DSA disk-related errors and is sometimes referred to as an error log tool. Each error log entry presented to VMS often results in one to two pages of VMS error log, reporting. DSAERR reduces this entry to a single line. Some of the advantages of DSAERR include: Reducing error log entries from one or two pages to a single line. Performing automatic transbition of logical block numbers (LBNs) to physical cylinder, track, sector, and head information associated with R/W transfer and bad block replacement error log entries. Providing manual translation of block numbers. Allowing the user to sort through the error log for specific status/event codes, disk drive types, device names, etc. Providing the ability to sort or summarize error log entries. by geographic characteristics (cylinder, track, sector). Providing a soft copy output which may be printed as hardcopy at the user's discretion. Providing the use of wild card characters (* and %) to make the selection and sorting process more versatile. DSAERR currently supports RA60, RA70, RA80, RA81, RA82, RA90 and associated DSA controllers (UDA, KDA, KDB, HSC). 17.1.1 Restrictions Two forms of the DSAERR program are available: DSAERR.EXE which is linked and ready for execution using the VMS RUN command, and DSAERR.OBJ which is not linked to the libraries in VMS. For most systems, the executable (DSAERR.EXE) version is all that is required by the user. Occasionally, the particular version and configuration of the VMS system may require the program to be linked on the actual target system. To assure compatibility, it is recommended that DSAERR.OBJ be obtained and linked on the system for which it is intended. The VMS command to perform this is: $ LINK DSAERR.OBJ $ RUN DSAERR NOTE DSAERR.OBJ (Version 3.03) or DSA303.0BJ is required to execute with VMS Version S.O or higher. 17-2 Digital Internal Use Only DSAERR V3.01 User Document VMS Error Log Tool 17.2 SELECTION PARAMETERS When run, DSAERR prompts the user for a variety of selection characteristics. A summary of all the prompts is shown below. The following sections describe each prompt and provide examples. Input file: Output file: Device(s) : Event (s) : After: Before: Report: Enter a question mark (?) at any prompt to obtain a brief summary of the information that may be supplied to that prompt. Enter HELP at any prompt to obtain more detailed information about what may be entered. . 17.2.1 Input File Input file: Enter a carriage return to select the default VMS binary ERRLOG.SYS file. T.his will be extracted from the default SYS$ERRORLOG:ERRLOG.SYS. Users sometimes rename this file ERRLOG.OLD. Some users may prefer to use the ANALYZE/ERROR/BINARY command to extract a portion of the full ERRLOG.SYS file and produce a limited binary output file which can be used as the input file to DSAERR. Any file name specification may be used as long as it is a binary, formatted VMS error log file. Only one file specification may be entered. Following is an example. Input file: SYS$SYSMAINTENANCE:DISK_ERRORS.BINARY If you enter IT in response to the input file prompt, the program will enter manual translation mode. Details for program operation in the manual translation mode are discussed in Section 17.3 17.2.2 Output File Output file: The user may respond in one of two ways to the output file prompt. A carriage return to the output file prompt will cause the program to display all results to the user terminal (SYS$OUTPUT as the default). 1. Enter a file name specification to cause the program to generate a soft copy of all error log output generated by the program. Output file: 17.2.3 TEST.OUT Device(s) and Type(s) Device(s) : The device name and device type may be specified a number of ways. A carriage return will default to all error log entries containing supported DSA disks and device names found in the specified input file. The general format for responding to this prompt is: Daan,Rann (Device name and/or device type separated by a comma) ttDaan" specifies the device name (DUA123, DUB2, DJA6, etc.). Wlld card characters (* and %) may be used in the device name specification. Multiple device entries may be used if they are separated by a comma. Digital Internal Use Only 17-3 DSAERR V3.01 User Document VMS Error Log Tool Examples: DUA* DJA10% Selects all OUA devices ending with any number. DJ,DUA*1 Selects all OJ devices and DUA devices ending with a 1. Selects all OJA devices with three-digit numbers beginning with 10. The device name may also include the controller name. This option allows you to capture error log events associated with a single controller channel. For example: HSC007$DUAS BRAVAX$DJA* "Rann" specifies the device type(s) for selection. These must be specified exactly as one or more of the following: RA60, RA70, RASO, RA81, RA82, RA90. Wild cards may not be used when selecting a device type. However, one or more device types may be specified, and they must be separated by commas. For example: DUA*,RA80,RA82 17.2.4 DJ* ,RA60 D*,RA70,RA80,RA81 Event Codes Event(s): This prompt allows you to sort and tailor the selection of error log events displayed according to a specific MSCP status/event code or a list of codes. Wtld card characters (* or %) may be specified in each MSCP code entered. A maximum of 100 MSCP codes may be specified, separated by commas. Entries are specified in hexadecimal, and leading zeros may be omitted. A default carriage return causes all entries with any MSCP event to be selected. For example: 1A8,*8 Limits entries selected to those containing MSCP event code 1AB or any event code ending in 8. %%68 Selects event codes four digits long with the last two digits of 68. *6B Selects event codes of any length with the last two digits of 68. 17.2.5 After Starting date: A default carriage return causes the program to start selection with the first entry in the binary input file. Entering a date/time will cause the program to select entries with a date/time equal to or greater than the value entered. The fonnat for the response is: dd-mmm-yyyy hh:mm:ss:cc A date or time may be selected independently. If you specify both the date and time, the intervening space is required. You can omit any of the trailing fields in the date or time parameter. 17.2.6 Before Ending date: A default carriage return causes the program to make selections up to and including the last entry in the binary input file. Entering a date/time will cause the program to select entries with date/tirne less than or equal to the value entered. The fonnat for the response is: dd-mmrn-yyyy hh:mm:ss:cc The date or time may be selected independently. If you specify both the date and time, the intervening space is required. You can omit any of the trailing fields in the date or time parameter. 17-4 Digital Internal Use Only DSAERR V3.01 User Document VMS Error Log Tool 17.2.7 Report Report type: This prompt allows you to specify the fonnat and style of the output report generated. The report type selected will also dictate the.amount of infonnation provided in conjunction with the parameters previously provided. There are five different report fonnats that may be selected. Physical Geographic Summary Verbose Time You may enter the first character or the entire word to select the desired report. A carriage return defaults to P (physical report). Details and example of each are described in the following sections. Digital Internal Use Only 17-5 DSAERR V3.01 User Document VMS Error Log Tool 17.2.7.1 Physical Report (P) This is the default report type. An example follows. The error log entries selected and reported are displayed in the same order in which they appeared in the binary input file. The information provided includes: Device name (including controller path if applicable) Drive type Drive LED code (if applicable) MSCP status/event code Block-number (LBN or RBN) Translation of block number into: Cyl Physical cylinder Hd Head S Physical sector from index Volume serial number Datemme of entry PHYSICAL repon example: -<* DSAERR V3.01 *>- Device Name Drv Drv Type Led MSCP Block Event Number HSC007SDUA66 HSC007SDUA66 HSC007 SDUA66 HSC007 $DUA66 DUA3 DUA3 DUA3 HSC007SDUA66 HSC007 SDUA66 HSC007 SDUA66 HSC007SDUA66 HSC007$DUA66 HSC007$DUA66 HSC007SDUA66 HSC007SDUA66 HSC007SDUA66 HSC007SDUA66 HSC007SDUA66 HSC007 SDUA66 HSC007SDUA66 HSC007SDUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007SDUA66 HSC007SDUA66 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 RA82 RA82 RA8:2 RA82 RA80 RA80 RA80 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA81 RA81 RA81 RA81 RA81 RA81 RA81 olAB olAB olAB 17-6 07 07 4F 4F CO 4D 4D 26 26 006B OOEB OOEB 002B OOEB OOEB 006B 0045 OOEB 002B OOEB 002B OOEB 006B 006B 0045 0045 0045 0045 0094 002B OOEB OOEB 0048 0048 0048 0014 00E8 OOE8 00E8 0 0 0 608485 0 0 0 0 0 87036 1217578 0 0 0 0 0 608483 608483 1216666 1218490 1216665 1216667 342 0 0 0 35896 35896 35896 35896 28047 28047 28047 Digital Internal Use Only Cyl Hd 0 0 0 711 0 0 0 0 0 101 1424 0 0 0 0 0 711 711 1423 1425 1423 1423 0 0 0 0 50 50 50 50 39 39 39 0 0 0 10 0 0 0 0 0 11 1 0 0 0 0 0 10 10 0 2 0 0 6 0 0 0 3 3 3 3 3 3 3 S Vol-sn yy-nun-dd hh:mm:ss:cc 0 0 0 34 0 0 0 0 0 34 15 0 0 0 0 0 32 32 1 29 0 2 26 0 0 0 33 33 33 33 38 38 38 634003 634003 634003 634003 0 0 0 634003 634003 634003 634003 634003 634003 634003 634003 634003 634003 634003 634003 634003 634003 634003 634003 634003 634003 634003 21198 21198 21198 21198 21198 21198 21198 86/04/08 86/04/08 86/04/08 86/04/08 86/04/08 86/04/08 86/04/08 86/04/09 86/04/09 86/04/09 86/04/08 86/04/08 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/08/22 86/08/22 86/08/22 86/08/22 86/08/22 86/08/22 86/08/22 13:37:56.68 13:37:56.79 13:37:56.91 13:37:56.91 14:00:24.89 14:00:24.89 14:00:34.11 16:48:17.97 16:48:23.41 16:48:28.89 14:48:44.25 16:17:54.27 09:32:30.25 09:32 :51. 77 09:33:13.05 09:33:13.17 10:14:19.34 10:14:19.50 10:14:36.89 10:14 37.17 10:14 37.43 10:15 12.53 10:15 12.65 15:28 00.33 15:28 00.47 15:28 00.50 13:24 03.38 13:24 04.18 13:24 04.92 13:24 05.94.. 13:24 06.90 13:24 07.33 13:24 07.79 DSAERR V3.01 User Document VMS Error Log Tool 17.2.7.2 Geographic Report (G) This report provides the same infonnation as the physical (P) report but sorts the entries according to geographic cylinder, head, and sector. The sorting priorities are: 1. Device name 2. Cylinder 3. Head 4. Sector An example follows. The infonnation provided includes: Device name (including controller path if applicable) Drive type Drive LED code (if applicable) MSCP status/event code Block number (LBN or RBN) Translation of block number into: Cyl Physical cylinder Hd Head S Physical sector from index Volume serial number DatelTime of entry Digital Internal Use Only 17-7 DSAERR V3.01 User Document VMS Error Log Tool GEOGRAPIDC report sample: Geography V3.0 Device Namerv Drv Type Led MSCP Block Event Number Cyl Hd S RA81 RA81 RA81 RA81 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 4D Fl 26 Fl 010B 010B 010B 010B 0168 0034 0128 0148 0034 0128 0128 0034 0128 00E8 0014 0128 0014 00E8 0128 0148 00E8 0128 0128 OOEB OOEB OOEB OOEB 006B 0094 006B 0094 0043 006B 006B 006B 006B 0045 0045 0045 0045 0045 Digital Internal Use Only 25237 187530 189322 240521 186137 186137 186137 186579 186579 186579 195246 195246 214791 214791 214791 6928 216954 216954 216954 216955 218704 218704 218704 0 0 0 0 96 96 342 342 16586 87036 608483 608483 608485 1216665 1216665 1216666 1216667 1216667 Vol-sn ------ ------35 262 265 336 428 428 428 429 429 429 449 449 494 494 494 494 499 499 499 499 503 503 503 0 0 0 0 0 0 0 0 19 101 711 711 711 1423 1423 1423 1423 1423 4 9 2 12 12 12' 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 0 0 0 0 1 1 6 6 5 11 10 10 10 0 0 0 0 0 47 25 38 17 13 13 13 21 21 21 8 8 23 23 23 31 16 16 16 17 30 30 30 0 0 0 0 53 53 26 26 10 34 32 32 34 0 0 1 2 2 10 10 10 10 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 63400 63400 63400 63400 63400 63400 63400 63400 63400 63400 63400 63400 63400 63400 63400 63400 63400 63400 yy-rnm-dd hh:rnm:.ss:cc -------------------- 86/10/03 86/09/29 86/09/25 86/09/25 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/04/09 86/04/08 86/04/09 86/04/08 86/04/08 86/04/08 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/08 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 11:41:37.20 16:30:49.15. 19:28:31.42 19:29:45.03 13:50:12.04 13:50:12.74 13:50:12.08 13:50:16.57 13:50:17.21 13:50:16.53 14:47:24.40 14:47:25.13 13:49:04.47 13:49:04.91 13:49:05.53 13:49:05.36 13:49:24.70 13:49:24.08 13:49:23.63 13:49:23.65 13:49:25.24 :;'3:49:24.82 13:49:24.77 09:33:13.17 17:19:01.01 15:28:00.45 17:19:57.67 14:48:27.02 14:48:44.35 10:14:36.72 10:15:12.65 15:28:00.53 16:48:28.89 10:14:19.50 10:14:19.34 13:37:56.91 10:14:54.61 10:14:37.43 10:14:36.89 10:14:36.68 10:15:12.53 DSAERR V3.01 User Document VMS Error Log Tool 17.2.7.3 Summary Report (S) This report provides a summary map of all selected entries in the error log. The map organizes the entries according to the disk R/W heads associated with the translation of each of the block numbers if applicable for a R/W transfer error log entry. This gives a geographic view of the displacement of R/W-relared entries across the disk media. Entries associated with non-transfer errors will have zeros for block numbers. Due to the nature of the program, these entries are tallied and entered in the coordinate associated with cylinder 0 and head O. Do not to let these lead you to believe there is a head 0 or cylinder 0 problem. One solution would be to select MSCP status/event codes associated with R/W data transfer errors, such as: 8 Forced error 48 Invalid (corrupted) header 68 -Data sync timeout 88 Correctable error in ECC field E8 Uncorrectable ECC error 128 Two-symbol ECC error 148 Three-symbol ECC error 168 Four-symbol. ECC error 188 Five-symbol ECC error 1A8 Six-symbol ECC error 1C8 Seven-sym bol ECC error 1:8 Eight-symbol ECC error A simpler method would be to use the wild card feature when responding to the status/event code prompt of the program and select any disk entry with status/event codes ending in 8. HEX Event Code (8) (nnnn)' lAB, E8, %%6B, *4' [ALL]? *8 Three (3) examples follow. The summary associated with disk DUA3 shows all of the logged entries as being associated with head number 12 (after automatic translation). The head may be defective or the media area associated with head 12 may be defective. The summary associated with DUA2 shows a distribution of errors across many different R/W heads. If the errors occurred in a relatively short amount of time, a read/write data path problem may exist with disk electronics, the SDI path, or the SDI electronics within the controller. A number of other possibilities also exist here. Examination of the specific event codes in a PHYSICAL report may provide more information to help isolate the problem. The summary for disk HSCOO7$DUA66 is an example where all the errors are logged at the coordinate for head 0, cylinder O. These are likely not R/W related but instead SDI related with no LBNs associated with the errors. You could confirm this by obtaining a PHYSICAL ·report for this drive. Digital Internal Use Only 17-9 DSAERR V3.01 User Document VMS Error Log Tool SUMMARY report sample: -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Volume Ser#: 25205 Device Name: DUA3 1 0 3 2 5 4 7 6 10 9 8 11 12 13 14 eYL# 428 429 449 494 499 503 532 541 3 3 2 4 4 5 4 6 - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Volume Ser#: 21198 Device Name: DUA2 1 0 CYL# 0 25 28 30 31 37 38 39 40 41 42 43 44 45 47 48 49 50 51 52 54 3 2 3 4 5 7 6 8 10 9 11 12 13 5 - 4 - 4 4 4 4 4 9 8 5 -- 5 - 4 - 4 5 5 8 4 4 - 4 4 4 8 4 - - 3 7 3 4 4 4 - - 9 - - 2 5 - - 14 4 - 5 - - 4 - 5 5 - 4 - - - 5 - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=~= -=-=-=-=-=-=-=-=-=-=-=-- Volume Ser#: 63400354 Device Name: HSe007$DUA66 o 1 2 3 eYL# o - 17-10 24 - Digital Internal Use Only 4 5 6 7 8 9 10 11 12 13 14 DSAERR V3.01 User Document VMS Error Log Tool 17.2.7.4 Verbose Report (V) The verbose report displays all the MSCP infonnation for each of the selected error log events. Unlike the standard VMS error log report that may display one to two pages for each entry, DSAERR condenses the infonnation into about a half a page for each entry. This fonnat is intended for experienced users who are more familiar with DSAMSCP and need these details. Examples for a disk transfer error log entry, an SDI error log entry, and a bad block replacement entry follow. VERBOSE report sample: -=-=-=-=-=-=-=-=-=-=~=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= DISK TRANSFER ERROR logged at B.ENTRY CLASS B. ENTRY TYPE W.ERR_SEQ W.SEQ_NOM B.DD CLASS B.DD TYPE B.DD NOM B.DD NAME W• MESSAGE TYPE L.CMD REF W.UNIT W.SEQ_NOM B.FORMAT B.FLAGS W.EVENT Q.CNT ID B.CNT-SVR B.CNT HVR W.MULTI UNIT 100. O. 30. 433. 1. 30. RAB2 66. HSC007SDUA 0001 130EOO06 66. 01B1 02 00 006B 010100000000F807 02 00 0050 8-APR-19B6 13:37:56.91 on SID 01380A4F B.RECOVERY LEVEL B.RECOVERY COUNT L.DRV SER B.UNIT SVR B . UNI T.~HVR B.UNIT TYPE B.UNIT CLASS L.VOL SER L.BLOCK NUM W.ORIG ERR FLAGS W.RECOVERY FLAGS B.LVL A RETRY CNT B.LVL-B RETRY- CNT -W.BUFFER ADDR B . SOURCE _REQ B.DETECT_REQ 7. O. 264. 01 OF 11. 2. 63400354. 6084B5. LBN 014000 000002 3. O. 141706 5 5 Digital Internal Use Only 17-11 DSAERR V3.01 User Document VMS Error Log Tool -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= SDI MESSAGE logged at 25-SEP-1987 14:31:28.60 on SID 01380A4F B.ENTRY CLASS B.ENTRY TYPE W.ERR_SEQ W.. SEQ NUM B.DD CLASS B.DD TYPE B.DD NUM B.DD NAME W.MESSAGE TYPE L.CMD REF W.UNIT W.SEQ_NUM B.FORMAT B.FLAGS W.EVENT Q.CNT ID B.CNT-SVR B.CNT HVR W.MULTI UNIT 100. O. 13. 12. 1. 21. RA81 116. HSC015$DUA 0001 000000 116. OOOC 03 40 OOEB 010100000000F807 02 00 0033 B.RECOVERY LEVEL B.RECOVERY COUNT L.DRV SER B.UNIT SVR B.UNIT HVR B.UNIT TYPE B.UNIT CLASS L.VOL SER L.BLOCK NUM L.SDI INFO B.SDI RETRY CNT B.PRV- CMD B.SDI STATUS W. CURRENT CYL B.CURRENT-GROUP B.DRlVE LED CODE B.DRV FAULT-CODE B.SDI_S_REQB.SDI_D_REQ - o O. 173816. 08 08 5. 2. 140582. O. LBN 0080001B O. 8E 00 627. 3. Fl 1A 3. 3. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= REPLACEMENT MESSAGE logged at 22-AUG-1986 14:30:32.90 on SID 03003AOO B.ENTRY CLASS B.ENTRY TYPE W.ERF._SEQ W.SEQ_NUM B.DD CLASS B.DD TYPE B.DD NUM B.DD NAME W.MESSAGE TYPE L.CMD REF W.UNIT W.SEQ_NUM B.FORMAT B.FLAGS W.EVENT Q.CNT_ID B.CNT SVR B.CNT HVR W.MULTI UNIT 17-12 100. O. 3098. 65535. 1. 21. RA81 2. DUA 0001 E2510006 2. FFFF 09 AO 0014 010600815ACA44 04 00 0002 Digital Internal Use Only B.RECOVERY LEVEL B.RECOVERY-COUNT L.DRV SER B.UNIT SVR B.UNIT HVR B.UNIT TYPE B.UNIT CLASS L.VOL SER L.BLOCK NUM B.BBR FLAGS L.BAD LBN L.OLD RBN W.BBR CAUSE L.NEW RBN 0 192. 76478. 07 06 5. 2. 21198. 30837. COOO 30837. O. 00E8 604. LBN LBN DSAERR V3.01 User Document VMS Error Log Tool 17.2.7.5 Time Report (T) The time report is similar to the physical report, but the entries are sorted by the date/time in which they occurred rather than the order in which they appear in the binary input file. The information provided includes: Device name (including controller path if applicable) Drive type Drive LED code (if applicable) MSCP status/event code Block number (LBN or RBN) Translation of block number into: Cyl Physical cylinder Hd Head S Physical sector from index Volume serial number Datemme of entry Digital Internal Use Only 17-13 DSAERR V3.01 User Document VMS Error Log Tool TIME report sample: -<* DSAERR V3.01 *>Device Name ------------ HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 DUA3 DUA3 DUA3 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA~6 HSC007$DUA66 HSC007SDUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 17-14 Drv Drv Type Led MSCP Block Event Number RA82 RA82 RA82 RA82 RA80 RA80 RA80 RA82 RA82 RA82 olAB RA8~ RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA62 RA82 RA62 RA82 RA62 RA62 RA82 RA81 RA81 RA81 RA81 RA81 RA81 RA81 RA81 RA81 RA81 RA81 RA81 07 07 CO 4D 4D 4D 26 26 4F 4F ------- 01AB 01AB 006B OOEB OOEB 002B 006B 0045 0094 OOEB 002B OOEB 002B OOEB 002B OOEB 006B 006B 0045 006B 0045 0045 0045 0045 0045 0045 0094 002B OOEB OOEB OOEB OOEB 006B 0048 0048 0048 0014 00E8 00E8 00E8 0014 00E8 00E8 00E8 0014 Digital Internal Use Only Cyl Hd 0 0 0 0 0 0 608485 711 0 0 0 0 0 0 0 96 1217578 1424 0 96 0 0 0 0 0 0 0 0 0 0 0 0 0 0 608483 711 608483 711 1216667 1423 0 342 1216666 1423 1218490 1425 1216665 1423 1216665 1423 1218489 1425 1216667 1423 0 342 0 0 0 0 0 0 0 0 0 0 67036 101 35896 50 35896 50 35896' 50 35896 50 28047 39 28047 39 28047 39 39 28047 31848 44 31848 44 31848 44 31848 44 0 0 0 10 0 0 0 1 1 1 0 0 0 0 0 0 0 10 10 0 6 0 2 0 0 2 0 6 0 0 0 0 0 11 3 3 3 3 3 3 3 3 8 8 8 8 S 0 0 0 34 0 0 0 53 15 53 0 0 0 0 0 0 0 32 32 2 26 1 29 0 0 28 2 26 0 0 0 0 0 34 33 33 33 33 38 38 38 38 32 32 32 32 Vol-sn yy-mm-dd hh:mm:ss:cc ------ -------------------634003 86/04/08 634003 86/04/08 634003 86/04/08 634003 86/04/08 0 86/04/08 0 86/04/08 0 86/04/08 634003 86/04/08 634003 86/04/08 634003 86/04/08 634003 86/04/08 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003·- - 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 634003 86/04/09 21198 86/08/22 21198 86/08/22 21198 86/08/22 21198 86/08/22 21198 86/08/22 21198 86/08/22 21198 86/08/22 21198 86/08/22 21198 86/08/22 21198 86/08/22 21198 86/08/22 21198 86/08/22 13:37:56.68 13:37:56.79 13:37:56.91 13:37:56.91 14:00:24.89 14:00:24.89 14:00:34.11 14:48:27.02 14:48:44.25 14:48:44.35 16:17:54.27 09:32:30.25 09:32:30.37 09:32:51.65 09:32:51.77 09:33:13.05 09:33:13.17. 10:14:19.34 10:14:19.50 10:14:36.68 10:14:36.72 10:14:36.89 10:14:37.17 10:14:37.43 10:14:54.61 10:15:11.95 10:15:12.53 10:15:12.65 15:28:00.33 15:28:00.47 15:28:00.50 16:48:17.97 16:48:23.41 16:48:28.89 13:24:03.38 13:24:04.18 13:24:04.92 13:24:05.94 13:24:06.90 13:24:07.33 13:24:07.79 13:24:08.48 13:24:37.66 13:24:37.94 13:24:39.14 13:24:39.82 DSAERR V3.01 User Document VMS Error Log Tool 17.2.8 Using the Selection Process The following pages provide some examples of the Selection parameters previously described and the resulting displays. Input file: Output file: Device(s) : Event(s) : After: Before: Report: errorlog.eng DU*,RA80,RA81 *8,14 P -<* DSAERR V3.01 *>Device Name ------------ DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 DUA2 BRIVAX$DUA3 BRIVAX$DUA3 BR IVAX$ DUA3 BRIVAX$DUA3 BR IVAX$ DUA3 BRIVAX$DUA3 Drv Drv Type Led MSCP Block Event Number Cyl Hd S Vol-sn 33 33 33 33 38 38 38 38 32 32 32 32 32 14 14 31 14 23 23 21198 21198 21198 21198 21198 21198 21198 21198 21198 21198 21198 21198 21198 25205 25205 25205 25205 25205 25205 ------RA81 RA81 RA81 RA81 RA81 RA81 RA81 RA81 RA81 RA81 RA81 RA81 RA81 RA80 RA80 RA80 RA80 RA80 RA80 0048 0048 0048 0014 00E8 00E8 00E8 0014 00E8 00E8 00E8 0014 01E8 0128 00E8 0128 0014 0128 00E8 35896 35896 35896 35896 28047 28047 28047 28047 31848 31848 31848 31848 22440 231274 231274 7460 231274 214791 214791 50 50 50 50 39 39 39 39 44 44 44 44 31 532 532 532 532 494 494 3 3 3 3 3 3 3 3 8 8 8 8 6 12 12 12 12 12 12 yy-mm-dd hh:mm:ss:cc ------ -------------------86/08/22 86/08/22 86/08/22 86/08/22 86/08/22 86/08/22 86/08/22 86/08/22 86/08/22 86/08/22 86/08/22 86/08/22 86/08/22 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 13:24:03.38 13:24:04.18 13:24:04.92 13:24:05.94 13:24:06.90 13:24:07.33 13:24:07.79 13:24:08.48 13:24:37.66 13:24:37.94 13:24:39.14 13:24:39.82 13 :24 :41.22 13: 48 :31. 04 13:48:31.47 13: 48 :31. 91 13:48:32.08 13:49:04.47 13:49:04.91 Digital Internal Use Only 17-15 DSAERR V3.0' User Document VMS Error Log Tool Input file: Output file: Device(s) : Event(s): After: Before: Report: binary.dat my.out DOA* 4% 8-APR-1986 14:00:00 11-APR-1986 -<* DSAERR V3.01 *>Device Name Drv Drv Type Led MSCP Block Event Number ------- -----------HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 MtJFFIN$DUAO MUFFIN$DUAO MUFFIN$DUAO 17-16 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA70 RA70 RA70 0045 0045 0045 0045 0045 0045 0045 0045 0043 004B 004B 004B Digital Internal Use Only 1217578 1216667 1216666 1218490 1216665 1216665 1218489 1216667 16586 Cy1 Hd 1424 1423 1423 1425 1423 1423 1425 1423 19 0 0 0 0 0 0 S Vol-sn -----1 0 0 2 0 0 2 0 5 0 0 0 15 2 1 29 0 0 28 .2 10 0 0 0 634003 634003 634003 634003 634003 634003 634003 634003 634003 0 0 0 yy-mm-dd hh:mm:ss:cc -------------------86/04/08 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/10 86/04/10 86/04/10 14:48:44.25 10:14:36.68 10:14:36.89 10:14:37.17 10:14:37.43 10:14:54.61 10:15:11.95 10:15:12.53 15:28:00.53 08:10:03.65 08:10:03.66 08:10:03.66 DSAERR V3.01 User Document VMS Error log Tool Input file: Output fil.e: Device(s} : Event (s) : After: Before: Report: DUA*,RA80,RA81,RA82 G -<* DSAERR V3.01 *>- Device Name Drv Drv Type Led MSCP Block Event Number ----------BRIVAX$DUAO BRIVAX$DUAO BRIVAX$DUAO BR IVAX$ DUAO BRIVAX$DUA3 BRIVAX$DUA3 BRIVAX$DUA3 BRIVAX$DUA3 BRIVAX$DUA3 BRIVAX$DUA3 BR IVAX$ DUA3 BR IVAX$ DUA3 BRIVAXSDUA3 BRIVAX$DUA3 BRIVAX$DUA3 BRIVAX$DUA3 BRIVAX$DUA3 BRIVAX$DUA3 BRIVAX$DUA3 BRIVAX$DUA3 BRIVAX$DUA3 BRIVAX$DUA3 BRIVAX$DUA3 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 HSC007$DUA66 Cyl Hd S Vol-sn RA81 RA81 RA81 RA81 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA80 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 4D F1 26 F1 010B 010B 010B 010B 0168 0034 0128 0148 0034 0128 0128 0034 0128 00E8 0014 0128 0014 00E8 0128 0148 00E8 0128 0128 OOEB OOEB OOEB OOEB 006B 0094 006B 0094 0043 006B 006B 006B 006B 0045 0045 0045 25237 35 187530 262 189322 265 240521 336 186137 428 186137 428 186137 428 186579 429 186579 429 186579 429 195246 449 195246 449 214791 494 214791 494 214791 494 6928 494 216954 499 216954 499 216954 499 216955 499 218704 503 218704 503 218704 503 0 0 0 0 0 0 0 0 96 0 96 0 342 0 0 342 16586 19 87036 101 608483 711 608483 711 608485 711 1216665 1423 1216665 1423 1216666 1423 yy-rnm-dd hh:rnm:ss:cc -------------------- ------4 9 2 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 0 0 0 0 1 1 6 6 5 11 10 10 10 0 0 0 47 25 38 17 13 13 13 21 21 21 8 8 23 23 23 31 16 16 16 17 30 30 30 0 0 0 0 53 53 26 26 10 34 32 32 34 0 0 1 10 10 10 10 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 63400 63400 63400 63400 63400 63400 63400 63400 63400 6340063400 63400 63400 63400 63400 63400 86/10/03 86/09/29 86/09/25 86/09/25 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/10/06 86/04/09 86/04/08 86/04/09 86/04/08 86/04/08 86/04/08 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/08 86/04/09 86/04/09 86/04/09 11:41:37.20 16:30:49.15 19:28:31.42 19:29:45.03 13:50:12.04 13:50:12.74 13:50:12.08 13:50:16.57 13:50:17.21 13:50:16.53 14:47:24.40 14:47:25.13 13:49:04.47 13:49:04.91 13:49:05.53 13:49:05.36 13:49:24.70 13:49:24.08 13:49:23.63 13:49:23.65 13:49:25.24 13 : 4 9 : 24 . 82 13:49:24.77 09:33:13.17 17:19:01.01 15:28:00.45 17:19:57.67 14:48:27.02 14:48:44.35 10:14:36.72 10:15:12.65 15:28:00.53 16:48:28.89 10:14:19.50 10:14:19.34 13:37:56.91 10:14:54.61 10:14:37.43 10:14:36.89 Digital Internal Use Only 17-17 DSAERR V3.01 User Document VMS Error Log Tool Input file: Output file: Device(s) : Event(s): After: Before: Report: DUA9,RA81 *8 S -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 23448 Volume Ser4f: Device Name: DUA9 o 1 2 3 CYL4f 25 28 30 31 37 38 139 140 141 142 143 144 145 236 248 249 254 17-18 Digital Internal Use Only 4 5 6 7 8 9 10 11 12 13 14 4 - DSAERR V3.01 User Document VMS Error Log Tool Input file: Output file: Device(s): Event (s) : After: Before: Report: binary.dat LED CODES.OUT DUA* EB -<* DSAERR V3.01 *>Device Name Drv Drv Type Led DUA3 DUA3 HSC007$DUA66 HSC007$OUA66 HSC007$OUA66 HSC007$OUA66 HSC007$OUA66 HSC007$OUA66 HSCOO7$OUA66 HSC007$OUA66 HSC007$DUA66 HSC007$OUA66 GRANPA$DUA124 GRANPA$DUA124 GRANPA$DUA124 GRANPA$DUA124 GRANPA$OUA124 GRANPA$DUA124 RA80 RA80 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA82 RA81 RA81 RA81 RA81 RA81 RA81 ------------ 07 07 4F 4F CO 40 40 40 26 26 26 26 F1 F1 4B 4B F1 4B MSCP Block Event Number ------- OOEB 002B OOEB OOEB OOEB OOEB OOEB OOEB OOEB OOEB OOEB OOEB OOEB OOEB OOEB OOEB OOEB OOEB 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Cyl Hd 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 S Vol-sn yy-mm-dd hh:mm:ss:cc 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 634003 634003 634003 634003 634003 634003 634003 634003 634003 634003 5 5 5 5 5 5 86/04/08 86/04/08 86/04/09 86/04/09 86/04/08 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 86/04/09 87/09/28 87/09/28 87/09/28 87/09/28 87/09/28 87/09/28 ------ -------------------- 14:00:24.89 14:00:34.11 16:48:17.97 16:48:23.41 16:17:54.27 09:32:30.37 09:32:51.77 09:33:13.17 15:28:00.45 15:28:00.46 15:28:00.47 15:28:00.50 16:23:19.63 16:23:32.06 16:23:53.26 16:23:43.06 16:24:18.95 16:24:30.60 Digital Internal Use Only 17-19 DSAERR V3.01 User Document VMS Error Log Tool 17.3 MANUAL TRANSLATION of DSA BLOCK NUMBERS When prompted for the input file, you may enter IT to cause the program to enter a mode where manual block translation may be performed. Following is a summary of the prompts that will be displayed for manual translation to occur. Drive Type: Entering a question mark (1) will cause the program to display the drive types supported by this program. Otherwise, enter the desired type of drive associated with the blocks you wish to translate. Drives supported at the time this course was developed include RA60, RA70, RA80, RA81, RA82, and RA90. Enter Control_Z if you wish to exit the program. Display drive parameter table (Y/N) [N] ? A carriage return will cause the program to exit to the previous prompt. Following is a sample of the results provided if the user selects "Y" and RA8! was previously entered for the drive type: RA81 TOPOLOGY information 1258 14 52 1258 14 V3.0 Physical CYLINDERS. Physical HEADS/Cylinder. Physical BLOCKS/Track. Logical CYLINDERS. Logical GROUPS/Cylinder. Logical TRACKS/Group. Logical BLOCKS/Track (LBNs). Replacement BLOCKS/Track (RBNs). Sectors of GROUP OFFSET. Starting cylinder in Host LBN Area. Starting cylinder in FCT Area. Starting cylinder in DBN Area. LBNs in Host Area. Total LBNs (Host + RCT) . Extended Blocks in FCT area (XBNs). Diagnostic Blocks in DBN area (DBNs). Total Physical Blocks (PBNs). 1 51 1 14 o 1252 1256 891072 893928 2912 2912 915824 BLOCK Type: A carriage return will cause the program to return to the "Drive Type" prompt. Otherwise, select the type of block (LBN, RBN, PBN, XBN or DBN) that you wish to translate. Enter a question mark (1) to obtain help. Block(s) : A carriage return will cause the program to return to the previous prompt. OtheIWise, enter the block number(s) you wish to have translated. You may enter any of the following: 1 A single block number 2 Several block numbers separated by a comma 10,11 3 A range of numbers separated by a colon 111 :120 4 A mix of 2 and 3 above 10,11,111 :120,999 Selecting drive type RA81, block type LBN, and block numbers 10,11,111:120,999 would result in the display shown on the next page. 17-20 Digital Internal Use Only DSAERR V3.01 User Document VMS Error Log Tool -<* DSAERR V3.01 *>LOG CYL LOG GRP TRK IN GRP LOG SEC PHY eYL 0 0 0 0 2 2 2 2 2 2 2 2 2 2 5 0 0 0 0 0 0 0 0 0 0 0 0 11 9 10 11 12 13 14 15 16 17 18 30 0 0 0 0 0 0 0 0 0 0 0 1 SEC PHY FROM LBN#: Usage . lDX HD ----------------------------------------------------------------------0 0 0 10 10 0 0 10 Host · ........ PBN #: 10 11 111 112 113 114 115 116 117 118 119 120 999 0 0 0 0 0 0 0 0 1 0 2 2 2 2 2 2 2 2 2 2 5 11 37 38 39 40 41 42 43 44 45 46 48 Host Host Host Host Host Host Host Host Host Host Host Host · ........ · ........ ......... · ........ ......... · ........ · ........ ......... · ........ · ........ · ........ · ........ PBN PBN PBN PBN PBN PBN PBN PBN PBN PBN PBN PBN #: #: #: #: #: #: #: #: #: #: #: #: 11 113 114 115 116 117 118 119 120 121 122 1018 Digital Internal Use Only 17-21 DSAERR V3.01 User Document VMS Error Log Tool 17-22 Digital Internal Use Only CHAPTER 18 FAKDSK (ON HSC) Digital Internal Use Only 18-1 FAKDSK (On HSC V300/3S0) 18.1 FAKDSK (on HSC V300N350) ON THE HSC CONSOLE: HSC50>RUN DD1 :FAKDSK The cassette light will indicate the program is loading. No other response will occur on the HSC terminal. Two new devices will be available from VMS using device numbers 256 and 257. Use Control Y or Control C To abort and exit from FAKDSK operation. ON THE VMS TERMINAL: $ EXCHANGE eel'> Program to exchange file information between VMS and RT-11 (in the HSC). $ EXCHANGE> MOUNT $1$DUA256: This mounts cassette drive 0 in the. HSC50 or floppy o in the HSC70. $ EXCHANGE> MQUNT $1$DUA257: This mounts cassette drive 1 in the HSC50 or floppy 1 in the HSC70. $ EXCHANGE> DIR $1$DUA257: To get a directory of files in drive 1 $ EXCHANGE> COpy trom: $1$DUA257:*.* to: *.*/LOGrrRANS=BLOCK To copy all files from drive 1 .to your account. The flog option lets you monitor the actiVity. The rrrans=block option assures image copy by block tor compatibility when copying back to another HSC cassette of floppy. $ EXCHANGE> COpy from: *.* to: $1 $DUA257:*.*/LOGrrRANS=BLOCK To copy all files from your account to drive 1. The flog option lets you monitor the actiVity. The rrrans=block option assures image copy by block for compatibility between HSC (RT-11) and the VMS environment. $ EXCHANGE> INIT$1$DUA257:NOLUME_FORMAT=RT11 Initialize a new cassette (in the HSC, drive 1) for use. You must then mount it to transfer files. NOTE: You can specify NERIFY when transferring files to the HSC media and verify for read back and compare checking, but this will add considerable time to the COpy operation. When transferring lots of files to the HSC (write to HSC tape/Hoppy), it will have to rewind the tape after each transfer to update the directory. Thus, this particular operation takes longer. CAUTION: Aborting FAKDSK while any transfer operations (e.g., COPY) are in progress will crash the HSC. 18-2 Digital Internal Use Only FAKDSK (On HSC V370 and up) 18.2 FAKDSK (on HSC V370 and up) HSC V370 has modified the disk path to support VMS access to the load device. As a result, FAKDSK is no longer needed, and FAKDSK support in the disk path no longer exists. Now, access to the load device is initiated through some SETSHO commands. To enable creation of a fake unit from the load device, use the following SETSHO commands: HSC70> R SETSHO SETSHO> ENABLE REBOOT SETSHO-S The HSC will reboot on ezit. SETSHO> SET SERVER DISK/LOAD_ACCESS SETSHO> EXIT SETSHO-S Rebooting HSC, type Y to continue, CTRL/Y to abort: Y INIPIO-I Booting ... Two fake units will be created and assigned unique unit numbers in the range of 4096-32767. The ENABLE REBOOT command is necessary to cause generation of the necessary structures for the vinual units. These units will be retained across subsequent reboots. The assigned unit numbers can then be found by issuing a SHOW DISK command. Also, a SHOW SERVER command will indicate whether or not this server option is enabled. Conversely, the following command disables vinual unit creation upon reboot: SETSHO> SET SERVER D!SK/NOLOAD_ACCESS On each reboot, the virtual units default to "no host access." In order to enable access to a load device, use the SETSHO command: SETSHO> SET Dn HOST ACCESS Once a vinual unit is set to host access, it can be accessed in the same manner as if it were running FAKDSK, such as using EXCHANGE. Always issue the following command after you are done using the fake unit SETSHO> SET Dn NOHOST ACCESS Digital Internal Use Only 18-3 FAKDSK (On HSC V370 and up) 18.3 SUMMARY (HSC Version 370 and up) Use the following sequence of events to enable a load device for host access: 1. See if the virtual units have already been created: HSC70> SHOW DISK If they have, skip to step 4. 2. Create virtual units: HSC70> R SETSHO SETSHO> ENABLE REBOOT SETSHO-S The HSC will reboot on exit. SETSHO> SET SERVER DISK/LOAD_ACCESS SETSHO> EXIT SETSHO-S Rebooting HSC, type Y to continue, CTRL/Y to abort: INIPIO-I Booting .•• S. Obtain the unit number(s), (Dn): HSC70> SHOW DISK 4. Enable host access: HSC70> SET Dn HOST ACCESS 5. Perform desired activity on load device . . Use the following to disable host access: HSC70> SET Dn NOHOST ACCESS 18-4 Digital Internal Use Only Y DSA TROUBLESHOOTING COURSE Lab Exercise #5 SET HOST HSC and DKRFCT LAB Digital Internal Use Only 1 SET HOST HSC and DKRFCT Lab Lab Exercise 5 1. Log into your student acqount. 2. $ SET PROCESS/PRIV=DIAGNOSE (You need this privilege to perfOlTIl SET HOSTIHSC functions.) 3. $ SET HOST/HSC/LOG Node-name (Use HSC node name assigned to you.) 4. Run DKUTIL and select your target disk using the GET command. 5. Using DKUTIL, set the FK bit 1 in the mode word of the FCI' volume control block. (This is bit 15 of word 21 in the first block of the FCI'.) 6. Using the DKUTIL GET command, select the target disk again. Note that the "FCT:" should now indicate NULL since you set the FK bit 7. DISPLAY the FCT and the RCT. = 8.· Locate an LBN that has not been replaced. 9. 5~, ') ~> Use DKUTIL and manually replace the LBN. 10. Run DKRFCI' and select the target disk. Note the LBN(s) that were added to the FCT. 11. Run DKUTlL again and select your target disk. Note that the "FCI':" should now indicate VALID. This is because the FK. bit is now cleared. 12. Dump the FCI' volume control block. Note that the FK. bit=O. This is because DKRFCT clears the FK bit when it adds entries into the FCI'. 13. Display the FCI' and verify that your manually replaced LBN is now permanently part of the FCT. Note any other LBNs that were added during step 10. 14. Exit from HSC console using CONTROL \ (back_slash). 15. Print a hardcopy of the file HSCPAD.LOG and review it. This is a log of your activities to/from the HSC. 2 Digital Internal Use Only DSA TROUBLESHOOTING FLOW CHART DSA TROUBLESHOOTING FLOW CHART Order Number: EK-DSATF-TM-PRE This guide is the property of DIGITAL EQUIPMENT CORPORATION and is considered for DIGITAL INTERNAL USE ONLY. Digital Equipment Corporation makes no representation that use of its products with those of other manufacturers will not infringe existing or future patent rights. The descriptions contained herein do not imply the granting of a license to make, use, or sell equipment or software as described in this ~anual. Digital Equipment Corporation assu mes no responsibi lity or liability for the proper performance of other manufacturers' products used with its products. Digital Equipment Corporation believes that information in this publication is accurate as ot.its publication date. Such information is subject to change without notice. Digital Equipment Corporation is not responsible for any inadvertent errors. Class A Computing Devices: NOTICE: This equipment generates, uses, and may emit radio radio frequency energy. It has been tested and found to comply with the limits for a Class A computing device pursuant to Subpart J of Part 15 of FCC rules for operation in a commercial environment. This equipment, when operated in a residential area, may cause interference to radio/TV communications. In such event the user (owner), at his/her own expense, may be required to take corrective measures. Revision/Update Information: PRELIMINARY, April 1989 This is the first document release from Cx/CSSE and supersedes all previous versions. All revisions and known error corrections up to March/89 have been incorporated into this manual eX/eSSE PRELIMINARY April, 1989 The information in this document is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation. Digital Equipment Corporation assumes no responsibility for any errors that may appear in this document. The software described in this document is furnished under a license and may be used or copied only in accordance with the terms of such license. No responsibility is assumed for the use or reliability of software on equipment that is not supplied by Digital Equipment Corporation or its affiliated companies. Copyright ©April, 1989 by Digital Equipment Corporation All Rights Reserved. Printed in U.S.A. The postpaid READER'S COMMENTS form on the last page of this document requests the user's critical evaluation to assist in preparing future documentation. The following are trademarks of Digital Equipment Corporation: DEC DEC/CMS DEC/MMS DECnet DECsystern-1 0 DECSYSTEM-20 DECUS DECwriter DIBOL EduSystem lAS MASSBUS PDP PDT RSTS RSX UNIBUS VAX VAXcluster VMS VT Idl i 191 i Itlalll ™ The following are also trademarks of Digital Equipment· Corporation: CI DDCMP DOIF DEBET OSA DECconnect OECdirect DECdisk DECmaii DECmat OECmate OECnetlE DECnet-RT DECnet-ULTRIX DECserver DECservice DECtape DELNI DELUA DEMPR DEQNA DESTA DEUNA OMS DRB32 DSRVB-AA HSC IVIS KA10 KD11 KDA50-Q KDB50-A KDB50-B KI KL10 KS10 LA50 LN01 LN03 MicroPDP-11 MicroVAX MicroVMS MSCP PDP-11 Q-bus RA60 RA70 RASO RA81 RA90 RC25 RQDX3 RMS-11 RSX-11 RSX-11M RSX-11S RX33 SA482 SASSO SA600 SASSO TA78-81 TMS-11 TK50 TOPS-10 TOPS-20 TU78-81 UDASO UETP ULTRIX ULTRIX-11 ULTRIX-32 VAXELN VAXNMS VAXsimPLUS VMS RA82 This document was prepared using VAX DOCUMENT, Version 1.0 TABLE.OF CONTENTS Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 1 INTRODUCTION....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 1.1 Structure of This Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 2 DATA COLLECTION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Host Error Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Customer Input .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Previous Call History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Visual Symptoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.4 VAXsimPLUS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ." . . . . . . . . . . . . 2.2.5 Theory Number. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.6 HSC Console . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.7 Host Error Log . . . . . . . . . . . " . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.8 Device Internal Error Log . . . . . . . . . . . . . . . . . . . . . . . .- . . . . . . . . . . . . . . . 3 CHAPTER 3 7 7 7 9 9 9 9 10 DATA ANALYSIS AND REPAIR ACTIONS ............. -. . . 13 3.1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Excessive Receiver-Ready Collisions (lAB) . . . . . . . . . . . . . . . . . . . . . . . . . . : . . 3.1.2 Hard Controller Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 EDC Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.4 Few 6B Errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.5 Many 6B Errors ... ". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.6 Forced Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ".. 3.1.7 Bad Block Replacement Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.8 Compare Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.9 Induced Errors with Other Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.10 Induced Errors Without Other Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.10.1 Controller Detected: Drive Command Tlffieout (Status!Event Code: 2B) . . . . . . . . . 3.1.10.2 Controller Detected: Loss of ReadlWrite Ready (Status!Event Code: 8B) . . . . . . . . 3.1.10.3 Controller Detected: Drive Oock Dropout (Status!Event Code: AB) . . . . . . . . . . . 3.1.10.4 Controller Detected: Lost Receiver Ready (Status!Event Code: CB) . . . . . . . . . . . 3.1.10.5 Controller Detected: Drive Failed Initialization (Status!Event Code: 16B) . . . . . . . . 3.1.10.6 Controller Detected: Drive Ignored Initialization (Status!Event Code: 18B) . . . . . . . 3.1.11 Suggestions for Troubleshooting Induced Errors . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.12 VMS Mount Verification Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.12.1 Exceptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.12.2 VMS Problems of "Why a Drive Mount Verifies" . . . . . . . . . . . . . . . . . . . . . 3.1.12.3 Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.13 Performance Issues when No Errors Are Logged . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.14 Invalid Media Format. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 15 15 17 17 18 21 21 23 23 23 23 24 24 24 24 25 25 27 27 27 28 31 35 iii 3.1.15 Unknown Spin Downs with Other Errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.16 Unknown Spin Downs Without Other Errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.17 Lost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 35 35 3.2 Drive-Detected Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Communication Error LED Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Troubleshooting Multiple LED Codes '" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Identify FRU to Replace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 37 37 37 3.3 Communication Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Communication Errors Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Disk Problem with LED Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 Disk Problem Without LED Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.4 Controller Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.5 SDl Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 39 41 41 41 41 3.4 Data Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Convert LBNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 LBN Replaced . . . . ' .. ',' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.3 Normal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.4 Transient/Manually Replace Repeating LBNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.5 Enough Infonnation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.6 Still Only on One or Two Heads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .- . . . . 3.4.7 Confinned Drive Problem . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 3.4.8 Confinned Controller Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ~. . . 3.4.9 Confirmed SDI Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.10 Lost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 43 43 43 43 45 45 47 47 47 47 3.5 Device Isolation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.6 Read/Write Data Path Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 CHAPTER 4 LOGICAL RECOVERY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.1 Overview. . . . . . . . . . . . . . . . . . . . . . 4.1.1 Commonly Known Recovery Techniques 4.1.2 Media Fonnat and Replacement Errors. . 4.1.3 Excessive BBR (Status!Event Code 14) . . . . 53 53 ,54 54 CHAPTER 5 VERIFiCATION...................................... 57 5.1 Device Verification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.2 Original Error Symptoms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.3 Verify the Problem Is Resolved to the Customer's Satisfaction . . . -. . . . . -. . . . . . . . . . . . . 57 5.4 Verify That No New Problems Have Been Induced . . .. . . . . . . . . . . . . . . . . . . . . . . . 58 5.5 Verify That No Residual Problems Remain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 = iv . . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 6 DOCUMENT........................................ 61 6.1 OveIView. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 APPENDIX A DRIVE-DETECED ERRORS . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 APPENDIX B RESOURCES AND UTILITIES . . . . . . . . . . . . . . . . . . . . . . . . . 67 B.0.1 Standalone Diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 B.0.2 What Can the Available Resources Do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 B.0.2.1 EVRLK/ZUDL (VAX/PDP-ll Utility) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 B.0.2.2 HSC Verify (HSC Utility) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 B.0.2.3 DKUTIL (HSC Utility) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 B.0.2.4 RAUTIL (VAX/MicroVAX VMS Utility) . . . . . . . . . . . . . . . . : . . . . . . .. . . 68 B.0.2.5 VAXSIM$LBN.COM (LBN.COM or BLOCK.COM)(VAX/MicroVAX VMS Utility) . . 69 B.0.2.6 DSAERR (DSA301.EXE, DSA303.EXEXVAX/MicroVAX VMS Utility) . . . . . . . . . 69 APPENDIX C CONVERSION FORMULAS FOR RA60 ............ '. . . . . . 71 C.1 LBN to Physical and Logical Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 C.l.l Quick Algorithm for RA60 Head . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 APPENDIX D CONVERSION FORMULAS FOR RA70/S0/81/82/90 ...._. . . . 73 D.1 LBN to Physical and Logical Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . '. . . 73 APPENDIX E TABLE OF CODES FOR CONVERSION FORMULAS . ....; . . 75 FIGURES 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Flow Map - Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow Map - Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Data Collection (Start) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Data Collection (2A) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow Map - Data Analysis and Repair Actions . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Data Analysis and Repair (3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Data Analysis and Repair (3A) .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Data Analysis and Repair (3B) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Data Analysis and Repair (3C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Data Analysis and Repair (3D) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Data Analysis and Repair (3E) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Data Analysis and Repair (3F) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Drive-Detected Errors (3.1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Communication Errors (3.2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Communication Errors (3.2A) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Data Errors (3.3) . . . . . . . . . . . . _. . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Data Errors (3.3A& 3.3B) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow - Data Errors (3.3C). . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Flow Map - Logical Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow Map - Verification of Problem Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . Flow Map - Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 2 6 8 12 14 16 20 22 26 30 34 36 38 40 42 44 46 52 56 60 v TABLES 1 2 3 vi VAX and PDP-ll Standalone Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MicroVAX Standalone Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conversion Fonnulas for RAxx Drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 67 75 Preface DSA Troubleshooting Overview Manual Objectives WHAT IT "IS" The DSA Troubleshooting Flowchart provides the following objectives: The DSA Troubleshooting Flowchart presents a logical approach to DSA troubleshooting based on the experience of thousands of DSA service calls. Written by support engineers with many years of DSA troubleshooting experience, this manual helps to logically analyze the extensive error infonnation and data that usually surround errors in the DSA environment The main emphasis will be on the use of ERROR LOG information. Furthennore, this manual supplements existing service documentation and training. This document will be most beneficial to the Field Engineer who has some DSA training and/or experience. The experienced DSA troubleshooter can benefit from exposure to a troubleshooting approach that has been used successfully by other engineers. DSA traiIled engineers will find this document easier to use and understand than non-DSA trained engineers. WHAT IT IS "NOT" The follOwing items are not within the scope of the DSA Troubleshooting Flowchart: This document is NOT a do-it-all "cookbook." This document is NOT meant to be self-sufficient. This document is NOT intended to replace existing documentation or training. This document is NOT intended to replace individual troubleshooting techniques or styles developed through experience. . This document does not address DSA tape subsystems. TROUBLESHOOTING OVERVIEW Keep in mind the following points when troubleshooting: The same featmes that make DSA a "high availability/ fault toleranC' architecture also make a "traditional" approach to troubleshooting often inappropriate. (For example, trying to force the problem to recur, using diagnostics. ) Accurate diagnosis often must be (and usually can be) made based on careful examination and analysis of infonnation that already exists without resorting to diagnostic simulation and the resulting disruption to customer operations (symptom-directed vs. test-directed diagnosis). This document is a guide for troubleshooting DSA disk subsystems, reflecting RAxx series disk drives and associated controllers (HSC, xDA, and so on). The troubleshooting flows and/or techniques may, however, apply to other DSA products (such as RDxx). DIGITAL INTERNAL USE ONLY vii DSA Troubleshooting Overview Use the following steps for more effective troubleshooting: A. Gather as much information as you can about the problem before using the flowchart, including: Error logs Messages Customer input Device error indications B. Stay organized, since there may be an overwhelming amount of information and data to consider, especially surrounding intermittent problems: Make notes, keep records Get hard copies, if possible C. Take time to analyze the data Use 20 minutes to analyze the data and understand the problem. That can save possibly hours of downtime caused by a "shotgun" approach. In most cases, the customer's system is probably not DOWN at the moment, so take the time to understand the problem before taking down the machine or system for a service action. Many service actions, such as FRU replacements, can and should be "scheduled" with the customer, to minimize disruption to any operations. (Remember DSA is a "FAULT-TOLERANT" architecture.) D. When in doubt, use the following services: Remote support Local district/area support Other field engineers THERE IS LOTS OF HELP AVAILABLE - viii DIGITAL INTERNAL USE ONLY USE IT WHEN YOU NEED IT. DSA Troubleshooting Overvi~w DIGITAL INTERNAL USE ONLY ix DSA Troubleshooting Overview Figure 1: Flow Map - Introduction YOU ARE HERE CXO-2387A x DIGITAL INTERNAL USE ONLY CHAPTER 1 INTRODUCTION 1.1 Structure of This Document This flow process incorporates a collection of troubleshooting techniques from a variety of resources and represents the most up-to-date approach. This includes the basic 6-step troubleshooting strategy incorporated by all of the recently announced disk products. This document is divided into six Chapters that map the DSA troubleshooting flow process. Refer to the diagram on the opposite page. Flow diagrams appear on a left-facing pages and all pertinent notes appear on a right-facing pages. Initial review of this document may appear oveIWhelming, which is expected As you gain experience in using the methods outlined here, this document will become a resource for most typical DSA problems, as well as a guide for occasional "difficult" DSA problems. Review the entire process to understand the approach and become familiar with the location of most o( the material in this document. Later, with experience, you will find that you can skip much of the material you already understand or material that does not apply to the problem you are troubleshooting. DIGITAL INTERNAL USE ONLY 1 Chapter 2 Data Collection Figure 2: Flow Map - Data Collection YOU ARE HERE CXO-2388A 2 DIGITAL INTERNAL USE ONLY CHAPTER 2 DATA COLLECTION 2.1 Overview The Mass Storage Control Protocol (MSCP) error logging and reporting system is perhaps the single most important maintainability featme of the DSA architecture. All higher level error analysis tools (VAXsirnPLUS, SPEAR, and so on) use the error log information as their source data. If error logging is available in any fonn (such as HSC console, system error log), it is your primary tool for obtaining error information. NOTE Most errors in the DSA environment are soft (recoverable) but they will be reported to the error log in most cases. Section 2.2 focuses on the use of the error log as the source of error infonnation. The amount of data information in error log packets can be overwhelming and possibly confusing at times. However, only a few key fields of the packet are necessary to make a diagnosis in most cases. The key starting point in error log packet interpretation is the STATUS/EVENT code. 2.2 Host Error Log The use of host error logs is a data collection process. Begin this data collection process by accessing the host error logs, obtaining the STATUS/EVENT codes, and noting the Logical Block Numbers (LBNs) for any read/write (R/W) disk transfer errors. While performing this data collection step, if the following errors are detected by the controller, note the LBN(s) being reported: - Data errors - Invalid header errors - ECC errors - Header compare errors - Uncorrectable ECC errors - Format errors - Header not found errors - Data sync timeout errors DIGITAL INTERNAL USE ONLY 3 Chapter 2 Data Collection The primary elements of an error log entry for troubleshooting are: - MSCP STATUS/EVENT code - Cylinder -" Master drive error code or LED code - R!W head - LBN - Sector - Date/time of error The cylinder, R/W head, and sector represent physical disk elements that are obtained by translating LBNs from an error log entry. Those physical characteristics are NOT usually indicated in most error logs. Several techniques and utilities are available to provide the necessary translation. Some of those utilities include BLOCK. COM, VAXSIM$LBN.COM, DSAERR, RAUTIL, and disk internal error logs. Appendix C provides manual conversion algorithms in the event that a conversion utility is not readily available. The data collection step is critical to effectively troubleshoot DSA subsystems. 1. Gather as much information as you can about the problem before using the flowchart, including: Error logs Messages Customer input 2. Stay organized, since there may be an overwhelming amount of information and data to consider, especially surrounding intermittent problems: Make notes, keep records Get hard copies, if possible This chapter outlines most of the available resources for collecting data while troubleshooting a problem on site. It is important to get the full scope of the problem. With all of the available information collected, you" will usually be able to obtain the essential information needed by this flow technique. Namely: MSCP STATUS/EVENT code Master drive error code (if applicable) LBNs (if applicable) Date/time of errors This chapter outlines the data collection resources for a variety of DSA system configurations. For a specific site configuration, not all of these resources will' apply. Use those that are applicable. The resources noted here are in "preferred priority" to maximize system availability to the customer. A "symptom-directed" data collection process prescribes the "best resources" first and other resources later. 4 DIGITAL INTERNAL USE ONLY Chapter 2 Data Collection DIGITAL INTERNAL USE ONLY 5 Chapter 2 Data Collection Figure 3: Flow - Data Collection (Start) CUSTOMER INPUT (SEE SECTION 2.2.1) YES DATA COLLECTION (SEE SECTION 2.2.2) YES DATA COLLECTION (SEE SECTION 2.2.3) REPLACE FRU AS PER DEVICE SERVICE MANUAL 5 PROCEED TO CHAPTER 5, VERIFICATION CXO-2389A 6 DIGITAL INTERNAL USE ONLY Chapter 2 Data Collection Service personnel should use one or more of the data collection mechanisms before continuing the analysis of the error types discussed in Chapter 3. 2.2.1 Customer Input Discuss the problem the operator experienced and how often the problem appeared during system operation. Operators/users may provide valuable information concerning system activity at the time of the errors (such as applications that were running, affected users, impact on other applications, and so on). NOTE Obtain any error messages from the user terminal. 2.2.2 Previous Call History The site guide may have valuable information that should be considered before you proceed with the service call. Look over what problems the site has been experiencing recently. There may be indications of a repeat call or an intemrittent problem. It is possible that you have previously analyzed an intemrittent problem and selected one of the multiple recommended Field Replaceable Units (FRUs) that could contribute to the current symptom (as previously documented). If this problem is a repeat of the same symptom, select the next suggested FRU for replacement. 2.2.3 Visual Symptoms Hard/repeat faults displayed in the drive Operator Control Panel (OCP) may directly correlate to an FRU. Hard/repeat faults may lead to directly analyzing the drive internal error log, if available. Hard/repeat faults include fault lights, error Light Emitting Diodes (LEDs) , noise, smoke, mechanical failures, power problems, drive spin up/down problems, and internal diagnostic failures. Refer to the appropriate device service manual for details on the error code and, if applicable, for the FRU. DIGITAL INTERNAL USE ONLY 7 Chapter.2 Data Collection Figure 4: Flow - Data Collection (2A) YES NO INTERPRET THEORY NUMBER (SEE SECTION 2.2.5) NO REPLACE FRU AS PER VAXsimPLUS CALLOUTS DATA COLLECTION (SEE SECTION 2.2.6) PROCEED TO CHAPTER 5, , VERIFICATION DATA COLLECTION (SEE SECTION 2.2.7) DATA COLLECTION (SEE SECTION 2.2.8) CXO-2390A 8 DIGITAL INTERNAL USE ONLY Chapte,r 2 Data Collection 2.2.4 VAXsimPLUS If such tools as VAXsimPLUS or SPEAR are available, a quick look at the summary report containing evidence information may lead to direct identification of the failing/faulty device. 2.2.5 Theory Number Using analysis of the errors being recorded by the VMS system error logger, VAXsimPLUS can determine which FRU(s) may need replacement based on the analysis and frequency of errors. VAXsimPLUS identifies the failure with a theory number, which can be cross-referenced to a particular FRU. Call the CSC to cross-reference a theory number to the FRU. Use the following numbers for assistance: Installation and Usage - PL01 = CSC/AT (Atlanta) PL31 Europe and GIA areas - 2.2.6 = CSC/CX (Colorado) 1-800-241-2546 1-800-525-6570 The telephone numbers are area dependent. Check with your support center for the correct telephone number. HSC Console If the subsystem is HSC based, check the HSC console log. The HSC console log may indicate which drive has a problem. Correlating user-provided time information of a disk subsystem error occurring during 'a user operation eases searching through hardcopy HSC console trails to identify a failure. This assumes that the HSC time is the same as that of the user setable system clock. NOTE Refer to Appendix A for specific "drive-detected error" examples. 2.2.7 Host Error Log Study available host error logs. Error summaries provided by some error log utilities enable you to identify the suspect drives. It is useful to correlate user-provided information around the time of the errors to the time stamps in the error log. For a complete description of the host error log messages, refer to the DSA Error Log Reference Manual (EK-DSAEL-MN-()()2). DIGITAL INTERNAL USE ONLY 9 Chapter 3 Data Analysis and Repair Actions 2.2.8 Device Internal Error Log Device internal error logs and/or error silos contain useful information occasionally not found in sources previously discussed. They may also contain misleading information. For example: Error information from a previous or unrelated problem Status information not related to the immediate problem Internal error logs accumulate data over a period of time. Therefore, select the information that appears appropriate for the current problem. The device internal error logs can be dumped by using DKUTIL, NAKDAx, or EVRLL/ZUDMxx. NOTE: Using NAKDAx or EVRLLlZUDM standalone diagnostics will remove system availability from the users. After you have collected all available data, proceed to Chapter 3, Data Analysis and Repair Actions, page 13. 10 DIGITAL INTERNAL USE ONLY Chapter 3 Data Analysis and Repair Actions DIGITAL INTERNAL USE ONLY 11 Chapter 3 Data Analysis and Repair Actions Figure 5: Flow Map - Data Analysis and Repair Actions YOU ARE HERE CXO-2391A 12 DIGITAL INTERNAL USE ONLY CHAPTER 3 DATA ANALYSIS AND REPAIR ACTIONS SECTION 3.1 3.1 Overview This chapter integrates two primary areas of troubleshooting into one. 1. Analyzing the data you collected in Chapter 2 (Data Collection) to derive a specific symptom. 2. Suggesting what remedial action could be perfonned to resolve the problem. The symptom analysis is based on two key areas. The first area (most preferred) presumes you have collected status/event codes. 1. . Scan through the flow until you find a status/event code decision diamond that matches one or more of those codes you isolated using data collection. 2. Then follow the flows and notes for the specific status/event code to further isolate the problem and select • appropriate repair actions as described in this section. Logical repair (non-physical) is covered in Chapter 4 of this document The most commonly encountered status/event codes are provided. If you encounter status/event codes not discussed in the document, consult your nearest support engineer or remote support organization. The second area of analysis provides further infonnation to troubleshooting symptoms that often occur with no corresponding status/event codes logged in the host/controller error logs. These decision diamonds "follow" the strings of status/event code flows. Only the most frequently encountered symptoms are listed. In short, if data collection provides one or more status/event codes, then use the status/event flow for troubleshooting. If symptoms occur without status/event codes in the error log, then skip the status/event code decision flows and proceed directly to Section 3.1.12 on page 27. The status/event code flow is a "prioritized" flow. If you have gathered "multiple" status/event codes from the data collection process, select the status/event code decision diamonds that match your codes and appear as near the beginning of the flow as possible. This will allow you to troubleshoot and usually resolve the "root" problems first, which are often responsible for the occurrence o( uother" symptoms and status/event codes reported. Finally, a few status/event code decisions involve more complex flows for subsequent analysis. These areas include drive-detected errors, communication errors, data erroIS, and so on. Flows for these topiCS are further described in other areas of this document. These flows are kept separate so as not to clutter or cause confusion with the main status/event code analysis flow. After this chapter, you can continue with the logical recovery step (Chapter 4) or proceed to the verification step (Chapter 5). This will depend on the decision flows you followed during the data analysis and repair actions steps. DIGITAL INTERNAL USE ONLY 13 Chapter 3 Data Analysis and Repair Actions Figure 6: Flow - Data Analysis and Repair (3) BEGIN DATA ANALYSIS AND REPAIR PROCESS ON STATUS/ EVENT CODES. FOR YOUR CONVENIENCE, THESE EVENT CODES ARE DOCUMENTED IN BOTH HEX AND OCTAL. STATUS/EVENT CODE HEX E8 OCTAL 353 REFER TO SECTION 3.1, DRIVE-DETECTED ERRORS 48 10B 148 113 413 513 REFER TO SECTION 3.2, COMMUNICATION ERRORS 1AB 2A 6A SA 10A 12A 14A 16A 653 052 152 212 412 452 RESOLVE CONTROLLER PROBLEM (SEE NOTE 10) 5 PROCEED TO CHAPTER 5, VERIFICATION 512 552 CXO-2392A 14 DIGITAL INTERNAL USE ONLY Chapter 3 Data Analysis and Repai r Actions 3.1.1 Excessive Receiver-Ready Collisions (1AB) These are recoverable events that do not result in data loss unless other SDI bus errors occur at the same time. This event usually occurs when the drive attempts to raise RECEIVER READY (RIDS status bit), indicating the drive is ready to receive a command from the controller while the controller had previously raised RECEIVER READY (RTCS status bit), indicating the controller was ready to receive a drive response. This is not an error, but an event within the subsystem. All DSA drives and controllers will occasionally result in this event being logged. There is no perfonnance impact associated with occasional occurrence of this event since the nonna! recovery time from this event is very fast. . Data integrity is assured and no data corruption is associated with the occurrence of this event provided there are no other SDI bus errors at the same time. Testing of HSC software Version 370 has indicated a noticeable reduction of the nonnal occurrence of receiverready collision events. Assuring that all drives and controllers are up to the latest hardware and software revision levels will contribute to the reduction of receiver-ready collisions. A receiver-ready collision rate of one or two events a week per drive to one or two events a day per drive is usually acceptable for most sites .. This presumes that: Physical SDI interconnects are not being broken (Plugging and unplugging SDI cables, worn connectors, and so on). Controller initialization is not occurring. Controller failover operations are not occurring. If you encounter excessive receiver-ready collisions, proceed to Section 3.3, Communication Errors, page 39. NOTE There are many possible causes for this event. The occurrence of receiver-ready collisions happens primarily when both A and B ports are enabled at the drive. 3.1.2 Hard Controller Errors 1. Refer to the DSA Error Log Reference Manual (EK-DSAEL-MN-002) for a complete description of the status/event code. 2. Refer to the specific controller service manual to identify the appropriate list of FRUs for replacement, based on the specific deSCription of the status/event code symptoms. DIGITAL INTERNAL USE ONLY 15 Chapter 3 Data Analysis and Repair Actions Figure 7: Flow - Data Analysis and Repair (3A) STATUS/EVENT CODE HEX 25 4A OCTAL 045 112 1C8 1 E8 105 110 145 150 210 345 350 450 510 550 610 650 710 750 68 153 45 48 65 68 88 E5 38 128 148 168 188 1A8 YES EDC ERRORS (SEE SECTION 3.1.3) 4 PROCEED TO CHAPTER 4, LOGICAL RECOVERY YES REFER TO SECTION 3.4, DATA ERRORS NO YES YES REFER TO SECTION 3.4, DATA ERRORS ONLY 68 (SEE SECTION 3.1.5) PROCEED TO CHAPTER 5, VERIFICATION CXO-2393A 16 DIGITAL INTERNAL USE ONLY Chapter 3 Data Analysis and Repair Actions 3.1.3 EDC Errors Error Detection Code (EDC) is a data-protection mechanism to ensure data integrity of the controller internal data path. In contrast, the ECC mechanism ensures data integrity from the controller through the drive, ·to the media, and back again. ECC provides· error detection and correction for both the data field and the EDC field within a block of data read from the disk. It is important to note the differences in how controllers implement the EDC mechanism. For the KDA/KDB/UDA family of controllers, EDC is generated on a sector of data at the CPU bus interface as the data is initially read from host memory. It is verified on a sector basis as the data is written to host memory from the internal controller memory. Therefore, with the xDA/xDB controllers, it is generated and checked at this CPU bus interface within the controller by the microcode engine of the controller. For HSC controllers, the EDC is generated on a sector of data at the K.pli port processor module as the data streams in from host memory over the CI bus. The EDC then becomes an integral part of the user data as the data is transferred to the HSe data memory. As this data is read out of the HSC data memory by the K.sdi modules and transmitted to the drive, the EDC for the user data is regenerated in the K.sdi and compared to the EDC characters appended to the data by the K.pli module. The EDC characters must match or the write transfer to the disk will be aborted. The HSC re-requests the data from host memory and requeues the write transfer to the disk when data is. again available in the HSC data memory. If the EDC verifies correctly at the K.sdi on a write to diSk, the EDC and BCC codes are appended to the. data stream and written to the disk, with the BCC mechanism ensuring data integrity of the customer data and the EDC code. Note also that EDC errors can be caused by the host if an incomplete data transfer results. As an example, the controller may prematurely tenninate a write transfer due to a host initialization, CPU crash or reboot during the transfer. For a read from the disk, the data as it is read by the K.sdi (over the SDIread/response line) is checked· for good ECC, then the data plus EDC characters are stored in HSC data memory. As the data is sent to host memory, the K.pli, while transferring the data to host memory, verifies that good EDC exists for the customer dlta block but does not transfer EDC characters to host memory. If the EDC is bad, the K.pli infonns the HSC functional code to re-request the same data from the disk. If EDC errors are detected without ECe errors, the problem is in the controller. This is because the ECC is protecting the data to and from the disk and checking the integrity of the data at the SDI port module logic. NOTE A properly functioning controller always reports bad EDC written to disks. If bad EDC is written to a disk (improperly functioning controller), each time the block containing bad EDC is read, EDC errors are logged against the drive. Only after the data is restored or rewritten to the disk with good EDC by a good controller will the errors be resolved. First resolve the problem in the controller that caused the EDC errors. Then proceed to Chapter 4, Logical R~covery, page 53 to resolve the EDC errors that were written to the disk. 3.1.4 Few 68 Errors If either of the following symptoms exist, then treat these errors as data errors: 1. Only a few (for example, less than 10 per day) error log entries contain status/event codes of 6B or 2. Any number of 6Bs are logged with other data transfer-related errors (BCe, header, and so on) Then treat the errors as data errors and proceed to Section 3.4, Data Errors, page 43. DIGITAL INTERNAL USE ONLY 17 Chapter 3 Data Analysis and Repair Actions 3.1.5 Many 68 Errors If many different blocks log only 6B errors (no other status/event codes), try reformatting the media. If other data errors are included with the 6B errors, proceed to Section 3.4, Data Errors, page 43. NOTE If you have previously reformatted the media for this problem and the 6Bs persist, replace the media (HDA or pack). 18 DIGITAL INTERNAL USE ONLY Chapter 3 Data Analysis and Repair Actions DIGITAL INTERNAL USE ONLY 19 Chapter 3 D~ta Analysis and Repair Actions Figure 8: Flow - Data Analysis and Repair (38) STATUS/EVENT CODE HEX 05 08 OCTAL YES 05 010 FORCED ERRORS (SEE SECTION 3.1.6) NO PROCEED TO CHAPTER 4, LOGICAL RECOVERY IF OTHER ERRORS ARE OCCURRING, RETURN TO DATA COLLECTION TO RESOLVE OTHER ERRORS FIRST. AFTER ALL HARDWARE ERRORS ARE RESOLVED, BE SURE TO PERFORM STEPS IN CHAPTER 4, LOGICAL RECOVERY, ASSOCIATED WITH FORCED ERRORS. 14 34 024 064 YES MESSAGES >---...., BBR (SEE SECTION 3.1.7) REFER TO SECTION 3.4, DATA ERRORS CXO-2394A 20 DIGITAL INTERNAL USE ONLY Chapter 3 Data Analysis and Repair Actions 3.1.6 Forced Errors The FORCED ERROR flag is an indicator used to inform the host that corrupted data is "correctly written" into a sector. A forced error in a system file or executable file (.EXE) will cause problems not easily isolated (e.g., system crash, system hang, and so on). Any file being read on a user terminal with a forced error will result in a VMS message. This may or may not result in additional entries into the error log. VMS produces the following error message when a forced error is in a block of a file that the user attempts to TYPE: $ TYPE FILE.DAT %TYFE-W-READERR, error reading $5$DUA230:[LOCAL.TEST]FILE.DAT;2 -RMS-F-RER, file read error -SYSTEM-F-FORCEDERROR, forced error flagged in last sector read When an uncorrectable ECC error is encountered in a block, several attempts are made to read and/or correct the data If those attempts fail, the block causing the uncorrectable ECC error is assumed to be bad and becomes a candidate for replacement. During the replacement process (BBR), the bad block is read again (including retries) in an attempt to extract the data for relocation to a replacement block. If the data is STILL uncorrectable, the BBR process writes "best guess" data into the replacement sector. 'The result is invalid data being correctly Written to a good block. To inform the user that the data was at one time uncorrectable, the forced error flag is attached to the block. It is the responsibility of the user to take the necessary steps to correct or replace the data and "clear" the forced error indicator. The only assured way to correct a forced error is to replace the file containing the affected LBN from a known good backup copy. Forced errors are the result of another error or problem. Use previous history or further analysis of the data available to determine if the hardware problem that created the forced errors has been resolved. A status/event code indicating a forced error is associated with status/event code 14 or 34 (BBR status messages).• Status/event code (hex) 05 indicates a forced error in the ReT area. Status/event code (hex) 08 indicates a forced error in the HOST area. NOTE Correct all known hardware problems first, then proceed to Chapter 4, Logical Recovery, page 53. 3.1.7 Bad Block Replacement Messages These are NOT errors, but infonnation entries reflecting the status of the Bad Block Replacement (BBR) process. A block read causes the data error to invoke BBR (set the BBR flag). The block is tested The block either passes the test (possible transient error), which is status/event code 34, or the block fails the test, which is status/event code 14. Entries with the same LBNs can be associated with the BBR messages. Identify all entries with the same LBN to get the total piCture. Note that the error log entries may be out of sequence. = = DIGITAL INTERNAL USE ONLY 21 Chapter 3 Data Analysis and Repair Actions Figure 9: Flow - Data Analysis and Repair (3e) STATUS/EVENT CODE HEX OCTAL 45 54 74 94 B4 D4 C5 105 105 400 405 445 2000 07 07 125 ISOLATE AND REPAIR ANY R/W DATA PATH PROBLEMS. PROCEED TO SECTION 3.6, READ/WRITE DATA PATH VERIFICATION. 124 164 224 264 324 REFER TO SECTION 3.6, READ/WRITE DATA PATH VERIFICATION 305 >--... COMPARE ERROR ACTIONS (SEE SECTION 3.1.8) YES 2B 8B AB CB 16B 18B 053 213 253 INDUCED ERRORS WITH OTHER ERRORS (SEE SECTION 3.1.9) 313 553 613 INDUCED ERRORS WITHOUT OTHER ERRORS (SEE SECTION 3.1.10) 5 PROCEED TO CHAPTER 5, VERIFICATION CXO-2395A 22 DIGITAL INTERNAL USE ONLY Chapter 3 Data Analysis and Repair Actions 3.1.8 Compare Errors A compare operation takes the data buffer in host memory as the result of the last I/O operation (read or write) and compares it to the data that is on the device. This compare is done in the controller. NOTE An error results if hardware or software modifies host memory before the compare operation is complete. Typically, any errors detected by the drive will result in other status/event codes or drive errors. Troubleshoot those errors first. The elements involved in this operation are controller memory, host memory, and the path between the controller and host memory. 3.1.9 Induced Errors with Other Errors These status/event codes are often a result of: Drive-detected errors (status/event code =EB) Drive failures (hard faults) Controller failures SDr hardware problems If other errors exist, troubleshoot those first. They will often isolate the root of the problem. For example, an error with a status/event code of EB (drive-detected error) is normally logged by the host AFTER the system has logged any number of induced errors. 3.1.10 • Induced Errors Without Other Errors Quite often one error in the DSA architecture may cause other errors to get reported to the error log. These "induced" errors are normally not the basis for primary troubleshooting. In the event that induced errors are the only available symptoms in the error log, use the following inforrnation for troubleshooting. 3.1.10.1 Controller Detected: Drive Command Timeout (Status/Event Code: 28) The controller timed out while waiting for the drive to complete an operation. If no other errors are associated with those errors, the problem may be due to a seek timeout. The drive internal error log or error silo may provide additional information. Refer to Section 3.1.11, Suggestions for Troubleshooting Induced Errors, page 25. DIGITAL INTERNAL USE ONLY 23 Chapter 3 Data Analysis and Repair Actions 3.1.10.2 Controller Detected: Loss of Read/Write Ready (Status/Event Code: 8B) This error indicates R/W Ready (RIDS status bit) was negated when: 1. The controller attempted to initiate a transfer. or 2. A R/W Ready was found negated at the completion of a transfer and R/W ready had been previously asserted (indicating completion of a preceding seek). This usually results from a drive-related error. The drive internal error log or silo may provide additional infonnation. Refer to Section 3.1.11, Suggestions for Troubleshooting Induced Errors, page 25. 3.1.10.3 Controller Detected: Drive Clock Dropout (Status/Event Code: AB) Either data (Read/Response Une) or the state clock (RIDS) was missing when it should have been present. This is usually detected through a timeout. A fatal drive condition can cause the drive to drop the drive clocks. The drive should reassert clocks after perfonning a drive !NIT and establishing clocks to the controller to reestablish communications and state infonnation between the drive and controller. The sequence of getting status and error information then occurs. Analysis· of error log message packets usually indicates that the above sequence has occurred. If such message packets are not being processed or received, the condition might not be detected by the drive. Possible causes for this problem include: Drive SDI logic and microprocessor logic Controller port module SDI bus (cables, connectors, and so on) 3.1.10.4 Controller Detected: Lost Receiver Ready (Status/Event Code: CB) The Receiver Ready (RTDS status bit) was negated when the controller attempted to initiate a transfer, or RECEIVER READY was not asserted at the completion of a transfer. This includes all cases of the controller timeout expiring for a transfer operation (level-l real-time command). As a consequence of this condition, the controller perfonns an SDI !NIT and then attempts to request a GET STATUS. The extended status error log entry returned in the GET STATUS command may indicate what the problem is. If no infonnation is being reported by the drive as a part of the error log sequence, approach the problem as a drive transmission-related error and proceed to Section 3.3, Communication ErrOIS, page 39. 3.1.10.5 Controller Detected: Drive Failed Initialization (Status/Event Code: 16B) The drive clock failed to resume following a controller-attempted drive initialization. . This implies the drive encountered a fatal initialization error. It also can indicate the drive was attempting its own initialization or that the drive was looping in an initialization state or routine. 24 DIGITAL INTERNAL USE ONLY Chapter 3 Data Analysis and Repair Actions 3.1.10.6 Controller Detected: Drive Ignored Initialization (Status/Event Code: 18B) The drive clock continued running even though the controller attempted to perfonn a drive initialization. This implies the drive did not recognize the INIT command from the controller. It may also indicate the drive was perfonning an initialization caused by a drive-detected condition and, in the course of initialization, ignored the controller's attempt to initialize the drive. 3.1.11 Suggestions for Troubleshooting Induced Errors If any of the previous (2B, 8B, AB, CB, 16B, 18B) errors are the ONLY errors, consider one or more of the following troubleshooting tips: Try to isolate the problem to a specific controller, drive, or SDI connection. Refer to Section 3.5, Device Isolation, page 48. Review the site history of previous service calls for any information that may allow further isolation of this problem (or other problems that might relate to this error). Verify all SDI cable connections. Investigate your system configuration. Could it be contributing to the problem? If these problems are common to one drive: 1. Obtain any drive internal error log information and resolve any errol'S by using the appropriate drive service manual. 2. Verify drive operation by using the drive internal diagnostics. 3. Troubleshoot the drive SDI logic or microprocessor logic first Contact the appropriate support resources for additional information or help. DIGITAL INTERNAL USE ONLY 25 Chapter 3 Data Analysis and Repair Actions Figure 10: . Flow - Data Analysis and Repair (3D) YES MOUNT VERIFICATION MESSAGES (SEE SECTION 3.1.12) CXO-2396A 26 DIGITAL INTERNAL USE ONLY Chapte,r 3 Data Analysis and Repair Actions 3.1.12 VMS Mount Verification Messages Mount verification or mount verification timeout messages mayor may not be the result of a hardware problem. The following are some of the reasons for mount verification messages: 1. The mount verification feature of Files-ll disk handling generally leaves users unaware that a mounted disk has gone offline and returned online. 2. A mounted disk has become unreachable and then restored. Mount verification is the default parameter for EXE$MOUNTVER. 3.1.12.1 Exceptions Disks mounted /FOREIGN and disks mounted /NOMOUNTVERIFICATION do not undergo mount verification, except during cluster state transitions. Dual-ported drives through HSCs should never be mounted using the /NOMOUNTVERIFICATION modifier, because it may prevent VMS from failing the drive over to the secondary HSC. EXE$MOUNTVER sends status messages to OPCOM. Because there are cases when mount verification messages are needed at the operator console and OPCOM might not be able to provide them, mount verification also sends special messages with the prefix %SYSTEM-I-MOUNTVER to the operator console, OPAO. 3.1.12.2 VMS Problems of "Why a Drive Mount Verifies" VMS calls EXE$MOUNTVER if a drive loses contact with the system (for example, the controller sends a command to the drive but does not get a successful response back within the controller-specific timeout period). It is a process to verify the disk with which VMS reestablished contact is the same disk to which VMS was originally connected. Sending the drive to mount verify state involves: 1. Host initiating an MSCP ONLINE command to the drive modifier followed by a GET UNIT STATUS (GUS). 2. Reading the home block and comparing the volume infonnation (serial number, name, and so on) of the drive before VMS lost contact with it and after VMS reestablishes contact with the drive during mount verification. The sequence is repeated until success or timeout. During the sequence, the drive port light is on and the ready light blinks slowly as the controller accesses the LBN block and the RCT for the media ID, effectively doing full-stroke seeks. The MVTIMEOUT system parameter defines the time (in seconds) that is allowed for a pending mount verification to complete before it is aborted. This dynamic parameter should be set to a reasonable value for the typical operations at the site. NOTE Do not use values less than the recommended default 600 seconds (10 minutes). After a mount verification times out, the pending and future I/O requests to the volume fail. You may try to execute the DISMOUNT/ABORT command, which allows a subsequent mount to be successful if the MY timer has previously expired. In some extreme cases, drive failures may require the reboot of the controller or the system. Entty to and exit from mount verify are time stamped. VAXcluster time-stamps may vary across the various cluster nodes due to differences in the Time of Year (TOy) clocks and the initial clock times. Slight variations in time stamps do not indicate multiple drive or controller failures causing mount verification, but rather one drive or controller failure causing every node to enter mount verification at their own locally specified time. The following reasons may explain why a drive enters mount verification: MANUAL intervention, such as: DIGITAL INTERNAL USE ONLY 27 Chapter 3 Data Analysis and Repair Actions - Change the state of the switches on the OCP of the disk - Loose cables - Accidental release of the port buttons - Change volumes - Accidental spin down of the drive - Change the unit number - Loss of power, trip the circuit breaker Those reasons result in mount verification messages and are to be expected.. They may also result in mount verification timeout messages if the timeout period is exceeded. SYSTEM level causes, such as: Planned failovers Mount verification timeout is incorrectly set (default is 600 seconds, less than 120 seconds is not appropriate). Cluster state transition messages for drives mounted to the cluster. However, mount verification timeout messages might indicate a problem with the host, controller, or drive. CONTROLLER level causes, such as: Controller failures, including an HSC crash Unplarmed failovers A LAST-fail packet from an xDA/DB controller occurred shortly after the mount verification, meaning the controller faulted/initialized as well. Those reasons result in mount verification messages associated with multiple drives. They could also result in mount verification timeout messages. 3.1.12.3 Actions Locate the source of the controller symptoms. Refer to Data Collection Techniques in Chapter 2 to obtain the status/event codes~ fault codes, and so on. Analyze and troubleshoot those errors according to Chapter 3. Refer to Section 3.5, Device Isolation, page 48, as appropriate. By noting the time duration of the mount verification and other circumstances surrounding the mount verify status, you can detennine some valuable troubleshooting infonnation. Ask yourself the following questions: How long did the mount verify take? Less than MVTIMEOUT and the drive eventually succeeded. A few seconds, implying a glitch or a recoverable fault Did it appear on another controller after the mount verification? If so, it could be a port-related problem. Thirty seconds to a minute to remount probably means the drive was spun down and had to be spun back up. Was this due to a drive fault? Did the drive run its spinup diagnostics error free? Infinite time probably means that along with the drive disappearing, it also: Changed its media_id Is a different drive Continually failed its spinup diagnostics Contained a hard fault Were the mount verification messages associated with specific drives? VMS does not log errors during the mount verify process, although it may log some before or after, depending on how the drive failed. 28 DIGITAL INTERNAL USE ONLY Chapter 3 Data Analysis and Repair Actions Were any errors logged to the host or HSC console log before or after the mount verify? Do any drives that are nonexistent appear, characterizing a unit select problem? Were all drives that failed on the same controller, K.sdi module, or controller port? Does the error always appear on the same physical drive? Did the drive see a fault during this period? (Examine the drive internal error log for error information.) Mount verification messages are often associated with a single drive. Also, mount verification timeout may occur. DRIVE level causes Intermittent OCP hardware SDI cables and connections Drive faults Drive communication (SDI) problems Actions - Locate the cause of the drive problem. Use the data collection techniques to obtain any status/event codes, LED codes, drive internally logged information, and so on. Analyze and troubleshoot any. errors found, according Section 3.5, Device Isolation, page 48, as appropriate. DIGITAL INTERNAL USE ONLY 29 Chapter 3 . Data Analysis and Repair Actions Figure 11: Flow - Data Analysis and Repair (3E) YES >--~I PERFORMANCE ISSUES (SEE SECTION 3.1.13) CXO-2397A 30 DIGITAL INTERNAL USE ONLY Chapter 3 Data Analysis and Repair Actions 3.1.13 Performance Issues when No Errors Are Logged Customer complaints of disk performance can take a fair amount of analysis. Often the performance complaints are quite subjective. Following is a list of questions that may help analyze performance complaints. 1. QUESTION: Do the performance issues relate to all or most of the disks? ANSWER: If most or all the disks are affected, ensure that the system parameters meet the suggested guidelines. Cluster size of disks, working set size parameters, paging parameters, and ACP/XQP-related parameters can all affect performance. 2. QUESTION: Do perfonnance problems occur during image activation (when a large application program is initially started)? ANSWER: Many layered products require some time to fully activate. This is not a disk problem. 3. QUESTION: Is the perfonnance problem noticed by users of the same image, layered product, or file on the (same) disk? ANSWER: If the disk is attached to a local controller (UDA/KDA/KDB) but is a VAX node member in a cluster where there is at least one HSC in the cluster, then request that the file/image/layered software product be moved to a disk on the HSC. Local serving of disks creates bus, VAX, and I/O overhead, which impacts perfonnance. 4. QUESTION: Is the performance problem noticed by users of a file/image/layered product that resides on the same disk as the swap and page files? ANSWER: If so, request the system manager to monitor paging and swapping activity. High page/swap rates decrease VMS response and create an I/O bottleneck for the page/swap disk. Request the file/irnage/layered product be moved to another disk. In addition to setting system parameters, this area of the architecture (hardware related) can contribute to loss of perfonnance. These include nonprimary replacements in a critical file or directory structure. Examples include: Nonprimary replacement in VMS disk [OOOOOO]INDEXF.SYS. Nonprimary replacement in a directory file that is frequently used NOTE VMS uses virtual block file structures, not logical blocks. Virtual Block Numbers (VBNs) do not correlate to LBNs. To correlate an LBN to the affected file, you should understand the operating system file structure, such as VMS ODS-2 or contact support personnel. It is a very complicated procedure to identify affected files within ODS-2. The two examples are files that may affect the perceived performance of a disk. However, the location of a block of data within a file and how the operating system is set up has an equal effect on nonprimary replacement, which in tum impacts system or disk drive perfonnance. A nonprimary replaced block in the INDEXF.SYS of a disk could be very significant if it is in the front of the file. However, if it is the last block within the file, it might not have as large an impact on system perfonnance. DIGITAL INTERNAL USE ONLY 31 Chapter 3 Data Analysis and Repair Actions A nonprimary replacement in a block within SYS.EXE that is loaded once by VMS into memory (at startup) and stays resident in memory has no effect on performance. However, if the block is within a portion of SYS.EXE that is frequently brought in by VMS, it could impact performance. A solution is for the system manager to increase the VMS working set size. A block within the swap or paging file that is nonpriroary replaced generally does not have much impact. If the system is doing enough paging and swapping to notice the occurrence of nonprimary replacements, the real problem may be with the user or system working set size. Have the system manager adjust system parameters around paging and swapping and see if performance improves. 32 DIGITAL INTERNAL USE ONLY Chapter 3 Data Analysis and Repair Actions DIGITAL INTERNAL USE ONLY 33 Chapter 3 Data Analysis and Repair Actions Figure 12: Flow - Data Analysis and Repair (3F) INVALID MEDIA FORMAT (SEE SECTION 3.1.14) YES UNKNOWN SPINDOWN WITH OTHER ERRORS (SEE SECTION 3.1.15) UNKNOWN SPINDOWN WITHOUT OTHER ERRORS (SEE SECTION 3.1.16) LOST (SEE SECTION 3.1.17) CXO-2398A 34 DIGITAL INTERNAL USE ONLY Chapter 3 Data Analysis and Repair Actions 3.1.14 Invalid Media Format This message indicates the detection of a corrupted Format Control Table (FCT) on the media while trying to complete a mount operation. The media will most likely need to be reformatted to resolve this condition. First, find the cause of the corruption. Plus, consider taking some actions to ensure that a hardware problem does not exist. For example: Look for other possible causes, using the data collection techniques. Troubleshoot any hardware errors found first. Verify the hardware data path, using EVRLF, EVRLG, ZUDH, ZUDI, NAKDAx, ILDISK, or drive internal R/W diagnostics. The problem may have been due to an incomplete format operation. In this case, a reformat will be the only resolution required Contact the appropriate support resources if additional assistance is needed. 3.1.15 Unknown Spin Downs with Other Errors If other errors are present, troubleshoot those first. Go back to the beginning of this flowchart if needed. 3.1.16 Unknown Spin Downs Without Other Errors Verify that no other errors or fault indications exist. The following are possible causes for unknown spin downs: Duplicate unit numbers Various host software programs or operating system commands (such as in VMS DISMOUNT without the /NOUNLOAD modifier) Invalid Media Format, see Section 3.1.14 Power NOTE Most RAn: drives will NOT spin back up after a power failure until the host software initiates the drive or the user toggles the run switch. 3.1.17 Lost If the problem appears obscure, if too much time has been spent trying to isolate this problem, or if you have insufficient infonnation from the data collection process, use the support resources available. Digital Field Service should operate with/by the guidelines of MAP within the respective areas. DIGITAL INTERNAL USE ONLY 35 Section 3.2 Drive-Detected Errors Figure 13: Flow - Drive-Detected Errors (3.1) FOR THE RAGO, THE PRIMARY ERROR BYTE IS BYTE 15 OR THE HIGH BYTE OF WORD 27. THIS IS CALLED THE PANEL CODE BY VMS. YES FOR OTHER RAXX DRIVES, THE PRIMARY ERROR BYTE IS BYTE 14 OR THE LOW BYTE OF WORD 27. THIS IS CALLED THE LED CODE BY VMS. REFER TO APPENDIX A FOR EXAMPLES REFER TO APPENDIX A FOR EXAMPLES YES COMMUNICATION ERROR LED CODES (SEE SECTION 3.2.1) REFER TO SECTION 3.3, COMMUNICATION ERRORS YES TROUBLESHOOTING MULTIPLE LED CODES (SEE SECTION 3.2.2) NO IDENTIFY FRU (SEE SECTION 3.2.3) 5 PROCEED TO CHAPTER 5, VERIFICATION CXO-2399A 36 DIGITAL INTERNAL USE ONLY Section 3.2 Drive-Detected Errors SECTION 3.2 3.2 Drive-Detected Errors 3.2.1 Communication Error LED Codes Communication errors most often cause other errors in the subsystem. Resolve communication errors first. Then continue troubleshooting any errors that are still occurring. STATUS/EVENT codes of EB with LED codes of: In the RA60: 9C, A2, A3, A4, A5, A6 In the RA80: 07, 08, 09, OA, OB, OC, IF, 20, 21, 22 In the RA81: 07, 08, 09, OA, OB, OC, IF, 20, 21, 22, 41 In the RA82: 07, 08, 09, OA, OB, OC, IF, 41, 4F In the RA70: 07, 08, 09, OA, OB, OC, OE, 17, 18, IF, 41,43,44, 4F, ED In the RA90: 07, 08, 09, OA, OB, OC, OD, OE, 10, 16, 17, 18, 19, lA, IF, 20, 21, 2A, 2B, 2C, 42 3.2.2 Troubleshooting Multiple LED Codes When troubleshooting devices provides multiple LED codes, consider the following rules: Select error codes that occur during internal drive diagnostics. These should be given top priority. Select codes with the least FRU callout. Select codes occurring most often. Select codes with FRU(s) in common to most all the codes provided. If in doubt, refer to the device service manual for details. 3.2.3 Identify FRU to Replace As detennined by the error codes received, identify the FRU for replacement Refer to the device service manual for the correct procedures to replace and verify the identified FRU. After FRU replacement, proceed to Chapter 5, Verification, page 57. DIGITAL INTERNAL USE ONLY 37 Section 3.3 Communication Errors Figure 14: Flow - Communication Errors (3.2) COMMUNICATION ERROR ENTRY (SEE SECTION 3.3.1) USE DEVICE ISOLATION TECHNIQUES IN SECTION 3.5, DEVICE ISOLATION, TO DETERMINE IF THE PROBLEM IS IN ONE OF THREE AREAS: DISK DRIVE, CONTROLLER, OR SOl BUS. RETURN TO THIS POINT IN THE FLOW WHEN COMPLETED. CXO-2400A 38 DIGITAL INTERNAL USE ONLY Section 3.3 Communication Errors SECTION 3.3 3.3 3.3.1 Communication Errors Communication Errors Entry You are here because of one of the following conditions: Status/event codes of 4B, lOB, 14B Excessive status/event codes of lAB Status/event codes of EB with LED codes of: In the RA60: 9C, A2, A3, A4, AS, A6 In the RA80: 07, 08, 09, OA, OB, OC, IF, 20, 21, 22 In the ~81: 07, 08, 09, OA, OB, OC, IF, 20, 21, 22, 41 In the RA82: 07, 08, 09, OA, OB, OC, IF, 41, 4F In the RA70: 07, 08, 09, OA, OB, OC, OE, 17, 18, IF, 41,43, 44, 4F, ED In the RA90: 07,08,09, OA, OB, OC, OD, OE, 10, 16, 17, 18, 19, lA, IF, 20, 21, 2A, 2B, 2C, 42 DIGITAL INTERNAL USE ONLY 39 Section 3.3 Communication Errors Figure 15: Flow - Communication Errors (3.2A) YES FIX LED CODE (SEE SECTION 3.3.2) DRIVE PROBLEM WITHOUT LED CODES (SEE SECTION 3.3.3) YES CONTROLLER PROBLEM (SEE SECTION 3.3.4) SOl PROBLEM (SEE SECTION 3.3.5) PROCEED TO CHAPTER 4, LOGICAL RECOVERY CXO-2401A 40 DIGITAL INTERNAL USE ONLY Section 3.3 Communication Errors 3.3.2 Disk Problem with LED Codes You are here because drive-detected communication errors with drive LED codes indicate a drive problem. Troubleshoot the LED code and replace FRUs as per the drive service manual. 3.3.3 Disk Problem Without LED Codes You are here because a controller STATUS/EVENT code indicates a drive problem. The problem could be in the following areas: Drive module with the SOl interface logic (for instance RA81 personality). Internal SOl cabling and connections. If STATUS/EVENT codes of l4B, 4B, or excessive lAB, also include the module containing the microcode or microprocessor that controls the SDI functions as part of the FRU selection (e.g., in the RA81, this would be the microprocessor). The drive service manual may provide additional information on identifying the FRUs associated with SOl communication lOgic. 3.3.4 Controller Problem The problem has been isolated to the controller. The problem could be in one of the following areas: Controller module containing the SDI interface logic. Internal controller SDI cabling and connections. The controller service manual may provide additional information on identifying the proper FRU for replacement 3.3.5 SOl Problem The problem has been isolated to the external SDl. The problem could be in one of the following areas: External SDI cables SDI bulkhead/transition connections Poor grounding Environment DIGITAL INTERNAL USE ONLY 41 Section 3.4 Data Errors Figure 16: Flow - Data Errors (3.3) CONVERT LBNs (SEE SECTION 3.4.1) YES NORMAL (SEE SECTION 3.4.3) YES MANUALLY REPLACE (SEE SECTION 3.4.4) TRANSIENT (SEE SECTION 3.4.4) PROCEED TO CHAPTER 6, DOCUMENT CXO-2402A 42 DIGITAL INTERNAL USE ONLY Section ,3.4 Data Errors SECTION 3.4 3.4 Data Errors 3.4.1 Convert LBNs The conversion of LBNs to the physical drive characteristics allows further isolation of FRUs related to R/W transfer problems. This also allows the identification of media versus nonmedia problems. There are many online tools to assist with conversions. See Appendix B for a list of resource utilities that provide LBN conversions and Appendix C for the conversion formulas. 3.4.2 LBN Replaced An additional event (BBR message) logged with a block being tested for replacement: =14 (Block replaced) Status/event code = 34 (Block not replaced) Status/event code NOTE These are NOT errors but information entries reflecting the status of the BBR process. A block read causes a data error to invoke BBR (set the BBR flag). The block is tested. The block passes the test (possible transient error), status/event code 34, or the block tested bad, status/event cede =14. Entries with the same LBNs can be associated with the BBR messages. Identify all entries with the same LBN to get the total picture. = You can use HSC VERIFY, DKUTIL, RAUTIL, or similar utilities to verify if the LBNs have been replaced. 3.4.3 Normal Occasional LBN replacements are expected. This indicates a bad spot on the media. For future reference, document 'the LBNs that have been replaced. 3.4.4 Transient/Manually Replace Repeating LBNs Occasional transient blocks are acceptable for these products. Normally, they will get replaced (status/event = 14). LBN numbers that consistently recur within the host error log that are not replaced may be manually replaced with the utilities DKUTIL (HSC), BVRLK, ZUDLxx, or RAUTIL. This is a useful procedure for those blocks that are consistently reporting BCC/data errol'S. This symptom can occur when the host BBR software does not utilize the user data as the pattern to test the suspect block. The block is initially flagged for replacement. The host executes a test of the block and finds nothing wrong with the block, does not revector, but restores the original data back into the block. The user then accesses the data again and may get another BCC error with severity to again invoke the BBR activity. An BCO made it a requirement to utilize user data as one of the test patterns when using the MSCP specification. Blocks that are reporting consistent "header not found" problems or "positioner unintelligible" header problems will likely still report such a problem if the block is revectored because of the way a block is searched for in this architecture. DIGITAL INTERNAL USE ONLY 43 Section 3.4 Data Errors Figure 17: Flow - Data Errors (3.3A & 3.38) 3.3A INFORMATION ENOUGH (SEE SECTION 3.4.5) YES NO YES STILL ONLY ONE OR TWO HEADS (SEE SECTION 3.4.6) PROCEED TO CHAPTER 5, VERIFICATION CXO-2403A 44 DIGITAL INTERNAL USE ONLY Section 3.4 Data Errors 3.4.5 Enough Information Since you are only here due to LBNs in the error log, you need to ensure visibility of all the LBNs with problems. Also, determine if the problems are isolated to one or two heads. The simplest technique is to perform an operation that reads all of the LBNs on the disk. The intent is to identify if all the information on the problem is known. If the problem is the result of a momentary failure (burst), the architecture is designed to handle these failures and no further actions should be needed. I\1any system and subsystem utilities, commands, and diagnostics allow the Field Engineer to perform read-only functions, read/write I/O,. scan, verify, and so on, that could provide additional data. Some of those tools are: ILDISK/lLEXER as appropriate EVRLK, ZUDL, BBR utilities (scrubbers) in verify mode NAKDAx (scrubber) RAUTIL EVRAE VMS command "$ ANALYZE/DISK/READ_CHECK DEVICE:" ~ 04.6 Stili Only on One or Two Heads The problem stabilizes (no further errors at this time), document and monitor. The symptoms from the previous history indicate a problem with the same one or two heads, replace the HDA. • Continued testing or additional information results in a growing problem with the same one or two heads, replace the HDA. NOTE In the RA60, the HDA would equate to the heads or the media. DIGITAL INTERNAL USE ONLY 45 Section 3.4 Data Errors Figure 18: Flow - Data Errors (3.3C) VERIFY DATA PATH TO DETERMINE IF A PROBLEM EXISTS IN THE READ/WRITE DATA PATH. REFER TO READ/WRITE DATA PATH VERIFICATION, SECTION 3.6. RETURN TO THIS POINT IN THE FLOW WHEN COM PLETE. YES YES YES LOST (SEE SECTION 3.4.10) CONFIRMED DRIVE PROBLEM (SEE SECTION 3.4.7) CONFIRMED CONTROLLER PROBLEM (SEE SECTION 3.4.8) CONFIRMED SDI PROBLEM (SEE SECTION 3.4.9) PROCEED TO -CHAPTER 4, LOGICAL RECOVERY CXO-2404A 46 DIGITAL INTERNAL USE ONLY Section 3.4 Data Errors 3.4.7 Confirmed Drive Problem The problem has been confirmed to be a R/W data path problem in the drive. Most drive data path problems can be associated with the drive SDI logic, serial R/W data path, and head select logic. For example, in the RA81, the following items may cause problems: HDA ground brush assembly Read/write module Microprocessor module Personality module HDA Faulty spindle ground Other less frequent items that could cause problems include the power supply, motor/brake assembly, worn belts, cabinet connections, servo module, and improper system/drive grounding. Knowledge of the drive is needed to determine the proper FRUs for replacement. The drive service manual may provide some assistance. 3.4.8 Confirmed Controller Problem The problem has been confirmed to be a R/W data path problem in the controller. Most controller data path problems can be associated with the serial read/write data path, ECC generator, SDr logic, internal cables and connections. For example: In the UDASO - M7486 In the HSC50 - K.sdi Other less frequent items that could cause problems include power supplies, cable connections, and improper grounding. Knowledge of the controller is needed to determine the proper FRUs for replacement The controller service manual may provide some assistance. 3.4.9 Confirmed SOl Problem The problem has been confirmed to be a R/W data path problem in the external SDr bus. Most external SDI bus problems can be associated with bulkhead/transition connections and the external SDI cables. Other less frequent items that could be problems are power, improper grounding, and poor environment. The hardware that causes communication errors is the same hardware that will cause data path errors. If any LED codes or status/event codes indicate communication errors or drive-detected errors and are encountered during the verification or troubleshooting process, these errors should be resolved first Refer to Section 3.2, Drive-Detected Errors, page 37, or Section 3.3, Communication Errors, page 39. Any of the above actions may result in a STATUS/EVENT code with more definition and should be used to isolate problems per the main flow of this document 3.4.10 Lost If the problem appears obscure, too much time has been spent trying to isolate this problem, or you have insufficient information from the data collection process, use the support resources available. DIGITAL Field Service should operate with/by the guidelines of MAP within the respective areas. DIGITAL INTERNAL USE ONLY 47 Section 3.5 Device Isolation SECTION 3.5 3.5 Device Isolation Use the techniques in this section to isolate a problem to one of the following areas: A common drive A common controller port A common controller The SDI bus to include: SDI cables, bulkhead, internal cabling to the bulkhead, and grounding Use these techniques 1. Perform a FAILOVER 2. Move the cable to an alternate drive 3. Move the cable to an alternate port at the controller 4. Move the cable to an alternate controller Determine if the symptoms move to the alternate device or remain with the same (original) device. Document your actions for clear traceability and to assure the isolation process has been completed (i.e., prevents confusion and getting lost). EXIT with the device isolated and return to the flow chart 48 DIGITAL INTERNAL USE ONLY Section ~.6 Read/Write data path verification SECTION 3.6 3.6 Read/Write Data Path Verification You need. to obtain enough evidence or data to prove or disprove that a data path problem exists by verifying the following symptoms. For example: Sufficient LBNs are translated. to clearly prove that the errors are randomly distributed across several cylinders, sectors, and heads, thus proving a R/W data path problem exists in a drive. Sufficient data or errors have been logged to prove that a data path problem exists in a controller that is causing similar errors on multiple ports and/or multiple drives. The read/write DATA PATH of a DSA subsystem consists of the following elements: In the CONTROLLER: The .logic in the controller containing the serial read/write data path, Eee generator, and the SDI logic Internal SDI cables and connections In the SDI bus: Bulkhead/transition connections External SDI cables In the DRIVE: The internal SDI cables and connections The logic in the drive containing the SDI, logic, serial read/write data path (R!W encode/decode, head select logic, and so on) The following symptoms are common to many read/write data path problems: Read/write data path problems in CONTROLLERS often exhibit the following symptoms: Errors for any drive connected to a port. Errors for any drive connected to all ports. Transmission errors. Variety of drive LED codes. Variety of status/event codes. ECC errors, header errors, and so on, where the physical translation of logical blocks result in: Random cylinders. Random sectors. Random R!W heads, usually errors on more than three heads. Excessive status/event codes of 34. Usually a large number of entries with a code of 34 strongly indicates an intennittent or transient data path problem rather than a problem with the media. Status/event code 34 indicates LBNs were flagged. for BBR testing but not replaced because the LBNs were marginal (transient errors). DIGITAL INTERNAL USE ONLY 49 Section 3.6 ReadlWrite data path verification Read/write data path problems with a DRIVE often exhibit the following symptoms: Transmission errors. Variety of drive detected LED codes. Variety of controller status/event codes. ECC errors, header errors, and so on, where the physical translation of logical blocks result in: Random cylinders. Random sectors. Random R/W heads, usually errors on more than three heads. Excessive status/event codes of 34. Usually a large number of entries with a code of 34 strongly indicates an intermittent or transient data path problem rather than a problem with the media. Status/event code 34 indicates LBNs were flagged for BBR testing but not replaced because the LBNs were marginal (transient errors). Refer to Section 3.5, Device Isolation Techniques, page 48, to assist in troubleshooting, if needed. Obtaining the necessary infonnation will often require you to employ one or more of the following techniques: .Collect all available infonnation from the error log. Include data from a previous service call. Include error log data that results from any of these techniques. Collect the HSC console infonnation. Collect error information that results from any of the utilities or diagnostics used during this process. Use HSC VERIFY as appropriate. Use system and subsystem utilities, commands, and diagnostics that allow react only functions, R/W I/O, scan, verify, and so on. ILDISK/lLEXER EVRLK, ZUDL, BBR utilities (scrubbers) in verify mode NAKDAx (scrubber) RAUTIL Use drive internal diagnostics as appropriate. Use SOl loopback testing as appropriate. Use a technique or process to reproduce the problem. Once you have detennined that a data path problem has been isolated to a controller, disk, or SDI path problem, return to the flow chart. 50 DIGITAL INTERNAL USE ONLY Chapter 4 Logical Recovery DIGITAL INTERNAL USE ONLY 51 Chapter 4 Logical Recovery Figure 19: Flow Map - Logical Recovery YOU ARE HERE CXO-240SA 52 DIGITAL INTERNAL USE ONLY CHAPTER 4 LOGICAL RECOVERY 4.1 Overview Read/write data path problems, transmission errors, and other hardware. subsystem problems often affect everything that is read or written to the media, including files and structures. This corruption could render the hardware to "appear" as if the hardware is still broken. Logical recovery is the repair of those corrupted files or structures. As such, no separate flows exist to further describe logical recovery. Logical recovery is a single step in the main flow, with notes describing the suggested actions to pursue. Not all problems result in corruption of structures nor require the need for logical recovery. The flow continuation pointers from Chapter 3, Data Analysis and Repair Actions, will skip this section when it is appropriate to do so. NOTE At times in the logical recovery process, reformatting is the recommended solution. It is essential that a known good copy of the data (file or volume) is available before the reformatting is -done. Tools that can be used to perform logical recovery are: HSC VERIFY EVRLK ZUDL DKUTIL RAUTIL Formatters ( HSC fonnat, EVRLB, ZUDK, and so on) See Appendix B for a description of what these tools can do. 4.1.1 Commonly Known Recovery Techniques For files containing forced errors or EDC errors: If only a few files are affected, replace the files with known good copies of the files. Also, delete the damaged ~ files with DELETE/ERASE. If several files are affected, consider restoring the volume from a known good backup copy of the volume. The following example shows VMS messages that result from reading a file containing bad EDC as still written on the disk, even after the controller is repaired. DIGITAL INTERNAL USE ONLY 53 Chapter 5 Verification $ TYPE FILE.DAT %TYFE-W-READERR, error reading $5$DUA230: [LOCAL.TEST]FILE.DATi2 -RMS-F-RER, file read error -SYSTEM-F-CTRLERR, fatal controller error NOTE Use a command under the operating system that will read all the host-allocated blocks on the disk to identify all the files with damage. For example, under VMS use the following command: $ ANALYZE/DISK/READ device_name: A file containing one forced error read many times will result in many forced errors being reported. $ TYPE FILE.DAT %TYFE-W-READERR, error reading $5$DUA230: [LOCAL.TEST]FILE.DATi2 -RMS-F-RER, file read error -SYSTEM-F-FORCEDERROR, forced error flagged in last sector read 4.1.2 Media Format and Replacement Errors Errors that continue to produce any of the following status/event codes or the equivalent error messages should be resolved by reformatting the media and then restoring the volume from a known good backup copy of the volume: 45, 54, 74, 94, B4, D4, C5, 105, 125, 400 4.1.3 Excessive BBR (Status/Event Code = 14) Read/write data path problems may cause the replacement of a high number of good blocks. This may lead to logical fragmentation of the disk. If this happens, the number of blocks in the RCT recorded as revectored differs substantially with FCT infonnation. As an example, the RCT may show a doubling of replaced blocks occurring over a short period of time. If you have repaired a data path problem that has caused an excessive number of blocks to be replaced, consider reformatting the media and restoring the volume from a known good backup copy of the volume. Use BVRLB, NAKDAx, or ZUDKxx to refonnat the disk and recover those good blocks. Back up the user data before executing the format. This can be done at the customer's convenience. When all appropriate logical recovery actions have been perfonned, proceed to Chapter 5, Verification, page 57. 54 DIGITAL INTERNAL USE ONLY Chapter 5 Verification DIGITAL INTERNAL USE ONLY 55 Chapter 5 Verification Figure 20: Flow Map - Verification of Problem Resolution YOU ARE HERE CXO-240SA 56 DIGITAL INTERNAL USE ONLY CHAPTER 5 VERIFICATION 5.1 Device Verification The following list can be used to detennine the device level verification: Verify the new FRU as recommended in the device service manual. If the replacement of an FRU results in the same errors in the device, restore the origmal FRU. This returns a probable good part that has had extensive testing (runtime). This will reduce the probability of inducing a new problem over the long tenn. Run the device internal diagnostics, if appropriate. Verify that no new symptoms exist (no DOA). 5.2 Original Error Symptoms Verify that the original error symptoms have been resolved. Event codes LED codes Fault indicators An operation that would reproduce the original symptoms 5.3 Verify the Problem Is Resolved to the Customer's Satisfaction Use a specific operation to verify the problem is resolved Verify the customer symptoms are resolved. Use input received from the customer in the Chapter 2, Data Collection, page 3, as the exit criteria. DIGITAL INTERNAL USE ONLY 57 Chapter 6 Document 5.4 Verify That No New Problems Have Been Induced New problems may be introduced by defective spares. Try an alternate spare if this seems to be the case. In some instances involving "multiple" problems, additional symptoms may appear only after more critical problems are resolved. Refer to Data Collection (Chapter 2, page 3) with the "new symptoms." If the same symptom persists, return the original FRU and try the next FRU per the appropriate service manual recommendation. Use the Data Collection step to asswe you understand all the original symptoms. 5.5 Verify That No Residual Problems Remain In the case of data type problems, verify that there are no residual "media" type problems after the original problem has been resolved. This may indicate the need for reviewing Chapter 4, Logical Recovery, page 53. Many system and subsystem utilities, commands, and diagnostics exist for reading the blocks on the media. Refer to Appendix B, Utilities and Resources, for a list of'some of these utilities. If lost, or if the problem appears obscure, or if too much time has been spent trying to isolate this problem, utilize the support resources available. Digital Field Service should operate with/by MAP guidelines within the respective areas. Proceed to Chapter 6, Document, page 61. 58 DIGITAL INTERNAL USE ONLY Chapter 6 Document .. DIGITAL INTERNAL USE ONLY 59 Chapter 6 Document Figure 21: Flow Map - Document YOU ARE HERE 60 DIGITAL INTERNAL USE ONLY CHAPTER 6 DOCUMENT 6.1 Overview Documenting your actions is an important step in effective troubleshooting and problem resolution, which is often overlooked or skipped. Don't treat this step lightly. Take the time to document your steps. Even though you may not forget what has happened, you may not be the next person to troubleshoot this system. How would you feel if you worked on a repeating problem with no infonnation available from your predecessor? How does the customer feel when his/her system is still faulty because redundant actions failed to resolve a problem? Those situations occur due to lack of sufficient documentation to maintain "control" of the problems. Document your actions during the service call in both of the following ways: LARS via CHAMP The log in the site guide This infonnation can be used in the following ways: To provide a history of the system for the next service call To verify the existence of new symptoms To isolate intermittent problems To determine the next step or FRU to change when troubleshooting an intermittent problem To determine if there are any repeat calls for this problem If the service call resulted in changing the subsystem configuration (including device addresses), then note those changes in the Site Management Guide for future reference. DIGITAL INTERNAL USE ONLY 61 APPENDIX A DRIVE-DETECED ERRORS The following example shows how the HSC displays a drive-detected error. This example illustrates the location of the MSCP status/event code and the locations containing the master error code for the various types of disk drives. ERROR-E Drive detected error at 8-apr-1986 15:11:44.37 00000000 Command Ref iF RA82 unit 41= 66. 40 Error Flags Event OOEB --> MSCP STATUS/EVENT code <-Request 1B Mode 00 Error 80 Controller 00 Retry/fail 00 Extended Status OC OB 00 00 00 OC--> Master drive error code for RA70/80/81/82/90 <-30--> Master drive error code for RA60 <-7. Requester iF O. Drive port ERROR-I End of error. '* The following examples illustrate how the same error would appear in a VMS fonnatted error log. DIGITAL INTERNAL USE ONLY 63 Appendix A Drive-Detected Errors V A X / VMS SYSTEM ERROR REPORT **************************** ENTRY ERROR SEQUENCE 83. ERL$LOGMESSAGE ENTRY I/O SUB- SYSTEM, UNIT MESSAGE TYPE COMPILED 8-APR-1986 16:41 PAGE 7. 3. **************************** LOGGED ON SID 01380A4F 8-APR-1986 15:11:44.37 KA780 REV* 7. SERIAL* 2639. MFG PLANT O. - HSC007$DUA66: 0001 DISK MSCP MESSAGE MSLG$L_CMD_REF MSLG$W_UNIT 00000000 0042 MSLG$W_SEQ_NUM 01BC UNIT *66. SEQUENCE #444. MSLG$B_FORMAT 03 MSLG$B_FLAGS 40 MSLG$W_EVENT OOEB "SDI" ERROR MSLG$Q_CNT_ID * ************* OPERATION CONTINUING -->* MSCP STATUS * <-* /EVENT code * DRIVE ERROR *************** DRIVE DETECTED ERROR 0000F807 01010000 UNIQUE IDENTIFIER, 00000000F807 MASS STORAGE CONTROLLER HSC70 MSLG$B_CNT_SVR 02 MSLG$B_CNT_HVR 00 CONTROLLER SOFTWARE VERSION *2. CONTROLLER HARDWARE REVISION #0. MSLG$W MULT UNT 0050 MSLG$Q=tJNIT=ID 00000108 020BOOOO UNIQUE IDENTIFIER, 000000000108 DISK CLASS DEVICE RA82 64 DIGITAL INTERNAL USE ONLY Appendix A Drive-Detected Errors VAX/VMS SYSTEM ERROR REPORT MSLG$B_UNIT SVR 01 MSLG$B_UNIT_HVR OF - COMPILED 8-APR-1986 16:41 PAGE 8. UNIT SOFTWARE VERSION #1. UNIT HARDWARE REVISION #15. MSLG$L_VOL_SER 03C769A2 MSLG$L_HEADER 00000000 VOLUME SERIAL #63400354. MSLG$Z_SDI REQUEST MODE 00 ERROR 80 CONTROLLER 00 RETRY 00 LBN #0. GOOD LOGICAL SECTOR 1B RUN/STOP SWITCH IN PORT SWITCH IN LOG INFORMATION IN EXTENDED AREA SPINDLE READY PORT A RECEIVERS ENABLED 512-BYTE SECTOR FORMAT DRIVE ERROR NORMAL DRIVE OPERATION o. RETRIES LEFT CONTROLLER OR DEVICE DEPENDENT INFORMATION ****************************************************** LED CODE CO * --> Master drive error code for RA70/80/81/82/90<--* PANEL CODE 30 * --> Master drive error code for RA60 <--* LAST OPCODE OC ****************************************************** RUN PORT IMAGE OB PORT B RTDS ENABLED PORT A RTDS ENABLED PORT A ENABLED DIGITAL INTERNAL USE ONLY 65 Appendix A Drive-Detected Errors VAX / VMS CUR CYLNDR SYSTEM ERROR REPORT COMPILED 8-APR-1986 16:41 9. PAGE 0000 CURRENT CYLINDER, #0. CUR GROUP 00 REQUESTOR 07 DRIVE PORT 00 CURRENT GROUP, #0. REQUESTOR #7. DRIVE PORT #0. 66 DIGITAL INTERNAL USE ONLY APPENDIX B RESOURCES AND UTILITIES There are significant concerns about running standalone diagnostics in troubleshooting DSA subsystem problems. The troubleshooting strategy takes into account user availability of the system, as well as the subsystem and the failing device. These strategies provide accurate diagnosis, fault correction, and verification to minimize the impact on the user. NOTE For transient disk subsystem errors, the running of available host loaded diagnostics via an xD A controller seldom isolates the elTors without the necessary long run times. This is a serious availability impact. Heavy emphasis must be placed on utilizing the available elTOr logs (hardware/software) that exist onsite. A list of these standalone diagnostics are provided for your reference. B.0.1 Standalone Diagnostics Table 1: VAX and PDP-11 .Standalone Diagnostics VAX POP-11 Comments EVRLB CZUDK Formatter EVRLF CZUDH Tests 1,2,3 1=UNIBUS Interrupt (Address tests) 2=Execute drive resident tests 3=Disk function test (R/W to DBN's) EVRLG CZUDI Test 4, disk exerciser EVRLJ CZUDJ Test 5, UDAIKDA subsystem EVRLK CZUDL BBR utility (scrubber) EVRLL CZUDM EVRAE Table 2: MOM Disk resident error log dump utility MSCP subsystem exerciser MicroVAX Standalone Diagnostics HSC NAKDAx Comments MicroVAX, Tests 1-4, Format, BBR utility, etc. ILDISK (inline) HSCS0170 equivalent to tests 1-3 ILEXER (inline) HSCSOnO equivalent to test 4 DIGITAL INTERNAL USE ONLY 67 Appendix B Resources and Utilities 8.0.2 What Can the Available Resources Do? The following briefly describes the special utilities that are referenced in this document. The DSA troubleshooting course provides training in most of these areas. 8.0.2.1 EVRLKlZUDL (VAXlPDP-11 Utility) Provides a list of all replacements (RCf) Scans host areas of the media. If used in verify mode, identifies blocks with Forced Error and/or bad EDC Provides auto replacement (scrubber) Provides a feature to manually replace LBNs (UDA/KDA subsystems) 8.0.2.2 HSC Verity (HSC Utility) Scans all blocks in host area Identifies corruption in structures (Le., RCT, FCf) Identifies logical blocks written with forced errors and/or bad EDC Provides a list of all replacements (R Cf) Provides a list of factory replaced blocks (FCf) Verifies the consistency of the 8.0.2.3 Rcr DKUTIL (HSC Utility) Provides a list of all replacements (ReT) Provides a list of factory-replaced blocks (FCI') Provides a feature to manually replace LBNs (HSC based subsystems) B.0.2.4 RAUTIL (VAXlMicroVAX VMS Utility) Provides media analysis on a "per head" basis Provides media analysis based on Rcr replacements Provides a list of all replacements (ReT) Verifies all replacements Provides auto replacement as needed (scrubber) Verifies the consistency of the Rcr (KDA/UDA-based subsystems) Identifies logical blocks written with forced errors and/or bad EDC Scans all blocks in the host area Provides a feature to manually replace LBNs (KDA/UDA-based subsystems) System!host program that runs online under VMS 68 DIGITAL INTERNAL USE ONLY Appendix B Resources and Utilities B.0.2.5 VAXSIM$LBN.COM (LBN.COM or BLOCK.COM)(VAXlMicroVAX VMS Utility) Provides logical translation of LBNs, RBNs, DBNs, and XBNs into physical cylinder, track, head, and sector B.0.2.6 DSAERR (DSA301.EXE, DSA303.EXE)(VAXlMicroVAX VMS Utility) Simplifies VMS and error log entries and provides the key information required by this document Provides customized error log sorting NOTE DSA303.EXE or higher is required for VMS VS.O or higher. DIGITAL INTERNAL USE ONLY 69 APPENDIX C CONVERSION FORMULAS FOR RASO C.1 LBN to Physical and Logical Parameters LC Logical Cylinder GP Group LBN LC.LC Rem BPLC * BPLC .LC Rem GP.GP Rem BPG TK Track (Logical) * BPG .GP Rem TK.TK Rem BPPT S Sector (Logical) CYL60 Physical Head (Round result to nearest whole number) CYL60.Rem (discard) LBN * 4 Physical Cylinder * BPT .GP Rem (4 * LBN - BPPC CYL60) + GROUP (CYL60 * 4 * HEAD.Rem BPPC) BPLC SFI Physical Sector from Index (GP * GP_Offset) + S X.SFI Rem (discard X) PSPT SFI Rem * PSPT SFI (Round result to nearest whole number) NOTE Refer to appendix E for the specific codes to use with these formulas. DIGITAL INTERNAL USE ONLY 71 Appendix C Conversion Formulas C.1.1 Quick Algorithm for RA60 Head If you know the LBN (Logical Block Number), first detennine the Logical Cylinder: LBN Logical Cylinder . Fraction (discard fraction) BPLC Logical Cylinder xxx.yyy 6 (heads) PHYSICAL HEAD BPLC 72 6 * (. YYY) 168 (16-bit packs) 152 (18-bit packs) DIGITAL INTERNAL USE ONLY Blocks Per Logical Cylinder APPENDIX D CONVERSION FORMULAS FOR RA70/80/81/82/90 0.1 LBN to Physical and Logical Parameters LBN PC Physical Cylinder PH Physical Head GP Group (Logical) BPPC PC.PC Rem * BPPC .PC Rem = BPPT * BPPC .PC Rem GP.GP Rem BPG TK Track (Logical) PH.PH Rem * BPG .GP Rem TK.TK Rem BPPT S Sector SFI Physical Sector from Index * BPPT .TK Rem (GP * GP_Offset) + S (Round to nearest whole number) S X.SFI Rem (discird X) PSPT SFI Rem * PSPT = SFI (Round result to nearest whole number) NOTE Refer to Appendix E for the specific codes to use with these formulas. DIGITAL INTERNAL USE ONLY 73 APPENDIX E TABLE OF CODES FOR CONVERSION FORMULAS Table 3: Conversion Formulas for RAxx Drives Table of Values Disk 16-bit 18-bit Blocks (LBNs)per RA60 252 228 physical cylinder RA70 363 (BPPC) RA80 434 406 RA81 714 644 RA82 855 RA90 897 Blocks (LBNs) per RASO 42 physical track RA70 33 (BPPT) RA80 31 28 RA81 51 46 RA82 57 RA90 69 38 Blocks (LBNs) per RA60 42 group RA70 33 (BPG) RA80 434 392 RA81 51 46 Group offset (GP_Offset) RA82 57 RA90 69 RA60 16 38 15 RA70 08 RA80 16 16 12 RA81 14 RA82 11 RA90 14 Physical sectors RA60 43 per track RA70 34 (PSPT) RA80 32 39 29 DIGITAL INTERNAL USE ONLY 75 Appendix E Conversion Formulas Table 3 (Cont.): Table of Values Blocks (LBNs) per Conversion Formulas for RAxx Drives Disk 18-bit 47 RA81 52 RA82 58 RAga 70 RA60 168 logical cylinder (BPLC) 76 16-bit DIGITAL INTERNAL USE ONLY 152 PDP-11 SCRUB Information Handout Digital Internal Use Only 1 PDP-11 SCRUB Information Handout 1 PDP-11 Media Package Contents (CYCLE: 132) PACKAGE CONTENT REPORT REPORT DATE: 09-Jun-87 ******************************************************************************* DIAGNOSTIC ECO MAINTAINER COMPONENT PART MEDIA PACKAGE PACKAGE TITLE /GROUP IDENTIFIER HISTORY NUMBER IDENTIFIER ******************************************************************************* BB-FF66G-YC * CZUDX HMXM HMSM HSAX HSAA HUDI HDDB HDDD HDDL HDDM HDDR HDDU HDDY HDLP HDMM ROMS Cl/v2LL HDMO HUDA HZDU HUP2 HUTE HUXC HUSU HUPA HQHL ZUDL ZUDM G CZUDXGO F. S. 1600 BPI MT FO DO CO G2 DO C1 DO DO CO CO EO DO BO CO CO EO BO CO G1 F1 IO FO E1 GO AO AO CHMXMFO CHMSMDO CHSAXCO CHSAAG2 CHUDIDO CHDDBCl CHDDDDO CHDDLDO CHDDMCO CHDDRCO CHDDUEO CHDDYDO CHDLPBO CHDMMCO CHDMSCO CHDMUEO CHUDABO CHZDUCO CHOP2G1 CHOTEFl CHOXCIO CHOSUFO CHUPAEl CHQHLGO CZUDLAO CZUDMAO XXDP V2 EXTENDED MON XXDP V2 RESIDENT MON XXDP V2 DIAG SUPR EXT XXDP V2 SUPR SML XXDP V2 DIRECTORY UT XXDP V2 DB DRVR/BOOT XXDP V2 DD DRVR/BOOT XXDP V2 DL DRVR/BOOT XXDP V2 DM DRVR/BOOT XXDP V2 DR DRVR/BOOT XXDP V2 DRVR/BOOT XXDP V2 DY DRVR/BOOT XXDP V2 LP DRVR XXDP V2 MM DRVR/BOOT XXDP V2 MS DRVR/EOOT XXDP V2 MU DRIVER XXDP V2 DATE UTILITY XXDP V2 DU SIZER XXDP V2 UPDATE UTIL XXDP V2 XTECO OTIL XXDP V2 DECX1l CNF/LN XXDP V2 SETUP UTIL XXDP V2 PATCH UTIL XXDP V2 HELP FILE BAD ELK REPLACE UTIL DUh?p~ ERROR LOG UTILITY €rllrles 2 Digital Internal Use Only .j Fdt'W1 ol~ of i ! vc P2.'J '!(}/90 (!ir-rlc PDP-11 SCRUB Information Handout 2 PDP-11 TURBO SCRUBBER Patches for ZUDLA CAUTION Use this patch and the program at your own risk. This turbo patch should only apply to ZUDLAO. It will not work for any other version. Pwpose: Will disable the drive Bee error threshold before requesting block replacement Eee error encountered regardless of the number of symbols in error. Thus BBR will take place for any Requirements: 1. Must be revision "A" of ZUDLAO 2. In" order to prevent customer problems, only run this on scratch media. (Back up the customer data before and restore after using this Turbo Scrubber.) Location Was Modify to 23620 000000 001100 41154 000000 001100 ---- Ditto ---- 26304 010000 000000 Disable Forced Error Command Modifier 26312 000400 000000 Disable Write with Forced Error (Put in "Suppress ECC Command Modifier) TO SET "DEBUG" BIT PURPOSE: The "debug" bit causes the program to ask for an address range to scrub. Will also cause program to continue running if a hard error (not hard ECC error) occurs. Location 2650 Was Modify to 000000 000001 Set any 1 into this location (non-zero) PROGRAM ASKS Enter first LBN (A) Enter last LBN (A) ? ? GIVE IT THE "STARTING LBN" FOR SCANNING Give it the "Ending LBN" for Scanning (RA60 (RA70 (RA80 (RA81 (RA82 (RA90 Max Max Max Max Max Max LBN LBN LBN LBN LBN LBN 400175) 540740) 237211) 891071) I: 1216664) .. 2376152) .. .. Digital Internal Use Only 3 PDP-11 SCRUB Information Handout 3 PDP-11 TURBO SCRUBBER Patches for ZUDLB CAUTION Use this patch and the program at your own risk. This turbo patch should only apply to ZUDLBO. It will not work for any other version. QUICK CHECK -- Loe 23340 23342 23344 40734 40736 40740 To ensure you have version "B", verify the following: Contents /1 ltvo -------12737 r ~ 0 2346 lur~s 12737 So Ii t./i.--, (( £ee 0 . ~J I-/e ('(/} J a.·zfl /7.~p m/li . lui ~ r(';-'tcf .\ ~~ &/ll) O/(P j 12 '" "7 ;Z~;;'/.rj~C; ut/! , cFI?,"7(YLfI:C( fL., fvt1L,~(! /1 0 2346 PATCH tJ)Ph £ (/C Loe From To 23342 40736 0 1400 1400 ~ vl/) it/1/VJ',lC/ I 6,;c/ 11 c:/aJ ~cpJ /ujJ 0-CVh,J'///;;./~ f'RlCMf' ,/0/1C/t £e.-d/#( M'iC1 /"7! 0 '. [ie/vii/, t 1·/":1 e:'f :JbI~'//L',Zb.:::::. 1.( 6'£ 0,,1 ,.,;11 (-r.,/ You can start ZUDLBO, then before answering the final question, halt and install the patch in memory, OR you can use the XXDP patch utility and install the patch pennanently into a new file (ZUDLBO.TUR, for example). Running the turbo will cause a lot of replacements that the non-turbo version does not make. CAUTION Running the turbo scrubber on a drive that is not in good working order can damage the integrity of the logical structures on the media. Use this turbo saubber only on drives that are in good working order but have bad blocks that need to be replaced. If, while running the turbo scrubber, you find that it is making far too many replacements (too many replacements for the turbo scrubber would be anything in excess of 150 replacements), then the drive or HDA, may not be in good working order. Fix or refonnat the media, then try the turbo scrubber again. If the drive is in good working order, use the turbo scrubber after the media (pack or HDA) has been replaced. 4 Digital Internal Use Only Preface Intended Audience This manual is intended for VMS system managers, operators, and system programmers. Document Structure This document consists of the following six sections: • Description-Provides a full description of the Monitor Utility (MONITOR). • Usage Summary-Outlines the following MONITOR information: -Invoking the utility -Exiting from the utility -Directing output -Restrictions or privileges required • Qualifiers-Describes MONITOR qualifiers, including format, parameters;, and examples. • Commands-Describes MONITOR commands, including format, parameters, and examples. • Examples-Provides additional MONITOR examples. • Appendix A-Provides supplemental MONITOR information. ,If! ;!~ Associated Documents For additional information on the topics covered in this documettt, refer to the VMS DeL Dictionary and the Guide to VMS Performance Management. xi Preface Conventions Convention Meaning In examples, a key name (usually abbreviated) shown within a box indicates that you press a key on the keyboard; in text, a key name is not enclosed in a box. In this example, the key is the RETURN key. (Note that the RETURN key is not usually shown in syntax statements or in all examples; however, assume that you must press the RETURN key after entering a command or responding to a prompt.) CTRL/C A key combination, shown in uppercase with a slash separating two key names, indicates that you hold down the first key while you press the second key. For example, the key combination CTRL/C indicates that you hold down the key labeled CTRL while you press the key labeled C. In examples, a key combination is enclosed in a box. $ SHOW TIME In examples, system output (what the system displays) is shown in black. User input (what you enter) is shown in red. 05-JUN-1988 11 :55:22 xii $ TYPE MYFILE.DAT In examples, a vertical series of periods, or ellipsis, means either that not all the data that the system would display in response to a command is shown or that not all the data a user would enter is shown. input-file, ... In examples, a horizontal ellipsis indicates that additional parameters, values, or other information can be entered, that preceding items can be repeated one or more times, or that optional arguments in a statement have been omitted. [logical-name] Brackets indicate that the enclosed item is optional. (Brackets are not, however, optional in the syntax of a directory name in a file specification or in the syntax of a substring specification in an assignment statement.) quotation marks apostrophes The term quotation marks is used to refer to double quotation marks (1/). The term apostrophe (') is used to refer to a single quotation mark. New and Changed Features Version 5.0 of the Monitor Utility includes the following new functions: • A new MONITOR MSCP_SERVER command that produces statistics useful in tuning an MSCP server. • A new MONITOR RMS command that produces a variety of statistics related to VMS Record Management Services. • The MONITOR MODES cOlllmand now produces statistics that pertain to symmetric multiprocessor systems. • The MONITOR 10 command now produces data for split transfers. MONITOR Description The Monitor Utility (MONITOR) is a system management tool that enables you to obtain information on operating system performance. Using MONITOR, you can monitor classes of systemwide performance data (such as system I/O statistics, page management statistics, and time spent in each of the processor modes) at specifiable intervals, and produce several types of output. To monitor a particular class of information, you specify the class name corresponding to the information class on the MONITOR command line. For example, to monitor page management statistics, specify the PAGE class name in the MONITOR command. MONITOR collects system performance data by class and produces the following three forms of optional output: • A disk recording file in binary format • Statistical terminal displays • A disk file containing statistical summary information in ASCII format The utility initiates a single MONITOR request for the classes of performance data specified each time you enter a command in the following form: MONITOR [/qualifier[, ... )] classname[, ... ] [/qualifier[, ... ]] Regardless of the order in which you specify class-name parameters, MONITOR always executes requests in the following sequence: PROCESSES STATES MODES PAGE 10 FCP POOL LOCK DECNET FILE_SYSTEM_CACHE DISK DLOCK SCS SYSTEM CLUSTER RMS MSCP_SERVER Depending on the command qualifiers specified, MONITOR collects system performance data from the running system or plays back data recorded previously in a recording file. When you play back data, you can display it, summarize it, and even rerecord it to reduce the amount of data in the recording file. The Examples section illustrates these operations in greater detail. For additional information about interpreting the information the Monitor Utility provides, see the Guide to VMS Performance Management. MON-1 MONITOR Description 1 Class Types Each MONITOR class consists of data items that, taken together, provide a statistical measure of a particular system performance category. The data items defined for individual classes are listed in the description of the MONITOR command in the Commands section. There are two MONITOR class types, differentiated by the scope of the data items collected: • System Classes, in which the data items provide statistics on resource utilization for the entire system (CLUSTER, DECNET, DLOCK, FCP, FILE_SYSTEM_CACHE, 10, LOCK, MSCP_SERVER, PAGE, POOL, STATES, SYSTEM). • Component Classes, in which the data items provide statistics on the contribution of individual components to the overall system or cluster. These classes are DISK, MODES, PROCESSES, RMS (Record Management Services), and SCS (System Communication Services). As an example of the distinction between MONITOR class types, the 10 class includes a data item to measure all direct I/O operations for the entire system and is therefore a system class. The DISK class measures direct I/O operations for individual disks, and is therefore a component class. 2 Class-Name Qualifiers The class-name qualifiers control the type of display and summary output format generated for each class name specified. They have no effect on the recording of binary data. Each of these qualifiers applies only to the immediately preceding class name. Class-name qualifiers must not appear as part of the command verb. Table MON-l summarizes class-name qualifiers and defaults. Table MON-1 MONITOR Class-Name Qualifiers Class Name Qualifiers Defaults All _CLASSES fAll/AVERAGE/CURRENT fMAXIMUM /MINIMUM See Commands section. CLUSTER fAll/AVERAGE/CURRENT fMAXIMUM /MINIMUM /CURRENT OECNET fAll/AVERAGE/CURRENT fMAXIMUM /MINIMUM /All DISK fAll / AVERAGE /CURRENT /ITEM fMAXIMUM /MINIMUM /[NO]PERCENT / All /ITEM=OPERA TION_ RATE /NOPERCENT DlOCK fAll/AVERAGE/CURRENT fMAXIMUM /MINIMUM /All FCP fAll/AVERAGE/CURRENT fMAXIMUM /MINIMUM /All FilE _SYSTEM _CACHE fAll / AVERAGE /CURRENT fMAXIMUM /MINIMUM /All MON-2 MONITOR Description Table MON-1 (Cont.) MONITOR Class-Name Qualifiers Class Name Qualifiers Defaults 10 /ALL/AVERAGE/CURRENT /MAXIMUM /MINIMUM /ALL LOCK /ALL/AVERAGE/CURRENT /MAXIMUM /MINIMUM /ALL MODES / ALL / AVERAGE /[NO]CPU /CURRENT /MAXIMUM /MINIMUM /[NO)PERCENT /NOCPU /CURRENT /NOPERCENT MSCP_SERVER / ALL / AVERAGE /CURRENT /MAXIMUM /MINIMUM /ALL PAGE /ALL/AVERAGE/CURRENT /MAXIMUM /MINIMUM /ALL POOL /ALL/AVERAGE/CURRENT /MAXIMUM /MINIMUM /ALL PROCESSES /TOPBIO' /TOPCPU /TOPDIO /TOPFAULT None RMS / ALL / AVERAGE /CURRENT /FILE /ITEM /MAXIMUM /MINIMUM /ITEM=OPERA TIONS /ALL SCS / ALL / AVERAGE /CURRENT /ITEM /MAXIMUM /MINIMUM /[NO]PERCENT / ALL /ITEM=KB_MAP /NOPERCENT STATES /ALL/AVERAGE/CURRENT /MAXIMUM /MINIMUM /[NO)PERCENT /CURRENT /NOPERCENT SYSTEM /ALL/AVERAGE/CURRENT /MAXIMUM /MINIMUM /CURRENT Following are the three categories of class-name qualifiers: • Statistics qualifiers (j ALL, I AVERAGE, ICURRENT, IMAXIMUM, and IMINIMUM) specify which statistics appear in display and summary output. These are conflicting qualifiers; specify no more than one of these qualifiers with each class name in a MONITOR request. Statistics qualifiers cannot be used with the PROCESSES class name or for multifile summaries. • The data transformation qualifier (/[NO]PERCENT) controls whether data for the selected class name is expressed as percentages of a whole. This qualifier can be used only with the STATES, DISK, MODES, and SCS class names, and is not allowed for multifile summaries. • Class-specific qualifiers (/CPU, lITEM, IFILE, ITOPBIO, ITOPCPU, . ITOPDIO, and ITOPFAULT) control the output of a specific class. - ICPU is used with the MODES class name to produce information for specific CPUs in a multiprocessor configuration. - lITEM is used with the component statistics class names DISK, RMS, and SCS to specify one or more data items for inclusion in display or summary output. - IFILE is used with the RMS class name to specify the RMS file to which a MONITOR RMS command applies. MON-3 MONITOR Description JTOP is used with the PROCESSES class name to produce bar graphs showing the top processes instead of the standard summary and display output. Top processes are the heaviest consumers of the resource being monitored. Up to eight processes can be shown in each display. Note that the JTOP qualifiers are mutually exclusive. Specify no more than one in a single request. 3 Output Types MONITOR can produce any combination of three forms of output for any single MONITOR request. The forms are display output, recording file output, and summary output. Output forms are specified with the jDISPlAY, jRECORD, and jSUMMARY qualifiers, as follows: • jDISPlAY produces output in the form of ASCII screen images. Screen images are written at a frequency governed by the jVIEWING_TIME qualifier. • jRECORD produces a binary recording file containing data collected for requested classes; one record for each class is written per interval. • jSUMMARY produces an ASCII file containing summary statistics for all requested classes over the duration of the MONITOR request. If you specify jINPUT with any of these qualifiers, MONITOR collects performance data from one or more previously created recording files; otherwise, data is collected from counters and data structures on the running system. The.MONITOR request begins and ends at times specified by the jBEGINNING and JENDING qualifiers respectively. 3.1 Display Output Display output consists of a series of terminal screen images. One screen image for each requested class for each requested viewing interval is produced. You can use any terminal supported by VMS with dimensions of at least 80 columns by 24 rows. (You might have to enter the DCl command SET TERMINAL to set the proper dimensions.) Display output can also be routed to a file for subsequent printing. . The amount of time between screen displays is determined by the jVIEWING_TIME value. Effective viewing time varies, however, depending on whether you are running MONITOR on your local system or on a remote node. (Remote in this context refers to use of the SET HOST command to access another node.) For remote access, the time required to display the screen is included in the viewing time, while for local access this time is not included. Therefore, use a larger viewing time than the 3-second default when running MONITOR on a remote system. The value appropriate for remote access depends on your terminal baud rate. For a 9600-baud terminal line, 6 seconds is a reasonable viewing time. For lower-speed lines, increase the viewing time appropriately. By pressing CTRljW, you can temporarily override the jVIEWING_TIME value and generate a new display immediately following the current one. This feature is useful when the MONITOR display area has been overwritten by an operator message. You can also use CTRljW in conjunction with a large jVIEWING_TIME value to generate display events on demand. MON-4 MONITOR Description 3.2 Display Data All displayable data items are rates or levels except in the PROCESSES class. Rates are shown in number of occurrences per second. A level is a value that indicates the size of the monitored data item. MONITOR can display any of four different statistics for each data item, as follows: • Current rate or level • Average rate or level • Minimum rate or level • Maximum rate or level Average, minimum, and maximum statistics are measured from the beginning of the MONITOR request. The current statistic is the most recently collected value for the rate or level. Any or all of the statistics can be requested. For the DISK, MODES, SCS, and STATES classes, all statistics can be expressed as percentages. 3.3 Screen Formats There are two basic screen formats used for displaying MONITOR class data: the single-statistic screen and the multiple-statistic screen. The formats vary slightly depending on whether the class being displayed is a system or component class. The following three characteristics occur in both screen formats: • The date and time appearing in the heading of each screen refer to the time the displayed data was originally collected. • The name of the node on which the data was originally collected also appears in the heading (except when playing back files that do not contain node name information or when displaying CLUSTER class data). The node name is obtained from the SCSNODE system parameter or, if SCSNODE is null, from the SYS$NODE logical name established by DEC net. • The bottom line of the display is used for status information about the current MONITOR request. If data collection is from a file of previously recorded monitor data, the word PLAYBACK appears at the left margin of the line. If the currently running system is being monitored, the word does not appear. If a summary file has been requested, the word SUMMARIZING appears in the middle of the line. If not, it does not appear. If creation of a recording file has been requested, the word RECORDING appears at the right margin of the line. If not, it does not appear. The PROCESSES, SYSTEM, AND CLUSTER classes have unique screen formats. MON-5 MONITOR Description 3.3.1 Single-Statistic Screen for System Classes This bar-graph style screen is used whenever one statistic (current, average, minimum, or maximum) is requested. Example MON-l shows the maximum statistic for the STATES class. For other classes and statistics, the screen format remains the same with different heading and data item descriptions. If the display of percentages is requested, the percent symbol ( % ) appears in the title and next to the numbers along the top of the graph. All values in this screen format are rounded up or down to seven whole numbers (except percentages, which are rounded to three whole numbers). Example MON-1 Single-Statistic Screen VAX/VMS Monitor Utility PROCESS STATES on node SAMPLE 31-DEC-1988 16:09:53 +-----+ 1 MAX 1 +-----+ o + - Collided Page Wait Mutex & Mise Resource Wait Common Event Flag Wait Page Fault Wait Local Event Flag Wait Local Evt FIg (Out swapped) Hibernate 1 3 10 - + - 20 - + - 30 + - 40 - -+ * 1 1*** 1 2 1** 28 1**************************** 4 1**** 11 1*********** 1 Hibernate (Outswapped) Suspended Suspended (Outswapped) Free Page Wait Compute Compute (Outswapped) Current Process 2 1** 1 1 1 4 1 1 1**** * * 1 1 + - 3.3.2 - - - + - - - - + - - - - + - - - - -+ Multiple-Statistic Screen for System Classes This tabular-style screen is used whenever all four statistics are requested with the / ALL class-name qualifier. Example MON-2 shows a multiple-statistic screen. The precision of the data items is seven whole and two decimal places. For each class, the screen format remains the same with different heading and data item descriptions. If you request the display of percentages, as in Example MON-3, the percent sign ( % ) appears in the title and the headings, and the figures· consist of three whole and one decimal place. MON-6 MONITOR Description Example MON-2 Sample Multiple-Statistic Screen VAX/VMS Monitor Utility PAGE MANAGEMENT STATISTICS on node SAMPLE 31-DEC-1988 16:13:38 CUR AVE MIN MAX Fault Rate Read Rate Read I/O Rate Write Rate 58.00 18.00 3.33 45.00 38.33 16.33 3.16 22.50 18.66 14.66 3.00 0.00 58.00 18.00 3.33 45.00 Page Write I/O Rate Free List Fault Rate Modified List Fault Rate Demand Zero Fault Rate Global Valid Fault Rate 1.66 26.33 4.66 12.00 11.33 0.83 15.66 3.83 7.66 7.83 0.00 5.00 3.00 3.33 4.33 1.66 26.33 4.66 12.00 11.33 0.00 24.33 3356.00 1.00 0.00 12.83 3321. 50 70.00 0.00 1.33 3287.00 1.00 0.00 24.33 3356.00 139.00 Page Page Page Page Wrt In Progress Fault Rate System Fault Rate Free List Size Modified List Size Example MON-3 Sample Multiple-Statistic Screen (Data Expressed as Percentages) VAX/VMS Monitor Utility TIME IN PROCESSOR MODES (%) on node SAMPLE 31-DEC-1988 16:13:38 CUR% AVE% MIN% MAX% 20.3 21.9 20.3 23.6 0.0 0.0 0.0 0.0 23.0 23.8 23.0 24.6 Executive Mode 3.0 3.5 3.0 24.6 Supervisor Mode 0.0 0.0 0.0 0.6 51.3 46.9 42.6 51.3 Compatibility Mode 2.3 3.6 0.0 3.9 Idle Time 0.0 0.0 0.0 94.9 Interrupt Stack MP Synchronization Kernel Mode User Mode 3.3.3 Component Classes Screen For all component classes except RMS and MODES, only one data item for each component is displayed on each screen. The item is identified in the upper left of the screen. Components for which statistics are reported appear in the left column of the screens. If more than one item keyword is specified with the lITEM qualifier or if IITEM=ALL is specified, a new screen appears for each item selected. For example, the following command would produce the output of the format shown in Example MON-4: MONITOR DISK/ITEM=(OPERA TION_RA TE,QUEUE_LENGTH) MON-7 MONITOR Description Example MON-4 Sample Component Statistics Screens VAX/VMS Monitor Utility DISK I/O STATISTICS on node SAMPLE 31-DEC-1988 20:08:42 I/O Operation Rate DRA2: DRB1: DRC3: DRC4: DBA3: DBA5: DRA7: DUA4: DUA5: DUA7: SAMPLEPAGE ACCREG VMS_X20R SAMPLESECD01 UMASTER MIDNITE RES2i3APR RES06AUG VMSDOCLIB OLD_QVSS$ CUR AVE MIN MAX 0.00 0.00 1.99 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.03 0.00 0.19 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.33 0.00 1.99 0.00 0.00 0.00 0.00 0.00 0.00 0.00 CUR AVE MIN MAX 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.43 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 VAX/VMS Monitor Utility DISK I/b STATISTICS on node SAMPLE 31-DEC-1988 20:08:45 I/O Request Queue Length DRA2: DRBl : DRC3: DRC4: DBA3: DBA5: DRA7: DUA4: DUA5: DUA7: SAMPLEPAGE ACCREG VMS_X20R SAMPLESECD01 UMASTER MIDNITE RES26APR RES06AUG VMSDOCLIB OLD_QVSS$ « 3.4 Recording File Output A recording file is a VAX RMS sequential disk file that is created when a MONITOR request includes the jRECORD qualifier. A record of binary performance data is written to this file once per interval for each requested class; the record contains a predefined set of data for each of the requested performance classes. The file is created when a MONITOR request is initiated and closed when the request terminates. The resulting file can be used as a source file by later requests to format and display the data on a terminal, to create a summary file, or to record a new recording file with different characteristics. All data pertaining to the class is recorded, even if you are concurrently displaying only a single statistic or a single item of a component statistics class. MON-8 MONITOR Description 3.4.1 Disk Space for Recording Files When recording is active (or display output is being routed to a disk file), you can use large quantities of disk space in a short period of time. In particular, if disk quota is exceeded during execution of a MONITOR request, open files are closed, and the request is terminated prematurely. To avoid this situation, use the information provided in Appendix A to estimate the amount of disk space required. When SYSTEM class data is recorded, the MODES, STATES, and PROCESSES classes are also recorded, even if not specifically requested. When CLUSTER class data is recorded, the MODES and DISK classes are also recorded. To estimate disk space requirements for CLUSTER recording files, multiply the totals for these classes by the number of nodes being monitored. After estimating disk space requirements, check the amount of disk quota available, and set apprppriate values for /INTERVAL and /ENDING. "~ Refer to Appendix:.¥ ~~~i e,x"act"recording file record sizes. . ".", , 3.4.2 Recording File Version COrhpatibility Before Version 5.0~ONITOR can read recording files generated by previous MONITOR versions, you must convert the files to the current format. Use the -CONVERT command described in the Commands section of this document. If you specify a list of recording files to produce a multifile summary, all recording files must have the same format. 3.5 Summary Output Summary output is an ASCII disk file consisting of one display screen image for each requested class. The screen format for each class is based on the statistic requested. The only difference in format between a display screen and a summary screen image is that the word SUMMARY appears in the heading along with a beginning and ending time for the period covered by the summary. The data containe<;i in the summaries is identical to that shown on the final display screen (if display output was also requested) for all except the PROCESSES/TOP, SYSTEM, and CLUSTER summari~s. Since the summary file reflects the accutV;u!ation of data throughout the MONITOR request, the average, minimum,' and maximum statistics are of particular interest. For the TOP summaries of the PROCESSES and SYSTEM classes, the data represents the top users for the entire duration of the MONITOR request, subject to the following restriction. To be eligible for inclusion in the list of top users, a process must be present and swapped in at the beginning and end of the MONITOR request. 3.6 Multifile Summary Reports Multifile summary reports provide a convenient method of combining data from a number of recording files to compare average performance statistics (excluding the PROCESSES and CLUSTER classes) for discrete time segments. Use the /BEGINNING and /ENDING command qualifiers to delimit the desired time segment (see the Qualifiers section). MO~ MONtTOR; LQescription \~'" ,To request:.amultifile summary, use the /SUMMARY command qualifier, and specify a list of recording files with the /INPUT qualifier. Note that since only AVERAGE statistics are collected, you should not specify class-name qualifiers:' 'Note also that multifile summaries are static; that is, they do not prov,~de ,continu,ously updated displays. Caution: Version 5.0 MONITOR file structure must be common to all recording files in the list. 3.6.1 . : ' .. .. ' .", Interpreting' Multifile Summary Reports Multifile summary reports differ from regular (single.:.file) reports in both content (only AVERAGE statistics are collected) and format. MONITOR formats~~~l~filereport d~ta ~s follows: • .By Jil~~ Th~s IS the default fomult. For each class requested, the report q.isp1lay,s':one <;olumn of AVERAGE statistics per input file, along with beginning an,d, end~ng. times for each file. For files that contain data for multipl~' qo~es;'there is one coltimn per node per file. • By '~~d~~ \i~';i~quest' this form~t, specify the /BY~ODE command qualifier (with the /SUMMARY and /INPUT qualifiers) when you create the sUmmary' file. The report combines data for a given node from all files into a single cohimnthat shows the average statistic for each data item. The contribution of the data from each file is weighted by the amount of time6vet which the data was collected (for rate items) or by the number of collections (for:level items). ~'. , For both formats, MONITOR provides Row Sum, Row Average, Row '·tMaximum, and- Row~JMinimum statistics. These represent simple arithmetic (operations performed on all averages in each row of the report. ,;..~. . I.,' Note: Because multifile summary reports frequently contain large amounts of '.:" data smd use a 132-character format, you may want to print them. A single pagecan\display only five columns of data. Depending on the number of recording files, nodes, and classes specified, a report may extend over many·"pages. In that event, Row values appear on the final page. .! ,'': 4, ~~. ;"~ ~ ,+; .... :'\:. " T "" - , .' . ' Ti,~ following examples illustrate differences between single-file and ,. multifile reports. In the first example, two fragments of single-file reports '.:':' are; 'generated on the same node from two different recording files, which cover, respectively, a two-hour and a ten-hour period: ;- ,.-,). t~· W " , "'VAX/VMS Monitor Utility . PAGE MANAGEMENT STATISTICS From: 31-DEC-1988 09:00:00 on node BLUE To: 31-DEC-1988 11:00:00 SUMMARY CUR Page Faul!t·, Rate' f AVE 90.00, MIN MAX r',' 'r~·:·'· \.: .": .... ~: ~~~: . 'Page Fault Rate MON-10 .VAX/VMS Monitor,Utility ;/ ~AGE MANAGEMEI(I' STATISTICS .., -" . on node 'BLUE . From: 31-DEC-1~88 11:00:00 SUMMARY To: 31-DEC-198821:00:00 CUR AVE 6.00 MIN MAX , 'I~ ~the next example, the torresJ5dnding "by node" multifile report fragment for the entire period shows 1that the· average Page Fault Rate is weighted toward the figure that repre~ent~ the, larger elapsed time: • ~ .r VAX/VMS M9ni~r ~t~+:~~~·;·::'L~. PAGE MANAGEMENT STATISTICS Mu~ tifile- SUMMA~Y! '.V +-----+ I AVE I +-----+ Node: BLUE (2) !
Source Exif Data:File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.3 Linearized : No XMP Toolkit : Adobe XMP Core 4.2.1-c043 52.372728, 2009/01/18-15:56:37 Create Date : 2013:11:24 10:41:55-08:00 Modify Date : 2013:11:24 11:48:16-08:00 Metadata Date : 2013:11:24 11:48:16-08:00 Producer : Adobe Acrobat 9.55 Paper Capture Plug-in Format : application/pdf Document ID : uuid:996514b6-8b0d-4e4a-9fb9-cb2e0b3d8df7 Instance ID : uuid:03589431-484d-4bf8-9f47-dc5cc45b270c Page Layout : SinglePage Page Mode : UseNone Page Count : 644EXIF Metadata provided by EXIF.tools