Dell Openmanage Server Administrator Version 6 0 3 Quick Reference Guide 6.0.3 Message
2014-11-13
: Dell Dell-Openmanage-Server-Administrator-Version-6-0-3-Quick-Reference-Guide-117353 dell-openmanage-server-administrator-version-6-0-3-quick-reference-guide-117353 dell pdf
Open the PDF directly: View PDF
.
Page Count: 224
| Download | |
| Open PDF In Browser | View PDF |
Dell™ OpenManage™ Server Administrator Messages Reference Guide w w w. d e l l . c o m | s u p p o r t . d e l l . c o m Notes and Cautions NOTE: A NOTE indicates important information that helps you make better use of your computer. CAUTION: A CAUTION indicates potential damage to hardware or loss of data if instructions are not followed. ____________________ Information in this document is subject to change without notice. © 2009 Dell Inc. All rights reserved. Reproduction of these materials in any manner whatsoever without the written permission of Dell Inc. is strictly forbidden. Trademarks used in this text: Dell, the DELL logo and Dell OpenManage are trademarks of Dell Inc.; VMware is registered trademark or trademark of VMware, Inc. in the United States and/or other jurisdictions; Microsoft, Windows and Windows Server are either trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries; Red Hat and Red Hat Enterprise Linux are registered trademark of Red Hat, Inc.; SUSE is a registered trademark of Novell, Inc. in the United States and other countries. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell Inc. disclaims any proprietary interest in trademarks and trade names other than its own. February 2009 Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . What’s New in this Release . . . . . . . . . . . . . . . . Messages Not Described in This Guide Understanding Event Messages Sample Event Message Text 8 . . . . . . . . . . . . . 8 . . . . . . . . . . . . . . . . . . . . . . Logging Messages to a Unicode File . . . . . . . Viewing Events in Windows 2000 Advanced Server and Windows Server 2003 . . . . . . Viewing Events in Red Hat Enterprise Linux and SUSE Linux Enterprise Server . . . . . Viewing the Event Information 10 11 12 . . . 12 . . . . 12 . . . . . . . . . . 13 . . . . . . . . . . . 14 Viewing Events in VMware ESXi . . . . . . . 14 . . . . . . . . . . . 19 . . . . . . . . . . . . . . . 19 Understanding the Event Description Event Message Reference Miscellaneous Messages . 7 . . . . . . . . . Viewing Alerts and Event Messages 2 7 Temperature Sensor Messages . . . . . . . . . . . . . 22 Cooling Device Messages . . . . . . . . . . . . . . . 25 Voltage Sensor Messages . . . . . . . . . . . . . . . 28 Current Sensor Messages . . . . . . . . . . . . . . . . 31 Contents 3 Chassis Intrusion Messages . . . . . . . . . . . . . . 35 Redundancy Unit Messages . . . . . . . . . . . . . . . 37 . . . . . . . . . . . . . . . . 41 Power Supply Messages Memory Device Messages Fan Enclosure Messages . . . . . . . . . . . . . . . 45 . . . . . . . . . . . . . . . . 46 AC Power Cord Messages . . . . . . . . . . . . . . . . Hardware Log Sensor Messages . . . . . . . . . . . . 49 Processor Sensor Messages . . . . . . . . . . . . . . 51 Pluggable Device Messages . . . . . . . . . . . . . . 54 . . . . . . . . . . . . . . . . 56 Battery Sensor Messages Chassis Management Controller Messages 3 . . . . . . 58 . . . . . . . . . . 59 . . . . . . . . . . . . . . . 59 . . . . . . . . . . . . . . . . . 60 . . . . . . . . . . . . . . . . . . . . 61 System Event Log Messages for IPMI Systems . . . . . . . . . Temperature Sensor Events Voltage Sensor Events . Fan Sensor Events Processor Status Events . . . . . . . . . . . . . . . . . 63 . . . . . . . . . . . . . . . . . . 64 . . . . . . . . . . . . . . . . . . . 66 Power Supply Events Memory ECC Events . . . . . . . . . . . . . . . . . 66 . . . . . . . . . . . . . . . . . . . . . 67 BMC Watchdog Events Memory Events 4 Contents 48 Hardware Log Sensor Events Drive Events . . . . . . . . . . . . . . 68 . . . . . . . . . . . . . . . . . . . . . . . 69 Intrusion Events BIOS Generated System Events . . . . . . . . . . . . . 72 . . . . . . . . . . . . . . 78 . . . . . . . . . . . . . . . 79 . . . . . . . . . . . . . . . . . . . . . . 79 R2 Generated System Events Cable Interconnect Events Battery Events Power And Performance Events Entity Presence Events 4 . . . . . . . . . . . . 80 . . . . . . . . . . . . . . . . . 80 Storage Management Message Reference . Alert Message Format with Substitution Variables . . . 81 . . . . . . . . . . . . . . . Alert Monitoring and Logging . . . . . . . . . . . . . . 81 . . . . . . . . . . . . . . . 82 Alert Message Change History 84 . . . . . . . . . . . . . Alert Descriptions and Corrective Actions Index 71 . . . . . . . . . . . . . . . . . . . . . 85 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 Contents 5 6 Contents Introduction Dell™ OpenManage™ Server Administrator produces event messages stored primarily in the operating system or Server Administrator event logs. This document describes the event messages created by Server Administrator version 6.0.3 and displayed in the Server Administrator Alert log. Server Administrator creates events in response to sensor status changes and other monitored parameters. The Server Administrator event monitor uses these status change events to add descriptive messages to the operating system event log or the Server Administrator Alert log. Each event message that Server Administrator adds to the Alert log consists of a unique identifier called the event ID for a specific event source category and a descriptive message. The event message includes the severity, cause of the event, and other relevant information, such as the event location and the monitored item’s previous state. Tables provided in this guide list all Server Administrator event IDs in numeric order. Each entry includes the event ID’s corresponding description, severity level, and cause. Message text in angle brackets (for example,) describes the event-specific information provided by the Server Administrator. What’s New in this Release The following changes have been made for this release: • Support for the VMware® ESXi version 3.5 Update 4 hypervisor. • Support for the Server Administrator Web Server. • Supports Serial Attached SCSI (SAS) controllers only for this release. Introduction 7 • No SNMP trap support for this release. • No support for LRA numbers for this release. • Added two new alerts 1013 and 2382 in the “Miscellaneous Messages” and “Alert Descriptions and Corrective Actions” sections respectively. • Added the POST Code Errors table in the “BIOS Generated System Events” section. • Support for Solid State Drives (SSD). Added new SSD alert 2370 in the “Storage Management Message Reference” section. Messages Not Described in This Guide This guide describes only event messages created by Server Administrator and displayed in the Server Administrator Alert log. For information on other messages produced by your system, consult one of the following sources: • Your system’s Installation and Troubleshooting Guide • Operating system documentation • Application program documentation Understanding Event Messages This section describes the various types of event messages generated by the Server Administrator. When an event occurs on your system, the Server Administrator sends information about one of the following event types to the systems management console: 8 Introduction Table 1-1. Icon Understanding Event Messages Alert Severity Component Status OK /Normal / Informational An event that describes the successful operation of a unit. The alert is provided for informational purposes and does not indicate an error condition. For example, the alert may indicate the normal start or stop of an operation, such as power supply or a sensor reading returning to normal. Warning / Non-critical An event that is not necessarily significant, but may indicate a possible future problem. For example, a Warning/Non-critical alert may indicate that a component (such as a temperature probe in an enclosure) has crossed a warning threshold. Critical / Failure / Error A significant event that indicates actual or imminent loss of data or loss of function. For example, crossing a failure threshold or a hardware failure such as an array disk. Server Administrator generates events based on status changes in the following sensors: • Temperature Sensor — Helps protect critical components by alerting the systems management console when temperatures become too high inside a chassis; also monitors a variety of locations in the chassis and in any attached systems. • Fan Sensor — Monitors fans in various locations in the chassis and in any attached systems. • Voltage Sensor — Monitors voltages across critical components in various chassis locations and in any attached systems. • Current Sensor — Monitors the current (or amperage) output from the power supply (or supplies) in the chassis and in any attached systems. • Chassis Intrusion Sensor — Monitors intrusion into the chassis and any attached systems. Introduction 9 • Redundancy Unit Sensor — Monitors redundant units (critical units such as fans, AC power cords, or power supplies) within the chassis; also monitors the chassis and any attached systems. For example, redundancy allows a second or nth fan to keep the chassis components at a safe temperature when another fan has failed. Redundancy is normal when the intended number of critical components are operating. Redundancy is degraded when a component fails, but others are still operating. Redundancy is lost when there is one less critical redundancy device than required. • Power Supply Sensor — Monitors power supplies in the chassis and in any attached systems. • Memory Prefailure Sensor — Monitors memory modules by counting the number of Error Correction Code (ECC) memory corrections. • Fan Enclosure Sensor — Monitors protective fan enclosures by detecting their removal from and insertion into the system, and by measuring how long a fan enclosure is absent from the chassis. This sensor monitors the chassis and any attached systems. • AC Power Cord Sensor — Monitors the presence of AC power for an AC power cord. • Hardware Log Sensor — Monitors the size of a hardware log. • Processor Sensor — Monitors the processor status in the system. • Pluggable Device Sensor — Monitors the addition, removal, or configuration errors for some pluggable devices, such as memory cards. • Battery Sensor — Monitors the status of one or more batteries in the system. Sample Event Message Text The following example shows the format of the event messages logged by Server Administrator. EventID: 1000 Source: Server Administrator Category: Instrumentation Service Type: Information 10 Introduction Date and Time: Mon Oct 21 10:38:00 2002 Computer: Description: Server Administrator starting Data: Bytes in Hex Viewing Alerts and Event Messages NOTE: The Red Hat® Enterprise Linux®, SUSE® Linux Enterprise Server and Microsoft® Windows® content mentioned in the following section do not apply to the VMware® ESXi version 3.5 Update 4 release. An event log is used to record information about important events. Server Administrator generates alerts that are added to the operating system event log and to the Server Administrator Alert log. To view these alerts in Server Administrator: 1 Select the System object in the tree view. 2 Select the Logs tab. 3 Select the Alert subtab. You can also view the event log using your operating system’s event viewer. Each operating system’s event viewer accesses the applicable operating system event log. The location of the event log file depends on the operating system you are using. • In the Microsoft® Windows® 2000 Advanced Server and Windows Server® 2003 operating systems, messages are logged to the system event log and optionally to a Unicode text file, dcsys32.log (viewable using Notepad), that is located in the install_path\omsa\log directory. The default install_path is C:\Program Files\Dell\SysMgt. • In the Red Hat Enterprise Linux, SUSE Linux Enterprise Server and VMware ESXi operating systems, messages are logged to the system log file. The default name of the system log file is /var/log/messages. You can view the messages file using a text editor such as vi or emacs. Introduction 11 Logging Messages to a Unicode File Logging messages to a Unicode text file is optional. By default, the feature is disabled. To enable this feature, modify the Event Manager section of the dcemdy32.ini file as follows: • In Windows, locate the file at \dataeng\ini and set UnitextLog.enabled=True. The default is C:\Program Files\Dell\SysMgt. Restart the DSM SA Event Manager service. • In Red Hat Enterprise Linux and SUSE Linux Enterprise Server, locate the file at /dataeng/ini and set UnitextLog.enabled=True. The default install_path is /opt/dell/srvadmin. Issue the "/etc/init.d/dataeng restart" command to restart the Server Administrator event manager service. The following subsections explain how to open the Windows 2000 Advanced Server, Windows Server 2003, Red Hat Enterprise Linux, SUSE Linux Enterprise Server and VMware ESXi event viewers. Viewing Events in Windows 2000 Advanced Server and Windows Server 2003 1 Click the Start button, point to Settings, and click Control Panel. 2 Double-click Administrative Tools, and then double-click Event Viewer. 3 In the Event Viewer window, click the Tree tab and then click System Log. The System Log window displays a list of recently logged events. 4 To view the details of an event, double-click one of the event items. NOTE: You can find the event log file dcsys32.log, at \omsa\log directory. The default is C:\Program Files\Dell\SysMgt. Viewing Events in Red Hat Enterprise Linux and SUSE Linux Enterprise Server 1 Log in as root. 2 Use a text editor such as vi or emacs to view the file named /var/log/ messages. 12 Introduction The following example shows the Red Hat Enterprise Linux and SUSE Linux Enterprise Server message log, /var/log/messages. The text in boldface type indicates the message text. ... Feb 6 14:20:51 server01 Server Administrator: Instrumentation Service EventID: 1000 Server Administrator starting Feb 6 14:20:51 server01 Server Administrator: Instrumentation Service EventID: 1001 Server Administrator startup complete Feb 6 14:21:21 server01 Server Administrator: Instrumentation Service EventID: 1254 Chassis intrusion detected Sensor location: Main chassis intrusion Chassis location: Main System Chassis Previous state was: OK (Normal) Chassis intrusion state: Open Feb 6 14:21:51 server01 Server Administrator: Instrumentation Service EventID: 1252 Chassis intrusion returned to normal Sensor location: Main chassis intrusion Chassis location: Main System Chassis Previous state was: Critical (Failed) Chassis intrusion state: Closed Viewing Events in VMware ESXi 1 Log in to the VMware ESXi system with VMware Infrastructure (VI) Client. 2 Click Administration on the navigation bar. 3 Select System Logs. 4 Select Server Log [/var/log/messages] entry on the drop-down list. Introduction 13 Viewing the Event Information The event log for each operating system contains some or all of the following information: • Date — The date the event occurred. • Time — The local time the event occurred. • Type — A classification of the event severity: Information, Warning, or Error. • User — The name of the user on whose behalf the event occurred. • Computer — The name of the system where the event occurred. • Source — The software that logged the event. • Category — The classification of the event by the event source. • Event ID — The number identifying the particular event type. • Description — A description of the event. The format and contents of the event description vary, depending on the event type. Understanding the Event Description Table 1-2 lists in alphabetical order each line item that may appear in the event description. Table 1-2. Event Description Reference Description Line Item Explanation Action performed was: Specifies the action that was performed, for example: Action performed was: Power cycle Action requested was: Specifies the action that was requested, for example: Action requested was: Reboot, shutdown OS first Specifies additional details available for the hot Additional Details: Memory device: DIMM1_A Serial number: FFFF30B1 14 Introduction Table 1-2. Event Description Reference (continued) Description Line Item Explanation Specifies information pertaining to the event, for example: Chassis intrusion state: Specifies the chassis intrusion state (open or closed), for example: Power supply input AC is off, Power supply POK (power OK) signal is not normal, Power supply is turned off Chassis intrusion state: Open Chassis location: message, for example: Chassis location: Main System Chassis Configuration error type: Specifies the type of configuration error that occurred, for example: Current sensor value (in Amps): Specifies the current sensor value in amps, for example: Configuration error type: Revision mismatch Current sensor value (in Amps): 7.853 Specifies the date and time the action was Date and time of action: performed, for example: Date and time of action: Sat Jun 12 16:20:33 2004 Device location: Specifies the location of the device in the specified chassis, for example: Device location: Memory Card A Discrete current state: Specifies the state of the current sensor, for example: Discrete current state: Good Introduction 15 Table 1-2. Event Description Reference (continued) Description Line Item Explanation Discrete temperature state: Specifies the state of the temperature sensor, for example: Discrete temperature state: Good Discrete voltage state: Specifies the state of the voltage sensor, for example: Discrete voltage state: Good Fan sensor value: Specifies the fan speed in revolutions per minute (RPM) or On/Off, for example: Fan sensor value (in RPM): 2600 Fan sensor value: Off Log type: Specifies the type of hardware log, for example: Log type: ESM Memory device bank Specifies the name of the memory bank in the location: Memory device bank location: Bank_1 Memory device location: Specifies the location of the memory module in the Memory device location: DIMM_A Number of devices required for full redundancy: Specifies the number of power supply or cooling devices required to achieve full redundancy, for example: Number of devices required for full redundancy: 4 Peak value (in Watts): Specifies the peak value in Watts, for example: Possible memory module event cause: Specifies a list of possible causes for the memory module event, for example: Peak value (in Watts): 125 Possible memory module event cause: Single bit warning error rate exceeded Single bit error logging disabled 16 Introduction Table 1-2. Event Description Reference (continued) Description Line Item Explanation Power Supply type: Specifies the type of power supply, for example:
Power Supply type: VRM Previous redundancy state was: Specifies the status of the previous redundancy message, for example: Previous redundancy state was: Lost Previous state was: Specifies the previous state of the sensor, for example: Previous state was: OK (Normal) Processor sensor status: Specifies the status of the processor sensor, for example: Processor sensor status: Configuration error Redundancy unit: Specifies the location of the redundant power Redundancy unit: Fan Enclosure Sensor location: Specifies the location of the sensor in the specified chassis, for example: Sensor location: CPU1 Temperature sensor value: Specifies the temperature in degrees Celsius, for example: Temperature sensor value (in degrees Celsius): 30 Voltage sensor value (in Volts): Specifies the voltage sensor value in volts, for example: Voltage sensor value (in Volts): 1.693 Introduction 17 18 Introduction Event Message Reference The following tables lists in numerical order each event ID and its corresponding description, along with its severity and cause. NOTE: For corrective actions, see the appropriate documentation. Miscellaneous Messages Miscellaneous messages in Table 2-1 indicate that certain alert systems are up and working. Table 2-1. Miscellaneous Messages Event Description ID Severity Cause 0000 Log was cleared Information User cleared the log from Server Administrator. 0001 Log backup created Information The log was full, copied to backup, and cleared. 1000 Server Administrator starting Information Server Administrator is beginning to initialize. 1001 Server Administrator startup complete Information Server Administrator completed its initialization. 1002 A system BIOS update Information has been scheduled for the next reboot The user has chosen to update the flash basic input/ output system (BIOS). 1003 A previously scheduled Information system BIOS update has been canceled The user decides to cancel the flash BIOS update, or an error occurs during the flash. Event Message Reference 19 Table 2-1. Miscellaneous Messages (continued) Event Description ID Severity Cause 1004 Thermal shutdown protection has been initiated Error This message is generated when a system is configured for thermal shutdown due to an error event. If a temperature sensor reading exceeds the error threshold for which the system is configured, the operating system shuts down and the system powers off. This event may also be initiated on certain systems when a fan enclosure is removed from the system for an extended period of time. 1005 SMBIOS data is absent Error The system does not contain the required systems management BIOS version 2.2 or higher, or the BIOS is corrupted. 1006 Automatic System Recovery (ASR) action was performed Error This message is generated when an automatic system recovery action is performed due to a hung operating system. The action performed and the time of action are provided. Information User requested a host system control action to reboot, power off, or power cycle the system. Alternatively the user had indicated protective measures to be initiated in the event of a thermal shutdown. Action performed was: Date and time of action: 1007 User initiated host system control action Action requested was: 20 Event Message Reference Table 2-1. Miscellaneous Messages (continued) Event Description ID Severity Cause 1008 Systems Management Data Manager Started Information Systems Management Data Manager services were started. 1009 Systems Management Data Manager Stopped Information Systems Management Data Manager services were stopped. 1011 RCI table is corrupt Error This message is generated when the BIOS Remote Configuration Interface (RCI) table is corrupted or cannot be read by the systems management software. 1012 IPMI Status Information This message is generated to indicate the Intelligent Platform Management Interface (IPMI)) status of the system. Interface: , 1013 System Peak Power detected new peak value Peak value (in Watts): Additional information, when available, includes Baseboard Management Controller (BMC) not present, BMC not responding, System Event Log (SEL) not present, and SEL Data Record (SDR) not present. Information The system peak power sensor detected a new peak value in power consumption. The new peak value in Watts is provided. Event Message Reference 21 Temperature Sensor Messages Temperature sensors listed in Table 2-2 help protect critical components by alerting the systems management console when temperatures become too high inside a chassis. The temperature sensor messages use additional variables: sensor location, chassis location, previous state, and temperature sensor value or state. Table 2-2. Temperature Sensor Messages Event Description ID Severity Cause 1050 Error A temperature sensor on the backplane board, system board, or the carrier in the specified system failed. The sensor location, chassis location, previous state, and temperature sensor value are provided. Temperature sensor has failed Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Temperature sensor value (in degrees Celsius): If sensor type is discrete: Discrete temperature state: 1051 22 Information A temperature sensor on the backplane board, Sensor location: drive carrier in the Chassis location: could not obtain a reading. The sensor If sensor type is not discrete: location, chassis Temperature sensor value (in location, previous degrees Celsius): state, and If sensor type is discrete: a nominal temperature sensor Discrete temperature state: value are provided. Temperature sensor value unknown Event Message Reference Table 2-2. Temperature Sensor Messages (continued) Event Description ID Severity Cause 1052 Information A temperature sensor on the backplane board, Sensor location: drive carrier in the Chassis location: returned to a valid range after crossing Previous state was: a failure threshold. If sensor type is not discrete: The sensor Temperature sensor value (in location, chassis degrees Celsius): location, previous state, and If sensor type is discrete: temperature sensor Discrete temperature state: value are provided. 1053 Temperature sensor detected a warning value Temperature sensor returned to a normal value Warning Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Temperature sensor value (in degrees Celsius): If sensor type is discrete: Discrete temperature state: A temperature sensor on the backplane board, system board, CPU, or drive carrier in the specified system exceeded its warning threshold. The sensor location, chassis location, previous state, and temperature sensor value are provided. Event Message Reference 23 Table 2-2. Temperature Sensor Messages (continued) Event Description ID Severity Cause 1054 Error A temperature sensor on the backplane board, system board, or drive carrier in the specified system exceeded its failure threshold. The sensor location, chassis location, previous state, and temperature sensor value are provided. Error A temperature sensor on the backplane board, system board, or drive carrier in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and temperature sensor value are provided. Temperature sensor detected a failure value Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Temperature sensor value (in degrees Celsius): If sensor type is discrete: Discrete temperature state: 1055 Temperature sensor detected a non-recoverable value Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Temperature sensor value (in degrees Celsius): If sensor type is discrete: Discrete temperature state: 24 Event Message Reference Cooling Device Messages Cooling device sensors listed in Table 2-3 monitor how well a fan is functioning. Cooling device messages provide status and warning information for fans in a particular chassis. Table 2-3. Cooling Device Messages Event Description ID Severity Cause 1100 Error A fan sensor in the specified system is not functioning. The sensor location, chassis location, previous state, and fan sensor value are provided. Error A fan sensor in the specified system could not obtain a reading. The sensor location, chassis location, previous state, and a nominal fan sensor value are provided. Fan sensor has failed Sensor location: Chassis location: Previous state was: Fan sensor value: 1101 Fan sensor value unknown Sensor location: Chassis location: Previous state was: Fan sensor value: Event Message Reference 25 Table 2-3. Cooling Device Messages (continued) Event Description ID Severity 1102 Information A fan sensor reading on the specified system returned to a valid range after crossing a warning threshold. The sensor location, chassis location, previous state, and fan sensor value are provided. Fan sensor returned to a normal value Sensor location: Chassis location: Cause Previous state was: Fan sensor value: 1103 Fan sensor detected a warning value Sensor location: Chassis location: Previous state was: Fan sensor value: 26 Event Message Reference Warning A fan sensor reading in the specified system exceeded a warning threshold. The sensor location, chassis location, previous state, and fan sensor value are provided. Table 2-3. Cooling Device Messages (continued) Event Description ID Severity Cause 1104 Error A fan sensor in the specified system detected the failure of one or more fans. The sensor location, chassis location, previous state, and fan sensor value are provided. Error A fan sensor detected an error from which it cannot recover. The sensor location, chassis location, previous state, and fan sensor value are provided. Fan sensor detected a failure value Sensor location: Chassis location: Previous state was: Fan sensor value: 1105 Fan sensor detected a non-recoverable value Sensor location: Chassis location: Previous state was: Fan sensor value: Event Message Reference 27 Voltage Sensor Messages Voltage sensors listed in Table 2-4 monitor the number of volts across critical components. Voltage sensor messages provide status and warning information for voltage sensors in a particular chassis. Table 2-4. Voltage Sensor Messages Event Description ID Severity Cause 1150 Voltage sensor has failed Error A voltage sensor in the specified system failed. The sensor location, chassis location, previous state, and voltage sensor value are provided. Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Voltage sensor value (in Volts): If sensor type is discrete: Discrete voltage state: 1151 Voltage sensor value unknown Warning Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Voltage sensor value (in Volts): If sensor type is discrete: Discrete voltage state: 28 Event Message Reference A voltage sensor in the specified system could not obtain a reading. The sensor location, chassis location, previous state, and a nominal voltage sensor value are provided. Table 2-4. Voltage Sensor Messages (continued) Event Description ID Severity 1152 Voltage sensor returned to a normal value Information A voltage sensor in the specified system returned to a valid range after crossing a failure threshold. The sensor location, chassis location, previous state, and voltage sensor value are provided. Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Cause Voltage sensor value (in Volts): If sensor type is discrete: Discrete voltage state: 1153 Voltage sensor detected a warning value Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Warning A voltage sensor in the specified system exceeded its warning threshold. The sensor location, chassis location, previous state, and voltage sensor value are provided. Voltage sensor value (in Volts): If sensor type is discrete: Discrete voltage state: Event Message Reference 29 Table 2-4. Voltage Sensor Messages (continued) Event Description ID Severity Cause 1154 Voltage sensor detected a failure value Error A voltage sensor in the specified system exceeded its failure threshold. The sensor location, chassis location, previous state, and voltage sensor value are provided. Error A voltage sensor in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and voltage sensor value are provided. Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Voltage sensor value (in Volts): If sensor type is discrete: Discrete voltage state: 1155 Voltage sensor detected a non-recoverable value Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Voltage sensor value (in Volts): If sensor type is discrete: Discrete voltage state: 30 Event Message Reference Current Sensor Messages Current sensors listed in Table 2-5 measure the amount of current (in amperes) that is traversing critical components. Current sensor messages provide status and warning information for current sensors in a particular chassis. Table 2-5. Current Sensor Messages Event Description ID Severity Cause 1200 Error A current sensor in the specified system failed. The sensor location, chassis location, previous state, and current sensor value are provided. Current sensor has failed Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Current sensor value (in Amps): OR Current sensor value (in Watts): If sensor type is discrete: Discrete current state: Event Message Reference 31 Table 2-5. Current Sensor Messages (continued) Event Description ID Severity Cause 1201 Error A current sensor in the specified system could not obtain a reading. The sensor location, chassis location, previous state, and a nominal current sensor value are provided. Current sensor value unknown Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Current sensor value (in Amps): OR Current sensor value (in Watts): If sensor type is discrete: Discrete current state: 1202 Information A current sensor in the specified system returned Sensor location: after crossing a Chassis location: The sensor location, chassis Previous state was: location, previous If sensor type is not discrete: state, and current Current sensor value (in Amps): sensor value are OR provided. Current sensor value (in Watts): Current sensor returned to a normal value If sensor type is discrete: Discrete current state: 32 Event Message Reference Table 2-5. Current Sensor Messages (continued) Event Description ID Severity Cause 1203 Warning A current sensor in the specified system exceeded its warning threshold. The sensor location, chassis location, previous state, and current sensor value are provided. Error A current sensor in the specified system exceeded its failure threshold. The sensor location, chassis location, previous state, and current sensor value are provided. Current sensor detected a warning value Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Current sensor value (in Amps): OR Current sensor value (in Watts): If sensor type is discrete: Discrete current state: 1204 Current sensor detected a failure value Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Current sensor value (in Amps): OR Current sensor value (in Watts): If sensor type is discrete: Discrete current state: Event Message Reference 33 Table 2-5. Current Sensor Messages (continued) Event Description ID Severity Cause 1205 Error A current sensor in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and current sensor value are provided. Current sensor detected a non-recoverable value Sensor location: Chassis location: Previous state was: If sensor type is not discrete: Current sensor value (in Amps): OR Current sensor value (in Watts): If sensor type is discrete: Discrete current state: 34 Event Message Reference Chassis Intrusion Messages Chassis intrusion messages listed in Table 2-6 are a security measure. Chassis intrusion means that someone is opening the cover to a system’s chassis. Alerts are sent to prevent unauthorized removal of parts from a chassis. Table 2-6. Chassis Intrusion Messages Event Description ID 1250 Severity Chassis intrusion sensor Error has failed Sensor location: Chassis location: Cause A chassis intrusion sensor in the specified system failed. The sensor location, chassis location, previous state, and chassis intrusion state are provided. Previous state was: Chassis intrusion state: 1251 Chassis intrusion sensor Error value unknown Sensor location: Chassis location: Previous state was: A chassis intrusion sensor in the specified system could not obtain a reading. The sensor location, chassis location, previous state, and chassis intrusion state are provided. Chassis intrusion state: Event Message Reference 35 Table 2-6. Chassis Intrusion Messages (continued) Event Description ID Severity Cause 1252 Information A chassis intrusion sensor in the specified system detected that a cover was opened while the system was operating but has since been replaced. The sensor location, chassis location, previous state, and chassis intrusion state are provided. Warning A chassis intrusion sensor in the specified system detected that a system cover is currently being opened and the system is operating. The sensor location, chassis location, previous state, and chassis intrusion state are provided. Chassis intrusion returned to normal Sensor location: Chassis location: Previous state was: Chassis intrusion state: 1253 Chassis intrusion in progress Sensor location: Chassis location: Previous state was: Chassis intrusion state: 36 Event Message Reference Table 2-6. Chassis Intrusion Messages (continued) Event Description ID Severity Cause 1254 Warning A chassis intrusion sensor in the specified system detected that the system cover was opened while the system was operating. The sensor location, chassis location, previous state, and chassis intrusion state are provided. Chassis intrusion detected Sensor location: Chassis location: Previous state was: Chassis intrusion state: 1255 Chassis intrusion sensor Error detected a nonrecoverable value Sensor location: Chassis location: Previous state was: A chassis intrusion sensor in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and chassis intrusion state are provided. Chassis intrusion state: Redundancy Unit Messages Redundancy means that a system chassis has more than one of certain critical components. Fans and power supplies, for example, are so important for preventing damage or disruption of a computer system that a chassis may have “extra” fans or power supplies installed. Redundancy allows a second or nth fan to keep the chassis components at a safe temperature when the primary fan has failed. Redundancy is normal when the intended number of critical components are operating. Redundancy is degraded when a Event Message Reference 37 component fails but others are still operating. Redundancy is lost when the number of components functioning falls below the redundancy threshold. Table 2-7 lists the redundancy unit messages. The number of devices required for full redundancy is provided as part of the message, when applicable, for the redundancy unit and the platform. For details on redundancy computation, see the respective platform documentation. Table 2-7. Redundancy Unit Messages Event Description ID Severity Cause 1300 Warning A redundancy sensor in the specified system failed. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided. Warning A redundancy sensor in the specified system could not obtain a reading. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided. Redundancy sensor has failed Redundancy unit: Chassis location: Previous redundancy state was: 1301 Redundancy sensor value unknown Redundancy unit: Chassis location: Previous redundancy state was: 38 Event Message Reference Table 2-7. Redundancy Unit Messages (continued) Event Description ID Severity 1302 Information A redundancy sensor in the specified system detected that a unit was not redundant. The redundancy location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided. Redundancy not applicable Redundancy unit: Chassis location: Previous redundancy state was: 1303 Redundancy is offline Redundancy unit: Chassis location: Previous redundancy state was: 1304 Redundancy regained Redundancy unit: Chassis location: Previous redundancy state was: Cause Information A redundancy sensor in the specified system detected that a redundant unit is offline. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided. Information A redundancy sensor in the specified system detected that a “lost” redundancy device has been reconnected or replaced; full redundancy is in effect. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided. Event Message Reference 39 Table 2-7. Redundancy Unit Messages (continued) Event Description ID Severity Cause 1305 Warning A redundancy sensor in the specified system detected that one of the components of the redundancy unit has failed but the unit is still redundant. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided. Error A redundancy sensor in the specified system detected that one of the components in the redundant unit has been disconnected, has failed, or is not present. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided. Redundancy degraded Redundancy unit: Chassis location: Previous redundancy state was: 1306 Redundancy lost Redundancy unit: Chassis location: Previous redundancy state was: 40 Event Message Reference Power Supply Messages Power supply sensors monitor how well a power supply is functioning. Power supply messages listed in Table 2-8 provide status and warning information for power supplies present in a particular chassis. Table 2-8. Power Supply Messages Event Description ID Severity Cause 1350 Error A power supply sensor in the specified system failed. The sensor location, chassis location, previous state, and additional power supply status information are provided. Power supply sensor has failed Sensor location: Chassis location: Previous state was: Power Supply type: If in configuration error state: Configuration error type: Event Message Reference 41 Table 2-8. Power Supply Messages (continued) Event Description ID 1351 Severity Power supply sensor value Warning unknown Sensor location: Chassis location: Previous state was: Cause A power supply sensor in the specified system could not obtain a reading. The sensor location, chassis location, previous state, and additional power supply status information are provided. Power Supply type: If in configuration error state: Configuration error type: 1352 Power supply returned to normal Sensor location: Chassis location: Previous state was: Power Supply type: If in configuration error state: Configuration error type: 42 Event Message Reference Information A power supply has been reconnected or replaced. The sensor location, chassis location, previous state, and additional power supply status information are provided. Table 2-8. Power Supply Messages (continued) Event Description ID Severity Cause 1353 Warning A power supply sensor reading in the specified system exceeded a user-definable warning threshold. The sensor location, chassis location, previous state, and additional power supply status information are provided. Error A power supply has been disconnected or has failed. The sensor location, chassis location, previous state, and additional power supply status information are provided. Power supply detected a warning Sensor location: Chassis location: Previous state was: Power Supply type: If in configuration error state: Configuration error type: 1354 Power supply detected a failure Sensor location: Chassis location: Previous state was: Power Supply type: If in configuration error state: Configuration error type: Event Message Reference 43 Table 2-8. Power Supply Messages (continued) Event Description ID Severity Cause 1355 Error A power supply sensor in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and additional power supply status information are provided. Power supply sensor detected a nonrecoverable value Sensor location: Chassis location: Previous state was: Power Supply type: If in configuration error state: Configuration error type: 44 Event Message Reference Memory Device Messages Memory device messages listed in Table 2-9 provide status and warning information for memory modules present in a particular system. Memory devices determine health status by monitoring the ECC memory correction rate and the type of memory events that have occurred. NOTE: A critical status does not always indicate a system failure or loss of data. In some instances, the system has exceeded the ECC correction rate. Although the system continues to function, you should perform system maintenance as described in Table 2-9. NOTE: In Table 2-9, can be either critical or non-critical. Table 2-9. Memory Device Messages Event Description ID Severity Cause 1403 Memory device status is Warning A memory device correction rate exceeded an acceptable Memory device value. The memory device location: provided. Possible memory module event cause: 1404 Memory device status is Error
Memory device location: Possible memory module event cause: A memory device correction rate exceeded an acceptable value, a memory spare bank was activated, or a multibit ECC error occurred. The system continues to function normally (except for a multibit error). Replace the memory module identified in the message during the system’s next scheduled maintenance. Clear the memory error on multibit ECC error. The memory device status and location are provided. Event Message Reference 45 Fan Enclosure Messages Some systems are equipped with a protective enclosure for fans. Fan enclosure messages listed in Table 2-10 monitor whether foreign objects are present in an enclosure and how long a fan enclosure is missing from a chassis. Table 2-10. Fan Enclosure Messages Event Description ID Severity Cause 1450 Critical/ Failure / Error The fan enclosure sensor in the specified system failed. The sensor location and chassis location are provided. Warning The fan enclosure sensor in the specified system could not obtain a reading. The sensor location and chassis location are provided. Fan enclosure sensor has failed Sensor location:
Chassis location: 1451 Fan enclosure sensor value unknown Sensor location: Chassis location: