Greenplum DCA Maintenance Guide 2.0.0.0 / 2.0.1.0 Data Computing Appliance 2.0.2.0 2.0.3.0
Greenplum-Data-Computing-Appliance-Maintenance-Guide-2.0.0.0---2.0.1.0---2.0.2.0---2.0.3.0
User Manual:
Open the PDF directly: View PDF .
Page Count: 214
Download | |
Open PDF In Browser | View PDF |
EMC CONFIDENTIAL EMC® Greenplum Data Computing Appliance Appliance Version 2.0.0.0/2.0.1.0/2.0.2.0 /2.0.3.0 Maintenance Guide REV 11 EMC CONFIDENTIAL Copyright © 2014 EMC Corporation. All rights reserved. Published in the USA. Published April, 2014 EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. EMC2, EMC, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the United States and other countries. All other trademarks used herein are the property of their respective owners. For the most up-to-date regulatory document for your product line, go to the technical documentation and advisories section on the EMC online support website. 2 EMC CONFIDENTIAL CONTENTS Chapter 1 Important Information Before You Begin ............................................. 6 New firmware updates in support of DCA software version 2.0.3.0 ................ Identify the version of the installed DCA software.......................................... Avoid electrostatic discharge damage (ESD) .................................................. Handling field replaceable units (FRUs) ................................................... Chapter 2 Replace a Master Server ...................................................................10 Required tools ............................................................................................ Task summary............................................................................................. Service tag location..................................................................................... Replace the Primary Master server............................................................... Replace the Standby Master server ............................................................. Identifying a single-NIC master versus a dual-NIC master in a DCAv2 ........... Replace a Master server in a DCA without a Greenplum database ................ Chapter 3 10 11 13 14 23 31 31 Replace a Segment, DIA, or Hadoop server ....................................... 39 Required tools ............................................................................................ Task summary............................................................................................. Service tag locations ................................................................................... Reseat cables before replacing a server....................................................... Replace a server in an initialized GPDB module ........................................... Replace a DIA server or a server in an uninitialized GPDB module................ Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0) ...... Replace a server in a Pivotal Hadoop module (version 2.0.1.0 and later) ..... Remove the failed PHD server and install the replacement PHD server......... Replace hdm1 (namenode, DCA version 2.0.1.0)................................... Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0) Replace hdm3 (resourcemanager, DCA version 2.0.1.0) ........................ Replace hdm4 (zookeeper/hive/hive-metastore, DCA version 2.0.1.0) .. Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0).............. Chapter 4 6 7 7 8 39 40 41 42 44 51 58 64 64 68 72 76 79 82 Replace a Disk Drive .........................................................................86 Hot spare drives and the Copyback operation.............................................. 86 Replace a disk drive in a Master, DIA, or Hadoop Compute server ................ 87 Replace a drive in a Segment Server............................................................ 91 Replace a drive in an Hadoop server............................................................ 96 Replace a drive in a Hadoop Master server ............................................ 97 Replace a drive in a Hadoop Worker server .......................................... 101 Chapter 5 Replace a Power Supply in a Server ................................................110 Power supply LEDs .................................................................................... 110 Replace a power supply in a server............................................................ 110 EMC Greenplum DCA Maintenance Guide 3 EMC CONFIDENTIAL Contents Chapter 6 Replace a Fan Assembly or Power Supply in an Arista Switch ..........113 Replace a Fan Assembly in an Arista Switch............................................... Fan Assembly Replacement Order Information .................................... Tools................................................................................................... Identify the Failed Fan Assembly ......................................................... Remove the Failed Fan Assembly and Install the Replacement Part...... Parts Return ........................................................................................ Replace a Power Supply in an Arista Switch ............................................... Power Supply Assembly Replacement Order Information ..................... Tools................................................................................................... Identify the Failed Power Supply.......................................................... Remove the Failed Power Supply and Install the Replacement Part...... Parts Return ........................................................................................ Chapter 7 113 113 113 114 115 115 116 116 116 117 118 118 Replace a Switch in the DCA ...........................................................119 Requirements ........................................................................................... Switch hostnames and IP addresses ......................................................... Replace an Arista 7050S Interconnect or Aggregation Switch..................... Replace an Arista 7048T Administration Switch ......................................... 120 120 122 127 Chapter 8 Replace an Interconnect Switch Cable ............................................134 Appendix A System Information and Configuration ............................................137 New firmware updates for DCA software version 2.0.3.0 ...................... Identify the version of the installed DCA software ................................ DCA configuration rules....................................................................... Racking order ...................................................................................... Racking guidelines .............................................................................. Mixed System rack components ......................................................... Hadoop-only System Rack components (minimum config.).................. HD-Compute System Rack components (minimum config.).................. Aggregation rack components ............................................................. Expansion rack components ............................................................... Power supply reference ............................................................................. BMC Controller interface functionality ....................................................... BMC Controller LED indicators and meanings ............................................ Network and cabling configurations .......................................................... Interconnect cabling reference ............................................................ Administration switch reference .......................................................... Aggregation switch reference .............................................................. Network hostname and IP configuration ................................................... Multiple-rack cabling reference ................................................................. Configuration files..................................................................................... Location of old core files ........................................................................... Default passwords .................................................................................... Appendix B Connect a workstation to the DCA ...................................................176 Laptop prerequisites ................................................................................. Configure your laptop to connect to the DCA.............................................. Configure a Windows 7 laptop............................................................. Configure a Windows XP laptop........................................................... Connect to the Master Server using an SSH client...................................... EMC Greenplum DCA Maintenance Guide 137 138 139 139 140 141 142 143 144 145 146 151 151 152 152 159 163 170 173 174 174 175 176 176 176 178 178 4 EMC CONFIDENTIAL Contents Copy a file to the Master Server using an SCP client................................... 179 Connect to an Interconnect or Administration switch using PuTTY .............. 181 Appendix C Power Off the DCA...........................................................................183 Appendix D Linux and vi Command Reference....................................................189 Common Linux command reference........................................................... 189 vi Quick Reference .................................................................................... 191 Appendix E Replace a Server in the Greenplum DCA Rack ..................................192 Mounting kit parts..................................................................................... 192 Appendix F Install a Switch in a Rack.................................................................200 Switch mounting kit parts ......................................................................... 201 Replace the switch in the rack ................................................................... 201 Replace an optical SFP module.................................................................. 208 Appendix G Switch Configuration: Backup and Recovery....................................210 Create Two Files for Switch Recovery.......................................................... 210 Recover the Switch Configurations ............................................................ 210 Appendix H DCA Part Numbers ..........................................................................212 EMC Greenplum DCA Maintenance Guide 5 EMC CONFIDENTIAL CHAPTER 1 Important Information Before You Begin For detailed descriptions of DCA components and configurations, see Appendix A, “System Information and Configuration,” on page 137. This chapter includes the following major sections: New firmware updates in support of DCA software version 2.0.3.0 ............................ 6 Identify the version of the installed DCA software...................................................... 7 Avoid electrostatic discharge damage (ESD) .............................................................. 7 New firmware updates in support of DCA software version 2.0.3.0 Customers can apply optional firmware updates prior to upgrading to DCA software version 2.0.3.0 as follows: Arista 7050S-52 and Arista 7048T switches • New firmware version EOS-4.9.8.swi • Field personnel can access the EOS-4.9.8.swi.zip firmware upgrade package from: ftp://ftp.aristanetworks.com/emc/certifiedeos/EOS-4.9.8.swi Field personnel can obtain the following document available on http://support.emc.com for step-by-step instructions: EMC Greenplum DCA Firmware Upgrade Instructions for the Interconnect Switch (Arista 7050S-52) and Administration Switch (Arista 7048T) Intel Servers (Kylin with eight drives, Dragon 12 with twelve drives, and Dragon 24 with 24 drives) • New BIOS upgrade revision level SE5C600.86B.02.01.0002 • Field personnel can access both the BIOS upgrade package, and the EMC Greenplum DCA Intel BIOS Upgrade Instructions for Intel Servers from http://support.emc.com. EMC Greenplum DCA Maintenance Guide Important Information Before You Begin 6 EMC CONFIDENTIAL Important Information Before You Begin Identify the version of the installed DCA software The replacement procedures in this guide pertain only to DCA clusters running software version 2.0.x.x. DCA documentation is tied to a specific version of the DCA software. Before beginning any replacement procedure on a DCA, make sure that the version of the software running on the clusters matches the version of the guide that you are using. 1. Log in to the Primary Master server as the user root (see “Connect a workstation to the DCA” on page 176). 2. View the contents of the /opt/dca/etc/dca-build-info.txt file. Verify that the ISO_VERSION value is 2.0.x.x # cat /opt/dca/etc/dca-build-info.txt ISO_BUILD_DATE="Wed Oct 15 21:59:56 PST 2013" ISO_VERSION="2.0.2.0" ISO_BUILD_VERSION="4" ISO_INSTALL_TYPE="iso" If the version of the software is not 2.0.x.x, go to he EMC Online Support site http://support.emc.com. From the Support by Product pages, search for Greenplum Data Computing Appliance and obtain the documentation that matches the software version running on the Primary Master server. Avoid electrostatic discharge damage (ESD) When you replace and install field replaceable units (FRUs), you can inadvertently damage sensitive electronic circuits in the equipment simply by touching them. Electrostatic charge that has accumulated on your body can discharge through the circuits. If the air in the work area is dry, running a humidifier in the work area can help decrease the risk of electrostatic discharge (ESD) damage. Read and understand the following guidelines: Provide enough room to work on the equipment. Clear the work site of any unnecessary materials, especially materials that naturally build up electrostatic charge such as foam packaging, foam cups, cellophane wrappers, and similar items. Do not remove replacement or upgrade FRUs from their antistatic packaging until you are ready to install them. Set up your EMC-issued ESD kit and all other materials that you need before servicing a Greenplum system. Once you begin service, avoid moving away from the work site; otherwise, your body can build up an electrostatic charge. Use the ESD kit when handling system components. Wear an ESD wristband. Attach the clip of the ESD wristband to any bare (unpainted) metal in the bay, and then place the wristband around your wrist with the metal button against your skin. EMC Greenplum DCA Maintenance Guide Identify the version of the installed DCA software 7 EMC CONFIDENTIAL Important Information Before You Begin Handling field replaceable units (FRUs) This section describes the precautions that you must take and the general procedures that you must follow when removing and storing any field replaceable unit (FRU). The only FRUs in the server are the disk drive assemblies and power supply modules. Depending on the product in which the server is used, the disk drive assemblies may be hot-swappable; that is you can replace a disk drive assembly while the server is running. To determine if disk drive assemblies are hot-swappable, refer to your product documentation. Regardless of the product in which the server is used, the power supply modules are hot-swappable; that is you can replace a power supply module while the server is running. You should not remove a faulty FRU until you have a replacement available. When you replace FRUs, you can inadvertently damage the sensitive electronic circuits in the equipment by simply touching them. Electrostatic charge (ESD) that has accumulated on your body discharges through the circuits. If the air in the work area is very dry, running a humidifier in the work area will help decrease the risk of ESD damage. Follow the procedures below to prevent damage to the equipment. Provide enough room to work on the equipment. Clear the work site of any unnecessary materials or materials that naturally build up electrostatic charge, such as foam packaging, foam cups, cellophane wrappers, and similar items. Do not remove replacement or upgrade FRUs from their antistatic packaging until you are ready to install them. Before you service a server, gather together the ESD kit and all other materials you will need. Once servicing begins, avoid moving away from the work site; otherwise, you may build up an electrostatic charge. Use the ESD kit when handling any FRU. Use an ESD wristband. To use the ESD wristband (strap), attach the clip of the wristband to any bare (unpainted) metal on the server; then put the wristband around your wrist with the metal button against your skin. If an emergency arises and the ESD kit is not available, follow the procedures in “Emergency procedures (without an ESD kit)” on page 8. Emergency procedures (without an ESD kit) In an emergency when an ESD kit is not available, use the procedures below to reduce the possibility of an electrostatic discharge by ensuring that your body and the subassembly are at the same electrostatic potential. These procedures are not a substitute for the use of an ESD kit. Follow them only in the event of an emergency. Before touching any FRU, touch a bare (unpainted) metal surface of the cabinet or server. Before removing any FRU from its antistatic bag, place one hand firmly on a bare metal surface of the server, and at the same time, pick up the FRU while it is still sealed in the antistatic bag. Once you have done this do not move around the room or touch other furnishings, personnel, or surfaces until you have installed the FRU. When you remove a FRU from the antistatic bag, avoid touching any electronic components and circuits on it. EMC Greenplum DCA Maintenance Guide Avoid electrostatic discharge damage (ESD) 8 EMC CONFIDENTIAL Important Information Before You Begin If you must move around the room or touch other surfaces before installing a FRU, first place the FRU back in the antistatic bag. When you are ready again to install the FRU, repeat these procedures. EMC Greenplum DCA Maintenance Guide Avoid electrostatic discharge damage (ESD) 9 EMC CONFIDENTIAL CHAPTER 2 Replace a Master Server This chapter describes how to replace a Primary or Standby Master server in a GPDB-only DCA, a mixed DCA, or a Hadoop-only DCA. (Applies only to version 2.0.1.0 and later) Additional steps are required if you are servicing a Hadoop-only DCA. Look for the following notice in the left margin: **If you are servicing a Hadoop-only DCA** Topics include: Required tools ........................................................................................................ Task summary......................................................................................................... Service tag location................................................................................................. Replace the Primary Master server........................................................................... Replace the Standby Master server ......................................................................... Replace a Master server in a DCA without a Greenplum database ............................ 10 11 13 14 23 31 Required tools You need the following tools to remove and replace a server: #2 Phillips screwdriver Wrist grounding strap EMC Greenplum DCA Maintenance Guide Replace a Master Server 10 EMC CONFIDENTIAL Replace a Master Server Task summary Table 1 Summary of Master server replacement tasks Tasks Primary or Standby Master in a DCA with no initialized GP database Primary Master Standby Master x x x Check with the customer if any custom configurations have been applied. x x x Disable health monitoring. x x x Check Master server sync state. x x Check Greenplum database for errors. x x If necessary, initiate an orchestrated failover from the Primary server to the Standby server. x x Check BIOS version when replacing a Master server in the cluster When installing a replacement server, identify the BIOS version on the new server (as well as the versions already running in the DCA). Then upgrade so that all servers reflect the same firmware levels. Go to http://support.emc.com to obtain the pertinent BIOS upgrade instructions. The upgrade instructions provide information on how to access and install the upgrade package. **If you are replacing a Primary Master in a Hadoop-only DCA**: include the argument --deletevip Verify the success of the failover. x Power off the failed server. x x x Label then remove cables from the failed server. x x x Install the replacement server. x x x Transfer drives from the failed server to the replacement server. x x x Connect cables to the replacement server (but do not power it on yet). x x x Configure the BMC IP address. x x x Power on the replacement server. x x x Import foreign disk configurations. x x x Check the health of the replacement server. x x x Exchange SSH keys. x x Initialize the replacement server as the acting Standby Master server (temporarily). x EMC Greenplum DCA Maintenance Guide Task summary 11 EMC CONFIDENTIAL Replace a Master Server Table 1 Summary of Master server replacement tasks Tasks Initiate a failover from the replacement server (the acting Standby Master server). Primary Master Standby Master Primary or Standby Master in a DCA with no initialized GP database x **If you are replacing a Primary Master in a Hadoop-only DCA**: include the argument --deletevip Revert the smdw server to its former standby role. x **If you are replacing a Primary or Standby Master in a Hadoop-only DCA** Recover and rebalance the x x Synchronize the system clock. x x Ask the customer if there are any custom configurations that must be reapplied to the DCA (for example, NFS mounts or gateways). x x Re-enable health monitoring. x x GPDB segment instances on the Primary Master server. EMC Greenplum DCA Maintenance Guide x Task summary 12 EMC CONFIDENTIAL Replace a Master Server Service tag location When replacing any hardware component, it is important that you properly debrief the part. The serial number of a Master server is located on the blue service label affixed to the front left corner of the server. Figure 1 Service tag location on the Master server (Dragon24 shown) EMC Greenplum DCA Maintenance Guide Service tag location 13 EMC CONFIDENTIAL Replace a Master Server Replace the Primary Master server Perform this procedure if the Primary Master server has failed or is failing. IMPORTANT This procedure directs you to transfer drives from the failed server to the replacement server. Take great care when transferring drives. Transfer only one drive at a time. Insert drives in the same slots that they occupied in the failed server. 1. You may want to consult “Task summary” on page 11 for a overview of the Master server replacement procedures. 2. If it is not already connected, connect your service laptop to the red service cable located on the laptop tray. The red service cable is connected to port 48 on the Administration switch in the SYSRACK. For instructions on how to configure the IP address on your laptop, see “Connect a workstation to the DCA” on page 176). 3. If the Primary Master server is still accessible through SSH, perform step a through step e below. If the Primary Master server is not accessible through SSH, skip to step 4. a. Log in to the Primary Master server as the user root (see “Connect a workstation to the DCA” on page 176). b. Activate the server identification LED. # dca_blinker -h mdw -a ON c. Switch to the user gpadmin and determine whether the Primary and Standby Master servers are synced: # su - gpadmin $ gpstate -f If the output returns the status synchronized, the master servers are in sync. If synchronized is not returned in the output or the database is not running, do not replace the Primary Master server. Contact EMC Support. d. Switch to the user root and make note of any custom NFS mounts the customer may have created: $ su # cat /etc/fstab e. Make note of any custom network gateways the customer may have created: # cat /etc/sysconfig/network 4. Before you replace the failed Primary Master server, perform the sub-steps below to determine whether an automatic failover occurred when the Primary Master failed. If an automatic failover did not occur, you must initiate an orchestrated (manual) failover before you replace the failed server (see step 5 below). To determine whether an automated failover occurred, do the following: a. Start the DCA Setup utility as the user root: # dca_setup EMC Greenplum DCA Maintenance Guide Replace the Primary Master server 14 EMC CONFIDENTIAL Replace a Master Server b. Select option 2 to Modify DCA Settings. c. Note the text of option 19: – If the text is Disable Master Server Auto Failover (currently enabled), an automatic failover occurred when the Primary Master failed. Skip to step 6 to determine if the failover was successful. – If the text is Enable Master Server Auto Failover (currently disabled), you must initiate an orchestrated (manual) failover as described in step 5. d. Enter X to exit the DCA Setup utility. 5. If an automatic failover did occur, proceed to step 6. If an automatic failover did not occur, initiate a orchestrated (manual) failover as follows: a. From the Standby Master server, issue the dca_failover command: # dca_failover --stopmasterdb --noninteractive --vip 10.10.10.10 --gateway 10.10.10.1 --netmask 255.255.255.0 **If you are servicing a Hadoop-only DCA** In a Hadoop-only DCA, make sure to include the option --deletevip: # dca_failover --stopmasterdb --noninteractive --vip 10.10.10.10 --gateway 10.10.10.1 --netmask 255.255.255.0 --deletevip b. Replace the values shown in bold above with the IP, Gateway, and Netmask of the virtual IP address provided by the customer. If the customer has not specified a virtual IP address, do not include the --vip, --gateway, and --netmask parameters. Wait for the prompt to appear indicating that the failover has completed before you continue. c. When the failover has completed, proceed to step 6 to determine if the failover operation was successful. 6. To determine whether the Master server failover operation was successful, switch to the user gpadmin and issue the following command. Verify that the text in bold is returned in the output: # su - gpadmin $ gpstate -f [INFO]:-Starting gpstate with args: -f [INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.1.1.3 build 4' [INFO]:-Obtaining Segment details from master... [INFO]:-Standby master instance not configured 7. Check the Greenplum Database for errors. If any errors are returned in the output, you must resolve them before you continue with this procedure: $ gpstate -e [INFO]:----------------------------------------------------[INFO]:-Segment Mirroring Status Report [INFO]:----------------------------------------------------[INFO]:-All segments are running normally 8. To prevent false dial home messages from being sent to EMC Support during service, log in to the Standby Master server as the user root and stop the healthmon daemon to disable health monitoring: $ su EMC Greenplum DCA Maintenance Guide Replace the Primary Master server 15 EMC CONFIDENTIAL Replace a Master Server # dca_healthmon_ctl -d 9. Shut down the Primary Master server: • If the failed Primary Master server is accessible through SSH, log into to it as the user root and issue the shutdown command. IMPORTANT Check the prompt to make sure that you are on the Primary Master (mdw) before you issue the shutdown command! $ ssh root@mdw # shutdown -h now • If the failed server is not accessible through SSH, power it off by pressing the power button on the front of the server for 5 seconds (see Figure 2 below). Power button Figure 2 Power button location on Master server 10. Label all the cables connected to the failed server so that you’ll know where to connect them on the replacement server. 11. Remove all power, Ethernet, and twin-axial cables from the back of the server. Note: If the system has Dual NICs installed, note the connections for customer and interconnect networks prior to disconnecting. 12. Remove the failed server and install the replacement server (see Appendix E, “Replace a Server in the Greenplum DCA Rack,” on page 192). 13. Transfer disk drives one at a time from the failed server to the replacement server. IMPORTANT Use caution when transferring drives. Transfer only one drive at a time. Insert the drives in the same slots that they occupied in the failed server. 14. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on the cables for proper connectivity. IMPORTANT Do not connect power to the replacement server yet. 15. From the Standby Master server start the dhcpd service as the user root: # service dhcpd start 16. Connect the power cables to the replacement server. EMC Greenplum DCA Maintenance Guide Replace the Primary Master server 16 EMC CONFIDENTIAL Replace a Master Server 17. Next, use these steps to identify the IP address assigned to the server. a. Issue the following command to obtain the lease information provided in the dhcpd.leases file: # tail /var/lib/dhcpd/dhcpd.leases b. The dhcpd.leases file dispays (similar to the following): Example lease 172.28.6.170 { starts 4 2012/10/18 20:09:08; ends 5 2013/10/18 20:09:08; cltt 4 2012/10/18 20:09:08; binding state active; next binding state free; hardware ethernet 00:00:00:00:00:04; uid "\001\000\036g,\242\014"; c. Locate the MAC address labelled hardware ethernet in the example dhcpd.leases file above: 00:00:00:00:00:04 d. Locate the MAC address on the replacement server’s service tag (highlighted in the photograph below): MAC1 00:00:00:00:00:00 Figure 3 Locating the MAC address on the service tag (Primary Master server shown, Dragon24) e. Compare the last two digits in the MAC addresses referenced in step c and step d (for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC address in the dhcpd.leases file is four greater than the last two numbers in the MAC address on the replacement server’s service tag. If this is the case, it is certain that the IP address in the dhcpd.leases file is the correct one to associate with the server. For example, the scenario described above verifies that 172.28.6.170 is correct in this specific instance. EMC Greenplum DCA Maintenance Guide Replace the Primary Master server 17 EMC CONFIDENTIAL Replace a Master Server f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170) identified in the previous steps using the example above as a guide: Note: Disregard the long, detailed output after this command is executed. # ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth g. Ping the new address to verify that the change was applied: # ping 172.28.0.# Where # is the number of the server you are replacing. 18. Turn off the dhcpd service: # service dhcpd stop 19. From the Primary Master server as user root, issue the following command to open a BMC console session on the replacement Master server mdw: # ipmiutil sol -a -e -N mdw-sp -U root -P sephiroth You will need to press the F key within 15 seconds after seeing this WARNING message: Foreign configuration(s) found on adapter Press any key to continue or ‘C’ load the configuration utility, or ‘F’ to import foreign configuration(s) and continue. 20. Power on the replacement server by pressing the power button on the front panel, and press the F key when prompted. 21. When the following message displays, disregard and press the space bar: All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Press any key to continue, or 'C' to load the configuration utility. EMC Greenplum DCA Maintenance Guide Replace the Primary Master server 18 EMC CONFIDENTIAL Replace a Master Server 22. If the message below appears you will need to power off the server, verify that all LED lights are off, and repeat steps 20 through 21. CLIENT MAC ADDR: 00 1E 67 4D C5 1D 001E674DC51D DHCP....\ GUID: 2A9B43A4 A50A 11E1 AAA0 23. Monitor the boot process onscreen and verify that the system boots from hard disk. If the system does boot from hard disk, proceed to step 24. If the system does not boot from hard disk, perform the following sub-steps to force it to boot from hard disk: a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.) key, as follows on the keyboard: then Note: When you exit the BMC console you are returned to your connection on the smdw as the user root. b. Issue the following command from smdw to force the appliance to boot from hard drive: # ipmiutil reset -h -N mdw-sp -U root -P sephiroth c. Once the operating system is loaded, issue the following command to change the boot order on mdw: # ssh mdw # syscfg /bbo “emcbios” HDD NW d. Reboot mdw: # reboot e. Following the reboot, issue the following commands to connect to mdw and verify the boot order: # ssh mdw # syscfg /bbosys f. Exit mdw: # exit g. Proceed to step 24. (You can skip step 24 because you already exited the BMC console in sub-step a above.) EMC Greenplum DCA Maintenance Guide Replace the Primary Master server 19 EMC CONFIDENTIAL Replace a Master Server 24. From the Standby Master server (smdw) check the health of the replacement server: # dcacheck -h mdw Verify that no errors display. 25. Exchange SSH keys on the replacement server using the DCA Setup utility: a. Start the DCA Setup utility as the user root: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 6 to Generate SSH Keys. d. Enter X to exit the DCA Setup utility. e. Check the firmware level of the RAID controllers with the CmdTool2 utility: # ssh mdw /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" If the above command returns either: FW Package Build: 23.7.0-0033 or FW Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a through n below to update the firmware. f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to your laptop. https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian ce g. Extract the files to your laptop using unzip or similar unpacking tool. For example: Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the file in /root. You can use WinSCP or a similar utility. Note: You may be required to provide a login to the destination server. i. For each server in need of an update, log into the server as root. j. SCP the MR56p.rom file from the master to the server you are updating. k. Install the new firmware using the following command: Note: This will take longer on 24-disk servers. # /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom -aall l. Reboot the server. # reboot m. When the server reboots, check the new firmware version: # /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" EMC Greenplum DCA Maintenance Guide Replace the Primary Master server 20 EMC CONFIDENTIAL Replace a Master Server The following should be returned, indicating your firmware has successfully been updated on this server: FW Package Build: 23.12.0-0013 n. Repeat these alphabetic steps to check/update the remaining servers in the cluster. 26. Switch to the user gpadmin and issue the following commands to initialize the replacement server as the acting Standby Master server: # su - gpadmin $ ssh mdw rm -r /data/master/* $ gpinitstandby -s mdw 27. At the message Do you want to continue with standby master initialization? enter Y to continue. Wait for the message Successfully created standby master on mdw. 28. Log in to the replacement server (now the new acting Standby Master server) as the user root: $ ssh root@mdw 29. Issue the following command to initiate the orchestrated (manual) failover: dca_failover --stopmasterdb --noninteractive --vip 10.10.10.10 --gateway 10.10.10.1 --netmask 255.255.255.0 **If you are servicing a Hadoop-only DCA** In a Hadoop-only DCA, make sure to include the option --deletevip: dca_failover --stopmasterdb --noninteractive --vip 10.10.10.10 --gateway 10.10.10.1 --netmask 255.255.255.0 --deletevip 30. At the message Do you want to continue? enter y. IMPORTANT Initiating a failover stops the Greenplum Database and renders it temporarily unavailable. When the failover operation finishes you are returned to the prompt [root@mdw]# . 31. Switch to the user gpadmin: # su - gpadmin 32. Connect to the Standby Master server and empty the /data/master directory: $ ssh smdw rm -r /data/master/* 33. Issue the following command to revert the smdw server to its standby role: $ gpinitstandby -s smdw 34. At the message Do you want to continue with standby master initialization? enter Y. EMC Greenplum DCA Maintenance Guide Replace the Primary Master server 21 EMC CONFIDENTIAL Replace a Master Server **If you are servicing a Hadoop-only DCA** Perform the next two steps only if you are replacing a Master server in a Hadoop-only DCA 35. Issue the following command to recover the GPDB segment instances running on the Primary and Standby Master servers: $ gprecoverseg Enter Y when prompted. For example: Continue with segment recovery procedure Yy|Nn (default=N): > Y 36. Issue the following command to rebalance the GPDB segment instances running on the Primary and Standby Master servers: $ gprecoverseg -r Enter Y when prompted. For example: Continue with segment rebalance procedure Yy|Nn (default=N): > Y The procedure continues here for all DCA types: 37. Exit the user gpadmin: $ exit 38. Start the DCA Setup utility: # dca_setup 39. Synchronize the system clock: a. Select option 2 for Modify DCA Settings. b. Select option 5 for Modify NTP/Clock Configuration Options. c. Select option 3 for Synchronize clocks across the cluster to the NTP server. Enter X to exit the DCA Setup utility. 40. IMPORTANT - Note that the same DCA system Serial Number (located on a label affixed to the top, rear of the rack) must be included in the following files for Dial Home to work after replacing a Master application server (mdw and smdw in the case of GPDB and hdm, and standby hdm in the case of Hadoop): • /opt/connectemc/ConnectEMC.ini • /opt/greenplum/serialnumber First, check the DCA system Serial Number in the connectemc initialization file, /opt/connectemc/ConnectEMC.ini file, as follows: a. Open the connectemc initialization file: /opt/connectemc/ConnectEMC.ini b. Locate the DCA system Serial Number per the following keyword in the file: SERIAL_NUMBER= EMC Greenplum DCA Maintenance Guide Replace the Primary Master server 22 EMC CONFIDENTIAL Replace a Master Server c. Check that this matches the DCA system Serial Number on the label affixed to the top, rear of the rack. Go to the next step (step d.) if the Serial Number is missing. d. If missing, enter the Serial Number in the /opt/connectemc/ConnectEMC.ini file, for example: SERIAL_NUMBER=APMXXXXXXXXX 41. Next, check that the DCA system Serial Number in the /opt/greenplum/serialnumber file matches the DCA system Serial Number in the /opt/connectemc/ConnectEMC.ini file, per step 40 above. For example: SERIAL_NUMBER=APM00140732731 Note: After verifying that the DCA system Serial Numbers are identical, remember to save the /opt/greenplum/serialnumber file if you made any changes. 42. Re-enable health monitoring: # dca_healthmon_ctl -e 43. You must stop and start the connectemc service (also referred to as Dial Home) to complete restarting the healthmon daemon. Enter the command: service connectemc stop You will see the message: Shutting down ConnectEMC 44. When you see the # prompt again, enter: service connectemc start You will see the message: Starting ConnectEMC The # prompt returns, indicating that you have re-enabled health monitoring. Replace the Standby Master server Perform this procedure if the Standby Master server has failed or is failing and the Primary Master server is in good health. IMPORTANT This procedure directs you to transfer drives from the failed server to the replacement server. Take great care when transferring drives. Transfer only one drive at a time. Insert drives in the same slots that they occupied in the failed server. 1. You may want to consult “Task summary” on page 11 for a overview of the Master server replacement procedures. EMC Greenplum DCA Maintenance Guide Replace the Standby Master server 23 EMC CONFIDENTIAL Replace a Master Server 2. If it is not already connected, connect your service laptop to the red service cable located on the laptop tray. For details on how to configure the IP address of your laptop, see “Connect a workstation to the DCA” on page 176. 3. To prevent false dial home messages from being sent to EMC Support during service, stop the healthmon daemon to disable health monitoring: # dca_healthmon_ctl -d 4. If the Standby Master server is still accessible through SSH, perform step a through step e below. If the failed Standby Master server is not accessible through SSH, skip to step 5. a. Log in to the Primary Master server as the user root (see “Connect a workstation to the DCA” on page 176). b. Activate the server identification LED. # dca_blinker -h smdw -a ON c. Switch to the user gpadmin and determine whether the Primary and Standby Master servers are synchronized: # su - gpadmin $ gpstate -f If the output returns the status synchronized, the Master servers are in sync. If synchronized is not returned in the output, do not replace the Standby Master server. Contact EMC Support. d. Switch to the user root and make note of any custom NFS mounts the customer may have created: $ su # cat /etc/fstab e. Make note of any custom network gateways the customer may have created: # cat /etc/sysconfig/network 5. From the Primary Master server, switch to the user gpadmin and remove the Standby Master server from the configuration: # su - gpadmin $ gpinitstandby -r 6. When prompted, enter Y to continue. 7. Shut down the Standby Master server: • If the failed Standby Master server is accessible through SSH, log into to it as the user root and issue the shutdown command. IMPORTANT Check the prompt to make sure that you are on the Standby Master (smdw) before you issue the shutdown command! $ ssh root@smdw # shutdown -h now EMC Greenplum DCA Maintenance Guide Replace the Standby Master server 24 EMC CONFIDENTIAL Replace a Master Server • If the failed server is not accessible through SSH, power it off by pressing the power button on the front of the server. 8. Label all the cables connected to the failed server so that you’ll know where to connect them on the replacement server. 9. Remove all power, Ethernet, and twin-axial cables from the back of the server. Note: If the system has Dual NICs installed, note the connections for customer and interconnect networks prior to disconnecting. 10. Remove the failed server and install the replacement server (see Appendix E, “Replace a Server in the Greenplum DCA Rack,” on page 192). 11. Transfer disk drives one at a time from the failed server to the replacement server. IMPORTANT Use caution when transferring drives. Transfer only one drive at a time. Insert the drives in the same slots that they occupied in the failed server. 12. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on the cables for proper connectivity. IMPORTANT Do not connect power to the replacement server yet. 13. From the Primary Master server start the dhcpd service as the user root: # service dhcpd start 14. Connect the power cables to the replacement server. 15. Next, use these steps to identify the IP address assigned to the server. a. Issue the following command to obtain the lease information provided in the dhcpd.leases file: # tail /var/lib/dhcpd/dhcpd.leases b. The dhcpd.leases file dispays (similar to the following): Example lease 172.28.6.170 { starts 4 2012/10/18 20:09:08; ends 5 2013/10/18 20:09:08; cltt 4 2012/10/18 20:09:08; binding state active; next binding state free; hardware ethernet 00:00:00:00:00:04; uid "\001\000\036g,\242\014"; EMC Greenplum DCA Maintenance Guide Replace the Standby Master server 25 EMC CONFIDENTIAL Replace a Master Server c. Locate the MAC address labelled hardware ethernet in the example dhcpd.leases file above: 00:00:00:00:00:04 d. Locate the MAC address on the replacement server’s service tag (highlighted in the photograph below): MAC1 00:00:00:00:00:00 Figure 4 Locating the MAC address on the service tag (Standby Master server shown, Dragon24) e. Compare the last two digits in the MAC addresses referenced in step c and step d (for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC address in the dhcpd.leases file is four greater than the last two numbers in the MAC address on the replacement server’s service tag. If this is the case, it is certain that the IP address in the dhcpd.leases file is the correct one to associate with the server. For example, the scenario described above verifies that 172.28.6.170 is correct in this specific instance. f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170) identified in the previous steps using the example above as a guide: Note: Disregard the long, detailed output after this command is executed. # ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth g. Ping the new address to verify that the change was applied: # ping 172.28.0.# Where # is the number of the server you are replacing. 16. Turn off the dhcpd service: # service dhcpd stop 17. Power on the replacement server by pressing the button on the front panel. EMC Greenplum DCA Maintenance Guide Replace the Standby Master server 26 EMC CONFIDENTIAL Replace a Master Server 18. Issue the following command to open a BMC console session on the replacement Master server: # ipmiutil sol -a -e -N smdw-sp -U root -P sephiroth You will need to press the F key within 15 seconds after seeing this WARNING message: Foreign configuration(s) found on adapter Press any key to continue or ‘C’ load the configuration utility, or ‘F’ to import foreign configuration(s) and continue. 19. Power on the replacement server by pressing the power button on the front panel, and press the F key when prompted. 20. When the following message displays, disregard and press the space bar: All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Press any key to continue, or 'C' to load the configuration utility. 21. If the message below appears it indicates that the server did not accept the “F” key request per the above WARNING. This means that you will need to power off the server, verify that all LED lights are off, and go back to step 19 (press the F key when prompted again): CLIENT MAC ADDR: 00 1E 67 4D C5 1D 001E674DC51D DHCP....\ EMC Greenplum DCA Maintenance Guide GUID: 2A9B43A4 A50A 11E1 AAA0 Replace the Standby Master server 27 EMC CONFIDENTIAL Replace a Master Server 22. Monitor the boot process onscreen and verify that the system boots from hard disk. If it does not, do the following to force it to boot from hard disk: a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.) key, as follows on the keyboard: then Note: When you exit the BMC console you are returned to your connection on the mdw as the user root. b. Issue the following command from mdw to force the replacement server to boot from hard drive: # ipmiutil reset -h -N smdw-sp -U root -P sephiroth c. Once the operating system is loaded, issue the following command to change the boot order on smdw: # ssh smdw # syscfg /bbo “emcbios” HDD NW d. Reboot the system: # reboot e. Following the reboot, issue the following commands to connect to smdw and verify the boot order: # ssh smdw # syscfg /bbosys f. Exit smdw: # exit 23. Check the health of the replacement server: # dcacheck -h smdw Verify that no errors display. 24. Exchange SSH keys on the replacement server using the DCA Setup utility: a. Start the DCA Setup utility as the user root: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 6 to Generate SSH Keys. d. Enter X to exit the DCA Setup utility. e. Check the firmware level of the RAID controllers with the CmdTool2 utility: # ssh smdw /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" EMC Greenplum DCA Maintenance Guide Replace the Standby Master server 28 EMC CONFIDENTIAL Replace a Master Server If the above command returns either: FW Package Build: 23.7.0-0033 or FW Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a through n below to update the firmware. f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to your laptop. https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian ce g. Extract the files to your laptop using unzip or similar unpacking tool. For example: Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the file in /root. You can use WinSCP or a similar utility. Note: You may be required to provide a login to the destination server. i. For each server in need of an update, log into the server as root. j. SCP the MR56p.rom file from the master to the server you are updating. k. Install the new firmware using the following command: Note: This will take longer on 24-disk servers. # /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom -aall l. Reboot the server. # reboot m. When the server reboots, check the new firmware version: # /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" The following should be returned, indicating your firmware has successfully been updated on this server: FW Package Build: 23.12.0-0013 n. Repeat these alphabetic steps to check/update the remaining servers in the cluster. 25. Switch to the user gpadmin and issue the following command from the Primary Master server (mdw) to initialize the replacement server as the Standby Master server: # su - gpadmin $ gpinitstandby -s smdw **If you are servicing a Hadoop-only DCA** Perform the next two steps only if you are replacing a Master server in a Hadoop-only DCA 26. Issue the following command to recover the GPDB segment instances running on the Primary and Standby Master servers: $ gprecoverseg Enter Y when prompted. For example: EMC Greenplum DCA Maintenance Guide Replace the Standby Master server 29 EMC CONFIDENTIAL Replace a Master Server Continue with segment recovery procedure Yy|Nn (default=N): > Y 27. Issue the following command to rebalance the GPDB segment instances running on the Primary and Standby Master servers: $ gprecoverseg -r Enter Y when prompted. For example: Continue with segment rebalance procedure Yy|Nn (default=N): > Y The procedure continues here for all DCA types: 28. Exit from the user gpadmin to the user root: $ exit 29. Start the dca_setup utility: # dca_setup 30. Synchronize the system clock: a. Select option 2 for Modify DCA Settings. b. Select option 5 for Modify NTP/Clock Configuration Options. c. Select option 3 for Synchronize clocks across the cluster to the NTP server. d. Enter X to exit the DCA Setup utility. 31. IMPORTANT - Note that the same DCA system Serial Number (located on a label affixed to the top, rear of the rack) must be included in the following files for Dial Home to work after replacing a Master application server (mdw and smdw in the case of GPDB and hdm, and standby hdm in the case of Hadoop): • /opt/connectemc/ConnectEMC.ini • /opt/greenplum/serialnumber First, check the DCA system Serial Number in the connectemc initialization file, /opt/connectemc/ConnectEMC.ini file, as follows: a. Open the connectemc initialization file: /opt/connectemc/ConnectEMC.ini b. Locate the DCA system Serial Number per the following keyword in the file: SERIAL_NUMBER= c. Check that this matches the DCA system Serial Number on the label affixed to the top, rear of the rack. Go to the next step (step d.) if the Serial Number is missing. d. If missing, enter the Serial Number in the /opt/connectemc/ConnectEMC.ini file, for example: SERIAL_NUMBER=APMXXXXXXXXX EMC Greenplum DCA Maintenance Guide Replace the Standby Master server 30 EMC CONFIDENTIAL Replace a Master Server 32. Next, check that the DCA system Serial Number in the /opt/greenplum/serialnumber file matches the DCA system Serial Number in the /opt/connectemc/ConnectEMC.ini file, per step 31 above. For example: SERIAL_NUMBER=APM00140732731 Note: After verifying that the DCA system Serial Numbers are identical, remember to save the /opt/greenplum/serialnumber file if you made any changes. 33. Re-enable health monitoring: # dca_healthmon_ctl -e 34. You must stop and start the connectemc service (also referred to as Dial Home) to complete restarting the healthmon daemon. Enter the command: service connectemc stop You will see the message: Shutting down ConnectEMC 35. When you see the # prompt again, enter: service connectemc start You will see the message: Starting ConnectEMC The # prompt returns, indicating that you have re-enabled health monitoring. Identifying a single-NIC master versus a dual-NIC master in a DCAv2 The Master server is a dual-NIC master if both eth6 and eth7 are present on the mdw or smdw. If the server is powered up, the only way to identify a dual-NIC master is to visually inspect it, by counting the number of SFP ports. Replace a Master server in a DCA without a Greenplum database Perform this procedure to replace a failed Master server in a DCA in which the Greenplum database is either not installed or is uninitialized. IMPORTANT This procedure directs you to transfer drives from the failed server to the replacement server. Take great care when transferring drives. Transfer only one drive at a time. Insert drives in the same slots that they occupied in the failed server. EMC Greenplum DCA Maintenance Guide Identifying a single-NIC master versus a dual-NIC master in a DCAv2 31 EMC CONFIDENTIAL Replace a Master Server 1. You may want to consult “Task summary” on page 11 for a overview of the Master server replacement procedures. 2. If it is not already connected, connect your service laptop to the red service cable located on the laptop tray. The red service cable is connected to port 48 on the first Administration switch a-sw-1 (see “Connect a workstation to the DCA” on page 176). 3. To prevent false dial home messages from being sent to EMC Support during service, stop the healthmon daemon to disable health monitoring: # dca_healthmon_ctl -d 4. If the failed server is still accessible by SSH, perform the following steps. If the failed Master server is not accessible through SSH, skip to step 5. a. Log in to the functioning Master server as the user root (see “Connect a workstation to the DCA” on page 176). b. Activate the server identification LED: Enter either mdw or smdw for the hostname, whichever applies. # dca_blinker -h smdw -a ON c. Make note of any custom NFS mounts the customer may have created: # cat /etc/fstab d. Make note of any custom network gateways the customer may have created: # cat /etc/sysconfig/network 5. If possible, while connected to the failed Master server, issue the following command to shut down the server. IMPORTANT Check the prompt to make sure that you are on the correct Master server (mdw or smdw) before you issue the shutdown command! # shutdown -h now 6. If the failed server is inaccessible through SSH, power it off by pressing the power button on the front of the server. 7. Label all the cables connected to the failed server so that you’ll know where to connect them on the replacement server. 8. Remove all power, Ethernet, and twin-axial cables from the back of the server. Note: If the system has Dual NICs installed, note the connections for customer and interconnect networks prior to disconnecting. 9. Remove the failed server and install the replacement server (see Appendix E, “Replace a Server in the Greenplum DCA Rack,” on page 192). 10. Transfer disk drives one at a time from the failed server to the replacement server. EMC Greenplum DCA Maintenance Guide Replace a Master server in a DCA without a Greenplum database 32 EMC CONFIDENTIAL Replace a Master Server IMPORTANT Use caution when transferring drives. Transfer only one drive at a time. Insert the drives in the same slots that they occupied in the failed server. 11. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on the cables for proper connectivity. IMPORTANT Do not connect power to the replacement server yet. 12. From the functional Master server start the dhcpd service: # service dhcpd start 13. Connect the power cables to the replacement server. EMC Greenplum DCA Maintenance Guide Replace a Master server in a DCA without a Greenplum database 33 EMC CONFIDENTIAL Replace a Master Server 14. Next, use these steps to identify the IP address assigned to the server. a. Issue the following command to obtain the lease information provided in the dhcpd.leases file: # tail /var/lib/dhcpd/dhcpd.leases b. The dhcpd.leases file dispays (similar to the following): Example lease 172.28.6.170 { starts 4 2012/10/18 20:09:08; ends 5 2013/10/18 20:09:08; cltt 4 2012/10/18 20:09:08; binding state active; next binding state free; hardware ethernet 00:00:00:00:00:04; uid "\001\000\036g,\242\014"; c. Locate the MAC address labelled hardware ethernet in the example dhcpd.leases file above: 00:00:00:00:00:04 d. Locate the MAC address on the replacement server’s service tag (highlighted in the photograph below): MAC1 00:00:00:00:00:00 Figure 5 Locating the MAC address on the service tag (Master server shown, Dragon24) e. Compare the last two digits in the MAC addresses referenced in step c and step d (for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC address in the dhcpd.leases file is four greater than the last two numbers in the MAC address on the replacement server’s service tag. If this is the case, it is certain that the IP address in the dhcpd.leases file is the correct one to associate with the server. For example, the scenario described above verifies that 172.28.6.170 is correct in this specific instance. EMC Greenplum DCA Maintenance Guide Replace a Master server in a DCA without a Greenplum database 34 EMC CONFIDENTIAL Replace a Master Server f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170) identified in the previous steps using the example above as a guide: Note: Disregard the long, detailed output after this command is executed. # ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth g. Ping the new address to verify that the change was applied: # ping 172.28.0.# Where # is the number of the server you are replacing. 15. Turn off the dhcpd service: # service dhcpd stop 16. Power on the replacement server by pressing the button on the front panel. 17. Issue the following command to open a console session on the replacement server. Enter either mdw or smdw for the hostname, whichever applies. For example, for smdw: # ipmiutil sol -a -e -N smdw-sp -U root -P sephiroth You will need to press the F key within 15 seconds after seeing this WARNING message: Foreign configuration(s) found on adapter Press any key to continue or ‘C’ load the configuration utility, or ‘F’ to import foreign configuration(s) and continue. 18. Power on the replacement server by pressing the power button on the front panel, and press the F key when prompted. 19. When the following message displays, disregard and press the space bar: All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Press any key to continue, or 'C' to load the configuration utility. EMC Greenplum DCA Maintenance Guide Replace a Master server in a DCA without a Greenplum database 35 EMC CONFIDENTIAL Replace a Master Server 20. If the message below appears it indicates that the server did not accept the “F” key request per the above WARNING. This means that you will need to power off the server, verify that all LED lights are off, and go back to step 18 (press the F key when prompted again): CLIENT MAC ADDR: 00 1E 67 4D C5 1D 001E674DC51D DHCP....\ GUID: 2A9B43A4 A50A 11E1 AAA0 21. Monitor the boot process onscreen and verify that the system boots from hard disk. If it does not, do the following to force it to boot from hard disk: a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.) key, as follows on the keyboard: then Note: When you exit the BMC console you are returned to your connection on the functioning master server the user root. b. Issue the following command from either the Primary or the Standby Master server, which ever applies. – If you replaced a Primary Master, issue the command from smdw. – If you replaced a Standby Master, issue the command from mdw. For example, for mdw: # ipmiutil reset -h -N mdw-sp -U root -P sephiroth c. Once the operating system is loaded, issue the following command to change the boot order on the replacement server. For example, if you replaced mdw: # ssh mdw # syscfg /bbo “emcbios” HDD NW d. Reboot the system: # reboot e. Following the reboot, issue the following commands to connect to the replacement server and verify the boot order. For example, if you replaced mdw: # ssh mdw # syscfg /bbosys f. Check the firmware level of the RAID controllers with the CmdTool2 utility: # ssh mdw /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" If the above command returns either: FW Package Build: 23.7.0-0033 or FW Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a through n below to update the firmware. EMC Greenplum DCA Maintenance Guide Replace a Master server in a DCA without a Greenplum database 36 EMC CONFIDENTIAL Replace a Master Server g. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to your laptop. https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian ce h. Extract the files to your laptop using unzip or similar unpacking tool. For example: Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip i. As a root user copy the MR56p.rom file to the Master server (mdw) and place the file in /root. You can use WinSCP or a similar utility. Note: You may be required to provide a login to the destination server. j. For each server in need of an update, log into the server as root. k. SCP the MR56p.rom file from the master to the server you are updating. l. Install the new firmware using the following command: Note: This will take longer on 24-disk servers. # /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom -aall m. Reboot the server. # reboot n. When the server reboots, check the new firmware version: # /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" The following should be returned, indicating your firmware has successfully been updated on this server: FW Package Build: 23.12.0-0013 o. Repeat these alphabetic steps to check/update the remaining servers in the cluster. 22. Issue the following command to check the health of the replacement server. For example, if you replaced mdw: # dcacheck -h mdw Verify that no errors display. 23. Re-enable health monitoring by restarting the healthmon daemon: # dca_healthmon_ctl -e 24. You must stop and start the connectemc service to complete restarting the healthmon daemon: EMC Greenplum DCA Maintenance Guide Replace a Master server in a DCA without a Greenplum database 37 EMC CONFIDENTIAL Replace a Master Server EMC Greenplum DCA Maintenance Guide Replace a Master server in a DCA without a Greenplum database 38 EMC CONFIDENTIAL CHAPTER 3 Replace a Segment, DIA, or Hadoop server This chapter describes how to replace a server used in GPDB, DIA, and GP HD modules. It includes the following major sections: Required tools ........................................................................................................ Task summary......................................................................................................... Service tag locations ............................................................................................... Reseat cables before replacing a server................................................................... Replace a server in an initialized GPDB module ....................................................... Replace a DIA server or a server in an uninitialized GPDB module............................ Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0) .................. Replace a server in a Pivotal Hadoop module (version 2.0.1.0 and later) ................. 39 40 41 42 44 51 58 64 Required tools You need the following tools to remove and replace a server: #2 Phillips screwdriver Wrist grounding strap EMC Greenplum DCA Maintenance Guide Replace a Segment, DIA, or Hadoop server 39 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server Task summary Table 2 Segment (GPDB), DIA, and Hadoop server replacement task summary Tasks Segment server in an initialized GPDB module DIA server; Segment server in an uninitialized GPDB module Hadoop Master or Worker server Check BIOS version when replacing a server When installing a replacement server, identify the BIOS version on the new server (as well as the versions already running in the DCA). Then upgrade so that all servers reflect the same firmware levels. Go to http://support.emc.com to obtain the pertinent BIOS upgrade instructions. The upgrade instructions provide information on how to access and install the upgrade package. x x x Check and reseat cables. x x x Connect to the DCA. x x x Disable health monitoring. x x x Check number of segments that are showing Change Tracking. x Activate light bar to locate the failed server. x x x Ask the customer about 3rd party software. x Note MAC address of the adapter eth1 x Note NFS mounts and custom gateways. x x Power off the failed server. x x x Install the replacement server. x x x Transfer drives from the failed server to the replacement server. x x x Connect cables to the replacement server. x x x Configure the BMC IP address. x x x Power on the replacement server. x x x Import foreign disk configurations. x x x Monitor the boot process and verify that the replacement server boots from hard disk. x x x Check the health of the replacement server. x x x Exchange SSH keys. x x x Launch gprecoverseg utility. x Issue gpstate -m to verify data status of all segments is Synchronized. x Issue gprecoverseg to restore the server to its optimal configuration. x Issue gpstate -e to check for errors. x EMC Greenplum DCA Maintenance Guide Task summary 40 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server Table 2 Segment (GPDB), DIA, and Hadoop server replacement task summary Segment server in an initialized GPDB module Tasks DIA server; Segment server in an uninitialized GPDB module Hadoop Master or Worker server x Synchronize the system clock. x x Verify with the customer that NFS mounts or gateways (if any) are functioning. x x Configure the external IP address (eth1). Re-enable health monitoring. x x Tell customer that they can reinstall 3rd party software. x x x (DIA server only) Service tag locations When replacing any hardware component, make sure to properly de-brief the part. Locate the serial number on the blue label affixed to the rear of the rotating power console on the front of each segment server. Figure 6 Service tag location on 24-drive Segment server EMC Greenplum DCA Maintenance Guide Service tag locations 41 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server Reseat cables before replacing a server Before you replace a failed server, determine whether the problem is caused by a faulty cable connection. Remove and then reconnect cables as described below. 1. Connect your service laptop to the red service cable located on the laptop tray in Rack 1. The red service cable is connected to port 48 on the first Administration switch a-sw-1 (see “Connect a workstation to the DCA” on page 176). 2. Open a console connection to the Primary Master server as the user root using IP Address 172.28.4.250 and password changeme. 3. To identify the failed server, activate its server identification light by issuing the following command. Replace the hostname shown in bold below with the hostname of the server want to identify: # dca_blinker -h sdw1 -a ON Note: If the server is completely non-operational, the light might not work. 4. Shut down the failed server: • If you can access the server through SSH: Enter the following command. Replace the hostname shown in bold with the hostname of the segment server you are working on: IMPORTANT Check the prompt to make sure that you are on the correct server before you issue the shutdown command! # ssh sdw1 # shutdown -h now • If you cannot access the server through SSH: Make sure that the server is powered off. Press the power button on the front of the server if necessary. 5. Once the server is powered off, unplug and then firmly reconnect the administration network cable, two interconnect cables, and two power supply AC cables. Figure 7 shows the relevant cable connection sites. EMC Greenplum DCA Maintenance Guide Reseat cables before replacing a server 42 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server To lower Interconnect switch (10Gb) To upper Interconnect switch (10Gb) AF004061a Power Power To Administration switch (CM & BMC; 1Gb) To lower Interconnect switch (10Gb) To upper Interconnect switch (10Gb) AF004142a To Administration switch (CM & BMC; 1Gb) (CM = Cluster Management; BMC = Baseboard Management Controller service port) Figure 7 Re-seat cables on the back of the server 6. Power on the server by pressing the power button on the front of the server. Wait for the server to boot (approximately 5 minutes). 7. From the Primary Master server, issue the ping command to each interface on the server. Replace the text in bold with the hostname of the server you are evaluating. # ping sdw1-cm # ping sdw1 • If there is no response from the interfaces, replace the server by performing the appropriate procedure listed below: – “Replace a server in an initialized GPDB module” on page 44 – “Replace a DIA server or a server in an uninitialized GPDB module” on page 51 – “Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0)” on page 58 • If all interfaces on the server respond, you do not need to replace the server. If the server you are evaluating is a DIA server, you are done. If the server you are evaluating is part of a GPDB or HD module, issue the following commands to recover the segment instances: # su - gpadmin $ gprecoverseg EMC Greenplum DCA Maintenance Guide Reseat cables before replacing a server 43 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server Replace a server in an initialized GPDB module Perform this procedure to replace a failed or failing Segment Server that is part of an initialized GPDB module. This procedure describes how to replace the server hardware and recover segment instances. To replace a server that is part of a DIA module or part of an uninitialized GPDB module (or a module in which GPBD is not installed), see page 51. IMPORTANT This procedure directs you to transfer drives from the failed server to the replacement server. Take great care when transferring drives. Transfer only one drive at a time. Insert drives in the same slots that they occupied in the failed server. 1. You may want to consult “Task summary” on page 40 for a overview of the segment server replacement procedures. 2. Make sure that you have checked the cable connections (see “Reseat cables before replacing a server” on page 42). 3. If it is not already connected, connect your service laptop to the red service cable located on the laptop tray in Rack 1. The red service cable is connected to port 48 on the first Administration switch a-sw-1 (see “Connect a workstation to the DCA” on page 176). 4. To prevent false dial home messages from being sent to EMC Support during service, disable health monitoring by stopping the healthmon daemon: # dca_healthmon_ctl -d 5. Log in to the Primary Master Server as the user gpadmin. 6. Issue the gpstate -m command. Verify that no more than eight segment instances display a status of Change Tracking. In this example, sdw1 has failed: $ gpstate -m Mirror Datadir sdw2-2 /data2/mirror/gpseg0 sdw3-2 /data2/mirror/gpseg1 sdw4-2 /data2/mirror/gpseg2 sdw2-1 /data1/mirror/gpseg3 sdw3-1 /data1/mirror/gpseg4 sdw4-1 /data1/mirror/gpseg5 sdw3-1 /data1/mirror/gpseg6 sdw4-1 /data1/mirror/gpseg7 Port 50003 50003 50003 50000 50000 50000 50000 50000 Status Acting as Acting as Acting as Acting as Acting as Acting as Acting as Acting as Data Status Primary Primary Primary Primary Primary Primary Primary Primary Change Change Change Change Change Change Change Change Tracking Tracking Tracking Tracking Tracking Tracking Tracking Tracking 7. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA door and the blue server identification LED as the user root: a. Launch the DCA Setup utility: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 18 for Light Bar Controls. d. Select option 3 for Blink the light bar. EMC Greenplum DCA Maintenance Guide Replace a server in an initialized GPDB module 44 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server e. Enter the hostname of the server and press ENTER. f. Enter X to exit the DCA Setup utility. The green lightbar on the DCA door and the blue server identification LED begin to blink. Note: If the DCA door does not have a lightbar, an error message displays. You can safely ignore the error message. To identify the failed server, locate the blue server identification LED. 8. If you can still access the server via SSH, perform sub-steps (a) through (d) below. If you cannot access the server via SSH, proceed to step 9. a. Log in to the failed segment server as the user root. Replace the hostname shown in bold with the hostname of the failed server: # ssh root@sdw1 b. Make note of any custom NFS mounts the customer may have created: # cat /etc/fstab c. Make note of any custom network gateways the customer may have created: # cat /etc/sysconfig/network d. Shut down the failed server: # shutdown -h now 9. If the failed server is inaccessible through SSH, power it off by pressing the power button on the front of the server. 10. Label all the cables connected to the failed server so that you’ll know where to connect them on the replacement server. 11. Remove all power, Ethernet, and twin-axial cables from the back of the server. Note: If the system has Dual NICs installed, note the connections for customer and interconnect networks prior to disconnecting. 12. Remove the failed server and install the replacement server (see Appendix E, “Replace a Server in the Greenplum DCA Rack,” on page 192). 13. Transfer disk drives one at a time from the failed server to the replacement server. IMPORTANT Use caution when transferring drives. Transfer only one drive at a time. Insert the drives in the same slots that they occupied in the failed server. 14. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on the cables for proper connectivity. IMPORTANT Do not connect power to the replacement server yet. EMC Greenplum DCA Maintenance Guide Replace a server in an initialized GPDB module 45 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server 15. From the Primary Master server start the dhcpd service: # service dhcpd start 16. Connect the power cables to the replacement server. 17. Next, use these steps to identify the IP address assigned to the server. a. Issue the following command to obtain the lease information provided in the dhcpd.leases file: # tail /var/lib/dhcpd/dhcpd.leases b. The dhcpd.leases file dispays (similar to the following): Example lease 172.28.6.170 { starts 4 2012/10/18 20:09:08; ends 5 2013/10/18 20:09:08; cltt 4 2012/10/18 20:09:08; binding state active; next binding state free; hardware ethernet 00:00:00:00:00:04; uid "\001\000\036g,\242\014"; c. Locate the MAC address labelled hardware ethernet in the example dhcpd.leases file above: 00:00:00:00:00:04 d. Locate the MAC address on the replacement server’s service tag (highlighted in the photograph below): MAC1 00:00:00:00:00:00 Figure 8 Locating the MAC address on the service tag (Dragon24) e. Compare the last two digits in the MAC addresses referenced in step c and step d (for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC address in the dhcpd.leases file is four greater than the last two numbers in the MAC address on the replacement server’s service tag. EMC Greenplum DCA Maintenance Guide Replace a server in an initialized GPDB module 46 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server If this is the case, it is certain that the IP address in the dhcpd.leases file is the correct one to associate with the server. For example, the scenario described above verifies that 172.28.6.170 is correct in this specific instance. f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170) you’ve identified in the previous steps using the example above as a guide: Note: Disregard the long, detailed output after this command is executed. # ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth g. Ping the new address to verify that the change was applied: # ping 172.28.0.# Where # is the number of the server you are replacing. 18. Turn off the dhcpd service: # service dhcpd stop 19. From the Primary Master server as user root, issue the following command to open a BMC console session on the replacement server. Replace the hostname shown in bold below with the hostname of the replacement server: # ipmiutil sol -a -e -N sdw1-sp -U root -P sephiroth You will need to press the F key within 15 seconds after seeing this WARNING message: Foreign configuration(s) found on adapter Press any key to continue or ‘C’ load the configuration utility, or ‘F’ to import foreign configuration(s) and continue. 20. Power on the replacement server by pressing the power button on the front panel, and press the F key when prompted. Note that after pressing the space bar in the next step you will again be prompted to press the F key within 15 seconds. 21. When the following message displays, disregard and press the space bar: All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Press any key to continue, or 'C' to load the configuration utility. 22. Press the F key when prompted. 23. When the following message displays, disregard and press the space bar: All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Press any key to continue, or 'C' to load the configuration utility. EMC Greenplum DCA Maintenance Guide Replace a server in an initialized GPDB module 47 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server 24. If the message below appears you will need to power off the server,essage below appears you will need to power off the server, verify that all LED lights are off, and repeat steps 21 through 23. CLIENT MAC ADDR: 00 1E 67 4D C5 1D 001E674DC51D GUID: 2A9B43A4 A50A 11E1 AAA0 25. Monitor the boot process onscreen and verify that the replacement server boots from hard disk. If it does not, do the following to force it to boot from hard disk: a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.) key, as follows on the keyboard: then Note: When you exit the BMC console you are returned to your connection on the Primary Master server as the user root. b. Issue the following command from the Primary Master server to force the replacement server to boot from the hard drive. Change the hostname shown in bold below to the hostname of the server you replaced: # ipmiutil reset -h -N sdw1-sp -U root -P sephiroth c. Once the operating system is loaded, issue the following command to change the boot order on the replacement server. For example, on sdw1: # ssh sdw1 # syscfg /bbo “emcbios” HDD NW d. Reboot the replacement server: # reboot e. Following the reboot, issue the following commands to connect to the replacement server and verify the boot order. Change the hostname shown in bold below to the hostname of the server you replaced: # ssh sdw1 # syscfg /bbosys 26. Check the health of the replacement server. Replace the text in bold with the hostname of the replacement segment server: # dcacheck -h sdw1 Verify that no errors display. 27. Exchange SSH keys on the replacement server using the DCA Setup utility: a. Start the DCA Setup utility as the user root: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 6 to Generate SSH Keys. EMC Greenplum DCA Maintenance Guide Replace a server in an initialized GPDB module 48 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server d. Enter X to exit the DCA Setup utility. e. When the server reboots, check the new firmware version. Replace the text in bold with the hostname of the replacement server: # ssh sdw1 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" If the above command returns either: FW Package Build: 23.7.0-0033 or FW Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a through n below to update the firmware. f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to your laptop. https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian ce g. Extract the files to your laptop using unzip or similar unpacking tool. For example: Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the file in /root. You can use WinSCP or a similar utility. Note: You may be required to provide a server login name and password. i. For each server in need of an update, log into the server as root. j. SCP the MR56p.rom file from the master to the server you are updating. k. Install the new firmware using the following command: Note: This will take longer on 24-disk servers. # /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom -aall l. Reboot the server. # reboot m. When the server reboots, check the new firmware version. Replace the text in bold with the hostname of the replacement server: # ssh sdw1 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" The following should be returned, indicating your firmware has successfully been updated on this server: FW package Build: 23.12.0-0013 n. Repeat these alphabetic steps to check/update the remaining servers in the cluster. 28. Switch to the user gpadmin and launch the gprecoverseg utility to recover the segment instances: $ gprecoverseg -a EMC Greenplum DCA Maintenance Guide Replace a server in an initialized GPDB module 49 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server 29. When the gprecoverseg utility is finished, issue the gpstate -m command and verify that the data status is reported as Resynchronizing in the output. 30. Wait a few minutes, and then issue the gpstate -m command again to verify that the data status of all segments is reported as Synchronized in the output. 31. Return the Greenplum system to its optimal configuration: $ gprecoverseg -ra IMPORTANT Issuing gprecoverseg -ra cancels running queries but does not interrupt database connections. 32. Issue the $ gpstate -e command and verify that no errors are reported. 33. Synchronize the system clock: a. Select option 2 for Modify DCA Settings. b. Select option 5 for Modify NTP/Clock Configuration Options. c. Select option 3 for Synchronize clocks across the cluster to the NTP server. Enter X to exit the DCA Setup utility. 34. Re-enable health monitoring by restarting the healthmon daemon: # dca_healthmon_ctl -e EMC Greenplum DCA Maintenance Guide Replace a server in an initialized GPDB module 50 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server Replace a DIA server or a server in an uninitialized GPDB module Perform this procedure only to replace a failed server that is part of a DIA module or part of an uninitialized GPDB module (or module in which GPBD is not installed). IMPORTANT This procedure directs you to transfer drives from the failed server to the replacement server. Take great care when transferring drives. Transfer only one drive at a time. Insert drives in the same slots that they occupied in the failed server. 1. Make sure that you have checked the cable connections as described in “Reseat cables before replacing a server” on page 42. 2. If it is not already connected, connect your service laptop to the red service cable located on the laptop tray in Rack 1. The red service cable is connected to port 48 on the first Administration switch a-sw-1 (see “Connect a workstation to the DCA” on page 176). 3. To prevent false dial home messages from being sent to EMC Support during service, disable health monitoring by stopping the healthmon daemon: # dca_healthmon_ctl -d 4. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA door and the blue server identification LED as the user root: a. Launch the DCA Setup utility: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 18 for Light Bar Controls. d. Select option 3 for Blink the light bar. e. Enter the hostname of the server and press ENTER. f. Enter X to exit the DCA Setup utility. The green lightbar on the DCA door and the blue server identification LED begin to blink. Note: If the DCA door does not have a lightbar, an error message displays. You can safely ignore the error message. To identify the failed server, locate the blue server identification LED. 5. Log in to the Primary Master Server as the user gpadmin. 6. If you can still access the server via SSH, perform sub-steps (a) through (f) below. If you cannot access the server via SSH, proceed to step 7. a. Log in to the failed server as the user root. Replace the hostname shown in bold with the hostname of the failed server: $ ssh root@etl1 EMC Greenplum DCA Maintenance Guide Replace a DIA server or a server in an uninitialized GPDB module 51 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server b. If the server is part of a DIA module, ask the customer if any third-party software is installed. Discuss with the customer whether any files need to be saved before you power off the server. c. Switch to the user root and make note of the MAC address of the adapter eth1. # ifconfig eth1 d. Make note of any custom NFS mounts the customer may have created: # cat /etc/fstab e. Make note of any custom network gateways the customer may have created: # cat /etc/sysconfig/network f. Shut down the failed server: # shutdown -h now 7. Label all the cables connected to the failed server so that you’ll know where to connect them on the replacement server. 8. Remove all power, Ethernet, and twin-axial cables from the back of the server. Note: If the system has Dual NICs installed, note the connections for customer and interconnect networks prior to disconnecting. 9. Remove the failed server and install the replacement server (see Appendix E, “Replace a Server in the Greenplum DCA Rack,” on page 192). 10. Transfer disk drives one at a time from the failed server to the replacement server. IMPORTANT Use caution when transferring drives. Transfer only one drive at a time. Insert the drives in the same slots that they occupied in the failed server. 11. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on the cables for proper connectivity. IMPORTANT Do not connect power to the replacement server yet. 12. From the Primary Master server start the dhcpd service: # service dhcpd start 13. Connect the power cables to the replacement server. EMC Greenplum DCA Maintenance Guide Replace a DIA server or a server in an uninitialized GPDB module 52 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server 14. Next, use these steps to identify the IP address assigned to the server. a. Issue the following command to obtain the lease information provided in the dhcpd.leases file: # tail /var/lib/dhcpd/dhcpd.leases b. The dhcpd.leases file dispays (similar to the following): Example lease 172.28.6.170 { starts 4 2012/10/18 20:09:08; ends 5 2013/10/18 20:09:08; cltt 4 2012/10/18 20:09:08; binding state active; next binding state free; hardware ethernet 00:00:00:00:00:04; uid "\001\000\036g,\242\014"; c. Locate the MAC address labelled hardware ethernet in the example dhcpd.leases file above: 00:00:00:00:00:04 d. Locate the MAC address on the replacement server’s service tag (highlighted in the photograph below): MAC1 00:00:00:00:00:00 Figure 9 Locating the MAC address on the service tag (Dragon24) e. Compare the last two digits in the MAC addresses referenced in step c and step d (for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC address in the dhcpd.leases file is four greater than the last two numbers in the MAC address on the replacement server’s service tag. If this is the case, it is certain that the IP address in the dhcpd.leases file is the correct one to associate with the server. For example, the scenario described above verifies that 172.28.6.170 is correct in this specific instance. EMC Greenplum DCA Maintenance Guide Replace a DIA server or a server in an uninitialized GPDB module 53 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170) identified in the previous steps using the example above as a guide: Note: Disregard the long, detailed output after this command is executed. # ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth g. Ping the new address to verify that the change was applied: # ping 172.28.0.# Where # is the number of the server you are replacing. 15. Turn off the dhcpd service: # service dhcpd stop 16. Power on the replacement server by pressing the button on the front panel. 17. From the Primary Master server issue the following command to open a BMC console session on the replacement server. Replace the hostname shown in bold with the hostname of the replacement server: # ipmiutil sol -a -e -N etl1-sp -U root -P sephiroth You will need to press the F key within 15 seconds after seeing this WARNING message: Foreign configuration(s) found on adapter Press any key to continue or ‘C’ load the configuration utility, or ‘F’ to import foreign configuration(s) and continue. 18. Power on the replacement server by pressing the power button on the front panel, and press the F key when prompted. Note that after pressing the space bar in the next step you may be prompted again to press the F key within 15 seconds (if the DIA server is a Dragon24). 19. When the following message displays, disregard and press the space bar: All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Press any key to continue, or 'C' to load the configuration utility. 20. Press the F key when prompted. 21. When the following message displays, disregard and press the space bar: All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Press any key to continue, or 'C' to load the configuration utility. EMC Greenplum DCA Maintenance Guide Replace a DIA server or a server in an uninitialized GPDB module 54 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server 22. If the message below appears it indicates that the server did not accept the “F” key request per the above WARNING. This means that you will need to power off the server, verify that all LED lights are off, and go back to step 18 (press the F key when prompted again): CLIENT MAC ADDR: 00 1E 67 4D C5 1D 001E674DC51D DHCP....\ GUID: 2A9B43A4 A50A 11E1 AAA0 23. Monitor the boot process onscreen and verify that the replacement server boots from hard disk. If it does not, do the following to force it to boot from hard disk: a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.) key, as follows on the keyboard: then Note: When you exit the BMC console you are returned to your connection on the Primary Master server as the user root. b. Issue the following command from the Primary Master server to force the Segment server to boot from hard drive: # ipmiutil reset -h -N etl1-sp -U root -P sephiroth Change the hostname shown in bold above to the hostname of the server you replaced. c. Once the operating system is loaded, issue the following command to change the boot order on the Segment server. For example, on etl1: # ssh etl1 # syscfg /bbo “emcbios” HDD NW d. Reboot the system: # reboot e. Following the reboot, issue the following commands to connect to the replacement server and verify the boot order: # ssh etl1 # syscfg /bbosys Change the hostname shown in bold above to the hostname of the server you replaced. 24. Issue the following command to check the health of the replacement server. Replace the text in bold with the hostname of the replacement DIA or Segment server: # dcacheck -h etl1 Verify that no errors display. 25. Exchange SSH keys on the replacement server using the DCA Setup utility: EMC Greenplum DCA Maintenance Guide Replace a DIA server or a server in an uninitialized GPDB module 55 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server a. Start the DCA Setup utility as the user root: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 6 to Generate SSH Keys. d. Enter X to exit the DCA Setup utility. 26. Using gpssh, check the firmware level of the RAID controllers with the CmdTool2 utility: # /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" If the above command returns either: FW Package Build: 23.7.0-0033 or FW Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a through n below to update the firmware. e. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to your laptop. https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian ce f. Extract the files to your laptop using unzip or similar unpacking tool. For example: Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip g. As a root user copy the MR56p.rom file to the Master server (mdw) and place the file in /root. You can use WinSCP or a similar utility. Note: You may be required to provide a server login name and password. h. For each server in need of an update, log into the server as root. i. SCP the MR56p.rom file from the master to the server you are updating. j. Install the new firmware using the following command: Note: This will take longer on 24-disk servers. # /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom -aall k. Reboot the server. # reboot l. When the server reboots, check the new firmware version. Replace the text in bold with the hostname of the replacement server: # ssh sdw1 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" The following should be returned, indicating your firmware has successfully been updated on this server: EMC Greenplum DCA Maintenance Guide Replace a DIA server or a server in an uninitialized GPDB module 56 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server FW package Build: 23.12.0-0013 m. Repeat these alphabetic steps to check/update the remaining servers in the cluster. 27. Synchronize the system clock: a. Select option 2 for Modify DCA Settings. b. Select option 5 for Modify NTP/Clock Configuration Options. c. Select option 3 for Synchronize clocks across the cluster to the NTP server. Enter X to exit the DCA Setup utility. 28. (Applies only if you replaced a DIA server): SSH into the replacement DIA server and configure the external interface IP address: Edit the file ... # vi /etc/sysconfig/network-scripts/ifcfg-eth1 ... and change the HWADDR setting using the MACADDRESS of eth1 that you determined in step 6-c above. Change the values shown in bold below. Do not change the other parameters: DEVICE=eth1 BOOTPROTO=static IPADDR=10.6.193.46 NETMASK=255.255.252.0 ONBOOT=YES MTU=1500 HWADDR=MACADDRESS 29. Re-enable health monitoring by restarting the healthmon daemon: # dca_healthmon_ctl -e 30. If any third-party software was installed, inform the customer that it is now safe to reinstall and validate the software. EMC Greenplum DCA Maintenance Guide Replace a DIA server or a server in an uninitialized GPDB module 57 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0) Perform this procedure to replace a failed server that is part of a Hadoop module in DCA version 2.0.0.0. To replace a Hadoop server in DCA version 2.0.0.0, see the procedure “Replace a server in a Pivotal Hadoop module (version 2.0.1.0 and later)” on page 64. IMPORTANT This procedure directs you to transfer drives from the failed server to the replacement server. Take great care when transferring drives. Transfer only one drive at a time. Insert drives in the same slots that they occupied in the failed server. 1. Make sure that you have checked the cable connections as described in “Reseat cables before replacing a server” on page 42. 2. If it is not already connected, connect your service laptop to the red service cable located on the laptop tray in Rack 1. The red service cable is connected to port 48 on the first Administration switch a-sw-1 (see “Connect a workstation to the DCA” on page 176). 3. To prevent false dial home messages from being sent to EMC Support during service, disable health monitoring by stopping the healthmon daemon: # dca_healthmon_ctl -d 4. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA door and the blue server identification LED as the user root: a. Launch the DCA Setup utility: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 18 for Light Bar Controls. d. Select option 3 for Blink the light bar. e. Enter the hostname of the server and press ENTER. f. Enter X to exit the DCA Setup utility. The green lightbar on the DCA door and the blue server identification LED begin to blink. Note: If the DCA door does not have a lightbar, an error message displays. You can safely ignore the error message. To identify the failed server, locate the blue server identification LED. 5. If you can still access the server via SSH, perform sub-steps (a) through (c) below. If you cannot access the server via SSH, proceed to step 6. EMC Greenplum DCA Maintenance Guide Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0) 58 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server a. Log in to the failed server as the user root. Replace the hostname shown in bold with the hostname of the failed server. For example, for the first Hadoop Master: # ssh root@hdm1 b. Make note of any custom network gateways the customer may have created: # cat /etc/sysconfig/network c. Shut down the failed server: # shutdown -h now 6. Label all the cables connected to the failed server so that you’ll know where to connect them on the replacement server. 7. Remove all power, Ethernet, and twin-axial cables from the back of the server. Note: If the system has Dual NICs installed, note the connections for customer and interconnect networks prior to disconnecting. 8. Remove the failed server and install the replacement server (see Appendix E, “Replace a Server in the Greenplum DCA Rack,” on page 192). 9. Transfer disk drives one at a time from the failed server to the replacement server. IMPORTANT Use caution when transferring drives. Transfer only one drive at a time. Insert the drives in the same slots that they occupied in the failed server. 10. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on the cables for proper connectivity. IMPORTANT Do not connect power to the replacement server yet. 11. From the Primary Master server start the dhcpd service: # service dhcpd start 12. Connect the power cables to the replacement server. binding state active; next binding state free; hardware ethernet [server_mac_address]; uid "\001\000\036g,\242\014"; EMC Greenplum DCA Maintenance Guide Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0) 59 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server 13. Next, use these steps to identify the IP address assigned to the server. a. Issue the following command to obtain the lease information provided in the dhcpd.leases file: # tail /var/lib/dhcpd/dhcpd.leases b. The dhcpd.leases file dispays (similar to the following): Example lease 172.28.6.170 { starts 4 2012/10/18 20:09:08; ends 5 2013/10/18 20:09:08; cltt 4 2012/10/18 20:09:08; binding state active; next binding state free; hardware ethernet 00:00:00:00:00:04; uid "\001\000\036g,\242\014"; c. Locate the MAC address labelled hardware ethernet in the example dhcpd.leases file above: 00:00:00:00:00:04 d. Locate the MAC address on the replacement server’s service tag (highlighted in the photograph below): MAC1 00:00:00:00:00:00 Figure 10 Locating the MAC address on the service tag (server shown, Dragon12) EMC Greenplum DCA Maintenance Guide Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0) 60 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server e. Compare the last two digits in the MAC addresses referenced in step c and step d (for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC address in the dhcpd.leases file is four greater than the last two numbers in the MAC address on the replacement server’s service tag. If this is the case, it is certain that the IP address in the dhcpd.leases file is the correct one to associate with the server. For example, the scenario described above verifies that 172.28.6.170 is correct in this specific instance. f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170) identified in the previous steps using the example above as a guide: Note: Disregard the long, detailed output after this command is executed. # ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth g. Ping the new address to verify that the change was applied: # ping 172.28.0.# Where # is the number of the server you are replacing. 14. Turn off the dhcpd service: # service dhcpd stop 15. From the Primary Master server issue the following command to open a console session on the replacement server. Replace the hostname shown in bold with the hostname of the replacement server: # ipmiutil sol -a -e -N hdm1-sp -U root -P sephiroth You will need to press the F key within 15 seconds after seeing this WARNING message: Foreign configuration(s) found on adapter Press any key to continue or ‘C’ load the configuration utility, or ‘F’ to import foreign configuration(s) and continue. 16. Power on the replacement server by pressing the power button on the front panel, and press the F key when prompted. 17. When the following message disp 18. lays, disregard and press the space bar: All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Press any key to continue, or 'C' to load the configuration utility. 19. If the message below appears it indicates that the server did not accept the “F” key request per the above WARNING. This means that you will need to power off the server, verify that all LED lights are off, and go back to step 16 (press the F key when prompted again): EMC Greenplum DCA Maintenance Guide Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0) 61 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server CLIENT MAC ADDR: 00 1E 67 4D C5 1D 001E674DC51D DHCP....\ GUID: 2A9B43A4 A50A 11E1 AAA0 20. Monitor the boot process onscreen and verify that the replacement server boots from hard disk. If it does not, do the following to force it to boot from hard disk: a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.) key, as follows on the keyboard: then Note: When you exit the BMC console you are returned to your connection on the Primary Master server as the user root. b. Issue the following command from the Primary Master server to force the Segment server to boot from hard drive: # ipmiutil reset -h -N hdm1-sp -U root -P sephiroth Change the hostname shown in bold above to the hostname of the server you replaced. c. Once the operating system is loaded, issue the following command to change the boot order on the server. For example, on hdm1: # ssh hdm1 # syscfg /bbo “emcbios” HDD NW d. Reboot the system: # reboot e. Following the reboot, issue the following commands to connect to the replaced server, and verify the boot order: # ssh hdm1 # syscfg /bbosys Change the hostname shown in bold above to the hostname of the server you replaced. 21. Issue the following command to check the health of the replacement server. Replace the text in bold with the hostname of the replacement hadoop server: # dcacheck -h hdm1 Verify that no errors display. 22. Exchange SSH keys on the replacement server using the DCA Setup utility: a. Start the DCA Setup utility as the user root: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 6 to Generate SSH Keys. EMC Greenplum DCA Maintenance Guide Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0) 62 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server d. Enter X to exit the DCA Setup utility. 23. Using gpssh, check the firmware level of the RAID controllers with the CmdTool2 utility: # /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" If the above command returns either: FW Package Build: 23.7.0-0033 or FW Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a through n below to update the firmware. e. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to your laptop. https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian ce f. Extract the files to your laptop using unzip or similar unpacking tool. For example: Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip g. As a root user copy the MR56p.rom file to the Master server (mdw) and place the file in /root. You can use WinSCP or a similar utility. Note: You may be required to provide a server login name and password. h. For each server in need of an update, log into the server as root. i. SCP the MR56p.rom file from the master to the server you are updating. j. Install the new firmware using the following command: Note: This will take longer on 24-disk servers. # /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom -aall k. Reboot the server. # reboot l. When the server reboots, check the new firmware version. Replace the text in bold with the hostname of the replacement server: # ssh sdw1 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" The following should be returned, indicating your firmware has successfully been updated on this server: FW package Build: 23.12.0-0013 m. Repeat these alphabetic steps to check/update the remaining servers in the cluster. 24. Synchronize the system clock: a. Select option 2 for Modify DCA Settings. EMC Greenplum DCA Maintenance Guide Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0) 63 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server b. Select option 5 for Modify NTP/Clock Configuration Options. c. Select option 3 for Synchronize clocks across the cluster to the NTP server. d. Enter X to exit the DCA Setup utility. 25. Re-enable health monitoring by restarting the healthmon daemon: # dca_healthmon_ctl -e Replace a server in a Pivotal Hadoop module (version 2.0.1.0 and later) Perform this procedure to replace a failed server that is part of a Pivotal Hadoop (PHD) module. To replace a Greenplum Hadoop server in DCA version 2.0.0.0, see the procedure “Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0)” on page 58. Choose the procedure for the type of PHD server you are replacing (see Table 3 below). Table 3 Server replacement procedures by PHD module type Hostname Server Module / Role Use this procedure hdm1 Master Module - Namenode Replace hdm1 (namenode, DCA version 2.0.1.0) hdm2 Master Module - Secondary Namenode, Zookeeper Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0) hdm3 Master Module resourcemanager Replace hdm3 (resourcemanager, DCA version 2.0.1.0) hdm4 Master Module - Zookeeper, Hive, Hive-Metastore Replace hdm4 (zookeeper/hive/hive-metastore, DCA version 2.0.1.0) hdw# Worker Module - Nodemanager, Datanode Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0) Remove the failed PHD server and install the replacement PHD server IMPORTANT This procedure directs you to transfer drives from the failed server to the replacement server. Take great care when transferring drives. Transfer only one drive at a time. Insert drives in the same slots that they occupied in the failed server. 1. Make sure that you have checked the cable connections as described in “Reseat cables before replacing a server” on page 42. 2. If it is not already connected, connect your service laptop to the red service cable located on the laptop tray in Rack 1 (see “Connect a workstation to the DCA” on page 176). EMC Greenplum DCA Maintenance Guide Replace a server in a Pivotal Hadoop module (version 2.0.1.0 and later) 64 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server 3. To prevent false dial home messages from being sent to EMC Support during service, disable health monitoring by stopping the healthmon daemon: # dca_healthmon_ctl -d 4. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA door and the blue server identification LED as the user root: a. Launch the DCA Setup utility: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 18 for Light Bar Controls. d. Select option 3 for Blink the light bar. e. Enter the hostname of the server and press ENTER. f. Enter X to exit the DCA Setup utility. The green lightbar on the DCA door and the blue server identification LED begin to blink. Note: If the DCA door does not have a lightbar, an error message displays. You can safely ignore the error message. To identify the failed server, locate the blue server identification LED. 5. If you can still access the server via SSH, perform sub-steps (a) through (c) below. If you cannot access the server via SSH, proceed to step 6. a. Log in to the failed server as the user root. Replace the hostname shown in bold with the hostname of the failed server: # ssh root@hdm1 b. Make note of any custom network gateways the customer may have created: # cat /etc/sysconfig/network c. Shut down the failed server: # shutdown -h now 6. Label all the cables connected to the failed server so that you’ll know where to connect them on the replacement server. 7. Remove all power, Ethernet, and twin-axial cables from the back of the server. Note: If the system has Dual NICs installed, note the connections for customer and interconnect networks prior to disconnecting. 8. Remove the failed server and install the replacement server (see Appendix E, “Replace a Server in the Greenplum DCA Rack,” on page 192). 9. Transfer disk drives one at a time from the failed server to the replacement server. EMC Greenplum DCA Maintenance Guide Remove the failed PHD server and install the replacement PHD server 65 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server IMPORTANT Use caution when transferring drives. Transfer only one drive at a time. Insert the drives in the same slots that they occupied in the failed server. 10. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on the cables for proper connectivity. IMPORTANT Do not connect power to the replacement server yet. 11. From the Primary Master server start the dhcpd service: # service dhcpd start 12. Connect the power cables to the replacement server. 13. Next, use these steps to identify the IP address assigned to the server. a. Issue the following command to obtain the lease information provided in the dhcpd.leases file: # tail /var/lib/dhcpd/dhcpd.leases b. The dhcpd.leases file dispays (similar to the following): Example lease 172.28.6.170 { starts 4 2012/10/18 20:09:08; ends 5 2013/10/18 20:09:08; cltt 4 2012/10/18 20:09:08; binding state active; next binding state free; hardware ethernet 00:00:00:00:00:04; uid "\001\000\036g,\242\014"; c. Locate the MAC address labelled hardware ethernet in the example dhcpd.leases file above: 00:00:00:00:00:04 d. Locate the MAC address on the replacement server’s service tag (highlighted in Figure 11): EMC Greenplum DCA Maintenance Guide Remove the failed PHD server and install the replacement PHD server 66 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server MAC1 00:00:00:00:00:00 Figure 11 Locating the MAC address on the service tag (server shown, Dragon12) e. Compare the last two digits in the MAC addresses referenced in step c and step d (for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC address in the dhcpd.leases file is four greater than the last two numbers in the MAC address on the replacement server’s service tag. If this is the case, it is certain that the IP address in the dhcpd.leases file is the correct one to associate with the server. For example, the scenario described above verifies that 172.28.6.170 is correct in this specific instance. EMC Greenplum DCA Maintenance Guide Remove the failed PHD server and install the replacement PHD server 67 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170) identified in the previous steps using the example above as a guide: Note: Disregard the long, detailed output after this command is executed. # ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth g. Ping the new address to verify that the change was applied: # ping 172.28.0.# Where # is the number of the server you are replacing. 14. Turn off the dhcpd service: # service dhcpd stop 15. Power on the replacement server by pressing the button on the front panel. Replace hdm1 (namenode, DCA version 2.0.1.0) 1. From the Primary Master server issue the following command to open a console session on the replacement server. Replace the hostname shown in bold with the hostname of the replacement server: # ipmiutil sol -a -e -N hdm1-sp -U root -P sephiroth You will need to press the F key within 15 seconds after seeing this WARNING message: Foreign configuration(s) found on adapter Press any key to continue or ‘C’ load the configuration utility, or ‘F’ to import foreign configuration(s) and continue. 2. Power on the replacement server by pressing the power button on the front panel, and press the F key when prompted. 3. When the following message displays, disregard and press the space bar: All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Press any key to continue, or 'C' to load the configuration utility. 4. If the message below appears it indicates that the server did not accept the “F” key request per the above WARNING. This means that you will need to power off the server, verify that all LED lights are off, and go back to step 2 (press the F key when prompted again): CLIENT MAC ADDR: 00 1E 67 4D C5 1D 001E674DC51D DHCP....\ EMC Greenplum DCA Maintenance Guide GUID: 2A9B43A4 A50A 11E1 AAA0 Replace hdm1 (namenode, DCA version 2.0.1.0) 68 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server 5. Monitor the boot process onscreen and verify that the replacement server boots from hard disk. If it does not, do the following to force it to boot from hard disk: a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.) key, as follows on the keyboard: then Note: When you exit the BMC console you are returned to your connection on the Primary Master server as the user root. b. Issue the following command from the Primary Master server to force the Segment server to boot from hard drive: # ipmiutil reset -h -N hdm1-sp -U root -P sephiroth c. Once the operating system is loaded, issue the following command to change the boot order on the server. For example, on hdm1: # ssh hdm1 # syscfg /bbo “emcbios” HDD NW d. Reboot the system: # reboot e. Following the reboot, issue the following commands to connect to the replacement server and verify the boot order: # ssh hdm1 # syscfg /bbosys Change the hostname shown in bold above to the hostname of the server you replaced. 6. Issue the following command to check the health of the replacement server: # dcacheck -h hdm1 Verify that no errors display. 7. Exchange SSH keys on the replacement server using the DCA Setup utility: a. Start the DCA Setup utility as the user root: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 6 to Generate SSH Keys. d. Enter X to exit the DCA Setup utility. e. When the server reboots, check the new firmware version. Replace the text in bold with the hostname of the replacement server: # ssh sdw1 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" EMC Greenplum DCA Maintenance Guide Replace hdm1 (namenode, DCA version 2.0.1.0) 69 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server If the above command returns either: FW Package Build: 23.7.0-0033 or FW Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a through n below to update the firmware. f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to your laptop. https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian ce g. Extract the files to your laptop using unzip or similar unpacking tool. For example: Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the file in /root. You can use WinSCP or a similar utility. Note: You may be required to provide a login to the destination server. i. For each server in need of an update, log into the server as root. j. SCP the MR56p.rom file from the master to the server you are updating. k. Install the new firmware using the following command: Note: This will take longer on 24-disk servers. # /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom -aall l. Reboot the server. # reboot m. When the server reboots, check the new firmware version: # /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" The following should be returned, indicating your firmware has successfully been updated on this server: FW package Build: 23.12.0-0013 n. Repeat these alphabetic steps to check/update the remaining servers in the cluster. 8. Synchronize the system clock: a. Select option 2 for Modify DCA Settings. b. Select option 5 for Modify NTP/Clock Configuration Options. c. Select option 3 for Synchronize clocks across the cluster to the NTP server. d. Enter X to exit the DCA Setup utility. 9. Re-enable health monitoring by restarting the healthmon daemon: # dca_healthmon_ctl -e EMC Greenplum DCA Maintenance Guide Replace hdm1 (namenode, DCA version 2.0.1.0) 70 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server 10. Connect to hdm1: # ssh hdm1 11. Issue the following command to view the status of the PHD cluster: # dca_hadoop --status all Verify that the status of hdm1 is stopped: module namenode(service hadoop-namenode) is stopped on host hdm1 12. Start the namenode service: # dca_hadoop --start namenode When the namenode service starts the PHD cluster is in safemode. 13. Switch to the user hdfs and issue the following command: # su - hdfs $ hadoop fsck / The following message indicates that the filesystem has an error: The filesystem under path ‘/’ is CORRUPT 14. Exit safemode and return the filesystem to a normal state: $ hadoop dfsadmin -safemode leave 15. Verify that the filesystem is healthy: $ hadoop fsck / The following message indicates that the filesystem is healthy: The filesystem under path ‘/’ is HEALTHY EMC Greenplum DCA Maintenance Guide Replace hdm1 (namenode, DCA version 2.0.1.0) 71 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0) 1. From the Primary Master server issue the following command to open a console session on the replacement server. Replace the hostname shown in bold with the hostname of the replacement server: # ipmiutil sol -a -e -N hdm2-sp -U root -P sephiroth You will need to press the F key within 15 seconds after seeing this WARNING message: Foreign configuration(s) found on adapter Press any key to continue or ‘C’ load the configuration utility, or ‘F’ to import foreign configuration(s) and continue. 2. Power on the replacement server by pressing the power button on the front panel, and press the F key when prompted. 3. When the following message displays, disregard and press the space bar: All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Press any key to continue, or 'C' to load the configuration utility. 4. If the message below appears it indicates that the server did not accept the “F” key request per the above WARNING. This means that you will need to power off the server, verify that all LED lights are off, and go back to step 2 (press the F key when prompted again): CLIENT MAC ADDR: 00 1E 67 4D C5 1D 001E674DC51D DHCP....\ GUID: 2A9B43A4 A50A 11E1 AAA0 5. Monitor the boot process onscreen and verify that the replacement server boots from hard disk. If it does not, do the following to force it to boot from hard disk: a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.) key, as follows on the keyboard: then Note: When you exit the BMC console you are returned to your connection on the Primary Master server as the user root. b. Issue the following command from the Primary Master server to force the Segment server to boot from hard drive: # ipmiutil reset -h -N hdm2-sp -U root -P sephiroth Change the hostname shown in bold above to the hostname of the server you replaced. EMC Greenplum DCA Maintenance Guide Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0) 72 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server c. Once the operating system is loaded, issue the following command to change the boot order on the server. For example, on hdm2: # ssh hdm2 # syscfg /bbo “emcbios” HDD NW d. Reboot the system: # reboot e. Following the reboot, issue the following commands to connect to the replacement server and verify the boot order: # ssh hdm2 # syscfg /bbosys 6. Issue the following command to check the health of the replacement server: # dcacheck -h hdm2 Verify that no errors display. 7. Exchange SSH keys on the replacement server using the DCA Setup utility: a. Start the DCA Setup utility as the user root: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 6 to Generate SSH Keys. EMC Greenplum DCA Maintenance Guide Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0) 73 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server d. Enter X to exit the DCA Setup utility. e. When the server reboots, check the new firmware version. Replace the text in bold with the hostname of the replacement server: # ssh hdm2 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" If the above command returns either: FW Package Build: 23.7.0-0033 or FW Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a through n below to update the firmware. f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to your laptop. https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian ce g. Extract the files to your laptop using unzip or similar unpacking tool. For example: Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the file in /root. You can use WinSCP or a similar utility. Note: You may be required to provide a login to the destination server. i. For each server in need of an update, log into the server as root. j. SCP the MR56p.rom file from the master to the server you are updating. k. Install the new firmware using the following command: Note: This will take longer on 24-disk servers. # /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom -aall l. Reboot the server. # reboot m. When the server reboots, check the new firmware version: # /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" The following should be returned, indicating your firmware has successfully been updated on this server: FW package Build: 23.12.0-0013 n. Repeat these alphabetic steps to check/update the remaining servers in the cluster. 8. Synchronize the system clock: a. Select option 2 for Modify DCA Settings. b. Select option 5 for Modify NTP/Clock Configuration Options. EMC Greenplum DCA Maintenance Guide Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0) 74 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server c. Select option 3 for Synchronize clocks across the cluster to the NTP server. d. Enter X to exit the DCA Setup utility. 9. Re-enable health monitoring by restarting the healthmon daemon: # dca_healthmon_ctl -e 10. Connect to hdm1: # ssh hdm1 11. Issue the following command to view the status of the Hadoop cluster: # dca_hadoop --status all The status of the secondary-namenode and zookeeper modules on hdm2 should be: module secondary-namenode(service hadoop-secondarynamenode) has error on host hdm2 module zookeeper(service zookeeper-server) has error on host hdm2 12. Start the secondary-namenode and zookeeper services: # dca_hadoop --start secondary-namenode # dca_hadoop --start zookeeper 13. Switch to the user hdfs and issue the following command: # su - hdfs $ hadoop fsck / 14. Verify that the filesystem is healthy: $ hadoop fsck / The following message indicates that the filesystem is healthy: The filesystem under path ‘/’ is HEALTHY EMC Greenplum DCA Maintenance Guide Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0) 75 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server Replace hdm3 (resourcemanager, DCA version 2.0.1.0) 1. From the Primary Master server issue the following command to open a console session on the replacement server. Replace the hostname shown in bold with the hostname of the replacement server: # ipmiutil sol -a -e -N hdm3-sp -U root -P sephiroth You will need to press the F key within 15 seconds after seeing this WARNING message: Foreign configuration(s) found on adapter Press any key to continue or ‘C’ load the configuration utility, or ‘F’ to import foreign configuration(s) and continue. 2. Power on the replacement server by pressing the power button on the front panel, and press the F key when prompted. 3. When the following message displays, disregard and press the space bar: All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Press any key to continue, or 'C' to load the configuration utility. 4. If the message below appears it indicates that the server did not accept the “F” key request per the above WARNING. This means that you will need to power off the server, verify that all LED lights are off, and go back to step 2 (press the F key when prompted again): CLIENT MAC ADDR: 00 1E 67 4D C5 1D 001E674DC51D DHCP....\ GUID: 2A9B43A4 A50A 11E1 AAA0 5. Monitor the boot process onscreen and verify that the replacement server boots from hard disk. If it does not, do the following to force it to boot from hard disk: a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.) key, as follows on the keyboard: then Note: When you exit the BMC console you are returned to your connection on the Primary Master server as the user root. b. Issue the following command from the Primary Master server to force the Segment server to boot from hard drive: # ipmiutil reset -h -N hdm3-sp -U root -P sephiroth Change the hostname shown in bold above to the hostname of the server you replaced. EMC Greenplum DCA Maintenance Guide Replace hdm3 (resourcemanager, DCA version 2.0.1.0) 76 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server c. Once the operating system is loaded, issue the following command to change the boot order on the server. For example, on hdm3: # ssh hdm3 # syscfg /bbo “emcbios” HDD NW d. Reboot the system: # reboot e. Following the reboot, issue the following commands to connect to the replacement server and verify the boot order: # ssh hdm3 # syscfg /bbosys 6. Issue the following command to check the health of the replacement server: # dcacheck -h hdm3 Verify that no errors display. 7. Exchange SSH keys on the replacement server using the DCA Setup utility: a. Start the DCA Setup utility as the user root: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 6 to Generate SSH Keys. d. Enter X to exit the DCA Setup utility. e. When the server reboots, check the new firmware version. Replace the text in bold with the hostname of the replacement server: # ssh hdm3 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" If the above command returns either: FW Package Build: 23.7.0-0033 or FW Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a through n below to update the firmware. f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to your laptop. https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian ce g. Extract the files to your laptop using unzip or similar unpacking tool. For example: Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the file in /root. You can use WinSCP or a similar utility. Note: You may be required to provide a login to the destination server. i. For each server in need of an update, log into the server as root. j. SCP the MR56p.rom file from the master to the server you are updating. EMC Greenplum DCA Maintenance Guide Replace hdm3 (resourcemanager, DCA version 2.0.1.0) 77 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server k. Install the new firmware using the following command: Note: This will take longer on 24-disk servers. # /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom -aall l. Reboot the server. # reboot m. When the server reboots, check the new firmware version: # /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" The following should be returned, indicating your firmware has successfully been updated on this server: FW package Build: 23.12.0-0013 n. Repeat these alphabetic steps to check/update the remaining servers in the cluster. 8. Synchronize the system clock: a. Select option 2 for Modify DCA Settings. b. Select option 5 for Modify NTP/Clock Configuration Options. c. Select option 3 for Synchronize clocks across the cluster to the NTP server. d. Enter X to exit the DCA Setup utility. 9. Re-enable health monitoring by restarting the healthmon daemon: # dca_healthmon_ctl -e 10. Connect to hdm1: # ssh hdm1 11. Issue the following command to view the status of the Hadoop cluster: # dca_hadoop --status all The status of the resourcemanager and zookeeper modules on hdm3 should be: module resourcemanager(service hadoop-resourcemanager) has error on host hdm3 module zookeeper(service zookeeper-server) has error on host hdm3 12. Start the resourcemanager and zookeeper services: # dca_hadoop --start resourcemanager # dca_hadoop --start zookeeper 13. Verify that all services are shown as started: # dca_hadoop --status all 14. Switch to the user hdfs and issue the following command: EMC Greenplum DCA Maintenance Guide Replace hdm3 (resourcemanager, DCA version 2.0.1.0) 78 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server # su - hdfs $ hadoop fsck / 15. Verify that the filesystem is healthy: $ hadoop fsck / The following message indicates that the filesystem is healthy: The filesystem under path ‘/’ is HEALTHY Replace hdm4 (zookeeper/hive/hive-metastore, DCA version 2.0.1.0) 1. From the Primary Master server issue the following command to open a console session on the replacement server. Replace the hostname shown in bold with the hostname of the replacement server: # ipmiutil sol -a -e -N hdm4-sp -U root -P sephiroth You will need to press the F key within 15 seconds after seeing this WARNING message: Foreign configuration(s) found on adapter Press any key to continue or ‘C’ load the configuration utility, or ‘F’ to import foreign configuration(s) and continue. 2. Power on the replacement server by pressing the power button on the front panel, and press the F key when prompted. 3. When the following message displays, disregard and press the space bar: All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Press any key to continue, or 'C' to load the configuration utility. 4. If the message below appears it indicates that the server did not accept the “F” key request per the above WARNING. This means that you will need to power off the server, verify that all LED lights are off, and go back to step 2 (press the F key when prompted again): CLIENT MAC ADDR: 00 1E 67 4D C5 1D 001E674DC51D DHCP....\ GUID: 2A9B43A4 A50A 11E1 AAA0 5. Monitor the boot process onscreen and verify that the replacement server boots from hard disk. If it does not, do the following to force it to boot from hard disk: a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.) key, as follows on the keyboard: then EMC Greenplum DCA Maintenance Guide Replace hdm4 (zookeeper/hive/hive-metastore, DCA version 2.0.1.0) 79 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server Note: When you exit the BMC console you are returned to your connection on the Primary Master server as the user root. b. Issue the following command from the Primary Master server to force the Segment server to boot from hard drive: # ipmiutil reset -h -N hdm4-sp -U root -P sephiroth Change the hostname shown in bold above to the hostname of the server you replaced. c. Once the operating system is loaded, issue the following command to change the boot order on the server. For example, on hdm4: # ssh hdm4 # syscfg /bbo “emcbios” HDD NW d. Reboot the system: # reboot e. Following the reboot, issue the following commands to connect to the replacement server and verify the boot order: # ssh hdm4 # syscfg /bbosys 6. Issue the following command to check the health of the replacement server: # dcacheck -h hdm4 Verify that no errors display. 7. Exchange SSH keys on the replacement server using the DCA Setup utility: a. Start the DCA Setup utility as the user root: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 6 to Generate SSH Keys. d. Enter X to exit the DCA Setup utility. e. When the server reboots, check the new firmware version. Replace the text in bold with the hostname of the replacement server: # ssh hdm4 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" If the above command returns either: FW Package Build: 23.7.0-0033 or FW Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a through n below to update the firmware. f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to your laptop. https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian ce EMC Greenplum DCA Maintenance Guide Replace hdm4 (zookeeper/hive/hive-metastore, DCA version 2.0.1.0) 80 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server g. Extract the files to your laptop using unzip or similar unpacking tool. For example: Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the file in /root. You can use WinSCP or a similar utility. Note: You may be required to provide a login to the destination server. i. For each server in need of an update, log into the server as root. j. SCP the MR56p.rom file from the master to the server you are updating. k. Install the new firmware using the following command: Note: This will take longer on 24-disk servers. # /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom -aall l. Reboot the server. # reboot m. When the server reboots, check the new firmware version: # /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" The following should be returned, indicating your firmware has successfully been updated on this server: FW package Build: 23.12.0-0013 n. Repeat these alphabetic steps to check/update the remaining servers in the cluster. 8. Synchronize the system clock: a. Select option 2 for Modify DCA Settings. b. Select option 5 for Modify NTP/Clock Configuration Options. c. Select option 3 for Synchronize clocks across the cluster to the NTP server. d. Enter X to exit the DCA Setup utility. 9. Re-enable health monitoring by restarting the healthmon daemon: # dca_healthmon_ctl -e 10. Connect to hdm1: # ssh hdm1 11. Issue the following command to view the status of the Hadoop cluster: # dca_hadoop --status all EMC Greenplum DCA Maintenance Guide Replace hdm4 (zookeeper/hive/hive-metastore, DCA version 2.0.1.0) 81 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server The status of the zookeeper, hive, and hive-metastore modules on hdm4 should be: module hive-server(service hive-server) is stopped on host hdm4 module hive-server(service hive-metastore) is stopped on host hdm4 module zookeeper(service zookeeper-server) is stopped on host hdm4 12. Start the stopped services: # dca_hadoop --start zookeeper # dca_hadoop --start hive-server 13. Verify that all services are shown as started: # dca_hadoop --status all 14. Switch to the user hdfs and issue the following command: # su - hdfs $ hadoop fsck / 15. Verify that the filesystem is healthy: $ hadoop fsck / The following message indicates that the filesystem is healthy: The filesystem under path ‘/’ is HEALTHY Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0) 1. From the Primary Master server issue the following command to open a console session on the replacement server. Replace the hostname shown in bold with the hostname of the replacement Hadoop Worker server. # ipmiutil sol -a -e -N hdw1-sp -U root -P sephiroth You will need to press the F key within 15 seconds after seeing this WARNING message: Foreign configuration(s) found on adapter Press any key to continue or ‘C’ load the configuration utility, or ‘F’ to import foreign configuration(s) and continue. 2. Power on the replacement server by pressing the power button on the front panel, and press the F key when prompted. 3. When the following message displays, disregard and press the space bar: All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Press any key to continue, or 'C' to load the configuration utility. EMC Greenplum DCA Maintenance Guide Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0) 82 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server 4. If the message below appears it indicates that the server did not accept the “F” key request per the above WARNING. This means that you will need to power off the server, verify that all LED lights are off, and go back to step 2 (press the F key when prompted again): CLIENT MAC ADDR: 00 1E 67 4D C5 1D 001E674DC51D DHCP....\ GUID: 2A9B43A4 A50A 11E1 AAA0 5. Monitor the boot process onscreen and verify that the replacement server boots from hard disk. If it does not, do the following to force it to boot from hard disk: a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.) key, as follows on the keyboard: then Note: When you exit the BMC console you are returned to your connection on the Primary Master server as the user root. b. Issue the following command from the Primary Master server to force the replacement Hadoop Worker server to boot from hard drive. Change the hostname shown in bold to the hostname of the server you replaced: # ipmiutil reset -h -N hdw1-sp -U root -P sephiroth c. Once the operating system is loaded, issue the following command to change the boot order on the server. Change the hostname shown in bold to the hostname of the server you replaced: # ssh hdw1 # syscfg /bbo “emcbios” HDD NW d. Reboot the system: # reboot e. Following the reboot, issue the following commands to connect to the replacement server and verify the boot order. Change the hostname shown in bold to the hostname of the server you replaced: # ssh hdw1 # syscfg /bbosys 6. Issue the following command to check the health of the replacement server. Change the hostname shown in bold to the hostname of the server you replaced: oreign Verify that no errors display. 7. Exchange SSH keys on the replacement server using the DCA Setup utility: a. Start the DCA Setup utility as the user root: # dca_setup EMC Greenplum DCA Maintenance Guide Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0) 83 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server b. Select option 2 to Modify DCA Settings. c. Select option 6 to Generate SSH Keys. d. Enter X to exit the DCA Setup utility. e. When the server reboots, check the new firmware version. Replace the text in bold with the hostname of the replacement server: # ssh hdw1 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" If the above command returns either: FW Package Build: 23.7.0-0033 or FW Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a through n below to update the firmware. f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to your laptop. https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian ce g. Extract the files to your laptop using unzip or similar unpacking tool. For example: Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the file in /root. You can use WinSCP or a similar utility. Note: You may be required to provide a login to the destination server. i. For each server in need of an update, log into the server as root. j. SCP the MR56p.rom file from the master to the server you are updating. k. Install the new firmware using the following command: Note: This will take longer on 24-disk servers. # /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom -aall l. Reboot the server. # reboot m. When the server reboots, check the new firmware version: # /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW Package Build" The following should be returned, indicating your firmware has successfully been updated on this server: FW package Build: 23.12.0-0013 n. Repeat these alphabetic steps to check/update the remaining servers in the cluster. 8. Synchronize the system clock: EMC Greenplum DCA Maintenance Guide Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0) 84 EMC CONFIDENTIAL Replace a Segment, DIA, or Hadoop server a. Select option 2 for Modify DCA Settings. b. Select option 5 for Modify NTP/Clock Configuration Options. c. Select option 3 for Synchronize clocks across the cluster to the NTP server. d. Enter X to exit the DCA Setup utility. 9. Re-enable health monitoring by restarting the healthmon daemon: # dca_healthmon_ctl -e 10. Connect to hdm1: # ssh hdm1 11. Issue the following command to view the status of the Hadoop cluster: # dca_hadoop --status all The status of the datanode and nodemanager modules on all hdw’s should be: module hive-server(service hadoop-datanode) is stopped on host hdw# module hive-server(service hadoop-nodemanager) is stopped on host hdw# 12. Start the stopped datanode and nodemanager services: # dca_hadoop --start datanode # dca_hadoop --start nodemanager 13. Verify that all services are shown as started: # dca_hadoop --status all 14. Switch to the user hdfs: # su - hdfs 15. Verify that the filesystem is healthy: $ hadoop fsck / The following message indicates that the filesystem is healthy: The fiesystem under path ‘/’ is HEALTHY EMC Greenplum DCA Maintenance Guide Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0) 85 EMC CONFIDENTIAL CHAPTER 4 Replace a Disk Drive This chapter describes how to replace a failed drive in a Master, Segment, DIA, or Hadoop server. It includes the following major sections: Hot spare drives and the Copyback operation.......................................................... Replace a disk drive in a Master, DIA, or Hadoop Compute server ............................ Replace a drive in a Segment Server........................................................................ Replace a drive in an Hadoop server........................................................................ 86 87 91 96 Hot spare drives and the Copyback operation (Does not apply to drives in an Hadoop Worker server) When a drive fails, the RAID controller begins the rebuild process and writes data to the hot spare disk drive in the server. A slowly blinking amber LED on the hot spare drive indicates that the drive has been invoked as the rebuild drive. You must allow the rebuild process to complete on the hot spare before you remove the failed drive and replace it with a replacement drive. When the rebuild process is finished and you replace the failed drive with a replacement drive, data is copied automatically from the hot spare drive to the replacement drive in a process called the Copyback operation. When the Copyback operation is complete, the hot spare drive ends its role as the rebuild drive and resumes its original role as the hot spare drive. Returning the hot spare to its original role ensures that the hot spare drive always occupies the same slot in the server. Hot spare locations are shown in Table 4 below. Table 4 Hot spare drive locations per server type Server type Hot spare drive location(s) Master, DIA, or Hadoop Compute server, 8 disk slots (slots 6 and 7 are empty) Slot 5 (see Figure 12 on page 88) GPDB server, 24 disk slots Slot 11 and Slot 23 (see Figure 15 on page 92) Hadoop Master server, 12 disk slots Slot 11 (see Figure 18 on page 98) Note: Hadoop Worker servers do not have a hot spare drive. The Copyback operation runs in the background. During the operation the virtual drive is still available online to the host. EMC Greenplum DCA Maintenance Guide Replace a Disk Drive 86 EMC CONFIDENTIAL Replace a Disk Drive Replace a disk drive in a Master, DIA, or Hadoop Compute server All drives are installed at the front of the server and connect to the system board through the backplane. Hard drives are supplied in special hot-swappable hard-drive carriers that fit in the hard-drive slots. In addition to describing how to physically remove and insert the disk drive, this procedure also describes how to do the following: Determine if the RAID group is still rebuilding and how to monitor the rebuild process. Verify that the Copyback operation is in progress and how to monitor it. Manually initiate the Copyback operation if necessary. 1. Connect your service laptop to the DCA and log in to the Primary Master as the user root (see “Connect a workstation to the DCA” on page 176). 2. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA door and the blue server identification LED as the user root: a. Launch the DCA Setup utility: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 18 for Light Bar Controls. d. Select option 3 for Blink the light bar. e. Enter the hostname of the server and press ENTER. f. Enter X to exit the DCA Setup utility. The green lightbar on the DCA door and the blue server identification LED begin to blink. Note: If the DCA door does not have a lightbar, an error message displays. You can safely ignore the error message. To identify the failed server, locate the blue server identification LED. 3. Locate the failed drive. Note: In this procedure, Disk 0 is the failed drive. LED indicators on each drive carrier indicate the current status of the drive within it. A failed drive is indicated by an amber LED or no LED. (A drive in the rebuild process also displays an amber LED.) EMC Greenplum DCA Maintenance Guide Replace a disk drive in a Master, DIA, or Hadoop Compute server 87 EMC CONFIDENTIAL Replace a Disk Drive If the dial-home information includes a drive number, locate the drive with the help of the following illustration: Hot spare drive Disk 1 Disk 0 Disk 3 Disk 2 Disk 5 Disk 4 Disk 6 (empty) Disk 7 (empty) Figure 12 Master and DIA server drive slot numbering Verify the state of the RAID group rebuild process. 4. Before you remove the faulted drive, read the topic “Hot spare drives and the Copyback operation” on page 86. Then issue the following command to determine whether the RAID group is still being rebuilt: # CmdTool2 -PDList -aALL|egrep “Adapter|Enclosure|Slot Number|Firmware state" Example output is shown below. Focus on the items in bold. Adapter #0 Enclosure Device ID: 252 Slot Number: 1 Enclosure position: 0 Firmware state: Online, Spun Enclosure Device ID: 252 Slot Number: 2 Enclosure position: 0 Firmware state: Online, Spun Enclosure Device ID: 252 Slot Number: 3 Enclosure position: 0 Firmware state: Online, Spun Slot Number: 4 Enclosure position: 0 Firmware state: Online, Spun Up Up Up Up Enclosure Device ID: 252 Slot Number: 5 Enclosure position: 0 Firmware state: Rebuild In the example output above, note that the rebuild process is still in progress. • If Rebuild appears anywhere in the output, the rebuild process is in progress. Do remove the faulted drive yet. Monitor the rebuild process as described in step 5. • If all drives in the output are shown as Online, Spun Up, the rebuild process is complete. Proceed to removing the failed drive as described in step 6. Monitor the rebuild process. 5. To monitor the rebuild process if it is in progress, issue the following command. Change the values in bold below to the actual values from your output. For example: # CmdTool2 -pdrbld -progdsply -PhysDrv[252:5] -a0 The values in the above example refer to the following parameters: • 252 refers to the Enclosure Device ID. • 5 refers to the Slot Number of the hotspare drive invoked as the rebuild drive. • 0 refers to the Adapter Number. EMC Greenplum DCA Maintenance Guide Replace a disk drive in a Master, DIA, or Hadoop Compute server 88 EMC CONFIDENTIAL Replace a Disk Drive 6. If the rebuild is complete, remove the failed drive from the server: a. Press the button on the front of the drive carrier to release the drive handle. b. Wait 10 seconds to allow the platter in the drive to stop spinning. c. Pull the drive carrier out of the server. A B CL4966 Figure 13 Removing a drive from a Master Server IMPORTANT Make sure that adjacent drive carriers are fully installed and locked in place before you remove or replace a drive carrier. Replacing a drive carrier and attempting to lock its handle when the adjacent drive is only partially-installed can damage the drives. 7. To replace a drive in the server: a. Press the button on the front of the drive carrier and open the handle. b. Insert the drive carrier into the drive bay until the carrier contacts the backplane. c. Close the handle to lock the drive carrier in place. The LED on the drive turns green and blinks while it automatically starts the Copyback operation. CL4967 Figure 14 Replacing a drive in a Master Server 8. The Copyback operation should begin automatically when you insert the replacement drive. (For details about Copyback, see “Hot spare drives and the Copyback operation” on page 86.) Wait for the Copyback operation to complete. The hot spare always occupies slot 5 on a Master server (see Figure 12 on page 88). EMC Greenplum DCA Maintenance Guide Replace a disk drive in a Master, DIA, or Hadoop Compute server 89 EMC CONFIDENTIAL Replace a Disk Drive Verify that the Copyback operation is in progress. 9. Issue the following command to verify that the Copyback operation is in progress: # CmdTool2 -pdlist -aall | egrep "Adapter|Enclosure|Slot Number|Firmware state" Example output is shown below. Focus on the items in bold. Adapter #0 Enclosure Device ID: 252 Slot Number: 0 Enclosure position: 0 Firmware state: Copyback Enclosure Device ID: 252 Slot Number: 1 Enclosure position: 0 Firmware state: Online, Spun Enclosure Device ID: 252 Slot Number: 2 Enclosure position: 0 Firmware state: Online, Spun Enclosure Device ID: 252 Slot Number: 3 Enclosure position: 0 Firmware state: Online, Spun Enclosure Device ID: 252 Slot Number: 4 Enclosure position: 0 Firmware state: Online, Spun Enclosure Device ID: 252 Slot Number: 5 Enclosure position: 0 Firmware state: Online, Spun Up Up Up Up Up In the example output above, note that the firmware state of the drive in Slot Number 0 is shown as Copyback. This indicates that the copyback operation is in progress and that data is being restored to the new drive in Slot 0. If no firmware states are shown as Copyback (for example, if the firmware states are shown as Online, Spun Up or Hotspare Spun Up) the Copyback operation is complete. Monitor the Copyback operation. 10. To monitor the progress of the Copyback operation, issue the following command. Change the values shown in bold below to the actual values from your output. # CmdTool2 -pdcpybk -progdsply -PhysDrv[252:0] -a0 The values in the above example refer to the following parameters: • 252 refers to the Enclosure Device ID. • 0 refers to the Slot Number of the hotspare drive invoked as the rebuild drive. • 0 refers to the Adapter Number. The following is an example output of the Copyback progress. Copyback Progress of Physical Drive... Enclosure:Slot Percent Complete Time Elps 252 :00 ##############*********29 %*********** 00:10:38 Presskey to quit... EMC Greenplum DCA Maintenance Guide Replace a disk drive in a Master, DIA, or Hadoop Compute server 90 EMC CONFIDENTIAL Replace a Disk Drive If the firmware state is reported as Unconfigured(Good) then the Copyback operation did not occur automatically. In this unlikely event, you must initiate Copyback manually by issuing the following command: If necessary, manually initiate the Copyback operation. # CmdTool2 -pdcpybk -start -PhysDrv[252:5,252:0] -a0 View the Copyback progress as described in step 10. If no firmware states are shown as Copyback, the Copyback operation is complete. 11. Issue the following command: CmdTool2 -pdlist -aall | egrep "Adapter|Enclosure|Slot Number|Firmware state" 12. In the output verify that the firmware state of the drives is reported as follows: • Drives 0 - 4: Online, Spun Up • Drive 5: hotspare, Spun Up Replace a drive in a Segment Server All drives are installed at the front of the server and connect to the system board through the backplane. Hard drives are supplied in special hot-swappable hard-drive carriers that fit in the hard-drive slots. In addition to describing how to physically remove and insert the disk drive, this procedure also describes how to do the following: Determine if the RAID group is still rebuilding and how to monitor the rebuild process. Verify that the Copyback operation is in progress and how to monitor it. Manually initiate the Copyback operation if necessary. 1. Connect your service laptop to the DCA and log in to the Primary Master as the user root (see “Connect a workstation to the DCA” on page 176). 2. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA door and the blue server identification LED as the user root: a. Launch the DCA Setup utility: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 18 for Light Bar Controls. d. Select option 3 for Blink the light bar. e. Enter the hostname of the server and press ENTER. f. Enter X to exit the DCA Setup utility. The green lightbar on the DCA door and the blue server identification LED begin to blink. EMC Greenplum DCA Maintenance Guide Replace a drive in a Segment Server 91 EMC CONFIDENTIAL Replace a Disk Drive Note: If the DCA door does not have a lightbar, an error message displays. You can safely ignore the error message. To identify the failed server, locate the blue server identification LED. g. Locate the failed drive. Note: In this procedure, Disk 0 is the failed drive. LED indicators on the each drive carrier indicate the current status of the drive within it. A failed drive is indicated by a solid (unblinking) amber LED. If the dial-home information includes a drive number, locate the drive with the help of the following illustration: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Hot spare drive 20 21 22 23 Hot spare drive AF004297 Figure 15 Segment server drive slot numbering Verify the state of the RAID group rebuild process. 3. Before you remove the faulted drive, read the topic “Hot spare drives and the Copyback operation” on page 86. 4. Issue the following command to determine whether the RAID group is still being rebuilt: # CmdTool2 -pdlist -aall | egrep "Adapter|Enclosure|Slot Number|Firmware state" Example output is shown below. Focus on the items in bold. Note from the output that 24-disk GPDB servers have two Adapters (#0 and #1) that control 12 slots each. Make sure that you investigate the correct area of the output for the drive that you are replacing. Adapter #0 Enclosure Device ID: 28 Slot Number: 1 Enclosure position: 0 Firmware state: Online, Spun Up . . . Enclosure Device ID: 28 Slot Number: 10 Enclosure position: 0 Firmware state: Online, Spun Up Enclosure Device ID: 28 Slot Number: 11 Enclosure position: 0 Firmware state: Rebuild EMC Greenplum DCA Maintenance Guide Replace a drive in a Segment Server 92 EMC CONFIDENTIAL Replace a Disk Drive Adapter #1 Enclosure Device ID: 13 Slot Number: 0 Enclosure position: 0 Firmware state: Online, Spun Up Enclosure Device ID: 13 . . . Enclosure Device ID: 13 Slot Number: 11 Enclosure position: 0 Firmware state: Hotspare, Spun Up In the example output above, note that the rebuild is still in progress. • If Rebuild appears anywhere in the output, the rebuild is in progress. Do not remove the faulted drive yet. Monitor the rebuild process as described in step 5. • If all drives in the output are shown as Online, Spun Up the rebuild is complete. Proceed to removing the failed drive as described in step 6. Monitor the rebuild process. 5. To monitor the rebuild process, issue the following command. Change the values shown in bold below to the actual values from your output. For example: # CmdTool2 -pdrbld -progdsply -PhysDrv[28:11] -a0 IMPORTANT Remember that GPDB servers have two Adapters (#0 and #1) that control 12 slots each. Make sure to specify the correct adapter number, slot number, and enclosure device ID when issuing the above command. The values in the above example refer to the following parameters: • 28 refers to the Enclosure Device ID. • 11 refers to the Slot Number of the hotspare drive invoked as the rebuild drive. • 0 refers to the Adapter Number. 6. If the rebuild is complete, remove the failed carrier from the server: a. Press the button on the front of the drive carrier to release the drive handle. b. Wait 10 seconds to allow the platter in the drive to stop spinning. EMC Greenplum DCA Maintenance Guide Replace a drive in a Segment Server 93 EMC CONFIDENTIAL Replace a Disk Drive c. Pull the drive carrier out of the server. B A CL5012 Figure 16 Removing a drive from a Segment server 7. Make sure that the capacity of the replacement drive matches the capacity of the failed drive. The drive capacity is printed on the label on each drive. IMPORTANT Do not mix drives of different capacities within a server. 8. To replace a drive carrier in the server: a. Insert the drive carrier into the drive bay until the carrier contacts the backplane. b. Close the handle to lock the drive carrier in place. The LED on the drive turns green and blinks while it automatically starts the Copyback operation. CL5014 Figure 17 Replacing a drive in a Segment server 9. The Copyback operation should begin automatically when you insert the replacement drive. (For details about Copyback, see “Hot spare drives and the Copyback operation” on page 86.) Wait for the Copyback operation to complete. The hot spare drives always occupy slot 11 and slot 23 in a 24-disk segment server (see Figure 15 on page 92). Verify that the Copyback operation is in progress. 10. Issue the following command to verify that the Copyback operation is in progress: # CmdTool2 -pdlist -aall | egrep EMC Greenplum DCA Maintenance Guide "Adapter|Enclosure|Slot Number|Firmware state" Replace a drive in a Segment Server 94 EMC CONFIDENTIAL Replace a Disk Drive Example output is shown below. Focus on the items in bold. Adapter #0 Enclosure Device ID: 28 Slot Number: 0 Enclosure position: 0 Firmware state: Copyback . . . Enclosure Device ID: 28 Slot Number: 11 Enclosure position: 0 Firmware state: Online, Spun Up Adapter #1 Enclosure Device ID: 13 Slot Number: 0 Enclosure position: 0 Firmware state: Online, Spun Up . . . Enclosure Device ID: 13 Slot Number: 11 Enclosure position: 0 Firmware state: Hotspare, Spun Up In the example output above, note that the firmware state of the drive in Adapter #0, Slot Number 0 is shown as Copyback. This indicates that the copyback operation is in progress and that data is being restored to the new drive in Slot 0. If no firmware states are shown as Copyback, (for example, the firmware states are Online, Spun Up or Hotspare Spun Up) the Copyback operation is complete. Monitor the Copyback operation. 11. To view the Copyback progress, issue the following command. Change the values shown in bold below to the actual values from your output. # CmdTool2 -pdcpybk -progdsply -PhysDrv[28:0] -a0 The values in the above example refer to the following parameters: • 28 refers to the Enclosure Device ID. • 0 refers to the Slot Number of the hotspare drive invoked as the rebuild drive. • 0 refers to the Adapter Number. The following is an example output of the Copyback progress. Copyback Progress of Physical Drive... Enclosure:Slot Percent Complete Time Elps 28 :00 ##############*********29 %*********** 00:10:38 Press key to quit... EMC Greenplum DCA Maintenance Guide Replace a drive in a Segment Server 95 EMC CONFIDENTIAL Replace a Disk Drive If the firmware state is reported as Unconfigured(Good) then the Copyback operation did not occur automatically. In this unlikely event, you must initiate Copyback manually by issuing the following command: If necessary, manually initiate the Copyback operation. # CmdTool2 -pdcpybk -start -PhysDrv[28:11,28:0] -a0 View the Copyback progress as described in step 11. If no firmware states are shown as Copyback, the Copyback operation is complete. 12. Issue the following command: CmdTool2 -pdlist -aall | egrep "Adapter|Enclosure|Slot Number|Firmware state" 13. In the output verify that the firmware state of the drives is reported as follows: • Drives 0 - 10 and 12 - 22: Online, Spun Up • Drives 11 and 23: hotspare, Spun Up Replace a drive in an Hadoop server The section describes how to replace a drive in a Hadoop server in DCA version 2.0.1.0 or later. For details on replacing a drive in a Hadoop server in DCA version 2.0.0.0, see the EMC Data Computing Appliance Maintenance Guide for 2.0.0.0, Rev A02. Hadoop-related procedures were not available at the time of this document’s publication. The document will be updated in the short term and re-released. Until that time, please contact platform-eng-support@gopivotal.com for Hadoop-related service questions. The replacement procedure you use to replace a drive in a Hadoop server depends on the type of Hadoop server it is, and—in the case of a Hadoop Worker—whether the faulted drive is a System drive or a Data drive. Hadoop Master servers—Drives are part of a RAID 5 configuration which can be rebuilt automatically by the server’s RAID controller. For instructions, see “Replace a drive in a Hadoop Master server” on page 97. Hadoop Worker servers—The server has two different RAID configurations: • System disks 0 – 1 are configured as RAID 1. If one of the System disks fails, the RAID is rebuilt automatically by the server’s RAID controller after the replacement drive is inserted. For instructions, see “Replace a System Disk (0 through 1)” on page 101. • Data disks 2 – 11 are each configured as single RAID-0. You must recover data through the Hadoop filesystem after you replace the drive. For instructions, see “Replace a failed Data Disk (2 through 11)” on page 104. EMC Greenplum DCA Maintenance Guide Replace a drive in an Hadoop server 96 EMC CONFIDENTIAL Replace a Disk Drive Replace a drive in a Hadoop Master server All drives are installed at the front of the server and connect to the system board through the backplane. Hard drives are supplied in special hot-swappable hard-drive carriers that fit in the hard-drive slots. In addition to describing how to physically remove and insert the disk drive, this procedure also describes how to do the following: Determine if the RAID group is still being rebuilt and how to monitor the rebuild process. Verify that the Copyback operation is in progress and how to monitor the Copyback operation. Manually initiate the Copyback operation if necessary. 1. Connect your service laptop to the DCA and log in to the Primary Master as the user root (see “Connect a workstation to the DCA” on page 176). 2. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA door and the blue server identification LED as the user root: a. Launch the DCA Setup utility: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 18 for Light Bar Controls. d. Select option 3 for Blink the light bar. e. Enter the hostname of the server and press ENTER. f. Enter X to exit the DCA Setup utility. The green lightbar on the DCA door and the blue server identification LED begin to blink. Note: If the DCA door does not have a lightbar, an error message displays. You can safely ignore the error message. To identify the failed server, locate the blue server identification LED. g. Locate the failed drive. Note: In this procedure, Disk 0 is the failed drive. LED indicators on the each drive carrier indicate the current status of the drive within it. A faulted drive is indicated by a solid (unblinking) amber LED. EMC Greenplum DCA Maintenance Guide Replace a drive in an Hadoop server 97 EMC CONFIDENTIAL Replace a Disk Drive If the dial-home information includes a drive number, locate the drive with the help of the following illustration: Hot spare drive (Hadoop Masters only) Disk 2 Disk 5 Disk 8 Disk 11 Disk 1 Disk 4 Disk 7 Disk 10 Disk 0 Disk 3 Disk 6 Disk 9 Figure 18 Hadoop Master server drive slot numbering Verify the state of the RAID group rebuild process. 3. Before you remove the faulted drive, read the topic “Hot spare drives and the Copyback operation” on page 86. 4. Issue the following command to determine whether the RAID group is still being rebuilt: # CmdTool2 -pdlist -aall | egrep "Adapter|Enclosure|Slot Number|Firmware state" Example output is shown below. Focus on the items in bold. Adapter #0 Enclosure Device ID: 28 Slot Number: 1 Enclosure position: 0 Firmware state: Online, Spun Up . . . Enclosure Device ID: 28 Slot Number: 11 Enclosure position: 0 Firmware state: Rebuild In the example output above, note that the rebuild is still in progress. • If Rebuild appears anywhere in the output, the rebuild is in progress. Do not remove the faulted drive yet. Monitor the rebuild process as described in step 5. • If all drives in the output are shown as Online, Spun Up the rebuild is complete. Proceed to removing the failed drive as described in step 6. Monitor the rebuild process. 5. To monitor the rebuild process, issue the following command. Change the values shown in bold below to the actual values from your output. For example: # CmdTool2 -pdrbld -progdsply -PhysDrv[28:11] -a0 The values in the above example refer to the following parameters: • 28 refers to the Enclosure Device ID. • 11 refers to the Slot Number of the hotspare drive invoked as the rebuild drive. • 0 refers to the Adapter Number. 6. If the rebuild is complete, remove the failed drive from the server: a. Press the button on the front of the drive carrier to release the drive handle. EMC Greenplum DCA Maintenance Guide Replace a drive in an Hadoop server 98 EMC CONFIDENTIAL Replace a Disk Drive b. Wait 10 seconds to allow the platter in the drive to stop spinning. c. Pull the drive carrier out of the server. CL4982 Figure 19 Removing a drive from a Hadoop Master server 7. Make sure that the capacity of the replacement drive matches the capacity of the failed drive. The drive capacity is printed on the label on each drive. IMPORTANT Do not mix drives of different capacities within a server. 8. To replace a drive carrier in the server: a. Insert the drive carrier into the drive bay until the carrier contacts the backplane. b. Close the handle to lock the drive carrier in place. The LED on the drive turns green and blinks while it automatically starts the Copyback operation. B A CL4983 Figure 20 Replacing a drive in a Hadoop Master server 9. The Copyback operation should begin automatically when you insert the replacement drive. (For details about Copyback, see “Hot spare drives and the Copyback operation” on page 86.) Wait for the Copyback operation to complete. The hot spare always occupies Slot 11 on a Hadoop Master server (see Figure 15 on page 92). Verify that the Copyback operation is in progress. 10. Issue the following command to verify that the Copyback operation is in progress: # CmdTool2 -pdlist -aall | egrep "Adapter|Enclosure|Slot Number|Firmware state" Example output is shown below. Focus on the items in bold. EMC Greenplum DCA Maintenance Guide Replace a drive in an Hadoop server 99 EMC CONFIDENTIAL Replace a Disk Drive Adapter #0 Enclosure Device ID: 28 Slot Number: 0 Enclosure position: 0 Firmware state: Copyback . . . Enclosure Device ID: 28 Slot Number: 11 Enclosure position: 0 Firmware state: Online, Spun Up In the example output above, note that the firmware state of the drive in Slot Number 0 is shown as Copyback. This indicates that the copyback operation is in progress and that data is being restored to the new drive in Slot 0. If no firmware states are shown as Copyback, (for example, the firmware states are Online, Spun Up or Hotspare Spun Up) the Copyback operation is complete. Monitor the Copyback operation. 11. To view the Copyback progress, issue the following command. Change the values shown in bold below to the actual values from your output. # CmdTool2 -pdcpybk -progdsply -PhysDrv[28:0] -a0 The values in the above example refer to the following parameters: • 28 refers to the Enclosure Device ID. • 0 refers to the Slot Number of the hotspare drive invoked as the rebuild drive. • 0 refers to the Adapter Number. The following is an example output of the Copyback progress. Copyback Progress of Physical Drive... Enclosure:Slot Percent Complete Time Elps 28 :00 ##############*********29 %*********** 00:10:38 Press key to quit... If the firmware state is reported as Unconfigured(Good) then the Copyback operation did not occur automatically. In this unlikely event, you must initiate Copyback manually by issuing the following command: If necessary, manually initiate the Copyback operation. # CmdTool2 -pdcpybk -start -PhysDrv[28:11,28:0] -a0 View the Copyback progress as described in step 11. If no firmware states are shown as Copyback, the Copyback operation is complete. 12. Issue the following command: CmdTool2 -pdlist -aall | egrep EMC Greenplum DCA Maintenance Guide "Adapter|Enclosure|Slot Number|Firmware state" Replace a drive in an Hadoop server 100 EMC CONFIDENTIAL Replace a Disk Drive 13. In the output verify that the firmware state of the drives is reported as follows: • Drives 0 - 10: Online, Spun Up • Drive 11: Hotspare, Spun Up Replace a drive in a Hadoop Worker server There are two types of RAID configurations in an Hadoop Worker server: • System disks 0 – 1 are configured as RAID 1. If one of the System disks fail, the RAID is rebuilt automatically by the server’s RAID controller after the replacement drive is inserted. For instructions, “Replace a System Disk (0 through 1)” on page 101. • Data disks 2 – 11 are each configured as individual RAID 0 disks. If a drive fails, data must be recovered through the Hadoop filesystem after you replace the drive. For instructions, see “Replace a failed Data Disk (2 through 11)” on page 104. Note: Unlike other server types, Hadoop Workers do not have a hot spare drive. ta Physical disk 0:0:11 ta Da ta Da Physical disk 0:0:10 ta Da ta Da Physical disk 0:0:9 ta Da Physical disk 0:0:3 Da System Disk Physical disk 0:0:7 ta Da Physical disk 0:0:0 Physical disk 0:0:4 Physical disk 0:0:8 ta Da System Disk Physical disk 0:0:5 ta Da Physical disk 0:0:1 ta Da Physical disk 0:0:2 Physical disk 0:0:6 Figure 21 Hadoop Worker server drive types and locations Replace a System Disk (0 through 1) All drives are installed at the front of the server and connect to the system board through the backplane. Hard drives are supplied in special hot-swappable hard-drive carriers that fit in the hard-drive slots. In addition to describing how to physically remove and insert the disk drive, this procedure also describes how to determine if the RAID group is still rebuilding and how to monitor the rebuild process. 1. Connect your service laptop to the DCA and log in to the Primary Master as the user root (see “Connect a workstation to the DCA” on page 176). 2. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA door and the blue server identification LED as the user root: a. Launch the DCA Setup utility: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 18 for Light Bar Controls. d. Select option 3 for Blink the light bar. e. Enter the hostname of the server and press ENTER. EMC Greenplum DCA Maintenance Guide Replace a drive in an Hadoop server 101 EMC CONFIDENTIAL Replace a Disk Drive f. Enter X to exit the DCA Setup utility. The green lightbar on the DCA door and the blue server identification LED begin to blink. Note: If the DCA door does not have a lightbar, an error message displays. You can safely ignore the error message. To identify the failed server, locate the blue server identification LED. g. Locate the failed drive. Note: In this procedure, System Disk 0 is the failed drive. LED indicators on the each drive carrier indicate the current status of the drive within it. A failed drive is indicated by a solid (unblinking) amber LED. If the dial-home information includes a drive number, locate the drive with the help of the following illustration: ta Da Physical disk 0:0:10 ta Da ta Da Physical disk 0:0:9 ta Da Physical disk 0:0:6 Physical disk 0:0:11 ta Da Physical disk 0:0:3 ta Da System Disk Physical disk 0:0:7 ta Da Physical disk 0:0:0 Physical disk 0:0:4 Physical disk 0:0:8 ta Da System Disk Physical disk 0:0:5 ta Da Physical disk 0:0:1 ta Da Physical disk 0:0:2 Failed disk in this example Figure 22 Hadoop Worker server drive slot numbering 3. Remove the failed drive from the server: a. Press the button on the front of the drive carrier to release the drive handle. b. Wait 10 seconds to allow the platter in the drive to stop spinning. c. Pull the drive carrier out of the server. CL4982 Figure 23 Removing a drive from a Hadoop Master server 4. Make sure that the capacity of the replacement drive matches the capacity of the failed drive. The drive capacity is printed on the label on each drive. EMC Greenplum DCA Maintenance Guide Replace a drive in an Hadoop server 102 EMC CONFIDENTIAL Replace a Disk Drive IMPORTANT Do not mix drives of different capacities within a server. 5. To replace a drive carrier in the server: a. Insert the drive carrier into the drive bay until the carrier contacts the backplane. b. Close the handle to lock the drive carrier in place. The LED on the drive turns green and blinks while it automatically starts the Copyback operation. B A CL4983 Figure 24 Replacing a drive in a Hadoop Master server After the replacement drive is inserted the RAID controller automatically begins the rebuild process and writes data to the replacement drive, as indicated by a slowly blinking amber LED on the drive. Do not interrupt the rebuild process. Verify the state of the RAID group rebuild process. 6. To determine whether the RAID group is still being rebuilt, issue the following command: # CmdTool2 -pdlist -aall | egrep "Adapter|Enclosure|Slot Number|Firmware state" Example output is shown below. Focus on the items in bold. Adapter #0 Enclosure Device ID: 21 Slot Number: 0 Enclosure position: 0 Firmware state: Rebuild In the example output above, note that the firmware state indicates that the rebuild is still in progress. • If the firmware state of the replacement drive is shown as Rebuild, the rebuild is in progress. • If the firmware state of the replacement drive is shown as Online, Spun Up the rebuild is complete. EMC Greenplum DCA Maintenance Guide Replace a drive in an Hadoop server 103 EMC CONFIDENTIAL Replace a Disk Drive Monitor the rebuild process. 7. To monitor the rebuild process, issue the following command. Change the values shown in bold below to the actual values from your output. For example: # CmdTool2 -pdrbld -progdsply -PhysDrv[21:0] -a0 The values in the above example refer to the following parameters: • 21 refers to the Enclosure Device ID. • 0 refers to the slot from which you removed the faulted drive and inserted a replacement drive. • -a0 refers to Adapter Number 0. 8. Issue the following command: CmdTool2 -pdlist -aall | egrep "Adapter|Enclosure|Slot Number|Firmware state" 9. In the output verify that the firmware state of the drives is reported as: If necessary, manually initiate the rebuild process. Online, Spun Up If the firmware state is reported as Unconfigured(Good) then the rebuild did not occur automatically. In this unlikely event, you must initiate the rebuild manually by issuing the following command: # CmdTool2 -pdrbld -start -PhysDrv[21:0] -a0 Monitor the rebuild as described in step 7. If the firmware state of the replacement drive is shown as Online, Spun Up, the rebuild is complete. Replace a failed Data Disk (2 through 11) Data disks 2 through 11 are each configured as individual RAID 0 disks. Because data from a RAID 0 cannot be recovered automatically by the server’s RAID controller, you must recover it through the Hadoop software as described in this procedure. 1. Log in to the Primary Master Server as the user gpadmin (see “ Connect to the Master Server using an SSH client” on page 178). 2. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA door and the blue server identification LED as the user root: a. Launch the DCA Setup utility: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 18 for Light Bar Controls. d. Select option 3 for Blink the light bar. e. Enter the hostname of the server and press ENTER. f. Enter X to exit the DCA Setup utility. EMC Greenplum DCA Maintenance Guide Replace a drive in an Hadoop server 104 EMC CONFIDENTIAL Replace a Disk Drive The green lightbar on the DCA door and the blue server identification LED begin to blink. Note: If the DCA door does not have a lightbar, an error message displays. You can safely ignore the error message. To identify the failed server, locate the blue server identification LED. 3. Locate the failed data disk (see Figure 22). Make note of the drive position in the format 0:0:X. where X is the slot number of the faulted data drive. 4. Remove the faulted drive from the server: a. Press the button on the front of the drive carrier to release the drive handle. b. Wait 10 seconds to allow the platter in the drive to stop spinning. c. Pull the drive carrier out of the server. 5. Make sure that the capacity of the replacement drive matches the capacity of the failed drive. The drive capacity is printed on the label on each drive. IMPORTANT Do not mix drives of different capacities within a server. 6. Install the replacement drive carrier in the server: a. Insert the drive carrier into the drive bay until the carrier contacts the backplane. b. Close the handle to lock the drive carrier in place. The LED on the drive turns green and blinks while it automatically starts the Copyback operation. 7. Log in to the Primary Master Server as the user root (see “ Connect to the Master Server using an SSH client” on page 178). 8. Connect as the user root to the server with the new disk. For example, if you replaced a disk in hdw1: $ ssh root@hdw1 Create a new virtual disk 9. Create a new virtual disk on the replacement drive: a. For the disk that you replaced, determine the Enclosure Device ID of the server and the Adapter Number : # CmdTool2 -pdlist -aall | egrep "Adapter|Enclosure|Slot Number” The following output is returned: Adapter #0 Enclosure Device ID: 13 Slot Number: 0 Enclosure position: 0 . . . Enclosure Device ID: 13 Slot Number: 11 Enclosure position: 0 EMC Greenplum DCA Maintenance Guide Replace a drive in an Hadoop server 105 EMC CONFIDENTIAL Replace a Disk Drive b. Using information from the output above, issue the command from Table 5 that corresponds to the physical disk that you installed. For example, to create a virtual disk on physical disk 0:0:11, issue: # CmdTool2 -CfgLdAdd r0 '[13:11]' -a0 The values in the above command example refer to the following: • r0 refers to the RAID level (which is always 0 for a Hadoop Worker data disk). • 13 refers to the Enclosure Device ID. • 11 refers to the Slot Number of the physical disk that you replaced. • -a0 refers to the Adapter Number. Table 5 Virtual disk creation commands per physical disk slot Physical Disk Command 0:0:0, 0:0:1 # CmdTool2 -CfgLdAdd r1 '[13:0,13:1]' -sz 102400 -a0 # CmdTool2 -CfgLdAdd r1 '[13:0,13:1]' -sz 65536 -a0 # CmdTool2 -CfgLdAdd r1 '[13:0,13:1]' -sz 102400 -a0 # CmdTool2 -CfgLdAdd r1 '[13:0,13:1]' -a0 0:0:2 # CmdTool2 -CfgLdAdd r0 '[13:2]' -a0 0:0:3 # CmdTool2 -CfgLdAdd r0 '[13:3]' -a0 0:0:4 # CmdTool2 -CfgLdAdd r0 '[13:4]' -a0 0:0:5 # CmdTool2 -CfgLdAdd r0 '[13:5]' -a0 0:0:6 # CmdTool2 -CfgLdAdd r0 '[13:6]' -a0 0:0:7 # CmdTool2 -CfgLdAdd r0 '[13:7]' -a0 0:0:8 # CmdTool2 -CfgLdAdd r0 '[13:8]' -a0 0:0:9 # CmdTool2 -CfgLdAdd r0 '[13:9]' -a0 0:0:11is used as the 0:0:10 # CmdTool2 -CfgLdAdd r0 '[13:10]' -a0 example replacement drive throughout this procedure. 0:0:11 # CmdTool2 -CfgLdAdd r0 '[13:11]' -a0 Table 6 below matches each physical disk in the server with its corresponding virtual disk. Note that the virtual disk for physical disk 0:0:11 is 13. Table 6 Disk attributes in a Hadoop Worker server Physical Disk Virtual Disk Mount (Device name) Label 0:0:0, 0:0:1 0 /sda1 /boot 1 /sda2 / 2 /sdbswap swap 3 /sdc1 crash 4 /sde /data1 0:0:2 EMC Greenplum DCA Maintenance Guide Replace a drive in an Hadoop server 106 EMC CONFIDENTIAL Replace a Disk Drive Table 6 Disk attributes in a Hadoop Worker server Determine the automatically-assigned device name. Physical Disk Virtual Disk Mount (Device name) Label 0:0:3 5 /sdf /data2 0:0:4 6 /sdg /data3 0:0:5 7 /sdh /data4 0:0:6 8 /sdi /data5 0:0:7 9 /sdj /data6 0:0:8 10 /sdk /data7 0:0:9 11 /sdl /data8 0:0:10 12 /sdm /data9 0:0:11 13 /sdn /data10 10. Confirm the device name that was assigned to the new virtual disk. When issuing the following command, change the values in bold below with the values specific to your situation: # CmdTool2 -LDInfo -L13 -a0 The values in the above command example refer to the following: • 13 refers to the virtual disk created on physical disk 0:0:11. • 0 refers to the Adapter Number (i.e., the RAID controller). Virtual Drive: 13 (Target Id: 13) Name :sdn RAID Level : Primary-0, Secondary-0, RAID Level Qualifier-0 Size : 2.727 TB Parity Size : 0 State : Optimal Strip Size : 256 KB Number Of Drives : 1 Span Depth : 1 Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write Disk Cache Policy : Disk's Default Encryption Type : None PI type: No PI Is VD Cached: No In the example output above, note the value next to Name. This is the device name that you will use in step 11 to format a filesystem on the new virtual disk. EMC Greenplum DCA Maintenance Guide Replace a drive in an Hadoop server 107 EMC CONFIDENTIAL Replace a Disk Drive If the name does not appear in your output as it did in the above example, refer to Table 6 on page 106 and look up the device name value that corresponds to the physical disk that you replaced. Then issue the following command to set the device name. For example, to set the virtual disk device name of physical disk 11 to sdn (as specified in Table 6 on page 106): # CmdTool2 -ldsetprop -name sdn -L13 -a0 11. Format and label the filesystem on the new virtual disk. Replace the text in bold with values from the Label column and the Mount/Device name column in Table 6 that correspond to the physical disk that you replaced. # mkfs -t xfs -L /data10 -f /dev/sdn # mount /data10 If necessary, clear the state of the disk If for any reason you eject and then reseat an existing drive, you may have put the drive into a foreign state. If so, you must clear the state of the drive. a. To check the state of the drive in this case, switch to the user root and then issue the following command: # CmdTool2 -pdlist -aall | egrep "Adapter|Enclosure|Slot Number|Firmware state|Foreign State" Example output for a drive with a foreign state would look like this: Enclosure Device ID: 13 Slot Number: 11 Enclosure position: 0 Firmware state: foreign b. Examine the output carefully. If the output shows the drive state to be foreign, clear the state by issuing the following command, making sure to replace the Enclosure Device and Slot Number shown in bold below with the Enclosure Device ID and Slot Number given in your output. # CmdTool2 -PDMakeGood -PhysDrv[13:11]-Force -aALL c. Then, issue the following command. Change the values in bold with the Enclosure Device ID and Slot Number given in your output: # CmdTool2 -PDClear -Start -PhysDrv[13:11]-a0 d. Wait 5 minutes, then issue the following command, making sure to change the values in bold with the Enclosure Device ID and Slot Number given in your output: # CmdTool2 -PDClear -Stop -PhysDrv[13:11]-a0 Introduce the replacement drive to the Hadoop cluster and restart Hadoop 12. Make the following directories on the replacement drive: mkdir /data10/hadoop mkdir /data10/hadoop/data mkdir /data10/hadoop/local EMC Greenplum DCA Maintenance Guide Replace a drive in an Hadoop server 108 EMC CONFIDENTIAL Replace a Disk Drive 13. Set permissions on the new directories that you made in the previous step: # chown hdfs:hadoop /datat10/hadoop/data # chown mapred:hadoop /datat10/hadoop/local 14. Set permissions on Hadoop to 755: # chmod 755 /data10/hadoop # chmod 755 /data10/hadoop/local 15. Restart Hadoop on the recovered Hadoop Worker node: # service hadoop-datanode restart # service hadoop-tasktracker restart 16. Connect to the host hdm1 : # ssh hdm1 17. Switch to the user hdfs and verify that the Hadoop filesystem is healthy: # su - hdfs $ hadoop fsck / The following message in the output indicates a healthy filesystem: The filesystem under path '/' is HEALTHY EMC Greenplum DCA Maintenance Guide Replace a drive in an Hadoop server 109 EMC CONFIDENTIAL CHAPTER 5 Replace a Power Supply in a Server This chapter describes how to replace power supplies in DCA UAP servers. Power supply LEDs Each server in the UAP DCA has two power supplies that share the power load and provide redundancy. When the power load is below a certain threshold, only one power supply in each server is active and its LED is solid green to reflect the active state. The other power supply is in standby mode and its LED flashes green to reflect the standby state. Table 7 Power supply LED behavior LED behavior Definition Solid green Active mode Blinking green, slow Standby mode Blinking green, rapid Power supply firmware updating Off No AC power to both power supplies in the server Amber No AC power to this power supply, other power supply has AC power Replace a power supply in a server The servers in the DCA rack are powered by dual redundant hot-swappable power supplies located in the rear of the appliance. To replace a power supply in a server, perform the following procedure. 1. Log in to the Primary Master Server as the user root (see “Connect to the Master Server using an SSH client” on page 178). 2. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA door and the blue server identification LED as the user root: a. Launch the DCA Setup utility: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 18 for Light Bar Controls. d. Select option 3 for Blink the light bar. e. Enter the hostname of the server and press ENTER. f. Enter X to exit the DCA Setup utility. The green lightbar on the DCA door and the blue server identification LED begin to blink. EMC Greenplum DCA Maintenance Guide Replace a Power Supply in a Server 110 EMC CONFIDENTIAL Replace a Power Supply in a Server Note: If the DCA door does not have a lightbar, an error message displays. You can safely ignore the error message. To identify the failed server, locate the blue server identification LED. 3. Locate the failed power supply in the server. The LED on a failed power supply is either amber or, if the power supply has completely failed, the LED is off. 4. Verify that the LED on the functioning (redundant) power supply is solid green. 5. Check the connection of the AC power cable. Make sure that both ends of the cable are securely connected. 6. If the power supply still appears to be failed, try a known-good power cable. 7. If the power supply still appears to be failed, disengage the retaining clip that secures the AC power cable, and then disconnect the AC power cable from the power supply. 8. While pressing the green release latch leftward, pull on the handle to slide the power supply out of the chassis. CL5013 Figure 25 Remove a server power supply 9. Make sure that the replacement power supply is the correct part number: EMC P/N 105-000-244 The part number is located on the packaging, not on the power supply itself. Both power supplies in a server must provide the same maximum output power. 10. Install the replacement power supply in the chassis. Slide the replacement power supply into the chassis until the power supply is fully seated and the release latch clicks into place. Figure 26 Insert a server power supply EMC Greenplum DCA Maintenance Guide Replace a power supply in a server 111 EMC CONFIDENTIAL Replace a Power Supply in a Server 11. Connect the AC power cable to the power supply and secure it with the retaining clip. 12. Verify that the power supply LED indicator is solid or blinking greeTurn off the green lightbar on the DCA door and the blue server identification LED: a. If you are not already in it, start the DCA Setup utility as the user root: # dca_setup b. Select option 2 to Modify DCA Settings. c. Select option 18 for Light Bar Controls. d. Select option 2 for Turn off the light bar. e. Enter the hostname of the server and press ENTER. For example, sdw1. The lightbar on the DCA door and the blue server identification LED stop blinking. 13. EMC Greenplum DCA Maintenance Guide Replace a power supply in a server 112 EMC CONFIDENTIAL CHAPTER 6 Replace a Fan Assembly or Power Supply in an Arista Switch Refer to the appropriate section to replace a fan assembly or a power supply in an Arista switch. Replace a Fan Assembly in an Arista Switch The Admin, Interconnect, and Aggregate switches (Arista 7048/7050 series) in a DCA rack are cooled by their four modular fan assemblies that can be replaced individually upon failure as shown in Figure 20. The fans provide rear-to-front airflow. Leave any failed fan assembly installed until the point where it can be immediately replaced. Do not remove a fan assembly from the chassis until you are ready to replace it. Follow ESD precautions, including the use of a wrist grounding strap, when you replace components. The cooling system requires pressurized air in order to function properly. Do not leave any fan assembly slot unoccupied for longer than two minutes when the appliance is operating. Fan Assembly Replacement Order Information Use the following information to order the Arista Fan Assembly for the 7048T and 7050S Switch: Description: Fan Assembly for the Arista Switch (rear-to-front air flow) Model Number: FAN-7000-R EMC Part Number: 105-000-313 Tools The procedure only requires a wrist grounding strap (there are no screws to remove). EMC Greenplum DCA Maintenance Guide Replace a Fan Assembly or Power Supply in an Arista Switch 113 EMC CONFIDENTIAL Replace a Fan Assembly or Power Supply in an Arista Switch Identify the Failed Fan Assembly 1. Once on site, you must locate the DCA rack and confirm the switch physical location within the cabinet top level assembly (TLA) serial number reported in the service request. Arista switches (7050S) function as server interconnect or rack aggregation and are mounted mid-point in the 40RU rack height. The Admin (7048T) switch is mounted at the top of the rack. 2. You must confirm that the fan assembly specified for replacement is the failed fan assembly, and that the LED on the other fans are operational green and lit steadily (Figure 20 and Figure 21). Access to the fans is from the front of the rack. Figure 27 Four fan assemblies in the Arista switch Figure 28 Fan Status LED location 3. Analyze the failure using Table 8 or possible actions. Table 8 Fan assembly LED Indicators LED Behavior Possible State Action No light Fan assembly is not receiving power. Verify that the fan assembly is seated correctly, and there is no air movement. Steady Green Fan assembly is operating normally. No action Steady Red or Amber Fan has failed or power supply was removed from switch. Failed and must be replaced. EMC Greenplum DCA Maintenance Guide Replace a Fan Assembly in an Arista Switch 114 EMC CONFIDENTIAL Replace a Fan Assembly or Power Supply in an Arista Switch Remove the Failed Fan Assembly and Install the Replacement Part The cooling system relies on pressurized air. Do not leave any of the Fan assembly slots empty longer than two minutes when the switch is in operation. 1. Remove the fan assembly. While pressing on the black release latch leftward, grip the blue pull-ring and slide the fan assembly out of the chassis. 2. Slide the new Fan assembly into the chassis until the unit is fully seated, and the release latch snaps back into its original position. Do not force the insertion as damage can occur. If it resists, ensure that it is oriented correctly for a smooth slide and fit. 3. Verify that the Fan status LED is lit (steady green) to indicate normal operation. Parts Return 1. Locate the Parts Return Label package. Fill out the shipping label. Apply the shipping label to the box for return to EMC. 2. Read enclosed Shipping Instructions sheet. 3. Apply other labels for the box appropriate to this returning part, including the Failure Analysis (FA) tag which is currently required for all DCA replacement parts. 4. Securely tape the box and ship the failed part back to EMC. 5. Send questions regarding this return shipment to: CS_Logistics_IC@emc.com This completes the fan assembly replacement. EMC Greenplum DCA Maintenance Guide Replace a Fan Assembly in an Arista Switch 115 EMC CONFIDENTIAL Replace a Fan Assembly or Power Supply in an Arista Switch Replace a Power Supply in an Arista Switch The Admin, Interconnect, and Aggregate switches (Arista 7048/7050 series) in a DCA rack are powered by dual redundant modular power supply assemblies that can be replaced individually upon failure as shown in Figure 22. Leave any failed power supply installed until the point where it can be immediately replaced. Follow ESD precautions, including the use of a wrist grounding strap, when you replace components. The cooling system requires pressurized air in order to function properly. Do not leave any power supply slot unoccupied for longer than two minutes when the appliance is operating. Power Supply Assembly Replacement Order Information Use the following information to order the Arista Power Supply Assembly for the 7048T and 7050S switch: Description: Power Supply Assembly for the Arista Switch, 460W AC (rear-to-front air flow) Model Number: PWR-460AC-R EMC Part Number: 105-000-314 Tools The procedure only requires a wrist grounding strap (there are no screws to remove). EMC Greenplum DCA Maintenance Guide Replace a Power Supply in an Arista Switch 116 EMC CONFIDENTIAL Replace a Fan Assembly or Power Supply in an Arista Switch Identify the Failed Power Supply 1. Once on site, you must locate the DCA rack and confirm the switch physical location within the cabinet top level assembly (TLA) serial number reported in the service request. Arista switches (7050S) function as server interconnect or rack aggregation and are mounted mid-point in the 40RU rack height. The Admin (7048T) switch is mounted at the top of the rack. 2. You must confirm that the power supply specified for replacement is not working, and that the LED on the other power supplies are operational green and lit steadily (Figure 30). Access to the power supplies is from the front of the rack. Figure 29 Dual power supply assemblies in the Arista switch Figure 30 Power Supply Status LED location 3. Analyze the failure using Table 9 for possible actions. Table 9 Power Supply LED indicators LED Behavior Possible State Action No light Power supply is not connected to rack AC power source, or not inserted fully. Remove and reinsert firmly. Steady Green Power supply is operating normally. No action Steady Red or Amber Power supply overheated or has failed. Failed and must be replaced. EMC Greenplum DCA Maintenance Guide Replace a Power Supply in an Arista Switch 117 EMC CONFIDENTIAL Replace a Fan Assembly or Power Supply in an Arista Switch Remove the Failed Power Supply and Install the Replacement Part The cooling system relies on pressurized air. Do not leave any of the power supply slots empty longer than two minutes when the switch is in operation. 1. Unplug the AC power cable from the power supply you intend to remove (see Figure 30). 2. Locate the black release latch (lower right). While pressing on the release latch leftward, grip the blue pull-ring and slide the power supply out of the chassis. 3. Slide the new power supply into the chassis until it is fully seated. The release latch snaps into place. Do not force the insertion as damage can occur. If it resists, ensure that it is oriented correctly for a smooth slide and fit. 4. Reconnect the AC power cable to the power supply. Note: When applying power to a new power supply, allow for the system to recognize the power supply and determine its status. The power supply status indicator turns green to signify that it is functioning properly. 5. Verify that the power supply status LED is lit to indicate normal operation. Parts Return 1. Locate the Parts Return Label package. Fill out the shipping label. Apply the shipping label to the box for return to EMC. 2. Read enclosed Shipping Instructions sheet. 3. Apply other labels for the box appropriate to this returning part, including the Failure Analysis (FA) tag which is currently required for all DCA replacement parts. 4. Securely tape the box and ship the failed part back to EMC. 5. Send questions regarding this return shipment to: CS_Logistics_IC@emc.com This completes the fan assembly replacement. EMC Greenplum DCA Maintenance Guide Replace a Power Supply in an Arista Switch 118 EMC CONFIDENTIAL CHAPTER 7 Replace a Switch in the DCA This chapter describes how to replace Arista 7048T and 7050S-52 switches in a DCA. It also describes how you can use the DCA Setup Utility to upload switch configuration files from the Master server to switches. Switch types include: Interconnect and Aggregation (10GB; SWCH-AR1U-7050S-52) Administration (1GB; SWCH-AR1U-7048T) Note: Beginning in DCA 2.0.0.0, you cannot use the DCA Setup Utility to back up switch configuration files to the Master server. Major topics include: Requirements ....................................................................................................... Switch hostnames and IP addresses ..................................................................... Replace an Arista 7050S Interconnect or Aggregation Switch................................. Replace an Arista 7048T Administration Switch ..................................................... EMC Greenplum DCA Maintenance Guide Replace a Switch in the DCA 120 120 122 127 119 EMC CONFIDENTIAL Replace a Switch in the DCA Requirements Wrist grounding strap 9-Pin serial cable (RJ-45 to 9-pin d-sub connector) IMPORTANT If your laptop does not have a serial port, you must use a USB-to-serial adapter cable. Materials to label 20 cables Phillips #2 screwdriver 1/4-inch flathead screwdriver Switch hostnames and IP addresses You must configure the replacement switch with the correct hostname and IP address for the type of rack it inhabits and its position within the rack, as detailed in Table 10 below. For a table containing IP addresses for all configurations, see Appendix A, “Network Configuration Information.” Table 10 Switch hostnames and IP addresses (page 1 of 2) Rack Hostname IP Address Arista 7050S Interconnect switches (10GB) SYSRACK (Rack 1) i-sw-2 (Upper switch) 172.28.0.180 i-sw-1 (Lower switch) 172.28.0.170 AGGREG (Rack 2) i-sw-4 (Upper switch) 172.28.0.181 i-sw-3 (Lower switch) 172.28.0.171 EXPAND (Rack 3) i-sw-6 (Upper switch) 172.28.0.182 i-sw-5 (Lower switch) 172.28.0.172 EXPAND (Rack 4) i-sw-8 (Upper switch) 172.28.0.183 i-sw-7 (Lower switch) 172.28.0.173 EXPAND (Rack 5) i-sw-10 (Upper switch) 172.28.0.184 i-sw-9 (Lower switch) 172.28.0.174 EXPAND (Rack 6) i-sw-12 (Upper switch) 172.28.0.185 i-sw-11 (Lower switch) 172.28.0.175 EXPAND (Rack 7) i-sw-14 (Upper switch) 172.28.0.186 i-sw-13 (Lower switch) 172.28.0.176 EXPAND (Rack 8) i-sw-16 (Upper switch) 172.28.0.187 i-sw-15 (Lower switch) 172.28.0.177 EMC Greenplum DCA Maintenance Guide Requirements 120 EMC CONFIDENTIAL Replace a Switch in the DCA Table 10 Switch hostnames and IP addresses (page 2 of 2) Rack Hostname IP Address EXPAND (Rack 9) i-sw-18 (Upper switch) 172.28.0.188 i-sw-17 (Lower switch) 172.28.0.178 EXPAND (Rack 10) i-sw-20 (Upper switch) 172.28.0.189 i-sw-19 (Lower switch) 172.28.0.179 EXPAND (Rack 11) i-sw-22 (Upper switch) 172.28.1.180 i-sw-21 (Lower switch) 172.28.1.170 Arista 7050S Aggregation switches (10GB) AGGREG (Rack 2 only) aggr-sw-2 (Upper switch) 172.28.0.249 aggr-sw-1 (Lower switch) 172.28.0.248 Arista 7048T Administration switches (1GB) SYSRACK (Rack 1) a-sw-1 172.28.0.190 AGGREG (Rack 2) a-sw-2 172.28.0.191 EXPAND (Rack 3) a-sw-3 172.28.0.192 EXPAND (Rack 4) a-sw-4 172.28.0.193 EXPAND (Rack 5) a-sw-5 172.28.0.194 EXPAND (Rack 6) a-sw-6 172.28.0.195 EXPAND (Rack 7) a-sw-7 172.28.0.196 EXPAND (Rack 8) a-sw-8 172.28.0.197 EXPAND (Rack 9) a-sw-9 172.28.0.198 EXPAND (Rack 10) a-sw-10 172.28.0.199 EXPAND (Rack 11) a-sw-11 172.28.1.190 EMC Greenplum DCA Maintenance Guide Switch hostnames and IP addresses 121 EMC CONFIDENTIAL Replace a Switch in the DCA Replace an Arista 7050S Interconnect or Aggregation Switch Summary of main tasks: When installing a replacement switch, identify the firmware version on the new switch (as well as the versions already running in the DCA). Then upgrade so that all switches reflect the same firmware levels. Go to http://support.emc.com to obtain the pertinent firmware upgrade instructions. The upgrade instructions provide information on how to access and install the firmware upgrade package. Remove the failed switch and install the replacement switch Establish a serial connection and log in to the replacement switch Configure the switch management port and password Check the current firmware version Update the firmware if necessary Upload the switch configuration through DCA Setup Check the health of the GPDB EMC Greenplum DCA Maintenance Guide Replace an Arista 7050S Interconnect or Aggregation Switch 122 EMC CONFIDENTIAL Replace a Switch in the DCA 1. Identify the type and location of the switch that you are going to replace. Interconnect switches in the System Rack Aggregation switches in the Aggregation Rack Figure 31 Location of Interconnect and Aggregation switches 2. Connect your service laptop to the red service cable located on the laptop tray in Rack 1. The red service cable is connected to port 48 on the Administration switch in the Rack 1 (see “Connect a workstation to the DCA” on page 176). 3. To prevent false dial home messages from being sent to EMC Support during service, stop the healthmon daemon to disable health monitoring: # dca_healthmon_ctl -d Remove the failed switch and install the replacement switch 4. Label all cables connected to the switch. On the label, include the server and server port from which each cable originates and the switch and switch port to which each cable connects. For connectivity details, see the following: • Interconnect Switch cabling—see “” on page 152. • Aggregation Switch cabling—see “Aggregation switch reference” on page 163. EMC Greenplum DCA Maintenance Guide Replace an Arista 7050S Interconnect or Aggregation Switch 123 EMC CONFIDENTIAL Replace a Switch in the DCA 5. Power off the switch by removing both AC power cables from the power supplies on the back of the switch. 6. Make sure that the interconnect cables are labeled as described in step 4 above, and then remove the interconnect cables from the Interconnect switch. 7. Remove the failed switch and install the replacement switch (see “Install a Switch in a Rack” on page 200). 8. Connect all data cables to the correct ports on the switch. Refer to the labels for the correct connectivity information. For more information, see “Network and cabling configurations” on page 152. 9. Power on the switch by connecting the AC power cables to the power supplies on the back of the replacement switch. The switch powers up as soon as AC power is applied Verify that the power supply LEDs are solid green after a few seconds. IMPORTANT Each switch power supply should be connected to a separate AC power zone on the rack. See “Power supply reference” on page 146. To Power Zone B PDU Rear of rack To Power Zone A PDU Front of rack CL5041 Figure 32 Connecting switch power cords to AC power Note: Because the replacement switch was configured at the factory, it is not yet accessible through SSH, so you must configure the replacement switch through a serial connection as described in “Connect to an Interconnect or Administration switch using PuTTY” on page 181. Establish a serial connection and log in to the replacement switch 10. Connect your service laptop to the serial console port on the replacement switch using a native RJ-45 serial cable. If you do not have a native RJ-45 serial cable, use a DB-9-to-RJ45 or USB-to-RJ45 serial adapter. Serial console port EMC Greenplum DCA Maintenance Guide Replace an Arista 7050S Interconnect or Aggregation Switch 124 EMC CONFIDENTIAL Replace a Switch in the DCA Figure 33 Arista 7050S serial port location 11. Using a terminal emulator such as Hyperterminal, log in to the switch as user admin and no password with the following settings: • Connection type: serial • Data rate: 9600 • Data bits: 8 • Parity: none • Stop bits: 1 • Hardware flow control: none 12. At the localhost prompt, issue the following commands to disable the Arista zerotouch feature: # enable # zerotouch cancel Wait as the switch reboots. 13. When the switch is finished booting, log in again as user admin and no password. Configure the switch management port and password; hostname, and IP address 14. At the localhost prompt, issue the following commands to configure the management port: Note: Change the IP address shown in bold below as appropriate for the type and location of the switch you are configuring. For details see “Switch hostnames and IP addresses” on page 120. # enable # conf t # hostname i-sw-1 (config)# interface management 1 (config-if-Ma1)# ip address 172.28.0.170/21 (config-if-Ma1)# exit (config)# user admin secret 0 changeme (config)# write mem (config)# exit 15. Connect all data cables to the correct ports and the ethernet cable to the management port of the switch. 16. Determine whether you need to update the Extensible Operating System (EOS) firmware on the switch: # show boot-config In the output, focus on the value shown in bold below: Software image: flash:/EOS-4.9.3.2.swi • If 4.9.3.2 is returned, you do not need to update the switch firmware. Proceed to step 18 to complete the switch configuration. • If 4.9.3.2 is not returned, you must update the switch firmware. Proceed to step 17. EMC Greenplum DCA Maintenance Guide Replace an Arista 7050S Interconnect or Aggregation Switch 125 EMC CONFIDENTIAL Replace a Switch in the DCA Update the firmware if necessary 17. If you determined in the previous step that you need to update the switch firmware: a. Before proceeding, back up the current switch configuration. Please read Appendix G for instructions on backing up the switch configurations. b. Download the Arista firmware from http://support.emc.com and place in /opt/dca/etc/arista_fw/ You may need to create the directory if it does not exist. c. Next, download the current switch configuration. Customizing the switches is a common practice. This procedure will reload the switches with default configurations. Backing up the switch configurations allows for easy restoration after installing the new firmware. d. Issue the following commands to copy the EOS firmware file from the Primary Master Server to the switch: # copy scp://root@172.28.4.250/opt/dca/etc/arista_fw/EOS-4.9.3.2.swi flash:/EOS-4.9.3.2.swi e. When prompted, enter password changeme. root@172.28.4.250's password: # conf t (config)# boot system flash:/EOS-4.9.3.2.swi (config)# exit f. Check the EOS firmware version that you installed. # show boot-config The following output is returned: Software image: flash:/EOS-4.9.3.2.swi Console speed: (not set) Aboot password (encrypted): (not set) g. Save the EOS configuration and reload. The switch reboots. # write mem # reload h. Recover the switch config using the instructions in Appendix G. 18. Disconnect the serial cable. 19. Connect your service laptop to the red service cable located on the laptop tray in Rack 1 and log in to the Primary Master as the user root (see “Connect a workstation to the DCA” on page 176). Check the health of the GPDB 20. Log in to the Primary Master as gpadmin and issue the following command to verify that the database is healthy: $ gpstate -m Verify that all segments are reported as Synchronized: Mirror sdw2-2 Datadir /data2/mirror/gpseg0 Port 50003 Status Acting as Primary Data Status Synchronized 21. Re-enable health monitoring: EMC Greenplum DCA Maintenance Guide Replace an Arista 7050S Interconnect or Aggregation Switch 126 EMC CONFIDENTIAL Replace a Switch in the DCA # dca_healthmon_ctl -e Replace an Arista 7048T Administration Switch Summary of main tasks: When installing a replacement switch, identify the firmware version on the new switch (as well as the versions already running in the DCA). Then upgrade so that all switches reflect the same firmware levels. Go to http://support.emc.com to obtain the pertinent firmware upgrade instructions. The upgrade instructions provide information on how to access and install the firmware upgrade package. Remove the failed switch and install the replacement switch Establish a serial connection and log in to the replacement switch Configure the switch management port Configure the switch password Check the current firmware version Update the firmware if necessary Upload the switch configuration through DCA Setup Check the health of the GPDB EMC Greenplum DCA Maintenance Guide Replace an Arista 7048T Administration Switch 127 EMC CONFIDENTIAL Replace a Switch in the DCA 1. Identify the Administration switch the rack. Administration switch in the System Rack Figure 34 Location of the Aggregation switch 2. To prevent false dial home messages from being sent to EMC Support during service, stop the healthmon daemon to disable health monitoring: # dca_healthmon_ctl -d Remove the failed switch and install the replacement switch 3. Label all cables connected to the switch. On the label, include the server and server port from which each cable originates and the switch and switch port to which each cable connects. For connectivity details, see “Administration switch reference” on page 159. 4. Power off the switch by removing both AC power cables from the power supplies on the back of the switch. 5. Make sure that the cables are labeled as described in step above, and then remove the interconnect cables from the Interconnect switch. 6. Remove the failed switch and install the replacement switch (see “Install a Switch in a Rack” on page 200). EMC Greenplum DCA Maintenance Guide Replace an Arista 7048T Administration Switch 128 EMC CONFIDENTIAL Replace a Switch in the DCA 7. Connect all data cables to the correct ports on the switch. Refer to the labels for the correct connectivity information. For more information, see “Network and cabling configurations” on page 152. 8. Power on the switch by connecting the AC power cables to the power supplies on the back of the replacement switch. The switch powers up as soon as AC power is applied Verify that the power supply LEDs are solid green after a few seconds. IMPORTANT Each switch power supply should be connected to a separate AC power zone on the rack. See “Power supply reference” on page 146. To Power Zone B PDU Rear of rack To Power Zone A PDU Front of rack CL5041 IMPORTANT Because the replacement switch was configured at the factory, it is not yet accessible through SSH, so you must configure it through a serial connection as described in “Connect to an Interconnect or Administration switch using PuTTY” on page 181. Establish a serial connection and log in to the replacement switch 9. Connect your service laptop to the serial console port on the replacement switch using a native RJ-45 serial cable. If you do not have a native RJ-45 serial cable, use a DB-9-to-RJ45 or USB-to-RJ45 serial adapter. Serial console port Figure 35 Arista 7048T serial port location 10. Using a terminal emulator such as Hyperterminal, log in to the switch as user admin and no password with the following settings: • Connection type: serial • Data rate: 9600 • Data bits: 8 • Parity: none EMC Greenplum DCA Maintenance Guide Replace an Arista 7048T Administration Switch 129 EMC CONFIDENTIAL Replace a Switch in the DCA • Stop bits: 1 • Hardware flow control: none 11. At the localhost prompt, issue the following commands to disable the Arista zerotouch feature: # enable # zerotouch cancel Wait as the switch reboots. 12. When the switch is finished booting, log in again as user admin and no password. Configure the switch VLAN port service, hostname, and IP address 13. At the localhost prompt, issue the following commands to set the VLAN port service: Change the hostname and IP address shown in bold below as appropriate for the switch you are configuring. For details see “Switch hostnames and IP addresses” on page 120. IMPORTANT This step differs slightly according to the specific Administration switch that you are replacing. Administration switch in Rack 1 (a-sw-1) # enable # conf t (config)# hostname a-sw-1 (config)# interface vlan 3 The following output displays: ! Access VLAN does not exist. Creating vlan 3 Continue: (config-if-Vl3)# ip address 172.28.0.190/21 (config-if-Vl3)#interface ethernet 1-48 (config-if-Et1-48)#switchport access vlan 3 Administration switch in Rack 2 (a-sw-2) # enable # conf t (config)# hostname a-sw-2 (config)# interface vlan 3 The following output displays: ! Access VLAN does not exist. Creating vlan 3 Continue: (config-if-Vl3)# ip address 172.28.0.191/21 (config-if-Vl3)#interface ethernet 1-48 (config-if-Et1-48)#switchport access vlan 3 (config-if-Et1-48)#interface port-Channel 1000 (config-if-Po1000)#switchport mode trunk (config-if-Po1000)#switchport trunk group mlagpeerlink (config-if-Po1000)#interface ethernet 45-46 EMC Greenplum DCA Maintenance Guide Replace an Arista 7048T Administration Switch 130 EMC CONFIDENTIAL Replace a Switch in the DCA (config-if-Et45-46)#channel-group 1000 mode active Administration switch in Racks 3 to 12 (a-sw-3 to a-sw-12) Change the hostname and IP address shown in bold below as appropriate for the specific Administration switch you are configuring. For details see Appendix A, “Network Configuration Information.” # enable # conf t (config)# hostname a-sw-3 (config)# interface vlan 3 The following output displays: ! Access VLAN does not exist. Creating vlan 3 Continue: (config-if-Vl3)# ip address 172.28.0.192/21 (config-if-Vl3)#interface ethernet 1-48 (config-if-Et1-48)#switchport access vlan 3 (config)#interface port-Channel 900 (config-if-Po900)#switchport access vlan 3 (config-if-Po900)#interface ethernet 45-46 (config-if-Et45-46)#channel-group 900 mode active 14. Verify the that the ports were added to vlan 3: (config-if-Et1-48)#show vlan The following output displays: VLAN ----- Name ----- Status ----- 1 default active 3 VLAN0003 active Ports ----- Cpu, Et1, Et2, Et3, Et4, Et5, Et6, Et7, Et8, Et17, Et18, Et19, Et20 Note: Only active ports display in the above output. You may see different output. Configure the switch password 15. Configure the switch password: (config-if-Et1-48)# exit (config)# user admin secret 0 changeme (config)# write mem (config)# exit 16. Connect all data cables to the correct ports and the ethernet cable to the management port of the switch. 17. Determine whether you need to update the Extensible Operating System (EOS) firmware on the switch: # show boot-config In the output, focus on the value shown in bold below: Software image: flash:/EOS-4.9.3.2.swi EMC Greenplum DCA Maintenance Guide Replace an Arista 7048T Administration Switch 131 EMC CONFIDENTIAL Replace a Switch in the DCA • If 4.9.3.2 is returned, you do not need to update the switch firmware. Proceed to step 19 to complete the switch configuration. • If 4.9.3.2 is not returned, you must update the switch firmware. Proceed to step 18. Update the firmware if necessary 18. If you determined in the previous step that you need to update the switch firmware: a. Before proceeding, back up the current switch configuration. Please read Appendix G for instructions on backing up the switch configurations. b. Download the Arista firmware from http://support.emc.com and place in /opt/dca/etc/arista_fw/ You may need to create the directory if it does not exist. c. Next, download the current switch configuration. Customizing the switches is a common practice. This procedure will reload the switches with default configurations. Backing up the switch configurations allows for easy restoration after installing the new firmware. d. Issue the following commands to copy the EOS firmware file from the Primary Master Server to the switch: # copy scp://root@172.28.4.250/opt/dca/etc/arista_fw/EOS-4.9.3.2.swi flash:/EOS-4.9.3.2.swi e. When prompted, enter password changeme. root@172.28.4.250's password: # conf t (config)# boot system flash:/EOS-4.9.3.2.swi (config)# exit f. Check the EOS firmware version that you installed. # show boot-config The following output is returned: Software image: flash:/EOS-4.9.3.2.swi Console speed: (not set) Aboot password (encrypted): (not set) g. Save the EOS configuration and reload. The switch reboots. # write mem # reload h. Recover the switch config using the instructions in Appendix G. 19. Disconnect the serial cable. 20. Connect your service laptop to the red service cable located on the laptop tray in Rack 1 and log in to the Primary Master as the user root (see “Connect a workstation to the DCA” on page 176). Check the health of the GPDB 21. Log in to the Primary Master as gpadmin and issue the following command to verify that the database is healthy: $ gpstate -m Verify that all segments are reported as Synchronized: Mirror Datadir sdw2-2 /data2/mirror/gpseg0 EMC Greenplum DCA Maintenance Guide Port 50003 Status Data Status Acting as Primary Synchronized Replace an Arista 7048T Administration Switch 132 EMC CONFIDENTIAL Replace a Switch in the DCA 22. Re-enable health monitoring: # dca_healthmon_ctl -e EMC Greenplum DCA Maintenance Guide Replace an Arista 7048T Administration Switch 133 EMC CONFIDENTIAL CHAPTER 8 Replace an Interconnect Switch Cable This chapter describes how to replace a twin-ax cable used to connect servers and interconnect switches in the DCA. Note: Some failed cables may be part of a cable bundle. Plan for multiple systems losing connectivity. It is recommended to disable the database and healthmon until cables are replaced. Locate and replace the failed cable 1. Log in to the Primary Master server as the user root. Refer to “Connect to the Master Server using an SSH client” on page 178 for details. 2. Activate the server identification LED on the server with the failed cable. For example, on sdw8: # dca_blinker -h sdw8 -a ON 3. From the rear of the system, locate the Converged Network Adapter (CNA) card in the server's expansion slot. To lower Interconnect switch To upper Interconnect switch Master or DIA server; Hadoop Compute server To lower Interconnect switch AF004142a To upper Interconnect switch Segment server; Hadoop master and worker servers AF004061a Figure 36 CNA card location in DCA servers Figure 37 Master server with extra 10Gb NICs EMC Greenplum DCA Maintenance Guide Replace an Interconnect Switch Cable 134 EMC CONFIDENTIAL Replace an Interconnect Switch Cable 4. Observe the Link and Act LEDs adjacent to each port on the card. A single, steadily flashing LED indicates that the attached cable has failed. If both LEDs are flashing, further diagnosis is required. DO NOT replace the cables in this case. Instead, contact EMC technical support services for assistance. Figure 38 Interconnect switch CNA port LEDs 5. Before connecting the replacement cable the database will need to be shutdown. This is due to the new cabling bundling introduced in release 2.0.2.0. Shutting down the database will prevent false dial home messages from being sent to EMC Support during service. To shutdown the database: a. Disable health monitoring by stopping the healthmon daemon: # dca_healthmon_ctl -d b. Switch to the user gpadmin: # su - gpadmin c. When prompted for the password, enter changeme. If the default password changeme was changed; enter the current password. d. Stop the Greenplum Database: $ gpstop -af e. Switch to the user root: $ su - 6. Disconnect both ends of the cable and remove the cable from the cable bundle. For Interconnect cable diagrams, refer to “” on page 152. 7. Connect one end of the new cable to the CNA port on the server and the other end to the correct port on the appropriate Interconnect switch. 8. Verify that the Link LED on the CNA card is solid green. 9. Secure the cable back into the cable bundle. 10. Verify that the eth4 interface on the affected server is UP and RUNNING: # ifconfig eth4 The following output should be returned: eth4 EMC Greenplum DCA Maintenance Guide Link encap:Ethernet HWaddr 8C:7C:FF:20:93:32 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:921818 errors:0 dropped:0 overruns:0 frame:0 TX packets:908966 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:129140405 (123.1 MiB) TX bytes:88721575 (84.6 MiB) Memory:ec440000-ec47ffff 135 EMC CONFIDENTIAL Replace an Interconnect Switch Cable If eth4 is not UP, bring it up by issuing: # ifup eth4 11. Once all of the connections are fixed, the database may be started up. EMC Greenplum DCA Maintenance Guide 136 EMC CONFIDENTIAL APPENDIX A System Information and Configuration This appendix includes the following sections: Greenplum DCA configurations Power supply reference Network and cabling configuration Network hostname and IP configuration Multiple-rack cabling reference Configuration files Default passwords New firmware updates for DCA software version 2.0.3.0 Customers can apply optional firmware updates prior to upgrading to DCA software version 2.0.3.0 as follows: Arista 7050S-52 and Arista 7048T switches • New firmware version EOS-4.9.8.swi • Field personnel can access the EOS-4.9.8.swi.zip firmware upgrade package from: ftp://ftp.aristanetworks.com/emc/certifiedeos/EOS-4.9.8.swi Field personnel can obtain the following document available on http://support.emc.com for step-by-step instructions: EMC Greenplum DCA Firmware Upgrade Instructions for the Interconnect Switch (Arista 7050S-52) and Administration Switch (Arista 7048T) Intel Servers (Kylin with eight drives, Dragon 12 with twelve drives, and Dragon 24 with 24 drives) • New BIOS upgrade revision level SE5C600.86B.02.01.0002 • Field personnel can access both the BIOS upgrade package, and the EMC Greenplum DCA Intel BIOS Upgrade Instructions for Intel Servers from http://support.emc.com. EMC Greenplum DCA Maintenance Guide System Information and Configuration 137 EMC CONFIDENTIAL System Information and Configuration Identify the version of the installed DCA software DCA documentation is tied to a specific version of the DCA software. To identify the version of the software running on a particular DCA, perform this procedure: 1. Log in to the Primary Master server as the user root. 2. View the contents of the /opt/dca/etc/dca-build-info.txt file. For example: # cat opt/dca/etc/dca-build-info.txt In the output see the ISO_Version information. ## ============================================= ISO_BUILD_DATE="Wed Oct 15 21:59:56 PST 2013" ISO_VERSION="2.0.2.0" ISO_BUILD_VERSION="4" ISO_INSTALL_TYPE="iso" ## ============================================= EMC Greenplum DCA Maintenance Guide 138 EMC CONFIDENTIAL System Information and Configuration DCA configuration rules Manufacturing ships three basic types of racks for the UAP DCA: System - DCA2-SYSRACK Aggregation - DCA2-AGGREG Expansion - DCA2-EXPAND Supported DCA modules Module type Server/drive types and quantities Greenplum Database (GPDB) Four Dragon 24 servers: • Compute: x24 300GB drives per server • Standard: x24 900GB drives per server • Compute High Memory: x24 300GB drives per server, 256GB of Memory • Two Kylin servers: x6 300GB drives per server Data Integration Accelerator (DIA) (One of these items) • • • • Hadoop (HD) (master or worker) Four Dragon 12 servers: x12 3TB drives per server Hadoop Compute option (referred to as HD-C module and used for Hadoop with Isilon) Two Kylin servers: x6 300GB drives per server Two Kylin servers: x6 300GB drives per server Two Dragon 12 servers: x12 3TB drives per server Two Dragon 24 servers: 256GB of memory Two DIA High Memory servers: x24 300GB drives per server (256GB of memory) Racking order All master nodes and switches are racked first. All other nodes are racked in the following order. Table 11 Approved DCA Racking Sequence SKU Rack Priority (when present) Dragon 24, 900GB disks, 64GB RAM (100-585-031-07) First Dragon 24, 300GB disks, 64GB RAM (100-585-035-06) Second Dragon 12, 3TB disks, 64GB RAM HDM, HDW, or DIA (100-585-030-06) Third Dragon 24, 900GB disks, 256GB RAM SDW or DIA (100-585-055-01) Fourth Kylin, 64GB RAM DIA, HDC, or HDM (100-585-029-05) Fifth EMC Greenplum DCA Maintenance Guide 139 EMC CONFIDENTIAL System Information and Configuration Racking guidelines GPDB Compute, Standard, or High Memory modules must not occupy the same DCA. The minimum Hadoop configuration must include two Hadoop modules, one serving as the Hadoop Master module (hdm) and a second serving as the Hadoop Worker (data) module (hdw). For Hadoop Compute with Isilon the minimum requirements are 8 Kylins (4 x2 Hadoop Compute modules). The 2nd rack (if present) is always an Aggregation rack. Racks 3 through 11 (if present) are Expansion racks. Any rack containing even one 100-585-055-01 is limited to thirty rack units for servers. Switches remain in the standard locations. Racks with High Memory servers should not exceed 30U. System Aggregation Expansion Figure 39 11-rack configuration Figure 40 Aggregation switch locations in a multi-rack DCA EMC Greenplum DCA Maintenance Guide 140 EMC CONFIDENTIAL System Information and Configuration Mixed System rack components Figure 41 Greenplum DCA2-SYSRACK Table 12 Greenplum DCA2-SYSRACK - System rack components DCA Component Quantity Hadoop Servers (Dragon 12, 2U) 16 (8 minimum, 4 hdw + 4 hdm) or 12 High Memory Systems Master Servers (Kylin, 1U) 2 (1 Primary + 1 Standby) GPDB (Segment) Servers (Dragon 24, 2U) 16 or 12 High Memory Systems Interconnect Switches (Arista 7050S-52) 2 Administration Switches (Arista 7048T-A) 1 EMC Greenplum DCA Maintenance Guide 141 EMC CONFIDENTIAL System Information and Configuration Hadoop-only System Rack components (minimum config.) Note: Supported in DCA version 2.0.1.0 and later. Figure 42 Hadoop-only System rack Table 13 Hadoop-only System Rack components DCA Component Quantity Hadoop Master Servers (hdm) 4 minimum Hadoop Worker Servers (hdw) 4 minimum Master Servers (Kylin, 1U) 2 Interconnect Switches (Arista 7050S-52) 2 Administration Switch (Arista 7048T-A) 1 EMC Greenplum DCA Maintenance Guide 142 EMC CONFIDENTIAL System Information and Configuration HD-Compute System Rack components (minimum config.) Note: Supported in DCA version 2.0.2.0 and later. Figure 43 HDC-Compute System rack Table 14 HDC-Compute System rack components DCA Component Quantity Hadoop Compute Servers (hdc) 8 minimum, 22 maximum EMC Greenplum DCA Maintenance Guide 143 EMC CONFIDENTIAL System Information and Configuration Aggregation rack components Figure 44 Greenplum DCA2-AGGREG Table 15 Greenplum DCA2-AGGERG - Aggregation rack components DCA Component Quantity Segment Servers 16 maximum (or 12 maximum with High Memory Modules) Master Servers (Kylin, 1U) 0 Interconnect Switches (Arista 7050S-52) 4 (2 for the Interconnect network; 2 for the Aggregation network) Administration Switch (Arista 7048T-A) 1 EMC Greenplum DCA Maintenance Guide 144 EMC CONFIDENTIAL System Information and Configuration Expansion rack components Figure 45 Greenplum DCA2-EXPAND Table 16 Greenplum DCA2-EXPAND - Expansion rack components Component Quantity Segment Servers 16 maximum (or 12 maximum with High Mem Module) Master Servers (Kylin, 1U) 0 Interconnect Switches (Arista 7050S-52) 2 Administration Switch (Arista 7048T-A) 1 EMC Greenplum DCA Maintenance Guide 145 EMC CONFIDENTIAL System Information and Configuration Power supply reference Figure 46 shows four external customer-supplied power input circuits connected to DCA Power Distribution Units (PDUs). The figure shows a full System rack. Power switches Power switches Customer-supplied power Upper Zone B input Upper Zone A input Customersupplied power Power switches Power switches Customer-supplied power Lower Zone B input Lower Zone A input Customersupplied power Figure 46 Greenplum DCA power cable configuration, full System rack EMC Greenplum DCA Maintenance Guide Power supply reference 146 EMC CONFIDENTIAL System Information and Configuration Power switches Power switches Customer-supplied power Upper Zone B input Upper Zone A input Customersupplied power Power switches Power switches Customer-supplied power Lower Zone B input Lower Zone A input Customersupplied power Figure 47 Greenplum DCA power cable configuration, 1/2 System rack EMC Greenplum DCA Maintenance Guide Power supply reference 147 EMC CONFIDENTIAL System Information and Configuration Customer-supplied power not needed to upper Zone B in 1/4 rack Customer-supplied power not needed to upper Zone A in 1/4 rack Power switches Power switches Customer-supplied power Lower Zone B input Lower Zone A input Customersupplied power Figure 48 Greenplum DCA power cable configuration, 1/4 System rack EMC Greenplum DCA Maintenance Guide Power supply reference 148 EMC CONFIDENTIAL System Information and Configuration Figure 49 Dense rack configuration EMC Greenplum DCA Maintenance Guide Power supply reference 149 EMC CONFIDENTIAL System Information and Configuration Figure 50 High memory system rack configuration EMC Greenplum DCA Maintenance Guide Power supply reference 150 EMC CONFIDENTIAL System Information and Configuration BMC Controller interface functionality The baseboard management controller (BMC) is a built-in interface included in most DCA servers. The BMC provides out-of-band system management facilities. The controller integrates its own processor, memory, battery, network connection, and access to the system bus. Key features (available through a supported web browser) include: Power management Virtual media access Remote console capabilities BMC gives system administrators the ability to manage a machine as if they were sitting at the local console. BMC Controller LED indicators and meanings Table 17 lists BMC LED states and components to check. Table 17 BMC LED indictor status and possible required action Color State Criticality Description Green Solid On Normal No action required by Field Support. BMC is operating in a healthy state. Green Blink (1 per second) Degraded Redundancy is lost or a non-critical warning/error Check for these possible issues: • Redundancy loss such as power-supply or fan • Correctable ECC memory error • Non-critical threshold crossed (Temp, Voltage, input power) Amber Solid On Non-critical Non-fatal alarm Check for critical thresholds surpassed on: • Temp • Voltage • Input power • Hard Drives • Fans (minimum number of fans not present) Amber Blink (1 per second) Critical Critical error Check for: • Power fault • Insufficient memory present • CPU thermal trip EMC Greenplum DCA Maintenance Guide BMC Controller interface functionality 151 EMC CONFIDENTIAL System Information and Configuration Network and cabling configurations This section describes the network cabling configurations for the Interconnect and administration switches. Interconnect cabling reference Each rack in the DCA contains two Interconnect switches which provide the Greenplum Interconnect network. Topics in this section include: “Lower Interconnect switch cabling reference” “Upper Interconnect switch cabling reference” “Dense rack switch cabling reference” “Dense rack Interconnect 2 configuration (dual NIC)” Port 17 to Primary Master Server (mdw) Ports 1-8 to servers sdw1-sdw8 Ports 9-16 to servers sdw9-sdw16 Port 18 to Standby Master Server smdw Ports 45-46 to lower Aggregation switch (aggr-sw-1) Ports 41-44 to customer network (single rack DCA only) Ports 47-48 to upper Aggregation switch (aggr-sw-2) mLAG peer connections to the other Interconnect switch in the rack. Serial console To Administration switch Figure 51 Interconnect switch port map EMC Greenplum DCA Maintenance Guide Network and cabling configurations 152 EMC CONFIDENTIAL System Information and Configuration Lower Interconnect switch cabling reference The lower Interconnect switch connects servers to the first Interconnect. Lower Interconnect switches are always odd-numbered hostnames (for example, i-sw-1, i-sw-3, i-sw-5, etc.). Figure 52 Lower Interconnect switch cabling reference EMC Greenplum DCA Maintenance Guide Network and cabling configurations 153 EMC CONFIDENTIAL System Information and Configuration Upper Interconnect switch cabling reference The upper Interconnect Switch connects servers to the second Interconnect. Upper Interconnect switches are always even-numbered hostnames (for example, i-sw-2, i-sw-4, i-sw-6, etc.). Figure 53 Upper Interconnect switch cabling reference EMC Greenplum DCA Maintenance Guide Network and cabling configurations 154 EMC CONFIDENTIAL System Information and Configuration Dense rack switch cabling reference Figure 54 Dense rack Interconnect 1 configuration (dual NIC) EMC Greenplum DCA Maintenance Guide Network and cabling configurations 155 EMC CONFIDENTIAL System Information and Configuration Dense rack Interconnect 2 configuration (dual NIC) Figure 55 Dense rack Interconnect 2 configuration (dual NIC) EMC Greenplum DCA Maintenance Guide Network and cabling configurations 156 EMC CONFIDENTIAL System Information and Configuration Table 18 Interconnect switch cable routing, 3-rack DCA (page 1 of 2) SYS-RACK IC switch port AGGR-RACK EXPAND-RACK i-sw-1 i-sw-2 i-sw-3 i-sw-4 i-sw-5 i-sw-6 Server CNA port 0 Server CNA port 1 Server CNA port 0 Server CNA port 1 Server CNA port 0 Server CNA port 1 1 server 1 server 1 server 1 server 1 server 1 server 1 2 server 2 server 2 server 2 server 2 server 2 server 2 3 server 3 server 3 server 3 server 3 server 3 server 3 4 server 4 server 4 server 4 server 4 server 4 server 4 5 server 5 server 5 server 5 server 5 server 5 server 5 6 server 6 server 6 server 6 server 6 server 6 server 6 7 server 7 server 7 server 7 server 7 server 7 server 7 8 server 8 server 8 server 8 server 8 server 8 server 8 9 server 9 server 9 server 9 server 9 server 9 server 9 10 server 10 server 10 server 10 server 10 server 10 server 10 11 server 11 server 11 server 11 server 11 server 11 server 11 12 server 12 server 12 server 12 server 12 server 12 server 12 13 server 13 server 13 server 13 server 13 server 13 server 13 14 server 14 server 14 server 14 server 14 server 14 server 14 15 server 15 server 15 server 15 server 15 server 15 server 15 16 server 16 server 16 server 16 server 16 server 16 server 16 17 mdw mdw server 17 server 17 server 17 server 17 18 smdw smdw server 18 server 18 server 18 server 18 19 server 17 server 17 server 19 server 19 server 19 server 19 20 server 18 server 18 server 20 server 20 server 20 server 20 21 server 19 server 19 22 server 20 server 20 23 to 40 41 to 44 Customer network (in single-rack DCA) 45 aggr-sw-1 port 1 aggr-sw-1 port 3 aggr-sw-1 port 5 aggr-sw-1 port 7 aggr-sw-1 port 9 aggr-sw-1 port 11 46 aggr-sw-1 port 2 aggr-sw-1 port 4 aggr-sw-1 port 6 aggr-sw-1 port 8 aggr-sw-1 port 10 aggr-sw-1 port 12 47 aggr-sw-2 port 1 aggr-sw-2 port 3 aggr-sw-2 port 5 aggr-sw-2 port 7 aggr-sw-2 port 9 aggr-sw-2 port 11 48 aggr-sw-2 port 2 aggr-sw-2 port 4 aggr-sw-2 port 6 aggr-sw-2 port 8 aggr-sw-2 port 10 aggr-sw-2 port 12 49 mLAG peer link: i-sw-1 to i-sw-2 EMC Greenplum DCA Maintenance Guide mLAG peer link: i-sw-3 to i-sw-4 mLAG peer link: i-sw-5 to i-sw-6 Network and cabling configurations 157 EMC CONFIDENTIAL System Information and Configuration Table 18 Interconnect switch cable routing, 3-rack DCA (page 2 of 2) SYS-RACK AGGR-RACK EXPAND-RACK 50 mLAG peer link: i-sw-1 to i-sw-2 mLAG peer link: i-sw-3 to i-sw-4 mLAG peer link: i-sw-5 to i-sw-6 51 mLAG peer link: i-sw-1 to i-sw-2 mLAG peer link: i-sw-3 to i-sw-4 mLAG peer link: i-sw-5 to i-sw-6 52 mLAG peer link: i-sw-1 to i-sw-2 mLAG peer link: i-sw-3 to i-sw-4 mLAG peer link: i-sw-5 to i-sw-6 EMC Greenplum DCA Maintenance Guide Network and cabling configurations 158 EMC CONFIDENTIAL System Information and Configuration Administration switch reference The DCA contains one Administration switch per rack. The Administration switch routes management traffic, connects all of the servers and switches in a DCA, and provides service connectivity through a red service cable. Topics in this section include: “Rack 1 Administration switch cabling reference” “Dense rack Interconnect 2 configuration (dual NIC)” “Dense rack Interconnect 2 configuration (dual NIC)” Port 17: to Primary Master server (mdw) Port 43: to lower Interconnect switch Serial console Other Administration Administration switches switches Other in aa multi-rack multi-rack DCA DCA in Serial console Ports 1-8: to servers 1 through 8 Ports 9-16: to servers 9 through 16 Port 18: to Standby Master server (smdw) Port 48: red service cable for cluster management (a-sw-1 only) Port 44: to upper Interconnect switch Ports 45 and 46: to a-sw-1 to a-sw-2 in multi-rack DCA Figure 56 Administration switch port map, single rack DCA EMC Greenplum DCA Maintenance Guide Network and cabling configurations 159 EMC CONFIDENTIAL System Information and Configuration Rack 1 Administration switch cabling reference Port 47: Customer Admin network access (optional) Port 48: Cluster Management (red service cable) Figure 57 Rack 1 Administration switch cabling reference EMC Greenplum DCA Maintenance Guide Network and cabling configurations 160 EMC CONFIDENTIAL System Information and Configuration Dense rack Administration switch port mapping reference Figure 58 Dense rack Administration switch port mapping to servers 9 - 16 EMC Greenplum DCA Maintenance Guide Network and cabling configurations 161 EMC CONFIDENTIAL System Information and Configuration Administration switch cabling routing reference Table 19 Administration switch cable routing Admin Switch Port a-sw-1 in a-sw-2 in a-sw-3 in SYS-RACK AGGR-RACK EXPAND-RACK Admin Switch Port a-sw-1 in SYS-RACK a-sw-2 in AGGR-RACK 1 server 1 server 1 server 1 25 a-sw-3, port 45 a-sw-3, port 46 n/a 2 server 2 server 2 server 2 26 a-sw-4, port 45 a-sw-4, port 46 n/a 3 server 3 server 3 server 3 27 a-sw-5, port 45 a-sw-5, port 46 n/a 4 server 4 server 4 server 4 28 a-sw-6, port 45 a-sw-6, port 46 n/a 5 server 5 server 5 server 5 29 a-sw-7, port 45 a-sw-7, port 46 n/a 6 server 6 server 6 server 6 30 a-sw-8, port 45 a-sw-8, port 46 n/a 7 server 7 server 7 server 7 31 a-sw-9, port 45 a-sw-9, port 46 n/a 8 server 8 server 8 server 8 32 a-sw-10, port 45 a-sw-10, port 46 n/a 9 server 9 server 9 server 9 33 a-sw-11, port 45 a-sw-11, port 46 n/a 10 server 10 server 10 server 10 34 a-sw-12, port 45 a-sw-12, port 46 n/a 11 server 11 server 11 server 11 35 n/a n/a n/a 12 server 12 server 12 server 12 36 n/a n/a n/a 13 server 13 server 13 server 13 37 n/a n/a n/a 14 server 14 server 14 server 14 38 n/a n/a n/a 15 server 15 server 15 server 15 39 n/a n/a n/a 16 server 16 server 16 server 16 40 n/a n/a n/a 17 mdw server 17 server 17 41 n/a n/a n/a 18 smdw server 18 server 18 42 n/a n/a n/a 19 server 17 server 19 server 19 43 Lower Interconnect switch <...> port 20 server 18 server 20 server 20 44 Upper Interconnect switch <...> port 21 server 19 — — 45 a-sw-2 peer a-sw-1 peer a-sw-1, port 25 22 server 20 — n/a 46 a-sw-2 peer a-sw-1 peer a-sw-2, port 25 23 — — n/a 47 Customer Admin network access (optional) 24 — — n/a 48 Cluster management (red service cable) Customer Admin network access (optional) n/a a-sw-3 in EXPAND-RACK n/a n/a Note: A dash (-) indicates cable connections that vary depending on the specific type(s) and quantity of servers and racks in the DCA. EMC Greenplum DCA Maintenance Guide Network and cabling configurations 162 EMC CONFIDENTIAL System Information and Configuration Aggregation switch reference Servers in a multiple-rack configuration communicate through the two Aggregation switches located in Rack 2. The following diagram and table show the proper connectivity. Figure 59 Aggregation switch port map EMC Greenplum DCA Maintenance Guide Network and cabling configurations 163 EMC CONFIDENTIAL System Information and Configuration Interconnect switch-to-Aggregation switch port mapping Table 20 Interconnect switch-to-Aggregation switch port mapping (page 1 of 6) Ports Ports 47 <.......> 3 Upper Interconnect switch (i-sw-2) 48 <.......> 4 Ports Ports 45 <.......> 3 46 <.......> 4 Rack 1 Expansion Ports 48 <.......> 2 Ports 46 <.......> 2 48 <.......> 8 Ports 46 <.......> 8 Ports 48 <.......> 6 Ports Lower Aggregation switch (aggr-sw-1) Rack 2 AGGR Rack Upper Aggregation switch (aggr-sw-2) Ports 45 <.......> 5 46 <.......> 6 EMC Greenplum DCA Maintenance Guide Upper Aggregation switch (aggr-sw-2) Ports 47 <.......> 5 Lower Interconnect switch (i-sw-3) Lower Aggregation switch (aggr-sw-1) Ports 45 <.......> 7 Rack 2 AGG Rack Upper Aggregation switch (aggr-sw-2) Ports 47 <.......> 7 Upper Interconnect switch (i-sw-4) Rack 2 AGGR Rack Ports 45 <.......> 1 Ports Lower Aggregation switch (aggr-sw-1) Ports 47 <.......> 1 Lower Interconnect switch (i-sw-1) Upper Aggregation switch (aggr-sw-2) Lower Aggregation switch (aggr-sw-1) Network and cabling configurations 164 EMC CONFIDENTIAL System Information and Configuration Table 20 Interconnect switch-to-Aggregation switch port mapping (page 2 of 6) Ports Ports 47 <.......> 11 Upper Interconnect switch (i-sw-6) 48 <.......> 12 Ports Ports 45 <.......> 11 46 <.......> 12 Rack 3 Expansion Ports 48 <.......> 10 Ports Rack 2 AGGR Rack Upper Aggregation switch (aggr-sw-2) Ports 45 <.......> 9 46 <.......> 10 EMC Greenplum DCA Maintenance Guide Lower Aggregation switch (aggr-sw-1) Ports 47 <.......> 9 Lower Interconnect switch (i-sw-5) Upper Aggregation switch (aggr-sw-2) Lower Aggregation switch (aggr-sw-1) Network and cabling configurations 165 EMC CONFIDENTIAL System Information and Configuration Table 20 Interconnect switch-to-Aggregation switch port mapping (page 3 of 6) Ports Ports 47 <.......> 15 Upper Interconnect switch (i-sw-8) 48 <.......> 16 Ports Ports 45 <.......> 15 46 <.......> 16 Rack 4 Expansion Ports 48 <.......> 14 Ports 46 <.......> 14 48 <.......> 20 Ports 46 <.......> 20 Ports 48 <.......> 18 Ports Lower Aggregation switch (aggr-sw-1) Rack 2 AGGR Rack Upper Aggregation switch (aggr-sw-2) Ports 45 <.......> 17 46 <.......> 18 EMC Greenplum DCA Maintenance Guide Upper Aggregation switch (aggr-sw-2) Ports 47 <.......> 17 Lower Interconnect switch (i-sw-9) Lower Aggregation switch (aggr-sw-1) Ports 45 <.......> 19 Rack 5 Expansion Upper Aggregation switch (aggr-sw-2) Ports 47 <.......> 19 Upper Interconnect switch (i-sw-10) Rack 2 AGGR Rack Ports 45 <.......> 13 Ports Lower Aggregation switch (aggr-sw-1) Ports 47 <.......> 13 Lower Interconnect switch (i-sw-7) Upper Aggregation switch (aggr-sw-2) Lower Aggregation switch (aggr-sw-1) Network and cabling configurations 166 EMC CONFIDENTIAL System Information and Configuration Table 20 Interconnect switch-to-Aggregation switch port mapping (page 4 of 6) Ports Ports 47 <.......> 23 Upper Interconnect switch (i-sw-12) 48 <.......> 24 Ports Ports 45 <.......> 23 46 <.......> 24 Rack 6 Expansion Ports 48 <.......> 22 Ports 46 <.......> 22 48 <.......> 28 Ports 46 <.......> 28 Ports 48 <.......> 26 Ports Lower Aggregation switch (aggr-sw-1) Rack 2 AGGR Rack Upper Aggregation switch (aggr-sw-2) Ports 45 <.......> 25 46 <.......> 26 EMC Greenplum DCA Maintenance Guide Upper Aggregation switch (aggr-sw-2) Ports 47 <.......> 25 Lower Interconnect switch (i-sw-13) Lower Aggregation switch (aggr-sw-1) Ports 45 <.......> 27 Rack 7 Expansion Upper Aggregation switch (aggr-sw-2) Ports 47 <.......> 27 Upper Interconnect switch (i-sw-14) Rack 2 AGGR Rack Ports 45 <.......> 21 Ports Lower Aggregation switch (aggr-sw-1) Ports 47 <.......> 21 Lower Interconnect switch (i-sw-11) Upper Aggregation switch (aggr-sw-2) Lower Aggregation switch (aggr-sw-1) Network and cabling configurations 167 EMC CONFIDENTIAL System Information and Configuration Table 20 Interconnect switch-to-Aggregation switch port mapping (page 5 of 6) Ports Ports 47 <.......> 31 Upper Interconnect switch (i-sw-16) 48 <.......> 32 Ports Ports 45 <.......> 31 46 <.......> 32 Rack 8 Expansion Ports 48 <.......> 30 Ports 46 <.......> 30 48 <.......> 36 Ports 46 <.......> 36 Ports 48 <.......> 34 Ports Lower Aggregation switch (aggr-sw-1) Rack 2 AGGR Rack Upper Aggregation switch (aggr-sw-2) Ports 45 <.......> 33 46 <.......> 34 EMC Greenplum DCA Maintenance Guide Upper Aggregation switch (aggr-sw-2) Ports 47 <.......> 33 Lower Interconnect switch (i-sw-17) Lower Aggregation switch (aggr-sw-1) Ports 45 <.......> 35 Rack 9 Expansion Upper Aggregation switch (aggr-sw-2) Ports 47 <.......> 35 Upper Interconnect switch (i-sw-18) Rack 2 AGGR Rack Ports 45 <.......> 29 Ports Lower Aggregation switch (aggr-sw-1) Ports 47 <.......> 29 Lower Interconnect switch (i-sw-15) Upper Aggregation switch (aggr-sw-2) Lower Aggregation switch (aggr-sw-1) Network and cabling configurations 168 EMC CONFIDENTIAL System Information and Configuration Table 20 Interconnect switch-to-Aggregation switch port mapping (page 6 of 6) Ports Ports 47 <.......> 39 Upper Interconnect switch (i-sw-20) 48 <.......> 40 Ports Ports 45 <.......> 39 46 <.......> 40 Rack 10 Expansion Ports 48 <.......> 38 Ports 46 <.......> 38 48 <.......> 44 Ports 46 <.......> 44 Ports 48 <.......> 42 Ports Lower Aggregation switch (aggr-sw-1) Rack 2 AGGR Rack Upper Aggregation switch (aggr-sw-2) Ports 45 <.......> 41 46 <.......> 42 EMC Greenplum DCA Maintenance Guide Upper Aggregation switch (aggr-sw-2) Ports 47 <.......> 41 Lower Interconnect switch (i-sw-21) Lower Aggregation switch (aggr-sw-1) Ports 45 <.......> 43 Rack 11 Expansion Upper Aggregation switch (aggr-sw-2) Ports 47 <.......> 43 Upper Interconnect switch (i-sw-22) Rack 2 AGGR Rack Ports 45 <.......> 37 Ports Lower Aggregation switch (aggr-sw-1) Ports 47 <.......> 37 Lower Interconnect switch (i-sw-19) Upper Aggregation switch (aggr-sw-2) Lower Aggregation switch (aggr-sw-1) Network and cabling configurations 169 EMC CONFIDENTIAL System Information and Configuration Network hostname and IP configuration Table 21 DCA network configuration (page 1 of 3) Component hostname BMC IP host-sp NIC 1 IP host-cm Reserved for DHCP n/a n/a 172.28.6.170 through 172.28.6.179 Rack 1 Administration Switch a-sw-1 172.28.0.190 Rack 2 Administration Switch a-sw-2 172.28.0.191 Rack 3 Administration Switch a-sw-3 172.28.0.192 Rack 4 Administration Switch a-sw-4 172.28.0.193 Rack 5 Administration Switch a-sw-5 172.28.0.194 Rack 6 Administration Switch a-sw-6 172.28.0.195 Rack 7 Administration Switch a-sw-7 172.28.0.196 Rack 8 Administration Switch a-sw-8 172.28.0.197 Rack 9 Administration Switch a-sw-9 172.28.0.198 Rack 10 Administration Switch a-sw-10 172.28.0.199 Rack 11 Administration Switch a-sw-11 172.28.1.190 Rack 1 Interconnect Switch, lower i-sw-1 172.28.0.170 Interconnect Switch, upper i-sw-2 172.28.0.180 Interconnect Switch, lower i-sw-3 172.28.0.171 Interconnect Switch, upper i-sw-4 172.28.0.181 Interconnect Switch, lower i-sw-5 172.28.0.172 Interconnect Switch, upper i-sw-6 172.28.0.182 Interconnect Switch, lower i-sw-7 172.28.0.173 Interconnect Switch, upper i-sw-8 172.28.0.183 Interconnect Switch, lower i-sw-9 172.28.0.174 Interconnect Switch, upper i-sw-10 172.28.0.184 Interconnect Switch, lower i-sw-11 172.28.0.175 Interconnect Switch, upper i-sw-12 172.28.0.185 Interconnect Switch, lower i-sw-13 172.28.0.176 Interconnect Switch, upper i-sw-14 172.28.0.186 Interconnect Switch, lower i-sw-15 172.28.0.177 Interconnect Switch, upper i-sw-16 172.28.0.187 Rack Rack 2 Rack 3 Rack 4 Rack 5 Rack 6 Rack 7 Rack 8 EMC Greenplum DCA Maintenance Guide Interconnect n/a Network hostname and IP configuration 170 EMC CONFIDENTIAL System Information and Configuration Table 21 DCA network configuration (page 2 of 3) BMC IP host-sp NIC 1 IP host-cm Rack Component hostname Rack 9 Interconnect Switch, lower i-sw-17 172.28.0.178 Interconnect Switch, upper i-sw-18 172.28.0.188 Interconnect Switch, lower i-sw-19 172.28.0.179 Interconnect Switch, upper i-sw-20 172.28.0.189 Interconnect Switch, lower i-sw-21 172.28.1.170 Interconnect Switch, upper i-sw-22 172.28.1.180 Aggregation Switch, lower aggr-sw-1 172.28.0.248 Aggregation Switch, upper aggr-sw-2 172.28.0.249 Primary Master Server, lower server mdw 172.28.0.250 172.28.4.250 172.28.8.250 Standby Master Server, upper server smdw 172.28.0.251 172.28.4.251 172.28.8.251 Rack 10 Rack 11 Rack 2 Rack 1 EMC Greenplum DCA Maintenance Guide Interconnect Network hostname and IP configuration 171 EMC CONFIDENTIAL System Information and Configuration Table 21 DCA network configuration (page 3 of 3) Rack Component hostname BMC IP host-sp NIC 1 IP host-cm Interconnect GPDB Segment Server 1-160 sdw# 172.28.0.# 172.28.4.# 172.28.8.# GPDB Segment Server 161-176 sdw# 172.28.1.1 172.28.1.16 172.28.5.1 172.28.5.16 172.28.9.1 172.28.9.16 DIA Server 1-16 etl# 172.28.0.20# 172.28.4.20# 172.28.8.20# DIA Server 17-32 etl# 172.28.1.201 172.28.1.216 172.28.5.201 172.28.5.216 172.28.9.201 172.28.9.216 DIA Server 33-48 etl# 172.28.2.231 172.28.2.246 172.28.6.231 172.28.6.246 172.28.10.231 172.28.10.246 DIA Server 49-64 etl# 172.28.3.231 172.28.3.246 172.28.7.231 172.28.7.246 172.28.11.231 172.28.11.246 Hadoop Master Node 1-8 hdm1 hdm2 hdm3 hdm4 hdm5 hdm6 hdm7 hdm8 172.28.1.250 172.28.1.251 172.28.1.252 172.28.1.253 172.28.2.250 172.28.2.251 172.28.3.250 172.28.3.251 172.28.5.250 172.28.5.251 172.28.5.252 172.28.5.253 172.28.6.250 172.28.6.251 172.28.7.250 172.28.7.251 172.28.9.250 172.28.9.251 172.28.9.252 172.28.9.253 172.28.10.250 172.28.10.251 172.28.11.250 172.28.11.251 Hadoop Worker Node 1-160 hdw1-160 172.28.2.# 172.28.6.# 172.28.10.# 1Hadoop Worker Node 161-320 hdw161-320 172.28.3.# # = node number minus 160. Example: hdw162-sp = 172.28.3.2 172.28.7.# # = node number minus 160. Example: hdw162 -cm= 172.28.7.2 172.28.11.# # = node number minus 160. Example: hdw162-1 = 172.28.11.2 Hadoop Compute Node 1-60 hdc1-60 172.28.2.170 172.28.2.229 172.28.6.170 172.28.6.229 172.28.10.170 172.28.10.229 Hadoop Compute Node 61-120 hdc61-120 172.28.3.170 172.28.3.229 172.28.7.170 172.28.7.229 172.28.11.170 172.28.11.229 2 IP Addresses reserved for Isilon 1. Hadoop Worker nodes are numbered 1-320. In order to accommodate the required number of hosts, the third IP address octet is incremented by 1 and the fourth octet restarts at 1 when the node number reaches 161. For example, the host hdw160-sp uses a third octet of 2 and a fourth octet of 160 - host hdw-161-sp uses a third octet of 3 and a fourth octet of 1. To see a complete list of IP addresses and hostnames, view the /etc/hosts file. 2.) 172.28.8.217 through 172.28.8.246 and 172.28.9.217 through 172.28.9.246 EMC Greenplum DCA Maintenance Guide Network hostname and IP configuration 172 EMC CONFIDENTIAL System Information and Configuration Multiple-rack cabling reference Table 22 Cabling kit contents and part numbers Table 23 Cable kits for a 7-to-11-rack DCA Connect from: Rack 2 - AGGREG EMC Greenplum DCA Maintenance Guide To: Use cable kit: Rack 1 - SYSRACK DCA2-CBL10 Rack 2 - AGGREG DCA2-CBL10 Rack 3 - 1st EXPAND DCA2-CBL10 Rack 4 - 2nd EXPAND DCA2-CBL10 Rack 5 - 3rd EXPAND DCA2-CBL10 Rack 6 - 4th EXPAND DCA2-CBL10 Rack 7 - 5th EXPAND DCA2-CBL30 Rack 8 - 6th EXPAND DCA2-CBL30 Rack 9 - 7th EXPAND DCA2-CBL30 Rack 10 - 8th EXPAND DCA2-CBL30 Rack 11 - 9th EXPAND DCA2-CBL30 Multiple-rack cabling reference 173 EMC CONFIDENTIAL System Information and Configuration Configuration files Configuration files are text files that contain the hostnames of servers that occupy quarter, half, or full rack configurations. The file used depends on the desired function. Refer to the table below for a description of each configuration and host file. The hostfiles are located at $ /home/gpadmin/gpconfigs: Table 24 Hostfiles created by the DCA Setup utility File Description gpexpand_map Expansion MAP file created during the dca_setup option Expand the DCA. It’s purpose is to during GPDB reallocate primary and mirror instances on the new hardware. gpinitsystem_map MAP file used during installation of GPDB blocks to assign primary and mirror segments to each server. hostfile Contains one hostname per server for ALL servers in the system. Includes GPDB, DIA and HD (if present). hostfile_segments Contains the hostnames of the segment servers of all GPDB blocks. hostfile_gpdb Contains the hostnames for GPDB servers. hostfile_dia Contains the hostnames of the DIA servers. hostfile_hadoop Contains the hostnames of the Hadoop servers. hostfile_hdm Contains the hostnames of all Hadoop Master servers. hostfile_hdw Contains the hostnames of all Hadoop Worker servers. hostfile_hdc Contains the hostnames of all Hadoop Compute servers. Location of old core files (Applies to DCA version 2.0.1.0 and later) Old core files are moved automatically to a separate directory to prevent them from being sent to Support following a healthmon restart. For example, for sdw1, old core files are moved to /var/crash/user. [root@sdw1 user-processed]# ls -l /var/crash/user EMC Greenplum DCA Maintenance Guide Configuration files 174 EMC CONFIDENTIAL System Information and Configuration Default passwords The following table lists default passwords for all the components in a DCA. Table 25 Default user names and passwords Component User Password Master Servers BMC root user For a new unconfigured server: password For an existing configured server: sephiroth Interconnect, Adminstration, and Aggregation switches EMC Greenplum DCA Maintenance Guide root changeme gpadmin changeme admin changeme Default passwords 175 EMC CONFIDENTIAL APPENDIX B Connect a workstation to the DCA This section describes how to connect a workstation to the DCA in prepration for performing various maintenance tasks. Administration is always performed from the Primary Master server (hostname mdw). A Windows laptop with the PuTTY application installed is required. Laptop prerequisites The laptop you use to connect to the Greenplum DCA must have the following capabilities in order to perform Greenplum DCA administration: RJ-45 Ethernet port Administrator access on the laptop An ssh client such as PuTTY or Cygwin with the OpenSSH package enabled An scp client such as WinSCP, PuTTY PSCP or Cygwin with the OpenSSH package enabled Configure your laptop to connect to the DCA Perform the appropriate procedure to configure your laptop to connect to the DCA Administration Network. Configure a Windows 7 laptop 1. Locate the red service cable on the laptop tray. The cable is connected to port 48 of the first administration switch (a-sw-1). Connect the service cable to your laptop. 2. Click Start > Control Panel > Network and Internet > Network Sharing Center. 3. On the left pane click Change adapter settings. 4. Right-click the connection that you want to change, and then click Properties. If you are prompted for an administrator password or confirmation, type the password or provide confirmation. EMC Greenplum DCA Maintenance Guide Connect a workstation to the DCA 176 EMC CONFIDENTIAL Connect a workstation to the DCA 5. From the Networking tab select Internet Protocol Version 4 (TCP/IPv4), and then click Properties. 6. Click Properties. 7. Select Use the following IP address, and then type the following IP address and subnet mask: • IP address: 172.28.3.253 • Subnet mask: 255.255.248.0 Note: Leave the Default gateway field blank. Do not configure a gateway. EMC Greenplum DCA Maintenance Guide Configure your laptop to connect to the DCA 177 EMC CONFIDENTIAL Connect a workstation to the DCA 8. Click OK. 9. Click Close. Configure a Windows XP laptop 1. Locate the red service cable on the laptop tray. The cable is connected to port 48 of the first administration switch (a-sw-1). Connect the service cable to your laptop. 2. On your Windows laptop, open Control Panel. 3. Double-click Network Connections. 4. Right-click Local Area Connection and then select Properties. 5. Select Internet Protocol (TCP/IP) and then click Properties. 6. Enter the IP address and subnet mask: • IP address: 172.28.3.253 • Subnet mask: 255.255.248.0 Note: Leave the Default gateway field blank. Do not configure a gateway. 7. Click OK. Connect to the Master Server using an SSH client The method you use to establish an ssh connection to the Master Server depends on your chosen ssh client (PuTTY, Cygwin, etc.). Regardless of the ssh client, connect using the following values: hostname: 172.28.4.250 username: root root password: changeme (or whatever the customer’s root password is) EMC Greenplum DCA Maintenance Guide Connect to the Master Server using an SSH client 178 EMC CONFIDENTIAL Connect a workstation to the DCA PuTTY example 1. Open PuTTY and enter 172.28.4.250 in the Host Name (or IP address) field. Select SSH as the Connection type. 2. Click Open. 3. If this is the first time you have connected to this server, a security alert will display. Click Yes to continue. 4. At the SSH Login window, enter your username and password. For example: login as: root root@172.28.4.250 password: changeme Cygwin example To use Cygwin, you must have enabled the OpenSSH package when you installed Cygwin. Open a Cygwin terminal window and type the following at the prompt: $ ssh root@172.28.4.250 When prompted, type the root password (default is changeme). Copy a file to the Master Server using an SCP client The method you use to copy a file from your local laptop to the Greenplum master server depends on your chosen scp client (WinSCP, Cygwin, etc.). Regardless of the scp client, connect using the following values: hostname: 172.28.4.250 username: gpadmin root password: changeme (or whatever the customer’s root password is) Destination on the master: /home/gpadmin EMC Greenplum DCA Maintenance Guide Copy a file to the Master Server using an SCP client 179 EMC CONFIDENTIAL Connect a workstation to the DCA WinSCP Example 1. Log in to the master host IP 172.28.4.250 as user gpadmin. Select SFTP as the File protocol. 2. On your local host, locate the file you want to copy and then choose the /home/gpadmin directory on the master server. EMC Greenplum DCA Maintenance Guide Copy a file to the Master Server using an SCP client 180 EMC CONFIDENTIAL Connect a workstation to the DCA 3. Click Copy. Cygwin example 1. To use Cygwin, you must have enabled the OpenSSH package when you installed Cygwin. Open a Cygwin terminal window and type the following at the prompt: $ cd $ scp gpadmin@172.28.4.250:/home/gpadmin 2. When prompted, type the gpadmin password (the default is changeme). Connect to an Interconnect or Administration switch using PuTTY This section describes how to connect your service laptop to a serial port on an Interconnect or Administration switch. You must perform this procedure if the switch contains factory settings or cannot be accessed by telnet or ssh through the DCA Administration network. 1. Connect one end of a serial cable from the serial port on the switch. Connect the other end of the cable to your workstation. Note: If the service laptop or workstation does not have a serial port, you can use a USB-to-Serial Adapter. 2. Launch the PuTTY application. 3. Select Serial in Basic Options under the Session section. Serial option in a PuTTY session EMC Greenplum DCA Maintenance Guide Connect to an Interconnect or Administration switch using PuTTY 181 EMC CONFIDENTIAL Connect a workstation to the DCA 4. Expand the connection section and select Serial. Verify that the settings for the COM port are 9600 Baud, 8 data bits, and no hardware flow control. 5. Click Open to connect. 6. Press to display the login prompt. EMC Greenplum DCA Maintenance Guide Connect to an Interconnect or Administration switch using PuTTY 182 EMC CONFIDENTIAL APPENDIX C Power Off the DCA To safely shut down and power off Greenplum DCA hardware and software, perform the following tasks in sequence: Task 1: Connect to the Greenplum DCA Master Server............................................ 184 Task 2: Stop the Greenplum Database software and shut down the OS.................. 185 Task 3: Place the PDU power switches in the OFF position ..................................... 187 IMPORTANT Stop all running queries and data loading before you power down the DCA. EMC Greenplum DCA Maintenance Guide Power Off the DCA 183 EMC CONFIDENTIAL Power Off the DCA Task 1: Connect to the Greenplum DCA Master Server The fastest method to shut down a DCA is to SSH in to a Master Server through an external network connection. If the external conection is not available and you have a service laptop, connect to the DCA as described in this procedure. This procedure assumes you are using the Windows Operating System. 1. Locate the system rack of the DCA. The systm rack contains the Primary and Standby Master servers. Master servers are highlighted in red in Figure 60. Figure 60 Master Servers in the System rack 2. Locate the red service cable on the laptop tray and connect it to your laptop. The red service cable is connected to port 48 on the Administration switch. EMC Greenplum DCA Maintenance Guide Connect to the Greenplum DCA Master Server 184 EMC CONFIDENTIAL Power Off the DCA 3. From your Windows laptop navigate to Start > Control Panel > Network and Internet > Network Sharing Center. 4. On the left pane click Change adapter settings.. 5. Right-click Local Area Connection and select Properties. 6. From the Networking tab select Internet Protocol Version 4 (TCP/IPv4). 7. Click Properties. 8. Select Use the following IP address, and then enter the following IP address and subnet mask: • IP address: 172.28.3.253 • Subnet mask: 255.255.248.0 9. Click OK. 10. Click Close. 11. Open an SSH client (such as PuTTY) and enter: • Host Name (or IP address): 172.28.4.250 • Connection type: SSH 12. Click Open. If this is the first time you have connected to this server, a security alert will display. 13. Click Yes to continue. 14. Log in as the user root with password changeme. If the default password changeme was changed, enter the current password. Task 2: Stop the Greenplum Database software and shut down the OS To ensure data consistency across primary and mirror segments, you must stop the Greenplum Database software correctly. 1. To prevent false dial home messages from being sent to EMC Support during service, disable health monitoring by stopping the healthmon daemon: # dca_healthmon_ctl -d 2. Switch to the user gpadmin: # su - gpadmin 3. When prompted for the password, enter changeme. If the default password changeme was changed, enter the current password. 4. Stop the Greenplum Database: $ gpstop -af 5. Stop Greenplum Command Center: $ gpcmdr --stop EMC Greenplum DCA Maintenance Guide Stop the Greenplum Database software and shut down the OS 185 EMC CONFIDENTIAL Power Off the DCA 6. Switch to the user root: $ su - 7. Start the DCA Shutdown utility: Issuing the shutdown command immediately shuts down the DCA. Make sure that you are ready to shut down the DCA before you issue this command. # dca_shutdown 8. Verify that the green LED on the power button on each server turns off after 1-2 minutes (see Figure 61 and Figure 62). 9. If a server does not power off, power it off manually by pressing the power button. Power button AF004297 Figure 61 Location of power button on a GPDB server (applies also to Hadoop Masters & Workers) Power button Figure 62 Location of power button on a Master, DIA, and Hadoop Compute servers {Procedure continues on next page} EMC Greenplum DCA Maintenance Guide Stop the Greenplum Database software and shut down the OS 186 EMC CONFIDENTIAL Power Off the DCA Task 3: Place the PDU power switches in the OFF position When the Greenplum Database is stopped and the operating system is shut down on each server, it is safe to power off the system via the eight PDU power switches in each rack. 1. Starting from the rear of the System rck (Rack 1), locate the power switches in the upper and lower Power Zones A and B (see Figure 63). Power switches Power switches Customer-supplied power Upper Zone B input Upper Zone A input Customersupplied power Power switches Power switches Customersupplied power Customer-supplied power Lower Zone B input Lower Zone A input {Procedure continues on next page) Figure 63 Rack power switch locations EMC Greenplum DCA Maintenance Guide Place the PDU power switches in the OFF position 187 EMC CONFIDENTIAL Power Off the DCA 2. First place the power switches in lower Power Zones A and B in the OFF position, and then place the power switches in upper Power Zones A and B in the OFF position. 3. Power off the remaining racks in the same way, one rack at a time, first placing the power switches in the lower zone and then the upper zone in the OFF position. After a few seconds, there should be no lit LEDs on any components in the system. Shutdown is complete. EMC Greenplum DCA Maintenance Guide Place the PDU power switches in the OFF position 188 EMC CONFIDENTIAL APPENDIX D Linux and vi Command Reference This appendix is a quick reference of basic Red Hat Linux commands, Greenplum-specific Linux commands, and common Vi text editor commands. Common Linux command reference Table 26 Common Linux commands (page 1 of 2) Linux command Description Moving Around / refers to the root directory .. refers to the parent directory Up/down arrows repeats the last (up arrow) or next (down arrow) command you typed pwd displays the current directory cd name changes to the named directory cd returns you to your home directory Basic Commands ls lists the contents of the current directory ls dir_name lists the content of the named directory ls -l lists the content of the named directory in long format; this includes file permissions, ownership information, and file size ls -a lists all the files in the named directory including files that start with a period (“.)” cat filename prints the content of the named file to the screen, one page at a time more filename prints the content of the named file to the screen, with scrolling and search facilities cp source destination copies the source file to the named destination for example: cp /misc/temp . copies a file called temp located in the misc directory, to the current directory (“.”) mv source destination moves the source file or directory to the named destination for example: mv /misc/temp . this moves a file called temp located in the misc directory, to the current directory (“.”) rm filename deletes (removes) the named file mkdir dir_name creates a new directory rmdir dir_name removes the specified directory (directory must be empty) source source path information EMC Greenplum DCA Maintenance Guide Linux and vi Command Reference 189 EMC CONFIDENTIAL Linux and vi Command Reference Table 26 Common Linux commands (page 2 of 2) Linux command Description su assume the super user (root) identity tar untars a tape archived and compressed file unzip extracts compressed files from a ZIP archive grep string filename prints all the lines in a file that contain the specified string su temporarily become the superuser - useful for system administration tasks passwd allows you to change the password used to access your user account. You are prompted to enter your current password, then enter a new one. who displays a list of users currently logged onto this computer Getting Help man command displays a (manual page (man) about the specified command, possible options and switches, and more detailed information about using that command Shutting down and rebooting a Linux machine /sbin/shutdown - r now reboots the machine immediately /sbin/shutdown - h now shuts down the machine immediately Greenplum Linux Commands gpcheck verifies and validates Greenplum Database platform settings gpexpand expands an existing Greenplum Database across new hosts in the array gpinitsystem initializes a Greenplum Database system by using configuration parameters specified in the gp_init_config file gpinitstandby adds and/or initializes a standby master host for a Greenplum Database system gpseginstall installs the Greenplum Database software on multiple hosts gpscp copies files between multiple hosts at the same time gpssh-exkeys provides ssh access to multiple hosts at the same time gpstate verifies the DCA master server status EMC Greenplum DCA Maintenance Guide Common Linux command reference 190 EMC CONFIDENTIAL Linux and vi Command Reference vi Quick Reference The following is a quick reference for the vi editor. Table 27 Common vi commands vi command Description Inserting/Deleting Text (To exit insert mode, press the [ESC] key) a append text, after the cursor i insert text, before the cursor R enter overtype mode x delete character dd delete current line Moving Cursor h, [BACKSPACE] left one character l, [SPACE] right one character w forward one word b back one word e end of word j down one line k up one line ?pattern search backward for pattern /pattern search forward for pattern n repeat last search N repeat last search in the opposite direction Saving File and Exiting :wq save file and quit :q! force quit the editor, do not save changes EMC Greenplum DCA Maintenance Guide vi Quick Reference 191 EMC CONFIDENTIAL APPENDIX E Replace a Server in the Greenplum DCA Rack This appendix describes how to replace the servers in a DCA 40U rack. Server types include: 1U servers—Master, DIA, and Hadoop compute servers (EMC SVR-I1U-1208) 2U servers—GPDB (EMC SVR-I2U-R2224) and Hadoop Master and Worker servers (EMC SVR-I2U-R2312) This appendix includes the following sections: Mounting kit parts................................................................................................. Task 1: Remove the server from the rack................................................................ Task 2: Remove the inner rails from the original server .......................................... Task 3: Attach the inner rails to the replacement server ......................................... Task 4: Install the server in the rack....................................................................... 192 193 195 195 196 Mounting kit parts The server mounting kit includes rails and screws as listed in the following table. Verify that these parts are included with the replacement server. Component Use 2 universal rail assemblies (consists of slide rails for connection to the rack and inner rails for connection to server) Attach back to front on either side between rack channels Four Phillips pan-head 8-32 x 0.35 in screws Stabilize the server and rail mounting You need a # 2 Phillips-head screwdriver to complete the installation of the rails and server. EMC Greenplum DCA Maintenance Guide Replace a Server in the Greenplum DCA Rack 192 EMC CONFIDENTIAL Replace a Server in the Greenplum DCA Rack Task 1: Remove the server from the rack The enclosure is heavy and should be installed into or removed from a rack by two people. To avoid personal injury and/or damage to the equipment, do not attempt to lift and install the enclosure into a rack without a mechanical lift and/or help from another person. Procedure: IMPORTANT When removing the server from the rack, do not hold the server up by its power/control module, which is on the right side of the front of the server. 1. Unplug all power and I/O cables from the back of the server, and label the cables so you can easily identify them when you need to connect them to the replacement server. 2. Remove the stabilizing screw behind the latch bracket on each side (Figure 64). CL5020 Figure 64 Remove the stabilizer screws EMC Greenplum DCA Maintenance Guide Remove the server from the rack 193 EMC CONFIDENTIAL Replace a Server in the Greenplum DCA Rack 3. Pull the server forward until it locks in place (Figure 65). CL5023 Figure 65 Slide server out of the rack to the locked position 4. Slide the blue disconnect tabs forward to release the inner rails from the slide rails (Figure 66). Once you release the server from the inner rails, you must support the full weight of the server. 5. Be prepared to support the full weight of the server, and then slowly pull the server forward and remove it from the rack (Figure 66). CL5019 Figure 66 Release the inner rail locks and remove the server from the rack EMC Greenplum DCA Maintenance Guide Remove the server from the rack 194 EMC CONFIDENTIAL Replace a Server in the Greenplum DCA Rack Task 2: Remove the inner rails from the original server 1. On the middle of the inner rail, push in and hold the metal latch. 2. Push the rail forward to release the connection studs from the small end of the rail notches. 3. When the connections studs are in the large end of the rail notches, release the metal latch. 4. Pull the inner rails away from the server. 1 2 3 4 CL5017 Figure 67 Release the inner rails from the server Task 3: Attach the inner rails to the replacement server 1. Align the large end of the rail notches on the inner rail with the connection studs on the side of the server. 2. Push the flat side of the inner rail onto connection studs. 3. Slide the inner rail backwards along the server until the studs fit securely into the small end of the rail notches. An audible click indicates that the rail is secure. EMC Greenplum DCA Maintenance Guide Remove the inner rails from the original server 195 EMC CONFIDENTIAL Replace a Server in the Greenplum DCA Rack 1 2 3 CL5016 Figure 68 Attach an inner rail to the server Task 4: Install the server in the rack The enclosure is heavy and should be installed into or removed from a rack by two people. To avoid personal injury and/or damage to the equipment, do not attempt to lift and install the enclosure into a rack without a mechanical lift and/or help from another person. Procedure: IMPORTANT When installing the server in the rack, do not pick the server up by the rotating power console on the front right side of the server and do not push on the power console. EMC Greenplum DCA Maintenance Guide Install the server in the rack 196 EMC CONFIDENTIAL Replace a Server in the Greenplum DCA Rack 1. On each slide rail bring the ball bearing retainer assembly fully to the front, so it rides onto the security knob (Figure 69). CL5092 Figure 69 Correct location for ball bearing retainer assembly 2. From the front of the rack, align the inner rails attached to the server with the white plastic guide block front inside of each slide rail (Appendix EFigure 70). Note: For clarity Figure 70 shows the inner rail without the server attached. CL5093 Figure 70 Align the inner rail with white plastic guide block 3. Slide the server into the chassis so the inner rails extend over the plastic guide blocks and the first part of the ball bearing retainer assemblies (Figure 71). Note: For clarity Figure 71 shows the inner rail without the server attached. EMC Greenplum DCA Maintenance Guide Install the server in the rack 197 EMC CONFIDENTIAL Replace a Server in the Greenplum DCA Rack CL5094 Figure 71 Inner rail over the first part of ball bearing retainer assembly 4. Once the inner rails are properly engaged with the ball bearing retainer assemblies, push the server into into the rack until the slide rails are engaged and locked. An audible click indicates that the slide rails are engaged and locked. 5. On the outside of each rail assembly, slide the blue disconnect tab forward to unlock the server, and push the server completely into the rack (Figure 72). CL5018 Figure 72 Inserting the server completely into the rack EMC Greenplum DCA Maintenance Guide Install the server in the rack 198 EMC CONFIDENTIAL Replace a Server in the Greenplum DCA Rack 6. To further secure the rail assembly and server in the rack, insert and tighten a small stabilizer screw directly behind each bezel latch (Figure 73). CL5020 Figure 73 Installing the stabilizer screws 7. Reconnect data and power cables as described in the server replacement procedure. EMC Greenplum DCA Maintenance Guide 199 EMC CONFIDENTIAL APPENDIX F Install a Switch in a Rack This appendix describes how to replace the switches in a DCA 40U rack. It includes the following major sections: Switch mounting kit parts ..................................................................................... 201 Replace the switch in the rack ............................................................................... 201 Replace an optical SFP module.............................................................................. 208 Switch types include: Interconnect and Aggregation (10GB; SWCH-AR1U-7050S-52) Administration (1GB; SWCH-AR1U-7048T) EMC Greenplum DCA Maintenance Guide Install a Switch in a Rack 200 EMC CONFIDENTIAL Install a Switch in a Rack Switch mounting kit parts The switch mounting kit includes rail assemblies and screws as listed below. Component Use 2 rail assemblies (consists of outer rails for connection to the rack and inner rails for connection to switch) Attach back to front on either side between rack channels and to the switch. CL5032 Eight Phillips pan-head 4M x 6 mm screws Attach inner rails to switch (4 per rail) CL5033 Six Phillips pan-head 5M x 16 mm screws Stabilize the rail mounting (3 per rail) CL5034 You need a Phillips-head screwdriver to complete the installation of the rails and switch. Replace the switch in the rack Replacement of non-FRU switch components by unauthorized personnel may void service warranties. If any non-FRU component fail you must replace the entire switch. Replacing the switch consists of the following steps: “Task 1: Unpack the replacement switch” on page 201 “Task 2: Remove the old switch from the rack” on page 201 “Task 3: Transfer any optical SFP modules” on page 203 “Task 4: Transfer the inner rails to the replacemet switch” on page 204 “Task 6: Install the switch in the rack” on page 206 Task 1: Unpack the replacement switch Unpack the replacement switch and place it on a clean, static-free surface near the rack with the faulted switch. Task 2: Remove the old switch from the rack 1. From the back of the rack: a. Unplug all cables from the switch, and label the cables for easy identification when you need to plug them into the replacement switch. EMC Greenplum DCA Maintenance Guide Switch mounting kit parts 201 EMC CONFIDENTIAL Install a Switch in a Rack b. Unplug both switch power cords from the rack’s power distribution unit(s). 2. From the front of the rack: a. Unplug both power cords from the switch. b. Push the end of each power cord through the large hole in the front of the each rail assembly so you will be able to slide the switch forward. c. Remove the middle stabilizing screw from the front of each rail assembly (Figure 74). d. Pull the switch out of the rack and place it near the replacement switch (Figure 75). CL5040 Figure 74 Removing the middle stabilizer screws CL5043 Figure 75 Removing the switch from the rack EMC Greenplum DCA Maintenance Guide Replace the switch in the rack 202 EMC CONFIDENTIAL Install a Switch in a Rack Task 3: Transfer any optical SFP modules You must transfer any optical SFP modules from the switch ports in the faulted switch to the same numbered switch ports in the replacement switch. For each optical SPF module in a port on the faulted switch: 1. Remove the optical SFP module (Figure 76): a. On the SPF module, gently pull down on the spring release latch up. b. While still holding onto the latch, gently pull out the SFP module. CL5042 Figure 76 Removing an optical SFP module from a switch port 2. On the replacement switch, install the optical SFP module in the port with the same number as the port from which you removed it (Figure 77): a. On the SPF module, push the spring release latch up. b. Align the replacement SPF module with the switch port. c. Slide the SFP module into the switch port until it is securely connected. CL5031 Figure 77 Installing an optical SFP module in a switch port EMC Greenplum DCA Maintenance Guide Replace the switch in the rack 203 EMC CONFIDENTIAL Install a Switch in a Rack Task 4: Transfer the inner rails to the replacemet switch Transfer the inner rails from the faulted switch to the replacement switch. 1. Unscrew the four screws attaching each inner rail to the faulted switch. Each rail assembly consists of an inner rail and an outer rail. 2. Attach an inner rail to each side of the replacement switch: a. Slide the rail sections apart to separate the inner rail from the outer rail (Figure 78). CL5035 Figure 78 Removing the inner rail from the outer rail b. Align the holes labelled on the inner rail with the holes on the side of the replacement switch and secure the inner rail to the replacement switch with four M4 x 6mm screws (Figure 79). CL5036 Figure 79 Attaching an inner rail to the switch EMC Greenplum DCA Maintenance Guide Replace the switch in the rack 204 EMC CONFIDENTIAL Install a Switch in a Rack Task 5: Attach the outer rails to the rack (if necessary) In most service situations that you encounter the outer rails will already be attached to the rack and you will not have to perform this procedure. This procedure is provided here mainly for reference. 1. Attach a switch power cord to each outer rail (Figure 80): a. At the rear of the outer rail (the end with the alignment pins), feed the male (prong) end of a switch power cord through small hole on the outer rail from the outside to the inside of the rail. b. Pull enough of the power cord through the hole to allow the cable to be plugged into the AC power outlet in the rack. A Rear B Front C CL5037 Figure 80 Attaching power cord the outer rail c. Attach the cord loosely to the rail with plastic ties. Anchor the plastic ties through the metal loops on the outside of the rail. The outside of the rail is the side with the two posts. EMC Greenplum DCA Maintenance Guide Replace the switch in the rack 205 EMC CONFIDENTIAL Install a Switch in a Rack 2. Attach the outer rails to the rack channels: a. From the front of the rack, align rail alignment posts with the rear channel holes for the selected 1 U (1.75 in) of rack space, and insert the rail alignment posts securely into the holes (Figure 81). CL5044 Figure 81 Inserting the rail alignment posts in the rear channel holes b. Secure the rail to the front channel with two small stablizer screws in the top and bottom holes, leaving the screws slightly loose (Figure 82). CL5038 Figure 82 Securing the rails to the front channel Task 6: Install the switch in the rack 1. Install the switch in the rack (Figure 83): a. At the front of the rack, align the rails attached to the switch with the channels of the outer rails. EMC Greenplum DCA Maintenance Guide Replace the switch in the rack 206 EMC CONFIDENTIAL Install a Switch in a Rack b. Slide the switch into the outer rails and push the switch into the rack. CL5039 Figure 83 Installing the switch in the rack 2. Secure the rails in the rack by threading a small stabilizer screw through the front rack channel and into the middle hole of each rail (Figure 84). CL5040 Figure 84 Installing the middle stablizer screws 3. Firmly tighten the three small stabilizing screws that you previously installed on front of each rail. EMC Greenplum DCA Maintenance Guide Replace the switch in the rack 207 EMC CONFIDENTIAL Install a Switch in a Rack 4. At the front of the rack, feed the female end of each switch power cord through the large hole in each rail assembly, and plug the cord into a power connector on the switch (Appendix FFigure 85). To Power Zone B PDU To Power Zone A PDU Front of the rack CL5041 Figure 85 Plugging the switch power cords into the switch 5. At the back of the rack attach the required power and Ethernet cables as described in “Replace a Switch in the DCA” on page 119. Replace an optical SFP module 1. Unpack the replacement optical SFP module and place it on a clean, static-free surface near the switch. 2. Identify the faulted SFP optical module in the switch. Consult your product documentation for information on identifying a faulted SFP module. 3. Remove the faulted optical SFP module (Figure 86): a. If a cable is connected to the SFP module, disconnect it. b. On the SPF module, gently pull down on the spring release latch. c. While still holding onto the latch, gently pull out the SFP module. CL5042 Figure 86 Removing an optical SFP module from a switch port EMC Greenplum DCA Maintenance Guide Replace an optical SFP module 208 EMC CONFIDENTIAL Install a Switch in a Rack 4. Install the replacement optical SFP module (Figure 87): a. On the replacement SFP module, push spring release latch up. b. Align the replacement SPF module with the switch port that contained the faulted module. c. Slide the SFP module into the switch port until it is securely connected. CL5031 Figure 87 Installing an optical SFP module in a switch port EMC Greenplum DCA Maintenance Guide Replace an optical SFP module 209 EMC CONFIDENTIAL APPENDIX G Switch Configuration: Backup and Recovery This appendix explains how to cause the DCA to export the current switch configurations of all switches in the cluster. All switches need to be accessible and have ssh keys exchanged. Cases where there is a failure in this procedure should be reported as a new support ticket. Create Two Files for Switch Recovery First, create two files for each switch as follows. 1. Log into the master as root. 2. Run the command dca_setup. 3. Select options 2, 13, 4. 4. Enter /root as the location for the switch backups. 5. In the /root folder on the master node, two files for each switch (the running configs and the startup configs) will be created that can be used to recover the current configuration after this process is complete. Recover the Switch Configurations Complete these steps to recover the previous configuration of a switch. 1. Log into the switch as user admin: # ssh admin@[switch hostname] For example, to connect to the lowest switch in the first rack: # ssh admin@i-sw-1 2. Copy the correct switch configuration to the startup configuration with the copy command: [switch]#copy scp://root@172.28.4.250/root/[switch].startup_config startup-config Note: If the switches were backed up to the stand-by master, use 172.27.4.251 instead of 172.28.4.250. For example, if uploading to i-sw-1 from a file stored in /root on mdw: i-sw-1#copy scp://root@172.28.4.250/root/i-sw-1.startup_config startup-config EMC Greenplum DCA Maintenance Guide Switch Configuration: Backup and Recovery 210 EMC CONFIDENTIAL Switch Configuration: Backup and Recovery 3. Type the root password per the prompt. The startup-config is updated: root@172.28.4.250's password: i-sw-1.startup_config 100% 8302 8.1KB/s 00:00 4. Type the reload command to reload the switch. If prompted, answer no to saving changes as this will overwrite the startup-config with what is in the running-config. The switch will reboot and come up with the recovered configuration. 5. Repeat these steps for each switch to be recovered. EMC Greenplum DCA Maintenance Guide Recover the Switch Configurations 211 EMC CONFIDENTIAL APPENDIX H DCA Part Numbers This appendix lists the part numbers for all field replaceable units (FRU) in a DCA. Refer to this appendix when ordering replacement parts. Table 28 DCA Server replacement part numbers and specifications (page 1 of 2) Module Type and EMC Internal Server Name Official Description Part Number (SKU) Disks Memory Volume Name DIA Module (Kylin) PV V2 SVR 1NIC C1U-6 300GB HDD 64GB MEM 100-585-029-xx (covers each Rev from -01 thru -xx) 6x300GB 64GB etl1, …. HD-Compute Module (Kylin) PV V2 SVR 1NIC C1U-6 300GB HDD 64GB MEM 100-585-029-xx (covers each Rev from -01 thru -xx) 6x300GB 64GB hdc1, ... Master Server (Kylin) PV V2 SVR 1NIC C1U-6 300GB HDD 64GB MEM 100-585-029-xx (covers each Rev from -01 thru -xx) 6x300GB 64GB mdw, smdw *DIA 3TB Disk Module (Dragon 12) PV V2 SERVER D2U-12 3TB HDD 64GB MEM 100-585-030-xx (covers each Rev from -01 thru -xx) 12x3TB 64GB etl1, … Hadoop (HD) Master Module (Dragon 12) PV V2 SERVER D2U-12 3TB HDD 64GB MEM 100-585-030-xx (covers each Rev from -01 thru -xx) 12x3TB 64GB hdm1, hdm2, hdm3, hdm4 Hadoop (HD) Data Module (Dragon 12) PV V2 SERVER D2U-12 3TB HDD 64GB MEM 100-585-030-xx (covers each Rev from -01 thru -xx) 12x3TB 64GB hdw1, … GPDB UAP Standard Module (Dragon 24) PV V2 SERVER D2U-24 900GB HDD 64GB MEM 100-585-031-xx (covers each Rev from -01 thru -xx) 24x900GB 64GB sdw1, … *Requires DCA software 2.0.2.0 or greater EMC Greenplum DCA Maintenance Guide DCA Part Numbers 212 EMC CONFIDENTIAL DCA Part Numbers Table 28 DCA Server replacement part numbers and specifications (page 2 of 2) Module Type and EMC Internal Server Name Memory Volume Name Official Description Part Number (SKU) Disks GPDB UAP Compute Module (Dragon 24) PV V2 SERVER D2U-24 300GB HDD 64GB MEM 100-585-035-xx (covers each Rev from -01 thru -xx) 24x300GB 64GB sdw1, … *Master Server with additional NIC (Kylin) PV V2 SVR 2NIC C1U-6 300GB HDD 64GB MEM 100-585-049-xx (covers each Rev from -01 thru -xx) 6x300GB mdw, smdw PV V2 SERVER D2U-24 300GB HDD 256GB MEM 100-585-055-xx (covers each Rev from -01 thru -xx) 24x300GB 256GB 64GB *Requires DCA software 2.0.2.0 or greater *GPDB Memory Module (Dragon 24) sdw1, … *Requires DCA software 2.0.2.0 or greater Table 29 Additional DCA FRU part numbers Part Number Description Official Description 100-585-043 100-585-062 (sub) 10GB Ethernet Switch ARISTA 7050S-52 10GB ETHERNET SWITCH 100-585-045 100-585-063 (sub) 1GB Ethernet Switch ARISTA 7048T-A 1GB ETHERNET SWITCH 100-585-043 100-585-062 (sub) 10GB Ethernet Switch ARISTA 7050S-52 10GB ETHERNET SWITCH 100-585-045 100-585-063 (sub) 1GB Ethernet Switch ARISTA 7048T-A 1GB ETHERNET SWITCH 105-000-244 750W Power Supply INTEL 750W POWER SUPPLY ROMLEY 100-585-043 Interconnect / Aggregation Switch, 52-port ARISTA 7050S-52 10GB ETHERNET SWITCH 100-585-062 Interconnect / Aggregation Switch, 52-port ARISTA 7050S-52 10GB ETHERNET SWITCH (CCC certified) 100-585-045 Administration Switch, 48-port ARISTA 7048T-A 1GB ETHERNET SWITCH 100-585-063 Administration Switch, 48-port ARISTA 7048T-A 1GB ETHERNET SWITCH (CCC certified) 100-585-048 Arista 10GBASE-SRL SFP+ OPTIC MODULE ARISTA 10GBASE-SRL SFP+ OPTIC MODULE 105-000-313 Fan assembly, Arista switch ARISTA FAN ASSEMBLY FOR 7048T, 7050S SWITCH EMC Greenplum DCA Maintenance Guide 213 EMC CONFIDENTIAL DCA Part Numbers Part Number Description Official Description 105-000-314 Power supply, Arista switch ARISTA POWER SUPPLY, 460W AC, FOR 7048T, 7050S SWITCH 105-000-222 Disk drive assembly for Hadoop Masters & Workers INTEL DISK ASSEMBLY/3.5” SATA/3TB/7.2K/512BPS 105-000-237 Disk drive assembly for GPDB Compute, Master servers, DIA server, Hadoop Compute server INTEL DISK ASSEMBLY/2.5” SAS/300GB/10K/512BPS 105-000-228 Disk drive assembly for GPDB Standard server INTEL DISK ASSEMBLY/2.5” SAS/900GB/10K/512BPS 105-000-244 Power supply for Masters, GPDB Standard & Capacity, DIA, Hadoop Masters & Workers INTEL 750W POWER SUPPLY ROMLEY 100-563-477 Power Distribution Unit (PDU) PDU: TITAN-D RACK:SINGLE PHASE 038-004-176 1 Meter Interconnect Cable ACTIVE SFP+ TO SFP+ 1M 8G/10G CABLE 038-004-177 3 Meter Interconnect Cable ACTIVE SFP+ TO SFP+ 3M 8G/10G CABLE 038-004-186 12 INCH PDU JUMPER CABLE 038-003-733 10 Meter Optical Cable 10M OM3 LC to LC 50μm OPTICAL CABLE 038-003-347 30 Meter Optical Cable 30m LC to LC OPTICAL 50 MICRON MM CABLE ASSEMBLIES 038-004-224 PWR CORD 24A SP 15FT 56PA332 BL 4PPP 038-004-293 CBL, 15 FT SINGLE PHASE, GRAY, N. AMERICA 038-004-294 CBL, 15 FT SINGLE PHASE, GRAY, IEC, PIN & SLEEVE 038-004-295 CBL, 15 FT SINGLE PHASE, GRAY, AUSTRALIA 038-004-296 CBL, 15 FT SINGLE PHASE, GRAY, RUSSELLSTOLL 3750DP 038-004-223 SINGLE POER INLET CORD OPTION IEC-309-332P6 INTERNATIONAL 15' 038-004-222 SINGLE POWER INLET CORD OPTION WITH HUBBELL L6-30P CONNECTOR, NORTH AMERICA/JAPAN 15' 038-004-228 HUBBELL L6-30R to RUSSELLSTOLL 3750DP CABLE 15' 038-003-888 Service Cable - Administration Switch-to-Laptop EMC Greenplum DCA Maintenance Guide ETHERNET CABLE, 71 INCHES, RED 214
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.6 Linearized : Yes Author : Pivotal Information Development Copyright : 2013 Create Date : 2014:04:14 13:37:18Z Modify Date : 2014:04:14 19:21:17-04:00 XMP Toolkit : Adobe XMP Core 5.4-c005 78.147326, 2012/08/23-13:03:03 Format : application/pdf Creator : Pivotal Information Development Title : Greenplum DCA Maintenance Guide 2.0.0.0 / 2.0.1.0 Creator Tool : FrameMaker 9.0 Metadata Date : 2014:04:14 19:21:17-04:00 Producer : Acrobat Distiller 11.0 (Windows) Document ID : uuid:35503407-7373-45a2-9114-906db75c9793 Instance ID : uuid:d29b8456-29ab-44e2-bac7-7236a9ee566c Page Mode : UseOutlines Page Count : 214EXIF Metadata provided by EXIF.tools