Greenplum DCA Maintenance Guide 2.0.0.0 / 2.0.1.0 Data Computing Appliance 2.0.2.0 2.0.3.0

Greenplum-Data-Computing-Appliance-Maintenance-Guide-2.0.0.0---2.0.1.0---2.0.2.0---2.0.3.0

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 214

DownloadGreenplum DCA Maintenance Guide 2.0.0.0 / 2.0.1.0 Greenplum-Data-Computing-Appliance-Maintenance-Guide-2.0.0.0---2.0.1.0---2.0.2.0---2.0.3.0
Open PDF In BrowserView PDF
EMC CONFIDENTIAL

EMC® Greenplum
Data Computing Appliance
Appliance Version 2.0.0.0/2.0.1.0/2.0.2.0 /2.0.3.0

Maintenance Guide
REV 11

EMC CONFIDENTIAL

Copyright © 2014 EMC Corporation. All rights reserved. Published in the USA.
Published April, 2014
EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without
notice.
The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect
to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular
purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.
EMC2, EMC, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the United States and other countries.
All other trademarks used herein are the property of their respective owners.
For the most up-to-date regulatory document for your product line, go to the technical documentation and advisories section on the
EMC online support website.

2

EMC CONFIDENTIAL

CONTENTS

Chapter 1

Important Information Before You Begin ............................................. 6
New firmware updates in support of DCA software version 2.0.3.0 ................
Identify the version of the installed DCA software..........................................
Avoid electrostatic discharge damage (ESD) ..................................................
Handling field replaceable units (FRUs) ...................................................

Chapter 2

Replace a Master Server ...................................................................10
Required tools ............................................................................................
Task summary.............................................................................................
Service tag location.....................................................................................
Replace the Primary Master server...............................................................
Replace the Standby Master server .............................................................
Identifying a single-NIC master versus a dual-NIC master in a DCAv2 ...........
Replace a Master server in a DCA without a Greenplum database ................

Chapter 3

10
11
13
14
23
31
31

Replace a Segment, DIA, or Hadoop server ....................................... 39
Required tools ............................................................................................
Task summary.............................................................................................
Service tag locations ...................................................................................
Reseat cables before replacing a server.......................................................
Replace a server in an initialized GPDB module ...........................................
Replace a DIA server or a server in an uninitialized GPDB module................
Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0) ......
Replace a server in a Pivotal Hadoop module (version 2.0.1.0 and later) .....
Remove the failed PHD server and install the replacement PHD server.........
Replace hdm1 (namenode, DCA version 2.0.1.0)...................................
Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0)
Replace hdm3 (resourcemanager, DCA version 2.0.1.0) ........................
Replace hdm4 (zookeeper/hive/hive-metastore, DCA version 2.0.1.0) ..
Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0)..............

Chapter 4

6
7
7
8

39
40
41
42
44
51
58
64
64
68
72
76
79
82

Replace a Disk Drive .........................................................................86
Hot spare drives and the Copyback operation.............................................. 86
Replace a disk drive in a Master, DIA, or Hadoop Compute server ................ 87
Replace a drive in a Segment Server............................................................ 91
Replace a drive in an Hadoop server............................................................ 96
Replace a drive in a Hadoop Master server ............................................ 97
Replace a drive in a Hadoop Worker server .......................................... 101

Chapter 5

Replace a Power Supply in a Server ................................................110
Power supply LEDs .................................................................................... 110
Replace a power supply in a server............................................................ 110

EMC Greenplum DCA Maintenance Guide

3

EMC CONFIDENTIAL
Contents

Chapter 6

Replace a Fan Assembly or Power Supply in an Arista Switch ..........113
Replace a Fan Assembly in an Arista Switch...............................................
Fan Assembly Replacement Order Information ....................................
Tools...................................................................................................
Identify the Failed Fan Assembly .........................................................
Remove the Failed Fan Assembly and Install the Replacement Part......
Parts Return ........................................................................................
Replace a Power Supply in an Arista Switch ...............................................
Power Supply Assembly Replacement Order Information .....................
Tools...................................................................................................
Identify the Failed Power Supply..........................................................
Remove the Failed Power Supply and Install the Replacement Part......
Parts Return ........................................................................................

Chapter 7

113
113
113
114
115
115
116
116
116
117
118
118

Replace a Switch in the DCA ...........................................................119
Requirements ...........................................................................................
Switch hostnames and IP addresses .........................................................
Replace an Arista 7050S Interconnect or Aggregation Switch.....................
Replace an Arista 7048T Administration Switch .........................................

120
120
122
127

Chapter 8

Replace an Interconnect Switch Cable ............................................134

Appendix A

System Information and Configuration ............................................137
New firmware updates for DCA software version 2.0.3.0 ......................
Identify the version of the installed DCA software ................................
DCA configuration rules.......................................................................
Racking order ......................................................................................
Racking guidelines ..............................................................................
Mixed System rack components .........................................................
Hadoop-only System Rack components (minimum config.)..................
HD-Compute System Rack components (minimum config.)..................
Aggregation rack components .............................................................
Expansion rack components ...............................................................
Power supply reference .............................................................................
BMC Controller interface functionality .......................................................
BMC Controller LED indicators and meanings ............................................
Network and cabling configurations ..........................................................
Interconnect cabling reference ............................................................
Administration switch reference ..........................................................
Aggregation switch reference ..............................................................
Network hostname and IP configuration ...................................................
Multiple-rack cabling reference .................................................................
Configuration files.....................................................................................
Location of old core files ...........................................................................
Default passwords ....................................................................................

Appendix B

Connect a workstation to the DCA ...................................................176
Laptop prerequisites .................................................................................
Configure your laptop to connect to the DCA..............................................
Configure a Windows 7 laptop.............................................................
Configure a Windows XP laptop...........................................................
Connect to the Master Server using an SSH client......................................

EMC Greenplum DCA Maintenance Guide

137
138
139
139
140
141
142
143
144
145
146
151
151
152
152
159
163
170
173
174
174
175

176
176
176
178
178
4

EMC CONFIDENTIAL
Contents

Copy a file to the Master Server using an SCP client................................... 179
Connect to an Interconnect or Administration switch using PuTTY .............. 181

Appendix C

Power Off the DCA...........................................................................183

Appendix D

Linux and vi Command Reference....................................................189
Common Linux command reference........................................................... 189
vi Quick Reference .................................................................................... 191

Appendix E

Replace a Server in the Greenplum DCA Rack ..................................192
Mounting kit parts..................................................................................... 192

Appendix F

Install a Switch in a Rack.................................................................200
Switch mounting kit parts ......................................................................... 201
Replace the switch in the rack ................................................................... 201
Replace an optical SFP module.................................................................. 208

Appendix G

Switch Configuration: Backup and Recovery....................................210
Create Two Files for Switch Recovery.......................................................... 210
Recover the Switch Configurations ............................................................ 210

Appendix H

DCA Part Numbers ..........................................................................212

EMC Greenplum DCA Maintenance Guide

5

EMC CONFIDENTIAL

CHAPTER 1
Important Information Before You Begin
For detailed descriptions of DCA components and configurations, see Appendix A,
“System Information and Configuration,” on page 137.
This chapter includes the following major sections:




New firmware updates in support of DCA software version 2.0.3.0 ............................ 6
Identify the version of the installed DCA software...................................................... 7
Avoid electrostatic discharge damage (ESD) .............................................................. 7

New firmware updates in support of DCA software version 2.0.3.0
Customers can apply optional firmware updates prior to upgrading to DCA software
version 2.0.3.0 as follows:


Arista 7050S-52 and Arista 7048T switches
• New firmware version EOS-4.9.8.swi
• Field personnel can access the EOS-4.9.8.swi.zip firmware upgrade
package from:
ftp://ftp.aristanetworks.com/emc/certifiedeos/EOS-4.9.8.swi
Field personnel can obtain the following document available on
http://support.emc.com for step-by-step instructions:

EMC Greenplum DCA Firmware Upgrade Instructions for the Interconnect Switch
(Arista 7050S-52) and Administration Switch (Arista 7048T)


Intel Servers (Kylin with eight drives, Dragon 12 with twelve drives, and Dragon 24
with 24 drives)
• New BIOS upgrade revision level SE5C600.86B.02.01.0002
• Field personnel can access both the BIOS upgrade package, and the

EMC Greenplum DCA Intel BIOS Upgrade Instructions for Intel Servers from
http://support.emc.com.

EMC Greenplum DCA Maintenance Guide

Important Information Before You Begin

6

EMC CONFIDENTIAL
Important Information Before You Begin

Identify the version of the installed DCA software
The replacement procedures in this guide pertain only to DCA clusters running software
version 2.0.x.x. DCA documentation is tied to a specific version of the DCA software.
Before beginning any replacement procedure on a DCA, make sure that the version of the
software running on the clusters matches the version of the guide that you are using.
1. Log in to the Primary Master server as the user root (see “Connect a workstation to
the DCA” on page 176).
2. View the contents of the /opt/dca/etc/dca-build-info.txt file.
Verify that the ISO_VERSION value is 2.0.x.x
# cat /opt/dca/etc/dca-build-info.txt
ISO_BUILD_DATE="Wed Oct 15 21:59:56 PST 2013"
ISO_VERSION="2.0.2.0"
ISO_BUILD_VERSION="4"
ISO_INSTALL_TYPE="iso"

If the version of the software is not 2.0.x.x, go to he EMC Online Support site
http://support.emc.com. From the Support by Product pages, search for Greenplum
Data Computing Appliance and obtain the documentation that matches the software
version running on the Primary Master server.

Avoid electrostatic discharge damage (ESD)
When you replace and install field replaceable units (FRUs), you can inadvertently damage
sensitive electronic circuits in the equipment simply by touching them. Electrostatic
charge that has accumulated on your body can discharge through the circuits. If the air in
the work area is dry, running a humidifier in the work area can help decrease the risk of
electrostatic discharge (ESD) damage.
Read and understand the following guidelines:


Provide enough room to work on the equipment. Clear the work site of any
unnecessary materials, especially materials that naturally build up electrostatic
charge such as foam packaging, foam cups, cellophane wrappers, and similar items.



Do not remove replacement or upgrade FRUs from their antistatic packaging until you
are ready to install them.



Set up your EMC-issued ESD kit and all other materials that you need before servicing
a Greenplum system. Once you begin service, avoid moving away from the work site;
otherwise, your body can build up an electrostatic charge.



Use the ESD kit when handling system components.



Wear an ESD wristband. Attach the clip of the ESD wristband to any bare (unpainted)
metal in the bay, and then place the wristband around your wrist with the metal
button against your skin.

EMC Greenplum DCA Maintenance Guide

Identify the version of the installed DCA software

7

EMC CONFIDENTIAL
Important Information Before You Begin

Handling field replaceable units (FRUs)
This section describes the precautions that you must take and the general procedures
that you must follow when removing and storing any field replaceable unit (FRU). The only
FRUs in the server are the disk drive assemblies and power supply modules. Depending
on the product in which the server is used, the disk drive assemblies may be
hot-swappable; that is you can replace a disk drive assembly while the server is running.
To determine if disk drive assemblies are hot-swappable, refer to your product
documentation. Regardless of the product in which the server is used, the power supply
modules are hot-swappable; that is you can replace a power supply module while the
server is running.
You should not remove a faulty FRU until you have a replacement available.
When you replace FRUs, you can inadvertently damage the sensitive electronic circuits in
the equipment by simply touching them. Electrostatic charge (ESD) that has accumulated
on your body discharges through the circuits. If the air in the work area is very dry, running
a humidifier in the work area will help decrease the risk of ESD damage. Follow the
procedures below to prevent damage to the equipment.


Provide enough room to work on the equipment. Clear the work site of any
unnecessary materials or materials that naturally build up electrostatic charge, such
as foam packaging, foam cups, cellophane wrappers, and similar items.



Do not remove replacement or upgrade FRUs from their antistatic packaging until you
are ready to install them.



Before you service a server, gather together the ESD kit and all other materials you will
need. Once servicing begins, avoid moving away from the work site; otherwise, you
may build up an electrostatic charge.



Use the ESD kit when handling any FRU. Use an ESD wristband. To use the ESD
wristband (strap), attach the clip of the wristband to any bare (unpainted) metal on
the server; then put the wristband around your wrist with the metal button against
your skin. If an emergency arises and the ESD kit is not available, follow the
procedures in “Emergency procedures (without an ESD kit)” on page 8.

Emergency procedures (without an ESD kit)
In an emergency when an ESD kit is not available, use the procedures below to reduce the
possibility of an electrostatic discharge by ensuring that your body and the subassembly
are at the same electrostatic potential. These procedures are not a substitute for the use
of an ESD kit. Follow them only in the event of an emergency.


Before touching any FRU, touch a bare (unpainted) metal surface of the cabinet or
server.



Before removing any FRU from its antistatic bag, place one hand firmly on a bare metal
surface of the server, and at the same time, pick up the FRU while it is still sealed in
the antistatic bag. Once you have done this do not move around the room or touch
other furnishings, personnel, or surfaces until you have installed the FRU.



When you remove a FRU from the antistatic bag, avoid touching any electronic
components and circuits on it.

EMC Greenplum DCA Maintenance Guide

Avoid electrostatic discharge damage (ESD)

8

EMC CONFIDENTIAL
Important Information Before You Begin



If you must move around the room or touch other surfaces before installing a FRU, first
place the FRU back in the antistatic bag. When you are ready again to install the FRU,
repeat these procedures.

EMC Greenplum DCA Maintenance Guide

Avoid electrostatic discharge damage (ESD)

9

EMC CONFIDENTIAL

CHAPTER 2
Replace a Master Server
This chapter describes how to replace a Primary or Standby Master server in
a GPDB-only DCA, a mixed DCA, or a Hadoop-only DCA.


(Applies only to version 2.0.1.0 and later)
Additional steps are required if you are servicing a Hadoop-only DCA. Look for the
following notice in the left margin:

**If you are servicing a Hadoop-only DCA**
Topics include:







Required tools ........................................................................................................
Task summary.........................................................................................................
Service tag location.................................................................................................
Replace the Primary Master server...........................................................................
Replace the Standby Master server .........................................................................
Replace a Master server in a DCA without a Greenplum database ............................

10
11
13
14
23
31

Required tools
You need the following tools to remove and replace a server:


#2 Phillips screwdriver



Wrist grounding strap

EMC Greenplum DCA Maintenance Guide

Replace a Master Server

10

EMC CONFIDENTIAL
Replace a Master Server

Task summary
Table 1 Summary of Master server replacement tasks
Tasks

Primary or Standby Master in a DCA
with no initialized GP database

Primary Master

Standby Master

x

x

x

Check with the customer if any custom
configurations have been applied.

x

x

x

Disable health monitoring.

x

x

x

Check Master server sync state.

x

x

Check Greenplum database for errors.

x

x

If necessary, initiate an orchestrated failover from
the Primary server to the Standby server.

x

x

Check BIOS version when replacing a
Master server in the cluster
When installing a replacement server, identify the
BIOS version on the new server (as well as the
versions already running in the DCA). Then
upgrade so that all servers reflect the same
firmware levels.
Go to http://support.emc.com to obtain the
pertinent BIOS upgrade instructions. The upgrade
instructions provide information on how to access
and install the upgrade package.

**If you are replacing a Primary Master in a
Hadoop-only DCA**: include the argument
--deletevip
Verify the success of the failover.

x

Power off the failed server.

x

x

x

Label then remove cables from the failed server.

x

x

x

Install the replacement server.

x

x

x

Transfer drives from the failed server to the
replacement server.

x

x

x

Connect cables to the replacement server (but do
not power it on yet).

x

x

x

Configure the BMC IP address.

x

x

x

Power on the replacement server.

x

x

x

Import foreign disk configurations.

x

x

x

Check the health of the replacement server.

x

x

x

Exchange SSH keys.

x

x

Initialize the replacement server as the acting
Standby Master server (temporarily).

x

EMC Greenplum DCA Maintenance Guide

Task summary

11

EMC CONFIDENTIAL
Replace a Master Server

Table 1 Summary of Master server replacement tasks
Tasks
Initiate a failover from the replacement server (the
acting Standby Master server).

Primary Master

Standby Master

Primary or Standby Master in a DCA
with no initialized GP database

x

**If you are replacing a Primary Master in a
Hadoop-only DCA**: include the argument
--deletevip
Revert the smdw server to its former standby role.

x

**If you are replacing a Primary or Standby Master
in a Hadoop-only DCA** Recover and rebalance the

x

x

Synchronize the system clock.

x

x

Ask the customer if there are any custom
configurations that must be reapplied to the DCA
(for example, NFS mounts or gateways).

x

x

Re-enable health monitoring.

x

x

GPDB segment instances on the Primary Master
server.

EMC Greenplum DCA Maintenance Guide

x

Task summary

12

EMC CONFIDENTIAL
Replace a Master Server

Service tag location
When replacing any hardware component, it is important that you properly debrief the
part. The serial number of a Master server is located on the blue service label affixed to
the front left corner of the server.

Figure 1 Service tag location on the Master server (Dragon24 shown)

EMC Greenplum DCA Maintenance Guide

Service tag location

13

EMC CONFIDENTIAL
Replace a Master Server

Replace the Primary Master server
Perform this procedure if the Primary Master server has failed or is failing.
IMPORTANT
This procedure directs you to transfer drives from the failed server to the replacement
server. Take great care when transferring drives. Transfer only one drive at a time. Insert
drives in the same slots that they occupied in the failed server.
1. You may want to consult “Task summary” on page 11 for a overview of the Master
server replacement procedures.
2. If it is not already connected, connect your service laptop to the red service cable
located on the laptop tray. The red service cable is connected to port 48 on the
Administration switch in the SYSRACK. For instructions on how to configure the IP
address on your laptop, see “Connect a workstation to the DCA” on page 176).
3. If the Primary Master server is still accessible through SSH, perform step a through
step e below. If the Primary Master server is not accessible through SSH, skip to
step 4.
a. Log in to the Primary Master server as the user root (see “Connect a workstation
to the DCA” on page 176).
b. Activate the server identification LED.
# dca_blinker -h mdw -a ON

c. Switch to the user gpadmin and determine whether the Primary and Standby
Master servers are synced:
# su - gpadmin
$ gpstate -f

If the output returns the status synchronized, the master servers are in sync. If
synchronized is not returned in the output or the database is not running, do
not replace the Primary Master server. Contact EMC Support.
d. Switch to the user root and make note of any custom NFS mounts the customer
may have created:
$ su # cat /etc/fstab

e. Make note of any custom network gateways the customer may have created:
# cat /etc/sysconfig/network

4. Before you replace the failed Primary Master server, perform the sub-steps below to
determine whether an automatic failover occurred when the Primary Master failed. If
an automatic failover did not occur, you must initiate an orchestrated (manual)
failover before you replace the failed server (see step 5 below). To determine whether
an automated failover occurred, do the following:
a. Start the DCA Setup utility as the user root:
# dca_setup

EMC Greenplum DCA Maintenance Guide

Replace the Primary Master server

14

EMC CONFIDENTIAL
Replace a Master Server

b. Select option 2 to Modify DCA Settings.
c. Note the text of option 19:
– If the text is Disable Master Server Auto Failover (currently
enabled), an automatic failover occurred when the Primary Master failed.
Skip to step 6 to determine if the failover was successful.
– If the text is Enable Master Server Auto Failover (currently
disabled), you must initiate an orchestrated (manual) failover as described
in step 5.
d. Enter X to exit the DCA Setup utility.
5. If an automatic failover did occur, proceed to step 6. If an automatic failover did not
occur, initiate a orchestrated (manual) failover as follows:
a. From the Standby Master server, issue the dca_failover command:
# dca_failover --stopmasterdb --noninteractive --vip 10.10.10.10
--gateway 10.10.10.1 --netmask 255.255.255.0

**If you are servicing a
Hadoop-only DCA**

In a Hadoop-only DCA, make sure to include the option --deletevip:
# dca_failover --stopmasterdb --noninteractive --vip 10.10.10.10
--gateway 10.10.10.1 --netmask 255.255.255.0 --deletevip

b. Replace the values shown in bold above with the IP, Gateway, and Netmask of the
virtual IP address provided by the customer. If the customer has not specified a
virtual IP address, do not include the --vip, --gateway, and --netmask
parameters. Wait for the prompt to appear indicating that the failover has
completed before you continue.
c. When the failover has completed, proceed to step 6 to determine if the failover
operation was successful.
6. To determine whether the Master server failover operation was successful, switch to
the user gpadmin and issue the following command. Verify that the text in bold is
returned in the output:
# su - gpadmin
$ gpstate -f
[INFO]:-Starting gpstate with args: -f
[INFO]:-local Greenplum Version: 'postgres (Greenplum Database)
4.1.1.3 build 4'
[INFO]:-Obtaining Segment details from master...
[INFO]:-Standby master instance not configured

7. Check the Greenplum Database for errors. If any errors are returned in the output, you
must resolve them before you continue with this procedure:
$ gpstate -e
[INFO]:----------------------------------------------------[INFO]:-Segment Mirroring Status Report
[INFO]:----------------------------------------------------[INFO]:-All segments are running normally

8. To prevent false dial home messages from being sent to EMC Support during service,
log in to the Standby Master server as the user root and stop the healthmon
daemon to disable health monitoring:
$ su EMC Greenplum DCA Maintenance Guide

Replace the Primary Master server

15

EMC CONFIDENTIAL
Replace a Master Server

# dca_healthmon_ctl -d

9. Shut down the Primary Master server:
• If the failed Primary Master server is accessible through SSH, log into to it as the
user root and issue the shutdown command.
IMPORTANT
Check the prompt to make sure that you are on the Primary Master (mdw) before
you issue the shutdown command!
$ ssh root@mdw
# shutdown -h now

• If the failed server is not accessible through SSH, power it off by pressing the
power button on the front of the server for 5 seconds (see Figure 2 below).
Power button

Figure 2 Power button location on Master server

10. Label all the cables connected to the failed server so that you’ll know where to
connect them on the replacement server.
11. Remove all power, Ethernet, and twin-axial cables from the back of the server.
Note: If the system has Dual NICs installed, note the connections for customer and
interconnect networks prior to disconnecting.
12. Remove the failed server and install the replacement server (see Appendix E, “Replace
a Server in the Greenplum DCA Rack,” on page 192).
13. Transfer disk drives one at a time from the failed server to the replacement server.
IMPORTANT
Use caution when transferring drives. Transfer only one drive at a time. Insert the
drives in the same slots that they occupied in the failed server.
14. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on
the cables for proper connectivity.
IMPORTANT
Do not connect power to the replacement server yet.
15. From the Standby Master server start the dhcpd service as the user root:
# service dhcpd start

16. Connect the power cables to the replacement server.

EMC Greenplum DCA Maintenance Guide

Replace the Primary Master server

16

EMC CONFIDENTIAL
Replace a Master Server

17. Next, use these steps to identify the IP address assigned to the server.
a. Issue the following command to obtain the lease information provided in the
dhcpd.leases file:
# tail /var/lib/dhcpd/dhcpd.leases

b. The dhcpd.leases file dispays (similar to the following):
Example
lease 172.28.6.170 {
starts 4 2012/10/18 20:09:08;
ends 5 2013/10/18 20:09:08;
cltt 4 2012/10/18 20:09:08;
binding state active;
next binding state free;
hardware ethernet 00:00:00:00:00:04;
uid "\001\000\036g,\242\014";

c. Locate the MAC address labelled hardware ethernet in the example
dhcpd.leases file above:
00:00:00:00:00:04
d. Locate the MAC address on the replacement server’s service tag (highlighted in the
photograph below):
MAC1 00:00:00:00:00:00

Figure 3 Locating the MAC address on the service tag (Primary Master server shown, Dragon24)

e. Compare the last two digits in the MAC addresses referenced in step c and step d
(for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC
address in the dhcpd.leases file is four greater than the last two numbers in the
MAC address on the replacement server’s service tag.
If this is the case, it is certain that the IP address in the dhcpd.leases file is the
correct one to associate with the server. For example, the scenario described
above verifies that 172.28.6.170 is correct in this specific instance.

EMC Greenplum DCA Maintenance Guide

Replace the Primary Master server

17

EMC CONFIDENTIAL
Replace a Master Server

f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170)
identified in the previous steps using the example above as a guide:
Note: Disregard the long, detailed output after this command is executed.
# ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth

g. Ping the new address to verify that the change was applied:
# ping 172.28.0.#

Where # is the number of the server you are replacing.
18. Turn off the dhcpd service:
# service dhcpd stop

19. From the Primary Master server as user root, issue the following command to open a
BMC console session on the replacement Master server mdw:
# ipmiutil sol -a -e -N mdw-sp -U root -P sephiroth


You will need to press the F key within 15 seconds after seeing this WARNING message:
Foreign configuration(s) found on adapter
Press any key to continue or ‘C’ load the configuration
utility, or ‘F’ to import foreign configuration(s) and
continue.
20. Power on the replacement server by pressing the power button on the front panel, and
press the F key when prompted.
21. When the following message displays, disregard and press the space bar:
All of the disks from your previous configuration are gone. If this
is an unexpected message, then please power off your system and
check your cables to ensure all disks are present. Press any key to
continue, or 'C' to load the configuration utility.

EMC Greenplum DCA Maintenance Guide

Replace the Primary Master server

18

EMC CONFIDENTIAL
Replace a Master Server

22. If the message below appears you will need to power off the server, verify that all LED
lights are off, and repeat steps 20 through 21.
CLIENT MAC ADDR: 00 1E 67 4D C5 1D
001E674DC51D
DHCP....\

GUID: 2A9B43A4 A50A 11E1 AAA0

23. Monitor the boot process onscreen and verify that the system boots from hard disk.
If the system does boot from hard disk, proceed to step 24.
If the system does not boot from hard disk, perform the following sub-steps to force it
to boot from hard disk:
a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.)
key, as follows on the keyboard:

then
Note: When you exit the BMC console you are returned to your connection on the
smdw as the user root.
b. Issue the following command from smdw to force the appliance to boot from hard
drive:
# ipmiutil reset -h -N mdw-sp -U root -P sephiroth
c. Once the operating system is loaded, issue the following command to change the
boot order on mdw:
# ssh mdw
# syscfg /bbo “emcbios” HDD NW

d. Reboot mdw:
# reboot

e. Following the reboot, issue the following commands to connect to mdw and verify
the boot order:
# ssh mdw
# syscfg /bbosys

f. Exit mdw:
# exit

g. Proceed to step 24. (You can skip step 24 because you already exited the BMC
console in sub-step a above.)

EMC Greenplum DCA Maintenance Guide

Replace the Primary Master server

19

EMC CONFIDENTIAL
Replace a Master Server

24. From the Standby Master server (smdw) check the health of the replacement server:
# dcacheck -h mdw

Verify that no errors display.
25. Exchange SSH keys on the replacement server using the DCA Setup utility:
a. Start the DCA Setup utility as the user root:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 6 to Generate SSH Keys.
d. Enter X to exit the DCA Setup utility.
e. Check the firmware level of the RAID controllers with the CmdTool2 utility:
# ssh mdw /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep
"FW Package Build"

If the above command returns either: FW Package Build: 23.7.0-0033 or FW
Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a
through n below to update the firmware.
f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to
your laptop.
https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian
ce
g. Extract the files to your laptop using unzip or similar unpacking tool. For example:
Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip

h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the
file in /root. You can use WinSCP or a similar utility.
Note: You may be required to provide a login to the destination server.
i. For each server in need of an update, log into the server as root.
j. SCP the MR56p.rom file from the master to the server you are updating.
k. Install the new firmware using the following command:
Note: This will take longer on 24-disk servers.
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom
-aall

l. Reboot the server.
# reboot

m. When the server reboots, check the new firmware version:
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW
Package Build"

EMC Greenplum DCA Maintenance Guide

Replace the Primary Master server

20

EMC CONFIDENTIAL
Replace a Master Server

The following should be returned, indicating your firmware has successfully been
updated on this server:
FW Package Build: 23.12.0-0013

n. Repeat these alphabetic steps to check/update the remaining servers in the
cluster.
26. Switch to the user gpadmin and issue the following commands to initialize the
replacement server as the acting Standby Master server:
# su - gpadmin
$ ssh mdw rm -r /data/master/*
$ gpinitstandby -s mdw

27. At the message Do you want to continue with standby master
initialization? enter Y to continue.
Wait for the message Successfully created standby master on mdw.
28. Log in to the replacement server (now the new acting Standby Master server) as the
user root:
$ ssh root@mdw

29. Issue the following command to initiate the orchestrated (manual) failover:
dca_failover --stopmasterdb --noninteractive --vip 10.10.10.10
--gateway 10.10.10.1 --netmask 255.255.255.0

**If you are servicing a
Hadoop-only DCA**

In a Hadoop-only DCA, make sure to include the option --deletevip:
dca_failover --stopmasterdb --noninteractive --vip 10.10.10.10
--gateway 10.10.10.1 --netmask 255.255.255.0 --deletevip

30. At the message Do you want to continue? enter y.
IMPORTANT
Initiating a failover stops the Greenplum Database and renders it temporarily
unavailable.
When the failover operation finishes you are returned to the prompt [root@mdw]# .
31. Switch to the user gpadmin:
# su - gpadmin

32. Connect to the Standby Master server and empty the /data/master directory:
$ ssh smdw rm -r /data/master/*

33. Issue the following command to revert the smdw server to its standby role:
$ gpinitstandby -s smdw

34. At the message Do you want to continue with standby master
initialization? enter Y.

EMC Greenplum DCA Maintenance Guide

Replace the Primary Master server

21

EMC CONFIDENTIAL
Replace a Master Server

**If you are servicing a
Hadoop-only DCA**

Perform the next two steps only if you are replacing a Master server in a Hadoop-only DCA
35. Issue the following command to recover the GPDB segment instances running on the
Primary and Standby Master servers:
$ gprecoverseg

Enter Y when prompted. For example:
Continue with segment recovery procedure Yy|Nn (default=N):
> Y

36. Issue the following command to rebalance the GPDB segment instances running on
the Primary and Standby Master servers:
$ gprecoverseg -r

Enter Y when prompted. For example:
Continue with segment rebalance procedure Yy|Nn (default=N):
> Y

The procedure continues here for all DCA types:
37. Exit the user gpadmin:
$ exit

38. Start the DCA Setup utility:
# dca_setup

39. Synchronize the system clock:
a. Select option 2 for Modify DCA Settings.
b. Select option 5 for Modify NTP/Clock Configuration Options.
c. Select option 3 for Synchronize clocks across the cluster to the
NTP server.
Enter X to exit the DCA Setup utility.
40. IMPORTANT - Note that the same DCA system Serial Number (located on a label affixed
to the top, rear of the rack) must be included in the following files for Dial Home to
work after replacing a Master application server (mdw and smdw in the case of GPDB
and hdm, and standby hdm in the case of Hadoop):
• /opt/connectemc/ConnectEMC.ini
• /opt/greenplum/serialnumber
First, check the DCA system Serial Number in the connectemc initialization file,
/opt/connectemc/ConnectEMC.ini file, as follows:
a. Open the connectemc initialization file:
/opt/connectemc/ConnectEMC.ini
b. Locate the DCA system Serial Number per the following keyword in the file:
SERIAL_NUMBER=

EMC Greenplum DCA Maintenance Guide

Replace the Primary Master server

22

EMC CONFIDENTIAL
Replace a Master Server

c. Check that this matches the DCA system Serial Number on the label affixed to the
top, rear of the rack. Go to the next step (step d.) if the Serial Number is missing.
d. If missing, enter the Serial Number in the
/opt/connectemc/ConnectEMC.ini file, for example:
SERIAL_NUMBER=APMXXXXXXXXX
41. Next, check that the DCA system Serial Number in the
/opt/greenplum/serialnumber file matches the DCA system Serial Number in
the /opt/connectemc/ConnectEMC.ini file, per step 40 above.
For example:
SERIAL_NUMBER=APM00140732731

Note: After verifying that the DCA system Serial Numbers are identical, remember to save
the /opt/greenplum/serialnumber file if you made any changes.
42. Re-enable health monitoring:
# dca_healthmon_ctl -e
43. You must stop and start the connectemc service (also referred to as Dial Home) to
complete restarting the healthmon daemon.
Enter the command:
service connectemc stop
You will see the message:
Shutting down ConnectEMC

44. When you see the # prompt again, enter:
service connectemc start
You will see the message:
Starting ConnectEMC

The # prompt returns, indicating that you have re-enabled health monitoring.

Replace the Standby Master server
Perform this procedure if the Standby Master server has failed or is failing and the Primary
Master server is in good health.
IMPORTANT
This procedure directs you to transfer drives from the failed server to the replacement
server. Take great care when transferring drives. Transfer only one drive at a time. Insert
drives in the same slots that they occupied in the failed server.
1. You may want to consult “Task summary” on page 11 for a overview of the Master
server replacement procedures.

EMC Greenplum DCA Maintenance Guide

Replace the Standby Master server

23

EMC CONFIDENTIAL
Replace a Master Server

2. If it is not already connected, connect your service laptop to the red service cable
located on the laptop tray. For details on how to configure the IP address of your
laptop, see “Connect a workstation to the DCA” on page 176.
3. To prevent false dial home messages from being sent to EMC Support during service,
stop the healthmon daemon to disable health monitoring:
# dca_healthmon_ctl -d

4. If the Standby Master server is still accessible through SSH, perform step a through
step e below. If the failed Standby Master server is not accessible through SSH, skip to
step 5.
a. Log in to the Primary Master server as the user root (see “Connect a workstation
to the DCA” on page 176).
b. Activate the server identification LED.
# dca_blinker -h smdw -a ON

c. Switch to the user gpadmin and determine whether the Primary and Standby
Master servers are synchronized:
# su - gpadmin
$ gpstate -f

If the output returns the status synchronized, the Master servers are in sync. If
synchronized is not returned in the output, do not replace the Standby Master
server. Contact EMC Support.
d. Switch to the user root and make note of any custom NFS mounts the customer
may have created:
$ su # cat /etc/fstab

e. Make note of any custom network gateways the customer may have created:
# cat /etc/sysconfig/network

5. From the Primary Master server, switch to the user gpadmin and remove the Standby
Master server from the configuration:
# su - gpadmin
$ gpinitstandby -r

6. When prompted, enter Y to continue.
7. Shut down the Standby Master server:
• If the failed Standby Master server is accessible through SSH, log into to it as the
user root and issue the shutdown command.
IMPORTANT
Check the prompt to make sure that you are on the Standby Master (smdw) before
you issue the shutdown command!
$ ssh root@smdw
# shutdown -h now

EMC Greenplum DCA Maintenance Guide

Replace the Standby Master server

24

EMC CONFIDENTIAL
Replace a Master Server

• If the failed server is not accessible through SSH, power it off by pressing the
power button on the front of the server.
8. Label all the cables connected to the failed server so that you’ll know where to
connect them on the replacement server.
9. Remove all power, Ethernet, and twin-axial cables from the back of the server.
Note: If the system has Dual NICs installed, note the connections for customer and
interconnect networks prior to disconnecting.
10. Remove the failed server and install the replacement server (see Appendix E, “Replace
a Server in the Greenplum DCA Rack,” on page 192).
11. Transfer disk drives one at a time from the failed server to the replacement server.
IMPORTANT
Use caution when transferring drives. Transfer only one drive at a time. Insert the
drives in the same slots that they occupied in the failed server.
12. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on
the cables for proper connectivity.
IMPORTANT
Do not connect power to the replacement server yet.
13. From the Primary Master server start the dhcpd service as the user root:
# service dhcpd start

14. Connect the power cables to the replacement server.
15. Next, use these steps to identify the IP address assigned to the server.
a. Issue the following command to obtain the lease information provided in the
dhcpd.leases file:
# tail /var/lib/dhcpd/dhcpd.leases

b. The dhcpd.leases file dispays (similar to the following):
Example
lease 172.28.6.170 {
starts 4 2012/10/18 20:09:08;
ends 5 2013/10/18 20:09:08;
cltt 4 2012/10/18 20:09:08;
binding state active;
next binding state free;
hardware ethernet 00:00:00:00:00:04;
uid "\001\000\036g,\242\014";

EMC Greenplum DCA Maintenance Guide

Replace the Standby Master server

25

EMC CONFIDENTIAL
Replace a Master Server

c. Locate the MAC address labelled hardware ethernet in the example
dhcpd.leases file above:
00:00:00:00:00:04
d. Locate the MAC address on the replacement server’s service tag (highlighted in the
photograph below):
MAC1 00:00:00:00:00:00

Figure 4 Locating the MAC address on the service tag (Standby Master server shown, Dragon24)

e. Compare the last two digits in the MAC addresses referenced in step c and step d
(for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC
address in the dhcpd.leases file is four greater than the last two numbers in the
MAC address on the replacement server’s service tag.
If this is the case, it is certain that the IP address in the dhcpd.leases file is the
correct one to associate with the server. For example, the scenario described
above verifies that 172.28.6.170 is correct in this specific instance.
f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170)
identified in the previous steps using the example above as a guide:
Note: Disregard the long, detailed output after this command is executed.
# ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth

g. Ping the new address to verify that the change was applied:
# ping 172.28.0.#

Where # is the number of the server you are replacing.
16. Turn off the dhcpd service:
# service dhcpd stop

17. Power on the replacement server by pressing the button on the front panel.

EMC Greenplum DCA Maintenance Guide

Replace the Standby Master server

26

EMC CONFIDENTIAL
Replace a Master Server

18. Issue the following command to open a BMC console session on the replacement
Master server:
# ipmiutil sol -a -e -N smdw-sp -U root -P sephiroth


You will need to press the F key within 15 seconds after seeing this WARNING message:
Foreign configuration(s) found on adapter
Press any key to continue or ‘C’ load the configuration
utility, or ‘F’ to import foreign configuration(s) and
continue.
19. Power on the replacement server by pressing the power button on the front panel, and
press the F key when prompted.
20. When the following message displays, disregard and press the space bar:
All of the disks from your previous configuration are gone. If this
is an unexpected message, then please power off your system and
check your cables to ensure all disks are present. Press any key to
continue, or 'C' to load the configuration utility.

21. If the message below appears it indicates that the server did not accept the “F” key
request per the above WARNING. This means that you will need to power off the server,
verify that all LED lights are off, and go back to step 19 (press the F key when
prompted again):
CLIENT MAC ADDR: 00 1E 67 4D C5 1D
001E674DC51D
DHCP....\

EMC Greenplum DCA Maintenance Guide

GUID: 2A9B43A4 A50A 11E1 AAA0

Replace the Standby Master server

27

EMC CONFIDENTIAL
Replace a Master Server

22. Monitor the boot process onscreen and verify that the system boots from hard disk.
If it does not, do the following to force it to boot from hard disk:
a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.)
key, as follows on the keyboard:

then
Note: When you exit the BMC console you are returned to your connection on the
mdw as the user root.
b. Issue the following command from mdw to force the replacement server to boot
from hard drive:
# ipmiutil reset -h -N smdw-sp -U root -P sephiroth
c. Once the operating system is loaded, issue the following command to change the
boot order on smdw:
# ssh smdw
# syscfg /bbo “emcbios” HDD NW

d. Reboot the system:
# reboot

e. Following the reboot, issue the following commands to connect to smdw and verify
the boot order:
# ssh smdw
# syscfg /bbosys

f. Exit smdw:
# exit

23. Check the health of the replacement server:
# dcacheck -h smdw

Verify that no errors display.
24. Exchange SSH keys on the replacement server using the DCA Setup utility:
a. Start the DCA Setup utility as the user root:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 6 to Generate SSH Keys.
d. Enter X to exit the DCA Setup utility.
e. Check the firmware level of the RAID controllers with the CmdTool2 utility:
# ssh smdw /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep
"FW Package Build"

EMC Greenplum DCA Maintenance Guide

Replace the Standby Master server

28

EMC CONFIDENTIAL
Replace a Master Server

If the above command returns either: FW Package Build: 23.7.0-0033 or FW
Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a
through n below to update the firmware.
f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to
your laptop.
https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian
ce
g. Extract the files to your laptop using unzip or similar unpacking tool. For example:
Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip

h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the
file in /root. You can use WinSCP or a similar utility.
Note: You may be required to provide a login to the destination server.
i. For each server in need of an update, log into the server as root.
j. SCP the MR56p.rom file from the master to the server you are updating.
k. Install the new firmware using the following command:
Note: This will take longer on 24-disk servers.
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom
-aall

l. Reboot the server.
# reboot

m. When the server reboots, check the new firmware version:
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW
Package Build"

The following should be returned, indicating your firmware has successfully been
updated on this server:
FW Package Build: 23.12.0-0013

n. Repeat these alphabetic steps to check/update the remaining servers in the
cluster.
25. Switch to the user gpadmin and issue the following command from the Primary
Master server (mdw) to initialize the replacement server as the Standby Master server:
# su - gpadmin
$ gpinitstandby -s smdw

**If you are servicing a
Hadoop-only DCA**

Perform the next two steps only if you are replacing a Master server in a Hadoop-only DCA
26. Issue the following command to recover the GPDB segment instances running on the
Primary and Standby Master servers:
$ gprecoverseg

Enter Y when prompted. For example:
EMC Greenplum DCA Maintenance Guide

Replace the Standby Master server

29

EMC CONFIDENTIAL
Replace a Master Server

Continue with segment recovery procedure Yy|Nn (default=N):
> Y

27. Issue the following command to rebalance the GPDB segment instances running on
the Primary and Standby Master servers:
$ gprecoverseg -r

Enter Y when prompted. For example:
Continue with segment rebalance procedure Yy|Nn (default=N):
> Y

The procedure continues here for all DCA types:
28. Exit from the user gpadmin to the user root:
$ exit

29. Start the dca_setup utility:
# dca_setup

30. Synchronize the system clock:
a. Select option 2 for Modify DCA Settings.
b. Select option 5 for Modify NTP/Clock Configuration Options.
c. Select option 3 for Synchronize clocks across the cluster to the
NTP server.
d. Enter X to exit the DCA Setup utility.
31. IMPORTANT - Note that the same DCA system Serial Number (located on a label affixed
to the top, rear of the rack) must be included in the following files for Dial Home to
work after replacing a Master application server (mdw and smdw in the case of GPDB
and hdm, and standby hdm in the case of Hadoop):
• /opt/connectemc/ConnectEMC.ini
• /opt/greenplum/serialnumber
First, check the DCA system Serial Number in the connectemc initialization file,
/opt/connectemc/ConnectEMC.ini file, as follows:
a. Open the connectemc initialization file:
/opt/connectemc/ConnectEMC.ini
b. Locate the DCA system Serial Number per the following keyword in the file:
SERIAL_NUMBER=
c. Check that this matches the DCA system Serial Number on the label affixed to the
top, rear of the rack. Go to the next step (step d.) if the Serial Number is missing.
d. If missing, enter the Serial Number in the
/opt/connectemc/ConnectEMC.ini file, for example:
SERIAL_NUMBER=APMXXXXXXXXX

EMC Greenplum DCA Maintenance Guide

Replace the Standby Master server

30

EMC CONFIDENTIAL
Replace a Master Server

32. Next, check that the DCA system Serial Number in the
/opt/greenplum/serialnumber file matches the DCA system Serial Number in
the /opt/connectemc/ConnectEMC.ini file, per step 31 above.
For example:
SERIAL_NUMBER=APM00140732731

Note: After verifying that the DCA system Serial Numbers are identical, remember to save
the /opt/greenplum/serialnumber file if you made any changes.
33. Re-enable health monitoring:
# dca_healthmon_ctl -e
34. You must stop and start the connectemc service (also referred to as Dial Home) to
complete restarting the healthmon daemon.
Enter the command:
service connectemc stop
You will see the message:
Shutting down ConnectEMC

35. When you see the # prompt again, enter:
service connectemc start
You will see the message:
Starting ConnectEMC

The # prompt returns, indicating that you have re-enabled health monitoring.

Identifying a single-NIC master versus a dual-NIC master in a
DCAv2
The Master server is a dual-NIC master if both eth6 and eth7 are present on the mdw or
smdw. If the server is powered up, the only way to identify a dual-NIC master is to visually
inspect it, by counting the number of SFP ports.

Replace a Master server in a DCA without a Greenplum database
Perform this procedure to replace a failed Master server in a DCA in which the Greenplum
database is either not installed or is uninitialized.
IMPORTANT
This procedure directs you to transfer drives from the failed server to the replacement
server. Take great care when transferring drives. Transfer only one drive at a time. Insert
drives in the same slots that they occupied in the failed server.

EMC Greenplum DCA Maintenance Guide

Identifying a single-NIC master versus a dual-NIC master in a DCAv2

31

EMC CONFIDENTIAL
Replace a Master Server

1. You may want to consult “Task summary” on page 11 for a overview of the Master
server replacement procedures.
2. If it is not already connected, connect your service laptop to the red service cable
located on the laptop tray. The red service cable is connected to port 48 on the first
Administration switch a-sw-1 (see “Connect a workstation to the DCA” on page 176).
3. To prevent false dial home messages from being sent to EMC Support during service,
stop the healthmon daemon to disable health monitoring:
# dca_healthmon_ctl -d

4. If the failed server is still accessible by SSH, perform the following steps. If the failed
Master server is not accessible through SSH, skip to step 5.
a. Log in to the functioning Master server as the user root (see “Connect a
workstation to the DCA” on page 176).
b. Activate the server identification LED:
Enter either mdw or smdw for the hostname, whichever applies.
# dca_blinker -h smdw -a ON

c. Make note of any custom NFS mounts the customer may have created:
# cat /etc/fstab

d. Make note of any custom network gateways the customer may have created:
# cat /etc/sysconfig/network

5. If possible, while connected to the failed Master server, issue the following command
to shut down the server.
IMPORTANT
Check the prompt to make sure that you are on the correct Master server (mdw or
smdw) before you issue the shutdown command!
# shutdown -h now

6. If the failed server is inaccessible through SSH, power it off by pressing the power
button on the front of the server.
7. Label all the cables connected to the failed server so that you’ll know where to
connect them on the replacement server.
8. Remove all power, Ethernet, and twin-axial cables from the back of the server.
Note: If the system has Dual NICs installed, note the connections for customer and
interconnect networks prior to disconnecting.
9. Remove the failed server and install the replacement server (see Appendix E, “Replace
a Server in the Greenplum DCA Rack,” on page 192).
10. Transfer disk drives one at a time from the failed server to the replacement server.

EMC Greenplum DCA Maintenance Guide

Replace a Master server in a DCA without a Greenplum database

32

EMC CONFIDENTIAL
Replace a Master Server

IMPORTANT
Use caution when transferring drives. Transfer only one drive at a time. Insert the
drives in the same slots that they occupied in the failed server.
11. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on
the cables for proper connectivity.
IMPORTANT
Do not connect power to the replacement server yet.
12. From the functional Master server start the dhcpd service:
# service dhcpd start

13. Connect the power cables to the replacement server.

EMC Greenplum DCA Maintenance Guide

Replace a Master server in a DCA without a Greenplum database

33

EMC CONFIDENTIAL
Replace a Master Server

14. Next, use these steps to identify the IP address assigned to the server.
a. Issue the following command to obtain the lease information provided in the
dhcpd.leases file:
# tail /var/lib/dhcpd/dhcpd.leases

b. The dhcpd.leases file dispays (similar to the following):
Example
lease 172.28.6.170 {
starts 4 2012/10/18 20:09:08;
ends 5 2013/10/18 20:09:08;
cltt 4 2012/10/18 20:09:08;
binding state active;
next binding state free;
hardware ethernet 00:00:00:00:00:04;
uid "\001\000\036g,\242\014";

c. Locate the MAC address labelled hardware ethernet in the example
dhcpd.leases file above:
00:00:00:00:00:04
d. Locate the MAC address on the replacement server’s service tag (highlighted in the
photograph below):
MAC1 00:00:00:00:00:00

Figure 5 Locating the MAC address on the service tag (Master server shown, Dragon24)

e. Compare the last two digits in the MAC addresses referenced in step c and step d
(for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC
address in the dhcpd.leases file is four greater than the last two numbers in the
MAC address on the replacement server’s service tag.
If this is the case, it is certain that the IP address in the dhcpd.leases file is the
correct one to associate with the server. For example, the scenario described
above verifies that 172.28.6.170 is correct in this specific instance.

EMC Greenplum DCA Maintenance Guide

Replace a Master server in a DCA without a Greenplum database

34

EMC CONFIDENTIAL
Replace a Master Server

f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170)
identified in the previous steps using the example above as a guide:
Note: Disregard the long, detailed output after this command is executed.
# ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth

g. Ping the new address to verify that the change was applied:
# ping 172.28.0.#

Where # is the number of the server you are replacing.
15. Turn off the dhcpd service:
# service dhcpd stop

16. Power on the replacement server by pressing the button on the front panel.
17. Issue the following command to open a console session on the replacement server.
Enter either mdw or smdw for the hostname, whichever applies. For example,
for smdw:
# ipmiutil sol -a -e -N smdw-sp -U root -P sephiroth


You will need to press the F key within 15 seconds after seeing this WARNING message:
Foreign configuration(s) found on adapter
Press any key to continue or ‘C’ load the configuration
utility, or ‘F’ to import foreign configuration(s) and
continue.
18. Power on the replacement server by pressing the power button on the front panel, and
press the F key when prompted.
19. When the following message displays, disregard and press the space bar:
All of the disks from your previous configuration are gone. If this
is an unexpected message, then please power off your system and
check your cables to ensure all disks are present. Press any key to
continue, or 'C' to load the configuration utility.

EMC Greenplum DCA Maintenance Guide

Replace a Master server in a DCA without a Greenplum database

35

EMC CONFIDENTIAL
Replace a Master Server

20. If the message below appears it indicates that the server did not accept the “F” key
request per the above WARNING. This means that you will need to power off the server,
verify that all LED lights are off, and go back to step 18 (press the F key when
prompted again):
CLIENT MAC ADDR: 00 1E 67 4D C5 1D
001E674DC51D
DHCP....\

GUID: 2A9B43A4 A50A 11E1 AAA0

21. Monitor the boot process onscreen and verify that the system boots from hard disk.
If it does not, do the following to force it to boot from hard disk:
a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.)
key, as follows on the keyboard:

then
Note: When you exit the BMC console you are returned to your connection on the
functioning master server the user root.
b. Issue the following command from either the Primary or the Standby Master server,
which ever applies.
– If you replaced a Primary Master, issue the command from smdw.
– If you replaced a Standby Master, issue the command from mdw.
For example, for mdw:
# ipmiutil reset -h -N mdw-sp -U root -P sephiroth
c. Once the operating system is loaded, issue the following command to change the
boot order on the replacement server. For example, if you replaced mdw:
# ssh mdw
# syscfg /bbo “emcbios” HDD NW

d. Reboot the system:
# reboot

e. Following the reboot, issue the following commands to connect to the replacement
server and verify the boot order. For example, if you replaced mdw:
# ssh mdw
# syscfg /bbosys

f. Check the firmware level of the RAID controllers with the CmdTool2 utility:
# ssh mdw /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep
"FW Package Build"

If the above command returns either: FW Package Build: 23.7.0-0033 or FW
Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a
through n below to update the firmware.

EMC Greenplum DCA Maintenance Guide

Replace a Master server in a DCA without a Greenplum database

36

EMC CONFIDENTIAL
Replace a Master Server

g. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to
your laptop.
https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian
ce
h. Extract the files to your laptop using unzip or similar unpacking tool. For example:
Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip

i. As a root user copy the MR56p.rom file to the Master server (mdw) and place the
file in /root. You can use WinSCP or a similar utility.
Note: You may be required to provide a login to the destination server.
j. For each server in need of an update, log into the server as root.
k. SCP the MR56p.rom file from the master to the server you are updating.
l. Install the new firmware using the following command:
Note: This will take longer on 24-disk servers.
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom
-aall

m. Reboot the server.
# reboot

n. When the server reboots, check the new firmware version:
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW
Package Build"

The following should be returned, indicating your firmware has successfully been
updated on this server:
FW Package Build: 23.12.0-0013

o. Repeat these alphabetic steps to check/update the remaining servers in the
cluster.
22. Issue the following command to check the health of the replacement server. For
example, if you replaced mdw:
# dcacheck -h mdw

Verify that no errors display.
23. Re-enable health monitoring by restarting the healthmon daemon:
# dca_healthmon_ctl -e
24. You must stop and start the connectemc service to complete restarting the healthmon
daemon:

EMC Greenplum DCA Maintenance Guide

Replace a Master server in a DCA without a Greenplum database

37

EMC CONFIDENTIAL
Replace a Master Server

EMC Greenplum DCA Maintenance Guide

Replace a Master server in a DCA without a Greenplum database

38

EMC CONFIDENTIAL

CHAPTER 3
Replace a Segment, DIA, or Hadoop server
This chapter describes how to replace a server used in GPDB, DIA, and GP HD modules. It
includes the following major sections:









Required tools ........................................................................................................
Task summary.........................................................................................................
Service tag locations ...............................................................................................
Reseat cables before replacing a server...................................................................
Replace a server in an initialized GPDB module .......................................................
Replace a DIA server or a server in an uninitialized GPDB module............................
Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0) ..................
Replace a server in a Pivotal Hadoop module (version 2.0.1.0 and later) .................

39
40
41
42
44
51
58
64

Required tools
You need the following tools to remove and replace a server:


#2 Phillips screwdriver



Wrist grounding strap

EMC Greenplum DCA Maintenance Guide

Replace a Segment, DIA, or Hadoop server

39

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

Task summary
Table 2 Segment (GPDB), DIA, and Hadoop server replacement task summary
Tasks

Segment server in an
initialized GPDB module

DIA server; Segment server in an
uninitialized GPDB module

Hadoop Master or
Worker server

Check BIOS version when replacing a server
When installing a replacement server,
identify the BIOS version on the new server
(as well as the versions already running in
the DCA). Then upgrade so that all servers
reflect the same firmware levels.
Go to http://support.emc.com to obtain the
pertinent BIOS upgrade instructions. The
upgrade instructions provide information on
how to access and install the upgrade
package.

x

x

x

Check and reseat cables.

x

x

x

Connect to the DCA.

x

x

x

Disable health monitoring.

x

x

x

Check number of segments that are showing
Change Tracking.

x

Activate light bar to locate the failed server.

x

x

x

Ask the customer about 3rd party software.

x

Note MAC address of the adapter eth1

x

Note NFS mounts and custom gateways.

x

x

Power off the failed server.

x

x

x

Install the replacement server.

x

x

x

Transfer drives from the failed server
to the replacement server.

x

x

x

Connect cables to the replacement server.

x

x

x

Configure the BMC IP address.

x

x

x

Power on the replacement server.

x

x

x

Import foreign disk configurations.

x

x

x

Monitor the boot process and verify that the
replacement server boots from hard disk.

x

x

x

Check the health of the replacement server.

x

x

x

Exchange SSH keys.

x

x

x

Launch gprecoverseg utility.

x

Issue gpstate -m to verify data status of
all segments is Synchronized.

x

Issue gprecoverseg to restore the server
to its optimal configuration.

x

Issue gpstate -e to check for errors.

x

EMC Greenplum DCA Maintenance Guide

Task summary

40

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

Table 2 Segment (GPDB), DIA, and Hadoop server replacement task summary
Segment server in an
initialized GPDB module

Tasks

DIA server; Segment server in an
uninitialized GPDB module

Hadoop Master or
Worker server
x

Synchronize the system clock.

x

x

Verify with the customer that NFS mounts or
gateways (if any) are functioning.

x

x

Configure the external IP address (eth1).
Re-enable health monitoring.

x
x

Tell customer that they can reinstall 3rd party
software.

x

x

x
(DIA server only)

Service tag locations
When replacing any hardware component, make sure to properly de-brief the part. Locate
the serial number on the blue label affixed to the rear of the rotating power console on the
front of each segment server.

Figure 6 Service tag location on 24-drive Segment server

EMC Greenplum DCA Maintenance Guide

Service tag locations

41

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

Reseat cables before replacing a server
Before you replace a failed server, determine whether the problem is caused by a faulty
cable connection. Remove and then reconnect cables as described below.
1. Connect your service laptop to the red service cable located on the laptop tray in
Rack 1. The red service cable is connected to port 48 on the first Administration switch
a-sw-1 (see “Connect a workstation to the DCA” on page 176).
2. Open a console connection to the Primary Master server as the user root using IP
Address 172.28.4.250 and password changeme.
3. To identify the failed server, activate its server identification light by issuing the
following command. Replace the hostname shown in bold below with the hostname of
the server want to identify:
# dca_blinker -h sdw1 -a ON

Note: If the server is completely non-operational, the light might not work.
4. Shut down the failed server:
• If you can access the server through SSH: Enter the following command. Replace
the hostname shown in bold with the hostname of the segment server you are
working on:
IMPORTANT
Check the prompt to make sure that you are on the correct server before you issue
the shutdown command!
# ssh sdw1
# shutdown -h now

• If you cannot access the server through SSH: Make sure that the server is powered
off. Press the power button on the front of the server if necessary.
5. Once the server is powered off, unplug and then firmly reconnect the administration
network cable, two interconnect cables, and two power supply AC cables. Figure 7
shows the relevant cable connection sites.

EMC Greenplum DCA Maintenance Guide

Reseat cables before replacing a server

42

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

To lower Interconnect switch (10Gb)

To upper Interconnect switch (10Gb)

AF004061a

Power

Power

To Administration switch
(CM & BMC; 1Gb)
To lower Interconnect switch (10Gb)
To upper Interconnect switch (10Gb)

AF004142a

To Administration switch
(CM & BMC; 1Gb)
(CM = Cluster Management; BMC = Baseboard Management Controller service port)
Figure 7 Re-seat cables on the back of the server

6. Power on the server by pressing the power button on the front of the server.
Wait for the server to boot (approximately 5 minutes).
7. From the Primary Master server, issue the ping command to each interface on the
server. Replace the text in bold with the hostname of the server you are evaluating.
# ping sdw1-cm
# ping sdw1

• If there is no response from the interfaces, replace the server by performing the
appropriate procedure listed below:
– “Replace a server in an initialized GPDB module” on page 44
– “Replace a DIA server or a server in an uninitialized GPDB module” on page 51
– “Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0)” on
page 58
• If all interfaces on the server respond, you do not need to replace the server. If the
server you are evaluating is a DIA server, you are done. If the server you are
evaluating is part of a GPDB or HD module, issue the following commands to
recover the segment instances:
# su - gpadmin
$ gprecoverseg

EMC Greenplum DCA Maintenance Guide

Reseat cables before replacing a server

43

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

Replace a server in an initialized GPDB module
Perform this procedure to replace a failed or failing Segment Server that is part of an
initialized GPDB module. This procedure describes how to replace the server hardware
and recover segment instances.
To replace a server that is part of a DIA module or part of an uninitialized GPDB module (or
a module in which GPBD is not installed), see page 51.
IMPORTANT
This procedure directs you to transfer drives from the failed server to the replacement
server. Take great care when transferring drives. Transfer only one drive at a time. Insert
drives in the same slots that they occupied in the failed server.
1. You may want to consult “Task summary” on page 40 for a overview of the segment
server replacement procedures.
2. Make sure that you have checked the cable connections (see “Reseat cables before
replacing a server” on page 42).
3. If it is not already connected, connect your service laptop to the red service cable
located on the laptop tray in Rack 1. The red service cable is connected to port 48 on
the first Administration switch a-sw-1 (see “Connect a workstation to the DCA” on
page 176).
4. To prevent false dial home messages from being sent to EMC Support during service,
disable health monitoring by stopping the healthmon daemon:
# dca_healthmon_ctl -d

5. Log in to the Primary Master Server as the user gpadmin.
6. Issue the gpstate -m command. Verify that no more than eight segment instances
display a status of Change Tracking. In this example, sdw1 has failed:
$ gpstate -m
Mirror
Datadir
sdw2-2
/data2/mirror/gpseg0
sdw3-2
/data2/mirror/gpseg1
sdw4-2
/data2/mirror/gpseg2
sdw2-1
/data1/mirror/gpseg3
sdw3-1
/data1/mirror/gpseg4
sdw4-1
/data1/mirror/gpseg5
sdw3-1
/data1/mirror/gpseg6
sdw4-1
/data1/mirror/gpseg7

Port
50003
50003
50003
50000
50000
50000
50000
50000

Status
Acting as
Acting as
Acting as
Acting as
Acting as
Acting as
Acting as
Acting as

Data Status
Primary
Primary
Primary
Primary
Primary
Primary
Primary
Primary

Change
Change
Change
Change
Change
Change
Change
Change

Tracking
Tracking
Tracking
Tracking
Tracking
Tracking
Tracking
Tracking

7. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA
door and the blue server identification LED as the user root:
a. Launch the DCA Setup utility:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 18 for Light Bar Controls.
d. Select option 3 for Blink the light bar.

EMC Greenplum DCA Maintenance Guide

Replace a server in an initialized GPDB module

44

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

e. Enter the hostname of the server and press ENTER.
f. Enter X to exit the DCA Setup utility.
The green lightbar on the DCA door and the blue server identification LED begin to
blink.
Note: If the DCA door does not have a lightbar, an error message displays. You can
safely ignore the error message. To identify the failed server, locate the blue server
identification LED.
8. If you can still access the server via SSH, perform sub-steps (a) through (d) below. If
you cannot access the server via SSH, proceed to step 9.
a. Log in to the failed segment server as the user root. Replace the hostname shown
in bold with the hostname of the failed server:
# ssh root@sdw1

b. Make note of any custom NFS mounts the customer may have created:
# cat /etc/fstab

c. Make note of any custom network gateways the customer may have created:
# cat /etc/sysconfig/network

d. Shut down the failed server:
# shutdown -h now

9. If the failed server is inaccessible through SSH, power it off by pressing the power
button on the front of the server.
10. Label all the cables connected to the failed server so that you’ll know where to
connect them on the replacement server.
11. Remove all power, Ethernet, and twin-axial cables from the back of the server.
Note: If the system has Dual NICs installed, note the connections for customer and
interconnect networks prior to disconnecting.
12. Remove the failed server and install the replacement server (see Appendix E, “Replace
a Server in the Greenplum DCA Rack,” on page 192).
13. Transfer disk drives one at a time from the failed server to the replacement server.
IMPORTANT
Use caution when transferring drives. Transfer only one drive at a time. Insert the
drives in the same slots that they occupied in the failed server.
14. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on
the cables for proper connectivity.
IMPORTANT
Do not connect power to the replacement server yet.

EMC Greenplum DCA Maintenance Guide

Replace a server in an initialized GPDB module

45

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

15. From the Primary Master server start the dhcpd service:
# service dhcpd start

16. Connect the power cables to the replacement server.
17. Next, use these steps to identify the IP address assigned to the server.
a. Issue the following command to obtain the lease information provided in the
dhcpd.leases file:
# tail /var/lib/dhcpd/dhcpd.leases

b. The dhcpd.leases file dispays (similar to the following):
Example
lease 172.28.6.170 {
starts 4 2012/10/18 20:09:08;
ends 5 2013/10/18 20:09:08;
cltt 4 2012/10/18 20:09:08;
binding state active;
next binding state free;
hardware ethernet 00:00:00:00:00:04;
uid "\001\000\036g,\242\014";

c. Locate the MAC address labelled hardware ethernet in the example
dhcpd.leases file above:
00:00:00:00:00:04
d. Locate the MAC address on the replacement server’s service tag (highlighted in the
photograph below):
MAC1 00:00:00:00:00:00

Figure 8 Locating the MAC address on the service tag (Dragon24)

e. Compare the last two digits in the MAC addresses referenced in step c and step d
(for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC
address in the dhcpd.leases file is four greater than the last two numbers in the
MAC address on the replacement server’s service tag.

EMC Greenplum DCA Maintenance Guide

Replace a server in an initialized GPDB module

46

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

If this is the case, it is certain that the IP address in the dhcpd.leases file is the
correct one to associate with the server. For example, the scenario described
above verifies that 172.28.6.170 is correct in this specific instance.
f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170)
you’ve identified in the previous steps using the example above as a guide:
Note: Disregard the long, detailed output after this command is executed.
# ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth

g. Ping the new address to verify that the change was applied:
# ping 172.28.0.#

Where # is the number of the server you are replacing.
18. Turn off the dhcpd service:
# service dhcpd stop

19. From the Primary Master server as user root, issue the following command to open a
BMC console session on the replacement server. Replace the hostname shown in bold
below with the hostname of the replacement server:
# ipmiutil sol -a -e -N sdw1-sp -U root -P sephiroth


You will need to press the F key within 15 seconds after seeing this WARNING message:
Foreign configuration(s) found on adapter
Press any key to continue or ‘C’ load the configuration
utility, or ‘F’ to import foreign configuration(s) and
continue.
20. Power on the replacement server by pressing the power button on the front panel, and
press the F key when prompted.


Note that after pressing the space bar in the next step you will again be prompted to
press the F key within 15 seconds.
21. When the following message displays, disregard and press the space bar:
All of the disks from your previous configuration are gone. If this
is an unexpected message, then please power off your system and
check your cables to ensure all disks are present. Press any key to
continue, or 'C' to load the configuration utility.

22. Press the F key when prompted.
23. When the following message displays, disregard and press the space bar:
All of the disks from your previous configuration are gone. If this
is an unexpected message, then please power off your system and
check your cables to ensure all disks are present. Press any key to
continue, or 'C' to load the configuration utility.

EMC Greenplum DCA Maintenance Guide

Replace a server in an initialized GPDB module

47

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

24. If the message below appears you will need to power off the server,essage below
appears you will need to power off the server, verify that all LED lights are off, and
repeat steps 21 through 23.
CLIENT MAC ADDR: 00 1E 67 4D C5 1D
001E674DC51D

GUID: 2A9B43A4 A50A 11E1 AAA0

25. Monitor the boot process onscreen and verify that the replacement server boots from
hard disk. If it does not, do the following to force it to boot from hard disk:
a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.)
key, as follows on the keyboard:

then
Note: When you exit the BMC console you are returned to your connection on the
Primary Master server as the user root.
b. Issue the following command from the Primary Master server to force the
replacement server to boot from the hard drive. Change the hostname shown in
bold below to the hostname of the server you replaced:
# ipmiutil reset -h -N sdw1-sp -U root -P sephiroth
c. Once the operating system is loaded, issue the following command to change the
boot order on the replacement server. For example, on sdw1:
# ssh sdw1
# syscfg /bbo “emcbios” HDD NW

d. Reboot the replacement server:
# reboot

e. Following the reboot, issue the following commands to connect to the replacement
server and verify the boot order. Change the hostname shown in bold below to the
hostname of the server you replaced:
# ssh sdw1
# syscfg /bbosys

26. Check the health of the replacement server. Replace the text in bold with the
hostname of the replacement segment server:
# dcacheck -h sdw1

Verify that no errors display.
27. Exchange SSH keys on the replacement server using the DCA Setup utility:
a. Start the DCA Setup utility as the user root:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 6 to Generate SSH Keys.
EMC Greenplum DCA Maintenance Guide

Replace a server in an initialized GPDB module

48

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

d. Enter X to exit the DCA Setup utility.
e. When the server reboots, check the new firmware version. Replace the text in bold
with the hostname of the replacement server:
# ssh sdw1 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall |
grep "FW Package Build"

If the above command returns either: FW Package Build: 23.7.0-0033 or FW
Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a
through n below to update the firmware.
f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to
your laptop.
https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian
ce
g. Extract the files to your laptop using unzip or similar unpacking tool. For example:
Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip

h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the
file in /root. You can use WinSCP or a similar utility.
Note: You may be required to provide a server login name and password.
i. For each server in need of an update, log into the server as root.
j. SCP the MR56p.rom file from the master to the server you are updating.
k. Install the new firmware using the following command:
Note: This will take longer on 24-disk servers.
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom
-aall

l. Reboot the server.
# reboot

m. When the server reboots, check the new firmware version. Replace the text in bold
with the hostname of the replacement server:
# ssh sdw1 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall |
grep "FW Package Build"

The following should be returned, indicating your firmware has successfully been
updated on this server:
FW package Build: 23.12.0-0013

n. Repeat these alphabetic steps to check/update the remaining servers in the
cluster.
28. Switch to the user gpadmin and launch the gprecoverseg utility to recover the
segment instances:
$ gprecoverseg -a

EMC Greenplum DCA Maintenance Guide

Replace a server in an initialized GPDB module

49

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

29. When the gprecoverseg utility is finished, issue the gpstate -m command and
verify that the data status is reported as Resynchronizing in the output.
30. Wait a few minutes, and then issue the gpstate -m command again to verify that
the data status of all segments is reported as Synchronized in the output.
31. Return the Greenplum system to its optimal configuration:
$ gprecoverseg -ra

IMPORTANT
Issuing gprecoverseg -ra cancels running queries but does not interrupt
database connections.
32. Issue the $ gpstate -e command and verify that no errors are reported.
33. Synchronize the system clock:
a. Select option 2 for Modify DCA Settings.
b. Select option 5 for Modify NTP/Clock Configuration Options.
c. Select option 3 for Synchronize clocks across the cluster to the
NTP server.
Enter X to exit the DCA Setup utility.
34. Re-enable health monitoring by restarting the healthmon daemon:
# dca_healthmon_ctl -e

EMC Greenplum DCA Maintenance Guide

Replace a server in an initialized GPDB module

50

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

Replace a DIA server or a server in an uninitialized GPDB module
Perform this procedure only to replace a failed server that is part of a DIA module or part of
an uninitialized GPDB module (or module in which GPBD is not installed).
IMPORTANT
This procedure directs you to transfer drives from the failed server to the replacement
server. Take great care when transferring drives. Transfer only one drive at a time. Insert
drives in the same slots that they occupied in the failed server.
1. Make sure that you have checked the cable connections as described in “Reseat
cables before replacing a server” on page 42.
2. If it is not already connected, connect your service laptop to the red service cable
located on the laptop tray in Rack 1. The red service cable is connected to port 48 on
the first Administration switch a-sw-1 (see “Connect a workstation to the DCA” on
page 176).
3. To prevent false dial home messages from being sent to EMC Support during service,
disable health monitoring by stopping the healthmon daemon:
# dca_healthmon_ctl -d

4. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA
door and the blue server identification LED as the user root:
a. Launch the DCA Setup utility:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 18 for Light Bar Controls.
d. Select option 3 for Blink the light bar.
e. Enter the hostname of the server and press ENTER.
f. Enter X to exit the DCA Setup utility.
The green lightbar on the DCA door and the blue server identification LED begin to
blink.
Note: If the DCA door does not have a lightbar, an error message displays. You can
safely ignore the error message. To identify the failed server, locate the blue server
identification LED.
5. Log in to the Primary Master Server as the user gpadmin.
6. If you can still access the server via SSH, perform sub-steps (a) through (f) below. If
you cannot access the server via SSH, proceed to step 7.
a. Log in to the failed server as the user root. Replace the hostname shown in bold
with the hostname of the failed server:
$ ssh root@etl1

EMC Greenplum DCA Maintenance Guide

Replace a DIA server or a server in an uninitialized GPDB module

51

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

b. If the server is part of a DIA module, ask the customer if any third-party software is
installed. Discuss with the customer whether any files need to be saved before you
power off the server.
c. Switch to the user root and make note of the MAC address of the adapter eth1.
# ifconfig eth1

d. Make note of any custom NFS mounts the customer may have created:
# cat /etc/fstab

e. Make note of any custom network gateways the customer may have created:
# cat /etc/sysconfig/network

f. Shut down the failed server:
# shutdown -h now

7. Label all the cables connected to the failed server so that you’ll know where to
connect them on the replacement server.
8. Remove all power, Ethernet, and twin-axial cables from the back of the server.
Note: If the system has Dual NICs installed, note the connections for customer and
interconnect networks prior to disconnecting.
9. Remove the failed server and install the replacement server (see Appendix E, “Replace
a Server in the Greenplum DCA Rack,” on page 192).
10. Transfer disk drives one at a time from the failed server to the replacement server.
IMPORTANT
Use caution when transferring drives. Transfer only one drive at a time. Insert the
drives in the same slots that they occupied in the failed server.
11. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on
the cables for proper connectivity.
IMPORTANT
Do not connect power to the replacement server yet.
12. From the Primary Master server start the dhcpd service:
# service dhcpd start

13. Connect the power cables to the replacement server.

EMC Greenplum DCA Maintenance Guide

Replace a DIA server or a server in an uninitialized GPDB module

52

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

14. Next, use these steps to identify the IP address assigned to the server.
a. Issue the following command to obtain the lease information provided in the
dhcpd.leases file:
# tail /var/lib/dhcpd/dhcpd.leases

b. The dhcpd.leases file dispays (similar to the following):
Example
lease 172.28.6.170 {
starts 4 2012/10/18 20:09:08;
ends 5 2013/10/18 20:09:08;
cltt 4 2012/10/18 20:09:08;
binding state active;
next binding state free;
hardware ethernet 00:00:00:00:00:04;
uid "\001\000\036g,\242\014";

c. Locate the MAC address labelled hardware ethernet in the example
dhcpd.leases file above:
00:00:00:00:00:04
d. Locate the MAC address on the replacement server’s service tag (highlighted in the
photograph below):
MAC1 00:00:00:00:00:00

Figure 9 Locating the MAC address on the service tag (Dragon24)

e. Compare the last two digits in the MAC addresses referenced in step c and step d
(for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC
address in the dhcpd.leases file is four greater than the last two numbers in the
MAC address on the replacement server’s service tag.
If this is the case, it is certain that the IP address in the dhcpd.leases file is the
correct one to associate with the server. For example, the scenario described
above verifies that 172.28.6.170 is correct in this specific instance.

EMC Greenplum DCA Maintenance Guide

Replace a DIA server or a server in an uninitialized GPDB module

53

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170)
identified in the previous steps using the example above as a guide:
Note: Disregard the long, detailed output after this command is executed.
# ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth

g. Ping the new address to verify that the change was applied:
# ping 172.28.0.#

Where # is the number of the server you are replacing.
15. Turn off the dhcpd service:
# service dhcpd stop

16. Power on the replacement server by pressing the button on the front panel.
17. From the Primary Master server issue the following command to open a BMC console
session on the replacement server. Replace the hostname shown in bold with the
hostname of the replacement server:
# ipmiutil sol -a -e -N etl1-sp -U root -P sephiroth


You will need to press the F key within 15 seconds after seeing this WARNING message:
Foreign configuration(s) found on adapter
Press any key to continue or ‘C’ load the configuration
utility, or ‘F’ to import foreign configuration(s) and
continue.
18. Power on the replacement server by pressing the power button on the front panel, and
press the F key when prompted.


Note that after pressing the space bar in the next step you may be prompted again to
press the F key within 15 seconds (if the DIA server is a Dragon24).
19. When the following message displays, disregard and press the space bar:
All of the disks from your previous configuration are gone. If this
is an unexpected message, then please power off your system and
check your cables to ensure all disks are present. Press any key to
continue, or 'C' to load the configuration utility.

20. Press the F key when prompted.
21. When the following message displays, disregard and press the space bar:
All of the disks from your previous configuration are gone. If this
is an unexpected message, then please power off your system and
check your cables to ensure all disks are present. Press any key to
continue, or 'C' to load the configuration utility.

EMC Greenplum DCA Maintenance Guide

Replace a DIA server or a server in an uninitialized GPDB module

54

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

22. If the message below appears it indicates that the server did not accept the “F” key
request per the above WARNING. This means that you will need to power off the server,
verify that all LED lights are off, and go back to step 18 (press the F key when
prompted again):
CLIENT MAC ADDR: 00 1E 67 4D C5 1D
001E674DC51D
DHCP....\

GUID: 2A9B43A4 A50A 11E1 AAA0

23. Monitor the boot process onscreen and verify that the replacement server boots from
hard disk. If it does not, do the following to force it to boot from hard disk:
a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.)
key, as follows on the keyboard:

then
Note: When you exit the BMC console you are returned to your connection on the
Primary Master server as the user root.
b. Issue the following command from the Primary Master server to force the Segment
server to boot from hard drive:
# ipmiutil reset -h -N etl1-sp -U root -P sephiroth
Change the hostname shown in bold above to the hostname of the server you
replaced.
c. Once the operating system is loaded, issue the following command to change the
boot order on the Segment server. For example, on etl1:
# ssh etl1
# syscfg /bbo “emcbios” HDD NW

d. Reboot the system:
# reboot

e. Following the reboot, issue the following commands to connect to the replacement
server and verify the boot order:
# ssh etl1
# syscfg /bbosys

Change the hostname shown in bold above to the hostname of the server you
replaced.
24. Issue the following command to check the health of the replacement server. Replace
the text in bold with the hostname of the replacement DIA or Segment server:
# dcacheck -h etl1

Verify that no errors display.
25. Exchange SSH keys on the replacement server using the DCA Setup utility:

EMC Greenplum DCA Maintenance Guide

Replace a DIA server or a server in an uninitialized GPDB module

55

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

a. Start the DCA Setup utility as the user root:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 6 to Generate SSH Keys.
d. Enter X to exit the DCA Setup utility.
26. Using gpssh, check the firmware level of the RAID controllers with the CmdTool2
utility:
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW
Package Build"

If the above command returns either: FW Package Build: 23.7.0-0033 or FW
Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a
through n below to update the firmware.
e. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to
your laptop.
https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian
ce
f. Extract the files to your laptop using unzip or similar unpacking tool. For example:
Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip

g. As a root user copy the MR56p.rom file to the Master server (mdw) and place the
file in /root. You can use WinSCP or a similar utility.
Note: You may be required to provide a server login name and password.
h. For each server in need of an update, log into the server as root.
i. SCP the MR56p.rom file from the master to the server you are updating.
j. Install the new firmware using the following command:
Note: This will take longer on 24-disk servers.
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom
-aall

k. Reboot the server.
# reboot

l. When the server reboots, check the new firmware version. Replace the text in bold
with the hostname of the replacement server:
# ssh sdw1 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall |
grep "FW Package Build"

The following should be returned, indicating your firmware has successfully been
updated on this server:

EMC Greenplum DCA Maintenance Guide

Replace a DIA server or a server in an uninitialized GPDB module

56

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

FW package Build: 23.12.0-0013

m. Repeat these alphabetic steps to check/update the remaining servers in the
cluster.
27. Synchronize the system clock:
a. Select option 2 for Modify DCA Settings.
b. Select option 5 for Modify NTP/Clock Configuration Options.
c. Select option 3 for Synchronize clocks across the cluster to the
NTP server.
Enter X to exit the DCA Setup utility.
28. (Applies only if you replaced a DIA server): SSH into the replacement DIA server and
configure the external interface IP address:
Edit the file ...
# vi /etc/sysconfig/network-scripts/ifcfg-eth1
... and change the HWADDR setting using the MACADDRESS of eth1 that you
determined in step 6-c above. Change the values shown in bold below. Do not change
the other parameters:
DEVICE=eth1
BOOTPROTO=static

IPADDR=10.6.193.46
NETMASK=255.255.252.0
ONBOOT=YES
MTU=1500
HWADDR=MACADDRESS

29. Re-enable health monitoring by restarting the healthmon daemon:
# dca_healthmon_ctl -e

30. If any third-party software was installed, inform the customer that it is now safe to
reinstall and validate the software.

EMC Greenplum DCA Maintenance Guide

Replace a DIA server or a server in an uninitialized GPDB module

57

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

Replace a server in a Greenplum Hadoop module (DCA version
2.0.0.0)
Perform this procedure to replace a failed server that is part of a Hadoop module in DCA
version 2.0.0.0.
To replace a Hadoop server in DCA version 2.0.0.0, see the procedure “Replace a server in
a Pivotal Hadoop module (version 2.0.1.0 and later)” on page 64.
IMPORTANT
This procedure directs you to transfer drives from the failed server to the replacement
server. Take great care when transferring drives. Transfer only one drive at a time. Insert
drives in the same slots that they occupied in the failed server.
1. Make sure that you have checked the cable connections as described in “Reseat
cables before replacing a server” on page 42.
2. If it is not already connected, connect your service laptop to the red service cable
located on the laptop tray in Rack 1. The red service cable is connected to port 48 on
the first Administration switch a-sw-1 (see “Connect a workstation to the DCA” on
page 176).
3. To prevent false dial home messages from being sent to EMC Support during service,
disable health monitoring by stopping the healthmon daemon:
# dca_healthmon_ctl -d

4. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA
door and the blue server identification LED as the user root:
a. Launch the DCA Setup utility:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 18 for Light Bar Controls.
d. Select option 3 for Blink the light bar.
e. Enter the hostname of the server and press ENTER.
f. Enter X to exit the DCA Setup utility.
The green lightbar on the DCA door and the blue server identification LED begin to
blink.
Note: If the DCA door does not have a lightbar, an error message displays. You can
safely ignore the error message. To identify the failed server, locate the blue server
identification LED.
5. If you can still access the server via SSH, perform sub-steps (a) through (c) below. If
you cannot access the server via SSH, proceed to step 6.

EMC Greenplum DCA Maintenance Guide

Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0)

58

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

a. Log in to the failed server as the user root. Replace the hostname shown in bold
with the hostname of the failed server. For example, for the first Hadoop Master:
# ssh root@hdm1

b. Make note of any custom network gateways the customer may have created:
# cat /etc/sysconfig/network

c. Shut down the failed server:
# shutdown -h now

6. Label all the cables connected to the failed server so that you’ll know where to
connect them on the replacement server.
7. Remove all power, Ethernet, and twin-axial cables from the back of the server.
Note: If the system has Dual NICs installed, note the connections for customer and
interconnect networks prior to disconnecting.
8. Remove the failed server and install the replacement server (see Appendix E, “Replace
a Server in the Greenplum DCA Rack,” on page 192).
9. Transfer disk drives one at a time from the failed server to the replacement server.
IMPORTANT
Use caution when transferring drives. Transfer only one drive at a time. Insert the
drives in the same slots that they occupied in the failed server.
10. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on
the cables for proper connectivity.
IMPORTANT
Do not connect power to the replacement server yet.
11. From the Primary Master server start the dhcpd service:
# service dhcpd start

12. Connect the power cables to the replacement server.
binding state active;
next binding state free;
hardware ethernet [server_mac_address];
uid "\001\000\036g,\242\014";

EMC Greenplum DCA Maintenance Guide

Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0)

59

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

13. Next, use these steps to identify the IP address assigned to the server.
a. Issue the following command to obtain the lease information provided in the
dhcpd.leases file:
# tail /var/lib/dhcpd/dhcpd.leases

b. The dhcpd.leases file dispays (similar to the following):
Example
lease 172.28.6.170 {
starts 4 2012/10/18 20:09:08;
ends 5 2013/10/18 20:09:08;
cltt 4 2012/10/18 20:09:08;
binding state active;
next binding state free;
hardware ethernet 00:00:00:00:00:04;
uid "\001\000\036g,\242\014";

c. Locate the MAC address labelled hardware ethernet in the example
dhcpd.leases file above:
00:00:00:00:00:04
d. Locate the MAC address on the replacement server’s service tag (highlighted in the
photograph below):
MAC1 00:00:00:00:00:00

Figure 10 Locating the MAC address on the service tag (server shown, Dragon12)

EMC Greenplum DCA Maintenance Guide

Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0)

60

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

e. Compare the last two digits in the MAC addresses referenced in step c and step d
(for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC
address in the dhcpd.leases file is four greater than the last two numbers in the
MAC address on the replacement server’s service tag.
If this is the case, it is certain that the IP address in the dhcpd.leases file is the
correct one to associate with the server. For example, the scenario described
above verifies that 172.28.6.170 is correct in this specific instance.
f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170)
identified in the previous steps using the example above as a guide:
Note: Disregard the long, detailed output after this command is executed.
# ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth

g. Ping the new address to verify that the change was applied:
# ping 172.28.0.#

Where # is the number of the server you are replacing.
14. Turn off the dhcpd service:
# service dhcpd stop

15. From the Primary Master server issue the following command to open a console
session on the replacement server. Replace the hostname shown in bold with the
hostname of the replacement server:
# ipmiutil sol -a -e -N hdm1-sp -U root -P sephiroth


You will need to press the F key within 15 seconds after seeing this WARNING message:
Foreign configuration(s) found on adapter
Press any key to continue or ‘C’ load the configuration
utility, or ‘F’ to import foreign configuration(s) and
continue.
16. Power on the replacement server by pressing the power button on the front panel, and
press the F key when prompted.
17. When the following message disp
18. lays, disregard and press the space bar:
All of the disks from your previous configuration are gone. If this
is an unexpected message, then please power off your system and
check your cables to ensure all disks are present. Press any key to
continue, or 'C' to load the configuration utility.

19. If the message below appears it indicates that the server did not accept the “F” key
request per the above WARNING. This means that you will need to power off the server,
verify that all LED lights are off, and go back to step 16 (press the F key when
prompted again):

EMC Greenplum DCA Maintenance Guide

Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0)

61

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

CLIENT MAC ADDR: 00 1E 67 4D C5 1D
001E674DC51D
DHCP....\

GUID: 2A9B43A4 A50A 11E1 AAA0

20. Monitor the boot process onscreen and verify that the replacement server boots from
hard disk. If it does not, do the following to force it to boot from hard disk:
a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.)
key, as follows on the keyboard:

then
Note: When you exit the BMC console you are returned to your connection on the
Primary Master server as the user root.
b. Issue the following command from the Primary Master server to force the Segment
server to boot from hard drive:
# ipmiutil reset -h -N hdm1-sp -U root -P sephiroth
Change the hostname shown in bold above to the hostname of the server you
replaced.
c. Once the operating system is loaded, issue the following command to change the
boot order on the server. For example, on hdm1:
# ssh hdm1
# syscfg /bbo “emcbios” HDD NW

d. Reboot the system:
# reboot

e. Following the reboot, issue the following commands to connect to the replaced
server, and verify the boot order:
# ssh hdm1
# syscfg /bbosys

Change the hostname shown in bold above to the hostname of the server you
replaced.
21. Issue the following command to check the health of the replacement server. Replace
the text in bold with the hostname of the replacement hadoop server:
# dcacheck -h hdm1

Verify that no errors display.
22. Exchange SSH keys on the replacement server using the DCA Setup utility:
a. Start the DCA Setup utility as the user root:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 6 to Generate SSH Keys.
EMC Greenplum DCA Maintenance Guide

Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0)

62

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

d. Enter X to exit the DCA Setup utility.
23. Using gpssh, check the firmware level of the RAID controllers with the CmdTool2
utility:
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW
Package Build"

If the above command returns either: FW Package Build: 23.7.0-0033 or FW
Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a
through n below to update the firmware.
e. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to
your laptop.
https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian
ce
f. Extract the files to your laptop using unzip or similar unpacking tool. For example:
Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip

g. As a root user copy the MR56p.rom file to the Master server (mdw) and place the
file in /root. You can use WinSCP or a similar utility.
Note: You may be required to provide a server login name and password.
h. For each server in need of an update, log into the server as root.
i. SCP the MR56p.rom file from the master to the server you are updating.
j. Install the new firmware using the following command:
Note: This will take longer on 24-disk servers.
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom
-aall

k. Reboot the server.
# reboot

l. When the server reboots, check the new firmware version. Replace the text in bold
with the hostname of the replacement server:
# ssh sdw1 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall |
grep "FW Package Build"

The following should be returned, indicating your firmware has successfully been
updated on this server:
FW package Build: 23.12.0-0013

m. Repeat these alphabetic steps to check/update the remaining servers in the
cluster.
24. Synchronize the system clock:
a. Select option 2 for Modify DCA Settings.

EMC Greenplum DCA Maintenance Guide

Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0)

63

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

b. Select option 5 for Modify NTP/Clock Configuration Options.
c. Select option 3 for Synchronize clocks across the cluster to the
NTP server.
d. Enter X to exit the DCA Setup utility.
25. Re-enable health monitoring by restarting the healthmon daemon:
# dca_healthmon_ctl -e

Replace a server in a Pivotal Hadoop module (version 2.0.1.0 and
later)
Perform this procedure to replace a failed server that is part of a Pivotal Hadoop (PHD)
module.
To replace a Greenplum Hadoop server in DCA version 2.0.0.0, see the procedure “Replace
a server in a Greenplum Hadoop module (DCA version 2.0.0.0)” on page 58.
Choose the procedure for the type of PHD server you are replacing (see Table 3 below).
Table 3 Server replacement procedures by PHD module type
Hostname

Server Module / Role

Use this procedure

hdm1

Master Module - Namenode

Replace hdm1 (namenode, DCA version 2.0.1.0)

hdm2

Master Module - Secondary
Namenode, Zookeeper

Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0)

hdm3

Master Module resourcemanager

Replace hdm3 (resourcemanager, DCA version 2.0.1.0)

hdm4

Master Module - Zookeeper,
Hive, Hive-Metastore

Replace hdm4 (zookeeper/hive/hive-metastore, DCA version 2.0.1.0)

hdw#

Worker Module - Nodemanager,
Datanode

Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0)

Remove the failed PHD server and install the replacement PHD
server
IMPORTANT
This procedure directs you to transfer drives from the failed server to the replacement
server. Take great care when transferring drives. Transfer only one drive at a time. Insert
drives in the same slots that they occupied in the failed server.
1. Make sure that you have checked the cable connections as described in “Reseat
cables before replacing a server” on page 42.
2. If it is not already connected, connect your service laptop to the red service cable
located on the laptop tray in Rack 1 (see “Connect a workstation to the DCA” on
page 176).

EMC Greenplum DCA Maintenance Guide

Replace a server in a Pivotal Hadoop module (version 2.0.1.0 and later)

64

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

3. To prevent false dial home messages from being sent to EMC Support during service,
disable health monitoring by stopping the healthmon daemon:
# dca_healthmon_ctl -d

4. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA
door and the blue server identification LED as the user root:
a. Launch the DCA Setup utility:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 18 for Light Bar Controls.
d. Select option 3 for Blink the light bar.
e. Enter the hostname of the server and press ENTER.
f. Enter X to exit the DCA Setup utility.
The green lightbar on the DCA door and the blue server identification LED begin to
blink.
Note: If the DCA door does not have a lightbar, an error message displays. You can
safely ignore the error message. To identify the failed server, locate the blue server
identification LED.
5. If you can still access the server via SSH, perform sub-steps (a) through (c) below. If
you cannot access the server via SSH, proceed to step 6.
a. Log in to the failed server as the user root. Replace the hostname shown in bold
with the hostname of the failed server:
# ssh root@hdm1

b. Make note of any custom network gateways the customer may have created:
# cat /etc/sysconfig/network

c. Shut down the failed server:
# shutdown -h now

6. Label all the cables connected to the failed server so that you’ll know where to
connect them on the replacement server.
7. Remove all power, Ethernet, and twin-axial cables from the back of the server.
Note: If the system has Dual NICs installed, note the connections for customer and
interconnect networks prior to disconnecting.
8. Remove the failed server and install the replacement server (see Appendix E, “Replace
a Server in the Greenplum DCA Rack,” on page 192).
9. Transfer disk drives one at a time from the failed server to the replacement server.

EMC Greenplum DCA Maintenance Guide

Remove the failed PHD server and install the replacement PHD server

65

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

IMPORTANT
Use caution when transferring drives. Transfer only one drive at a time. Insert the
drives in the same slots that they occupied in the failed server.
10. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on
the cables for proper connectivity.
IMPORTANT
Do not connect power to the replacement server yet.
11. From the Primary Master server start the dhcpd service:
# service dhcpd start

12. Connect the power cables to the replacement server.
13. Next, use these steps to identify the IP address assigned to the server.
a. Issue the following command to obtain the lease information provided in the
dhcpd.leases file:
# tail /var/lib/dhcpd/dhcpd.leases

b. The dhcpd.leases file dispays (similar to the following):
Example
lease 172.28.6.170 {
starts 4 2012/10/18 20:09:08;
ends 5 2013/10/18 20:09:08;
cltt 4 2012/10/18 20:09:08;
binding state active;
next binding state free;
hardware ethernet 00:00:00:00:00:04;
uid "\001\000\036g,\242\014";

c. Locate the MAC address labelled hardware ethernet in the example
dhcpd.leases file above:
00:00:00:00:00:04
d. Locate the MAC address on the replacement server’s service tag (highlighted in
Figure 11):

EMC Greenplum DCA Maintenance Guide

Remove the failed PHD server and install the replacement PHD server

66

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

MAC1 00:00:00:00:00:00

Figure 11 Locating the MAC address on the service tag (server shown, Dragon12)

e. Compare the last two digits in the MAC addresses referenced in step c and step d
(for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC
address in the dhcpd.leases file is four greater than the last two numbers in the
MAC address on the replacement server’s service tag.
If this is the case, it is certain that the IP address in the dhcpd.leases file is the
correct one to associate with the server. For example, the scenario described
above verifies that 172.28.6.170 is correct in this specific instance.

EMC Greenplum DCA Maintenance Guide

Remove the failed PHD server and install the replacement PHD server

67

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170)
identified in the previous steps using the example above as a guide:
Note: Disregard the long, detailed output after this command is executed.
# ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth

g. Ping the new address to verify that the change was applied:
# ping 172.28.0.#

Where # is the number of the server you are replacing.
14. Turn off the dhcpd service:
# service dhcpd stop

15. Power on the replacement server by pressing the button on the front panel.

Replace hdm1 (namenode, DCA version 2.0.1.0)
1. From the Primary Master server issue the following command to open a console
session on the replacement server. Replace the hostname shown in bold with the
hostname of the replacement server:
# ipmiutil sol -a -e -N hdm1-sp -U root -P sephiroth


You will need to press the F key within 15 seconds after seeing this WARNING message:
Foreign configuration(s) found on adapter
Press any key to continue or ‘C’ load the configuration
utility, or ‘F’ to import foreign configuration(s) and
continue.
2. Power on the replacement server by pressing the power button on the front panel, and
press the F key when prompted.
3. When the following message displays, disregard and press the space bar:
All of the disks from your previous configuration are gone. If this
is an unexpected message, then please power off your system and
check your cables to ensure all disks are present. Press any key to
continue, or 'C' to load the configuration utility.

4. If the message below appears it indicates that the server did not accept the “F” key
request per the above WARNING. This means that you will need to power off the server,
verify that all LED lights are off, and go back to step 2 (press the F key when prompted
again):
CLIENT MAC ADDR: 00 1E 67 4D C5 1D
001E674DC51D
DHCP....\

EMC Greenplum DCA Maintenance Guide

GUID: 2A9B43A4 A50A 11E1 AAA0

Replace hdm1 (namenode, DCA version 2.0.1.0)

68

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

5. Monitor the boot process onscreen and verify that the replacement server boots from
hard disk. If it does not, do the following to force it to boot from hard disk:
a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.)
key, as follows on the keyboard:

then
Note: When you exit the BMC console you are returned to your connection on the
Primary Master server as the user root.
b. Issue the following command from the Primary Master server to force the Segment
server to boot from hard drive:
# ipmiutil reset -h -N hdm1-sp -U root -P sephiroth
c. Once the operating system is loaded, issue the following command to change the
boot order on the server. For example, on hdm1:
# ssh hdm1
# syscfg /bbo “emcbios” HDD NW

d. Reboot the system:
# reboot

e. Following the reboot, issue the following commands to connect to the replacement
server and verify the boot order:
# ssh hdm1
# syscfg /bbosys

Change the hostname shown in bold above to the hostname of the server you
replaced.
6. Issue the following command to check the health of the replacement server:
# dcacheck -h hdm1

Verify that no errors display.
7. Exchange SSH keys on the replacement server using the DCA Setup utility:
a. Start the DCA Setup utility as the user root:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 6 to Generate SSH Keys.
d. Enter X to exit the DCA Setup utility.
e. When the server reboots, check the new firmware version. Replace the text in bold
with the hostname of the replacement server:
# ssh sdw1 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall |
grep "FW Package Build"
EMC Greenplum DCA Maintenance Guide

Replace hdm1 (namenode, DCA version 2.0.1.0)

69

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

If the above command returns either: FW Package Build: 23.7.0-0033 or FW
Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a
through n below to update the firmware.
f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to
your laptop.
https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian
ce
g. Extract the files to your laptop using unzip or similar unpacking tool. For example:
Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip

h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the
file in /root. You can use WinSCP or a similar utility.
Note: You may be required to provide a login to the destination server.
i. For each server in need of an update, log into the server as root.
j. SCP the MR56p.rom file from the master to the server you are updating.
k. Install the new firmware using the following command:
Note: This will take longer on 24-disk servers.
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom
-aall

l. Reboot the server.
# reboot

m. When the server reboots, check the new firmware version:
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW
Package Build"

The following should be returned, indicating your firmware has successfully been
updated on this server:
FW package Build: 23.12.0-0013

n. Repeat these alphabetic steps to check/update the remaining servers in the
cluster.
8. Synchronize the system clock:
a. Select option 2 for Modify DCA Settings.
b. Select option 5 for Modify NTP/Clock Configuration Options.
c. Select option 3 for Synchronize clocks across the cluster to the
NTP server.
d. Enter X to exit the DCA Setup utility.
9. Re-enable health monitoring by restarting the healthmon daemon:
# dca_healthmon_ctl -e
EMC Greenplum DCA Maintenance Guide

Replace hdm1 (namenode, DCA version 2.0.1.0)

70

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

10. Connect to hdm1:
# ssh hdm1

11. Issue the following command to view the status of the PHD cluster:
# dca_hadoop --status all

Verify that the status of hdm1 is stopped:
module namenode(service hadoop-namenode) is stopped on host hdm1

12. Start the namenode service:
# dca_hadoop --start namenode

When the namenode service starts the PHD cluster is in safemode.
13. Switch to the user hdfs and issue the following command:
# su - hdfs
$ hadoop fsck /

The following message indicates that the filesystem has an error:
The filesystem under path ‘/’ is CORRUPT

14. Exit safemode and return the filesystem to a normal state:
$ hadoop dfsadmin -safemode leave

15. Verify that the filesystem is healthy:
$ hadoop fsck /

The following message indicates that the filesystem is healthy:
The filesystem under path ‘/’ is HEALTHY

EMC Greenplum DCA Maintenance Guide

Replace hdm1 (namenode, DCA version 2.0.1.0)

71

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0)
1. From the Primary Master server issue the following command to open a console
session on the replacement server. Replace the hostname shown in bold with the
hostname of the replacement server:
# ipmiutil sol -a -e -N hdm2-sp -U root -P sephiroth


You will need to press the F key within 15 seconds after seeing this WARNING message:
Foreign configuration(s) found on adapter
Press any key to continue or ‘C’ load the configuration
utility, or ‘F’ to import foreign configuration(s) and
continue.
2. Power on the replacement server by pressing the power button on the front panel, and
press the F key when prompted.
3. When the following message displays, disregard and press the space bar:
All of the disks from your previous configuration are gone. If this
is an unexpected message, then please power off your system and
check your cables to ensure all disks are present. Press any key to
continue, or 'C' to load the configuration utility.

4. If the message below appears it indicates that the server did not accept the “F” key
request per the above WARNING. This means that you will need to power off the server,
verify that all LED lights are off, and go back to step 2 (press the F key when prompted
again):
CLIENT MAC ADDR: 00 1E 67 4D C5 1D
001E674DC51D
DHCP....\

GUID: 2A9B43A4 A50A 11E1 AAA0

5. Monitor the boot process onscreen and verify that the replacement server boots from
hard disk. If it does not, do the following to force it to boot from hard disk:
a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.)
key, as follows on the keyboard:

then
Note: When you exit the BMC console you are returned to your connection on the
Primary Master server as the user root.
b. Issue the following command from the Primary Master server to force the Segment
server to boot from hard drive:
# ipmiutil reset -h -N hdm2-sp -U root -P sephiroth
Change the hostname shown in bold above to the hostname of the server you
replaced.

EMC Greenplum DCA Maintenance Guide

Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0)

72

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

c. Once the operating system is loaded, issue the following command to change the
boot order on the server. For example, on hdm2:
# ssh hdm2
# syscfg /bbo “emcbios” HDD NW

d. Reboot the system:
# reboot

e. Following the reboot, issue the following commands to connect to the replacement
server and verify the boot order:
# ssh hdm2
# syscfg /bbosys

6. Issue the following command to check the health of the replacement server:
# dcacheck -h hdm2

Verify that no errors display.
7. Exchange SSH keys on the replacement server using the DCA Setup utility:
a. Start the DCA Setup utility as the user root:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 6 to Generate SSH Keys.

EMC Greenplum DCA Maintenance Guide

Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0)

73

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

d. Enter X to exit the DCA Setup utility.
e. When the server reboots, check the new firmware version. Replace the text in bold
with the hostname of the replacement server:
# ssh hdm2 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall |
grep "FW Package Build"

If the above command returns either: FW Package Build: 23.7.0-0033 or FW
Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a
through n below to update the firmware.
f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to
your laptop.
https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian
ce
g. Extract the files to your laptop using unzip or similar unpacking tool. For example:
Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip

h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the
file in /root. You can use WinSCP or a similar utility.
Note: You may be required to provide a login to the destination server.
i. For each server in need of an update, log into the server as root.
j. SCP the MR56p.rom file from the master to the server you are updating.
k. Install the new firmware using the following command:
Note: This will take longer on 24-disk servers.
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom
-aall

l. Reboot the server.
# reboot

m. When the server reboots, check the new firmware version:
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW
Package Build"

The following should be returned, indicating your firmware has successfully been
updated on this server:
FW package Build: 23.12.0-0013

n. Repeat these alphabetic steps to check/update the remaining servers in the
cluster.
8. Synchronize the system clock:
a. Select option 2 for Modify DCA Settings.
b. Select option 5 for Modify NTP/Clock Configuration Options.

EMC Greenplum DCA Maintenance Guide

Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0)

74

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

c. Select option 3 for Synchronize clocks across the cluster to the
NTP server.
d. Enter X to exit the DCA Setup utility.
9. Re-enable health monitoring by restarting the healthmon daemon:
# dca_healthmon_ctl -e

10. Connect to hdm1:
# ssh hdm1

11. Issue the following command to view the status of the Hadoop cluster:
# dca_hadoop --status all

The status of the secondary-namenode and zookeeper modules on hdm2
should be:
module secondary-namenode(service hadoop-secondarynamenode) has error on host hdm2
module zookeeper(service zookeeper-server) has error on host hdm2

12. Start the secondary-namenode and zookeeper services:
# dca_hadoop --start secondary-namenode
# dca_hadoop --start zookeeper

13. Switch to the user hdfs and issue the following command:
# su - hdfs
$ hadoop fsck /

14. Verify that the filesystem is healthy:
$ hadoop fsck /

The following message indicates that the filesystem is healthy:
The filesystem under path ‘/’ is HEALTHY

EMC Greenplum DCA Maintenance Guide

Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0)

75

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

Replace hdm3 (resourcemanager, DCA version 2.0.1.0)
1. From the Primary Master server issue the following command to open a console
session on the replacement server. Replace the hostname shown in bold with the
hostname of the replacement server:
# ipmiutil sol -a -e -N hdm3-sp -U root -P sephiroth


You will need to press the F key within 15 seconds after seeing this WARNING message:
Foreign configuration(s) found on adapter
Press any key to continue or ‘C’ load the configuration
utility, or ‘F’ to import foreign configuration(s) and
continue.
2. Power on the replacement server by pressing the power button on the front panel, and
press the F key when prompted.
3. When the following message displays, disregard and press the space bar:
All of the disks from your previous configuration are gone. If this
is an unexpected message, then please power off your system and
check your cables to ensure all disks are present. Press any key to
continue, or 'C' to load the configuration utility.

4. If the message below appears it indicates that the server did not accept the “F” key
request per the above WARNING. This means that you will need to power off the server,
verify that all LED lights are off, and go back to step 2 (press the F key when prompted
again):
CLIENT MAC ADDR: 00 1E 67 4D C5 1D
001E674DC51D
DHCP....\

GUID: 2A9B43A4 A50A 11E1 AAA0

5. Monitor the boot process onscreen and verify that the replacement server boots from
hard disk. If it does not, do the following to force it to boot from hard disk:
a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.)
key, as follows on the keyboard:

then
Note: When you exit the BMC console you are returned to your connection on the
Primary Master server as the user root.
b. Issue the following command from the Primary Master server to force the Segment
server to boot from hard drive:
# ipmiutil reset -h -N hdm3-sp -U root -P sephiroth
Change the hostname shown in bold above to the hostname of the server you
replaced.

EMC Greenplum DCA Maintenance Guide

Replace hdm3 (resourcemanager, DCA version 2.0.1.0)

76

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

c. Once the operating system is loaded, issue the following command to change the
boot order on the server. For example, on hdm3:
# ssh hdm3
# syscfg /bbo “emcbios” HDD NW

d. Reboot the system:
# reboot

e. Following the reboot, issue the following commands to connect to the replacement
server and verify the boot order:
# ssh hdm3
# syscfg /bbosys

6. Issue the following command to check the health of the replacement server:
# dcacheck -h hdm3

Verify that no errors display.
7. Exchange SSH keys on the replacement server using the DCA Setup utility:
a. Start the DCA Setup utility as the user root:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 6 to Generate SSH Keys.
d. Enter X to exit the DCA Setup utility.
e. When the server reboots, check the new firmware version. Replace the text in bold
with the hostname of the replacement server:
# ssh hdm3 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall |
grep "FW Package Build"

If the above command returns either: FW Package Build: 23.7.0-0033 or FW
Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a
through n below to update the firmware.
f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to
your laptop.
https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian
ce
g. Extract the files to your laptop using unzip or similar unpacking tool. For example:
Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip

h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the
file in /root. You can use WinSCP or a similar utility.
Note: You may be required to provide a login to the destination server.
i. For each server in need of an update, log into the server as root.
j. SCP the MR56p.rom file from the master to the server you are updating.
EMC Greenplum DCA Maintenance Guide

Replace hdm3 (resourcemanager, DCA version 2.0.1.0)

77

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

k. Install the new firmware using the following command:
Note: This will take longer on 24-disk servers.
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom
-aall

l. Reboot the server.
# reboot

m. When the server reboots, check the new firmware version:
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW
Package Build"

The following should be returned, indicating your firmware has successfully been
updated on this server:
FW package Build: 23.12.0-0013

n. Repeat these alphabetic steps to check/update the remaining servers in the
cluster.
8. Synchronize the system clock:
a. Select option 2 for Modify DCA Settings.
b. Select option 5 for Modify NTP/Clock Configuration Options.
c. Select option 3 for Synchronize clocks across the cluster to the
NTP server.
d. Enter X to exit the DCA Setup utility.
9. Re-enable health monitoring by restarting the healthmon daemon:
# dca_healthmon_ctl -e

10. Connect to hdm1:
# ssh hdm1

11. Issue the following command to view the status of the Hadoop cluster:
# dca_hadoop --status all

The status of the resourcemanager and zookeeper modules on hdm3 should be:
module resourcemanager(service hadoop-resourcemanager) has error on host hdm3
module zookeeper(service zookeeper-server) has error on host hdm3

12. Start the resourcemanager and zookeeper services:
# dca_hadoop --start resourcemanager
# dca_hadoop --start zookeeper

13. Verify that all services are shown as started:
# dca_hadoop --status all

14. Switch to the user hdfs and issue the following command:

EMC Greenplum DCA Maintenance Guide

Replace hdm3 (resourcemanager, DCA version 2.0.1.0)

78

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

# su - hdfs
$ hadoop fsck /

15. Verify that the filesystem is healthy:
$ hadoop fsck /

The following message indicates that the filesystem is healthy:
The filesystem under path ‘/’ is HEALTHY

Replace hdm4 (zookeeper/hive/hive-metastore, DCA version 2.0.1.0)
1. From the Primary Master server issue the following command to open a console
session on the replacement server. Replace the hostname shown in bold with the
hostname of the replacement server:
# ipmiutil sol -a -e -N hdm4-sp -U root -P sephiroth


You will need to press the F key within 15 seconds after seeing this WARNING message:
Foreign configuration(s) found on adapter
Press any key to continue or ‘C’ load the configuration
utility, or ‘F’ to import foreign configuration(s) and
continue.
2. Power on the replacement server by pressing the power button on the front panel, and
press the F key when prompted.
3. When the following message displays, disregard and press the space bar:
All of the disks from your previous configuration are gone. If this
is an unexpected message, then please power off your system and
check your cables to ensure all disks are present. Press any key to
continue, or 'C' to load the configuration utility.

4. If the message below appears it indicates that the server did not accept the “F” key
request per the above WARNING. This means that you will need to power off the server,
verify that all LED lights are off, and go back to step 2 (press the F key when prompted
again):
CLIENT MAC ADDR: 00 1E 67 4D C5 1D
001E674DC51D
DHCP....\

GUID: 2A9B43A4 A50A 11E1 AAA0

5. Monitor the boot process onscreen and verify that the replacement server boots from
hard disk. If it does not, do the following to force it to boot from hard disk:
a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.)
key, as follows on the keyboard:

then

EMC Greenplum DCA Maintenance Guide

Replace hdm4 (zookeeper/hive/hive-metastore, DCA version 2.0.1.0)

79

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

Note: When you exit the BMC console you are returned to your connection on the
Primary Master server as the user root.
b. Issue the following command from the Primary Master server to force the Segment
server to boot from hard drive:
# ipmiutil reset -h -N hdm4-sp -U root -P sephiroth
Change the hostname shown in bold above to the hostname of the server you
replaced.
c. Once the operating system is loaded, issue the following command to change the
boot order on the server. For example, on hdm4:
# ssh hdm4
# syscfg /bbo “emcbios” HDD NW

d. Reboot the system:
# reboot

e. Following the reboot, issue the following commands to connect to the replacement
server and verify the boot order:
# ssh hdm4
# syscfg /bbosys

6. Issue the following command to check the health of the replacement server:
# dcacheck -h hdm4

Verify that no errors display.
7. Exchange SSH keys on the replacement server using the DCA Setup utility:
a. Start the DCA Setup utility as the user root:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 6 to Generate SSH Keys.
d. Enter X to exit the DCA Setup utility.
e. When the server reboots, check the new firmware version. Replace the text in bold
with the hostname of the replacement server:
# ssh hdm4 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall |
grep "FW Package Build"

If the above command returns either: FW Package Build: 23.7.0-0033 or FW
Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a
through n below to update the firmware.
f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to
your laptop.
https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian
ce

EMC Greenplum DCA Maintenance Guide

Replace hdm4 (zookeeper/hive/hive-metastore, DCA version 2.0.1.0)

80

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

g. Extract the files to your laptop using unzip or similar unpacking tool. For example:
Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip

h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the
file in /root. You can use WinSCP or a similar utility.
Note: You may be required to provide a login to the destination server.
i. For each server in need of an update, log into the server as root.
j. SCP the MR56p.rom file from the master to the server you are updating.
k. Install the new firmware using the following command:
Note: This will take longer on 24-disk servers.
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom
-aall

l. Reboot the server.
# reboot

m. When the server reboots, check the new firmware version:
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW
Package Build"

The following should be returned, indicating your firmware has successfully been
updated on this server:
FW package Build: 23.12.0-0013

n. Repeat these alphabetic steps to check/update the remaining servers in the
cluster.
8. Synchronize the system clock:
a. Select option 2 for Modify DCA Settings.
b. Select option 5 for Modify NTP/Clock Configuration Options.
c. Select option 3 for Synchronize clocks across the cluster to the
NTP server.
d. Enter X to exit the DCA Setup utility.
9. Re-enable health monitoring by restarting the healthmon daemon:
# dca_healthmon_ctl -e

10. Connect to hdm1:
# ssh hdm1

11. Issue the following command to view the status of the Hadoop cluster:
# dca_hadoop --status all

EMC Greenplum DCA Maintenance Guide

Replace hdm4 (zookeeper/hive/hive-metastore, DCA version 2.0.1.0)

81

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

The status of the zookeeper, hive, and hive-metastore modules on hdm4
should be:
module hive-server(service hive-server) is stopped on host hdm4
module hive-server(service hive-metastore) is stopped on host hdm4
module zookeeper(service zookeeper-server) is stopped on host hdm4

12. Start the stopped services:
# dca_hadoop --start zookeeper
# dca_hadoop --start hive-server

13. Verify that all services are shown as started:
# dca_hadoop --status all

14. Switch to the user hdfs and issue the following command:
# su - hdfs
$ hadoop fsck /

15. Verify that the filesystem is healthy:
$ hadoop fsck /

The following message indicates that the filesystem is healthy:
The filesystem under path ‘/’ is HEALTHY

Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0)
1. From the Primary Master server issue the following command to open a console
session on the replacement server. Replace the hostname shown in bold with the
hostname of the replacement Hadoop Worker server.
# ipmiutil sol -a -e -N hdw1-sp -U root -P sephiroth


You will need to press the F key within 15 seconds after seeing this WARNING message:
Foreign configuration(s) found on adapter
Press any key to continue or ‘C’ load the configuration
utility, or ‘F’ to import foreign configuration(s) and
continue.
2. Power on the replacement server by pressing the power button on the front panel, and
press the F key when prompted.
3. When the following message displays, disregard and press the space bar:
All of the disks from your previous configuration are gone. If this
is an unexpected message, then please power off your system and
check your cables to ensure all disks are present. Press any key to
continue, or 'C' to load the configuration utility.

EMC Greenplum DCA Maintenance Guide

Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0)

82

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

4. If the message below appears it indicates that the server did not accept the “F” key
request per the above WARNING. This means that you will need to power off the server,
verify that all LED lights are off, and go back to step 2 (press the F key when prompted
again):
CLIENT MAC ADDR: 00 1E 67 4D C5 1D
001E674DC51D
DHCP....\

GUID: 2A9B43A4 A50A 11E1 AAA0

5. Monitor the boot process onscreen and verify that the replacement server boots from
hard disk. If it does not, do the following to force it to boot from hard disk:
a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.)
key, as follows on the keyboard:

then
Note: When you exit the BMC console you are returned to your connection on the
Primary Master server as the user root.
b. Issue the following command from the Primary Master server to force the
replacement Hadoop Worker server to boot from hard drive. Change the hostname
shown in bold to the hostname of the server you replaced:
# ipmiutil reset -h -N hdw1-sp -U root -P sephiroth
c. Once the operating system is loaded, issue the following command to change the
boot order on the server. Change the hostname shown in bold to the hostname of
the server you replaced:
# ssh hdw1
# syscfg /bbo “emcbios” HDD NW

d. Reboot the system:
# reboot

e. Following the reboot, issue the following commands to connect to the replacement
server and verify the boot order. Change the hostname shown in bold to the
hostname of the server you replaced:
# ssh hdw1
# syscfg /bbosys

6. Issue the following command to check the health of the replacement server. Change
the hostname shown in bold to the hostname of the server you replaced:
oreign

Verify that no errors display.
7. Exchange SSH keys on the replacement server using the DCA Setup utility:
a. Start the DCA Setup utility as the user root:
# dca_setup
EMC Greenplum DCA Maintenance Guide

Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0)

83

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

b. Select option 2 to Modify DCA Settings.
c. Select option 6 to Generate SSH Keys.
d. Enter X to exit the DCA Setup utility.
e. When the server reboots, check the new firmware version. Replace the text in bold
with the hostname of the replacement server:
# ssh hdw1 /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall |
grep "FW Package Build"

If the above command returns either: FW Package Build: 23.7.0-0033 or FW
Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a
through n below to update the firmware.
f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to
your laptop.
https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian
ce
g. Extract the files to your laptop using unzip or similar unpacking tool. For example:
Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip

h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the
file in /root. You can use WinSCP or a similar utility.
Note: You may be required to provide a login to the destination server.
i. For each server in need of an update, log into the server as root.
j. SCP the MR56p.rom file from the master to the server you are updating.
k. Install the new firmware using the following command:
Note: This will take longer on 24-disk servers.
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom
-aall

l. Reboot the server.
# reboot

m. When the server reboots, check the new firmware version:
# /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW
Package Build"

The following should be returned, indicating your firmware has successfully been
updated on this server:
FW package Build: 23.12.0-0013

n. Repeat these alphabetic steps to check/update the remaining servers in the
cluster.
8. Synchronize the system clock:

EMC Greenplum DCA Maintenance Guide

Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0)

84

EMC CONFIDENTIAL
Replace a Segment, DIA, or Hadoop server

a. Select option 2 for Modify DCA Settings.
b. Select option 5 for Modify NTP/Clock Configuration Options.
c. Select option 3 for Synchronize clocks across the cluster to the
NTP server.
d. Enter X to exit the DCA Setup utility.
9. Re-enable health monitoring by restarting the healthmon daemon:
# dca_healthmon_ctl -e

10. Connect to hdm1:
# ssh hdm1

11. Issue the following command to view the status of the Hadoop cluster:
# dca_hadoop --status all

The status of the datanode and nodemanager modules on all hdw’s should be:
module hive-server(service hadoop-datanode) is stopped on host hdw#
module hive-server(service hadoop-nodemanager) is stopped on host
hdw#

12. Start the stopped datanode and nodemanager services:
# dca_hadoop --start datanode
# dca_hadoop --start nodemanager

13. Verify that all services are shown as started:
# dca_hadoop --status all

14. Switch to the user hdfs:
# su - hdfs

15. Verify that the filesystem is healthy:
$ hadoop fsck /

The following message indicates that the filesystem is healthy:
The fiesystem under path ‘/’ is HEALTHY

EMC Greenplum DCA Maintenance Guide

Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0)

85

EMC CONFIDENTIAL

CHAPTER 4
Replace a Disk Drive
This chapter describes how to replace a failed drive in a Master, Segment, DIA, or Hadoop
server. It includes the following major sections:





Hot spare drives and the Copyback operation..........................................................
Replace a disk drive in a Master, DIA, or Hadoop Compute server ............................
Replace a drive in a Segment Server........................................................................
Replace a drive in an Hadoop server........................................................................

86
87
91
96

Hot spare drives and the Copyback operation
(Does not apply to drives in an Hadoop Worker server) When a drive fails, the RAID
controller begins the rebuild process and writes data to the hot spare disk drive in the
server. A slowly blinking amber LED on the hot spare drive indicates that the drive has
been invoked as the rebuild drive. You must allow the rebuild process to complete on the
hot spare before you remove the failed drive and replace it with a replacement drive.
When the rebuild process is finished and you replace the failed drive with a replacement
drive, data is copied automatically from the hot spare drive to the replacement drive in a
process called the Copyback operation. When the Copyback operation is complete, the hot
spare drive ends its role as the rebuild drive and resumes its original role as the hot spare
drive. Returning the hot spare to its original role ensures that the hot spare drive always
occupies the same slot in the server. Hot spare locations are shown in Table 4 below.
Table 4 Hot spare drive locations per server type
Server type

Hot spare drive location(s)

Master, DIA, or Hadoop Compute server,
8 disk slots (slots 6 and 7 are empty)

Slot 5 (see Figure 12 on page 88)

GPDB server, 24 disk slots

Slot 11 and Slot 23 (see Figure 15 on page 92)

Hadoop Master server, 12 disk slots

Slot 11 (see Figure 18 on page 98)

Note: Hadoop Worker servers do not have a hot spare drive.

The Copyback operation runs in the background. During the operation the virtual drive is
still available online to the host.

EMC Greenplum DCA Maintenance Guide

Replace a Disk Drive

86

EMC CONFIDENTIAL
Replace a Disk Drive

Replace a disk drive in a Master, DIA, or Hadoop Compute server
All drives are installed at the front of the server and connect to the system board through
the backplane. Hard drives are supplied in special hot-swappable hard-drive carriers that
fit in the hard-drive slots.
In addition to describing how to physically remove and insert the disk drive, this
procedure also describes how to do the following:


Determine if the RAID group is still rebuilding and how to monitor the rebuild process.



Verify that the Copyback operation is in progress and how to monitor it.



Manually initiate the Copyback operation if necessary.

1. Connect your service laptop to the DCA and log in to the Primary Master as the user
root (see “Connect a workstation to the DCA” on page 176).
2. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA
door and the blue server identification LED as the user root:
a. Launch the DCA Setup utility:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 18 for Light Bar Controls.
d. Select option 3 for Blink the light bar.
e. Enter the hostname of the server and press ENTER.
f. Enter X to exit the DCA Setup utility.
The green lightbar on the DCA door and the blue server identification LED begin to
blink.
Note: If the DCA door does not have a lightbar, an error message displays. You can
safely ignore the error message. To identify the failed server, locate the blue server
identification LED.
3. Locate the failed drive.
Note: In this procedure, Disk 0 is the failed drive.
LED indicators on each drive carrier indicate the current status of the drive within
it. A failed drive is indicated by an amber LED or no LED. (A drive in the rebuild
process also displays an amber LED.)

EMC Greenplum DCA Maintenance Guide

Replace a disk drive in a Master, DIA, or Hadoop Compute server

87

EMC CONFIDENTIAL
Replace a Disk Drive

If the dial-home information includes a drive number, locate the drive with the
help of the following illustration:
Hot spare drive

Disk 1
Disk 0

Disk 3
Disk 2

Disk 5
Disk 4

Disk 6 (empty)

Disk 7 (empty)

Figure 12 Master and DIA server drive slot numbering

Verify the state of the
RAID group rebuild
process.

4. Before you remove the faulted drive, read the topic “Hot spare drives and the
Copyback operation” on page 86. Then issue the following command to determine
whether the RAID group is still being rebuilt:

# CmdTool2 -PDList -aALL|egrep “Adapter|Enclosure|Slot Number|Firmware state"

Example output is shown below. Focus on the items in bold.
Adapter #0
Enclosure Device ID: 252
Slot Number: 1
Enclosure position: 0
Firmware state: Online, Spun
Enclosure Device ID: 252
Slot Number: 2
Enclosure position: 0
Firmware state: Online, Spun
Enclosure Device ID: 252
Slot Number: 3
Enclosure position: 0
Firmware state: Online, Spun
Slot Number: 4
Enclosure position: 0
Firmware state: Online, Spun

Up

Up

Up

Up

Enclosure Device ID: 252
Slot Number: 5
Enclosure position: 0
Firmware state: Rebuild

In the example output above, note that the rebuild process is still in progress.
• If Rebuild appears anywhere in the output, the rebuild process is in progress. Do
remove the faulted drive yet. Monitor the rebuild process as described in step 5.
• If all drives in the output are shown as Online, Spun Up, the rebuild process is
complete. Proceed to removing the failed drive as described in step 6.

Monitor the rebuild
process.

5. To monitor the rebuild process if it is in progress, issue the following command.
Change the values in bold below to the actual values from your output. For example:
# CmdTool2 -pdrbld -progdsply -PhysDrv[252:5] -a0

The values in the above example refer to the following parameters:
• 252 refers to the Enclosure Device ID.
• 5 refers to the Slot Number of the hotspare drive invoked as the rebuild drive.
• 0 refers to the Adapter Number.

EMC Greenplum DCA Maintenance Guide

Replace a disk drive in a Master, DIA, or Hadoop Compute server

88

EMC CONFIDENTIAL
Replace a Disk Drive

6. If the rebuild is complete, remove the failed drive from the server:
a. Press the button on the front of the drive carrier to release the drive handle.
b. Wait 10 seconds to allow the platter in the drive to stop spinning.
c. Pull the drive carrier out of the server.

A
B

CL4966

Figure 13 Removing a drive from a Master Server

IMPORTANT
Make sure that adjacent drive carriers are fully installed and locked in place before
you remove or replace a drive carrier. Replacing a drive carrier and attempting to lock
its handle when the adjacent drive is only partially-installed can damage the drives.
7. To replace a drive in the server:
a. Press the button on the front of the drive carrier and open the handle.
b. Insert the drive carrier into the drive bay until the carrier contacts the backplane.
c. Close the handle to lock the drive carrier in place. The LED on the drive turns green
and blinks while it automatically starts the Copyback operation.

CL4967

Figure 14 Replacing a drive in a Master Server

8. The Copyback operation should begin automatically when you insert the replacement
drive. (For details about Copyback, see “Hot spare drives and the Copyback operation”
on page 86.) Wait for the Copyback operation to complete. The hot spare always
occupies slot 5 on a Master server (see Figure 12 on page 88).

EMC Greenplum DCA Maintenance Guide

Replace a disk drive in a Master, DIA, or Hadoop Compute server

89

EMC CONFIDENTIAL
Replace a Disk Drive

Verify that the Copyback
operation is in progress.

9. Issue the following command to verify that the Copyback operation is in progress:

# CmdTool2 -pdlist -aall | egrep

"Adapter|Enclosure|Slot Number|Firmware state"

Example output is shown below. Focus on the items in bold.
Adapter #0
Enclosure Device ID: 252
Slot Number: 0
Enclosure position: 0
Firmware state: Copyback
Enclosure Device ID: 252
Slot Number: 1
Enclosure position: 0
Firmware state: Online, Spun
Enclosure Device ID: 252
Slot Number: 2
Enclosure position: 0
Firmware state: Online, Spun
Enclosure Device ID: 252
Slot Number: 3
Enclosure position: 0
Firmware state: Online, Spun
Enclosure Device ID: 252
Slot Number: 4
Enclosure position: 0
Firmware state: Online, Spun
Enclosure Device ID: 252
Slot Number: 5
Enclosure position: 0
Firmware state: Online, Spun

Up

Up

Up

Up

Up

In the example output above, note that the firmware state of the drive in
Slot Number 0 is shown as Copyback. This indicates that the copyback operation is
in progress and that data is being restored to the new drive in Slot 0.
If no firmware states are shown as Copyback (for example, if the firmware states are
shown as Online, Spun Up or Hotspare Spun Up) the Copyback operation
is complete.

Monitor the Copyback
operation.

10. To monitor the progress of the Copyback operation, issue the following command.
Change the values shown in bold below to the actual values from your output.
# CmdTool2 -pdcpybk -progdsply -PhysDrv[252:0] -a0

The values in the above example refer to the following parameters:
• 252 refers to the Enclosure Device ID.
• 0 refers to the Slot Number of the hotspare drive invoked as the rebuild drive.
• 0 refers to the Adapter Number.
The following is an example output of the Copyback progress.
Copyback Progress of Physical Drive...
Enclosure:Slot
Percent Complete
Time Elps
252 :00 ##############*********29 %*********** 00:10:38
Press  key to quit...

EMC Greenplum DCA Maintenance Guide

Replace a disk drive in a Master, DIA, or Hadoop Compute server

90

EMC CONFIDENTIAL
Replace a Disk Drive


If the firmware state is reported as Unconfigured(Good) then the Copyback
operation did not occur automatically. In this unlikely event, you must initiate Copyback
manually by issuing the following command:

If necessary, manually
initiate the Copyback
operation.

# CmdTool2 -pdcpybk -start -PhysDrv[252:5,252:0] -a0

View the Copyback progress as described in step 10.
If no firmware states are shown as Copyback, the Copyback operation is complete.
11. Issue the following command:

CmdTool2 -pdlist -aall | egrep

"Adapter|Enclosure|Slot Number|Firmware state"

12. In the output verify that the firmware state of the drives is reported as follows:
• Drives 0 - 4: Online, Spun Up
• Drive 5: hotspare, Spun Up

Replace a drive in a Segment Server
All drives are installed at the front of the server and connect to the system board through
the backplane. Hard drives are supplied in special hot-swappable hard-drive carriers that
fit in the hard-drive slots.
In addition to describing how to physically remove and insert the disk drive, this
procedure also describes how to do the following:


Determine if the RAID group is still rebuilding and how to monitor the rebuild process.



Verify that the Copyback operation is in progress and how to monitor it.



Manually initiate the Copyback operation if necessary.

1. Connect your service laptop to the DCA and log in to the Primary Master as the user
root (see “Connect a workstation to the DCA” on page 176).
2. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA
door and the blue server identification LED as the user root:
a. Launch the DCA Setup utility:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 18 for Light Bar Controls.
d. Select option 3 for Blink the light bar.
e. Enter the hostname of the server and press ENTER.
f. Enter X to exit the DCA Setup utility.
The green lightbar on the DCA door and the blue server identification LED begin to
blink.

EMC Greenplum DCA Maintenance Guide

Replace a drive in a Segment Server

91

EMC CONFIDENTIAL
Replace a Disk Drive

Note: If the DCA door does not have a lightbar, an error message displays. You can
safely ignore the error message. To identify the failed server, locate the blue server
identification LED.
g. Locate the failed drive.
Note: In this procedure, Disk 0 is the failed drive.
LED indicators on the each drive carrier indicate the current status of the drive
within it. A failed drive is indicated by a solid (unblinking) amber LED.
If the dial-home information includes a drive number, locate the drive with the
help of the following illustration:

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18 19

Hot spare drive

20

21

22

23

Hot spare drive

AF004297

Figure 15 Segment server drive slot numbering

Verify the state of the
RAID group rebuild
process.

3. Before you remove the faulted drive, read the topic “Hot spare drives and the
Copyback operation” on page 86.
4. Issue the following command to determine whether the RAID group is still being
rebuilt:

# CmdTool2 -pdlist -aall | egrep

"Adapter|Enclosure|Slot Number|Firmware state"

Example output is shown below. Focus on the items in bold.


Note from the output that 24-disk GPDB servers have two Adapters (#0 and #1) that
control 12 slots each. Make sure that you investigate the correct area of the output for the
drive that you are replacing.
Adapter #0
Enclosure Device ID: 28
Slot Number: 1
Enclosure position: 0
Firmware state: Online, Spun Up
. . .
Enclosure Device ID: 28
Slot Number: 10
Enclosure position: 0
Firmware state: Online, Spun Up

Enclosure Device ID: 28
Slot Number: 11
Enclosure position: 0
Firmware state: Rebuild
EMC Greenplum DCA Maintenance Guide

Replace a drive in a Segment Server

92

EMC CONFIDENTIAL
Replace a Disk Drive

Adapter #1
Enclosure Device ID: 13
Slot Number: 0
Enclosure position: 0
Firmware state: Online, Spun Up
Enclosure Device ID: 13
. . .
Enclosure Device ID: 13
Slot Number: 11
Enclosure position: 0
Firmware state: Hotspare, Spun Up

In the example output above, note that the rebuild is still in progress.
• If Rebuild appears anywhere in the output, the rebuild is in progress. Do not
remove the faulted drive yet. Monitor the rebuild process as described in step 5.
• If all drives in the output are shown as Online, Spun Up the rebuild is
complete. Proceed to removing the failed drive as described in step 6.

Monitor the
rebuild process.

5. To monitor the rebuild process, issue the following command. Change the values
shown in bold below to the actual values from your output. For example:
# CmdTool2 -pdrbld -progdsply -PhysDrv[28:11] -a0

IMPORTANT
Remember that GPDB servers have two Adapters (#0 and #1) that control 12 slots
each. Make sure to specify the correct adapter number, slot number, and enclosure
device ID when issuing the above command.
The values in the above example refer to the following parameters:
• 28 refers to the Enclosure Device ID.
• 11 refers to the Slot Number of the hotspare drive invoked as the rebuild drive.
• 0 refers to the Adapter Number.
6. If the rebuild is complete, remove the failed carrier from the server:
a. Press the button on the front of the drive carrier to release the drive handle.
b. Wait 10 seconds to allow the platter in the drive to stop spinning.

EMC Greenplum DCA Maintenance Guide

Replace a drive in a Segment Server

93

EMC CONFIDENTIAL
Replace a Disk Drive

c. Pull the drive carrier out of the server.

B

A

CL5012

Figure 16 Removing a drive from a Segment server

7. Make sure that the capacity of the replacement drive matches the capacity of the
failed drive. The drive capacity is printed on the label on each drive.
IMPORTANT
Do not mix drives of different capacities within a server.
8. To replace a drive carrier in the server:
a. Insert the drive carrier into the drive bay until the carrier contacts the backplane.
b. Close the handle to lock the drive carrier in place. The LED on the drive turns green
and blinks while it automatically starts the Copyback operation.

CL5014

Figure 17 Replacing a drive in a Segment server

9. The Copyback operation should begin automatically when you insert the replacement
drive. (For details about Copyback, see “Hot spare drives and the Copyback operation”
on page 86.) Wait for the Copyback operation to complete. The hot spare drives always
occupy slot 11 and slot 23 in a 24-disk segment server (see Figure 15 on page 92).

Verify that the Copyback
operation is in progress.

10. Issue the following command to verify that the Copyback operation is in progress:

# CmdTool2 -pdlist -aall | egrep

EMC Greenplum DCA Maintenance Guide

"Adapter|Enclosure|Slot Number|Firmware state"

Replace a drive in a Segment Server

94

EMC CONFIDENTIAL
Replace a Disk Drive

Example output is shown below. Focus on the items in bold.
Adapter #0
Enclosure Device ID: 28
Slot Number: 0
Enclosure position: 0
Firmware state: Copyback
. . .
Enclosure Device ID: 28
Slot Number: 11
Enclosure position: 0
Firmware state: Online, Spun Up

Adapter #1
Enclosure Device ID: 13
Slot Number: 0
Enclosure position: 0
Firmware state: Online, Spun Up
. . .
Enclosure Device ID: 13
Slot Number: 11
Enclosure position: 0
Firmware state: Hotspare, Spun Up

In the example output above, note that the firmware state of the drive in Adapter #0,
Slot Number 0 is shown as Copyback. This indicates that the copyback operation is
in progress and that data is being restored to the new drive in Slot 0.
If no firmware states are shown as Copyback, (for example, the firmware states are
Online, Spun Up or Hotspare Spun Up) the Copyback operation is complete.

Monitor the Copyback
operation.

11. To view the Copyback progress, issue the following command. Change the values
shown in bold below to the actual values from your output.
# CmdTool2 -pdcpybk -progdsply -PhysDrv[28:0] -a0

The values in the above example refer to the following parameters:
• 28 refers to the Enclosure Device ID.
• 0 refers to the Slot Number of the hotspare drive invoked as the rebuild drive.
• 0 refers to the Adapter Number.
The following is an example output of the Copyback progress.
Copyback Progress of Physical Drive...
Enclosure:Slot
Percent Complete
Time Elps
28 :00 ##############*********29 %*********** 00:10:38
Press  key to quit...

EMC Greenplum DCA Maintenance Guide

Replace a drive in a Segment Server

95

EMC CONFIDENTIAL
Replace a Disk Drive


If the firmware state is reported as Unconfigured(Good) then the Copyback
operation did not occur automatically. In this unlikely event, you must initiate Copyback
manually by issuing the following command:

If necessary, manually
initiate the Copyback
operation.

# CmdTool2 -pdcpybk -start -PhysDrv[28:11,28:0] -a0

View the Copyback progress as described in step 11.
If no firmware states are shown as Copyback, the Copyback operation is complete.
12. Issue the following command:

CmdTool2 -pdlist -aall | egrep

"Adapter|Enclosure|Slot Number|Firmware state"

13. In the output verify that the firmware state of the drives is reported as follows:
• Drives 0 - 10 and 12 - 22: Online, Spun Up
• Drives 11 and 23: hotspare, Spun Up

Replace a drive in an Hadoop server
The section describes how to replace a drive in a Hadoop server in DCA version 2.0.1.0 or
later. For details on replacing a drive in a Hadoop server in DCA version 2.0.0.0, see the
EMC Data Computing Appliance Maintenance Guide for 2.0.0.0, Rev A02.


Hadoop-related procedures were not available at the time of this document’s publication.
The document will be updated in the short term and re-released. Until that time, please
contact platform-eng-support@gopivotal.com for Hadoop-related service questions.
The replacement procedure you use to replace a drive in a Hadoop server depends on the
type of Hadoop server it is, and—in the case of a Hadoop Worker—whether the faulted
drive is a System drive or a Data drive.


Hadoop Master servers—Drives are part of a RAID 5 configuration which can be rebuilt
automatically by the server’s RAID controller. For instructions, see “Replace a drive in
a Hadoop Master server” on page 97.



Hadoop Worker servers—The server has two different RAID configurations:
• System disks 0 – 1 are configured as RAID 1. If one of the System disks fails, the
RAID is rebuilt automatically by the server’s RAID controller after the replacement
drive is inserted. For instructions, see “Replace a System Disk (0 through 1)” on
page 101.
• Data disks 2 – 11 are each configured as single RAID-0. You must recover data
through the Hadoop filesystem after you replace the drive. For instructions, see
“Replace a failed Data Disk (2 through 11)” on page 104.

EMC Greenplum DCA Maintenance Guide

Replace a drive in an Hadoop server

96

EMC CONFIDENTIAL
Replace a Disk Drive

Replace a drive in a Hadoop Master server
All drives are installed at the front of the server and connect to the system board through
the backplane. Hard drives are supplied in special hot-swappable hard-drive carriers that
fit in the hard-drive slots.
In addition to describing how to physically remove and insert the disk drive, this
procedure also describes how to do the following:


Determine if the RAID group is still being rebuilt and how to monitor the rebuild
process.



Verify that the Copyback operation is in progress and how to monitor the Copyback
operation.



Manually initiate the Copyback operation if necessary.

1. Connect your service laptop to the DCA and log in to the Primary Master as the user
root (see “Connect a workstation to the DCA” on page 176).
2. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA
door and the blue server identification LED as the user root:
a. Launch the DCA Setup utility:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 18 for Light Bar Controls.
d. Select option 3 for Blink the light bar.
e. Enter the hostname of the server and press ENTER.
f. Enter X to exit the DCA Setup utility.
The green lightbar on the DCA door and the blue server identification LED begin to
blink.
Note: If the DCA door does not have a lightbar, an error message displays. You can
safely ignore the error message. To identify the failed server, locate the blue server
identification LED.
g. Locate the failed drive.
Note: In this procedure, Disk 0 is the failed drive.
LED indicators on the each drive carrier indicate the current status of the drive
within it. A faulted drive is indicated by a solid (unblinking) amber LED.

EMC Greenplum DCA Maintenance Guide

Replace a drive in an Hadoop server

97

EMC CONFIDENTIAL
Replace a Disk Drive

If the dial-home information includes a drive number, locate the drive with the
help of the following illustration:
Hot spare drive
(Hadoop Masters only)

Disk 2

Disk 5

Disk 8

Disk 11

Disk 1

Disk 4

Disk 7

Disk 10

Disk 0

Disk 3

Disk 6

Disk 9

Figure 18 Hadoop Master server drive slot numbering

Verify the state of the
RAID group rebuild
process.

3. Before you remove the faulted drive, read the topic “Hot spare drives and the
Copyback operation” on page 86.
4. Issue the following command to determine whether the RAID group is still being
rebuilt:

# CmdTool2 -pdlist -aall | egrep

"Adapter|Enclosure|Slot Number|Firmware state"

Example output is shown below. Focus on the items in bold.
Adapter #0
Enclosure Device ID: 28
Slot Number: 1
Enclosure position: 0
Firmware state: Online, Spun Up
. . .

Enclosure Device ID: 28
Slot Number: 11
Enclosure position: 0
Firmware state: Rebuild
In the example output above, note that the rebuild is still in progress.
• If Rebuild appears anywhere in the output, the rebuild is in progress. Do not
remove the faulted drive yet. Monitor the rebuild process as described in step 5.
• If all drives in the output are shown as Online, Spun Up the rebuild is
complete. Proceed to removing the failed drive as described in step 6.

Monitor the
rebuild process.

5. To monitor the rebuild process, issue the following command. Change the values
shown in bold below to the actual values from your output. For example:
# CmdTool2 -pdrbld -progdsply -PhysDrv[28:11] -a0

The values in the above example refer to the following parameters:
• 28 refers to the Enclosure Device ID.
• 11 refers to the Slot Number of the hotspare drive invoked as the rebuild drive.
• 0 refers to the Adapter Number.
6. If the rebuild is complete, remove the failed drive from the server:
a. Press the button on the front of the drive carrier to release the drive handle.
EMC Greenplum DCA Maintenance Guide

Replace a drive in an Hadoop server

98

EMC CONFIDENTIAL
Replace a Disk Drive

b. Wait 10 seconds to allow the platter in the drive to stop spinning.
c. Pull the drive carrier out of the server.

CL4982

Figure 19 Removing a drive from a Hadoop Master server

7. Make sure that the capacity of the replacement drive matches the capacity of the
failed drive. The drive capacity is printed on the label on each drive.
IMPORTANT
Do not mix drives of different capacities within a server.
8. To replace a drive carrier in the server:
a. Insert the drive carrier into the drive bay until the carrier contacts the backplane.
b. Close the handle to lock the drive carrier in place. The LED on the drive turns green
and blinks while it automatically starts the Copyback operation.

B

A

CL4983

Figure 20 Replacing a drive in a Hadoop Master server

9. The Copyback operation should begin automatically when you insert the replacement
drive. (For details about Copyback, see “Hot spare drives and the Copyback operation”
on page 86.) Wait for the Copyback operation to complete. The hot spare always
occupies Slot 11 on a Hadoop Master server (see Figure 15 on page 92).

Verify that the Copyback
operation is in progress.

10. Issue the following command to verify that the Copyback operation is in progress:

# CmdTool2 -pdlist -aall | egrep

"Adapter|Enclosure|Slot Number|Firmware state"

Example output is shown below. Focus on the items in bold.
EMC Greenplum DCA Maintenance Guide

Replace a drive in an Hadoop server

99

EMC CONFIDENTIAL
Replace a Disk Drive

Adapter #0
Enclosure Device ID: 28
Slot Number: 0
Enclosure position: 0
Firmware state: Copyback
. . .
Enclosure Device ID: 28
Slot Number: 11
Enclosure position: 0
Firmware state: Online, Spun Up

In the example output above, note that the firmware state of the drive in
Slot Number 0 is shown as Copyback. This indicates that the copyback operation is
in progress and that data is being restored to the new drive in Slot 0.
If no firmware states are shown as Copyback, (for example, the firmware states are
Online, Spun Up or Hotspare Spun Up) the Copyback operation is complete.

Monitor the Copyback
operation.

11. To view the Copyback progress, issue the following command. Change the values
shown in bold below to the actual values from your output.
# CmdTool2 -pdcpybk -progdsply -PhysDrv[28:0] -a0

The values in the above example refer to the following parameters:
• 28 refers to the Enclosure Device ID.
• 0 refers to the Slot Number of the hotspare drive invoked as the rebuild drive.
• 0 refers to the Adapter Number.
The following is an example output of the Copyback progress.
Copyback Progress of Physical Drive...
Enclosure:Slot
Percent Complete
Time Elps
28 :00 ##############*********29 %*********** 00:10:38
Press  key to quit...


If the firmware state is reported as Unconfigured(Good) then the Copyback operation
did not occur automatically. In this unlikely event, you must initiate Copyback manually by
issuing the following command:

If necessary, manually
initiate the Copyback
operation.

# CmdTool2 -pdcpybk -start -PhysDrv[28:11,28:0] -a0

View the Copyback progress as described in step 11.
If no firmware states are shown as Copyback, the Copyback operation is complete.
12. Issue the following command:

CmdTool2 -pdlist -aall | egrep

EMC Greenplum DCA Maintenance Guide

"Adapter|Enclosure|Slot Number|Firmware state"

Replace a drive in an Hadoop server

100

EMC CONFIDENTIAL
Replace a Disk Drive

13. In the output verify that the firmware state of the drives is reported as follows:
• Drives 0 - 10: Online, Spun Up
• Drive 11: Hotspare, Spun Up

Replace a drive in a Hadoop Worker server
There are two types of RAID configurations in an Hadoop Worker server:
• System disks 0 – 1 are configured as RAID 1. If one of the System disks fail, the
RAID is rebuilt automatically by the server’s RAID controller after the replacement
drive is inserted. For instructions, “Replace a System Disk (0 through 1)” on
page 101.
• Data disks 2 – 11 are each configured as individual RAID 0 disks. If a drive fails,
data must be recovered through the Hadoop filesystem after you replace the drive.
For instructions, see “Replace a failed Data Disk (2 through 11)” on page 104.
Note: Unlike other server types, Hadoop Workers do not have a hot spare drive.

ta

Physical disk 0:0:11

ta
Da

ta
Da

Physical disk 0:0:10

ta
Da

ta
Da

Physical disk 0:0:9

ta
Da

Physical disk 0:0:3

Da

System Disk

Physical disk 0:0:7

ta
Da

Physical disk 0:0:0

Physical disk 0:0:4

Physical disk 0:0:8

ta
Da

System Disk

Physical disk 0:0:5

ta
Da

Physical disk 0:0:1

ta
Da

Physical disk 0:0:2

Physical disk 0:0:6

Figure 21 Hadoop Worker server drive types and locations

Replace a System Disk (0 through 1)
All drives are installed at the front of the server and connect to the system board through
the backplane. Hard drives are supplied in special hot-swappable hard-drive carriers that
fit in the hard-drive slots.
In addition to describing how to physically remove and insert the disk drive, this
procedure also describes how to determine if the RAID group is still rebuilding and how to
monitor the rebuild process.
1. Connect your service laptop to the DCA and log in to the Primary Master as the user
root (see “Connect a workstation to the DCA” on page 176).
2. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA
door and the blue server identification LED as the user root:
a. Launch the DCA Setup utility:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 18 for Light Bar Controls.
d. Select option 3 for Blink the light bar.
e. Enter the hostname of the server and press ENTER.

EMC Greenplum DCA Maintenance Guide

Replace a drive in an Hadoop server

101

EMC CONFIDENTIAL
Replace a Disk Drive

f. Enter X to exit the DCA Setup utility.
The green lightbar on the DCA door and the blue server identification LED begin to
blink.
Note: If the DCA door does not have a lightbar, an error message displays. You can
safely ignore the error message. To identify the failed server, locate the blue server
identification LED.
g. Locate the failed drive.
Note: In this procedure, System Disk 0 is the failed drive.
LED indicators on the each drive carrier indicate the current status of the drive
within it. A failed drive is indicated by a solid (unblinking) amber LED.
If the dial-home information includes a drive number, locate the drive with the
help of the following illustration:
ta
Da

Physical disk 0:0:10

ta
Da

ta
Da

Physical disk 0:0:9

ta
Da

Physical disk 0:0:6

Physical disk 0:0:11

ta
Da

Physical disk 0:0:3

ta
Da

System Disk

Physical disk 0:0:7

ta
Da

Physical disk 0:0:0

Physical disk 0:0:4

Physical disk 0:0:8

ta
Da

System Disk

Physical disk 0:0:5

ta
Da

Physical disk 0:0:1

ta
Da

Physical disk 0:0:2

Failed disk in
this example

Figure 22 Hadoop Worker server drive slot numbering

3. Remove the failed drive from the server:
a. Press the button on the front of the drive carrier to release the drive handle.
b. Wait 10 seconds to allow the platter in the drive to stop spinning.
c. Pull the drive carrier out of the server.

CL4982

Figure 23 Removing a drive from a Hadoop Master server

4. Make sure that the capacity of the replacement drive matches the capacity of the
failed drive. The drive capacity is printed on the label on each drive.

EMC Greenplum DCA Maintenance Guide

Replace a drive in an Hadoop server

102

EMC CONFIDENTIAL
Replace a Disk Drive

IMPORTANT
Do not mix drives of different capacities within a server.
5. To replace a drive carrier in the server:
a. Insert the drive carrier into the drive bay until the carrier contacts the backplane.
b. Close the handle to lock the drive carrier in place. The LED on the drive turns green
and blinks while it automatically starts the Copyback operation.

B

A

CL4983

Figure 24 Replacing a drive in a Hadoop Master server

After the replacement drive is inserted the RAID controller automatically begins the rebuild
process and writes data to the replacement drive, as indicated by a slowly blinking amber
LED on the drive. Do not interrupt the rebuild process.

Verify the state of the
RAID group rebuild
process.

6. To determine whether the RAID group is still being rebuilt, issue the following
command:

# CmdTool2 -pdlist -aall | egrep

"Adapter|Enclosure|Slot Number|Firmware state"

Example output is shown below. Focus on the items in bold.
Adapter #0
Enclosure Device ID: 21
Slot Number: 0
Enclosure position: 0
Firmware state: Rebuild

In the example output above, note that the firmware state indicates that the rebuild is
still in progress.
• If the firmware state of the replacement drive is shown as Rebuild, the rebuild is
in progress.
• If the firmware state of the replacement drive is shown as Online, Spun Up the
rebuild is complete.

EMC Greenplum DCA Maintenance Guide

Replace a drive in an Hadoop server

103

EMC CONFIDENTIAL
Replace a Disk Drive

Monitor the
rebuild process.

7. To monitor the rebuild process, issue the following command. Change the values
shown in bold below to the actual values from your output. For example:
# CmdTool2 -pdrbld -progdsply -PhysDrv[21:0] -a0

The values in the above example refer to the following parameters:
• 21 refers to the Enclosure Device ID.
• 0 refers to the slot from which you removed the faulted drive and inserted a
replacement drive.
• -a0 refers to Adapter Number 0.
8. Issue the following command:
CmdTool2 -pdlist -aall | egrep

"Adapter|Enclosure|Slot Number|Firmware state"

9. In the output verify that the firmware state of the drives is reported as:

If necessary, manually
initiate the rebuild
process.

Online, Spun Up


If the firmware state is reported as Unconfigured(Good) then the rebuild did not
occur automatically. In this unlikely event, you must initiate the rebuild manually by
issuing the following command:
# CmdTool2 -pdrbld -start -PhysDrv[21:0] -a0

Monitor the rebuild as described in step 7.
If the firmware state of the replacement drive is shown as Online, Spun Up, the
rebuild is complete.

Replace a failed Data Disk (2 through 11)
Data disks 2 through 11 are each configured as individual RAID 0 disks. Because data
from a RAID 0 cannot be recovered automatically by the server’s RAID controller, you must
recover it through the Hadoop software as described in this procedure.
1. Log in to the Primary Master Server as the user gpadmin (see “ Connect to the Master
Server using an SSH client” on page 178).
2. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA
door and the blue server identification LED as the user root:
a. Launch the DCA Setup utility:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 18 for Light Bar Controls.
d. Select option 3 for Blink the light bar.
e. Enter the hostname of the server and press ENTER.
f. Enter X to exit the DCA Setup utility.
EMC Greenplum DCA Maintenance Guide

Replace a drive in an Hadoop server

104

EMC CONFIDENTIAL
Replace a Disk Drive

The green lightbar on the DCA door and the blue server identification LED begin to
blink.
Note: If the DCA door does not have a lightbar, an error message displays. You can
safely ignore the error message. To identify the failed server, locate the blue server
identification LED.
3. Locate the failed data disk (see Figure 22). Make note of the drive position in the
format 0:0:X. where X is the slot number of the faulted data drive.
4. Remove the faulted drive from the server:
a. Press the button on the front of the drive carrier to release the drive handle.
b. Wait 10 seconds to allow the platter in the drive to stop spinning.
c. Pull the drive carrier out of the server.
5. Make sure that the capacity of the replacement drive matches the capacity of the
failed drive. The drive capacity is printed on the label on each drive.
IMPORTANT
Do not mix drives of different capacities within a server.
6. Install the replacement drive carrier in the server:
a. Insert the drive carrier into the drive bay until the carrier contacts the backplane.
b. Close the handle to lock the drive carrier in place. The LED on the drive turns green
and blinks while it automatically starts the Copyback operation.
7. Log in to the Primary Master Server as the user root (see “ Connect to the Master
Server using an SSH client” on page 178).
8. Connect as the user root to the server with the new disk. For example, if you replaced
a disk in hdw1:
$ ssh root@hdw1

Create a new virtual disk

9. Create a new virtual disk on the replacement drive:
a. For the disk that you replaced, determine the Enclosure Device ID of the server and
the Adapter Number :
# CmdTool2 -pdlist -aall | egrep

"Adapter|Enclosure|Slot Number”

The following output is returned:
Adapter #0
Enclosure Device ID: 13
Slot Number: 0
Enclosure position: 0
. . .
Enclosure Device ID: 13
Slot Number: 11
Enclosure position: 0

EMC Greenplum DCA Maintenance Guide

Replace a drive in an Hadoop server

105

EMC CONFIDENTIAL
Replace a Disk Drive

b. Using information from the output above, issue the command from Table 5 that
corresponds to the physical disk that you installed.
For example, to create a virtual disk on physical disk 0:0:11, issue:
# CmdTool2 -CfgLdAdd r0 '[13:11]' -a0

The values in the above command example refer to the following:
• r0 refers to the RAID level (which is always 0 for a Hadoop Worker data disk).
• 13 refers to the Enclosure Device ID.
• 11 refers to the Slot Number of the physical disk that you replaced.
• -a0 refers to the Adapter Number.
Table 5 Virtual disk creation commands per physical disk slot
Physical Disk

Command

0:0:0, 0:0:1

# CmdTool2 -CfgLdAdd r1 '[13:0,13:1]' -sz 102400 -a0
# CmdTool2 -CfgLdAdd r1 '[13:0,13:1]' -sz 65536 -a0
# CmdTool2 -CfgLdAdd r1 '[13:0,13:1]' -sz 102400 -a0
# CmdTool2 -CfgLdAdd r1 '[13:0,13:1]' -a0

0:0:2

# CmdTool2 -CfgLdAdd r0 '[13:2]' -a0

0:0:3

# CmdTool2 -CfgLdAdd r0 '[13:3]' -a0

0:0:4

# CmdTool2 -CfgLdAdd r0 '[13:4]' -a0

0:0:5

# CmdTool2 -CfgLdAdd r0 '[13:5]' -a0

0:0:6

# CmdTool2 -CfgLdAdd r0 '[13:6]' -a0

0:0:7

# CmdTool2 -CfgLdAdd r0 '[13:7]' -a0

0:0:8

# CmdTool2 -CfgLdAdd r0 '[13:8]' -a0

0:0:9

# CmdTool2 -CfgLdAdd r0 '[13:9]' -a0

0:0:11is used as the

0:0:10

# CmdTool2 -CfgLdAdd r0 '[13:10]' -a0

example replacement drive
throughout this procedure.

0:0:11

# CmdTool2 -CfgLdAdd r0 '[13:11]' -a0

Table 6 below matches each physical disk in the server with its corresponding virtual
disk. Note that the virtual disk for physical disk 0:0:11 is 13.
Table 6 Disk attributes in a Hadoop Worker server
Physical Disk

Virtual Disk

Mount (Device name) Label

0:0:0, 0:0:1

0

/sda1

/boot

1

/sda2

/

2

/sdbswap

swap

3

/sdc1

crash

4

/sde

/data1

0:0:2

EMC Greenplum DCA Maintenance Guide

Replace a drive in an Hadoop server

106

EMC CONFIDENTIAL
Replace a Disk Drive

Table 6 Disk attributes in a Hadoop Worker server

Determine the
automatically-assigned
device name.

Physical Disk

Virtual Disk

Mount (Device name) Label

0:0:3

5

/sdf

/data2

0:0:4

6

/sdg

/data3

0:0:5

7

/sdh

/data4

0:0:6

8

/sdi

/data5

0:0:7

9

/sdj

/data6

0:0:8

10

/sdk

/data7

0:0:9

11

/sdl

/data8

0:0:10

12

/sdm

/data9

0:0:11

13

/sdn

/data10

10. Confirm the device name that was assigned to the new virtual disk. When issuing the
following command, change the values in bold below with the values specific to your
situation:
# CmdTool2 -LDInfo -L13 -a0

The values in the above command example refer to the following:
• 13 refers to the virtual disk created on physical disk 0:0:11.
• 0 refers to the Adapter Number (i.e., the RAID controller).
Virtual Drive: 13 (Target Id: 13)
Name
:sdn
RAID Level
: Primary-0, Secondary-0, RAID Level Qualifier-0
Size
: 2.727 TB
Parity Size
: 0
State
: Optimal
Strip Size
: 256 KB
Number Of Drives
: 1
Span Depth
: 1
Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache
if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache
if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy
: Disk's Default
Encryption Type
: None
PI type: No PI
Is VD Cached: No

In the example output above, note the value next to Name. This is the device name
that you will use in step 11 to format a filesystem on the new virtual disk.

EMC Greenplum DCA Maintenance Guide

Replace a drive in an Hadoop server

107

EMC CONFIDENTIAL
Replace a Disk Drive


If the name does not appear in your output as it did in the above example, refer to Table 6
on page 106 and look up the device name value that corresponds to the physical disk that
you replaced. Then issue the following command to set the device name. For example, to
set the virtual disk device name of physical disk 11 to sdn (as specified in Table 6 on
page 106):
# CmdTool2 -ldsetprop -name sdn -L13 -a0

11. Format and label the filesystem on the new virtual disk. Replace the text in bold with
values from the Label column and the Mount/Device name column in Table 6 that
correspond to the physical disk that you replaced.
# mkfs -t xfs -L /data10 -f /dev/sdn
# mount /data10

If necessary, clear
the state of the disk


If for any reason you eject and then reseat an existing drive, you may have put the drive
into a foreign state. If so, you must clear the state of the drive.
a. To check the state of the drive in this case, switch to the user root and then issue
the following command:

# CmdTool2 -pdlist -aall | egrep

"Adapter|Enclosure|Slot Number|Firmware state|Foreign State"

Example output for a drive with a foreign state would look like this:
Enclosure Device ID: 13
Slot Number: 11
Enclosure position: 0
Firmware state: foreign

b. Examine the output carefully. If the output shows the drive state to be foreign,
clear the state by issuing the following command, making sure to replace the
Enclosure Device and Slot Number shown in bold below with the Enclosure Device
ID and Slot Number given in your output.
# CmdTool2 -PDMakeGood -PhysDrv[13:11]-Force -aALL

c. Then, issue the following command. Change the values in bold with the Enclosure
Device ID and Slot Number given in your output:
# CmdTool2 -PDClear -Start -PhysDrv[13:11]-a0

d. Wait 5 minutes, then issue the following command, making sure to change the
values in bold with the Enclosure Device ID and Slot Number given in your output:
# CmdTool2 -PDClear -Stop -PhysDrv[13:11]-a0

Introduce the
replacement drive to the
Hadoop cluster and
restart Hadoop

12. Make the following directories on the replacement drive:
mkdir /data10/hadoop
mkdir /data10/hadoop/data
mkdir /data10/hadoop/local

EMC Greenplum DCA Maintenance Guide

Replace a drive in an Hadoop server

108

EMC CONFIDENTIAL
Replace a Disk Drive

13. Set permissions on the new directories that you made in the previous step:
# chown hdfs:hadoop /datat10/hadoop/data
# chown mapred:hadoop /datat10/hadoop/local

14. Set permissions on Hadoop to 755:
# chmod 755 /data10/hadoop
# chmod 755 /data10/hadoop/local

15. Restart Hadoop on the recovered Hadoop Worker node:
# service hadoop-datanode restart
# service hadoop-tasktracker restart

16. Connect to the host hdm1 :
# ssh hdm1

17. Switch to the user hdfs and verify that the Hadoop filesystem is healthy:
# su - hdfs
$ hadoop fsck /

The following message in the output indicates a healthy filesystem:
The filesystem under path '/' is HEALTHY

EMC Greenplum DCA Maintenance Guide

Replace a drive in an Hadoop server

109

EMC CONFIDENTIAL

CHAPTER 5
Replace a Power Supply in a Server
This chapter describes how to replace power supplies in DCA UAP servers.

Power supply LEDs
Each server in the UAP DCA has two power supplies that share the power load and provide
redundancy. When the power load is below a certain threshold, only one power supply in
each server is active and its LED is solid green to reflect the active state. The other power
supply is in standby mode and its LED flashes green to reflect the standby state.
Table 7 Power supply LED behavior
LED behavior

Definition

Solid green

Active mode

Blinking green, slow

Standby mode

Blinking green, rapid

Power supply firmware updating

Off

No AC power to both power supplies in the server

Amber

No AC power to this power supply, other power supply has AC power

Replace a power supply in a server
The servers in the DCA rack are powered by dual redundant hot-swappable power supplies
located in the rear of the appliance. To replace a power supply in a server, perform the
following procedure.
1. Log in to the Primary Master Server as the user root (see “Connect to the Master
Server using an SSH client” on page 178).
2. To locate the server, use the DCA Setup Utility to activate the green lightbar on the DCA
door and the blue server identification LED as the user root:
a. Launch the DCA Setup utility:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 18 for Light Bar Controls.
d. Select option 3 for Blink the light bar.
e. Enter the hostname of the server and press ENTER.
f. Enter X to exit the DCA Setup utility.
The green lightbar on the DCA door and the blue server identification LED begin to
blink.

EMC Greenplum DCA Maintenance Guide

Replace a Power Supply in a Server

110

EMC CONFIDENTIAL
Replace a Power Supply in a Server

Note: If the DCA door does not have a lightbar, an error message displays. You can
safely ignore the error message. To identify the failed server, locate the blue server
identification LED.
3. Locate the failed power supply in the server.
The LED on a failed power supply is either amber or, if the power supply has
completely failed, the LED is off.
4. Verify that the LED on the functioning (redundant) power supply is solid green.
5. Check the connection of the AC power cable. Make sure that both ends of the cable are
securely connected.
6. If the power supply still appears to be failed, try a known-good power cable.
7. If the power supply still appears to be failed, disengage the retaining clip that secures
the AC power cable, and then disconnect the AC power cable from the power supply.
8. While pressing the green release latch leftward, pull on the handle to slide the power
supply out of the chassis.

CL5013

Figure 25 Remove a server power supply

9. Make sure that the replacement power supply is the correct part number:
EMC P/N 105-000-244
The part number is located on the packaging, not on the power supply itself. Both
power supplies in a server must provide the same maximum output power.
10. Install the replacement power supply in the chassis.
Slide the replacement power supply into the chassis until the power supply is fully
seated and the release latch clicks into place.

Figure 26 Insert a server power supply
EMC Greenplum DCA Maintenance Guide

Replace a power supply in a server

111

EMC CONFIDENTIAL
Replace a Power Supply in a Server

11. Connect the AC power cable to the power supply and secure it with the retaining clip.
12. Verify that the power supply LED indicator is solid or blinking greeTurn off the green
lightbar on the DCA door and the blue server identification LED:
a. If you are not already in it, start the DCA Setup utility as the user root:
# dca_setup

b. Select option 2 to Modify DCA Settings.
c. Select option 18 for Light Bar Controls.
d. Select option 2 for Turn off the light bar.
e. Enter the hostname of the server and press ENTER. For example, sdw1.
The lightbar on the DCA door and the blue server identification LED stop blinking.
13.

EMC Greenplum DCA Maintenance Guide

Replace a power supply in a server

112

EMC CONFIDENTIAL

CHAPTER 6
Replace a Fan Assembly or Power Supply in an
Arista Switch
Refer to the appropriate section to replace a fan assembly or a power supply in an Arista
switch.

Replace a Fan Assembly in an Arista Switch
The Admin, Interconnect, and Aggregate switches (Arista 7048/7050 series) in a DCA rack
are cooled by their four modular fan assemblies that can be replaced individually upon
failure as shown in Figure 20. The fans provide rear-to-front airflow.





Leave any failed fan assembly installed until the point where it can be immediately
replaced.



Do not remove a fan assembly from the chassis until you are ready to replace it.



Follow ESD precautions, including the use of a wrist grounding strap, when you
replace components.



The cooling system requires pressurized air in order to function properly. Do not leave
any fan assembly slot unoccupied for longer than two minutes when the appliance is
operating.

Fan Assembly Replacement Order Information
Use the following information to order the Arista Fan Assembly for the 7048T and 7050S
Switch:


Description: Fan Assembly for the Arista Switch (rear-to-front air flow)



Model Number: FAN-7000-R



EMC Part Number: 105-000-313

Tools
The procedure only requires a wrist grounding strap (there are no screws to remove).

EMC Greenplum DCA Maintenance Guide

Replace a Fan Assembly or Power Supply in an Arista Switch

113

EMC CONFIDENTIAL
Replace a Fan Assembly or Power Supply in an Arista Switch

Identify the Failed Fan Assembly
1. Once on site, you must locate the DCA rack and confirm the switch physical location
within the cabinet top level assembly (TLA) serial number reported in the service
request. Arista switches (7050S) function as server interconnect or rack aggregation
and are mounted mid-point in the 40RU rack height. The Admin (7048T) switch is
mounted at the top of the rack.
2. You must confirm that the fan assembly specified for replacement is the failed fan
assembly, and that the LED on the other fans are operational green and lit steadily
(Figure 20 and Figure 21). Access to the fans is from the front of the rack.

Figure 27 Four fan assemblies in the Arista switch

Figure 28 Fan Status LED location

3. Analyze the failure using Table 8 or possible actions.
Table 8 Fan assembly LED Indicators
LED Behavior

Possible State

Action

No light

Fan assembly is not receiving power.

Verify that the fan assembly is seated
correctly, and there is no air movement.

Steady Green

Fan assembly is operating normally.

No action

Steady Red or Amber

Fan has failed or power supply was
removed from switch.

Failed and must be replaced.

EMC Greenplum DCA Maintenance Guide

Replace a Fan Assembly in an Arista Switch

114

EMC CONFIDENTIAL
Replace a Fan Assembly or Power Supply in an Arista Switch

Remove the Failed Fan Assembly and Install the Replacement Part

The cooling system relies on pressurized air. Do not leave any of the Fan assembly slots
empty longer than two minutes when the switch is in operation.
1. Remove the fan assembly. While pressing on the black release latch leftward, grip the
blue pull-ring and slide the fan assembly out of the chassis.
2. Slide the new Fan assembly into the chassis until the unit is fully seated, and the
release latch snaps back into its original position.


Do not force the insertion as damage can occur. If it resists, ensure that it is oriented
correctly for a smooth slide and fit.
3. Verify that the Fan status LED is lit (steady green) to indicate normal operation.

Parts Return
1. Locate the Parts Return Label package. Fill out the shipping label. Apply the shipping
label to the box for return to EMC.
2. Read enclosed Shipping Instructions sheet.
3. Apply other labels for the box appropriate to this returning part, including the Failure
Analysis (FA) tag which is currently required for all DCA replacement parts.
4. Securely tape the box and ship the failed part back to EMC.
5. Send questions regarding this return shipment to:
CS_Logistics_IC@emc.com
This completes the fan assembly replacement.

EMC Greenplum DCA Maintenance Guide

Replace a Fan Assembly in an Arista Switch

115

EMC CONFIDENTIAL
Replace a Fan Assembly or Power Supply in an Arista Switch

Replace a Power Supply in an Arista Switch
The Admin, Interconnect, and Aggregate switches (Arista 7048/7050 series) in a DCA rack
are powered by dual redundant modular power supply assemblies that can be replaced
individually upon failure as shown in Figure 22.





Leave any failed power supply installed until the point where it can be immediately
replaced.



Follow ESD precautions, including the use of a wrist grounding strap, when you
replace components.



The cooling system requires pressurized air in order to function properly. Do not leave
any power supply slot unoccupied for longer than two minutes when the appliance is
operating.

Power Supply Assembly Replacement Order Information
Use the following information to order the Arista Power Supply Assembly for the 7048T
and 7050S switch:


Description: Power Supply Assembly for the Arista Switch, 460W AC (rear-to-front air
flow)



Model Number: PWR-460AC-R



EMC Part Number: 105-000-314

Tools
The procedure only requires a wrist grounding strap (there are no screws to remove).

EMC Greenplum DCA Maintenance Guide

Replace a Power Supply in an Arista Switch

116

EMC CONFIDENTIAL
Replace a Fan Assembly or Power Supply in an Arista Switch

Identify the Failed Power Supply
1. Once on site, you must locate the DCA rack and confirm the switch physical location
within the cabinet top level assembly (TLA) serial number reported in the service
request. Arista switches (7050S) function as server interconnect or rack aggregation
and are mounted mid-point in the 40RU rack height. The Admin (7048T) switch is
mounted at the top of the rack.
2. You must confirm that the power supply specified for replacement is not working, and
that the LED on the other power supplies are operational green and lit steadily (Figure
30). Access to the power supplies is from the front of the rack.

Figure 29 Dual power supply assemblies in the Arista switch

Figure 30 Power Supply Status LED location

3. Analyze the failure using Table 9 for possible actions.
Table 9 Power Supply LED indicators
LED Behavior

Possible State

Action

No light

Power supply is not connected to rack AC
power source, or not inserted fully.

Remove and reinsert firmly.

Steady Green

Power supply is operating normally.

No action

Steady Red or Amber

Power supply overheated or has failed.

Failed and must be replaced.

EMC Greenplum DCA Maintenance Guide

Replace a Power Supply in an Arista Switch

117

EMC CONFIDENTIAL
Replace a Fan Assembly or Power Supply in an Arista Switch

Remove the Failed Power Supply and Install the Replacement Part


The cooling system relies on pressurized air. Do not leave any of the power supply slots
empty longer than two minutes when the switch is in operation.
1. Unplug the AC power cable from the power supply you intend to remove (see Figure
30).
2. Locate the black release latch (lower right). While pressing on the release latch
leftward, grip the blue pull-ring and slide the power supply out of the chassis.
3. Slide the new power supply into the chassis until it is fully seated. The release latch
snaps into place.


Do not force the insertion as damage can occur. If it resists, ensure that it is oriented
correctly for a smooth slide and fit.
4. Reconnect the AC power cable to the power supply.
Note: When applying power to a new power supply, allow for the system to recognize the
power supply and determine its status. The power supply status indicator turns green to
signify that it is functioning properly.
5. Verify that the power supply status LED is lit to indicate normal operation.

Parts Return
1. Locate the Parts Return Label package. Fill out the shipping label. Apply the shipping
label to the box for return to EMC.
2. Read enclosed Shipping Instructions sheet.
3. Apply other labels for the box appropriate to this returning part, including the Failure
Analysis (FA) tag which is currently required for all DCA replacement parts.
4. Securely tape the box and ship the failed part back to EMC.
5. Send questions regarding this return shipment to:
CS_Logistics_IC@emc.com
This completes the fan assembly replacement.

EMC Greenplum DCA Maintenance Guide

Replace a Power Supply in an Arista Switch

118

EMC CONFIDENTIAL

CHAPTER 7
Replace a Switch in the DCA
This chapter describes how to replace Arista 7048T and 7050S-52 switches in a DCA. It
also describes how you can use the DCA Setup Utility to upload switch configuration files
from the Master server to switches. Switch types include:


Interconnect and Aggregation (10GB; SWCH-AR1U-7050S-52)



Administration (1GB; SWCH-AR1U-7048T)

Note: Beginning in DCA 2.0.0.0, you cannot use the DCA Setup Utility to back up switch
configuration files to the Master server.
Major topics include:





Requirements .......................................................................................................
Switch hostnames and IP addresses .....................................................................
Replace an Arista 7050S Interconnect or Aggregation Switch.................................
Replace an Arista 7048T Administration Switch .....................................................

EMC Greenplum DCA Maintenance Guide

Replace a Switch in the DCA

120
120
122
127

119

EMC CONFIDENTIAL
Replace a Switch in the DCA

Requirements


Wrist grounding strap



9-Pin serial cable (RJ-45 to 9-pin d-sub connector)
IMPORTANT
If your laptop does not have a serial port, you must use a USB-to-serial adapter cable.



Materials to label 20 cables



Phillips #2 screwdriver



1/4-inch flathead screwdriver

Switch hostnames and IP addresses
You must configure the replacement switch with the correct hostname and IP address for
the type of rack it inhabits and its position within the rack, as detailed in Table 10 below.
For a table containing IP addresses for all configurations, see Appendix A, “Network
Configuration Information.”
Table 10 Switch hostnames and IP addresses (page 1 of 2)
Rack

Hostname

IP Address

Arista 7050S Interconnect switches (10GB)
SYSRACK
(Rack 1)

i-sw-2 (Upper switch)

172.28.0.180

i-sw-1 (Lower switch)

172.28.0.170

AGGREG
(Rack 2)

i-sw-4 (Upper switch)

172.28.0.181

i-sw-3 (Lower switch)

172.28.0.171

EXPAND
(Rack 3)

i-sw-6 (Upper switch)

172.28.0.182

i-sw-5 (Lower switch)

172.28.0.172

EXPAND
(Rack 4)

i-sw-8 (Upper switch)

172.28.0.183

i-sw-7 (Lower switch)

172.28.0.173

EXPAND
(Rack 5)

i-sw-10 (Upper switch)

172.28.0.184

i-sw-9 (Lower switch)

172.28.0.174

EXPAND
(Rack 6)

i-sw-12 (Upper switch)

172.28.0.185

i-sw-11 (Lower switch)

172.28.0.175

EXPAND
(Rack 7)

i-sw-14 (Upper switch)

172.28.0.186

i-sw-13 (Lower switch)

172.28.0.176

EXPAND
(Rack 8)

i-sw-16 (Upper switch)

172.28.0.187

i-sw-15 (Lower switch)

172.28.0.177

EMC Greenplum DCA Maintenance Guide

Requirements

120

EMC CONFIDENTIAL
Replace a Switch in the DCA

Table 10 Switch hostnames and IP addresses (page 2 of 2)
Rack

Hostname

IP Address

EXPAND
(Rack 9)

i-sw-18 (Upper switch)

172.28.0.188

i-sw-17 (Lower switch)

172.28.0.178

EXPAND
(Rack 10)

i-sw-20 (Upper switch)

172.28.0.189

i-sw-19 (Lower switch)

172.28.0.179

EXPAND
(Rack 11)

i-sw-22 (Upper switch)

172.28.1.180

i-sw-21 (Lower switch)

172.28.1.170

Arista 7050S Aggregation switches (10GB)
AGGREG
(Rack 2 only)

aggr-sw-2 (Upper switch)

172.28.0.249

aggr-sw-1 (Lower switch)

172.28.0.248

Arista 7048T Administration switches (1GB)
SYSRACK
(Rack 1)

a-sw-1

172.28.0.190

AGGREG
(Rack 2)

a-sw-2

172.28.0.191

EXPAND
(Rack 3)

a-sw-3

172.28.0.192

EXPAND
(Rack 4)

a-sw-4

172.28.0.193

EXPAND
(Rack 5)

a-sw-5

172.28.0.194

EXPAND
(Rack 6)

a-sw-6

172.28.0.195

EXPAND
(Rack 7)

a-sw-7

172.28.0.196

EXPAND
(Rack 8)

a-sw-8

172.28.0.197

EXPAND
(Rack 9)

a-sw-9

172.28.0.198

EXPAND
(Rack 10)

a-sw-10

172.28.0.199

EXPAND
(Rack 11)

a-sw-11

172.28.1.190

EMC Greenplum DCA Maintenance Guide

Switch hostnames and IP addresses

121

EMC CONFIDENTIAL
Replace a Switch in the DCA

Replace an Arista 7050S Interconnect or Aggregation Switch
Summary of main tasks:


When installing a replacement switch, identify the firmware version on the new switch
(as well as the versions already running in the DCA). Then upgrade so that all switches
reflect the same firmware levels.
Go to http://support.emc.com to obtain the pertinent firmware upgrade instructions.
The upgrade instructions provide information on how to access and install the
firmware upgrade package.



Remove the failed switch and install the replacement switch



Establish a serial connection and log in to the replacement switch



Configure the switch management port and password



Check the current firmware version



Update the firmware if necessary



Upload the switch configuration through DCA Setup



Check the health of the GPDB

EMC Greenplum DCA Maintenance Guide

Replace an Arista 7050S Interconnect or Aggregation Switch

122

EMC CONFIDENTIAL
Replace a Switch in the DCA

1. Identify the type and location of the switch that you are going to replace.

Interconnect switches
in the System Rack

Aggregation switches
in the Aggregation Rack

Figure 31 Location of Interconnect and Aggregation switches

2. Connect your service laptop to the red service cable located on the laptop tray in
Rack 1. The red service cable is connected to port 48 on the Administration switch in
the Rack 1 (see “Connect a workstation to the DCA” on page 176).
3. To prevent false dial home messages from being sent to EMC Support during service,
stop the healthmon daemon to disable health monitoring:
# dca_healthmon_ctl -d

Remove the failed switch
and install the
replacement switch

4. Label all cables connected to the switch.
On the label, include the server and server port from which each cable originates and
the switch and switch port to which each cable connects. For connectivity details, see
the following:
• Interconnect Switch cabling—see “” on page 152.
• Aggregation Switch cabling—see “Aggregation switch reference” on page 163.

EMC Greenplum DCA Maintenance Guide

Replace an Arista 7050S Interconnect or Aggregation Switch

123

EMC CONFIDENTIAL
Replace a Switch in the DCA

5. Power off the switch by removing both AC power cables from the power supplies on
the back of the switch.
6. Make sure that the interconnect cables are labeled as described in step 4 above, and
then remove the interconnect cables from the Interconnect switch.
7. Remove the failed switch and install the replacement switch (see “Install a Switch in a
Rack” on page 200).
8. Connect all data cables to the correct ports on the switch.
Refer to the labels for the correct connectivity information. For more information, see
“Network and cabling configurations” on page 152.
9. Power on the switch by connecting the AC power cables to the power supplies on the
back of the replacement switch. The switch powers up as soon as AC power is applied
Verify that the power supply LEDs are solid green after a few seconds.
IMPORTANT
Each switch power supply should be connected to a separate AC power zone on the
rack. See “Power supply reference” on page 146.
To Power Zone B PDU
Rear of rack
To Power Zone A PDU

Front of rack

CL5041

Figure 32 Connecting switch power cords to AC power

Note: Because the replacement switch was configured at the factory, it is not yet
accessible through SSH, so you must configure the replacement switch through a serial
connection as described in “Connect to an Interconnect or Administration switch using
PuTTY” on page 181.

Establish a serial
connection and log in to
the replacement switch

10. Connect your service laptop to the serial console port on the replacement switch using
a native RJ-45 serial cable. If you do not have a native RJ-45 serial cable, use a
DB-9-to-RJ45 or USB-to-RJ45 serial adapter.
Serial console port

EMC Greenplum DCA Maintenance Guide

Replace an Arista 7050S Interconnect or Aggregation Switch

124

EMC CONFIDENTIAL
Replace a Switch in the DCA

Figure 33 Arista 7050S serial port location

11. Using a terminal emulator such as Hyperterminal, log in to the switch as user admin
and no password with the following settings:
• Connection type: serial
• Data rate: 9600
• Data bits: 8
• Parity: none
• Stop bits: 1
• Hardware flow control: none
12. At the localhost prompt, issue the following commands to disable the Arista zerotouch
feature:
# enable
# zerotouch cancel

Wait as the switch reboots.
13. When the switch is finished booting, log in again as user admin and no password.

Configure the switch
management port
and password;
hostname, and IP
address

14. At the localhost prompt, issue the following commands to configure the management
port:
Note: Change the IP address shown in bold below as appropriate for the type and
location of the switch you are configuring. For details see “Switch hostnames and IP
addresses” on page 120.
# enable
# conf t
# hostname i-sw-1
(config)# interface management 1
(config-if-Ma1)# ip address 172.28.0.170/21
(config-if-Ma1)# exit
(config)# user admin secret 0 changeme
(config)# write mem
(config)# exit

15. Connect all data cables to the correct ports and the ethernet cable to the management
port of the switch.
16. Determine whether you need to update the Extensible Operating System (EOS)
firmware on the switch:
# show boot-config

In the output, focus on the value shown in bold below:
Software image: flash:/EOS-4.9.3.2.swi

• If 4.9.3.2 is returned, you do not need to update the switch firmware. Proceed to
step 18 to complete the switch configuration.
• If 4.9.3.2 is not returned, you must update the switch firmware. Proceed to
step 17.

EMC Greenplum DCA Maintenance Guide

Replace an Arista 7050S Interconnect or Aggregation Switch

125

EMC CONFIDENTIAL
Replace a Switch in the DCA

Update the firmware
if necessary

17. If you determined in the previous step that you need to update the switch firmware:
a. Before proceeding, back up the current switch configuration. Please read
Appendix G for instructions on backing up the switch configurations.
b. Download the Arista firmware from http://support.emc.com and place in
/opt/dca/etc/arista_fw/
You may need to create the directory if it does not exist.
c. Next, download the current switch configuration. Customizing the switches is a
common practice. This procedure will reload the switches with default
configurations. Backing up the switch configurations allows for easy restoration
after installing the new firmware.
d. Issue the following commands to copy the EOS firmware file from the Primary
Master Server to the switch:

# copy scp://root@172.28.4.250/opt/dca/etc/arista_fw/EOS-4.9.3.2.swi flash:/EOS-4.9.3.2.swi

e. When prompted, enter password changeme.
root@172.28.4.250's password:
# conf t
(config)# boot system flash:/EOS-4.9.3.2.swi
(config)# exit

f. Check the EOS firmware version that you installed.
# show boot-config

The following output is returned:
Software image: flash:/EOS-4.9.3.2.swi
Console speed: (not set)
Aboot password (encrypted): (not set)

g. Save the EOS configuration and reload. The switch reboots.
# write mem
# reload

h. Recover the switch config using the instructions in Appendix G.
18. Disconnect the serial cable.
19. Connect your service laptop to the red service cable located on the laptop tray in
Rack 1 and log in to the Primary Master as the user root (see “Connect a workstation
to the DCA” on page 176).

Check the health
of the GPDB

20. Log in to the Primary Master as gpadmin and issue the following command to verify
that the database is healthy:
$ gpstate -m

Verify that all segments are reported as Synchronized:
Mirror
sdw2-2

Datadir
/data2/mirror/gpseg0

Port
50003

Status
Acting as Primary

Data Status

Synchronized

21. Re-enable health monitoring:

EMC Greenplum DCA Maintenance Guide

Replace an Arista 7050S Interconnect or Aggregation Switch

126

EMC CONFIDENTIAL
Replace a Switch in the DCA

# dca_healthmon_ctl -e

Replace an Arista 7048T Administration Switch
Summary of main tasks:


When installing a replacement switch, identify the firmware version on the new switch
(as well as the versions already running in the DCA). Then upgrade so that all switches
reflect the same firmware levels.
Go to http://support.emc.com to obtain the pertinent firmware upgrade instructions.
The upgrade instructions provide information on how to access and install the
firmware upgrade package.



Remove the failed switch and install the replacement switch



Establish a serial connection and log in to the replacement switch



Configure the switch management port



Configure the switch password



Check the current firmware version



Update the firmware if necessary



Upload the switch configuration through DCA Setup



Check the health of the GPDB

EMC Greenplum DCA Maintenance Guide

Replace an Arista 7048T Administration Switch

127

EMC CONFIDENTIAL
Replace a Switch in the DCA

1. Identify the Administration switch the rack.

Administration switch
in the System Rack
Figure 34 Location of the Aggregation switch

2. To prevent false dial home messages from being sent to EMC Support during service,
stop the healthmon daemon to disable health monitoring:
# dca_healthmon_ctl -d

Remove the failed switch
and install the
replacement switch

3. Label all cables connected to the switch.
On the label, include the server and server port from which each cable originates and
the switch and switch port to which each cable connects. For connectivity details, see
“Administration switch reference” on page 159.
4. Power off the switch by removing both AC power cables from the power supplies on
the back of the switch.
5. Make sure that the cables are labeled as described in step above, and then remove
the interconnect cables from the Interconnect switch.
6. Remove the failed switch and install the replacement switch (see “Install a Switch in a
Rack” on page 200).

EMC Greenplum DCA Maintenance Guide

Replace an Arista 7048T Administration Switch

128

EMC CONFIDENTIAL
Replace a Switch in the DCA

7. Connect all data cables to the correct ports on the switch.
Refer to the labels for the correct connectivity information. For more information, see
“Network and cabling configurations” on page 152.
8. Power on the switch by connecting the AC power cables to the power supplies on the
back of the replacement switch. The switch powers up as soon as AC power is applied
Verify that the power supply LEDs are solid green after a few seconds.
IMPORTANT
Each switch power supply should be connected to a separate AC power zone on the
rack. See “Power supply reference” on page 146.
To Power Zone B PDU
Rear of rack
To Power Zone A PDU

Front of rack

CL5041

IMPORTANT
Because the replacement switch was configured at the factory, it is not yet accessible
through SSH, so you must configure it through a serial connection as described in
“Connect to an Interconnect or Administration switch using PuTTY” on page 181.

Establish a serial
connection and log in to
the replacement switch

9. Connect your service laptop to the serial console port on the replacement switch using
a native RJ-45 serial cable. If you do not have a native RJ-45 serial cable, use a
DB-9-to-RJ45 or USB-to-RJ45 serial adapter.
Serial console port

Figure 35 Arista 7048T serial port location

10. Using a terminal emulator such as Hyperterminal, log in to the switch as user admin
and no password with the following settings:
• Connection type: serial
• Data rate: 9600
• Data bits: 8
• Parity: none
EMC Greenplum DCA Maintenance Guide

Replace an Arista 7048T Administration Switch

129

EMC CONFIDENTIAL
Replace a Switch in the DCA

• Stop bits: 1
• Hardware flow control: none
11. At the localhost prompt, issue the following commands to disable the Arista zerotouch
feature:
# enable
# zerotouch cancel

Wait as the switch reboots.
12. When the switch is finished booting, log in again as user admin and no password.

Configure the switch
VLAN port service,
hostname, and IP
address

13. At the localhost prompt, issue the following commands to set the VLAN port service:
Change the hostname and IP address shown in bold below as appropriate for the
switch you are configuring. For details see “Switch hostnames and IP addresses” on
page 120.
IMPORTANT
This step differs slightly according to the specific Administration switch that you are
replacing.

Administration switch in Rack 1 (a-sw-1)
# enable
# conf t
(config)# hostname a-sw-1
(config)# interface vlan 3

The following output displays:
! Access VLAN does not exist.

Creating vlan 3

Continue:
(config-if-Vl3)# ip address 172.28.0.190/21
(config-if-Vl3)#interface ethernet 1-48
(config-if-Et1-48)#switchport access vlan 3

Administration switch in Rack 2 (a-sw-2)
# enable
# conf t
(config)# hostname a-sw-2
(config)# interface vlan 3

The following output displays:
! Access VLAN does not exist.

Creating vlan 3

Continue:
(config-if-Vl3)# ip address 172.28.0.191/21
(config-if-Vl3)#interface ethernet 1-48
(config-if-Et1-48)#switchport access vlan 3
(config-if-Et1-48)#interface port-Channel 1000
(config-if-Po1000)#switchport mode trunk
(config-if-Po1000)#switchport trunk group mlagpeerlink
(config-if-Po1000)#interface ethernet 45-46

EMC Greenplum DCA Maintenance Guide

Replace an Arista 7048T Administration Switch

130

EMC CONFIDENTIAL
Replace a Switch in the DCA

(config-if-Et45-46)#channel-group 1000 mode active

Administration switch in Racks 3 to 12 (a-sw-3 to a-sw-12)
Change the hostname and IP address shown in bold below as appropriate for the specific
Administration switch you are configuring. For details see Appendix A, “Network
Configuration Information.”
# enable
# conf t
(config)# hostname a-sw-3
(config)# interface vlan 3

The following output displays:
! Access VLAN does not exist.

Creating vlan 3

Continue:
(config-if-Vl3)# ip address 172.28.0.192/21
(config-if-Vl3)#interface ethernet 1-48
(config-if-Et1-48)#switchport access vlan 3
(config)#interface port-Channel 900
(config-if-Po900)#switchport access vlan 3
(config-if-Po900)#interface ethernet 45-46
(config-if-Et45-46)#channel-group 900 mode active

14. Verify the that the ports were added to vlan 3:
(config-if-Et1-48)#show vlan

The following output displays:
VLAN
-----

Name
-----

Status
-----

1

default

active

3

VLAN0003 active

Ports
-----

Cpu, Et1, Et2, Et3, Et4, Et5, Et6, Et7, Et8,
Et17, Et18, Et19, Et20

Note: Only active ports display in the above output. You may see different output.

Configure the
switch password

15. Configure the switch password:
(config-if-Et1-48)# exit
(config)# user admin secret 0 changeme
(config)# write mem
(config)# exit

16. Connect all data cables to the correct ports and the ethernet cable to the management
port of the switch.
17. Determine whether you need to update the Extensible Operating System (EOS)
firmware on the switch:
# show boot-config

In the output, focus on the value shown in bold below:
Software image: flash:/EOS-4.9.3.2.swi

EMC Greenplum DCA Maintenance Guide

Replace an Arista 7048T Administration Switch

131

EMC CONFIDENTIAL
Replace a Switch in the DCA

• If 4.9.3.2 is returned, you do not need to update the switch firmware. Proceed to
step 19 to complete the switch configuration.
• If 4.9.3.2 is not returned, you must update the switch firmware. Proceed to
step 18.

Update the firmware
if necessary

18. If you determined in the previous step that you need to update the switch firmware:
a. Before proceeding, back up the current switch configuration. Please read
Appendix G for instructions on backing up the switch configurations.
b. Download the Arista firmware from http://support.emc.com and place in
/opt/dca/etc/arista_fw/
You may need to create the directory if it does not exist.
c. Next, download the current switch configuration. Customizing the switches is a
common practice. This procedure will reload the switches with default
configurations. Backing up the switch configurations allows for easy restoration
after installing the new firmware.
d. Issue the following commands to copy the EOS firmware file from the Primary
Master Server to the switch:

# copy scp://root@172.28.4.250/opt/dca/etc/arista_fw/EOS-4.9.3.2.swi flash:/EOS-4.9.3.2.swi

e. When prompted, enter password changeme.
root@172.28.4.250's password:
# conf t
(config)# boot system flash:/EOS-4.9.3.2.swi
(config)# exit

f. Check the EOS firmware version that you installed.
# show boot-config

The following output is returned:
Software image: flash:/EOS-4.9.3.2.swi
Console speed: (not set)
Aboot password (encrypted): (not set)

g. Save the EOS configuration and reload. The switch reboots.
# write mem
# reload

h. Recover the switch config using the instructions in Appendix G.
19. Disconnect the serial cable.
20. Connect your service laptop to the red service cable located on the laptop tray in
Rack 1 and log in to the Primary Master as the user root (see “Connect a workstation
to the DCA” on page 176).

Check the health
of the GPDB

21. Log in to the Primary Master as gpadmin and issue the following command to verify
that the database is healthy:
$ gpstate -m

Verify that all segments are reported as Synchronized:
Mirror
Datadir
sdw2-2
/data2/mirror/gpseg0
EMC Greenplum DCA Maintenance Guide

Port
50003

Status
Data Status
Acting as Primary
Synchronized
Replace an Arista 7048T Administration Switch

132

EMC CONFIDENTIAL
Replace a Switch in the DCA

22. Re-enable health monitoring:
# dca_healthmon_ctl -e

EMC Greenplum DCA Maintenance Guide

Replace an Arista 7048T Administration Switch

133

EMC CONFIDENTIAL

CHAPTER 8
Replace an Interconnect Switch Cable
This chapter describes how to replace a twin-ax cable used to connect servers and
interconnect switches in the DCA.
Note: Some failed cables may be part of a cable bundle. Plan for multiple systems losing
connectivity. It is recommended to disable the database and healthmon until cables are
replaced.
Locate and replace the failed cable
1. Log in to the Primary Master server as the user root. Refer to “Connect to the Master
Server using an SSH client” on page 178 for details.
2. Activate the server identification LED on the server with the failed cable. For example,
on sdw8:
# dca_blinker -h sdw8 -a ON

3. From the rear of the system, locate the Converged Network Adapter (CNA) card in the
server's expansion slot.

To lower Interconnect switch

To upper Interconnect switch

Master or DIA server; Hadoop Compute server
To lower Interconnect switch

AF004142a

To upper Interconnect switch

Segment server; Hadoop master and worker servers

AF004061a

Figure 36 CNA card location in DCA servers

Figure 37 Master server with extra 10Gb NICs

EMC Greenplum DCA Maintenance Guide

Replace an Interconnect Switch Cable

134

EMC CONFIDENTIAL
Replace an Interconnect Switch Cable

4. Observe the Link and Act LEDs adjacent to each port on the card. A single, steadily
flashing LED indicates that the attached cable has failed. If both LEDs are flashing,
further diagnosis is required. DO NOT replace the cables in this case. Instead, contact
EMC technical support services for assistance.

Figure 38 Interconnect switch CNA port LEDs

5. Before connecting the replacement cable the database will need to be shutdown. This
is due to the new cabling bundling introduced in release 2.0.2.0. Shutting down the
database will prevent false dial home messages from being sent to EMC Support
during service.
To shutdown the database:
a. Disable health monitoring by stopping the healthmon daemon:
# dca_healthmon_ctl -d

b. Switch to the user gpadmin:
# su - gpadmin

c. When prompted for the password, enter changeme.
If the default password changeme was changed; enter the current password.
d. Stop the Greenplum Database:
$ gpstop -af

e. Switch to the user root:
$ su -

6. Disconnect both ends of the cable and remove the cable from the cable bundle. For
Interconnect cable diagrams, refer to “” on page 152.
7. Connect one end of the new cable to the CNA port on the server and the other end to
the correct port on the appropriate Interconnect switch.
8. Verify that the Link LED on the CNA card is solid green.
9. Secure the cable back into the cable bundle.
10. Verify that the eth4 interface on the affected server is UP and RUNNING:
# ifconfig eth4

The following output should be returned:
eth4

EMC Greenplum DCA Maintenance Guide

Link encap:Ethernet HWaddr 8C:7C:FF:20:93:32
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:921818 errors:0 dropped:0 overruns:0 frame:0
TX packets:908966 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:129140405 (123.1 MiB) TX bytes:88721575 (84.6 MiB)
Memory:ec440000-ec47ffff

135

EMC CONFIDENTIAL
Replace an Interconnect Switch Cable

If eth4 is not UP, bring it up by issuing:
# ifup eth4

11. Once all of the connections are fixed, the database may be started up.

EMC Greenplum DCA Maintenance Guide

136

EMC CONFIDENTIAL

APPENDIX A
System Information and Configuration
This appendix includes the following sections:


Greenplum DCA configurations



Power supply reference



Network and cabling configuration



Network hostname and IP configuration



Multiple-rack cabling reference



Configuration files



Default passwords

New firmware updates for DCA software version 2.0.3.0
Customers can apply optional firmware updates prior to upgrading to DCA software
version 2.0.3.0 as follows:


Arista 7050S-52 and Arista 7048T switches
• New firmware version EOS-4.9.8.swi
• Field personnel can access the EOS-4.9.8.swi.zip firmware upgrade
package from:
ftp://ftp.aristanetworks.com/emc/certifiedeos/EOS-4.9.8.swi
Field personnel can obtain the following document available on
http://support.emc.com for step-by-step instructions:

EMC Greenplum DCA Firmware Upgrade Instructions for the Interconnect Switch
(Arista 7050S-52) and Administration Switch (Arista 7048T)


Intel Servers (Kylin with eight drives, Dragon 12 with twelve drives, and Dragon 24
with 24 drives)
• New BIOS upgrade revision level SE5C600.86B.02.01.0002
• Field personnel can access both the BIOS upgrade package, and the

EMC Greenplum DCA Intel BIOS Upgrade Instructions for Intel Servers from
http://support.emc.com.

EMC Greenplum DCA Maintenance Guide

System Information and Configuration

137

EMC CONFIDENTIAL
System Information and Configuration

Identify the version of the installed DCA software
DCA documentation is tied to a specific version of the DCA software. To identify the version
of the software running on a particular DCA, perform this procedure:
1. Log in to the Primary Master server as the user root.
2. View the contents of the /opt/dca/etc/dca-build-info.txt file.
For example:
# cat opt/dca/etc/dca-build-info.txt

In the output see the ISO_Version information.
## =============================================
ISO_BUILD_DATE="Wed Oct 15 21:59:56 PST 2013"
ISO_VERSION="2.0.2.0"
ISO_BUILD_VERSION="4"
ISO_INSTALL_TYPE="iso"
## =============================================

EMC Greenplum DCA Maintenance Guide

138

EMC CONFIDENTIAL
System Information and Configuration

DCA configuration rules
Manufacturing ships three basic types of racks for the UAP DCA:


System - DCA2-SYSRACK



Aggregation - DCA2-AGGREG



Expansion - DCA2-EXPAND

Supported DCA modules
Module type

Server/drive types and quantities

Greenplum Database (GPDB)

Four Dragon 24 servers:
• Compute: x24 300GB drives per server
• Standard: x24 900GB drives per server
• Compute High Memory: x24 300GB drives per server,
256GB of Memory
• Two Kylin servers: x6 300GB drives per server

Data Integration Accelerator (DIA)
(One of these items)

•
•
•
•

Hadoop (HD) (master or worker)

Four Dragon 12 servers: x12 3TB drives per server

Hadoop Compute option (referred to as
HD-C module and used for Hadoop with
Isilon)

Two Kylin servers: x6 300GB drives per server

Two Kylin servers: x6 300GB drives per server
Two Dragon 12 servers: x12 3TB drives per server
Two Dragon 24 servers: 256GB of memory
Two DIA High Memory servers: x24 300GB drives per
server (256GB of memory)

Racking order
All master nodes and switches are racked first. All other nodes are racked in the following
order.
Table 11 Approved DCA Racking Sequence
SKU

Rack Priority (when present)

Dragon 24, 900GB disks, 64GB RAM
(100-585-031-07)

First

Dragon 24, 300GB disks, 64GB RAM
(100-585-035-06)

Second

Dragon 12, 3TB disks, 64GB RAM
HDM, HDW, or DIA
(100-585-030-06)

Third

Dragon 24, 900GB disks, 256GB RAM
SDW or DIA
(100-585-055-01)

Fourth

Kylin, 64GB RAM
DIA, HDC, or HDM
(100-585-029-05)

Fifth

EMC Greenplum DCA Maintenance Guide

139

EMC CONFIDENTIAL
System Information and Configuration

Racking guidelines


GPDB Compute, Standard, or High Memory modules must not occupy the same DCA.



The minimum Hadoop configuration must include two Hadoop modules, one serving
as the Hadoop Master module (hdm) and a second serving as the Hadoop Worker
(data) module (hdw). For Hadoop Compute with Isilon the minimum requirements are
8 Kylins (4 x2 Hadoop Compute modules).



The 2nd rack (if present) is always an Aggregation rack.



Racks 3 through 11 (if present) are Expansion racks.



Any rack containing even one 100-585-055-01 is limited to thirty rack units for
servers. Switches remain in the standard locations. Racks with High Memory servers
should not exceed 30U.

System

Aggregation
Expansion

Figure 39 11-rack configuration

Figure 40 Aggregation switch locations in a multi-rack DCA

EMC Greenplum DCA Maintenance Guide

140

EMC CONFIDENTIAL
System Information and Configuration

Mixed System rack components

Figure 41 Greenplum DCA2-SYSRACK
Table 12 Greenplum DCA2-SYSRACK - System rack components
DCA Component

Quantity

Hadoop Servers (Dragon 12, 2U)

16 (8 minimum, 4 hdw + 4 hdm) or 12 High
Memory Systems

Master Servers (Kylin, 1U)

2 (1 Primary + 1 Standby)

GPDB (Segment) Servers (Dragon 24, 2U)

16 or 12 High Memory Systems

Interconnect Switches (Arista 7050S-52)

2

Administration Switches (Arista 7048T-A)

1

EMC Greenplum DCA Maintenance Guide

141

EMC CONFIDENTIAL
System Information and Configuration

Hadoop-only System Rack components (minimum config.)
Note: Supported in DCA version 2.0.1.0 and later.

Figure 42 Hadoop-only System rack
Table 13 Hadoop-only System Rack components
DCA Component

Quantity

Hadoop Master Servers (hdm)

4 minimum

Hadoop Worker Servers (hdw)

4 minimum

Master Servers (Kylin, 1U)

2

Interconnect Switches (Arista 7050S-52)

2

Administration Switch (Arista 7048T-A)

1

EMC Greenplum DCA Maintenance Guide

142

EMC CONFIDENTIAL
System Information and Configuration

HD-Compute System Rack components (minimum config.)
Note: Supported in DCA version 2.0.2.0 and later.

Figure 43 HDC-Compute System rack
Table 14 HDC-Compute System rack components
DCA Component

Quantity

Hadoop Compute Servers (hdc)

8 minimum, 22 maximum

EMC Greenplum DCA Maintenance Guide

143

EMC CONFIDENTIAL
System Information and Configuration

Aggregation rack components

Figure 44 Greenplum DCA2-AGGREG
Table 15 Greenplum DCA2-AGGERG - Aggregation rack components
DCA Component

Quantity

Segment Servers

16 maximum (or 12 maximum with High
Memory Modules)

Master Servers (Kylin, 1U)

0

Interconnect Switches (Arista 7050S-52)

4 (2 for the Interconnect network; 2 for the
Aggregation network)

Administration Switch (Arista 7048T-A)

1

EMC Greenplum DCA Maintenance Guide

144

EMC CONFIDENTIAL
System Information and Configuration

Expansion rack components

Figure 45 Greenplum DCA2-EXPAND
Table 16 Greenplum DCA2-EXPAND - Expansion rack components
Component

Quantity

Segment Servers

16 maximum (or 12
maximum with High Mem
Module)

Master Servers (Kylin, 1U)

0

Interconnect Switches (Arista 7050S-52)

2

Administration Switch (Arista 7048T-A)

1

EMC Greenplum DCA Maintenance Guide

145

EMC CONFIDENTIAL
System Information and Configuration

Power supply reference
Figure 46 shows four external customer-supplied power input circuits connected to DCA
Power Distribution Units (PDUs). The figure shows a full System rack.

Power
switches

Power
switches

Customer-supplied
power

Upper
Zone B
input

Upper Zone A
input

Customersupplied power

Power
switches

Power
switches

Customer-supplied
power

Lower
Zone B
input

Lower Zone A
input

Customersupplied power

Figure 46 Greenplum DCA power cable configuration, full System rack

EMC Greenplum DCA Maintenance Guide

Power supply reference

146

EMC CONFIDENTIAL
System Information and Configuration

Power
switches

Power
switches

Customer-supplied
power

Upper
Zone B
input

Upper Zone A
input

Customersupplied power

Power
switches

Power
switches

Customer-supplied
power

Lower
Zone B
input

Lower Zone A
input

Customersupplied power

Figure 47 Greenplum DCA power cable configuration, 1/2 System rack

EMC Greenplum DCA Maintenance Guide

Power supply reference

147

EMC CONFIDENTIAL
System Information and Configuration

Customer-supplied
power not needed
to upper Zone B
in 1/4 rack

Customer-supplied
power not needed
to upper Zone A
in 1/4 rack

Power
switches

Power
switches

Customer-supplied
power

Lower
Zone B
input

Lower Zone A
input

Customersupplied power

Figure 48 Greenplum DCA power cable configuration, 1/4 System rack

EMC Greenplum DCA Maintenance Guide

Power supply reference

148

EMC CONFIDENTIAL
System Information and Configuration

Figure 49 Dense rack configuration

EMC Greenplum DCA Maintenance Guide

Power supply reference

149

EMC CONFIDENTIAL
System Information and Configuration

Figure 50 High memory system rack configuration

EMC Greenplum DCA Maintenance Guide

Power supply reference

150

EMC CONFIDENTIAL
System Information and Configuration

BMC Controller interface functionality
The baseboard management controller (BMC) is a built-in interface included in most
DCA servers. The BMC provides out-of-band system management facilities. The
controller integrates its own processor, memory, battery, network connection, and
access to the system bus. Key features (available through a supported web browser)
include:


Power management



Virtual media access



Remote console capabilities

BMC gives system administrators the ability to manage a machine as if they were
sitting at the local console.

BMC Controller LED indicators and meanings
Table 17 lists BMC LED states and components to check.
Table 17 BMC LED indictor status and possible required action
Color

State

Criticality

Description

Green

Solid On

Normal

No action required by Field Support.
BMC is operating in a healthy state.

Green

Blink (1 per second)

Degraded

Redundancy is lost or a non-critical warning/error
Check for these possible issues:
• Redundancy loss such as power-supply or fan
• Correctable ECC memory error
• Non-critical threshold crossed (Temp, Voltage, input power)

Amber

Solid On

Non-critical

Non-fatal alarm
Check for critical thresholds surpassed on:
• Temp
• Voltage
• Input power
• Hard Drives
• Fans (minimum number of fans not present)

Amber

Blink (1 per second)

Critical

Critical error
Check for:
• Power fault
• Insufficient memory present
• CPU thermal trip

EMC Greenplum DCA Maintenance Guide

BMC Controller interface functionality

151

EMC CONFIDENTIAL
System Information and Configuration

Network and cabling configurations
This section describes the network cabling configurations for the Interconnect and
administration switches.

Interconnect cabling reference
Each rack in the DCA contains two Interconnect switches which provide the Greenplum
Interconnect network. Topics in this section include:


“Lower Interconnect switch cabling reference”



“Upper Interconnect switch cabling reference”



“Dense rack switch cabling reference”



“Dense rack Interconnect 2 configuration (dual NIC)”

Port 17 to Primary
Master Server (mdw)

Ports 1-8 to
servers
sdw1-sdw8

Ports 9-16 to
servers
sdw9-sdw16

Port 18 to Standby
Master Server smdw

Ports 45-46 to lower
Aggregation switch
(aggr-sw-1)

Ports 41-44 to customer
network (single rack DCA only)

Ports 47-48 to upper
Aggregation switch
(aggr-sw-2)

mLAG peer connections
to the other Interconnect
switch in the rack.

Serial console

To Administration
switch

Figure 51 Interconnect switch port map

EMC Greenplum DCA Maintenance Guide

Network and cabling configurations

152

EMC CONFIDENTIAL
System Information and Configuration

Lower Interconnect switch cabling reference
The lower Interconnect switch connects servers to the first Interconnect. Lower
Interconnect switches are always odd-numbered hostnames (for example, i-sw-1, i-sw-3,
i-sw-5, etc.).

Figure 52 Lower Interconnect switch cabling reference

EMC Greenplum DCA Maintenance Guide

Network and cabling configurations

153

EMC CONFIDENTIAL
System Information and Configuration

Upper Interconnect switch cabling reference
The upper Interconnect Switch connects servers to the second Interconnect. Upper
Interconnect switches are always even-numbered hostnames (for example, i-sw-2, i-sw-4,
i-sw-6, etc.).

Figure 53 Upper Interconnect switch cabling reference

EMC Greenplum DCA Maintenance Guide

Network and cabling configurations

154

EMC CONFIDENTIAL
System Information and Configuration

Dense rack switch cabling reference

Figure 54 Dense rack Interconnect 1 configuration (dual NIC)

EMC Greenplum DCA Maintenance Guide

Network and cabling configurations

155

EMC CONFIDENTIAL
System Information and Configuration

Dense rack Interconnect 2 configuration (dual NIC)

Figure 55 Dense rack Interconnect 2 configuration (dual NIC)

EMC Greenplum DCA Maintenance Guide

Network and cabling configurations

156

EMC CONFIDENTIAL
System Information and Configuration

Table 18 Interconnect switch cable routing, 3-rack DCA (page 1 of 2)
SYS-RACK

IC switch
port

AGGR-RACK

EXPAND-RACK

i-sw-1

i-sw-2

i-sw-3

i-sw-4

i-sw-5

i-sw-6

Server
CNA port 0

Server
CNA port 1

Server
CNA port 0

Server
CNA port 1

Server
CNA port 0

Server
CNA port 1

1

server 1

server 1

server 1

server 1

server 1

server 1

2

server 2

server 2

server 2

server 2

server 2

server 2

3

server 3

server 3

server 3

server 3

server 3

server 3

4

server 4

server 4

server 4

server 4

server 4

server 4

5

server 5

server 5

server 5

server 5

server 5

server 5

6

server 6

server 6

server 6

server 6

server 6

server 6

7

server 7

server 7

server 7

server 7

server 7

server 7

8

server 8

server 8

server 8

server 8

server 8

server 8

9

server 9

server 9

server 9

server 9

server 9

server 9

10

server 10

server 10

server 10

server 10

server 10

server 10

11

server 11

server 11

server 11

server 11

server 11

server 11

12

server 12

server 12

server 12

server 12

server 12

server 12

13

server 13

server 13

server 13

server 13

server 13

server 13

14

server 14

server 14

server 14

server 14

server 14

server 14

15

server 15

server 15

server 15

server 15

server 15

server 15

16

server 16

server 16

server 16

server 16

server 16

server 16

17

mdw

mdw

server 17

server 17

server 17

server 17

18

smdw

smdw

server 18

server 18

server 18

server 18

19

server 17

server 17

server 19

server 19

server 19

server 19

20

server 18

server 18

server 20

server 20

server 20

server 20

21

server 19

server 19

22

server 20

server 20

23 to 40
41 to 44

Customer network (in single-rack DCA)

45

aggr-sw-1 port 1

aggr-sw-1 port 3

aggr-sw-1 port 5

aggr-sw-1 port 7

aggr-sw-1 port 9

aggr-sw-1 port 11

46

aggr-sw-1 port 2

aggr-sw-1 port 4

aggr-sw-1 port 6

aggr-sw-1 port 8

aggr-sw-1 port 10

aggr-sw-1 port 12

47

aggr-sw-2 port 1

aggr-sw-2 port 3

aggr-sw-2 port 5

aggr-sw-2 port 7

aggr-sw-2 port 9

aggr-sw-2 port 11

48

aggr-sw-2 port 2

aggr-sw-2 port 4

aggr-sw-2 port 6

aggr-sw-2 port 8

aggr-sw-2 port 10

aggr-sw-2 port 12

49

mLAG peer link: i-sw-1 to i-sw-2

EMC Greenplum DCA Maintenance Guide

mLAG peer link: i-sw-3 to i-sw-4

mLAG peer link: i-sw-5 to i-sw-6

Network and cabling configurations

157

EMC CONFIDENTIAL
System Information and Configuration

Table 18 Interconnect switch cable routing, 3-rack DCA (page 2 of 2)
SYS-RACK

AGGR-RACK

EXPAND-RACK

50

mLAG peer link: i-sw-1 to i-sw-2

mLAG peer link: i-sw-3 to i-sw-4

mLAG peer link: i-sw-5 to i-sw-6

51

mLAG peer link: i-sw-1 to i-sw-2

mLAG peer link: i-sw-3 to i-sw-4

mLAG peer link: i-sw-5 to i-sw-6

52

mLAG peer link: i-sw-1 to i-sw-2

mLAG peer link: i-sw-3 to i-sw-4

mLAG peer link: i-sw-5 to i-sw-6

EMC Greenplum DCA Maintenance Guide

Network and cabling configurations

158

EMC CONFIDENTIAL
System Information and Configuration

Administration switch reference
The DCA contains one Administration switch per rack. The Administration switch routes
management traffic, connects all of the servers and switches in a DCA, and provides
service connectivity through a red service cable.
Topics in this section include:


“Rack 1 Administration switch cabling reference”



“Dense rack Interconnect 2 configuration (dual NIC)”



“Dense rack Interconnect 2 configuration (dual NIC)”

Port 17: to Primary Master
server (mdw)

Port 43: to lower
Interconnect switch

Serial console

Other Administration
Administration switches
switches
Other
in aa multi-rack
multi-rack DCA
DCA
in
Serial console
Ports 1-8: to servers
1 through 8

Ports 9-16: to servers
9 through 16

Port 18: to Standby Master
server (smdw)

Port 48: red service cable
for cluster management
(a-sw-1 only)

Port 44: to upper
Interconnect switch
Ports 45 and 46: to
a-sw-1 to a-sw-2
in multi-rack DCA

Figure 56 Administration switch port map, single rack DCA

EMC Greenplum DCA Maintenance Guide

Network and cabling configurations

159

EMC CONFIDENTIAL
System Information and Configuration

Rack 1 Administration switch cabling reference

Port 47: Customer Admin
network access (optional)
Port 48: Cluster Management
(red service cable)

Figure 57 Rack 1 Administration switch cabling reference

EMC Greenplum DCA Maintenance Guide

Network and cabling configurations

160

EMC CONFIDENTIAL
System Information and Configuration

Dense rack Administration switch port mapping reference

Figure 58 Dense rack Administration switch port mapping to servers 9 - 16

EMC Greenplum DCA Maintenance Guide

Network and cabling configurations

161

EMC CONFIDENTIAL
System Information and Configuration

Administration switch cabling routing reference
Table 19 Administration switch cable routing
Admin
Switch Port

a-sw-1 in
a-sw-2 in
a-sw-3 in
SYS-RACK AGGR-RACK EXPAND-RACK

Admin
Switch Port

a-sw-1 in
SYS-RACK

a-sw-2 in
AGGR-RACK

1

server 1

server 1

server 1

25

a-sw-3, port 45

a-sw-3, port 46

n/a

2

server 2

server 2

server 2

26

a-sw-4, port 45

a-sw-4, port 46

n/a

3

server 3

server 3

server 3

27

a-sw-5, port 45

a-sw-5, port 46

n/a

4

server 4

server 4

server 4

28

a-sw-6, port 45

a-sw-6, port 46

n/a

5

server 5

server 5

server 5

29

a-sw-7, port 45

a-sw-7, port 46

n/a

6

server 6

server 6

server 6

30

a-sw-8, port 45

a-sw-8, port 46

n/a

7

server 7

server 7

server 7

31

a-sw-9, port 45

a-sw-9, port 46

n/a

8

server 8

server 8

server 8

32

a-sw-10, port 45

a-sw-10, port 46

n/a

9

server 9

server 9

server 9

33

a-sw-11, port 45

a-sw-11, port 46

n/a

10

server 10

server 10

server 10

34

a-sw-12, port 45

a-sw-12, port 46

n/a

11

server 11

server 11

server 11

35

n/a

n/a

n/a

12

server 12

server 12

server 12

36

n/a

n/a

n/a

13

server 13

server 13

server 13

37

n/a

n/a

n/a

14

server 14

server 14

server 14

38

n/a

n/a

n/a

15

server 15

server 15

server 15

39

n/a

n/a

n/a

16

server 16

server 16

server 16

40

n/a

n/a

n/a

17

mdw

server 17

server 17

41

n/a

n/a

n/a

18

smdw

server 18

server 18

42

n/a

n/a

n/a

19

server 17

server 19

server 19

43

Lower Interconnect switch <...> port

20

server 18

server 20

server 20

44

Upper Interconnect switch <...> port

21

server 19

—

—

45

a-sw-2
peer

a-sw-1
peer

a-sw-1,
port 25

22

server 20

—

n/a

46

a-sw-2
peer

a-sw-1
peer

a-sw-2,
port 25

23

—

—

n/a

47

Customer Admin
network access
(optional)

24

—

—

n/a

48

Cluster
management
(red service cable)

Customer Admin
network access
(optional)
n/a

a-sw-3 in
EXPAND-RACK

n/a

n/a

Note: A dash (-) indicates cable connections that vary depending on the specific type(s) and quantity of servers and racks in the DCA.

EMC Greenplum DCA Maintenance Guide

Network and cabling configurations

162

EMC CONFIDENTIAL
System Information and Configuration

Aggregation switch reference
Servers in a multiple-rack configuration communicate through the two Aggregation
switches located in Rack 2. The following diagram and table show the proper connectivity.

Figure 59 Aggregation switch port map

EMC Greenplum DCA Maintenance Guide

Network and cabling configurations

163

EMC CONFIDENTIAL
System Information and Configuration

Interconnect switch-to-Aggregation switch port mapping
Table 20 Interconnect switch-to-Aggregation switch port mapping (page 1 of 6)
Ports

Ports

47 <.......> 3
Upper Interconnect
switch (i-sw-2)

48 <.......> 4
Ports

Ports

45 <.......> 3
46 <.......> 4

Rack 1
Expansion

Ports

48 <.......> 2
Ports

46 <.......> 2

48 <.......> 8
Ports

46 <.......> 8
Ports

48 <.......> 6
Ports

Lower Aggregation
switch (aggr-sw-1)
Rack 2
AGGR Rack
Upper Aggregation
switch (aggr-sw-2)

Ports

45 <.......> 5
46 <.......> 6

EMC Greenplum DCA Maintenance Guide

Upper Aggregation
switch (aggr-sw-2)

Ports

47 <.......> 5
Lower Interconnect
switch (i-sw-3)

Lower Aggregation
switch (aggr-sw-1)

Ports

45 <.......> 7
Rack 2
AGG Rack

Upper Aggregation
switch (aggr-sw-2)

Ports

47 <.......> 7
Upper Interconnect
switch (i-sw-4)

Rack 2
AGGR Rack

Ports

45 <.......> 1

Ports

Lower Aggregation
switch (aggr-sw-1)

Ports

47 <.......> 1
Lower Interconnect
switch (i-sw-1)

Upper Aggregation
switch (aggr-sw-2)

Lower Aggregation
switch (aggr-sw-1)

Network and cabling configurations

164

EMC CONFIDENTIAL
System Information and Configuration

Table 20 Interconnect switch-to-Aggregation switch port mapping (page 2 of 6)

Ports

Ports

47 <.......> 11
Upper Interconnect
switch (i-sw-6)

48 <.......> 12
Ports

Ports

45 <.......> 11
46 <.......> 12

Rack 3
Expansion

Ports

48 <.......> 10
Ports

Rack 2
AGGR Rack
Upper Aggregation
switch (aggr-sw-2)

Ports

45 <.......> 9
46 <.......> 10

EMC Greenplum DCA Maintenance Guide

Lower Aggregation
switch (aggr-sw-1)

Ports

47 <.......> 9
Lower Interconnect
switch (i-sw-5)

Upper Aggregation
switch (aggr-sw-2)

Lower Aggregation
switch (aggr-sw-1)

Network and cabling configurations

165

EMC CONFIDENTIAL
System Information and Configuration

Table 20 Interconnect switch-to-Aggregation switch port mapping (page 3 of 6)
Ports

Ports

47 <.......> 15
Upper Interconnect
switch (i-sw-8)

48 <.......> 16
Ports

Ports

45 <.......> 15
46 <.......> 16

Rack 4
Expansion

Ports

48 <.......> 14
Ports

46 <.......> 14

48 <.......> 20
Ports

46 <.......> 20
Ports

48 <.......> 18
Ports

Lower Aggregation
switch (aggr-sw-1)
Rack 2
AGGR Rack
Upper Aggregation
switch (aggr-sw-2)

Ports

45 <.......> 17
46 <.......> 18

EMC Greenplum DCA Maintenance Guide

Upper Aggregation
switch (aggr-sw-2)

Ports

47 <.......> 17
Lower Interconnect
switch (i-sw-9)

Lower Aggregation
switch (aggr-sw-1)

Ports

45 <.......> 19
Rack 5
Expansion

Upper Aggregation
switch (aggr-sw-2)

Ports

47 <.......> 19
Upper Interconnect
switch (i-sw-10)

Rack 2
AGGR Rack

Ports

45 <.......> 13

Ports

Lower Aggregation
switch (aggr-sw-1)

Ports

47 <.......> 13
Lower Interconnect
switch (i-sw-7)

Upper Aggregation
switch (aggr-sw-2)

Lower Aggregation
switch (aggr-sw-1)

Network and cabling configurations

166

EMC CONFIDENTIAL
System Information and Configuration

Table 20 Interconnect switch-to-Aggregation switch port mapping (page 4 of 6)
Ports

Ports

47 <.......> 23
Upper Interconnect
switch (i-sw-12)

48 <.......> 24
Ports

Ports

45 <.......> 23
46 <.......> 24

Rack 6
Expansion

Ports

48 <.......> 22
Ports

46 <.......> 22

48 <.......> 28
Ports

46 <.......> 28
Ports

48 <.......> 26
Ports

Lower Aggregation
switch (aggr-sw-1)
Rack 2
AGGR Rack
Upper Aggregation
switch (aggr-sw-2)

Ports

45 <.......> 25
46 <.......> 26

EMC Greenplum DCA Maintenance Guide

Upper Aggregation
switch (aggr-sw-2)

Ports

47 <.......> 25
Lower Interconnect
switch (i-sw-13)

Lower Aggregation
switch (aggr-sw-1)

Ports

45 <.......> 27
Rack 7
Expansion

Upper Aggregation
switch (aggr-sw-2)

Ports

47 <.......> 27
Upper Interconnect
switch (i-sw-14)

Rack 2
AGGR Rack

Ports

45 <.......> 21

Ports

Lower Aggregation
switch (aggr-sw-1)

Ports

47 <.......> 21
Lower Interconnect
switch (i-sw-11)

Upper Aggregation
switch (aggr-sw-2)

Lower Aggregation
switch (aggr-sw-1)

Network and cabling configurations

167

EMC CONFIDENTIAL
System Information and Configuration

Table 20 Interconnect switch-to-Aggregation switch port mapping (page 5 of 6)
Ports

Ports

47 <.......> 31
Upper Interconnect
switch (i-sw-16)

48 <.......> 32
Ports

Ports

45 <.......> 31
46 <.......> 32

Rack 8
Expansion

Ports

48 <.......> 30
Ports

46 <.......> 30

48 <.......> 36
Ports

46 <.......> 36
Ports

48 <.......> 34
Ports

Lower Aggregation
switch (aggr-sw-1)
Rack 2
AGGR Rack
Upper Aggregation
switch (aggr-sw-2)

Ports

45 <.......> 33
46 <.......> 34

EMC Greenplum DCA Maintenance Guide

Upper Aggregation
switch (aggr-sw-2)

Ports

47 <.......> 33
Lower Interconnect
switch (i-sw-17)

Lower Aggregation
switch (aggr-sw-1)

Ports

45 <.......> 35
Rack 9
Expansion

Upper Aggregation
switch (aggr-sw-2)

Ports

47 <.......> 35
Upper Interconnect
switch (i-sw-18)

Rack 2
AGGR Rack

Ports

45 <.......> 29

Ports

Lower Aggregation
switch (aggr-sw-1)

Ports

47 <.......> 29
Lower Interconnect
switch (i-sw-15)

Upper Aggregation
switch (aggr-sw-2)

Lower Aggregation
switch (aggr-sw-1)

Network and cabling configurations

168

EMC CONFIDENTIAL
System Information and Configuration

Table 20 Interconnect switch-to-Aggregation switch port mapping (page 6 of 6)
Ports

Ports

47 <.......> 39
Upper Interconnect
switch (i-sw-20)

48 <.......> 40
Ports

Ports

45 <.......> 39
46 <.......> 40

Rack 10
Expansion

Ports

48 <.......> 38
Ports

46 <.......> 38

48 <.......> 44
Ports

46 <.......> 44
Ports

48 <.......> 42
Ports

Lower Aggregation
switch (aggr-sw-1)
Rack 2
AGGR Rack
Upper Aggregation
switch (aggr-sw-2)

Ports

45 <.......> 41
46 <.......> 42

EMC Greenplum DCA Maintenance Guide

Upper Aggregation
switch (aggr-sw-2)

Ports

47 <.......> 41
Lower Interconnect
switch (i-sw-21)

Lower Aggregation
switch (aggr-sw-1)

Ports

45 <.......> 43
Rack 11
Expansion

Upper Aggregation
switch (aggr-sw-2)

Ports

47 <.......> 43
Upper Interconnect
switch (i-sw-22)

Rack 2
AGGR Rack

Ports

45 <.......> 37

Ports

Lower Aggregation
switch (aggr-sw-1)

Ports

47 <.......> 37
Lower Interconnect
switch (i-sw-19)

Upper Aggregation
switch (aggr-sw-2)

Lower Aggregation
switch (aggr-sw-1)

Network and cabling configurations

169

EMC CONFIDENTIAL
System Information and Configuration

Network hostname and IP configuration
Table 21 DCA network configuration (page 1 of 3)
Component

hostname

BMC IP
host-sp

NIC 1 IP
host-cm

Reserved for DHCP

n/a

n/a

172.28.6.170
through
172.28.6.179

Rack 1

Administration Switch

a-sw-1

172.28.0.190

Rack 2

Administration Switch

a-sw-2

172.28.0.191

Rack 3

Administration Switch

a-sw-3

172.28.0.192

Rack 4

Administration Switch

a-sw-4

172.28.0.193

Rack 5

Administration Switch

a-sw-5

172.28.0.194

Rack 6

Administration Switch

a-sw-6

172.28.0.195

Rack 7

Administration Switch

a-sw-7

172.28.0.196

Rack 8

Administration Switch

a-sw-8

172.28.0.197

Rack 9

Administration Switch

a-sw-9

172.28.0.198

Rack 10

Administration Switch

a-sw-10

172.28.0.199

Rack 11

Administration Switch

a-sw-11

172.28.1.190

Rack 1

Interconnect Switch, lower

i-sw-1

172.28.0.170

Interconnect Switch, upper

i-sw-2

172.28.0.180

Interconnect Switch, lower

i-sw-3

172.28.0.171

Interconnect Switch, upper

i-sw-4

172.28.0.181

Interconnect Switch, lower

i-sw-5

172.28.0.172

Interconnect Switch, upper

i-sw-6

172.28.0.182

Interconnect Switch, lower

i-sw-7

172.28.0.173

Interconnect Switch, upper

i-sw-8

172.28.0.183

Interconnect Switch, lower

i-sw-9

172.28.0.174

Interconnect Switch, upper

i-sw-10

172.28.0.184

Interconnect Switch, lower

i-sw-11

172.28.0.175

Interconnect Switch, upper

i-sw-12

172.28.0.185

Interconnect Switch, lower

i-sw-13

172.28.0.176

Interconnect Switch, upper

i-sw-14

172.28.0.186

Interconnect Switch, lower

i-sw-15

172.28.0.177

Interconnect Switch, upper

i-sw-16

172.28.0.187

Rack

Rack 2

Rack 3

Rack 4

Rack 5

Rack 6

Rack 7

Rack 8

EMC Greenplum DCA Maintenance Guide

Interconnect
n/a

Network hostname and IP configuration

170

EMC CONFIDENTIAL
System Information and Configuration

Table 21 DCA network configuration (page 2 of 3)
BMC IP
host-sp

NIC 1 IP
host-cm

Rack

Component

hostname

Rack 9

Interconnect Switch, lower

i-sw-17

172.28.0.178

Interconnect Switch, upper

i-sw-18

172.28.0.188

Interconnect Switch, lower

i-sw-19

172.28.0.179

Interconnect Switch, upper

i-sw-20

172.28.0.189

Interconnect Switch, lower

i-sw-21

172.28.1.170

Interconnect Switch, upper

i-sw-22

172.28.1.180

Aggregation Switch, lower

aggr-sw-1

172.28.0.248

Aggregation Switch, upper

aggr-sw-2

172.28.0.249

Primary Master Server, lower server

mdw

172.28.0.250

172.28.4.250

172.28.8.250

Standby Master Server, upper server

smdw

172.28.0.251

172.28.4.251

172.28.8.251

Rack 10

Rack 11

Rack 2

Rack 1

EMC Greenplum DCA Maintenance Guide

Interconnect

Network hostname and IP configuration

171

EMC CONFIDENTIAL
System Information and Configuration

Table 21 DCA network configuration (page 3 of 3)
Rack

Component

hostname

BMC IP
host-sp

NIC 1 IP
host-cm

Interconnect

GPDB Segment Server 1-160

sdw#

172.28.0.#

172.28.4.#

172.28.8.#

GPDB Segment Server 161-176

sdw#

172.28.1.1
172.28.1.16

172.28.5.1
172.28.5.16

172.28.9.1
172.28.9.16

DIA Server 1-16

etl#

172.28.0.20#

172.28.4.20#

172.28.8.20#

DIA Server 17-32

etl#

172.28.1.201
172.28.1.216

172.28.5.201
172.28.5.216

172.28.9.201
172.28.9.216

DIA Server 33-48

etl#

172.28.2.231
172.28.2.246

172.28.6.231
172.28.6.246

172.28.10.231
172.28.10.246

DIA Server 49-64

etl#

172.28.3.231
172.28.3.246

172.28.7.231
172.28.7.246

172.28.11.231
172.28.11.246

Hadoop Master Node 1-8

hdm1
hdm2
hdm3
hdm4
hdm5
hdm6
hdm7
hdm8

172.28.1.250
172.28.1.251
172.28.1.252
172.28.1.253
172.28.2.250
172.28.2.251
172.28.3.250
172.28.3.251

172.28.5.250
172.28.5.251
172.28.5.252
172.28.5.253
172.28.6.250
172.28.6.251
172.28.7.250
172.28.7.251

172.28.9.250
172.28.9.251
172.28.9.252
172.28.9.253
172.28.10.250
172.28.10.251
172.28.11.250
172.28.11.251

Hadoop Worker Node 1-160

hdw1-160

172.28.2.#

172.28.6.#

172.28.10.#

1Hadoop Worker Node 161-320

hdw161-320

172.28.3.#
# = node
number minus
160. Example:
hdw162-sp =
172.28.3.2

172.28.7.#
# = node
number minus
160. Example:
hdw162 -cm=
172.28.7.2

172.28.11.#
# = node
number minus
160. Example:
hdw162-1 =
172.28.11.2

Hadoop Compute Node 1-60

hdc1-60

172.28.2.170
172.28.2.229

172.28.6.170
172.28.6.229

172.28.10.170
172.28.10.229

Hadoop Compute Node 61-120

hdc61-120

172.28.3.170
172.28.3.229

172.28.7.170
172.28.7.229

172.28.11.170
172.28.11.229

2 IP Addresses reserved for Isilon

1. Hadoop Worker nodes are numbered 1-320. In order to accommodate the required number of hosts, the third IP address octet is
incremented by 1 and the fourth octet restarts at 1 when the node number reaches 161. For example, the host hdw160-sp uses a
third octet of 2 and a fourth octet of 160 - host hdw-161-sp uses a third octet of 3 and a fourth octet of 1. To see a complete list of
IP addresses and hostnames, view the /etc/hosts file.
2.) 172.28.8.217 through 172.28.8.246 and 172.28.9.217 through 172.28.9.246

EMC Greenplum DCA Maintenance Guide

Network hostname and IP configuration

172

EMC CONFIDENTIAL
System Information and Configuration

Multiple-rack cabling reference
Table 22 Cabling kit contents and part numbers

Table 23 Cable kits for a 7-to-11-rack DCA
Connect from:

Rack 2 - AGGREG

EMC Greenplum DCA Maintenance Guide

To:

Use cable kit:

Rack 1 - SYSRACK

DCA2-CBL10

Rack 2 - AGGREG

DCA2-CBL10

Rack 3 - 1st EXPAND

DCA2-CBL10

Rack 4 - 2nd EXPAND

DCA2-CBL10

Rack 5 - 3rd EXPAND

DCA2-CBL10

Rack 6 - 4th EXPAND

DCA2-CBL10

Rack 7 - 5th EXPAND

DCA2-CBL30

Rack 8 - 6th EXPAND

DCA2-CBL30

Rack 9 - 7th EXPAND

DCA2-CBL30

Rack 10 - 8th EXPAND

DCA2-CBL30

Rack 11 - 9th EXPAND

DCA2-CBL30

Multiple-rack cabling reference

173

EMC CONFIDENTIAL
System Information and Configuration

Configuration files
Configuration files are text files that contain the hostnames of servers that occupy quarter,
half, or full rack configurations. The file used depends on the desired function. Refer to the
table below for a description of each configuration and host file. The hostfiles are located
at $ /home/gpadmin/gpconfigs:
Table 24 Hostfiles created by the DCA Setup utility
File

Description

gpexpand_map

Expansion MAP file created during the dca_setup
option Expand the DCA. It’s purpose is to during GPDB
reallocate primary and mirror instances on the new
hardware.

gpinitsystem_map

MAP file used during installation of GPDB blocks to
assign primary and mirror segments to each server.

hostfile

Contains one hostname per server for ALL servers in the
system. Includes GPDB, DIA and HD (if present).

hostfile_segments

Contains the hostnames of the segment servers of all
GPDB blocks.

hostfile_gpdb

Contains the hostnames for GPDB servers.

hostfile_dia

Contains the hostnames of the DIA servers.

hostfile_hadoop

Contains the hostnames of the Hadoop servers.

hostfile_hdm

Contains the hostnames of all Hadoop Master servers.

hostfile_hdw

Contains the hostnames of all Hadoop Worker servers.

hostfile_hdc

Contains the hostnames of all Hadoop Compute
servers.

Location of old core files
(Applies to DCA version 2.0.1.0 and later) Old core files are moved automatically to a
separate directory to prevent them from being sent to Support following a healthmon
restart. For example, for sdw1, old core files are moved to /var/crash/user.
[root@sdw1 user-processed]# ls -l /var/crash/user

EMC Greenplum DCA Maintenance Guide

Configuration files

174

EMC CONFIDENTIAL
System Information and Configuration

Default passwords
The following table lists default passwords for all the components in a DCA.
Table 25 Default user names and passwords
Component

User

Password

Master Servers

BMC root user

For a new unconfigured server:
password
For an existing configured server:
sephiroth

Interconnect,
Adminstration, and
Aggregation switches

EMC Greenplum DCA Maintenance Guide

root

changeme

gpadmin

changeme

admin

changeme

Default passwords

175

EMC CONFIDENTIAL

APPENDIX B
Connect a workstation to the DCA
This section describes how to connect a workstation to the DCA in prepration for
performing various maintenance tasks. Administration is always performed from the
Primary Master server (hostname mdw). A Windows laptop with the PuTTY application
installed is required.

Laptop prerequisites
The laptop you use to connect to the Greenplum DCA must have the following capabilities
in order to perform Greenplum DCA administration:


RJ-45 Ethernet port



Administrator access on the laptop



An ssh client such as PuTTY or Cygwin with the OpenSSH package enabled



An scp client such as WinSCP, PuTTY PSCP or Cygwin with the OpenSSH package
enabled

Configure your laptop to connect to the DCA
Perform the appropriate procedure to configure your laptop to connect to the DCA
Administration Network.

Configure a Windows 7 laptop
1. Locate the red service cable on the laptop tray. The cable is connected to port 48 of
the first administration switch (a-sw-1). Connect the service cable to your laptop.
2. Click Start > Control Panel > Network and Internet > Network Sharing Center.
3. On the left pane click Change adapter settings.
4. Right-click the connection that you want to change, and then click Properties. If you
are prompted for an administrator password or confirmation, type the password or
provide confirmation.

EMC Greenplum DCA Maintenance Guide

Connect a workstation to the DCA

176

EMC CONFIDENTIAL
Connect a workstation to the DCA

5. From the Networking tab select Internet Protocol Version 4 (TCP/IPv4), and then click
Properties.

6. Click Properties.
7. Select Use the following IP address, and then type the following IP address and subnet
mask:
• IP address: 172.28.3.253
• Subnet mask: 255.255.248.0
Note: Leave the Default gateway field blank. Do not configure a gateway.

EMC Greenplum DCA Maintenance Guide

Configure your laptop to connect to the DCA

177

EMC CONFIDENTIAL
Connect a workstation to the DCA

8. Click OK.
9. Click Close.

Configure a Windows XP laptop
1. Locate the red service cable on the laptop tray. The cable is connected to port 48 of
the first administration switch (a-sw-1). Connect the service cable to your laptop.
2. On your Windows laptop, open Control Panel.
3. Double-click Network Connections.
4. Right-click Local Area Connection and then select Properties.
5. Select Internet Protocol (TCP/IP) and then click Properties.
6. Enter the IP address and subnet mask:
• IP address: 172.28.3.253
• Subnet mask: 255.255.248.0
Note: Leave the Default gateway field blank. Do not configure a gateway.
7. Click OK.

Connect to the Master Server using an SSH client
The method you use to establish an ssh connection to the Master Server depends on your
chosen ssh client (PuTTY, Cygwin, etc.). Regardless of the ssh client, connect using the
following values:


hostname: 172.28.4.250



username: root



root password: changeme (or whatever the customer’s root password is)

EMC Greenplum DCA Maintenance Guide

Connect to the Master Server using an SSH client

178

EMC CONFIDENTIAL
Connect a workstation to the DCA

PuTTY example
1. Open PuTTY and enter 172.28.4.250 in the Host Name (or IP address) field. Select
SSH as the Connection type.

2. Click Open.
3. If this is the first time you have connected to this server, a security alert will display.
Click Yes to continue.
4. At the SSH Login window, enter your username and password. For example:
login as: root
root@172.28.4.250 password: changeme

Cygwin example
To use Cygwin, you must have enabled the OpenSSH package when you installed Cygwin.
Open a Cygwin terminal window and type the following at the prompt:
$ ssh root@172.28.4.250

When prompted, type the root password (default is changeme).

Copy a file to the Master Server using an SCP client
The method you use to copy a file from your local laptop to the Greenplum master server
depends on your chosen scp client (WinSCP, Cygwin, etc.). Regardless of the scp client,
connect using the following values:


hostname: 172.28.4.250



username: gpadmin



root password: changeme (or whatever the customer’s root password is)



Destination on the master: /home/gpadmin

EMC Greenplum DCA Maintenance Guide

Copy a file to the Master Server using an SCP client

179

EMC CONFIDENTIAL
Connect a workstation to the DCA

WinSCP Example
1. Log in to the master host IP 172.28.4.250 as user gpadmin. Select SFTP as the File
protocol.

2. On your local host, locate the file you want to copy and then choose the
/home/gpadmin directory on the master server.

EMC Greenplum DCA Maintenance Guide

Copy a file to the Master Server using an SCP client

180

EMC CONFIDENTIAL
Connect a workstation to the DCA

3. Click Copy.

Cygwin example
1. To use Cygwin, you must have enabled the OpenSSH package when you installed
Cygwin. Open a Cygwin terminal window and type the following at the prompt:
$ cd 
$ scp  gpadmin@172.28.4.250:/home/gpadmin

2. When prompted, type the gpadmin password (the default is changeme).

Connect to an Interconnect or Administration switch using PuTTY
This section describes how to connect your service laptop to a serial port on an
Interconnect or Administration switch. You must perform this procedure if the switch
contains factory settings or cannot be accessed by telnet or ssh through the DCA
Administration network.
1. Connect one end of a serial cable from the serial port on the switch. Connect the other
end of the cable to your workstation.
Note: If the service laptop or workstation does not have a serial port, you can use a
USB-to-Serial Adapter.
2. Launch the PuTTY application.
3. Select Serial in Basic Options under the Session section.
Serial option in a PuTTY session

EMC Greenplum DCA Maintenance Guide

Connect to an Interconnect or Administration switch using PuTTY

181

EMC CONFIDENTIAL
Connect a workstation to the DCA

4. Expand the connection section and select Serial. Verify that the settings for the COM
port are 9600 Baud, 8 data bits, and no hardware flow control.

5. Click Open to connect.
6. Press  to display the login prompt.

EMC Greenplum DCA Maintenance Guide

Connect to an Interconnect or Administration switch using PuTTY

182

EMC CONFIDENTIAL

APPENDIX C
Power Off the DCA
To safely shut down and power off Greenplum DCA hardware and software, perform the
following tasks in sequence:




Task 1: Connect to the Greenplum DCA Master Server............................................ 184
Task 2: Stop the Greenplum Database software and shut down the OS.................. 185
Task 3: Place the PDU power switches in the OFF position ..................................... 187

IMPORTANT
Stop all running queries and data loading before you power down the DCA.

EMC Greenplum DCA Maintenance Guide

Power Off the DCA

183

EMC CONFIDENTIAL
Power Off the DCA

Task 1: Connect to the Greenplum DCA Master Server
The fastest method to shut down a DCA is to SSH in to a Master Server through an external
network connection.
If the external conection is not available and you have a service laptop, connect to the DCA
as described in this procedure. This procedure assumes you are using the Windows
Operating System.
1. Locate the system rack of the DCA.
The systm rack contains the Primary and Standby Master servers. Master servers are
highlighted in red in Figure 60.

Figure 60 Master Servers in the System rack

2. Locate the red service cable on the laptop tray and connect it to your laptop. The red
service cable is connected to port 48 on the Administration switch.

EMC Greenplum DCA Maintenance Guide

Connect to the Greenplum DCA Master Server

184

EMC CONFIDENTIAL
Power Off the DCA

3. From your Windows laptop navigate to Start > Control Panel > Network and Internet >
Network Sharing Center.
4. On the left pane click Change adapter settings..
5. Right-click Local Area Connection and select Properties.
6. From the Networking tab select Internet Protocol Version 4 (TCP/IPv4).
7. Click Properties.
8. Select Use the following IP address, and then enter the following IP address and
subnet mask:
• IP address: 172.28.3.253
• Subnet mask: 255.255.248.0
9. Click OK.
10. Click Close.
11. Open an SSH client (such as PuTTY) and enter:
• Host Name (or IP address): 172.28.4.250
• Connection type: SSH
12. Click Open.
If this is the first time you have connected to this server, a security alert will display.
13. Click Yes to continue.
14. Log in as the user root with password changeme.
If the default password changeme was changed, enter the current password.

Task 2: Stop the Greenplum Database software and shut down the OS
To ensure data consistency across primary and mirror segments, you must stop the
Greenplum Database software correctly.
1. To prevent false dial home messages from being sent to EMC Support during service,
disable health monitoring by stopping the healthmon daemon:
# dca_healthmon_ctl -d

2. Switch to the user gpadmin:
# su - gpadmin

3. When prompted for the password, enter changeme.
If the default password changeme was changed, enter the current password.
4. Stop the Greenplum Database:
$ gpstop -af

5. Stop Greenplum Command Center:
$ gpcmdr --stop

EMC Greenplum DCA Maintenance Guide

Stop the Greenplum Database software and shut down the OS

185

EMC CONFIDENTIAL
Power Off the DCA

6. Switch to the user root:
$ su -

7. Start the DCA Shutdown utility:


Issuing the shutdown command immediately shuts down the DCA. Make sure that
you are ready to shut down the DCA before you issue this command.
# dca_shutdown

8. Verify that the green LED on the power button on each server turns off after 1-2
minutes (see Figure 61 and Figure 62).
9. If a server does not power off, power it off manually by pressing the power button.
Power button

AF004297

Figure 61 Location of power button on a GPDB server (applies also to Hadoop Masters & Workers)

Power button

Figure 62 Location of power button on a Master, DIA, and Hadoop Compute servers

{Procedure continues on next page}

EMC Greenplum DCA Maintenance Guide

Stop the Greenplum Database software and shut down the OS

186

EMC CONFIDENTIAL
Power Off the DCA

Task 3: Place the PDU power switches in the OFF position
When the Greenplum Database is stopped and the operating system is shut down on each
server, it is safe to power off the system via the eight PDU power switches in each rack.
1. Starting from the rear of the System rck (Rack 1), locate the power switches in the
upper and lower Power Zones A and B (see Figure 63).

Power
switches

Power
switches

Customer-supplied
power

Upper
Zone B
input

Upper Zone A
input

Customersupplied power

Power
switches

Power
switches

Customersupplied power

Customer-supplied
power

Lower
Zone B
input

Lower Zone A
input

{Procedure continues on next page)

Figure 63 Rack power switch locations
EMC Greenplum DCA Maintenance Guide

Place the PDU power switches in the OFF position

187

EMC CONFIDENTIAL
Power Off the DCA

2. First place the power switches in lower Power Zones A and B in the OFF position, and
then place the power switches in upper Power Zones A and B in the OFF position.
3. Power off the remaining racks in the same way, one rack at a time, first placing the
power switches in the lower zone and then the upper zone in the OFF position.
After a few seconds, there should be no lit LEDs on any components in the system.
Shutdown is complete.

EMC Greenplum DCA Maintenance Guide

Place the PDU power switches in the OFF position

188

EMC CONFIDENTIAL

APPENDIX D
Linux and vi Command Reference
This appendix is a quick reference of basic Red Hat Linux commands, Greenplum-specific
Linux commands, and common Vi text editor commands.

Common Linux command reference
Table 26 Common Linux commands (page 1 of 2)
Linux command

Description
Moving Around

/

refers to the root directory

..

refers to the parent directory

Up/down arrows

repeats the last (up arrow) or next (down arrow) command you typed

pwd

displays the current directory

cd name

changes to the named directory

cd

returns you to your home directory
Basic Commands

ls

lists the contents of the current directory

ls dir_name

lists the content of the named directory

ls -l

lists the content of the named directory in long format; this includes
file permissions, ownership information, and file size

ls -a

lists all the files in the named directory including files that start with
a period (“.)”

cat filename

prints the content of the named file to the screen, one page at a time

more filename

prints the content of the named file to the screen, with scrolling and
search facilities

cp source destination

copies the source file to the named destination
for example: cp /misc/temp .
copies a file called temp located in the misc directory, to the
current directory (“.”)

mv source destination

moves the source file or directory to the named destination
for example: mv /misc/temp .
this moves a file called temp located in the misc directory, to the
current directory (“.”)

rm filename

deletes (removes) the named file

mkdir dir_name

creates a new directory

rmdir dir_name

removes the specified directory (directory must be empty)

source

source path information

EMC Greenplum DCA Maintenance Guide

Linux and vi Command Reference

189

EMC CONFIDENTIAL
Linux and vi Command Reference

Table 26 Common Linux commands (page 2 of 2)
Linux command

Description

su

assume the super user (root) identity

tar

untars a tape archived and compressed file

unzip

extracts compressed files from a ZIP archive

grep string filename

prints all the lines in a file that contain the specified string

su

temporarily become the superuser - useful for system administration
tasks

passwd

allows you to change the password used to access your user
account. You are prompted to enter your current password, then
enter a new one.

who

displays a list of users currently logged onto this computer
Getting Help

man command

displays a (manual page (man) about the specified command,
possible options and switches, and more detailed information
about using that command
Shutting down and rebooting a Linux machine

/sbin/shutdown - r now

reboots the machine immediately

/sbin/shutdown - h now

shuts down the machine immediately
Greenplum Linux Commands

gpcheck

verifies and validates Greenplum Database platform settings

gpexpand

expands an existing Greenplum Database across new hosts in the
array

gpinitsystem

initializes a Greenplum Database system by using configuration
parameters specified in the gp_init_config file

gpinitstandby

adds and/or initializes a standby master host for a Greenplum
Database system

gpseginstall

installs the Greenplum Database software on multiple hosts

gpscp

copies files between multiple hosts at the same time

gpssh-exkeys

provides ssh access to multiple hosts at the same time

gpstate

verifies the DCA master server status

EMC Greenplum DCA Maintenance Guide

Common Linux command reference

190

EMC CONFIDENTIAL
Linux and vi Command Reference

vi Quick Reference
The following is a quick reference for the vi editor.
Table 27 Common vi commands
vi command

Description
Inserting/Deleting Text
(To exit insert mode, press the [ESC] key)

a

append text, after the cursor

i

insert text, before the cursor

R

enter overtype mode

x

delete character

dd

delete current line
Moving Cursor

h, [BACKSPACE]

left one character

l, [SPACE]

right one character

w

forward one word

b

back one word

e

end of word

j

down one line

k

up one line

?pattern

search backward for pattern

/pattern

search forward for pattern

n

repeat last search

N

repeat last search in the opposite direction
Saving File and Exiting

:wq

save file and quit

:q!

force quit the editor, do not save changes

EMC Greenplum DCA Maintenance Guide

vi Quick Reference

191

EMC CONFIDENTIAL

APPENDIX E
Replace a Server in the Greenplum DCA Rack
This appendix describes how to replace the servers in a DCA 40U rack. Server types
include:


1U servers—Master, DIA, and Hadoop compute servers (EMC SVR-I1U-1208)



2U servers—GPDB (EMC SVR-I2U-R2224) and Hadoop Master and Worker servers
(EMC SVR-I2U-R2312)

This appendix includes the following sections:






Mounting kit parts.................................................................................................
Task 1: Remove the server from the rack................................................................
Task 2: Remove the inner rails from the original server ..........................................
Task 3: Attach the inner rails to the replacement server .........................................
Task 4: Install the server in the rack.......................................................................

192
193
195
195
196

Mounting kit parts
The server mounting kit includes rails and screws as listed in the following table. Verify
that these parts are included with the replacement server.
Component

Use

2 universal rail assemblies
(consists of slide rails for connection to the
rack and inner rails for connection to server)

Attach back to front on either side between rack
channels

Four Phillips pan-head 8-32 x 0.35 in screws

Stabilize the server and rail mounting

You need a # 2 Phillips-head screwdriver to complete the installation of the rails and
server.

EMC Greenplum DCA Maintenance Guide

Replace a Server in the Greenplum DCA Rack

192

EMC CONFIDENTIAL
Replace a Server in the Greenplum DCA Rack

Task 1: Remove the server from the rack

The enclosure is heavy and should be installed into or removed from a rack by two
people. To avoid personal injury and/or damage to the equipment, do not attempt to lift
and install the enclosure into a rack without a mechanical lift and/or help from another
person.
Procedure:
IMPORTANT
When removing the server from the rack, do not hold the server up by its power/control
module, which is on the right side of the front of the server.
1. Unplug all power and I/O cables from the back of the server, and label the cables so
you can easily identify them when you need to connect them to the replacement
server.
2. Remove the stabilizing screw behind the latch bracket on each side (Figure 64).

CL5020

Figure 64 Remove the stabilizer screws

EMC Greenplum DCA Maintenance Guide

Remove the server from the rack

193

EMC CONFIDENTIAL
Replace a Server in the Greenplum DCA Rack

3. Pull the server forward until it locks in place (Figure 65).

CL5023

Figure 65 Slide server out of the rack to the locked position

4. Slide the blue disconnect tabs forward to release the inner rails from the slide rails
(Figure 66).


Once you release the server from the inner rails, you must support the full weight of
the server.
5. Be prepared to support the full weight of the server, and then slowly pull the server
forward and remove it from the rack (Figure 66).

CL5019

Figure 66 Release the inner rail locks and remove the server from the rack

EMC Greenplum DCA Maintenance Guide

Remove the server from the rack

194

EMC CONFIDENTIAL
Replace a Server in the Greenplum DCA Rack

Task 2: Remove the inner rails from the original server
1. On the middle of the inner rail, push in and hold the metal latch.
2. Push the rail forward to release the connection studs from the small end of the rail
notches.
3. When the connections studs are in the large end of the rail notches, release the metal
latch.
4. Pull the inner rails away from the server.

1

2

3

4

CL5017

Figure 67 Release the inner rails from the server

Task 3: Attach the inner rails to the replacement server
1. Align the large end of the rail notches on the inner rail with the connection studs on
the side of the server.
2. Push the flat side of the inner rail onto connection studs.
3. Slide the inner rail backwards along the server until the studs fit securely into the
small end of the rail notches.
An audible click indicates that the rail is secure.
EMC Greenplum DCA Maintenance Guide

Remove the inner rails from the original server

195

EMC CONFIDENTIAL
Replace a Server in the Greenplum DCA Rack

1

2

3

CL5016

Figure 68 Attach an inner rail to the server

Task 4: Install the server in the rack

The enclosure is heavy and should be installed into or removed from a rack by two
people. To avoid personal injury and/or damage to the equipment, do not attempt to lift
and install the enclosure into a rack without a mechanical lift and/or help from another
person.
Procedure:
IMPORTANT
When installing the server in the rack, do not pick the server up by the rotating power
console on the front right side of the server and do not push on the power console.

EMC Greenplum DCA Maintenance Guide

Install the server in the rack

196

EMC CONFIDENTIAL
Replace a Server in the Greenplum DCA Rack

1. On each slide rail bring the ball bearing retainer assembly fully to the front, so it rides
onto the security knob (Figure 69).

CL5092

Figure 69 Correct location for ball bearing retainer assembly

2. From the front of the rack, align the inner rails attached to the server with the white
plastic guide block front inside of each slide rail (Appendix EFigure 70).
Note: For clarity Figure 70 shows the inner rail without the server attached.

CL5093

Figure 70 Align the inner rail with white plastic guide block

3. Slide the server into the chassis so the inner rails extend over the plastic guide blocks
and the first part of the ball bearing retainer assemblies (Figure 71).
Note: For clarity Figure 71 shows the inner rail without the server attached.

EMC Greenplum DCA Maintenance Guide

Install the server in the rack

197

EMC CONFIDENTIAL
Replace a Server in the Greenplum DCA Rack

CL5094

Figure 71 Inner rail over the first part of ball bearing retainer assembly

4. Once the inner rails are properly engaged with the ball bearing retainer assemblies,
push the server into into the rack until the slide rails are engaged and locked.
An audible click indicates that the slide rails are engaged and locked.
5. On the outside of each rail assembly, slide the blue disconnect tab forward to unlock
the server, and push the server completely into the rack (Figure 72).

CL5018

Figure 72 Inserting the server completely into the rack

EMC Greenplum DCA Maintenance Guide

Install the server in the rack

198

EMC CONFIDENTIAL
Replace a Server in the Greenplum DCA Rack

6. To further secure the rail assembly and server in the rack, insert and tighten a small
stabilizer screw directly behind each bezel latch (Figure 73).

CL5020

Figure 73 Installing the stabilizer screws

7. Reconnect data and power cables as described in the server replacement procedure.

EMC Greenplum DCA Maintenance Guide

199

EMC CONFIDENTIAL

APPENDIX F
Install a Switch in a Rack
This appendix describes how to replace the switches in a DCA 40U rack. It includes the
following major sections:




Switch mounting kit parts ..................................................................................... 201
Replace the switch in the rack ............................................................................... 201
Replace an optical SFP module.............................................................................. 208

Switch types include:


Interconnect and Aggregation (10GB; SWCH-AR1U-7050S-52)



Administration (1GB; SWCH-AR1U-7048T)

EMC Greenplum DCA Maintenance Guide

Install a Switch in a Rack

200

EMC CONFIDENTIAL
Install a Switch in a Rack

Switch mounting kit parts
The switch mounting kit includes rail assemblies and screws as listed below.
Component

Use

2 rail assemblies
(consists of outer rails for connection to the
rack and inner rails for connection to switch)

Attach back to front on either side between rack
channels and to the switch.

CL5032

Eight Phillips pan-head 4M x 6 mm screws

Attach inner rails to switch (4 per rail)

CL5033

Six Phillips pan-head 5M x 16 mm screws

Stabilize the rail mounting (3 per rail)

CL5034

You need a Phillips-head screwdriver to complete the installation of the rails and switch.

Replace the switch in the rack
Replacement of non-FRU switch components by unauthorized personnel may void service
warranties. If any non-FRU component fail you must replace the entire switch.
Replacing the switch consists of the following steps:


“Task 1: Unpack the replacement switch” on page 201



“Task 2: Remove the old switch from the rack” on page 201



“Task 3: Transfer any optical SFP modules” on page 203



“Task 4: Transfer the inner rails to the replacemet switch” on page 204



“Task 6: Install the switch in the rack” on page 206

Task 1: Unpack the replacement switch
Unpack the replacement switch and place it on a clean, static-free surface near the rack
with the faulted switch.

Task 2: Remove the old switch from the rack
1. From the back of the rack:
a. Unplug all cables from the switch, and label the cables for easy identification when
you need to plug them into the replacement switch.

EMC Greenplum DCA Maintenance Guide

Switch mounting kit parts

201

EMC CONFIDENTIAL
Install a Switch in a Rack

b. Unplug both switch power cords from the rack’s power distribution unit(s).
2. From the front of the rack:
a. Unplug both power cords from the switch.
b. Push the end of each power cord through the large hole in the front of the each rail
assembly so you will be able to slide the switch forward.
c. Remove the middle stabilizing screw from the front of each rail assembly
(Figure 74).
d. Pull the switch out of the rack and place it near the replacement switch
(Figure 75).

CL5040

Figure 74 Removing the middle stabilizer screws

CL5043

Figure 75 Removing the switch from the rack

EMC Greenplum DCA Maintenance Guide

Replace the switch in the rack

202

EMC CONFIDENTIAL
Install a Switch in a Rack

Task 3: Transfer any optical SFP modules
You must transfer any optical SFP modules from the switch ports in the faulted switch to
the same numbered switch ports in the replacement switch.
For each optical SPF module in a port on the faulted switch:
1. Remove the optical SFP module (Figure 76):
a. On the SPF module, gently pull down on the spring release latch up.
b. While still holding onto the latch, gently pull out the SFP module.

CL5042

Figure 76 Removing an optical SFP module from a switch port

2. On the replacement switch, install the optical SFP module in the port with the same
number as the port from which you removed it (Figure 77):
a. On the SPF module, push the spring release latch up.
b. Align the replacement SPF module with the switch port.
c. Slide the SFP module into the switch port until it is securely connected.

CL5031

Figure 77 Installing an optical SFP module in a switch port

EMC Greenplum DCA Maintenance Guide

Replace the switch in the rack

203

EMC CONFIDENTIAL
Install a Switch in a Rack

Task 4: Transfer the inner rails to the replacemet switch
Transfer the inner rails from the faulted switch to the replacement switch.
1. Unscrew the four screws attaching each inner rail to the faulted switch.
Each rail assembly consists of an inner rail and an outer rail.
2. Attach an inner rail to each side of the replacement switch:
a. Slide the rail sections apart to separate the inner rail from the outer rail (Figure 78).

CL5035

Figure 78 Removing the inner rail from the outer rail

b. Align the holes labelled on the inner rail with the holes on the side of the
replacement switch and secure the inner rail to the replacement switch with four
M4 x 6mm screws (Figure 79).

CL5036

Figure 79 Attaching an inner rail to the switch

EMC Greenplum DCA Maintenance Guide

Replace the switch in the rack

204

EMC CONFIDENTIAL
Install a Switch in a Rack

Task 5: Attach the outer rails to the rack (if necessary)

In most service situations that you encounter the outer rails will already be attached to the
rack and you will not have to perform this procedure. This procedure is provided here
mainly for reference.
1. Attach a switch power cord to each outer rail (Figure 80):
a. At the rear of the outer rail (the end with the alignment pins), feed the male (prong)
end of a switch power cord through small hole on the outer rail from the outside to
the inside of the rail.
b. Pull enough of the power cord through the hole to allow the cable to be plugged
into the AC power outlet in the rack.
A

Rear
B

Front
C

CL5037

Figure 80 Attaching power cord the outer rail

c. Attach the cord loosely to the rail with plastic ties. Anchor the plastic ties through
the metal loops on the outside of the rail.
The outside of the rail is the side with the two posts.

EMC Greenplum DCA Maintenance Guide

Replace the switch in the rack

205

EMC CONFIDENTIAL
Install a Switch in a Rack

2. Attach the outer rails to the rack channels:
a. From the front of the rack, align rail alignment posts with the rear channel holes for
the selected 1 U (1.75 in) of rack space, and insert the rail alignment posts
securely into the holes (Figure 81).

CL5044

Figure 81 Inserting the rail alignment posts in the rear channel holes

b. Secure the rail to the front channel with two small stablizer screws in the top and
bottom holes, leaving the screws slightly loose (Figure 82).

CL5038

Figure 82 Securing the rails to the front channel

Task 6: Install the switch in the rack
1. Install the switch in the rack (Figure 83):
a. At the front of the rack, align the rails attached to the switch with the channels of
the outer rails.
EMC Greenplum DCA Maintenance Guide

Replace the switch in the rack

206

EMC CONFIDENTIAL
Install a Switch in a Rack

b. Slide the switch into the outer rails and push the switch into the rack.

CL5039

Figure 83 Installing the switch in the rack

2. Secure the rails in the rack by threading a small stabilizer screw through the front rack
channel and into the middle hole of each rail (Figure 84).

CL5040

Figure 84 Installing the middle stablizer screws

3. Firmly tighten the three small stabilizing screws that you previously installed on front
of each rail.

EMC Greenplum DCA Maintenance Guide

Replace the switch in the rack

207

EMC CONFIDENTIAL
Install a Switch in a Rack

4. At the front of the rack, feed the female end of each switch power cord through the
large hole in each rail assembly, and plug the cord into a power connector on the
switch (Appendix FFigure 85).
To Power Zone B PDU
To Power Zone A PDU

Front of the rack

CL5041

Figure 85 Plugging the switch power cords into the switch

5. At the back of the rack attach the required power and Ethernet cables as described in
“Replace a Switch in the DCA” on page 119.

Replace an optical SFP module
1. Unpack the replacement optical SFP module and place it on a clean, static-free surface
near the switch.
2. Identify the faulted SFP optical module in the switch.
Consult your product documentation for information on identifying a faulted SFP
module.
3. Remove the faulted optical SFP module (Figure 86):
a. If a cable is connected to the SFP module, disconnect it.
b. On the SPF module, gently pull down on the spring release latch.
c. While still holding onto the latch, gently pull out the SFP module.

CL5042

Figure 86 Removing an optical SFP module from a switch port
EMC Greenplum DCA Maintenance Guide

Replace an optical SFP module

208

EMC CONFIDENTIAL
Install a Switch in a Rack

4. Install the replacement optical SFP module (Figure 87):
a. On the replacement SFP module, push spring release latch up.
b. Align the replacement SPF module with the switch port that contained the faulted
module.
c. Slide the SFP module into the switch port until it is securely connected.

CL5031

Figure 87 Installing an optical SFP module in a switch port

EMC Greenplum DCA Maintenance Guide

Replace an optical SFP module

209

EMC CONFIDENTIAL

APPENDIX G
Switch Configuration: Backup and Recovery
This appendix explains how to cause the DCA to export the current switch configurations
of all switches in the cluster. All switches need to be accessible and have ssh keys
exchanged. Cases where there is a failure in this procedure should be reported as a new
support ticket.

Create Two Files for Switch Recovery
First, create two files for each switch as follows.
1. Log into the master as root.
2. Run the command dca_setup.
3. Select options 2, 13, 4.
4. Enter /root as the location for the switch backups.
5. In the /root folder on the master node, two files for each switch (the running configs
and the startup configs) will be created that can be used to recover the current
configuration after this process is complete.

Recover the Switch Configurations
Complete these steps to recover the previous configuration of a switch.
1. Log into the switch as user admin:
# ssh admin@[switch hostname]

For example, to connect to the lowest switch in the first rack:
# ssh admin@i-sw-1

2. Copy the correct switch configuration to the startup configuration with the copy
command:
[switch]#copy
scp://root@172.28.4.250/root/[switch].startup_config
startup-config

Note: If the switches were backed up to the stand-by master, use 172.27.4.251 instead of
172.28.4.250.
For example, if uploading to i-sw-1 from a file stored in /root on mdw:
i-sw-1#copy scp://root@172.28.4.250/root/i-sw-1.startup_config
startup-config

EMC Greenplum DCA Maintenance Guide

Switch Configuration: Backup and Recovery

210

EMC CONFIDENTIAL
Switch Configuration: Backup and Recovery

3. Type the root password per the prompt. The startup-config is updated:
root@172.28.4.250's password:
i-sw-1.startup_config 100% 8302 8.1KB/s 00:00

4. Type the reload command to reload the switch.
If prompted, answer no to saving changes as this will overwrite the startup-config with
what is in the running-config. The switch will reboot and come up with the recovered
configuration.
5. Repeat these steps for each switch to be recovered.

EMC Greenplum DCA Maintenance Guide

Recover the Switch Configurations

211

EMC CONFIDENTIAL

APPENDIX H
DCA Part Numbers
This appendix lists the part numbers for all field replaceable units (FRU) in a DCA.
Refer to this appendix when ordering replacement parts.

Table 28 DCA Server replacement part numbers and specifications (page 1 of 2)
Module Type and
EMC Internal
Server Name

Official Description

Part Number (SKU)

Disks

Memory

Volume
Name

DIA Module
(Kylin)

PV V2 SVR 1NIC C1U-6 300GB HDD 64GB MEM

100-585-029-xx
(covers each Rev from
-01 thru -xx)

6x300GB

64GB

etl1, ….

HD-Compute
Module
(Kylin)

PV V2 SVR 1NIC C1U-6 300GB HDD 64GB MEM

100-585-029-xx
(covers each Rev from
-01 thru -xx)

6x300GB

64GB

hdc1, ...

Master Server
(Kylin)

PV V2 SVR 1NIC C1U-6 300GB HDD 64GB MEM

100-585-029-xx
(covers each Rev from
-01 thru -xx)

6x300GB

64GB

mdw, smdw

*DIA 3TB Disk
Module
(Dragon 12)

PV V2 SERVER D2U-12 3TB HDD 64GB MEM

100-585-030-xx
(covers each Rev from
-01 thru -xx)

12x3TB

64GB

etl1, …

Hadoop (HD)
Master Module
(Dragon 12)

PV V2 SERVER D2U-12 3TB HDD 64GB MEM

100-585-030-xx
(covers each Rev from
-01 thru -xx)

12x3TB

64GB

hdm1,
hdm2,
hdm3,
hdm4

Hadoop (HD)
Data Module
(Dragon 12)

PV V2 SERVER D2U-12 3TB HDD 64GB MEM

100-585-030-xx
(covers each Rev from
-01 thru -xx)

12x3TB

64GB

hdw1, …

GPDB UAP
Standard Module
(Dragon 24)

PV V2 SERVER D2U-24 900GB HDD 64GB MEM

100-585-031-xx
(covers each Rev from
-01 thru -xx)

24x900GB 64GB

sdw1, …

*Requires DCA
software 2.0.2.0
or greater

EMC Greenplum DCA Maintenance Guide

DCA Part Numbers

212

EMC CONFIDENTIAL
DCA Part Numbers

Table 28 DCA Server replacement part numbers and specifications (page 2 of 2)
Module Type and
EMC Internal
Server Name

Memory

Volume
Name

Official Description

Part Number (SKU)

Disks

GPDB UAP
Compute Module
(Dragon 24)

PV V2 SERVER D2U-24 300GB HDD 64GB MEM

100-585-035-xx
(covers each Rev from
-01 thru -xx)

24x300GB 64GB

sdw1, …

*Master Server
with additional
NIC
(Kylin)

PV V2 SVR 2NIC C1U-6 300GB HDD 64GB MEM

100-585-049-xx
(covers each Rev from
-01 thru -xx)

6x300GB

mdw, smdw

PV V2 SERVER D2U-24 300GB HDD 256GB MEM

100-585-055-xx
(covers each Rev from
-01 thru -xx)

24x300GB 256GB

64GB

*Requires DCA
software 2.0.2.0
or greater
*GPDB Memory
Module
(Dragon 24)

sdw1, …

*Requires DCA
software 2.0.2.0
or greater

Table 29 Additional DCA FRU part numbers
Part Number

Description

Official Description

100-585-043
100-585-062
(sub)

10GB Ethernet Switch

ARISTA 7050S-52 10GB ETHERNET SWITCH

100-585-045
100-585-063
(sub)

1GB Ethernet Switch

ARISTA 7048T-A 1GB ETHERNET SWITCH

100-585-043
100-585-062
(sub)

10GB Ethernet Switch

ARISTA 7050S-52 10GB ETHERNET SWITCH

100-585-045
100-585-063
(sub)

1GB Ethernet Switch

ARISTA 7048T-A 1GB ETHERNET SWITCH

105-000-244

750W Power Supply

INTEL 750W POWER SUPPLY ROMLEY

100-585-043

Interconnect / Aggregation Switch, 52-port

ARISTA 7050S-52 10GB ETHERNET SWITCH

100-585-062

Interconnect / Aggregation Switch, 52-port

ARISTA 7050S-52 10GB ETHERNET SWITCH (CCC certified)

100-585-045

Administration Switch, 48-port

ARISTA 7048T-A 1GB ETHERNET SWITCH

100-585-063

Administration Switch, 48-port

ARISTA 7048T-A 1GB ETHERNET SWITCH (CCC certified)

100-585-048

Arista 10GBASE-SRL SFP+ OPTIC MODULE

ARISTA 10GBASE-SRL SFP+ OPTIC MODULE

105-000-313

Fan assembly, Arista switch

ARISTA FAN ASSEMBLY FOR 7048T, 7050S SWITCH

EMC Greenplum DCA Maintenance Guide

213

EMC CONFIDENTIAL
DCA Part Numbers

Part Number

Description

Official Description

105-000-314

Power supply, Arista switch

ARISTA POWER SUPPLY, 460W AC, FOR 7048T, 7050S
SWITCH

105-000-222

Disk drive assembly for Hadoop Masters & Workers

INTEL DISK ASSEMBLY/3.5” SATA/3TB/7.2K/512BPS

105-000-237

Disk drive assembly for GPDB Compute, Master
servers, DIA server, Hadoop Compute server

INTEL DISK ASSEMBLY/2.5” SAS/300GB/10K/512BPS

105-000-228

Disk drive assembly for GPDB Standard server

INTEL DISK ASSEMBLY/2.5” SAS/900GB/10K/512BPS

105-000-244

Power supply for Masters, GPDB Standard & Capacity,
DIA, Hadoop Masters & Workers

INTEL 750W POWER SUPPLY ROMLEY

100-563-477

Power Distribution Unit (PDU)

PDU: TITAN-D RACK:SINGLE PHASE

038-004-176

1 Meter Interconnect Cable

ACTIVE SFP+ TO SFP+ 1M 8G/10G CABLE

038-004-177

3 Meter Interconnect Cable

ACTIVE SFP+ TO SFP+ 3M 8G/10G CABLE

038-004-186

12 INCH PDU JUMPER CABLE

038-003-733

10 Meter Optical Cable

10M OM3 LC to LC 50μm OPTICAL CABLE

038-003-347

30 Meter Optical Cable

30m LC to LC OPTICAL 50 MICRON MM CABLE ASSEMBLIES

038-004-224

PWR CORD 24A SP 15FT 56PA332 BL 4PPP

038-004-293

CBL, 15 FT SINGLE PHASE, GRAY, N. AMERICA

038-004-294

CBL, 15 FT SINGLE PHASE, GRAY, IEC, PIN & SLEEVE

038-004-295

CBL, 15 FT SINGLE PHASE, GRAY, AUSTRALIA

038-004-296

CBL, 15 FT SINGLE PHASE, GRAY, RUSSELLSTOLL 3750DP

038-004-223

SINGLE POER INLET CORD OPTION IEC-309-332P6
INTERNATIONAL 15'

038-004-222

SINGLE POWER INLET CORD OPTION WITH HUBBELL L6-30P
CONNECTOR, NORTH AMERICA/JAPAN 15'

038-004-228

HUBBELL L6-30R to RUSSELLSTOLL 3750DP CABLE 15'

038-003-888

Service Cable - Administration Switch-to-Laptop

EMC Greenplum DCA Maintenance Guide

ETHERNET CABLE, 71 INCHES, RED

214



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.6
Linearized                      : Yes
Author                          : Pivotal Information Development
Copyright                       : 2013
Create Date                     : 2014:04:14 13:37:18Z
Modify Date                     : 2014:04:14 19:21:17-04:00
XMP Toolkit                     : Adobe XMP Core 5.4-c005 78.147326, 2012/08/23-13:03:03
Format                          : application/pdf
Creator                         : Pivotal Information Development
Title                           : Greenplum DCA Maintenance Guide 2.0.0.0 / 2.0.1.0
Creator Tool                    : FrameMaker 9.0
Metadata Date                   : 2014:04:14 19:21:17-04:00
Producer                        : Acrobat Distiller 11.0 (Windows)
Document ID                     : uuid:35503407-7373-45a2-9114-906db75c9793
Instance ID                     : uuid:d29b8456-29ab-44e2-bac7-7236a9ee566c
Page Mode                       : UseOutlines
Page Count                      : 214
EXIF Metadata provided by EXIF.tools

Navigation menu