Greenplum DCA Maintenance Guide 2.0.0.0 / 2.0.1.0 Data Computing Appliance 2.0.2.0 2.0.3.0

Greenplum-Data-Computing-Appliance-Maintenance-Guide-2.0.0.0---2.0.1.0---2.0.2.0---2.0.3.0

User Manual:

Open the PDF directly: View PDF .
Page Count: 214 [warning: Documents this large are best viewed by clicking the View PDF Link!]

Important Information Before You Begin
Replace a Master Server
Replace a Segment, DIA, or Hadoop server
Replace a Disk Drive
Replace a Power Supply in a Server
- Power supply LEDs
- Replace a power supply in a server
Replace a Fan Assembly or Power Supply in an Arista Switch
- Replace a Fan Assembly in an Arista Switch
- Replace a Power Supply in an Arista Switch
Replace a Switch in the DCA
Replace an Interconnect Switch Cable
System Information and Configuration
Connect a workstation to the DCA
Power Off the DCA
Linux and vi Command Reference
- Common Linux command reference
- vi Quick Reference
Replace a Server in the Greenplum DCA Rack
Install a Switch in a Rack
Switch Configuration: Backup and Recovery
- Create Two Files for Switch Recovery
- Recover the Switch Configurations
DCA Part Numbers

EMC CONFIDENTIAL

EMC® Greenplum

Data Computing Appliance

Appliance Version 2.0.0.0/2.0.1.0/2.0.2.0 /2.0.3.0

Maintenance Guide

REV 11

EMC CONFIDENTIAL

Published April, 2014

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without

notice.

The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect

to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular

purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

EMC2, EMC, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the United States and other countries.

All other trademarks used herein are the property of their respective owners.

For the most up-to-date regulatory document for your product line, go to the technical documentation and advisories section on the

EMC online support website.

EMC Greenplum DCA Maintenance Guide

EMC CONFIDENTIAL

CONTENTS

Chapter 1 Important Information Before You Begin .............................................6

New firmware updates in support of DCA software version 2.0.3.0 ................ 6

Identify the version of the installed DCA software.......................................... 7

Avoid electrostatic discharge damage (ESD).................................................. 7

Handling field replaceable units (FRUs) ................................................... 8

Chapter 2 Replace a Master Server ...................................................................10

Required tools ............................................................................................ 10

Task summary............................................................................................. 11

Service tag location..................................................................................... 13

Replace the Primary Master server............................................................... 14

Replace the Standby Master server ............................................................. 23

Identifying a single-NIC master versus a dual-NIC master in a DCAv2 ........... 31

Replace a Master server in a DCA without a Greenplum database ................ 31

Chapter 3 Replace a Segment, DIA, or Hadoop server .......................................39

Required tools ............................................................................................ 39

Task summary............................................................................................. 40

Service tag locations................................................................................... 41

Reseat cables before replacing a server....................................................... 42

Replace a server in an initialized GPDB module........................................... 44

Replace a DIA server or a server in an uninitialized GPDB module................ 51

Replace a server in a Greenplum Hadoop module (DCA version 2.0.0.0)...... 58

Replace a server in a Pivotal Hadoop module (version 2.0.1.0 and later) ..... 64

Remove the failed PHD server and install the replacement PHD server......... 64

Replace hdm1 (namenode, DCA version 2.0.1.0)................................... 68

Replace hdm2 (zookeeper/secondary-namenode, DCA version 2.0.1.0) 72

Replace hdm3 (resourcemanager, DCA version 2.0.1.0) ........................ 76

Replace hdm4 (zookeeper/hive/hive-metastore, DCA version 2.0.1.0) .. 79

Replace hdw# (datanode, nodemanager, DCA version 2.0.1.0).............. 82

Chapter 4 Replace a Disk Drive .........................................................................86

Hot spare drives and the Copyback operation.............................................. 86

Replace a disk drive in a Master, DIA, or Hadoop Compute server ................ 87

Replace a drive in a Segment Server............................................................ 91

Replace a drive in an Hadoop server............................................................ 96

Replace a drive in a Hadoop Master server ............................................ 97

Replace a drive in a Hadoop Worker server.......................................... 101

Chapter 5 Replace a Power Supply in a Server ................................................110

Power supply LEDs.................................................................................... 110

Replace a power supply in a server............................................................ 110

EMC Greenplum DCA Maintenance Guide

Contents

EMC CONFIDENTIAL

Chapter 6 Replace a Fan Assembly or Power Supply in an Arista Switch ..........113

Replace a Fan Assembly in an Arista Switch............................................... 113

Fan Assembly Replacement Order Information .................................... 113

Tools................................................................................................... 113

Identify the Failed Fan Assembly ......................................................... 114

Remove the Failed Fan Assembly and Install the Replacement Part...... 115

Parts Return ........................................................................................ 115

Replace a Power Supply in an Arista Switch ............................................... 116

Power Supply Assembly Replacement Order Information ..................... 116

Tools................................................................................................... 116

Identify the Failed Power Supply.......................................................... 117

Remove the Failed Power Supply and Install the Replacement Part...... 118

Parts Return ........................................................................................ 118

Chapter 7 Replace a Switch in the DCA ...........................................................119

Requirements ........................................................................................... 120

Switch hostnames and IP addresses ......................................................... 120

Replace an Arista 7050S Interconnect or Aggregation Switch..................... 122

Replace an Arista 7048T Administration Switch......................................... 127

Chapter 8 Replace an Interconnect Switch Cable ............................................134

Appendix A System Information and Configuration............................................137

New firmware updates for DCA software version 2.0.3.0...................... 137

Identify the version of the installed DCA software................................ 138

DCA configuration rules....................................................................... 139

Racking order...................................................................................... 139

Racking guidelines.............................................................................. 140

Mixed System rack components ......................................................... 141

Hadoop-only System Rack components (minimum config.).................. 142

HD-Compute System Rack components (minimum config.).................. 143

Aggregation rack components............................................................. 144

Expansion rack components ............................................................... 145

Power supply reference ............................................................................. 146

BMC Controller interface functionality ....................................................... 151

BMC Controller LED indicators and meanings ............................................ 151

Network and cabling configurations .......................................................... 152

Interconnect cabling reference ............................................................ 152

Administration switch reference .......................................................... 159

Aggregation switch reference .............................................................. 163

Network hostname and IP configuration ................................................... 170

Multiple-rack cabling reference ................................................................. 173

Configuration files..................................................................................... 174

Location of old core files ........................................................................... 174

Default passwords .................................................................................... 175

Appendix B Connect a workstation to the DCA ...................................................176

Laptop prerequisites................................................................................. 176

Configure your laptop to connect to the DCA.............................................. 176

Configure a Windows 7 laptop............................................................. 176

Configure a Windows XP laptop........................................................... 178

Connect to the Master Server using an SSH client...................................... 178

EMC Greenplum DCA Maintenance Guide

Contents

EMC CONFIDENTIAL

Copy a file to the Master Server using an SCP client................................... 179

Connect to an Interconnect or Administration switch using PuTTY .............. 181

Appendix C Power Off the DCA...........................................................................183

Appendix D Linux and vi Command Reference....................................................189

Common Linux command reference........................................................... 189

vi Quick Reference .................................................................................... 191

Appendix E Replace a Server in the Greenplum DCA Rack ..................................192

Mounting kit parts..................................................................................... 192

Appendix F Install a Switch in a Rack.................................................................200

Switch mounting kit parts ......................................................................... 201

Replace the switch in the rack ................................................................... 201

Replace an optical SFP module.................................................................. 208

Appendix G Switch Configuration: Backup and Recovery....................................210

Create Two Files for Switch Recovery.......................................................... 210

Recover the Switch Configurations ............................................................ 210

Appendix H DCA Part Numbers ..........................................................................212

EMC Greenplum DCA Maintenance Guide Important Information Before You Begin

EMC CONFIDENTIAL

CHAPTER 1

Important Information Before You Begin

For detailed descriptions of DCA components and configurations, see Appendix A,

“System Information and Configuration,” on page 137.

This chapter includes the following major sections:

New firmware updates in support of DCA software version 2.0.3.0 ............................ 6

Identify the version of the installed DCA software...................................................... 7

Avoid electrostatic discharge damage (ESD) .............................................................. 7

New firmware updates in support of DCA software version 2.0.3.0

Customers can apply optional firmware updates prior to upgrading to DCA software

version 2.0.3.0 as follows:

Arista 7050S-52 and Arista 7048T switches

• New firmware version EOS-4.9.8.swi

• Field personnel can access the EOS-4.9.8.swi.zip firmware upgrade

package from:

ftp://ftp.aristanetworks.com/emc/certifiedeos/EOS-4.9.8.swi

Field personnel can obtain the following document available on

http://support.emc.com for step-by-step instructions:

EMC Greenplum DCA Firmware Upgrade Instructions for the Interconnect Switch

(Arista 7050S-52) and Administration Switch (Arista 7048T)

Intel Servers (Kylin with eight drives, Dragon 12 with twelve drives, and Dragon 24

with 24 drives)

• New BIOS upgrade revision level SE5C600.86B.02.01.0002

• Field personnel can access both the BIOS upgrade package, and the

EMC Greenplum DCA Intel BIOS Upgrade Instructions for Intel Servers

from

http://support.emc.com.

EMC Greenplum DCA Maintenance Guide Identify the version of the installed DCA software

Important Information Before You Begin

EMC CONFIDENTIAL

Identify the version of the installed DCA software

The replacement procedures in this guide pertain only to DCA clusters running software

version 2.0.x.x. DCA documentation is tied to a specific version of the DCA software.

Before beginning any replacement procedure on a DCA, make sure that the version of the

software running on the clusters matches the version of the guide that you are using.

1. Log in to the Primary Master server as the user root (see “Connect a workstation to

the DCA” on page 176).

2. View the contents of the /opt/dca/etc/dca-build-info.txt file.

Verify that the ISO_VERSION value is 2.0.x.x

# cat /opt/dca/etc/dca-build-info.txt

ISO_BUILD_DATE="Wed Oct 15 21:59:56 PST 2013"

ISO_VERSION="2.0.2.0"

ISO_BUILD_VERSION="4"

ISO_INSTALL_TYPE="iso"

If the version of the software is not 2.0.x.x, go to he EMC Online Support site

http://support.emc.com. From the Support by Product pages, search for Greenplum

Data Computing Appliance and obtain the documentation that matches the software

version running on the Primary Master server.

Avoid electrostatic discharge damage (ESD)

When you replace and install field replaceable units (FRUs), you can inadvertently damage

sensitive electronic circuits in the equipment simply by touching them. Electrostatic

charge that has accumulated on your body can discharge through the circuits. If the air in

the work area is dry, running a humidifier in the work area can help decrease the risk of

electrostatic discharge (ESD) damage.

Read and understand the following guidelines:

Provide enough room to work on the equipment. Clear the work site of any

unnecessary materials, especially materials that naturally build up electrostatic

charge such as foam packaging, foam cups, cellophane wrappers, and similar items.

Do not remove replacement or upgrade FRUs from their antistatic packaging until you

are ready to install them.

Set up your EMC-issued ESD kit and all other materials that you need before servicing

a Greenplum system. Once you begin service, avoid moving away from the work site;

otherwise, your body can build up an electrostatic charge.

Use the ESD kit when handling system components.

Wear an ESD wristband. Attach the clip of the ESD wristband to any bare (unpainted)

metal in the bay, and then place the wristband around your wrist with the metal

button against your skin.

EMC Greenplum DCA Maintenance Guide Avoid electrostatic discharge damage (ESD)

Important Information Before You Begin

EMC CONFIDENTIAL

Handling field replaceable units (FRUs)

This section describes the precautions that you must take and the general procedures

that you must follow when removing and storing any field replaceable unit (FRU). The only

FRUs in the server are the disk drive assemblies and power supply modules. Depending

on the product in which the server is used, the disk drive assemblies may be

hot-swappable; that is you can replace a disk drive assembly while the server is running.

To determine if disk drive assemblies are hot-swappable, refer to your product

documentation. Regardless of the product in which the server is used, the power supply

modules are hot-swappable; that is you can replace a power supply module while the

server is running.

You should not remove a faulty FRU until you have a replacement available.

When you replace FRUs, you can inadvertently damage the sensitive electronic circuits in

the equipment by simply touching them. Electrostatic charge (ESD) that has accumulated

on your body discharges through the circuits. If the air in the work area is very dry, running

a humidifier in the work area will help decrease the risk of ESD damage. Follow the

procedures below to prevent damage to the equipment.

Provide enough room to work on the equipment. Clear the work site of any

unnecessary materials or materials that naturally build up electrostatic charge, such

as foam packaging, foam cups, cellophane wrappers, and similar items.

Do not remove replacement or upgrade FRUs from their antistatic packaging until you

are ready to install them.

Before you service a server, gather together the ESD kit and all other materials you will

need. Once servicing begins, avoid moving away from the work site; otherwise, you

may build up an electrostatic charge.

Use the ESD kit when handling any FRU. Use an ESD wristband. To use the ESD

wristband (strap), attach the clip of the wristband to any bare (unpainted) metal on

the server; then put the wristband around your wrist with the metal button against

your skin. If an emergency arises and the ESD kit is not available, follow the

procedures in “Emergency procedures (without an ESD kit)” on page 8.

Emergency procedures (without an ESD kit)

In an emergency when an ESD kit is not available, use the procedures below to reduce the

possibility of an electrostatic discharge by ensuring that your body and the subassembly

are at the same electrostatic potential. These procedures are not a substitute for the use

of an ESD kit. Follow them only in the event of an emergency.

Before touching any FRU, touch a bare (unpainted) metal surface of the cabinet or

server.

Before removing any FRU from its antistatic bag, place one hand firmly on a bare metal

surface of the server, and at the same time, pick up the FRU while it is still sealed in

the antistatic bag. Once you have done this do not move around the room or touch

other furnishings, personnel, or surfaces until you have installed the FRU.

When you remove a FRU from the antistatic bag, avoid touching any electronic

components and circuits on it.

EMC Greenplum DCA Maintenance Guide Avoid electrostatic discharge damage (ESD)

Important Information Before You Begin

EMC CONFIDENTIAL

If you must move around the room or touch other surfaces before installing a FRU, first

place the FRU back in the antistatic bag. When you are ready again to install the FRU,

repeat these procedures.

EMC Greenplum DCA Maintenance Guide Replace a Master Server

EMC CONFIDENTIAL

CHAPTER 2

Replace a Master Server

This chapter describes how to replace a Primary or Standby Master server in

a GPDB-only DCA, a mixed DCA, or a Hadoop-only DCA.



(Applies only to version 2.0.1.0 and later)

Additional steps are required if you are servicing a Hadoop-only DCA. Look for the

following notice in the left margin:

**If you are servicing a Hadoop-only DCA**

Topics include:

Required tools ........................................................................................................ 10

Task summary......................................................................................................... 11

Service tag location................................................................................................. 13

Replace the Primary Master server........................................................................... 14

Replace the Standby Master server ......................................................................... 23

Replace a Master server in a DCA without a Greenplum database ............................ 31

Required tools

You need the following tools to remove and replace a server:

#2 Phillips screwdriver

Wrist grounding strap

EMC Greenplum DCA Maintenance Guide Task summary

Replace a Master Server

EMC CONFIDENTIAL

Task summary

Table 1 Summary of Master server replacement tasks

Tasks Primary Master Standby Master Primary or Standby Master in a DCA

with no initialized GP database

Check BIOS version when replacing a

Master server in the cluster

When installing a replacement server, identify the

BIOS version on the new server (as well as the

versions already running in the DCA). Then

upgrade so that all servers reflect the same

firmware levels.

Go to http://support.emc.com to obtain the

pertinent BIOS upgrade instructions. The upgrade

instructions provide information on how to access

and install the upgrade package.

xx x

Check with the customer if any custom

configurations have been applied.

xx x

Disable health monitoring. xx x

Check Master server sync state. xx

Check Greenplum database for errors. xx

If necessary, initiate an orchestrated failover from

the Primary server to the Standby server.

**If you are replacing a Primary Master in a

Hadoop-only DCA**

: include the argument

--deletevip

Verify the success of the failover. x

Power off the failed server. xx x

Label then remove cables from the failed server. xx x

Install the replacement server. xx x

Transfer drives from the failed server to the

replacement server.

xx x

Connect cables to the replacement server (but do

not power it on yet).

xx x

Configure the BMC IP address. xx x

Power on the replacement server. xx x

Import foreign disk configurations. xx x

Check the health of the replacement server. xx x

Exchange SSH keys. xx

Initialize the replacement server as the

acting

Standby Master server (temporarily).

EMC Greenplum DCA Maintenance Guide Task summary

Replace a Master Server

EMC CONFIDENTIAL

Initiate a failover from the replacement server (the

acting

Standby Master server).

**If you are replacing a Primary Master in a

Hadoop-only DCA**

: include the argument

--deletevip

Revert the smdw server to its former standby role. x

**If you are replacing a Primary or Standby Master

in a Hadoop-only DCA**

Recover and rebalance the

GPDB segment instances on the Primary Master

server.

Synchronize the system clock. xx

Ask the customer if there are any custom

configurations that must be reapplied to the DCA

(for example, NFS mounts or gateways).

Re-enable health monitoring. xx x

Table 1 Summary of Master server replacement tasks

Tasks Primary Master Standby Master Primary or Standby Master in a DCA

with no initialized GP database

EMC Greenplum DCA Maintenance Guide Service tag location

Replace a Master Server

EMC CONFIDENTIAL

Service tag location

When replacing any hardware component, it is important that you properly debrief the

part. The serial number of a Master server is located on the blue service label affixed to

the front left corner of the server.

Figure 1 Service tag location on the Master server (Dragon24 shown)

EMC Greenplum DCA Maintenance Guide Replace the Primary Master server

Replace a Master Server

EMC CONFIDENTIAL

Replace the Primary Master server

Perform this procedure if the Primary Master server has failed or is failing.

IMPORTANT

This procedure directs you to transfer drives from the failed server to the replacement

server. Take great care when transferring drives. Transfer only one drive at a time. Insert

drives in the same slots that they occupied in the failed server.

1. You may want to consult “Task summary” on page 11 for a overview of the Master

server replacement procedures.

2. If it is not already connected, connect your service laptop to the red service cable

located on the laptop tray. The red service cable is connected to port 48 on the

Administration switch in the SYSRACK. For instructions on how to configure the IP

address on your laptop, see “Connect a workstation to the DCA” on page 176).

3. If the Primary Master server is still accessible through SSH, perform step a through

step e below. If the Primary Master server is not accessible through SSH, skip to

step 4.

a. Log in to the

Primary Master server

as the user root (see “Connect a workstation

to the DCA” on page 176).

b. Activate the server identification LED.

# dca_blinker -h mdw -a ON

c. Switch to the user gpadmin and determine whether the Primary and Standby

Master servers are synced:

# su - gpadmin

$ gpstate -f

If the output returns the status synchronized, the master servers are in sync. If

synchronized is not returned in the output or the database is not running, do

not replace the Primary Master server. Contact EMC Support.

d. Switch to the user root and make note of any custom NFS mounts the customer

may have created:

$ su -

# cat /etc/fstab

e. Make note of any custom network gateways the customer may have created:

# cat /etc/sysconfig/network

4. Before you replace the failed Primary Master server, perform the sub-steps below to

determine whether an automatic failover occurred when the Primary Master failed. If

an automatic failover did not occur, you must initiate an orchestrated (manual)

failover before you replace the failed server (see step 5 below). To determine whether

an automated failover occurred, do the following:

a. Start the DCA Setup utility as the user root:

# dca_setup

EMC Greenplum DCA Maintenance Guide Replace the Primary Master server

Replace a Master Server

EMC CONFIDENTIAL

b. Select option 2 to Modify DCA Settings.

c. Note the text of option 19:

– If the text is Disable Master Server Auto Failover (currently

enabled), an automatic failover occurred when the Primary Master failed.

Skip to step 6 to determine if the failover was successful.

–If the text is Enable Master Server Auto Failover (currently

disabled), you must initiate an orchestrated (manual) failover as described

in step 5.

d. Enter X to exit the DCA Setup utility.

5. If an automatic failover did occur, proceed to step 6. If an automatic failover did not

occur, initiate a orchestrated (manual) failover as follows:

a. From the

Standby Master server

, issue the dca_failover command:

# dca_failover --stopmasterdb --noninteractive --vip 10.10.10.10

--gateway 10.10.10.1 --netmask 255.255.255.0

**If you are servicing a

Hadoop-only DCA**

In a Hadoop-only DCA, make sure to include the option

--deletevip

# dca_failover --stopmasterdb --noninteractive --vip 10.10.10.10

--gateway 10.10.10.1 --netmask 255.255.255.0 --deletevip

b. Replace the values shown in bold above with the IP, Gateway, and Netmask of the

virtual IP address provided by the customer. If the customer has not specified a

virtual IP address, do not include the --vip, --gateway, and --netmask

parameters. Wait for the prompt to appear indicating that the failover has

completed before you continue.

c. When the failover has completed, proceed to step 6 to determine if the failover

operation was successful.

6. To determine whether the Master server failover operation was successful, switch to

the user gpadmin and issue the following command. Verify that the text in bold is

returned in the output:

# su - gpadmin

$ gpstate -f

[INFO]:-Starting gpstate with args: -f

[INFO]:-local Greenplum Version: 'postgres (Greenplum Database)

4.1.1.3 build 4'

[INFO]:-Obtaining Segment details from master...

[INFO]:-Standby master instance not configured

7. Check the Greenplum Database for errors. If any errors are returned in the output, you

must resolve them before you continue with this procedure:

$ gpstate -e

[INFO]:-----------------------------------------------------

[INFO]:-Segment Mirroring Status Report

[INFO]:-----------------------------------------------------

[INFO]:-All segments are running normally

8. To prevent false dial home messages from being sent to EMC Support during service,

Standby Master server

as the user root and stop the healthmon

daemon to disable health monitoring:

$ su -

EMC Greenplum DCA Maintenance Guide Replace the Primary Master server

Replace a Master Server

EMC CONFIDENTIAL

# dca_healthmon_ctl -d

9. Shut down the Primary Master server:

• If the failed Primary Master server is accessible through SSH, log into to it as the

user root and issue the shutdown command.

IMPORTANT

Check the prompt to make sure that you are on the Primary Master (mdw) before

you issue the shutdown command!

$ ssh root@mdw

# shutdown -h now

• If the failed server is not accessible through SSH, power it off by pressing the

power button on the front of the server for 5 seconds (see Figure 2 below).

Figure 2 Power button location on Master server

10. Label all the cables connected to the failed server so that you’ll know where to

connect them on the replacement server.

11. Remove all power, Ethernet, and twin-axial cables from the back of the server.

Note: If the system has Dual NICs installed, note the connections for customer and

interconnect networks prior to disconnecting.

12. Remove the failed server and install the replacement server (see Appendix E, “Replace

a Server in the Greenplum DCA Rack,” on page 192).

13. Transfer disk drives one at a time from the failed server to the replacement server.

IMPORTANT

Use caution when transferring drives. Transfer only one drive at a time. Insert the

drives in the same slots that they occupied in the failed server.

14. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on

the cables for proper connectivity.

IMPORTANT

Do not connect power to the replacement server yet.

15. From the

Standby Master server

start the dhcpd service as the user root:

# service dhcpd start

16. Connect the power cables to the replacement server.

Power button

EMC Greenplum DCA Maintenance Guide Replace the Primary Master server

Replace a Master Server

EMC CONFIDENTIAL

17. Next, use these steps to identify the IP address assigned to the server.

a. Issue the following command to obtain the lease information provided in the

dhcpd.leases file:

# tail /var/lib/dhcpd/dhcpd.leases

b. The dhcpd.leases file dispays (similar to the following):

Example

lease 172.28.6.170 {

starts 4 2012/10/18 20:09:08;

ends 5 2013/10/18 20:09:08;

cltt 4 2012/10/18 20:09:08;

binding state active;

next binding state free;

hardware ethernet 00:00:00:00:00:04;

uid "\001\000\036g,\242\014";

c. Locate the MAC address labelled hardware ethernet in the example

dhcpd.leases file above:

00:00:00:00:00:04

d. Locate the MAC address on the replacement server’s service tag (highlighted in the

photograph below):

MAC1 00:00:00:00:00:00

Figure 3 Locating the MAC address on the service tag (Primary Master server shown, Dragon24)

e. Compare the last two digits in the MAC addresses referenced in step c and step d

(for example, 00:00:00:00:00:04 and 00:00:00:00:00:00 ). Verify that the MAC

address in the dhcpd.leases file is four greater than the last two numbers in the

MAC address on the replacement server’s service tag.

If this is the case, it is certain that the IP address in the dhcpd.leases file is the

correct one to associate with the server. For example, the scenario described

above verifies that 172.28.6.170 is correct in this specific instance.

EMC Greenplum DCA Maintenance Guide Replace the Primary Master server

Replace a Master Server

EMC CONFIDENTIAL

f. Issue the following ipmiutil command. Insert the IP address (172.28.6.170)

identified in the previous steps using the example above as a guide:

Note: Disregard the long, detailed output after this command is executed.

# ipmiutil lan -e -l -I 172.28.0.250 -S 255.255.248.0 -N 172.28.6.170 -U root -P sephiroth

g. Ping the new address to verify that the change was applied:

# ping 172.28.0.#

Where # is the number of the server you are replacing.

18. Turn off the dhcpd service:

# service dhcpd stop

19. From the Primary Master server as user root, issue the following command to open a

BMC console session on the replacement Master server mdw:

# ipmiutil sol -a -e -N mdw-sp -U root -P sephiroth



You will need to press the F key within 15 seconds after seeing this WARNING message:

Foreign configuration(s) found on adapter

Press any key to continue or ‘C’ load the configuration

utility, or ‘F’ to import foreign configuration(s) and

continue.

20. Power on the replacement server by pressing the power button on the front panel, and

press the F key when prompted.

21. When the following message displays, disregard and press the space bar:

All of the disks from your previous configuration are gone. If this

is an unexpected message, then please power off your system and

check your cables to ensure all disks are present. Press any key to

continue, or 'C' to load the configuration utility.

EMC Greenplum DCA Maintenance Guide Replace the Primary Master server

Replace a Master Server

EMC CONFIDENTIAL

22. If the message below appears you will need to power off the server, verify that all LED

lights are off, and repeat steps 20 through 21.

CLIENT MAC ADDR: 00 1E 67 4D C5 1D GUID: 2A9B43A4 A50A 11E1 AAA0

001E674DC51D

DHCP....\

23. Monitor the boot process onscreen and verify that the system boots from hard disk.

If the system does boot from hard disk, proceed to step 24.

If the system does not boot from hard disk, perform the following sub-steps to force it

to boot from hard disk:

a. Exit the BMC console utility by pressing the a tilde key (~), and then the period (.)

key, as follows on the keyboard:

then

Note: When you exit the BMC console you are returned to your connection on the

smdw as the user root.

b. Issue the following command from smdw to force the appliance to boot from hard

drive:

# ipmiutil reset -h -N mdw-sp -U root -P sephiroth

c. Once the operating system is loaded, issue the following command to change the

boot order on mdw:

# ssh mdw

# syscfg /bbo “emcbios” HDD NW

d. Reboot mdw:

# reboot

e. Following the reboot, issue the following commands to connect to mdw and verify

the boot order:

# ssh mdw

# syscfg /bbosys

f. Exit mdw:

# exit

g. Proceed to step 24. (You can skip step 24 because you already exited the BMC

console in sub-step a above.)

EMC Greenplum DCA Maintenance Guide Replace the Primary Master server

Replace a Master Server

EMC CONFIDENTIAL

24. From the Standby Master server (smdw) check the health of the replacement server:

# dcacheck -h mdw

Verify that no errors display.

25. Exchange SSH keys on the replacement server using the DCA Setup utility:

a. Start the DCA Setup utility as the user root:

# dca_setup

b. Select option 2 to Modify DCA Settings.

c. Select option 6 to Generate SSH Keys.

d. Enter X to exit the DCA Setup utility.

e. Check the firmware level of the RAID controllers with the CmdTool2 utility:

# ssh mdw /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep

"FW Package Build"

If the above command returns either: FW Package Build: 23.7.0-0033 or FW

Package Build: 23.9.0-0026, your firmware needs updating. Follow steps a

through n below to update the firmware.

f. Download ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip file from Support Zone to

your laptop.

https://support.emc.com/downloads/9507_Greenplum-Data-Computing-Applian

g. Extract the files to your laptop using unzip or similar unpacking tool. For example:

Unzip ir3_2208SASHWR_FWPKG-v23.12.0-0013.zip

h. As a root user copy the MR56p.rom file to the Master server (mdw) and place the

file in /root. You can use WinSCP or a similar utility.

Note: You may be required to provide a login to the destination server.

i. For each server in need of an update, log into the server as root.

j. SCP the MR56p.rom file from the master to the server you are updating.

k. Install the new firmware using the following command:

Note: This will take longer on 24-disk servers.

# /opt/MegaRAID/CmdTool2/CmdTool2 -adpfwflash -f /root/MR56p.rom

-aall

l. Reboot the server.

# reboot

m. When the server reboots, check the new firmware version:

# /opt/MegaRAID/CmdTool2/CmdTool2 -adpallinfo -aall | grep "FW

Package Build"

EMC Greenplum DCA Maintenance Guide Replace the Primary Master server

Replace a Master Server

EMC CONFIDENTIAL

The following should be returned, indicating your firmware has successfully been

updated on this server:

FW Package Build: 23.12.0-0013

n. Repeat these alphabetic steps to check/update the remaining servers in the

cluster.

26. Switch to the user gpadmin and issue the following commands to initialize the

replacement server as the

acting

Standby Master server:

# su - gpadmin

$ ssh mdw rm -r /data/master/*

$ gpinitstandby -s mdw

27. At the message Do you want to continue with standby master

initialization? enter Y to continue.

Wait for the message Successfully created standby master on mdw.

28. Log in to the replacement server (now the new

acting

Standby Master server) as the

user root:

$ ssh root@mdw

29. Issue the following command to initiate the orchestrated (manual) failover:

dca_failover --stopmasterdb --noninteractive --vip 10.10.10.10

--gateway 10.10.10.1 --netmask 255.255.255.0

**If you are servicing a

Hadoop-only DCA**

In a Hadoop-only DCA, make sure to include the option

--deletevip:

dca_failover --stopmasterdb --noninteractive --vip 10.10.10.10

--gateway 10.10.10.1 --netmask 255.255.255.0 --deletevip

30. At the message Do you want to continue? enter y.

IMPORTANT

Initiating a failover stops the Greenplum Database and renders it temporarily

unavailable.

When the failover operation finishes you are returned to the prompt [root@mdw]# .

31. Switch to the user gpadmin:

# su - gpadmin

32. Connect to the Standby Master server and empty the /data/master directory:

$ ssh smdw rm -r /data/master/*

33. Issue the following command to revert the smdw server to its standby role:

$ gpinitstandby -s smdw

34. At the message Do you want to continue with standby master

initialization? enter Y.

EMC Greenplum DCA Maintenance Guide Replace the Primary Master server

Replace a Master Server

EMC CONFIDENTIAL

**If you are servicing a

Hadoop-only DCA**

Perform the next two steps only if you are replacing a Master server in a Hadoop-only DCA

35. Issue the following command to recover the GPDB segment instances running on the

Primary and Standby Master servers:

$ gprecoverseg

Enter Y when prompted. For example:

Continue with segment recovery procedure Yy|Nn (default=N):

> Y

36. Issue the following command to rebalance the GPDB segment instances running on

the Primary and Standby Master servers:

$ gprecoverseg -r

Enter Y when prompted. For example:

Continue with segment rebalance procedure Yy|Nn (default=N):

> Y

The procedure continues here for all DCA types:

37. Exit the user gpadmin:

$ exit

38. Start the DCA Setup utility:

# dca_setup

39. Synchronize the system clock:

a. Select option 2 for Modify DCA Settings.

b. Select option 5 for Modify NTP/Clock Configuration Options.

c. Select option 3 for Synchronize clocks across the cluster to the

NTP server.

Enter X to exit the DCA Setup utility.

40.

IMPORTANT

- Note that the same DCA system Serial Number (located on a label affixed

to the top, rear of the rack) must be included in the following files for Dial Home to

work after replacing a Master application server (mdw and smdw in the case of GPDB

and hdm, and standby hdm in the case of Hadoop):

•/opt/connectemc/ConnectEMC.ini

•/opt/greenplum/serialnumber

First, check the DCA system Serial Number in the connectemc initialization file,

/opt/connectemc/ConnectEMC.ini file, as follows:

a. Open the connectemc initialization file:

/opt/connectemc/ConnectEMC.ini

b. Locate the DCA system Serial Number per the following keyword in the file:

SERIAL_NUMBER=

EMC Greenplum DCA Maintenance Guide Replace the Standby Master server

Replace a Master Server

EMC CONFIDENTIAL

c. Check that this matches the DCA system Serial Number on the label affixed to the

top, rear of the rack. Go to the next step (step d.) if the Serial Number is missing.

d. If missing, enter the Serial Number in the

/opt/connectemc/ConnectEMC.ini file, for example:

SERIAL_NUMBER=APMXXXXXXXXX

41. Next, check that the DCA system Serial Number in the

/opt/greenplum/serialnumber file matches the DCA system Serial Number in

the /opt/connectemc/ConnectEMC.ini file, per

step 40

above.

For example:

SERIAL_NUMBER=APM00140732731

Note: After verifying that the DCA system Serial Numbers are identical, remember to save

the /opt/greenplum/serialnumber file if you made any changes.

42. Re-enable health monitoring:

# dca_healthmon_ctl -e

43. You must stop and start the connectemc service (also referred to as Dial Home) to

complete restarting the healthmon daemon.

Enter the command:

service connectemc stop

You will see the message:

Shutting down ConnectEMC

44. When you see the # prompt again, enter:

service connectemc start

You will see the message:

Starting ConnectEMC

The # prompt returns, indicating that you have re-enabled health monitoring.

Replace the Standby Master server

Perform this procedure if the Standby Master server has failed or is failing and the Primary

Master server is in good health.

IMPORTANT

This procedure directs you to transfer drives from the failed server to the replacement

server. Take great care when transferring drives. Transfer only one drive at a time. Insert

drives in the same slots that they occupied in the failed server.

1. You may want to consult “Task summary” on page 11 for a overview of the Master

server replacement procedures.

EMC Greenplum DCA Maintenance Guide Replace the Standby Master server

Replace a Master Server

EMC CONFIDENTIAL

2. If it is not already connected, connect your service laptop to the red service cable

located on the laptop tray. For details on how to configure the IP address of your

laptop, see “Connect a workstation to the DCA” on page 176.

3. To prevent false dial home messages from being sent to EMC Support during service,

stop the healthmon daemon to disable health monitoring:

# dca_healthmon_ctl -d

4. If the Standby Master server is still accessible through SSH, perform step a through

step e below. If the failed Standby Master server is not accessible through SSH, skip to

step 5.

a. Log in to the

Primary Master server

as the user root (see “Connect a workstation

to the DCA” on page 176).

b. Activate the server identification LED.

# dca_blinker -h smdw -a ON

c. Switch to the user gpadmin and determine whether the Primary and Standby

Master servers are synchronized:

# su - gpadmin

$ gpstate -f

If the output returns the status synchronized, the Master servers are in sync. If

synchronized is not returned in the output, do not replace the Standby Master

server. Contact EMC Support.

d. Switch to the user root and make note of any custom NFS mounts the customer

may have created:

$ su -

# cat /etc/fstab

e. Make note of any custom network gateways the customer may have created:

# cat /etc/sysconfig/network

5. From the

Primary Master server

, switch to the user gpadmin and remove the Standby

Master server from the configuration:

# su - gpadmin

$ gpinitstandby -r

6. When prompted, enter Y to continue.

7. Shut down the Standby Master server:

• If the failed Standby Master server is accessible through SSH, log into to it as the

user root and issue the shutdown command.

IMPORTANT

Check the prompt to make sure that you are on the Standby Master (smdw) before

you issue the shutdown command!

$ ssh root@smdw

# shutdown -h now

EMC Greenplum DCA Maintenance Guide Replace the Standby Master server

Replace a Master Server

EMC CONFIDENTIAL

• If the failed server is not accessible through SSH, power it off by pressing the

power button on the front of the server.

8. Label all the cables connected to the failed server so that you’ll know where to

connect them on the replacement server.

9. Remove all power, Ethernet, and twin-axial cables from the back of the server.

Note: If the system has Dual NICs installed, note the connections for customer and

interconnect networks prior to disconnecting.

10. Remove the failed server and install the replacement server (see Appendix E, “Replace

a Server in the Greenplum DCA Rack,” on page 192).

11. Transfer disk drives one at a time from the failed server to the replacement server.

IMPORTANT

Use caution when transferring drives. Transfer only one drive at a time. Insert the

drives in the same slots that they occupied in the failed server.

12. Connect Ethernet and twin-ax cables to the replacement server. Refer to the labels on

the cables for proper connectivity.

IMPORTANT

Do not connect power to the replacement server yet.

13. From the Primary Master server start the dhcpd service as the user root:

# service dhcpd start

14. Connect the power cables to the replacement server.

15. Next, use these steps to identify the IP address assigned to the server.

a. Issue the following command to obtain the lease information provided in the

dhcpd.leases file:

# tail /var/lib/dhcpd/dhcpd.leases

b. The dhcpd.leases file dispays (similar to the following):

Example

lease 172.28.6.170 {

starts 4 2012/10/18 20:09:08;

ends 5 2013/10/18 20:09:08;

cltt 4 2012/10/18 20:09:08;

binding state active;

next binding state free;

hardware ethernet 00:00:00:00:00:04;

uid "\001\000\036g,\242\014";

EMC Greenplum DCA Maintenance Guide Replace the Standby Master server

Replace a Master Server

EMC CONFIDENTIAL

c. Locate the MAC address labelled hardware ethernet in the example

dhcpd.leases file above:

00:00:00:00:00:04

d. Locate the MAC address on the replacement server’s service tag (highlighted in the

photograph below):

MAC1 00:00:00:00:00:00

Figure 4 Locating the MAC address on the service tag (Standby Master server shown, Dragon24)