Previous Topic: 3tnetha - Network Health Check ScriptNext Topic: Grid Failure Logs


BFC Troubleshooting

This section contains the following topics:

Deployment Failure

IPMI Remote Power Management Problems

Grid Failure Logs

Grid Recovery Procedures

Deployment Failure

Problem

If you redeploy a node multiple times, or destroy and recreate a grid several times, a node boots from the utility image, but the CA 3Tera AppLogic deployment fails. The following messages appear in the node console:

  EXECUTING INSTRUCTIONS...
etcrcS.dS7Orun: line 6: ./run: Permission denied
BFC utility image processing is complete. Console login is disabled.
udevd event[1000]: wait_for_sysfs: waiting for  /sys/devices/pciOOOO:OO/OOOO:
OO:
1f .Z. hostOtargetO:O:Oioerr_cnt  failed
udevd-event[99?]: wait_for_sysfs: waiting for  sysdevicespciOOOO:OOOOOO:OO:1
f.2,hostl/targetl:O:O,ioerrcnt  failed
udevd event[1005]: wait_for_sysfs: waiting for  ,sys,devices,pciOOOO:OO,OOOO:
OO:
1f .Z. hostZtargetZ:O:Oioerr_cnt  failed
[ 33.62331?] sd 0:0:0:0: Attached scsi generic sg  type 0

Reason

An incomplete cleanup of the old configuration in the BFC causes this behavior. The BFC caches the MAC addresses of the deployed servers.

Solution

The BFC also keeps a copy under the /boot_command/config directory. Look in that directory for one file with the MAC address name of the effected server. Delete that file and retry deployment.

IPMI Remote Power Management Problems

As part of discovery and inventory process, the BFC attempts to configure the servers it discovers for remote power management via IPMI. When configured successfully, the server is 'power controlled' and the BFC can intelligently control the power management operations on the server. Failure to configure the BMC (Baseboard Management Controller) for IPMI LAN access results in a server in 'manual power' mode.

You can identify a server in 'manual power' mode wherever the server status is identified in the user interface. Specifically:

When a 'power controlled' server is not responding to remote power status or action requests it is considered 'degraded'. The 'degraded' state refers to an intermittent state resulting from sporadic communication failures or a temporarily non-responsive BMC. Typically these conditions self-correct.

However, another condition can lead to the 'degraded' state for a server, resulting from the inability to set up the BFC PowerAdmin user account for which the remote IPMI calls get authenticated.

When a server is discovered, the BFC attempts to add the BFC's user (PowerAdmin__BFC) to the power controller. If that attempt fails for some reason, the BFC then uses the system-wide IPMI password. The server is degraded, and as noted above, the user configuration failure message appears.

There are two cases in which the fallback to the system-wide IPMI password may fail:

  1. A system-wide IPMI password is not set.
  2. A non-BFC user/password is already specified for the server. Note: Server-specific credentials are always used irrespective of the ability to configure the PowerAdmin_BFC user.

Do one of the following to change the power state from 'degraded' to 'power controlled'. In either case, you are entering credentials for an existing user.

Note: Although the BFC may attempt to put its own user on the power controller, the BFC never changes or deletes any existing users already configured on a power controller.