Q: What is IPMI and why would I want it?
A: IPMI stands for (Intelligent Platform Management Interface) and is an industry standard mechanism for remotely controlling the power state (on, off, cycle) of a server across an IP network. This capability is used by the BFC controller to power manage servers under its control enabling the BFC to power off servers that are not currently being used by AppLogic grids. If you want to have your servers IPMI power-controlled, they should be on a network that is accessible to both the BFC and the AppLogic grid controller. For the most secure setup, you should allocate addresses from the backbone subnet for this purpose as that is accessible to the BFC and AppLogic Controller, but not generally available to the outside environment.
Q: My machines are IPMI 1.0 compliant, will they work?
A: No, version 1.5 or greater is required for IPMI support.
Q: My server doesn’t have an IPMI power controller, can I still use it?
A: Ye,s but your level of power management will be more limited. If you have an IPMI power controller, when a server is not in use it is powered off to save power/cooling in the datacenter. Without an IPMI controller, a server must be left running a basic operating system (done automatically) so that the BFC can issue a remote command to reboot the server when we need to put it into service. Additionally, an IPMI power controller can be used to force a power cycle in cases where the operating system becomes non-responsive.
Q: I’ve been using previous versions of AppLogic. Will the hardware work with the new version?
A: The answer is it depends. The new version of AppLogic requires that the server’s NIC attached to the backbone network be PXE bootable. Beyond that it depends on how you want to use the new version. If you intend to use Xen configurations in your backbone, then you will need to verify that the existing hardware is on the Hardware Compatibility List (HCL).
Q: Do I need to change anything on my server to make it work with the 3.x versions of AppLogic?
A: Yes, you need to make a couple of changes in the BIOS and verify the access methods for the IPMI power if present. Specifically, check the following:
Q: How does server discovery work?
A: Server discovery is accomplished through the use of a number of industry standards (DHCP, PXE booting, IPMI…). The process is:
The server is then powered on and PXE boots on the backbone network.
Q: What is the difference between User and System mode when configuring a power network?
A: User mode tells the BFC to not configure the IPMI power controller but rather to honor the networking information that the user has previously configured. This is the most often used mode (and the default when creating a power network in the UI) as it keeps the IP address of the power controllers fixed at the pre-configured IP addresses so that they can be accessed directly.
In contrast, system mode should be selected if the user does not want to manually configure each IPMI power controller prior to starting the discovery process. In this mode, the BFC will automatically configure the networking parameters for the IPMI power controllers as a part of the discovery/inventory process, based on the power network configuration entered by the user under the Administration->Networks->Power tab. Once the network configuration is assigned as a part of discovery, it will not change. As a result, the user is free to use the IP address shown for the power controller in the BFC UI’s server list to access the IPMI interface externally if desired.
Q: When would I use Manual Configuration ("whitelist") mode for discovery?
A: Manual Configuration mode is typically used in environments where the backbone network is not dedicated to the set of servers to be managed by the BFC installation. In this mode, the user must explicitly enter the MAC address of the NIC that will be used to boot the server on the backbone network. The BFC will only respond to DHCP requests from MAC addresses configured in the whitelist. This mode assures that the BFC controller will never manage a server that has not been explicitly added for inclusion. This is the safest mode to run the BFC controller in but requires the most work by the user when adding a new server for management as the MAC address of the boot NIC must be manually gathered and entered into the Manual Configuration list before the BFC will proceed with the discovery/inventory process. The BFC is in Manual Configuration mode at installation (with an empty list).
Q: When would I use Auto Discovery ("blacklist") mode for discovery?
A: Auto Discovery mode is typically used in environments where the backbone network is dedicated to the set of servers to be managed by the BFC installation. In this mode, the BFC will respond to all DHCP requests seen on the backbone network except those MAC addresses you entered into the list of addresses on the Administration page, Discovery tab. Auto-discovery mode allows additional server capacity to be racked and used without the need to manually gather the MAC addresses for each server, determine which NIC is on the backbone network and then hand enter that MAC address into the whitelist. This lack of data gathering and data entry substantially minimizes the overhead associated with adding additional capacity into a backbone. In this mode, the Auto Discovery list is used operationally to work around configuration issues where a server was incorrectly added to the wrong network and rather than having to wait for the server to be correctly configured, the BFC administrator can simply add the MAC address of the server into the list on the Administration page, Discovery tab, until the reconfiguration is completed.
Q: I’ve powered on a server but it didn’t display in the server list.
A: When this occurs, perform the following checks before trying to discover the server again.
Now that everything is verified, lets boot the server again while watching both the console of the server and the DHCP server output on the BFC control server (one or the other is going to point us in the right direction if the server still doesn’t discover). For server console access, typically you can view the console via an external KVM attached to the server or many IPMI controllers provide KVM access via the IPMI web interface. Pick the appropriate access mechanism to view the console of the server in question. For access to the DHCP server output, log into the BFC control node as root and type "tail –100f /var/log/messages" in a terminal session. This will show the dhcp requests from the grid servers as they boot.
Now that you have everything ready to go, go ahead and power the server back on (either physically by pressing the power button or remotely via the IPMI interface)
Watch the console of the server and verify that after completing the POST operations, the correct NIC is seen to issue a PXE request.
If you get an error before the node PXE boots with text like
Link Failure, Check Cable?
You may have a wiring or switch issue as the NIC in question is not getting link status with the switch. Please get with your hosting provider or local IT staff to have them verify the servers network connectivity.
If the right NIC is PXE booting but it eventually times out then there are a couple of items to check. First, while the server is trying to PXE boot, check the /var/log/messages output and see if an entry such as
Apr 30 19:52:40 bfc dhcpd: DHCPREQUEST for 192.168.0.26 (192.168.0.11) from b8:ac:6f:8f:2d:a3 via eth0
Apr 30 19:52:40 bfc dhcpd: DHCPACK on 192.168.0.26 to b8:ac:6f:8f:2d:a3 via eth0
Apr 30 19:52:40 bfc xinetd[942]: START: tftp pid=1367 from=192.168.0.26
Apr 30 19:52:40 bfc in.tftpd[1368]: tftp: client does not accept options
appears in the log with the MAC address of the server in question.
If the entry is not in the DHCP output then either you have the wrong NIC configured to PXE boot or the correct NIC is configured but the wiring or switch configuration is incorrect. Please get with your hosting provider or local IT staff to have them verify the servers network configuration.
If you see the DHCP request in the log but you see
Apr 30 12:23:53 bfc dhcpd: DHCPDISCOVER from f2:32:1d:00:22:00 via eth0: network 192.168.0/24: no free leases
You may have either forgotten to configure or exhausted the IP address pool in the backbone network. Refer to step 3 above.
If you have verified that you have IP’s available, then re-verify step 2 to assure that the system is prepared to respond to your server’s MAC address.
If you see in the DHCP log text like
Apr 30 00:40:15 bfc dhcpd: Abandoning IP address 192.168.0.20: pinged before offer
The IP address range you have configured into the BFC for allocation is currently in use by servers on the backbone network. Please refer to step 4 above.
If you’ve gotten this far then you should have seen text in the DHCP log that looks like
Apr 30 19:52:40 bfc dhcpd: DHCPREQUEST for 192.168.0.26 (192.168.0.11) from b8:ac:6f:8f:2d:a3 via eth0
Apr 30 19:52:40 bfc dhcpd: DHCPACK on 192.168.0.26 to b8:ac:6f:8f:2d:a3 via eth0
Apr 30 19:52:40 bfc xinetd[942]: START: tftp pid=1367 from=192.168.0.26
Apr 30 19:52:40 bfc in.tftpd[1368]: tftp: client does not accept options
If you see an error on the server console then the issue is likely with the Linux inventory image not recognizing some aspect of your server. Please capture the error output and contact CA support for further diagnostic steps.
If you get to a prompt on the server console like
Please press Enter to activate this console.
The server has correctly PXE booted the discovery/inventory image. If you still do not see the server in the BFC UI, please contact CA support for further diagnostic steps.
Q: My server is viewable in the UI but the power type is manual even though it has an IPMI controller
A: This is usually caused by one of the following issues.
Q: My server was discovered correctly and the power controller was correctly recognized as IPMI but the power controller now shows degraded.
A: This means that the server was correctly inventoried and the IPMI controller was correctly configured (in either User or System mode.) Unfortunately, the BFC controller was unable to contact the IPMI power controller on the configured IP address when performing it’s routine health and status check. Please verify that the correct IP address range was entered for the power network (Administration->Networks->Power) and that the network is routable from the BFC control node. You can verify this by logging into the BFC control node and trying to ping the IP address of one of the IPMI power controllers. If this does not work then please contact the hosting company or your local IT staff to verify the correct network connectivity.
Q: I get an error when re-installing the BFC that it cannot re-install because grids have not been deleted, but there are no grids running
A: This is a known issue, and the workaround is to run the installer with a "-f" flag. This forces the installer to ignore this check.
Q: I get an error when installing or re-installing the BFC that there is not enough disk space in the "/" file system.
A: The BFC and CA AppLogic requires at least 25 Gb of free disk space on the "/" file system to hold the BFC installation. Please ensure there is the requisite disk space and then start the installation again.
|
Copyright © 2013 CA Technologies.
All rights reserved.
|
|