

Reference Information › CA AppLogic Support Knowledge Base › Overview of Support Knowledge Base › How AppLogic Handles NIC failures - Network HA
How AppLogic Handles NIC failures - Network HA
In this particular situation there were two servers in which we know that the NICs had failed. And the question the customer raised was, since the servers are in a network-HA configuration (2 NICs in each network), why didn’t AppLogic failover to the other card.
- First and foremost, the network HA in AppLogic targets the failover during a "no carrier" status which is when a switch power loss is experienced, a broken modem ("phy") on either side, an unplugged or loose cable, etc. – these would be our single point of failure without network HA. However there are other failures that prevent data transfer, but are not seen as carrier loss therefore failure of any particular server is handled by the N+1 redundancy and whether the server’s power supply, memory, disk or NIC failed, it is still a failed server. So, overall, by requirement we don’t target failover for a NIC failure (but read on…)
- Since we use a standard teaming driver on the servers, we benefit from its ability to failover to another card in the team if one card fails. This means we have an unintended, but not undesirable, additional ability to failover a bad NIC within the server. The driver fails over automatically and only reports it to AppLogic (i.e., it is not AppLogic that drives the failover for the NIC). If the NIC failed in a way that the standard teaming driver does not detect the failure, then it wouldn’t fail over. The driver will switch over to the standby NIC *only* if the active one is seen as being disconnected therefore triggering a "no carrier" status.
In this case the failed NICs were not seen as being physically disconnected from the network device as they were still powered up but in a hung state, therefore AppLogic shut down (disabled and powered off) the failed servers and restarted the appliances that were running on them elsewhere and repaired the volume streams to other, working servers.
Copyright © 2013 CA Technologies.
All rights reserved.
 
|
|