

Reference Information › CA AppLogic Support Knowledge Base › Overview of Support Knowledge Base › Backbone Fabric Controller Home › Brownbag Session - 2012-04-05 - Backbone Fabric Controller 3.1: BFC/AppLogic Interaction
Brownbag Session - 2012-04-05 - Backbone Fabric Controller 3.1: BFC/AppLogic Interaction
This session is to help you understand:
- How the BFC/BFC GUI and AppLogic interact for grid operations
- What OS services the BFC uses
- What can be done about "stuck" BFC operations
These topics provide a knowledge base for Support that will help in resolving customer issues with the BFC.
- BFC and AppLogic
- BFC GUI and AppLogic
- BFC and CentOS
- BFC and "Stuck" Operations
1. BFC and AppLogic
- The BFC creates and manages AppLogic grids
- Each BFC can manage up to 31 grids
- The BFC uses a common pool of commodity servers to create grids
- To create/modify grids, the BFC today executes ALDO commands
- "aldo new"
- "aldo set"
- "aldo addserv/remserv"
- "aldo clean"
- "aldo upgrade1/upgrade1a/upgrade2"
- Grid start in the BFC just powers on servers
- Grid stop issues a shutdown on all grid servers
- The commands executed can be found in the ContainerX_python.log.x file
- Communication with grid controllers occurs on the backbone network
- Each grid controller is reached on its 192.168.<grid ID>.254 address
- Grid servers are monitored by the BFC via the grid controller, but today we only show AppLogic server status. The BFC does not act on this status
- Once every minutes the BFC does a "health" check on the grid
- Every 5 minutes the BFC does a "srv list" to get server states
- Once a grid is running the BFC, aside from that monitoring and any user-initiated actions, does not interact with the grid controller or the grid servers.
- If the BFC service is stopped on the BFC server, this does not affect any running grids. They continue to run
- Commands run in the grid shell do not affect the BFC
- Grid servers cannot be quarantined or powered via the BFC GUI while in a grid
- Grid updates
-
- Any grid update (grid name/description, application Ips, VLAN settings, "GTB" entries, etc.) will run an "aldo set" command, which will cause the grid to "regulate" until the command completes.
- Some of these updates (grid name, controller IP, etc.) require a grid reboot
- If the update fails, there are two options:
- Re-issue the update, correcting any parameter issues
- "Clear Failure" will return the grid to its running state
2. BFC GUI and AppLogic
The BFC GUI does not directly issue aldo commands, but uses the Client Interface
BFC Client Interface
- Provides a set of interfaces to allow the GUI to do all BFC operations
- These is not a publicly available interface (require a lot of internal knowledge of the infrastructure)
- In 3.5, we have the BFC API to allow non-BFC developers to create scripts, etc. to interact with the BFC
3. BFC and CentOS
The BFC currently runs on CentOS 5.5
Linux services utilized by the BFC:
Discovery
- LDAP
- Discover (whitelist and blacklist configuration) and BFC users
- TFTP
- Allows the BFC to define whether a machine is booting from the BFC utility image or local disk
- SSH/SCP
- Used to issue shutdown commands to grid servers, power control non-IPMI servers, send inventory/deployer data back to the BFC
- NFS
- NFS service is set up so that the download directory can be on an nfs mount, as well as the replica directory
- Will also be used in 3.5 for NFS storage by AppLogic
- IPTables
- BFC just configures iptables to allow only https access to the webserver
As of 3.1, the BFC serves as an NTP server for the grid servers
- BFC can be configured to use an NTP server for the BFC itself
- BFC is then an NTP server for grid servers
- Sets up BFC as the NTP server in the gridOS during grid server deployment
- Sets up BFC as the NTP server in applogic.conf on grid controller
4. BFC and Stuck Operations
In certain error conditions, servers can appear to be "stuck" during grid operations
- BFC has a couple of timeouts for servers going into a grid which lead to servers appearing to be stuck if they are not booting properly
- Can also hit issues coming out of a grid while we are running the "sanitizer"
- In 3.1/3.5, the BFC does not have a good way to shortcut these processes, but the dev team definitely recognizes this as a problem
- Big Red Button (proposed)
Would shortcut operations and take the grid back to an offline state
Could just power off servers as part of this process
Issue: During "sanitizer" process, if we just power off boxes we could leave data on disks
Copyright © 2013 CA Technologies.
All rights reserved.
 
|
|