Notes from the meeting.
Overview: BFC Log Naming
How do logs get named? It seems to make no sense.
Roll based on file size:
ContainerX_python.log
Purpose
2012-02-07 15:29:32,045|download_resource_driver|DEBUG|:1267 - tick() 2012-02-07 15:29:32,048|download_resource_driver|DEBUG|:1267 - verify_versions 2012-02-07 15:29:32,064|download_resource_driver|DEBUG|:1267 - verify_hotfixes 2012-02-07 15:29:32,105|download_resource_driver|DEBUG|:1267 - add_grid_versions(),dir: /opt/downloads 2012-02-07 15:29:32,147|download_resource_driver|DEBUG|:1267 - add_grid_hotfixes(),dir: /opt/downloads
BFCLog.x
Purpose
grep -H XEN_RESOURCE BFC* grep -H ESX_RESOURCE BFC*
Erlang.log.x
Purpose
ContainerX_erlang.log.x
Purpose
Stream of Consciousness (SOC)
Purpose
2012-02-07 10:17:28 : Grid Skynet6GridA (1561) - State information is now: Installation resources are now available. Continuing.. 2012-02-07 10:17:28 : Grid Skynet6GridA (1561) - State information is now: Waiting for other installation operations to finish before continuing. 2012-02-07 10:17:28 : Xen Server (1626) is now running. 2012-02-07 10:17:28 : Xen Server (1626) - State information has been cleared. 2012-02-07 10:16:57 : Xen Server (1626) - State information is now: Booting deployed image. 2012-02-07 10:16:38 : Xen Server (1634) is now running. 2012-02-07 10:16:37 : Xen Server (1634) - State information has been cleared. 2012-02-07 10:16:07 : Xen Server (1634) - State information is now: Booting deployed image. 2012-02-07 10:08:58 : Xen Server (1634) is now being booted. 2012-02-07 10:08:58 : Linux Control Utility (1633) is now running. 2012-02-07 10:08:58 : Xen Server (1634) - State information is now: Deploying image to host. 2012-02-07 10:08:58 : Xen Server (1626) is now being booted. 2012-02-07 10:08:58 : Xen Server (1626) - State information is now: Deploying image to host. 2012-02-07 10:08:58 : Linux Control Utility (1625) is now running 2012-02-07 10:08:58 : Linux Control Utility (1633) is now being booted. 2012-02-07 10:08:58 : Raw Hardware (1520) is now running. 2012-02-07 10:08:58 : Raw Hardware (1520) - State information is now: Compute Host Power Status: Compute host is currently under active power control.. 2012-02-07 10:08:58 : Linux Control Utility (1625) is now being booted. 2012-02-07 10:08:58 : Raw Hardware (1462) is now running. 2012-02-07 10:08:58 : Raw Hardware (1520) is now being booted. 2012-02-07 10:08:58 : Raw Hardware (1462) - State information is now: Compute Host Power Status: Compute host is currently under active power control.. 2012-02-07 10:08:58 : Raw Hardware (1520) - State information has been cleared. 2012-02-07 10:08:58 : Raw Hardware (1462) is now being booted. 2012-02-07 10:08:58 : Linux Control Utility (1633) is now waiting for other resources to boot before booting. 2012-02-07 10:08:58 : Xen Server (1634) is now waiting for other resources to boot before booting. 2012-02-07 10:08:58 : Raw Hardware (1462) - State information has been cleared. 2012-02-07 10:08:58 : Raw Hardware (1462) is now waiting for other resources to boot before booting. 2012-02-07 10:08:58 : Linux Control Utility (1625) is now waiting for other resources to boot before booting. 2012-02-07 10:08:58 : Xen Server (1626) is now waiting for other resources to boot before booting. 2012-02-07 10:08:58 : Grid Skynet6GridA (1561) - State information is now: Creating Grid: Installing the grid servers. 2012-02-07 10:08:58 : Grid Skynet6GridA (1561) - State information is now: Creating Grid: There were no unneeded servers.. 2012-02-07 10:08:58 : Grid Skynet6GridA (1561) - State information is now: Creating Grid: Checking for and removing any unneeded servers. 2012-02-07 10:08:58 : Grid Skynet6GridA (1561) is now being booted.
Boot sequence for a server going into a XEN grid:
Boot sequence for a server going into a VMware grid:
Inventory and Deployer Logs
Miscellaneous Logs
Output from installation
Mnesia and Erlang Processes
Mnesia
Erlang Processes
bfcadmin 16713 16711 0 Feb06 pts/2 00:13:21 /opt/bfc/erts-5.8.1/bin/beam.smp -- -root /opt/bfc -progname start_erl -- -home /home/bfcadmin -- -boot/opt/bfc/releases/A/start -config /opt/bfc/releases/A/sys -sname bfc -heart
-mnesia dir "/opt/backup"
/opt/bfc/lib/containers-1/ebin -run container_main run_container 33873 Container1_erlang /opt/bfc/logs
Heartbeat Process
/opt/bfc/releases/A/sys -sname bfc -heart
Common Issues
Feb 8 13:02:57 heehaw-goober-ctrl1 dhcpd: DHCPDISCOVER from 00:15:60:0e:26:26 via eth0:network 192.168.0/24: no free leases Feb 8 13:03:54 heehaw-goober-ctrl1 last message repeated 3 times
SOC
Server is quarantined due to no external IP reserved for this server:
2012-02-08 14:17:12 : Raw Hardware (1667) - State information is now: Compute Host Power Status: Compute host is currently under active power control.. 2012-02-08 14:17:12 : Raw Hardware (1667) - State information is now: IP Assignment: Unable to find a Subnet with IP capacity for: 'bbc/applogic_external_network' usage.. 2012-02-08 14:17:07 : Raw Hardware (1667) - State information is now : Compute Host Inventory Status: Configured IPMI power user: 'PowerAdmin__BFC' in slot: '3' with a system generated password.. 2012-02-08 14:17:07 : Raw Hardware (1667) - State information is now: Compute Host Inventory Status: Received inventory data from server.. 2012-02-08 14:16:50 : Raw Hardware (1667) - State information is now: Power IP Assignment: Reserved Power IP: '10.10.175.22' for: '00:19:B9:DD:C4:CD'.. 2012-02-08 14:16:50 : Raw Hardware (1667) - State information is now: Compute Host Inventory Status: Awaiting inventory data from server.. 2012-02-08 14:16:50 : Raw Hardware (1667) - State information is now: IP Assignment: Allocated:'192.168.0.225' to: '00:19:B9:DD:C4:CD' for: 'bbc/applogic_backbone_network' usage..
BFCLog.x
Search for the component ID of the server (1667 in this case)
Grep -A 12 :1667 BFCLog.*
2012-02-08 14:17:12.426 INFO ['core/resources.compute_host':1667]
compute_host_resource_driver:598
Setting State Info Badge for: 1667, to: [{<<"args">>,
[{1,
<<"bbc/applogic_external_network">>}]},
{<<"category_code">>,
<<"core/services.compute_host.category_compute_host_ips">>},
{<<"category_fmt>>,
<<"IP Assignment">>},
{<<"fmt">>,
<<"Unable to find a Subnet with IP capacity for: '{0}' usage.">>}]
SOC
2012-02-07 13:56:33 : Grid xen (1469) is now failed but remains partially running. 2012-02-07 13:56:33 : Grid xen (1469) - State information is now: Failed in starting the resource: Could not acquire enough nodes to meet min. 2012-02-07 13:56:26 : Xen Server (1542) is now failed but remains partially running. 2012-02-07 13:56:26 : Xen Server (1542) - State information is now: Failed in starting the resource: Utility Image Deployer Results Status never recieved, file: "/var/bfcadmin/deployer/results.192.168.0.33". 2012-02-07 13:56:18 : Xen Server (1533) is now failed but remains partially running. 2012-02-07 13:56:18 : Xen Server (1533) - State information is now: Failed in starting the resource: Utility Image Deployer Results Status never received, file: "/var/bfcadmin/deployer/results.192.168.0.34". 2012-02-07 13:34:42 : Xen Server (1533) is now being booted. 2012-02-07 13:34:42 : Xen Server (1542) is now being booted. 2012-02-07 13:34:42 : Xen Server (1533) - State information is now: Deploying image to host. 2012-02-07 13:34:42 : Linux Control Utility (1532) is now running. 2012-02-07 13:34:42 : Xen Server (1542) - State information is now: Deploying image to host. 2012-02-07 13:34:42 : Linux Control Utility (1541) is now running. 2012-02-07 13:34:42 : Linux Control Utility (1532) is now being booted. 2012-02-07 13:34:42 : Raw Hardware (1361) is now running. 2012-02-07 13:34:42 : Linux Control Utility (1541) is now being booted. 2012-02-07 13:34:42 : Raw Hardware (1423) is now running. 2012-02-07 13:34:42 : Raw Hardware (1361) - State information is now: Compute Host Power Status: Compute host has no power controller and will never be shut down. If a power failure occurs and the host does not return to online status, user intervention will be required. 2012-02-07 13:34:42 : Raw Hardware (1423) - State information is now: Compute Host Power Status: Compute host's power controller is currently unreponsive. It will be reconfigured at the next reboot. If a power failure occurs prior to the next reboot and the host does not return to online status, user intervention will be required. 2012-02-07 13:34:42 : Raw Hardware (1361) is now being booted. 2012-02-07 13:34:42 : Raw Hardware (1361) - State information has been cleared. 2012-02-07 13:34:42 : Raw Hardware (1423) is now being booted. 2012-02-07 13:34:42 : Raw Hardware (1423) - State information has been cleared. 2012-02-07 13:34:42 : Linux Control Utiltiy (1541) is now waiting for other resources to boot before booting. 2012-02-07 13:34:42 : Raw Hardware (1423) is now waiting for other resources to boot before booting. 2012-02-07 13:34:42 : Xen Server (1533) is now waiting for other resources to boot before booting. 2012-02-07 13:34:42 : Raw Hardware (1361) is now waiting for other resources to boot before booting. 2012-02-07 13:34:42 : Linux Control Utiltiy (1532) is now waiting for other resources to boot before booting. 2012-02-07 13:34:42 : Xen Server (1533) is now waiting for other resources to boot before booting. 2012-02-07 13:34:42 : Grid xen (1469) - State information is now: Creating Grid: Installing the grid servers. 2012-02-07 13:34:42 : Grid xen (1469) - State information is now: Creating Grid: There were no unneeded servers.. 2012-02-07 13:34:42 : Grid xen (1469) - State information is now: Creating Grid: Checking for and removing any unneeded servers. 2012-02-07 13:34:42 : Grid xen (1469) is now being booted.
/var/bfc/admin/deployer
/tmp on server
/opt/bfc/logs/ContainerX_python.log
Download directory does not contain any AppLogic versions, and/or contains extraneous directories:
2012-02-08 12:47:57,895|grid_service_driver|ERROR|:1257 - Invalid AppLogic version directory:/tmp/3.5-522 2012-02-08 12:47:57,896|grid_service_driver|ERROR|:1257 - Missing packing.slip 2012-02-08 12:47:57,900|grid_service_driver|ERROR|:1257 - Invalid AppLogic version directory:/tmp/wmtrace 2012-02-08 12:47:57,902|grid_service_driver|ERROR|:1257 - Missing packing.slip 2012-02-08 12:47:57,906|grid_service_driver|ERROR|:1257 - Invalid AppLogic version directory:/tmp/orbit-root 2012-02-08 12:47:57,907|grid_service_driver|ERROR|:1257 - Missing packing.slip 2012-02-08 12:47:57,911|grid_service_driver|ERROR|:1257 - Invalid AppLogic version directory:/tmp/3.1-ga 2012-02-08 12:47:57,913|grid_service_driver|ERROR|:1257 - Missing packing.slip
SOC
No activity for a long time (in this case, these entries were taken at 1:55) (could also indicate the deployer hanging)
2012-02-08 12:55:38 : Grid xen (1469) - State information is now: Creating Grid: Installing the grid servers.
2012-02-08 12:55:38 : Grid xen (1469) - State information is now: Creating Grid: There were no unneeded servers..
2012-02-08 12:55:38 : Grid xen (1469) - State information is now: Creating Grid: Checking for and removing any uneeded servers.
2012-02-08 12:55:38 : Grid xen (1469) is now being booted.
BFCLog.x
2012-02-07 13:34:26,172|grid_resource_driver|DEBUG|:1469 - attempting to connect to version 1294
2012-02-07 13:34:26,175|grid_resource_driver|DEBUG|:1469 - versions before []
2012-02-07 13:34:26,179|grid_resource_driver|DEBUG|:1469 - versions after []
2012-02-07 13:34:26,190|grid_version_service_driver|DEBUG|:1227 - consumeCapacity()
2012-02-07 13:34:26,193|fs_storage_service_driver|DEBUG|:1203 - fs_storage_service_driver.getInstances()
2012-02-07 13:34:26,211|fs_storage_service_driver|DEBUG|:1203 - fs_storage_service_driver.getconsumeCapacity()
2012-02-07 13:34:26,249|grid_version_service_driver|DEBUG|:1227 - releaseCapacity()
2012-02-07 13:34:26,250|fs_storage_service_driver|DEBUG|:1203 - fs_storage_service_drive.releaseCapacity()
2012-02-07 13:34:27,256|grid_resource_driver|DEBUG|:1469 - connected resource to 1294 - new conn 1498 - conn list [1498]
2012-02-07 13:34:27,264|grid_resource_driver|DEBUG|:1469 - setPrivateHotfixes(), hotfixes: []
2012-02-07 13:34:27,265|grid_resource_driver|DEBUG|:1469 - create_hotfix_connections(), hotfixes: []
2012-02-07 13:34:27,267|grid_resource_driver|DEBUG|:1469 - getHotfixes
2012-02-07 13:34:27,275|grid_service_driver|DEBUG|:1257 - reserveIPs(), newGrid: 1469
2012-02-07 13:34:27,276|grid_service_driver|DEBUG|:1257 - reserveIPs(), AppIPs:[(Atom('grid_ip_range'), 1422,'public','10.10.75.143', '10.10.75.143')]
2012-02-07 13:34:27,277|grid_service_driver|DEBUG|:1257 - extractIpRange(), ipRange:(Atom('grid_ip_range'), 1422,'public','10.10.75.143', '10.10.75.143')
2012-02-07 13:34:27,278|grid_service_driver|DEBUG|:1257 - ipRangeList:[(1422,'10.10.75.143','10.10.75.143')]
2012-02-07 13:34:27,304|grid_service_driver|DEBUG|:1257 - reserveIPs,appIpSubPoolId: 1463
2012-02-07 13:34:27,305|grid_util|DEBUG|:1257 - grid_ip_pool_name() Name: GridIpPool_1469_1422
2012-02-07 13:34:28,697|grid_service_driver|DEBUG|:1257 - gridSubPoolId:1499
2012-02-07 13:34:28,700|grid_resource_driver|DEBUG|:1469 - addIpResPoolConn(), resPoolId: 1499
2012-02-07 13:34:31,286|grid_service_driver|DEBUG|:1257 - createIpPoolConnections() gridSubpoolId: 1499
2012-02-07 13:34:31,287|grid_service_driver|DEBUG|:1257 - grid res pool conn: 1539
2012-02-07 13:34:31,288|grid_service_driver|DEBUG|:1257 - validateControllerIp(),gridId: 1469, controllerIP: 10.10.75.142
2012-02-07 13:34:31,324|grid_util|DEBUG|:1257 - grid_ip_pool_name() Name: GridIpPool_1469_1422
2012-02-07 13:34:31,356|grid_util|DEBUG|:1257 - grid_ip_pool_name() Name: GridIpPool_1469_1422
2012-02-07 13:34:33,524|grid_util|DEBUG|:1257 - controller subpool ID: 1510
2012-02-07 13:34:33,528|grid_service_driver|DEBUG|:1257 - set_private_resource_fields()
2012-02-07 13:34:33,535|grid_service_driver|DEBUG|:1257 - set_private_resource_fields(),acctId: undefined, defaultAccountId:dev_optimize
2012-02-07 13:34:33,543|grid_resource_driver|DEBUG|:1469 - setPrivateFields(),args: cUser: cadev@ca.com, accountKeyfile: undefined
2012-02-07 13:34:33,563|grid_service_driver|DEBUG|:1257 - releaseCapacity()
2012-02-07 13:34:33,727|grid_resource_driver|DEBUG|:1469 - ready()
2012-02-07 13:34:33,738|grid_resource_driver|DEBUG|:1469 - adding 2 xen and 0 esx nodes to connection
2012-02-07 13:34:33,743|grid_resource_driver|DEBUG|:1469 - adding #1
2012-02-07 13:34:35,226|grid_resource_driver|DEBUG|:1469 - No cached version found, trying new
2012-02-07 13:34:37,561|grid_resource_driver|DEBUG|:1469 - Configure GridOS:
2012-02-07 13:34:37,562|grid_resource_driver|DEBUG|:1469 - Version => 1294
2012-02-07 13:34:37,564|grid_resource_driver|DEBUG|:1469 - Striping enabled => False
2012-02-07 13:34:37,565|grid_resource_driver|DEBUG|:1469 - Stripe size =>64
2012-02-07 13:34:37,566|grid_resource_driver|DEBUG|:1469 - add result for bbc/properties.gridos_node_xen: 1543
2012-02-07 13:34:37,568|grid_resource_driver|DEBUG|:1469 - adding #2
2012-02-07 13:34:38,619|grid_resource_driver|DEBUG|:1469 - No cached version found, trying new
2012-02-07 13:34:41,870|grid_resource_driver|DEBUG|:1469 - Configure GridOS:
2012-02-07 13:34:41,872|grid_resource_driver|DEBUG|:1469 - Version => 1294
2012-02-07 13:34:41,873|grid_resource_driver|DEBUG|:1469 - Striping enabled => False
2012-02-07 13:34:41,874|grid_resource_driver|DEBUG|:1469 - Stripe size => 64
2012-02-07 13:34:41,876|grid_resource_driver|DEBUG|:1469 - add result for bbc/properties.gridos_node_xen: 1534
2012-02-07 13:34:41,877|grid_resource_driver|DEBUG|:1469 - added 2 xen and 0 esx nodes to connection
2012-02-07 13:34:41,928|grid_version_resource_driver|DEBUG|:1294 - getGridOSHost()
2012-02-07 13:34:41,946|grid_version_resource_driver|DEBUG|:1294 - getGridOSHost(),osHost: 192.168.0.32
2012-02-07 13:34:41,950|grid_version_resource_driver|DEBUG|:1294 - getGridOSHost()
2012-02-07 13:34:41,956|grid_version_resource_driver|DEBUG|:1294 - getGridOSHost(),osHost: 192.168.0.32
2012-02-07 13:34:41,959|grid_version_resource_driver|DEBUG|:1294 - getGridOSHost()
2012-02-07 13:34:41,963|fs_storage_resource_driver|DEBUG|:1293 - getStorageBaseDir(), rootdir: /opt/bfc/applogic_versions
2012-02-07 13:34:41,965|grid_version_resource_driver|DEBUG|:1294 - imageBaseDir: /opt/bfc/applogic_versions
2012-02-07 13:34:41,980|grid_version_resource_driver|DEBUG|:1294 - getGridOSPath()
2012-02-07 13:34:41,985|fs_storage_resource_driver|DEBUG|1293 - getStorageBaseDir(), rootdir: /opt/bfc/applogic_viersions
2012-02-07 13:34:41,990|grid_version_resource_driver|DEBUG|:1294 - imageBaseDir: /opt/bfc/applogic_versions
2012-02-07 13:34:42,188|grid_resource_driver|DEBUG|:1469 - removing resources, removal field: []
And then nothing else.
|
Copyright © 2013 CA Technologies.
All rights reserved.
|
|