Installation Problem at 3.7.14: Error Message
2013-06-26 21:13:10 : Grid testgrid (3670) is now waiting for other resources to boot before booting.
2013-06-26 15:32:52 : Grid testgrid (3670) is now failed but remains partially running.
2013-06-26 15:32:49 : Grid testgrid (3670) - State information is now: Failed in starting the resource: Error running aldo command: ["new","grid=testgrid","controller_ip=202.158.107.11",
"servers=192.168.200.10:10.64.71.113:PowerAdmin__BFC:******,192.168.200.11:10.64.71.112:PowerAdmin__BFC:******",
"grid_id=27/27","account_id=cbn","account_key=/opt/bfc/fcaccountkeyfile.pri",
"ips=202.158.107.12-202.158.107.50/24+202.158.107.1","ipbase=10.0.0.0",
"answer=yes","ext_network=202.158.107.0/24","ext_gateway=202.158.107.1",
"ctl_cfg=/opt/bfc/applogic_versions/configs/ctl_cfg_3670",
"file=/opt/bfc/applogic_versions/configs/globaldir_cfg_3670",
"sm_type=ipmi_public","ext_dns1=202.158.3.7","ext_dns2=202.158.3.6",
"file=/opt/bfc/applogic_versions/configs/config_3670",
"time_servers=192.168.200.12",
"appsnet_cfg=/opt/bfc/applogic_versions/configs/apps_network.3670"] -- returned 25
2013-06-26 15:32:46 : Grid testgrid (3670) - State information is now: Creating Grid: Error running aldo command: ['new', 'grid=testgrid', 'controller_ip=202.158.107.11', 'servers=192.168.200.10:10.64.71.113:PowerAdmin__BFC:******,192.168.200.11:10.64.71.112:PowerAdmin__BFC:******', 'grid_id=27/27', 'account_id=cbn', 'account_key=/opt/bfc/fcaccountkeyfile.pri', 'ips=202.158.107.12-202.158.107.50/24+202.158.107.1', 'ipbase=10.0.0.0', 'answer=yes', 'ext_network=202.158.107.0/24', 'ext_gateway=202.158.107.1', 'ctl_cfg=/opt/bfc/applogic_versions/configs/ctl_cfg_3670', 'file=/opt/bfc/applogic_versions/configs/globaldir_cfg_3670', 'sm_type=ipmi_public', 'ext_dns1=202.158.3.7', 'ext_dns2=202.158.3.6', 'file=/opt/bfc/applogic_versions/configs/config_3670', 'time_servers=192.168.200.12', 'appsnet_cfg=/opt/bfc/applogic_versions/configs/apps_network.3670'] -- returned testing the target servers
connecting the server (maximum attempts:1)
connected to 192.168.200.10
verifying connection to 192.168.200.10
testing OS and distro version on 192.168.200.10
checking network setup on 192.168.200.10
server check phase 1 completed
connecting the server (maximum attempts:1)
connected to 192.168.200.11
verifying connection to 192.168.200.11
testing OS and distro version on 192.168.200.11
checking network setup on 192.168.200.11
server check phase 1 completed
testing the target servers OK
detecting network layout
detecting network layout OK
Switches:
N Identifier name model
--------------------------------------------------
a 74:8e:f8:20:98:f2:0001 CYB2-SW-BRCD-TX24... 74:8e:f8
b 00:05:33:67:c5:2c:0000 00:05:33
LANs:
N Role Switches
----------------------
l1 backbone a
l2 external b
Connections
| l1 | l2 |
| a | b |
192.168.200.10 | eth0 | eth1 |
192.168.200.11 | eth0 | eth1 |
INFO: no redundant wiring found -> no network HA
preparing new grid configuration
re-checking controller addr from srv1 (192.168.200.10) on eth0 eth1
looking for existing grids visible from 192.168.200.10...
Found active grid IDs: 1 10 21
Using operator-specified value 27 as the new grid identifier
getting timezone info from localhost
preparing new grid configuration OK
testing connections between the servers
checking for direct Ethernet connection from 192.168.200.10
using 192.168.27.250 192.168.27.249 for server conn tst
link speed 192.168.200.10->192.168.200.11 verified
server check phase 2 completed
testing connections between the servers OK
installing first grid server
downloading kernel-mode packages
downloading user-mode packages
running package install
server install on 192.168.200.10 completed
installing first grid server OK
installing controller
installing the controller image
building import/export workarea: 65536 Mb
metadata vol build started
meta volume created and mounted
applogic.conf prepared
copy ssh-keys
metadata vol build done
create impex volume
warning: Unable to get device geometry for /var/applogic/volumes/vols/v-ctl-impex
create boot volume
controller volumes done
controller install on 192.168.200.10 completed successfully
starting grid on 192.168.200.10, please wait...
closing connection to 192.168.200.10
waiting for 192.168.200.10 to reboot
.
.
ping succeeded, re-connecting SSH session...
.
connected to 192.168.200.10
waiting for controller to start
waiting for VMM daemon (xend) to start
.
waiting for the ctl vm to start
ping OK
connecting to controller
connected to 192.168.27.254
connected to controller for 'testgrid' (192.168.27.254), id=27, srv1
saving version tag
applying customization settings from /opt/bfc/applogic_versions/configs/ctl_cfg_3670
reading cfg template from 192.168.200.10
updating controller configuration
saving sm_id for 192.168.200.10
installing controller OK
preparing to add new servers to grid
getting timezone info from Asia/Jakarta
Looking for existing servers
servers currently on 192.168.27.254: 1
preparing to add new servers to grid OK
installing srv2 on 192.168.200.11
downloading kernel-mode packages
downloading user-mode packages
running package install
server install on 192.168.200.11 completed
installing srv2 on 192.168.200.11 OK
starting applogic on the installed servers
closing connection to 192.168.200.11
closing connection to 192.168.200.10
waiting for reboot
waiting for the servers to reboot...
all new servers responded to ping, waiting for them to join the grid
starting applogic on the installed servers FAILED
aldo: error: 1 servers did not come up, final grid status:
Name State CPU Mem(MB) BW(Mbps) Role
Alloc Free Alloc Free Alloc Free
---------------------------------------------------------------------
srv1 up 0.00 6.00 0 95085 0 20000 primary
srv2 unknown 0.00 0.00 0 0 0 0 none
cleanup
closing connection to 192.168.27.254
cleanup FAILED
Cleaning up after a failed install is incomplete, the following servers
may be in an indeterminate state: 192.168.200.10 192.168.200.11
NOTE: aldo will now exit, the remote ssh access to the failed servers
is down and they and will not be contacted for cleanup, use
'Remove' from grid and 'Clear Failure' to clean the servers after
checking them and restoring network access to them
**** aldo new: command FAILED
2013-06-26 15:32:26 : Grid testgrid (3670) - State information has been cleared.
Things to check:
Solution:
From 3.7 onward, jumbo frames are turned ON by default on a 10G grid creation/upgrade. See below link for some more info:
http://doc.3tera.com/AppLogic37/2130178.html#2125027
Two solutions:
To enable jumbo frame
LODISW10G(config)#interface TenGigabitEthernet2/6 LODISW10G(config-if)#mtu 9198 LODISW10G(config-if)#end
Note - It is best to refer Cisco documentation and consult your Network Admin on how to do it
|
Copyright © 2013 CA Technologies.
All rights reserved.
|
|