This section describes the known issues and limitations at this time.
This means that an appliance can only talk to appliances connected to it (plus its own server and the grid controller). Nevertheless, protocols on new appliances should be properly specified to help ensure application design integrity and compatibility with future versions of CA AppLogic.
The total available disk space reported by the grid info command is a raw estimate and does not take volume mirroring into account. The true available disk space is the reported available amount divided by the number of mirrors (2 mirrors by default). For example, if there is 1000GB of available disk space and the grid was configured for mirroring of 2, the available disk space is 500GB. Also, to successfully mirror volumes, there must be enough disk space on at least X servers where X is the number of mirrors (CA AppLogic will not fail to create a volume if any one of its mirrors cannot be created, it will display a warning that the volume could not be mirrored).
If an application is started and one of the grid's servers fails, the application start will fail if one or more of the application's appliances were scheduled to run on the failed server. If this situation occurs, simply restart the application.
To upload larger files to your volume, use the vol manage shell command; don't forget to specify the external IP settings for this command to enable remote access from within the volume manager. For more information, see the reference for the vol manage command.
The new dhcp configuration mode does not support the property markup for appliance configuration. When porting appliances from volfix to dhcp configuration modes, the APK documentation describes how to deal with appliances that depend upon the property markup for appliance configuration. See the Appliance Kit (APK) for more information.
To see the validation flags for an application, open the application in edit mode. The validation flags are used to flag appliances that do not have all of their mandatory properties/terminals/volumes properly configured.
Therefore, the graphical console cannot be used with these appliances. This is done on purpose to make the appliances as compact as possible. Using the new iso2class utility, users may create their own appliances with full desktop support.
This error is due to the fact that CA AppLogic sets the computer name of an appliance to its instance name. Therefore, if you have more than 1 appliance running on a grid that all have the same instance names, the duplicate name error will be displayed in Windows on the graphical console. This error is simply a warning and does not affect the grid or its operation. However, if you need to use Windows as a domain controller, you will need to set the computer names to unique names for each appliance. You may use the wincfg utility to set the computer name in your appliance.
If the latest version of Java is not used, the graphical console may not work correctly (it will hang while trying to load). Before reporting graphical console errors to CA, be sure to verify that you are using the latest Java version (if you need to upgrade java in your browser, be sure to re-open your browser afterwards for the graphical console to work correctly).
When a secondary server takes over as the new primary server, if there are not enough resources available on the server to start the grid controller, CA AppLogic restarts appliances which are running on the new primary server on other servers within the grid so the grid controller can be started on the new primary server. Note that this may break appliance failover groups. If CA AppLogic stops one of these appliances it may not be able to restart the appliance on another server because there may not be enough resources to satisfy the failover group.
All HVM-based appliances (Windows, etc.) use more memory on the server than what they are configured to use. Typically, depending upon the amount of memory assigned to an HVM-based appliance, the appliance uses additional memory on the server in which it is running (this additional memory is required by the virtualization hypervisor running on the servers and is known as shadow memory). Therefore it is possible that even though a server might have enough available memory as compared to what is assigned for the appliance, the appliance will not be able to run on that server due to the additional shadow memory needed for HVM-based appliances that is not available on the server. The CA AppLogic scheduler does take this extra shadow memory into account when scheduling appliances during application start.
Any other browser may be used instead.
Shared interfaces should work with all other operating systems.
The following are the known issues in this release:
While the grid and the grid controller itself is under heavy load, it is possible for various grid controller commands (app provision/vol resize for example) to fail and network errors in the GUI to occur. If this issue is encountered, increase the grid controller CPU to 1 and the memory to at least 2GB and this should workaround the issue.
This issue will be fixed in a subsequent release.
In order to work around this issue, unpin the appliance and restart the application. This issue will be fixed in a subsequent release.
Currently the filer does not support managing two ext3-snapshot volumes at the same time. This issue will be fixed in a future release.
When using the HP Smart Array RAID controller without the write cache enabled, there is a 50% reduction in performance. This issue has been verified on a HP DL 580 G7 Server, with Smart Array P410i 256mb. These cards require a battery or capacitor to be installed to enable the write cache.
When using ServerEngines Corp. Emulex OneConnect 10Gb NIC (be3) (rev 01) NICs with CA AppLogic, these NICs incorrectly bounce packets if the SR-IOV BIOS option is enabled. These bounced packets alter the bridge's forwarding cache, causing the bridge to drop packets instead of forwarding them to the correct destination. This causes instability in CA AppLogic which results in intermittent application start failures. Therefore, please ensure that the SR-IOV BIOS setting is DISABLED for all Emulex 10G NICs on all servers within the grid.
Very rarely an application will fail to start due to a stuck volume mount on one of the servers. CA AppLogic detects stuck volume mounts and reports them to the user on the grid's dashboard. If this problem occurs on your grid, notify CA Support. Optionally, disabling the server or rebooting the server that has the stuck mounts will resolve this issue.
If this situation occurs, rebooting the primary server will restore the grid to an operational state. Note that this issue has not been observed in CA AppLogic 3.5 or 3.7.
The GUI no longer automatically logs the user out when there is heavy load on the grid controller. Instead, the user will receive a message stating that there was a network error. In this case however, the GUI is still fully functional. The network error message will only be received when there is heavy load on the controller, such as starting 4 applications at the same time AND copying a large multi-GB volume. In large grids, try assigning up to a full CPU core and 1GB RAM to the controller.
If a grid is rebooted using the grid reboot command, when the grid comes back up after the reboot, one or more of the system volumes may become degraded. CA AppLogic automatically repairs these volumes as highest priority.
When migrating a volume, verify that at least one of its streams is on an enabled server or else the migration command will fail. The volume can be completely migrated off of its original set of servers by migrating the volume twice.
Some physical servers may take a long time to reboot - this may cause CA AppLogic's automated grid recovery to fail. The end result of this is that applications may not be all restarted automatically after the grid recovers from a failure. This is due to the grid controller waiting for a maximum of 10 minutes for all servers to reboot and reconnect to the grid controller (which may not be enough time for all servers to reboot). Workaround is to manually restart applications after all servers have reconnected to the grid controller - execute "list srv" to help ensure that all servers are connected to the grid controller - they all should be in the UP state. In CA AppLogic 2.1, with server boot timeout of 10 minutes, this may occur primarily if a server fails to boot due to hardware or BIOS malfunction.
When the operator reboots the grid, the grid flapping state is supposed to be reset and a message should be displayed on the dashboard stating that the operator rebooted the grid intentionally ("Grid has been restarted by operator on ..."). Occasionally when rebooting the grid, the grid file is not reset nor is the dashboard message displayed. The only problem that this may cause is upon the next grid failure, the applications may not be automatically restarted (depending on how many times the grid has failed when this bug occurs). To workaround this problem, if after an intentional grid reboot there is no dashboard message displayed, contact CA Support to have the grid flapping state reset on your grid.
The reason for the slightly reduced resources is related to allocation for service areas. For memory, it is likely due to Xen related to the memory map table for a virtual machine. For disk, it is due to normal file system service areas (this is the same as on regular Linux servers).
In this case, the application is not opened for editing by any other user but the CA AppLogic editor erroneously thinks somebody else has the application open for editing. If this occurs, simply override the application lock when prompted by the editor upon opening the application.
The main slowdown occurs when opening an application in the CA AppLogic infrastructure editor.
If the client has the graphical console open and they lose connection to the internet (client network card failure, client computer crash, internet access is unavailable, etc.), it will take 15 minutes to re-open the graphical console.
The mouse is hard to use in Ubuntu when using the CA AppLogic graphical console. This is due to a limitation of the Xen VNC support (mouse acceleration is not supported). Some users report that adjusting the mouse settings in Ubuntu resolves the issue. Also, rarely keystrokes will be repeated several times when typing in text from the keyboard (in such cases, simply delete the extra characters that are displayed).
This includes passwords when logging into an appliance. The text boot console should only be used for debugging purposes. The SSH console can be used instead for all other purposes.
If a user re-opens the text boot console for an appliance after it has already been opened, they must press the enter key to see either the login prompt or the command prompt. This is because the boot console is waiting for user input (either for login information or a command to be executed).
If a grid has an appliance that is part of a failover group running on a secondary server where the grid controller needs to be restarted, CA AppLogic may stop that appliance which could break the failover group.
After upgrading a grid to the latest release, a dashboard message is posted stating that the grid failed due to a hardware issue. This message can be safely ignored and removed from the dashboard.
If using a network HA configuration with CA AppLogic and there is an external network failure, applications/appliances that use external interfaces may become inaccessible for up to 5 minutes. This appears to be caused by the external router caching MAC addresses. Waiting for the router to flush its ARP cache or sending an ARP response with arping from the application restores operation. This only affects the external network (the backbone network is not affected).
The recovery GUI only works on Xen-based servers.
Shared interfaces do not support appliance counters.
If a user power-cycles a grid, the system uptime is not reset. If the grid is rebooted, the system uptime should be reset.
If a user power-cycles a grid using the grid power_cycle command, the primary server may fail to reboot. This only occurs when the command is executed after a new grid install and the grid was never rebooted before the power cycle command was executed. Rebooting the grid at some point after a new grid install will avoid this issue.
When a grid that used a SAN is destroyed, CA AppLogic deletes the contents of the grid’s folder on the SAN, but leaves behind the empty folder. This issue will be resolved in a future release.
Very rarely, an upgrade to 3.7 from either 3.0 or 3.1 may fail. In this particular upgrade failure case, the following messages are present in the grid’s status log accessed using the BFC (click on the status of the grid to open the log).
installing the controller image ioctl: LOOP_SET_FD: Device or resource busy installing new controller FAILED, aborting
If these messages are present in the log, rerun the upgrade again and it should succeed.
Note: This issue is actually a bug in both CA AppLogic 3.0 and 3.1, and is resolved in CA AppLogic 3.7.
The rollback command does not work from 3.5 to 3.1 for an ESX-based grid. However, as a workaround, the downgrade command can be used (note that downgrade takes a bit longer than rollback). This issue will be resolved in a future release.
Ext3-snapshot based volumes do not work on ESX-based grids. However these volumes work on Xen-based grids. If you are using an ESX-based grid and you need to use an ext3-snapshot volume, you can add a Xen-based node to your grid and use that node to create/manage your ext3-snapshot volumes (when running the volume commands, disable all of the ESX servers so the CA AppLogic filer will run on the Xen-based node). This issue will be resolved in a future release.
An attempt to migrate a volume stream on the local SAN might fail on grids that are configured to use an external SAN. Iinstead of migrating the volume stream to the local SAN, CA AppLogic incorrectly tries to migrate the stream to the external SAN. If you encounter this failure, use the store=local option with the vol migrate command. This issue will be resolved in a future release.
When CA AppLogic is upgraded from 3.0.30 to 3.5.x, the grid controller intermittently hangs and any 3tshell command executed returns a low memory condition error message.
To work around the issue, reboot the grid controller. This issue will be resolved in a future release.
Note: This could affect the 3.7 release as well.
While resizing very large NTFS-based volumes (many GBs in size), the resize operation may stop reporting progress and will appear to be stuck. However the resize operation is indeed progressing and will be completed successfully. This reporting issue will be fixed in a future release.
Since CA AppLogic 3.1+, the megaraid SAS driver’s performance is degraded and operates ~75% slower as compared to a physical server. CA is currently working on resolving this issue and will release a hotfix as soon as the issue is identified and fixed. Until this issue is resolved, it is strongly recommended to use a different type of disk controller.
There is an issue with upgrading Cygwin while trying to upgrade to the latest Windows APK that is distributed with CA AppLogic 3.7. It is recommended to build a new Windows appliance rather than upgrading until this issue is fixed. This issue will be fixed in a future release.
When executing 3t commands over ssh, the parameters are being split either on a space or on a back tick (`), depending on the way the command is invoked. If a 3t command has a property value with a space in it, the characters after the space will incorrectly be treated as a separate argument. This issue will be fixed in a future release.
The http_port property is ignored; as such the port will always be 8080. This issue will be fixed in a future release.
Attempts to start more than 90 HVM-based appliances on Xen-based grids may fail with mount or appliance start errors. This is a known issue and will be fixed in a future release.
Please use an older version of Safari or refer to the following link for a possible workaround for this issue.
The following are the key known problems with Windows appliances in this release. Also, see the Windows Appliance Installation Reference for additional procedures and notes.
32-bit Windows 8 is currently not supported by the Halsign Turbogate Drivers; however the 64-bit version of Windows 8 is supported. This issue will be fixed in a subsequent release.
The Windows APK currently does not correctly detect duplicate IP address assignments. Therefore it is up to the user to determine if they have accidentally assigned duplicate IP addresses. This issue will be fixed in a subsequent release.
The Windows filer can fail a volume resize operation if the source volume contains a corrupt directory entry/file. The main source of this problem comes from the fact that some of the Microsoft software installations purposely contain invalid directory entries (we are not sure why this is; this has been observed when a user installed a version of Microsoft SQL Server in their appliance). Additionally, the source volume can be corrupt due to normal wear and tear. This issue can be worked around by running a file system repair on the volume (vol fsrepair) before resizing the volume.
It has been observed by CA that the NTFS volume resize operation fails about two times out of 100. These two failures occurred because the Windows filer failed to start correctly on the grid. If this issue is observed, repeating the resize operation a second time should succeed. This issue however should be resolved in this release; if this issue is observed, notify CA technical support.
The Windows filer uses a Microsoft utility named diskpart to deal with the Windows NTFS volumes. Occasionally diskpart fails to obtain volume information or may fail to mount the volume. This is a very rare failure and may cause either vol create or vol resize to fail over NTFS volumes.
If the user has an application that contains a Windows appliance and one or more Windows appliances are added to the app or terminals are added or removed from the Windows appliances, during the first app start some of the Windows appliances may detect duplicate IPs on their internal network (this can only happen during the first app start after the application is modified). This should not cause any operational failure of the application or require user intervention; the duplicate IP addresses are purely temporary. Worse case, some of the network communication involving any of the Windows appliances may be delayed for up to 30-60 seconds.
Occasionally zeros are reported for the following disk I/O counters for Windows appliances (even though sustained I/O is being generated): Total bytes written/read, # of volume writes/reads, time spent in writes/reads. This is due to a bug in the Windows perfmon API - the zero values is what is being reported by the Windows perfmon API.
Other than the filer MSI, localized Japanese Windows should work under CA AppLogic.
A windows appliance fails to start if the MagicISO virtual DVD-ROM device is installed. Virtual DVD-ROM devices are not currently supported in CA AppLogic for windows-based appliances.
Occasionally it takes several minutes for Windows to detect new NICs inside of an appliance. This occurs when the user adds/removes terminals for a Windows appliance singleton. The extra time it takes to detect these new NICs may cause appliance boot timeouts. To workaround this, increase the boot timeout of your Windows appliance.
If a user has a Windows appliance on their grid and they migrate the appliance to another grid that has different hardware, the Windows appliance may require re-activation (Microsoft's Windows re-activation). The re-activation is triggered when a specific amount of hardware has changed (it is unknown to CA exactly what hardware changes trigger the re-activation). Note that re-activation may require access to the internet from within the Windows appliance. This particular problem was observed after resizing the Windows appliance boot volume and migrating the appliance to a different grid.
This issue only affects Windows 2008 Server 32/64-bit (Windows 2003 server works OK). When accessing a Windows 2008 volume either through the filer using ssh to an appliance, the user may not be able to access/modify files due to permission issues. To access and modify files using the command shell, log in through the graphical console to the Windows desktop and open up a command shell. The command shell can be used to access and modify files.
Currently Windows 2008 DataCenter edition appliances fail to start if configured to use more than eight CPUs (only on Xen-based grids).
|
Copyright © 2013 CA Technologies.
All rights reserved.
|
|