Previous Topic: 3tsrv - CA AppLogic® Server Control UtilityNext Topic: Using CA AppLogic®


How to Maintain a Grid

As a Backbone Administrator, you want to perform administrative tasks and advanced operations on a grid. These operations include retrieving server information, checking the network health, and reviewing grid failures. BFC Maintainers can perform these tasks regularly.

The following diagram shows how to maintain a grid in a typical environment:

Diagram that shows how to maintain a grid.

  1. Retrieve and modify the server information.
  2. If you have too many servers, remove a server from the grid.

    For example, you want to change the minimum number of required servers from 5 to 4 because of underutilized resources.

  3. If you want to detect problems in the grid, check the network health.
  4. If the grid fails, complete the following steps:
    1. Review the grid failure logs.
    2. Restore grid controller operation.
Retrieve and Modify the Server Information

The 3tsrv utility resides in dom0 of each server. Only grid administrators have access to this utility. Use this utility to retrieve and modify server information and perform functions such as log collection for failure analysis.

Follow these steps:

  1. Execute the following command:
    3tsrv command [ prop[=val] ]* [ --batch ][ --force ] 
    
    command

    Specifies the command that you want to execute.

    prop=val

    Specifies additional command-specific arguments. If you do not specify val, the product assumes a Boolean property with a value of TRUE.

    --batch

    Specifies that the utility executes from a script and should not display lengthy error messages. This variable also displays output in UDL format.

    --force

    Forces the specified operation.

  2. Consider the following supported commands:
    info

    Retrieves detailed server information.

    set

    Sets server information.

    reboot

    Reboots the server.

    shutdown

    Shuts down the server.

    applogic activate

    Activates the product on the server.

    applogic deactivate

    Deactivates the product on the server.

    applogic start

    Starts the product on the server.

    applogic stop

    Stops the product on the server.

    diskchk enable

    Enables disk failure detection on the server.

    diskchk disable

    Disables disk failure detection on the server.

    bd list

    Lists the active block devices used by the product on the server

    sd get

    Displays the contents of the server data file

    sd set

    Updates the server data file

    logs collect

    Collects server logs and information

    help

    Displays help for the utility.

    For example, you want to view details information about the server.

    Execute the following command:

    3tsrv info [ --batch ]
    
Remove a Server from the Grid

You can remove a server from a grid using Server Actions in the Grid Properties page.

You may need to decrement the minimum server value if you indicated an equal value for the minimum number and target number. For example, your grid has five servers with the values set to 5/5/5 for minimum/target/maximum. You want to set the minimum value to 4 because svr5 is underutilized.

Note: If the minimum server value is less than the target value, you do not have to adjust the minimum value first. For example, if you indicated 5/7/7 for minimum/target/maximum, you can remove two servers without adjusting the minimum value.

Follow these steps:

  1. From the BFC UI, open the Xen or VMware tab of the Grid Properties page.
  2. Enter 4 as the minimum value.
  3. Open the grid shell.
  4. Execute the following command:
    3t srv disable srv<n>
    
    n

    The server number that you want to remove from the grid.

    Note: This step is optional if you select the Force Removal check box in Step 5d.

  5. Complete the following steps:
    1. From the BFC UI, select the Servers tab on the Grid Properties page.
    2. Select the server that you want to remove, then click Remove from the Server Actions drop-down list.
    3. If you do not want another grid to select the server, enable the Quarantine option.
    4. To remove servers that are not disabled in CA AppLogic®, select the Force Removal option.
Check the Network Health

The 3tnetha utility lets you script various network and switch-related checks. The products invoke the 3tnetha script periodically as part of its periodic health checks.

Follow these steps:

  1. Locate the script within the following directory on the controller:
    /var/applogic/scripts
    

    By default, the script does nothing except for exit 0.

  2. Verify the output the script:
Review the Grid Failure Logs

You can troubleshoot the failures based on the behavior of the grid. Report any unexpected grid failures to CA Support. However, before you submit a bug report, review the Release Notes to verify that your problem is unknown.

Follow these steps:

  1. Verify if you experience any of the following grid behavior:

    In this example, you cannot verify your problem as a known issue. You decide to file a bug report.

    You want to collect all the logs from the grid, including the backups. For example, xxxx.1, xxxx.2, and so on.

    The grid and servers logs require administrator access. You send these logs to CA Support.

    Note: You can use the 3tsrv utility on each server to collect the server specific logs and information.

  2. Collect the Grid controller logs from the following directories:
  1. Collect the following information about each server within the grid (dom0):
Restore Grid Controller Operation

If the grid controller server fails, the product detects grid controller recovery issues. These issues could potentially cause the grid controller to become inaccessible.

Follow these steps:

  1. If the grid does not have a controller HA due to one or more down grid controller servers, consider the following information:
  2. If you configured the HA grid controller improperly, consider the following information:
  3. If single server grids do not have HA features, consider the following information:
  4. If you did not configure the grid with the appropriate amount of controller memory, controller cpu, or server memory, consider the following information:

You have successfully performed grid maintenance.