Managing Policies › Configuring Data Collection › Key Points About Metrics Collection
Key Points About Metrics Collection
To make informed decisions when you select metrics, review these points to understand CA Server Automation performance and application metrics collection:
- How does CA Server Automation collect metrics data? CA Server Automation communicates with the CA Performance Agent or with the SystemEDGE agent on the remote computer to collect the specified system metrics. The CA Performance Agent is a lightweight version of the CA NSM Performance Agent. If you have installed the CA NSM Performance Agent, CA Server Automation can also poll that agent.
Note: The CA Performance Agent works differently from the CA NSM Performance Agent, so the metrics available for collection on Linux can be different, depending on which agent you use.
The CA Performance Agent or the SystemEDGE agent must be installed on any server from which you want to collect the base system metrics, unless you are already using CA NSM performance agents or SystemEDGE agents. If CA NSM performance agents or SystemEDGE agents are present, then the CA Performance Agent is not required. If necessary, you can install the SystemEDGE agent using the product user interface. All performance metrics are stored in the Performance DB.
- How is overall utilization calculated? Overall utilization is an aggregate calculation of all the metrics that are currently being collected for servers managed by CA Server Automation. The calculation is based on the value of the metrics and the user-defined thresholds that define the parameters for normal operation. Any new metric that you select for collection is not used in the overall utilization calculation unless you select Include for Overall Calculation in the Policy, Metrics, Thresholds section of the user interface. When you do this, CA Server Automation does not provide false results when evaluating the state of the servers.
- How is overall utilization impacted by metric evaluations? The metric details provided in the tables help you understand how CA Server Automation evaluates the different metrics. Each metric has a method property set to either exact or complement. A higher exact value is a worse scenario than a lower exact value because it indicates an increase in overall utilization. A higher complement value is a positive scenario because it indicates a decrease in overall utilization. Generally, a high exact value negatively impacts overall utilization and a low exact value positively affects overall utilization. By contrast, a high complement value positively impacts overall utilization, and a low complement value negatively affects overall utilization. For example, if the value of Memory: Percentage Committed Bytes In Use increases, overall utilization of the system increases. If the value of Memory: Available MB increases, overall utilization decreases.
- What are the default metrics? The default metric definitions are located in the metric list in the Filter section for all supported platforms. You can find the default metrics indicator on the metric list with the value Yes in the Default column. CA Server Automation uses this list to obtain the metric definitions when you add a new server. You can configure platforms, types, subtypes, instances, and the type of data to collect in the Filter section. The metric filter and definitions for each server are stored in the Performance DB.
- Is performance data currently available for my systems? By default, if data cannot be collected, CA Server Automation does not negatively affect server state because lack of data does not reflect server criticality. By reviewing the Events list or selecting a specific system, you can determine whether metric data is being collected. However, if a more immediate means of determining this is needed, or if performance data is critical, CA Server Automation can be configured to change the state of a system automatically to Warning or Critical if performance data cannot be collected. To enable easy identification of systems where performance data is not available, modify the caaipconf.cfg file located in the CA Server Automation install_path\conf directory. Open the file with a text editor and locate the health state property as follows:
<property name="CONFIG_KEY_DEFAULT_HEALTH_STATE">
<!-- Valid values: 0 (Unknown); 5 (OK); 10 (Warning); 15 (Minor Failure); 20 (Major Failure); 25 (CriticalFailure) -->
<!-- Changes the value of HealthState for the CA_CollectionState object associated to the CA_ComputerSystem -->
<!-- If set to 30, CE will not set the HealthState. -->
<value>5</value>
<displayName>Default node health state when problem encountered in metric or data collection</displayName>
</property
By modifying the value surrounded by the value XML elements to one of the other supported values such as 5 or 10, for OK or Warning respectively, CA Server Automation reflects the desired state when performance data cannot be collected. For example:
<property name="CONFIG_KEY_DEFAULT_HEALTH_STATE">
<!-- Valid values: 0 (Unknown); 5 (OK); 10 (Warning); 15 (Minor Failure); 20 (Major Failure); 25 (CriticalFailure) -->
<!-- Changes the value of HealthState for the CA_CollectionState object associated to the CA_ComputerSystem -->
<!-- If set to 30, CE will not set the HealthState. -->
<value>10</value>
<displayName>Default node health state when problem encountered in metric or data collection</displayName>
</property
Because <value> was changed to "10", systems that do not have performance data available are displayed in a warning state in the CA Server Automation user interface.
Note: For a list of performance metrics and descriptions, see the Reference Guide.