The system rating is the average between the Memory_rating and the CPU_rating. When jobs are ready for execution, they find the node with the highest rating, and send the job to that node.
The algorithm for calculating ratings follows:
Total Rating = Cpu_rating + Memory_rating Cpu_rating = Cpu_calculation * CPU_weight * (1 + ((number of CPUs - 1) * .2)) / number of CPUs Memory_rating = Memory_calculation * Memory_weight Cpu_calculation = ( %Free CPU * (VUPS per CPU) * #CPUs * COM-FACTOR ) + (3 * (CPUs* VUPS per CPU)) COM-FACTOR = 0.5 if the ((number of COM+CUR+COMO processes on the system)-1) is more than the number of CPUs COM-FACTOR = 1.0 if the ((number of COM+CUR+COMO processes on the system)-1) is equal to the number of CPUs COM-FACTOR = 1.5 if the ((number of COM+CUR+COMO processes on the system)-1) is less than the number of CPUs Memory_calculation= (FREE_PAGES/TOTAL_PAGES) * MEG * 10 If Current Jobs is greater than or equal to MAX_JOBS, then rating = 0
CPU_weight and Memory_weight have values 0 through 1.
To view the "VUPS per CPU" rating of the system, use the following command:
$SCHEDULE SHOW CPU_RATING
This command displays the CPU (VUPS) rating of the system used in the algorithm’s calculation. For example, the command returns the following for a DS20E:
$SCHEDULE SHOW CPU_RATING Machine Name <XYZ> Hardware Name <COMPAQ AlphaServer DS20E 5o00 MH> Hardware Model <1921> Rating <1000>
If the system does not have a CPU rating in the Job Management Manager CPU database, Job Management Manager returns a message saying the CPU is not rated. If you see this message, contact CA Technical Support and provide them with the text of the message. The following is a sample message for an unrated machine:
$SCHEDULE SHOW CPU_RATING Machine Name <XYZ> Hardware Name <COMPAQ AlphaServer DS20E 5o00 MH> Hardware Model <1921> is not rated
When jobs are loaded, the manager identifies the node with the highest rating and loads the first job on that node. Then the manager checks to see which node has the highest rating and loads the next job onto that node. Each time the manager determines the node with the highest rating, it checks the node’s MAX_JOBS value (the maximum number of jobs allowed) before loading the job. Jobs are loaded on each node until that node’s MAX_JOBS value is reached.
A node’s rating is updated:
You can adjust the CPU rating in the following way. Define the logical name NSCHED$LBAL_INSENSITIVE to a non-zero integer percentage value. When this logical is greater than zero, the load balancing algorithm considers all systems that are rated within the given percent of each other to be the same rating. They are then load balanced using round robin scheduling. This distributes the load more equitably among similarly rated systems. We recommend you define this logical name cluster wide, or in the system table of each node in the cluster.
You can adjust when a rating is updated in the following way. Define the logical name NSCHED$LBAL$UPDATE_RATING to ON. The manager will decrease a node’s rating when a job is sent to it, so a node is aware of the number of jobs pending and the rating routine will take these jobs into consideration. We recommend you define this logical name cluster wide, or in the system table of each node in the cluster.
Note: We do not recommend having an NSCHED$LBAL$INTERVAL interval of calculation below 5 seconds. The resources used to calculate the ratings become high, and the return on these resources is low.
The rating is calculated only on the node where the job state changes. If you want to use load balancing on a particular node, you will get an accurate rating for it after the job is started, or after its regular update.
There is no minimum rating a node must have in order to have a job started on it. To remove a node from load balancing, set the rating weights for that node to 0. If the rating is 0, the node is not used in load balancing. When a node has reached its MAX_JOBS value, its rating is zero.
| Copyright © 2012 CA. All rights reserved. | Tell Technical Publications how we can improve this information |