Assume that you would like to be even more proactive detecting potential bottlenecks on disks. You might want to reduce the global threshold on disk queue length (which is used as an initial condition in many disk rules) from 1.0 to 0.66, and see how many more rule firings you get as a result of this change.
The list of all Performance Manager thresholds is presented in the Performance Manager Thresholds table at the beginning of this chapter. To implement a change like this is relatively straightforward. Simply add the following line to your MYRULES.VPR file:
Threshold TD_DISK_QL_MAX = 0.66 EndThreshold
Assume that you would like to make this change, plus raise the threshold of free space remaining on a disk from 5 percent to 10 percent (for use in the rule that you just finished adding in the last section). To change more than one threshold, you might want to change your format within MYRULES.VPR to the following, for greater clarity:
Threshold TD_DISK_QL_MAX = 0.66 TD_MIN_DSKSPC_PCT = 0.10 EndThreshold
There are other cases when a threshold might need to be changed. For example, if you own older RF31 disk drives, you should change the RF31 threshold values, as they now reflect the performance of the newer RF31T disk. (The internal model number used by OpenVMS to identify disk types is no longer unique: this is the first case of an ID number being re-used for a newer disk, but more recycling of IDs is expected in the future.)
Also, certain disks may be able to process many more I/Os per second than indicated by the given disk thresholds, if they are able to make effective use of their (embedded or HSC-based) disk cache. If you have many I/O rule firings, but upon investigation, find relatively low queue lengths on disks that are processing I/Os at a much higher rate than shown in the Performance Manager Thresholds table, then you might want to increase your disk operations rate thresholds to account for this performance, and eliminate these extraneous rule firings.
Suggested numbers for these scenarios are as follows:
TD_T43_RF30 = 31 TD_T48_RZ22 = 40 TD_T44_RF71 = 31 TD_T49_RZ23 = 41 TD_T56_RF31 = 96 TD_T50_RZ24 = 51 TD_T57_RF72 = 55 TD_T51_RZ55 = 54 TD_T75_RFH31 = 96 TD_T59_RZ25 = 65 (est.) TD_T76_RFH72 = 55 TD_T60_RZ56 = 58 TD_T77_RF73 = 63 TD_T61_RZ57 = 62 TD_T78_RFH73 = 63 TD_T66_RZ23L = 47 TD_T81_RF35 = 87 TD_T68_RZ57I = 62 TD_T82_RFH35 = 87 TD_T70_RZ58 = 70 (est.) TD_T83_RF31F = 56 TD_T84_RZ72 = 63 (est.) For the older RF31: TD_T85_RZ73 = 63 TD_T56_RF31 = 51 TD_T86_RZ35 = 87 TD_T75_RFH31 = 51 TD_T87_RZ24L = 55 (est.) TD_T88_RZ25L = 62 (est.) With HSC caching: TD_T89_RZ55L = 59 (est.) TD_T80_RA71 = 97 TD_T90_RZ56L = 60 (est.) TD_T79_RA72 = 97 TD_T91_RZ57L = 62 (est.) TD_T92_RA73 = 102 TD_T93_RZ26 = 87 TD_T94_RZ36 = 88 (est.) TD_T95_RZ74 = 78 (est.) TD_T99_RZ27 = 88 (est.)
You might want to change processor-specific thresholds (in the Performance Manager Thresholds table). To learn what default values are in effect for your system, you can produce a dump report for a single two-minute interval as follows:
$ ADVISE COLLECT REPORT DUMP_DATACELLS - _$ /BEGINNING=hh:mm/ENDING=hh:mm+2/NODE_NAMES=node1
Take the the actual hours and minutes from the beginning statement, add two minutes, and then enter the final value in the hh:mm format. Do not use +2 as part of the syntax.
Look up the values given for their corresponding data cells CPU_VUP_RATING, COM_SCALING, SOFT_FAULT_SCALING, HARD_FAULT_SCALING, and IMG_ACT_RATE_SCALING. (Since the DUMP_DATACELLS report produces voluminous output for each interval, and since the processor-specific thresholds do not change, generate this report for only one two-minute interval to learn their (fixed) values.)
These scaling factors are multiplied by the values shown in the following examples before being applied in rule condition-checking:
If you decide that you would like to refine the scaling factors to more precisely reflect typical activity on your system, then you need to know your system's hardware model ID number. To get this, enter the following at the DCL prompt:
$ n = f$getsyi("hw_model")
$ show symbol n
Assume that this action returned the number 230 to you. Then, if you want to change the threshold for the number of jobs in the Compute queue from its default value of 1.30 to 1.45, you just add another line in MYRULES.VPR in the following format:
Threshold TD_COM_SCALING_230 = 1.45 EndThreshold
|
Copyright © 2008 CA.
All rights reserved.
|
|