6. DATA SOURCES › 6.4 Query CP Parallelism Considerations › 6.4.2 Parallel Tasks Data Analysis › 6.4.2.2 Data Analysis Guidelines › 6.4.2.2.2 Monitoring Query CP Parallelism
6.4.2.2.2 Monitoring Query CP Parallelism
In addition to the user reporting guidelines provided in the
previous section, the analyst has to address the internal DB2
perspective of parallel processing. This includes several
tasks like identifying potentially good candidates for query
CP parallelism, monitoring balance, and overall performance
of parallel subtasks executions.
1 - Identify Potential Candidates for CP Parallelism
2 - Analyzing CP Parallelism Performance
3 - Parallelism Reduction and Degradation
6.4.2.2.2.1 Identify Potential Candidates for CP Parallelism
Knowing the profile of your DB2 sequential queries before
enabling them for CP parallelism is important because only
processor-intensive queries will benefit from concurrent
executions.
The type of DB2 queries that consume large amounts of
processor time include:
- Queries involving large volumes of data scans, but whose
selection criteria only return a few rows.
- Queries accessing long data rows with a very low rows per
page ratio.
- Queries containing aggregate column functions (like AVG,
SUM, etc.), which generally return a single result after
scanning multiple rows.
- Queries involving large data sorts.
The commonly used criteria to determine if a query is
CP-intensive is the ratio of the total TCB time to the total
elapsed time. The following computation, from the DB2 User
Activity file (DB2DSU) provides this ratio:
CP_RATIO = ( DSUTCBTM / DSUELPTM ) * 100
If CP_RATIO is less than 30%, the query will not take
advantage of CP parallelism.
If CP_RATIO is between 30% and 70%, the query is
well-balanced and enabling it for CP parallelism may
slightly improve its performance.
If CP_RATIO is greater than 70%, then the query is
processor-intensive and is a good candidate for CP
parallelism.
6.4.2.2.2.2 Analyzing CP Parallelism Performance
In this section, you are presented with a list of questions
and explanations that should be taken into account when
optimizing a DB2 query after enabling parallelism:
1. Did the overall performance of the query improve?
-------------------------------------------------
Element or Computation: DSUELPTM vs. DSUTELPT
Explanation: The difference between these two data
elements is that DSUELPTM represents the
elapsed time of only the parent task,
which is the actual query response time
while DSUTELPT is the accumulation of all
parallel tasks, the parent plus all child
subtasks. Thus, DSUTELPT is equivalent
to the total query elapsed time, when
executed sequentially.
If DSUELPTM is significantly lower than
DSUTELPT, the goal is achieved.
2. How many parallel tasks were generated for this query?
------------------------------------------------------
Element or Computation: DSUNTHRD + DSUPCNT
Explanation: When a query is enabled for CP
parallelism, instead of creating a unique
SMF type 101 record for the complete
thread, DB2 generates one record for the
originating parent task and either one
additional rollup record containing
summary information for all the parallel
child subtasks (when the DB2 parameter
PTASKROL is set to YES) or one record per
individual parallel subtask executed
(when the DB2 parameter PTASKROL is set
to NO). When CA MICS detects that a
record belongs to a parallel query, it
sets these two variables for the parent
record: DSUNTHRD is set to one and
DSUPCNT is set to the number of parallel
subtasks generated. For subtasks
record(s), they are both forced to zero.
For example, if an originating task
creates 8 parallel subtasks, the total
number of tasks executed for the query
will be 9 (1 in DSUNTHRD and 8 in
DSUPCNT). This is very important when
you want to derive average measurements
per task.
3. What is the average TCB time per parallel task?
-----------------------------------------------
Element or Computation: DSUTCBTM
------------------
DSUNTHRD + DSUPCNT
Explanation: DSUTCBTM contains the sum of all parallel
tasks TCB time. This includes the parent
task and all child subtasks.
4. What is the average elapsed time per parallel task?
---------------------------------------------------
Element or Computation: DSUTELPT
------------------
DSUNTHRD + DSUPCNT
Explanation: You must use DSUTELPT, which is the only
data element containing the accumulated
elapsed times of all parallel tasks,
including the parent task, and not
DSUELPTM, which only represents the
elapsed time of the parent task.
5. Is parallelism well balanced across all subtasks?
-------------------------------------------------
Element or Computation: DB2PTASK
Analyze in detail how each parallel subtask performs
compared to its counterpart. This task should be
carefully planned and should only be performed for a
short-term, as it may have severe implications on your
data center operations and CA MICS daily processes.
By default, the PTASKROL value of the DB2 system
parameter is YES, which means DB2 rolls up all child
tasks information into a single SMF type 101 record.
If you want task level information, you will have to
change the PTASKROL value to NO, which instructs DB2 to
write a SMF type 101 record for each individual
parallel subtask. This action may flood your
installation's SMF data sets and lead to severe
operational problems.
Once the SMF data is created, you will have to store it
in the CA MICS database. This can be done by activating
the DB2DSU file in the DETAIL timespan (by default, the
DB2DSU file is inactive). Again, this may lead to disk
space problems during your CA MICS daily processing.
However, you can use the TAPEDSU facility to create the
detail file on tape, rather than disk.
Data element DB2PTASK, Special Transaction Type
Identifier, allows you to determine if the observation
represents the activity of the parent task or of the
child subtask activity. For more information, see the
DB2PTASK data dictionary entry.
6.4.2.2.2.3 Parallelism Reduction and Degradation
In DB2, the optimal number of parallel operations or tasks
used to access data in order to achieve minimum elapsed time
is initially determined at bind time, and later adjusted when
the query is executed. If the query cannot be run at the
planned degree, as determined by DB2, parallelism will
execute at a reduced degree or will degrade to sequential
operations.
This section summarizes the reasons for parallelism reduction
and degradation, and provides you with some related CA MICS
data elements, when available, that you may find useful for
monitoring.
The easiest way to verify if parallelism ran at the planned
degree is to compare the values of the following data
elements:
o DSUTOGRP: Parallel Groups Executed
o DSUNOGRP: Parallel Groups Exec Planned Degree
A parallel group is a set of parallel operations. A single
query can have several parallel groups, but each group within
the query is identified by its own unique number.
If the number of parallel groups executed at the planned
degree is not equal to the total number of parallel groups
executed, then parallelism reduction or/and degradation
occurred at some point for one of the following reasons:
o Lack of Virtual Buffer Pool Space Availability
----------------------------------------------
Element(s) : DSUDGBUF, DSURDGRP, DSURPOP2, DSUSKIP
Explanation: Non-zero values for any of these data
elements is an indicator of a storage
shortage or contention on the buffer pool
that affected parallelism execution. You
may want to increase the current buffer
pool using the ALTER BUFFERPOOL
statement, or assign table spaces
accessed by this query to a different
buffer pool using the ALTER TABLESPACE
statement.
o Runtime Host Variable Values
-----------------------------
Element(s) : Not Available
Explanation: In an application program, a variable
that is referenced by embedded SQL
statements can determine the qualifying
partitions of a table for a given query.
In this case, determination of optimum
parallelism degree is deferred until
runtime, when the host variable value is
known.
o Updateable Cursor
----------------
Element(s) : DSUDGCUR
Explanation: A non-zero value in this data element
indicates that DB2 determined, at
runtime, that an ambiguous cursor was
updateable, and that the parallel groups
fell back to sequential mode. This
situation can occur when a query is bound
with the CURRENDATA(YES) option, and if
DB2 cannot tell if the cursor is
read-only. To ensure DB2 will consider
parallelism for ambiguous cursors, you
can choose the CURRENDATA(NO) bind
option. However, it is best to always
prevent such a situation, by indicating
when a cursor is read-only on the DECLARE
CURSOR statement.
o Lack of Proper Hardware and Software Support
--------------------------------------------
Element(s) : DSUDGESA, DSUEGENC
Explanation: If the hardware sort facility is not
available at runtime, and a sort merge
join is needed, DSUDGESA will have a non-
zero value. Similarly, if enclave
services are unavailable, DSUEGENC will
have a non-zero value. Enclave services
are required for distributed requests,
because the parallel tasks run under an
enclave SRB, unlike non-distributed
requests, for which parallel tasks run
under a client SRB.
o Changes in the Configuration of Online Processors
-------------------------------------------------
Element(s) : DSURPOP1
Explanation: DB2 can reduce the parallelism degree if
there are fewer online central processors
(or, for sysplex parallelism, DB2
members) available at runtime than at
bind time. DSURPOP1 has to be used as a
potential problem indicator only if
parallelism did not run at the planned
degree, because a non-zero value does not
necessarily mean that a reduction
occurred. If the change in configuration
results in an increase of the processing
power, DB2 can increase the parallel
degree to take best advantage of this
updated configuration.
o Disabled by the Resource Limit Facility (RLF)
---------------------------------------------
Element(s) : DSURLFDP
Explanation: This data element provides the number of
threads for which query parallelism was
disabled by the RLF.
o Execution Limited to a Single DB2 Subsystem
------------------------------------------
Element(s) : DSUORNO, DSUORR, DSUPGDTT
Explanation: These data elements indicate that
parallel groups executed on a single DB2
subsystem due to one of the following
reasons:
- For DSUORNO: Either the COORDINATOR
subsystem parameter was set to YES at
bind time, but the parameter is set to
NO at runtime; or the plan or package
was bound on a DB2 with the COORDINATOR
subsystem parameter set to YES, but the
program is being run on a different DB2
that has the COORDINATOR value set to
NO.
- For DSUORR: The plan or package was
bound with an isolation value of
REPEATABLE-READ or READ-STABILITY.
- For DSUPGDTT: Sysplex parallelism was
downgraded to CP parallelism for
parallel groups because they referenced
a user-defined function (UDF), and a
declared temporary table was detected
at runtime.