Previous Topic: 6.4.2.2.1 Reporting on User Activity

Next Topic: 6.5 ACCUMACC Roll-up Records

6.4.2.2.2 Monitoring Query CP Parallelism

In addition to the user reporting guidelines provided in the
previous section, the analyst has to address the internal DB2
perspective of parallel processing.  This includes several
tasks like identifying potentially good candidates for query
CP parallelism, monitoring balance, and overall performance
of parallel subtasks executions.

  1 - Identify Potential Candidates for CP Parallelism
  2 - Analyzing CP Parallelism Performance
  3 - Parallelism Reduction and Degradation
6.4.2.2.2.1 Identify Potential Candidates for CP Parallelism

Knowing the profile of your DB2 sequential queries before
enabling them for CP parallelism is important because only
processor-intensive queries will benefit from concurrent
executions.

The type of DB2 queries that consume large amounts of
processor time include:

  - Queries involving large volumes of data scans, but whose
    selection criteria only return a few rows.

  - Queries accessing long data rows with a very low rows per
    page ratio.

  - Queries containing aggregate column functions (like AVG,
    SUM, etc.), which generally return a single result after
    scanning multiple rows.

  - Queries involving large data sorts.

The commonly used criteria to determine if a query is
CP-intensive is the ratio of the total TCB time to the total
elapsed time.  The following computation, from the DB2 User
Activity file (DB2DSU) provides this ratio:

  CP_RATIO = ( DSUTCBTM / DSUELPTM ) * 100

  If CP_RATIO is less than 30%, the query will not take
  advantage of CP parallelism.

  If CP_RATIO is between 30% and 70%, the query is
  well-balanced and enabling it for CP parallelism may
  slightly improve its performance.

  If CP_RATIO is greater than 70%, then the query is
  processor-intensive and is a good candidate for CP
  parallelism.
6.4.2.2.2.2 Analyzing CP Parallelism Performance

In this section, you are presented with a list of questions
and explanations that should be taken into account when
optimizing a DB2 query after enabling parallelism:

  1.  Did the overall performance of the query improve?
      -------------------------------------------------

      Element or Computation:  DSUELPTM vs. DSUTELPT

      Explanation:  The difference between these two data
                    elements is that DSUELPTM represents the
                    elapsed time of only the parent task,
                    which is the actual query response time
                    while DSUTELPT is the accumulation of all
                    parallel tasks, the parent plus all child
                    subtasks.  Thus, DSUTELPT is equivalent
                    to the total query elapsed time, when
                    executed sequentially.

                    If DSUELPTM is significantly lower than
                    DSUTELPT, the goal is achieved.

  2.  How many parallel tasks were generated for this query?
      ------------------------------------------------------

      Element or Computation:  DSUNTHRD + DSUPCNT

      Explanation:  When a query is enabled for CP
                    parallelism, instead of creating a unique
                    SMF type 101 record for the complete
                    thread, DB2 generates one record for the
                    originating parent task and either one
                    additional rollup record containing
                    summary information for all the parallel
                    child subtasks (when the DB2 parameter
                    PTASKROL is set to YES) or one record per
                    individual parallel subtask executed
                    (when the DB2 parameter PTASKROL is set
                    to NO).  When CA MICS detects that a
                    record belongs to a parallel query, it
                    sets these two variables for the parent
                    record:  DSUNTHRD is set to one and
                    DSUPCNT is set to the number of parallel
                    subtasks generated.  For subtasks
                    record(s), they are both forced to zero.
                    For example, if an originating task
                    creates 8 parallel subtasks, the total
                    number of tasks executed for the query
                    will be 9 (1 in DSUNTHRD and 8 in
                    DSUPCNT).  This is very important when
                    you want to derive average measurements
                    per task.

  3.  What is the average TCB time per parallel task?
      -----------------------------------------------

      Element or Computation:      DSUTCBTM
                               ------------------
                               DSUNTHRD + DSUPCNT

      Explanation:  DSUTCBTM contains the sum of all parallel
                    tasks TCB time. This includes the parent
                    task and all child subtasks.

  4.  What is the average elapsed time per parallel task?
      ---------------------------------------------------

      Element or Computation:      DSUTELPT
                               ------------------
                               DSUNTHRD + DSUPCNT

      Explanation:  You must use DSUTELPT, which is the only
                    data element containing the accumulated
                    elapsed times of all parallel tasks,
                    including the parent task, and not
                    DSUELPTM, which only represents the
                    elapsed time of the parent task.

  5.  Is parallelism well balanced across all subtasks?
      -------------------------------------------------

      Element or Computation:      DB2PTASK

      Analyze in detail how each parallel subtask performs
      compared to its counterpart.  This task should be
      carefully planned and should only be performed for a
      short-term, as it may have severe implications on your
      data center operations and CA MICS daily processes.

      By default, the PTASKROL value of the DB2 system
      parameter is YES, which means DB2 rolls up all child
      tasks information into a single SMF type 101 record.
      If you want task level information, you will have to
      change the PTASKROL value to NO, which instructs DB2 to
      write a SMF type 101 record for each individual
      parallel subtask.  This action may flood your
      installation's SMF data sets and lead to severe
      operational problems.

      Once the SMF data is created, you will have to store it
      in the CA MICS database. This can be done by activating
      the DB2DSU file in the DETAIL timespan (by default, the
      DB2DSU file is inactive).  Again, this may lead to disk
      space problems during your CA MICS daily processing.
      However, you can use the TAPEDSU facility to create the
      detail file on tape, rather than disk.

      Data element DB2PTASK, Special Transaction Type
      Identifier, allows you to determine if the observation
      represents the activity of the parent task or of the
      child subtask activity.  For more information, see the
      DB2PTASK data dictionary entry.
6.4.2.2.2.3 Parallelism Reduction and Degradation

In DB2, the optimal number of parallel operations or tasks
used to access data in order to achieve minimum elapsed time
is initially determined at bind time, and later adjusted when
the query is executed.  If the query cannot be run at the
planned degree, as determined by DB2, parallelism will
execute at a reduced degree or will degrade to sequential
operations.

This section summarizes the reasons for parallelism reduction
and degradation, and provides you with some related CA MICS
data elements, when available, that you may find useful for
monitoring.

The easiest way to verify if parallelism ran at the planned
degree is to compare the values of the following data
elements:

o  DSUTOGRP:  Parallel Groups Executed

o  DSUNOGRP:  Parallel Groups Exec Planned Degree

A parallel group is a set of parallel operations.  A single
query can have several parallel groups, but each group within
the query is identified by its own unique number.

If the number of parallel groups executed at the planned
degree is not equal to the total number of parallel groups
executed, then parallelism reduction or/and degradation
occurred at some point for one of the following reasons:

   o  Lack of Virtual Buffer Pool Space Availability
      ----------------------------------------------

      Element(s) :  DSUDGBUF, DSURDGRP, DSURPOP2, DSUSKIP

      Explanation:  Non-zero values for any of these data
                    elements is an indicator of a storage
                    shortage or contention on the buffer pool
                    that affected parallelism execution.  You
                    may want to increase the current buffer
                    pool using the ALTER BUFFERPOOL
                    statement, or assign table spaces
                    accessed by this query to a different
                    buffer pool using the ALTER TABLESPACE
                    statement.

   o  Runtime Host Variable Values
      -----------------------------

      Element(s) :  Not Available

      Explanation:  In an application program, a variable
                    that is referenced by embedded SQL
                    statements can determine the qualifying
                    partitions of a table for a given query.
                    In this case, determination of optimum
                    parallelism degree is deferred until
                    runtime, when the host variable value is
                    known.

   o  Updateable Cursor
      ----------------

      Element(s) :  DSUDGCUR

      Explanation:  A non-zero value in this data element
                    indicates that DB2 determined, at
                    runtime, that an ambiguous cursor was
                    updateable, and that the parallel groups
                    fell back to sequential mode.  This
                    situation can occur when a query is bound
                    with the CURRENDATA(YES) option, and if
                    DB2 cannot tell if the cursor is
                    read-only.  To ensure DB2 will consider
                    parallelism for ambiguous cursors, you
                    can choose the CURRENDATA(NO) bind
                    option.  However, it is best to always
                    prevent such a situation, by indicating
                    when a cursor is read-only on the DECLARE
                    CURSOR statement.

   o  Lack of Proper Hardware and Software Support
      --------------------------------------------

      Element(s) :  DSUDGESA, DSUEGENC

      Explanation:  If the hardware sort facility is not
                    available at runtime, and a sort merge
                    join is needed, DSUDGESA will have a non-
                    zero value.  Similarly, if enclave
                    services are unavailable, DSUEGENC will
                    have a non-zero value.  Enclave services
                    are required for distributed requests,
                    because the parallel tasks run under an
                    enclave SRB, unlike non-distributed
                    requests, for which parallel tasks run
                    under a client SRB.

   o  Changes in the Configuration of Online Processors
      -------------------------------------------------

      Element(s) :  DSURPOP1

      Explanation:  DB2 can reduce the parallelism degree if
                    there are fewer online central processors
                    (or, for sysplex parallelism, DB2
                    members) available at runtime than at
                    bind time.  DSURPOP1 has to be used as a
                    potential problem indicator only if
                    parallelism did not run at the planned
                    degree, because a non-zero value does not
                    necessarily mean that a reduction
                    occurred.  If the change in configuration
                    results in an increase of the processing
                    power, DB2 can increase the parallel
                    degree to take best advantage of this
                    updated configuration.

   o  Disabled by the Resource Limit Facility (RLF)
      ---------------------------------------------

      Element(s) :  DSURLFDP

      Explanation:  This data element provides the number of
                    threads for which query parallelism was
                    disabled by the RLF.

   o  Execution Limited to a Single DB2 Subsystem
      ------------------------------------------

      Element(s) :  DSUORNO, DSUORR, DSUPGDTT

      Explanation:  These data elements indicate that
                    parallel groups executed on a single DB2
                    subsystem due to one of the following
                    reasons:

                    - For DSUORNO:  Either the COORDINATOR
                      subsystem parameter was set to YES at
                      bind time, but the parameter is set to
                      NO at runtime; or the plan or package
                      was bound on a DB2 with the COORDINATOR
                      subsystem parameter set to YES, but the
                      program is being run on a different DB2
                      that has the COORDINATOR value set to
                      NO.

                    - For DSUORR:  The plan or package was
                      bound with an isolation value of
                      REPEATABLE-READ or READ-STABILITY.

                    - For DSUPGDTT:  Sysplex parallelism was
                      downgraded to CP parallelism for
                      parallel groups because they referenced
                      a user-defined function (UDF), and a
                      declared temporary table was detected
                      at runtime.