DB2 uses a cost-based optimizer, and that optimizer needs accurate statistical information about your data. Collecting the proper statistics is necessary for good performance. With each new version of DB2, the optimizer better utilizes catalog statistics. This point means that with each new version, DB2 is more dependent on catalog statistics. You should have statistics on every column referenced in every WHERE clause. If you are using parameter markers and host variables, you need at least cardinality statistics. If you have skewed data or you are using literal values in your SQL statements, you need frequency distribution or histogram statistics. If you suspect columns are correlated, gather column correlation statistics.
To determine if columns are correlated, run these two queries:
SELECT COUNT (DISTINCT CITY) AS CITYCNT *
COUNT (DISTINCT STATE) AS STATECNT
FROM CUSTOMER
SELECT COUNT (*) FROM
(SELECT DISTINCT CITY, STATE
FROM CUSTOMER) AS FINLCNT
If the number from the second query is lower than the number from the first query, the columns are correlated.
You can also run GROUP BY queries against tables for columns used in predicates to count the occurrences of values in these columns. These counts indicate whether you need frequency distribution statistics or histogram statistics, and run-time reoptimization for skewed data distributions.
|
Copyright © 2013 CA Technologies.
All rights reserved.
|
|