Structured Data | There Is No Time Like '%NOW%' To Use Dynamic Sampling

I recently came across a query in which the Optimizer was making a poor cardinality estimate, which in turn caused inefficient join type, which in turn caused the query to run excessively long. This post is a reenactment of my troubleshooting.

The Suspect SQL

The original SQL was quite large and had a fairly complex plan so I simplified it down to this test case for the purpose of this blog post:

select [...]
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010' and
       al1.business_type in ('002', '003', '007', '009') and
       (not (al1.cust_po like '%MATERIAL HANDLING%' or
             al1.cust_po like '%SECURITY%' or
             al1.cust_po like '%SUMMER OF CREATIVITY%' or
             al1.cust_po like '%TEST%'));

----------------------------------------------------------------------------
| Id  | Operation                            | Name                | Rows  |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                     |     1 |
|   1 |  SORT AGGREGATE                      |                     |     1 |
|   2 |   PARTITION LIST SINGLE              |                     |     9 |
|   3 |    PARTITION HASH ALL                |                     |     9 |
|*  4 |     TABLE ACCESS BY LOCAL INDEX ROWID| FACT_TABLE          |     9 |
|   5 |      BITMAP CONVERSION TO ROWIDS     |                     |       |
|*  6 |       BITMAP INDEX SINGLE VALUE      | FACT_BX10           |       |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("AL1"."CUST_PO" NOT LIKE '%MATERIAL HANDLING%' AND
              "AL1"."CUST_PO" NOT LIKE '%SECURITY%' AND
              "AL1"."CUST_PO" NOT LIKE '%SUMMER OF CREATIVITY%' AND
              "AL1"."CUST_PO" NOT LIKE '%TEST%')
   6 - access("AL1"."ORDER_TYPE"='Z010')

Looking at the plan I would surmise that the cardinality estimate for the fact table (line 4) must surely be greater than 9. To find out how far the Optimizer’s guess is off, I ran this query which contained all the filter predicates for FACT_TABLE:

select count(*)
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010' and
       (not (al1.cust_po like '%MATERIAL HANDLING%' or
             al1.cust_po like '%SECURITY%' or
             al1.cust_po like '%SUMMER OF CREATIVITY%' or
             al1.cust_po like '%TEST%'));

  COUNT(*)
----------
 1,324,510

As you can see, the cardinality estimate in this case is way off. The Optimizer estimated 9 rows and in reality the query returns 1.3 million rows, only a difference of 6 orders of magnitude (10^6 or 1,000,000). How could this be? Let’s try and understand why and where the cardinality estimate went wrong.

Bite Size Chunks

I find the easiest way to debug these issues is to use start with one predicate then add one predicate at a time, noting the cardinality estimate and comparing it to the actual cardinality value.

One Predicate, Two Predicate, Red Predicate, Blue Predicate

explain plan for
select count (*)
from   fact_table al1
where  al1.region = '003'
/
select *
from table(dbms_xplan.display(format=>'BASIC ROWS PREDICATE'));

--------------------------------------------------------------
| Id  | Operation              | Name                | Rows  |
--------------------------------------------------------------
|   0 | SELECT STATEMENT       |                     |     1 |
|   1 |  SORT AGGREGATE        |                     |     1 |
|   2 |   PARTITION LIST SINGLE|                     |   141M|
|   3 |    PARTITION HASH ALL  |                     |   141M|
|   4 |     TABLE ACCESS FULL  | FACT_TABLE          |   141M|
--------------------------------------------------------------

   COUNT(*)
-----------
141,821,991

Looks good so far. REGION is the list partition key so we’d expect an accurate estimate.

explain plan for
select count (*)
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010'
/
select *
from table(dbms_xplan.display(format=>'BASIC ROWS PREDICATE'));

------------------------------------------------------------
| Id  | Operation                     | Name       | Rows  |
------------------------------------------------------------
|   0 | SELECT STATEMENT              |            |     1 |
|   1 |  SORT AGGREGATE               |            |     1 |
|   2 |   PARTITION LIST SINGLE       |            |  1456K|
|   3 |    PARTITION HASH ALL         |            |  1456K|
|   4 |     BITMAP CONVERSION COUNT   |            |  1456K|
|*  5 |      BITMAP INDEX SINGLE VALUE| FACT_BX10  |       |
------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("AL1"."ORDER_TYPE"='Z010')

  COUNT(*)
----------
 1,324,642

No issues here. 1,456,000 statistically equivalent to 1,324,642.

explain plan for
select count (*)
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010' and
       (not (al1.cust_po like '%MATERIAL HANDLING%'))
/
select *
from table(dbms_xplan.display(format=>'BASIC ROWS PREDICATE'));

----------------------------------------------------------------------------
| Id  | Operation                            | Name                | Rows  |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                     |     1 |
|   1 |  SORT AGGREGATE                      |                     |     1 |
|   2 |   PARTITION LIST SINGLE              |                     | 72803 |
|   3 |    PARTITION HASH ALL                |                     | 72803 |
|*  4 |     TABLE ACCESS BY LOCAL INDEX ROWID| FACT_TABLE          | 72803 |
|   5 |      BITMAP CONVERSION TO ROWIDS     |                     |       |
|*  6 |       BITMAP INDEX SINGLE VALUE      | FACT_BX10           |       |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("AL1"."CUST_PO" NOT LIKE '%MATERIAL HANDLING%')
   6 - access("AL1"."ORDER_TYPE"='Z010')

  COUNT(*)
----------
 1,324,642

With the addition of the NOT LIKE predicate we start to see a bit of a difference. This plan has a 20x reduction from the previous cardinality estimate (1,456,000/72,803 = 20). Let’s add one more NOT LIKE predicate and see what we get.

explain plan for
select count (*)
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010' and
       (not (al1.cust_po like '%MATERIAL HANDLING%' or
             al1.cust_po like '%SECURITY%'))
/
select *
from table(dbms_xplan.display(format=>'BASIC ROWS PREDICATE'));

----------------------------------------------------------------------------
| Id  | Operation                            | Name                | Rows  |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                     |     1 |
|   1 |  SORT AGGREGATE                      |                     |     1 |
|   2 |   PARTITION LIST SINGLE              |                     |  3640 |
|   3 |    PARTITION HASH ALL                |                     |  3640 |
|*  4 |     TABLE ACCESS BY LOCAL INDEX ROWID| FACT_TABLE          |  3640 |
|   5 |      BITMAP CONVERSION TO ROWIDS     |                     |       |
|*  6 |       BITMAP INDEX SINGLE VALUE      | FACT_BX10           |       |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("AL1"."CUST_PO" NOT LIKE '%MATERIAL HANDLING%' AND
              "AL1"."CUST_PO" NOT LIKE '%SECURITY%')
   6 - access("AL1"."ORDER_TYPE"='Z010')


  COUNT(*)
----------
 1,324,642

With the addition of a second NOT LIKE predicate the cardinality estimate has dropped to 3,640 from 72,803, a 20x reduction.

explain plan for
select count (*)
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010' and
       (not (al1.cust_po like '%MATERIAL HANDLING%' or
             al1.cust_po like '%SECURITY%' or
             al1.cust_po like '%SUMMER OF CREATIVITY%'))
/
select *
from table(dbms_xplan.display(format=>'BASIC ROWS PREDICATE'));

----------------------------------------------------------------------------
| Id  | Operation                            | Name                | Rows  |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                     |     1 |
|   1 |  SORT AGGREGATE                      |                     |     1 |
|   2 |   PARTITION LIST SINGLE              |                     |   182 |
|   3 |    PARTITION HASH ALL                |                     |   182 |
|*  4 |     TABLE ACCESS BY LOCAL INDEX ROWID| FACT_TABLE          |   182 |
|   5 |      BITMAP CONVERSION TO ROWIDS     |                     |       |
|*  6 |       BITMAP INDEX SINGLE VALUE      | FACT_BX10           |       |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("AL1"."CUST_PO" NOT LIKE '%MATERIAL HANDLING%' AND
              "AL1"."CUST_PO" NOT LIKE '%SECURITY%' AND
              "AL1"."CUST_PO" NOT LIKE '%SUMMER OF CREATIVITY%')
   6 - access("AL1"."ORDER_TYPE"='Z010')

   COUNT(*)
----------
 1,324,642

With the addition of the third NOT LIKE predicate the cardinality estimate has dropped from 3,640 to 182, another 20x reduction. Looks like we may have found the issue. Each NOT LIKE predicate appears to result in a 20x reduction (5% selectivity) from the previous estimate. The original query had all four NOT LIKE predicates on it and had a cardinality estimate of 9. If we work the math: 182 * 5% = 9.

Looking at the query we can see there are four NOT LIKE predicates each with a leading and trailing wild card (%). Since DBMS_STATS does not gather table column information on parts of strings, each of the NOT LIKE predicates will have a default selectivity guess of 5%. Given this query has four NOT LIKE predicates, the total reduction for those four predicates will be 5%^4 = 1/160,000 = 0.00000625 which is quite significant, and in this case not representative and the root cause of the original query’s suboptimal access and join type.

Dynamic Sampling To The Rescue

Dynamic Sampling was designed with cases like this in mind. That is, cases where the Optimizer has to resort to selectivity guesses and could very well guess poorly. Substring guessing is not simply done as the substring could appear anywhere in the string. Let’s see what the cardinality estimate is when I add a dynamic_sampling hint to the query.

explain plan for
select /*+ dynamic_sampling(al1 4) */
       count (*)
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010' and
       (not (al1.cust_po like '%MATERIAL HANDLING%' or
             al1.cust_po like '%SECURITY%' or
             al1.cust_po like '%SUMMER OF CREATIVITY%' or
             al1.cust_po like '%TEST%'))
/
select *
from table(dbms_xplan.display(format=>'BASIC ROWS PREDICATE NOTE'));

--------------------------------------------------------------
| Id  | Operation              | Name                | Rows  |
--------------------------------------------------------------
|   0 | SELECT STATEMENT       |                     |     1 |
|   1 |  SORT AGGREGATE        |                     |     1 |
|   2 |   PARTITION LIST SINGLE|                     |  1606K|
|   3 |    PARTITION HASH ALL  |                     |  1606K|
|*  4 |     TABLE ACCESS FULL  | FACT_TABLE          |  1606K|
--------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("AL1"."ORDER_TYPE"='Z010' AND
              "AL1"."CUST_PO" NOT LIKE '%MATERIAL HANDLING%' AND
              "AL1"."CUST_PO" NOT LIKE '%SECURITY%' AND
              "AL1"."CUST_PO" NOT LIKE '%SUMMER OF CREATIVITY%' AND
              "AL1"."CUST_PO" NOT LIKE '%TEST%')

Note
-----
   - dynamic sampling used for this statement

With a level 4 dynamic_sampling hint the Optimizer estimates 1.6 million rows, very close to the actual value of 1.3 million rows. This estimate is surely close enough to give us the optimal access and join type such that the original query should perform optimally.

Summary

There are case where the Optimizer guesses and sometimes can guess poorly. When this is the case, dynamic sampling can be used to give the Optimizer a better cardinality guess. Dynamic sampling is probably best suited for queries that run minutes or longer, such that the overhead of the dynamic sampling query is only a fraction of the total run time. In these cases, the minimal overhead of dynamic sampling can well out weigh the cost of a suboptimal plan.

All tests were performed on 11.1.0.6.0.