对越界值的查询可能影响ORACLE的执行计划
时间:2010-07-15 来源:TOMSYAN
遇到一个问题:对不存在的值的查询可能影响ORACLE的执行计划 。
问题重现如下:
SQL> conn ysp/ysp
Connected.
SQL> DROP TABLE TEST PURGE;
Table dropped.
SQL> CREATE TABLE TEST(ID CHAR(8),COLNO VARCHAR2(30),NAME VARCHAR2(8),CONTENTS CHAR(500));
Table created.
SQL> INSERT INTO TEST SELECT LPAD(TRUNC(DBMS_RANDOM.VALUE(1,11)),8,'0'),RPAD(REVERSE(OBJECT_NAME),30,'X'),SUBSTR(OBJECT_NAME,1,4),OBJECT_NAME||'***'||OBJECT_TYPE FROM ALL_OBJECTS ;
50587 rows created.
SQL> COMMIT;
Commit complete.
SQL> SELECT DISTINCT ID FROM TEST ORDER BY 1;
ID
----------------
00000001
00000002
00000003
00000004
00000005
00000006
00000007
00000008
00000009
00000010
10 rows selected.
SQL> CREATE INDEX TEST_IDX_01 ON TEST(ID,NAME);
Index created.
SQL> CREATE INDEX TEST_IDX_02 ON TEST(ID,COLNO);
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'test',cascade=>true);
PL/SQL procedure successfully completed.
SQL> select index_name,blevel,leaf_blocks,clustering_factor from user_indexes where table_name='TEST';
INDEX_NAME BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ----------- -----------------
TEST_IDX_01 1 242 42762
TEST_IDX_02 2 472 47468
SQL> SELECT COLUMN_NAME,NUM_DISTINCT,DENSITY FROM USER_TAB_COLUMNS WHERE TABLE_NAME='TEST';
COLUMN_NAM NUM_DISTINCT DENSITY
---------- ------------ ----------
ID 10 9.8464E-06
COLNO 29769 .000033592
NAME 3303 .000575374
CONTENTS 49791 .000020084
如果执行下面这条SQL语句:
SELECT CONTENTS FROM TEST WHERE ID='00000010' AND COLNO='$NOCXXXXXXXXXXXXXXXXXXXXXXXXXX';
SQL> SELECT CONTENTS FROM TEST WHERE ID='00000010' AND COLNO='$NOCXXXXXXXXXXXXXXXXXXXXXXXXXX';
Execution Plan
----------------------------------------------------------
Plan hash value: 3375868528
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2065 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 2065 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TEST_IDX_02 | 1 | | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"='00000010' AND "COLNO"='$NOCXXXXXXXXXXXXXXXXXXXXXXXXXX')
SQL> SELECT CONTENTS FROM TEST WHERE ID='00000001'AND COLNO='2LOC_IXXXXXXXXXXXXXXXXXXXXXXXX';
Execution Plan
----------------------------------------------------------
Plan hash value: 3375868528
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2065 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 2065 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TEST_IDX_02 | 1 | | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"='00000001' AND "COLNO"='2LOC_IXXXXXXXXXXXXXXXXXXXXXXXX')
可以看到,执行计划都是正确的,对于这条SQL语句,索引TEST_IDX_02无疑是最佳的。
如果我们输入的值不在00000001-00000010之间。可能会导致错误的执行计划。
如下所示:00000050不再00000001-00000010之间。
SQL> SELECT CONTENTS FROM TEST WHERE ID='00000050' AND COLNO='$NOCXXXXXXXXXXXXXXXXXXXXXXXXXX';
Execution Plan
----------------------------------------------------------
Plan hash value: 2214617613
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2065 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 2065 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TEST_IDX_01 | 1 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COLNO"='$NOCXXXXXXXXXXXXXXXXXXXXXXXXXX')
2 - access("ID"='00000050')
可以看到ORACLE选择了索引TEST_IDX_01,很明显,对于这条SQL语句,这不是一个理想的索引。当然,对于当前的值00000050由于是不存在的,用哪个索引,效率都不会很低。但是如果,你采用了绑定变量的话,如果第一次执行的 时候用了不存在的值,那么生成的执行计划,对以后执行的SQL语句,可能是效率非常低的执行计划。本例就是如此,如果以后都用TEST_IDX_01,很明显是不合适的。
今天在做监控的时候,发现了一条SQL语句,明显有个很好的索引,但是就是没用,用了一个效率低的索引,导致大量的逻辑读,CPU利用率很高。和本例类似。
造成这种情况的原因,主要是因为对于超出边界的值,ORACLE在计算选择率的时候采用了直线下降的算法,本例中的选择度已经降低接近为0了。从而间接的导致了COST值的大大降低。
根据COST的计算公式:COST=BLEVEL+CEIL(索引的选择度*叶子块数)+CEIL(表的选择度*聚簇因子)
CEIL(索引的选择度*叶子块数)约等于1 ,CEIL(表的选择度*聚簇因子) 也约等于1 ,所以主要的COST就是BLEVEL了。
而由于TEST_IDX_01 索引的 BLEVEL比TEST_IDX_02 的 BLEVEL小1,而就这个小1,导致了ORACLE走了错误的执行计划。
采用TEST_IDX_02 在计算索引成本的时候是3。如下:
SQL> SELECT /*+INDEX(TEST TEST_IDX_02)*/ CONTENTS FROM TEST WHERE ID='00000050' AND COLNO='$NOCXXXXXXXXXXXXXXXXXXXXXXXXXX';
Execution Plan
----------------------------------------------------------
Plan hash value: 3375868528
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2065 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 2065 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TEST_IDX_02 | 1 | | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"='00000050' AND "COLNO"='$NOCXXXXXXXXXXXXXXXXXXXXXXXXXX')
这种情况还是蛮少出现的,如果ID列比较少的话,可以对这个列不采用绑定变量,避免SQL的共享。
还可以删除这列上上的直方图,使ORACLE无法知道是否超出范围。
SQL> EXEC DBMS_STATS.DELETE_COLUMN_STATS(USER,'TEST','ID');
PL/SQL procedure successfully completed.
SQL> SELECT CONTENTS FROM TEST WHERE ID='00000050' AND COLNO='$NOCXXXXXXXXXXXXXXXXXXXXXXXXXX';
Execution Plan
----------------------------------------------------------
Plan hash value: 3375868528
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2066 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 2066 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TEST_IDX_02 | 1 | | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"='00000050' AND "COLNO"='$NOCXXXXXXXXXXXXXXXXXXXXXXXXXX')
删除了列上的统计信息,可能会对其他SQL语句产生影响。所以要权衡一下利弊。
当然,如果索引TEST_IDX_01的BLEVEL>=TEST_IDX_02 也不会有问题了。
SQL> select index_name,blevel,leaf_blocks,clustering_factor from user_indexes where table_name='TEST';
INDEX_NAME BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ----------- -----------------
TEST_IDX_01 1 242 42762
TEST_IDX_02 2 472 47468
SQL> ALTER INDEX TEST_IDX_01 REBUILD PCTFREE 90;
Index altered.
SQL> select index_name,blevel,leaf_blocks,clustering_factor from user_indexes where table_name='TEST';
INDEX_NAME BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ----------- -----------------
TEST_IDX_01 2 2024 42762
TEST_IDX_02 2 472 47468
SQL> exec dbms_stats.gather_table_stats(user,'test',cascade=>true);
PL/SQL procedure successfully completed.
SQL> SET AUTOT TRACEONLY EXP
SQL> SELECT CONTENTS FROM TEST WHERE ID='00000050' AND COLNO='$NOCXXXXXXXXXXXXXXXXXXXXXXXXXX';
Execution Plan
----------------------------------------------------------
Plan hash value: 3375868528
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2065 | 4 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 2065 | 4 |
|* 2 | INDEX RANGE SCAN | TEST_IDX_02 | 1 | | 3 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"='00000050' AND "COLNO"='$NOCXXXXXXXXXXXXXXXXXXXXXXXXXX')