oracle 不使用索引的原因有哪些?
今天是2014-02-07,今天开始总结一下oracle不使用索引的原因有哪些。一边学习一边做笔记。
第一种:行数存在差异。
在视图user_tables存在一个num_rows字段,该字段是记录在统计信息收集后所对应对象的行数,在user_tab_columns视图中存在一个num_distinct字段,该字段记录每个字段内不同数值的个数。oracle认为当num_distinct越接近num_rows的时候索引的选择性越好,那么在执行查询的时候越容易使用索引。
第二种:聚簇因子:
什么是聚簇因子?
聚簇因子是衡量索引列数据顺序与表字段数据顺序相似性的一个值。我们都知道在创建的表中一般都是堆表,也就是数据在表中存储是无续的,那么为了更加快速的访问数据,我们通常使用索引进行数据访问,这时候没个索引都有一个聚簇因子,聚簇因子越接近对象的块数,那么选择性越好,越接近表的行数那么选择性越差。
之前听到有个朋友曾经提到这么一个问题“为什么我在测试环境查询一个数据很快和在生产环境查询数据怎么这么慢呢?表结构都一样的,数据也是一样的。”。那么不妨看看聚簇因子是多少。
聚簇因子的查看是从user_ind_statistics视图中: CLUSTERING_FACTOR 表示的。看一下官方介绍:
Indicates the amount of order of the rows in the table based on the values of the index.
- If the value is near the number of blocks, then the table is very well ordered. In this case, the index entries in a single leaf block tend to point to rows in the same data blocks.
- If the value is near the number of rows, then the table is very randomly ordered. In this case, it is unlikely that index entries in the same leaf block point to rows in the same data blocks.
往往聚簇因子的大小和数据获取的I/o存在一定的相似性。如果聚簇因子大,那么相对的物理或是逻辑(一般是)i/o开销很大,也就是块被频繁反复读取,一致数据获取很慢。
长查询的视图有dba_ind_statistics和dba_tab_statistics
第三种:使用不等条件:
当使用在进行查询数据的时候使用不等条件<>,那么oracle任务这个符号会需要读取大部分的数据块,那么就会跳过使用索引。eg:
SQL> select index_name,table_name,column_name from user_ind_columns where table_name='EMP';
INDEX_NAME TABLE_NAME COLUMN_NAME
------------------------------ ------------------------------ ----------------------------------------
EMP_IDX1 EMP DEPTNO
EMP_IDX1 EMP EMPNO
SQL> select * from emp;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7839 KING PRESIDENT 17-NOV-81 5000 10
7934 MILLER CLERK 7782 23-JAN-82 1300 10
7369 SMITH CLERK 7902 17-DEC-80 800 20
7566 JONES MANAGER 7839 02-APR-81 2975 20
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7900 JAMES CLERK 7698 03-DEC-81 950 30
14 rows selected.
SQL> set autotrace trace exp
SQL> select * from emp where empno<>7900;
Execution Plan
----------------------------------------------------------
Plan hash value: 822536733
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 494 | 14 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE ALL| | 13 | 494 | 14 (0)| 00:00:01 | 1 | 4 |
|* 2 | TABLE ACCESS FULL | EMP | 13 | 494 | 14 (0)| 00:00:01 | 1 | 4 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EMPNO"<>7900)
SQL> select /*+index(emp EMP_IDX1)*/ * from emp where empno<>7900;
Execution Plan
----------------------------------------------------------
Plan hash value: 257372123
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 494 | 6 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE ALL | | 13 | 494 | 6 (0)| 00:00:01 | 1 | 4 |
| 2 | TABLE ACCESS BY LOCAL INDEX ROWID| EMP | 13 | 494 | 6 (0)| 00:00:01 | 1 | 4 |
|* 3 | INDEX FULL SCAN | EMP_IDX1 | 13 | | 3 (0)| 00:00:01 | 1 | 4 |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("EMPNO"<>7900)
SQL>
可知,当使用<>就会跳过索引,但是我们可以使用hints(提示)让数据库强制使用索引,注意not in或是in 关键字类似<>依然会跳过索引,那么替代办法要么更改sql查询语句,要么使用case when条件,那么也需要在建立函数索引了。
第四种:统计信息过旧
在基于成本的optimizer更具数据进行估计cast,当数据已经被修改(如进行了大量的dml操作),那么统计信息肯定过旧,那么oracle在更加统计信息进行执行计划选择的时候往往可能出现选择错误的执行计划。
在10g开始oracle会有自动收集统计信息的任务在运行。
17:15:22 sys@REPDB>select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE 10.2.0.5.0 Production
TNS for HPUX: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production
17:15:28 sys@REPDB>select JOB_NAME,LAST_START_DATE,comments from dba_scheduler_jobs;
JOB_NAME LAST_START_DATE COMMENTS
------------------------------ --------------------------------------------------------------------------- ------------------------------------------------------------
PURGE_LOG 07-2ÔÂ -14 03.00.01.146182 ÉÏÎç +08:00 purge log job
FGR$AUTOPURGE_JOB file group auto-purge job
GATHER_STATS_JOB Oracle defined automatic optimizer statistics collection job
AUTO_SPACE_ADVISOR_JOB auto space advisor maintenance job
MGMT_CONFIG_JOB 06-2ÔÂ -14 10.00.02.198895 ÏÂÎç +08:00 Configuration collection job.
MGMT_STATS_CONFIG_JOB 01-2ÔÂ -14 01.01.01.762793 ÉÏÎç +08:00 OCM Statistics collection job.
RLM$EVTCLEANUP 07-2ÔÂ -14 04.18.33.923785 ÏÂÎç +08:00
RLM$SCHDNEGACTION 07-2ÔÂ -14 05.02.43.177946 ÏÂÎç +08:00
ÒÑÑ¡Ôñ8ÐС£
但在11g该内容开始取消;
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE 11.2.0.4.0 Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production
SQL> select JOB_NAME,LAST_START_DATE,comments from dba_scheduler_jobs;
JOB_NAME LAST_START_DATE COMMENTS
------------------------------ -------------------------------------------------- --------------------------------------------------
XMLDB_NFS_CLEANUP_JOB
SM$CLEAN_AUTO_SPLIT_MERGE 07-FEB-14 12.00.00.522780 AM PST8PDT auto clean job for auto split merge
RSE$CLEAN_RECOVERABLE_SCRIPT 07-FEB-14 12.00.00.242715 AM PST8PDT auto clean job for recoverable script
FGR$AUTOPURGE_JOB file group auto-purge job
BSLN_MAINTAIN_STATS_JOB 06-FEB-14 10.39.20.219977 PM -07:00 Oracle defined automatic moving window baseline st
atistics computation job
DRA_REEVALUATE_OPEN_FAILURES 31-JAN-14 06.00.02.807655 AM PST8PDT Reevaluate open failures for DRA
HM_CREATE_OFFLINE_DICTIONARY Create offline dictionary in ADR for DRA name tran
slation
ORA$AUTOTASK_CLEAN 06-FEB-14 09.39.19.975070 PM PST8PDT Delete obsolete AUTOTASK repository data
FILE_WATCHER File watcher job
PURGE_LOG 06-FEB-14 09.39.19.999313 PM PST8PDT purge log job
MGMT_STATS_CONFIG_JOB 06-FEB-14 10.39.19.607560 PM -07:00 OCM Statistics collection job.
MGMT_CONFIG_JOB 07-FEB-14 01.01.01.661585 AM -07:00 Configuration collection job.
RLM$SCHDNEGACTION 07-FEB-14 04.32.12.069752 PM +08:00
RLM$EVTCLEANUP 07-FEB-14 01.49.46.407904 AM -07:00
14 rows selected.
SQL>
那么当统计信息过久的时候通常使用dbms_stats包进行相应对象信息的收集。
第五种:使用通配符查询:
当谓词条件中存在%或是_通配符的时候,oracle会忽略索引,但是可以通过通配符位移解决,如下:
SQL> create index emp_idx2 on emp(ename);
Index created.
SQL> set autotrace trace exp
SQL> select * from emp where ename like '%OT%';
Execution Plan
----------------------------------------------------------
Plan hash value: 822536733
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 38 | 14 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE ALL| | 1 | 38 | 14 (0)| 00:00:01 | 1 | 4 |
|* 2 | TABLE ACCESS FULL | EMP | 1 | 38 | 14 (0)| 00:00:01 | 1 | 4 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("ENAME" LIKE '%OT%')
SQL> SELECT * FROM EMP WHERE ENAME LIKE 'SC%OT%';
Execution Plan
----------------------------------------------------------
Plan hash value: 547783664
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 38 | 2 (0)| 00:00:01 | | |
| 1 | TABLE ACCESS BY GLOBAL INDEX ROWID| EMP | 1 | 38 | 2 (0)| 00:00:01 | ROWID | ROWID |
|* 2 | INDEX RANGE SCAN | EMP_IDX2 | 1 | | 1 (0)| 00:00:01 | | |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ENAME" LIKE 'SC%OT%')
filter("ENAME" LIKE 'SC%OT%')
SQL>
第六种:在谓词条件中存在函数
无论在谓词中存在隐式函数转换或是显示函数那么oracle都不会使用索引,对于隐式函数转换例子如下:
第七种:谓词列存在null值。
当在谓词条件中的列存在null值,那么oracle将跳过索引,因为null值不存在索引段中,但是如果是复合索引,其中一列为null,另外一列不为null,那么数据库依然使用索引。
第八种:跳过前导列
这是最后一种不使用索引的原因,那就是如果是在默写列上创建了复合索引,但在谓词条件中没有涉及到前导列,那么将会是跳过索引。,
如上是oracle不使用索引情况的一点学习笔记。如果强制使用索引,那么可以尝试相关hints如下: