oracle 不使用索引的原因有哪些?

   今天是2014-02-07,今天开始总结一下oracle不使用索引的原因有哪些。一边学习一边做笔记。

第一种:行数存在差异。

   在视图user_tables存在一个num_rows字段,该字段是记录在统计信息收集后所对应对象的行数,在user_tab_columns视图中存在一个num_distinct字段,该字段记录每个字段内不同数值的个数。oracle认为当num_distinct越接近num_rows的时候索引的选择性越好,那么在执行查询的时候越容易使用索引。

第二种:聚簇因子:

什么是聚簇因子?

 聚簇因子是衡量索引列数据顺序与表字段数据顺序相似性的一个值。我们都知道在创建的表中一般都是堆表,也就是数据在表中存储是无续的,那么为了更加快速的访问数据,我们通常使用索引进行数据访问,这时候没个索引都有一个聚簇因子,聚簇因子越接近对象的块数,那么选择性越好,越接近表的行数那么选择性越差。

之前听到有个朋友曾经提到这么一个问题“为什么我在测试环境查询一个数据很快和在生产环境查询数据怎么这么慢呢?表结构都一样的,数据也是一样的。”。那么不妨看看聚簇因子是多少。

聚簇因子的查看是从user_ind_statistics视图中: CLUSTERING_FACTOR    表示的。看一下官方介绍:

Indicates the amount of order of the rows in the table based on the values of the index.

  • If the value is near the number of blocks, then the table is very well ordered. In this case, the index entries in a single leaf block tend to point to rows in the same data blocks.
  • If the value is near the number of rows, then the table is very randomly ordered. In this case, it is unlikely that index entries in the same leaf block point to rows in the same data blocks.

往往聚簇因子的大小和数据获取的I/o存在一定的相似性。如果聚簇因子大,那么相对的物理或是逻辑(一般是)i/o开销很大,也就是块被频繁反复读取,一致数据获取很慢。

长查询的视图有dba_ind_statistics和dba_tab_statistics

第三种:使用不等条件:

 当使用在进行查询数据的时候使用不等条件<>,那么oracle任务这个符号会需要读取大部分的数据块,那么就会跳过使用索引。eg:

SQL> select index_name,table_name,column_name from user_ind_columns where table_name='EMP';

INDEX_NAME                     TABLE_NAME                     COLUMN_NAME
------------------------------ ------------------------------ ----------------------------------------
EMP_IDX1                       EMP                            DEPTNO
EMP_IDX1                       EMP                            EMPNO

SQL> select * from emp;

     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7782 CLARK      MANAGER         7839 09-JUN-81       2450                    10
      7839 KING       PRESIDENT            17-NOV-81       5000                    10
      7934 MILLER     CLERK           7782 23-JAN-82       1300                    10
      7369 SMITH      CLERK           7902 17-DEC-80        800                    20
      7566 JONES      MANAGER         7839 02-APR-81       2975                    20
      7788 SCOTT      ANALYST         7566 19-APR-87       3000                    20
      7876 ADAMS      CLERK           7788 23-MAY-87       1100                    20
      7902 FORD       ANALYST         7566 03-DEC-81       3000                    20
      7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30
      7521 WARD       SALESMAN        7698 22-FEB-81       1250        500         30
      7654 MARTIN     SALESMAN        7698 28-SEP-81       1250       1400         30
      7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30
      7844 TURNER     SALESMAN        7698 08-SEP-81       1500          0         30
      7900 JAMES      CLERK           7698 03-DEC-81        950                    30

14 rows selected.

SQL> set autotrace trace exp
SQL> select * from emp where empno<>7900;

Execution Plan
----------------------------------------------------------
Plan hash value: 822536733

--------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |    13 |   494 |    14   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ALL|      |    13 |   494 |    14   (0)| 00:00:01 |     1 |     4 |
|*  2 |   TABLE ACCESS FULL | EMP  |    13 |   494 |    14   (0)| 00:00:01 |     1 |     4 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("EMPNO"<>7900)

SQL> select /*+index(emp EMP_IDX1)*/ * from emp where empno<>7900;

Execution Plan
----------------------------------------------------------
Plan hash value: 257372123

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |          |    13 |   494 |     6   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ALL               |          |    13 |   494 |     6   (0)| 00:00:01 |     1 |     4 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| EMP      |    13 |   494 |     6   (0)| 00:00:01 |     1 |     4 |
|*  3 |    INDEX FULL SCAN                 | EMP_IDX1 |    13 |       |     3   (0)| 00:00:01 |     1 |     4 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("EMPNO"<>7900)

SQL>

可知,当使用<>就会跳过索引,但是我们可以使用hints(提示)让数据库强制使用索引,注意not in或是in 关键字类似<>依然会跳过索引,那么替代办法要么更改sql查询语句,要么使用case when条件,那么也需要在建立函数索引了。

第四种:统计信息过旧

 在基于成本的optimizer更具数据进行估计cast,当数据已经被修改(如进行了大量的dml操作),那么统计信息肯定过旧,那么oracle在更加统计信息进行执行计划选择的时候往往可能出现选择错误的执行计划。

 在10g开始oracle会有自动收集统计信息的任务在运行。

17:15:22 sys@REPDB>select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for HPUX: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production

17:15:28 sys@REPDB>select JOB_NAME,LAST_START_DATE,comments from dba_scheduler_jobs;

JOB_NAME                       LAST_START_DATE                                                             COMMENTS
------------------------------ --------------------------------------------------------------------------- ------------------------------------------------------------
PURGE_LOG                      07-2ÔÂ -14 03.00.01.146182 ÉÏÎç +08:00                                      purge log job
FGR$AUTOPURGE_JOB                                                                                          file group auto-purge job
GATHER_STATS_JOB                                                                                           Oracle defined automatic optimizer statistics collection job
AUTO_SPACE_ADVISOR_JOB                                                                                     auto space advisor maintenance job
MGMT_CONFIG_JOB                06-2ÔÂ -14 10.00.02.198895 ÏÂÎç +08:00                                      Configuration collection job.
MGMT_STATS_CONFIG_JOB          01-2ÔÂ -14 01.01.01.762793 ÉÏÎç +08:00                                      OCM Statistics collection job.
RLM$EVTCLEANUP                 07-2ÔÂ -14 04.18.33.923785 ÏÂÎç +08:00
RLM$SCHDNEGACTION              07-2ÔÂ -14 05.02.43.177946 ÏÂÎç +08:00

ÒÑÑ¡Ôñ8ÐС£

但在11g该内容开始取消;

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE    11.2.0.4.0      Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production

SQL> select JOB_NAME,LAST_START_DATE,comments from dba_scheduler_jobs;

JOB_NAME                       LAST_START_DATE                                    COMMENTS
------------------------------ -------------------------------------------------- --------------------------------------------------
XMLDB_NFS_CLEANUP_JOB
SM$CLEAN_AUTO_SPLIT_MERGE      07-FEB-14 12.00.00.522780 AM PST8PDT               auto clean job for auto split merge
RSE$CLEAN_RECOVERABLE_SCRIPT   07-FEB-14 12.00.00.242715 AM PST8PDT               auto clean job for recoverable script
FGR$AUTOPURGE_JOB                                                                 file group auto-purge job
BSLN_MAINTAIN_STATS_JOB        06-FEB-14 10.39.20.219977 PM -07:00                Oracle defined automatic moving window baseline st
                                                                                  atistics computation job

DRA_REEVALUATE_OPEN_FAILURES   31-JAN-14 06.00.02.807655 AM PST8PDT               Reevaluate open failures for DRA
HM_CREATE_OFFLINE_DICTIONARY                                                      Create offline dictionary in ADR for DRA name tran
                                                                                  slation

ORA$AUTOTASK_CLEAN             06-FEB-14 09.39.19.975070 PM PST8PDT               Delete obsolete AUTOTASK repository data
FILE_WATCHER                                                                      File watcher job
PURGE_LOG                      06-FEB-14 09.39.19.999313 PM PST8PDT               purge log job
MGMT_STATS_CONFIG_JOB          06-FEB-14 10.39.19.607560 PM -07:00                OCM Statistics collection job.
MGMT_CONFIG_JOB                07-FEB-14 01.01.01.661585 AM -07:00                Configuration collection job.
RLM$SCHDNEGACTION              07-FEB-14 04.32.12.069752 PM +08:00
RLM$EVTCLEANUP                 07-FEB-14 01.49.46.407904 AM -07:00

14 rows selected.

SQL>

那么当统计信息过久的时候通常使用dbms_stats包进行相应对象信息的收集。

第五种:使用通配符查询:

当谓词条件中存在%或是_通配符的时候,oracle会忽略索引,但是可以通过通配符位移解决,如下:

SQL> create index emp_idx2 on emp(ename);

Index created.

SQL> set autotrace trace exp
SQL> select * from emp where ename like '%OT%';

Execution Plan
----------------------------------------------------------
Plan hash value: 822536733

--------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    38 |    14   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ALL|      |     1 |    38 |    14   (0)| 00:00:01 |     1 |     4 |
|*  2 |   TABLE ACCESS FULL | EMP  |     1 |    38 |    14   (0)| 00:00:01 |     1 |     4 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("ENAME" LIKE '%OT%')

SQL> SELECT * FROM EMP WHERE ENAME LIKE 'SC%OT%';

Execution Plan
----------------------------------------------------------
Plan hash value: 547783664

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |          |     1 |    38 |     2   (0)| 00:00:01 |       |       |
|   1 |  TABLE ACCESS BY GLOBAL INDEX ROWID| EMP      |     1 |    38 |     2   (0)| 00:00:01 | ROWID | ROWID |
|*  2 |   INDEX RANGE SCAN                 | EMP_IDX2 |     1 |       |     1   (0)| 00:00:01 |       |       |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ENAME" LIKE 'SC%OT%')
       filter("ENAME" LIKE 'SC%OT%')

SQL>

第六种:在谓词条件中存在函数

无论在谓词中存在隐式函数转换或是显示函数那么oracle都不会使用索引,对于隐式函数转换例子如下:


第七种:谓词列存在null值。

当在谓词条件中的列存在null值,那么oracle将跳过索引,因为null值不存在索引段中,但是如果是复合索引,其中一列为null,另外一列不为null,那么数据库依然使用索引。

第八种:跳过前导列

 这是最后一种不使用索引的原因,那就是如果是在默写列上创建了复合索引,但在谓词条件中没有涉及到前导列,那么将会是跳过索引。,

 

如上是oracle不使用索引情况的一点学习笔记。如果强制使用索引,那么可以尝试相关hints如下: