文档课题:oracle统计信息不准导致性能问题—谓词越界
数据库:oracle 19.13 多租户
1、理论知识
谓词越界是SQL语句的查询条件超出了数据库统计信息所记录的范围,谓词越界会导致优化器选择错误的执行计划,导致性能问题。
2、相关测试
2.1、建测试表
leo@ORCLPDB 2024-04-04 10:27:40> create table test (col1 number);

Table created.

leo@ORCLPDB 2024-04-04 10:34:04> create index ind_test_col1 on test(col1);

Index created.

leo@ORCLPDB 2024-04-04 10:29:56> begin
  2  for i in 1..10000 loop
  3  insert into test values (i);
  4  end loop;
  5  commit;
  6  end;
  7  /

PL/SQL procedure successfully completed.

leo@ORCLPDB 2024-04-04 10:30:30> exec dbms_stats.gather_table_stats('LEO','TEST');

PL/SQL procedure successfully completed.

leo@ORCLPDB 2024-04-04 10:33:00> col low_value for a15
leo@ORCLPDB 2024-04-04 10:33:28> col high_value for a15
--查询表test的谓词情况
leo@ORCLPDB 2024-04-04 10:33:34> select low_value,high_value from dba_tab_col_statistics where table_name='TEST' and owner='LEO';

LOW_VALUE       HIGH_VALUE
--------------- ---------------
C102            C302

leo@ORCLPDB 2024-04-04 10:35:44> var x number;
leo@ORCLPDB 2024-04-04 10:36:40> exec dbms_stats.convert_raw_value('c102',:x);

PL/SQL procedure successfully completed.

leo@ORCLPDB 2024-04-04 10:37:29> select :x from dual;

        :X
----------
         1

leo@ORCLPDB 2024-04-04 10:37:36> exec dbms_stats.convert_raw_value('c302',:x);

PL/SQL procedure successfully completed.

leo@ORCLPDB 2024-04-04 10:38:03> select :x from dual;

        :X
----------
     10000

说明:如上使用系统包dbms_stats.convert_raw_value将统计信息里的上限裸数据转换成可读的数值.

2.2、谓词范围内查询
说明:以下情况为在谓词范围内的条件查询场景.
leo@ORCLPDB 2024-04-04 10:38:08> explain plan for select * from test where col1 between 1 and 10000;

Explained.

leo@ORCLPDB 2024-04-04 10:41:37> select * from table (dbms_xplan.display);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
Plan hash value: 749384458

--------------------------------------------------------------------------------------
| Id  | Operation            | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |               | 10000 | 40000 |     7   (0)| 00:00:01 |
|*  1 |  INDEX FAST FULL SCAN| IND_TEST_COL1 | 10000 | 40000 |     7   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("COL1">=1 AND "COL1"<=10000)

13 rows selected.

说明:此处条件包含test表内的所有数据,所以采用多块读,且不回表的执行计划为最优(table access full/index fast full scan).

2.3、更新表数据
说明:再次往test表写入数据.
leo@ORCLPDB 2024-04-04 10:41:51> begin
  2  for i in 10001..10000000 loop
  3  insert into test values (i);
  4  end loop;
  5  commit;
  6  end;
  7  /

PL/SQL procedure successfully completed.

2.4、不收集统计信息的情况
--在不重新收集统计信息的情况下,检查表的统计信息.
leo@ORCLPDB 2024-04-04 10:51:00> select low_value,high_value from dba_tab_col_statistics where table_name='TEST' and owner='LEO';

LOW_VALUE       HIGH_VALUE
--------------- ---------------
C102            C302

--接下来进行一次谓词越界查询,使用谓词条件col1 between 10001 and 10000000.逻辑上来说,此类选择表99.9%数据的语句应该使用多块读,且不回表的执行计划(table access full/index fast full scan),测试效果如下:

leo@ORCLPDB 2024-04-04 11:51:23> set timing on
leo@ORCLPDB 2024-04-04 11:51:30> set serveroutput off
leo@ORCLPDB 2024-04-04 11:51:40> select count(*) from test where col1 between 10001 and 10000000;

  COUNT(*)
----------
   9990000

Elapsed: 00:00:00.58
leo@ORCLPDB 2024-04-04 11:52:53> select * from table(dbms_xplan.display_cursor('','',''));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
SQL_ID  95s99xp4spkdq, child number 0
-------------------------------------
select count(*) from test where col1 between 10001 and 10000000

Plan hash value: 1420195510

-----------------------------------------------------------------------------------
| Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |               |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE   |               |     1 |     4 |            |          |
|*  2 |   INDEX RANGE SCAN| IND_TEST_COL1 |     1 |     4 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("COL1">=10001 AND "COL1"<=10000000)


19 rows selected.

Elapsed: 00:00:00.23

说明:可以看到该条SQL执行580ms,执行计划选择的是单块读的index range scan,而不是期望的多块读,且不回表的两种执行计划(table access full/index fast full scan)之一,甚至返回的rows和bytes出现严重的预估错误.
2.5、收集统计信息的情况
--SYS用户搜集统计信息.
sys@ORCLCDB 2024-04-04 11:57:05> exec dbms_stats.gather_table_stats('LEO','TEST');

PL/SQL procedure successfully completed.

sys@ORCLCDB 2024-04-04 11:57:42> alter system flush shared_pool;

System altered.

sys@ORCLCDB 2024-04-04 11:58:26> alter system flush buffer_cache;

System altered.

leo@ORCLPDB 2024-04-04 11:59:10> set timing on
leo@ORCLPDB 2024-04-04 11:59:13> set serveroutput off
leo@ORCLPDB 2024-04-04 11:59:19> select count(*) from test where col1 between 10001 and 10000000;

  COUNT(*)
----------
   9990000

Elapsed: 00:00:00.25

leo@ORCLPDB 2024-04-04 11:59:42> select * from table(dbms_xplan.display_cursor('','',''));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
SQL_ID  95s99xp4spkdq, child number 0
-------------------------------------
select count(*) from test where col1 between 10001 and 10000000

Plan hash value: 1950795681

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |  4206 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |     6 |            |          |
|*  2 |   TABLE ACCESS FULL| TEST |  9990K|    57M|  4206   (3)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("COL1">=10001 AND "COL1"<=10000000))


19 rows selected.

Elapsed: 00:00:00.19

说明:统计信息收集完成后,优化器选择更优的执行计划(table access full),rows和bytes的预估值也与实际值相同,执行时间250ms.

3、谓词越界总结
3.1、临时表
此处指业务上的临时表而不是Oracle数据库本身的temporary table。在某些系统中会根据业务条件创建前台表和后台表,数据先进入前台表,处理完毕后,存入后台表,并用delete语句清理前台表的数据,前台表起到一个临时表的作用。Oracle自动收集统计信息的默认时间窗口是工作日晚上22点到凌晨2点,周末是早上6点到第二天凌晨2点。在自动收集统计信息时间窗口内,当前台表基本上处于无数据或数据量很小的情况,产生的统计信息会和白天实际处理业务数据时存在较大偏差,此时便会发生谓词越界的情况。
3.2、巨大表
Oracle触发自动收集某个表的统计信息的条件是表中修改的数据量超过该表数据总量的10%,假设一张表每天新增1w条数据,一年后该表变成365w条数据,这就意味着该表在一年后需一个月才会触发一次自动收集统计信息的作业。那么在该表上的谓词查询,尤其是时间、序列等自增条件上的查询,就可能发生谓词越界的情况,影响优化器正确选择执行计划。

说明:如上内容来自以下网址,笔者仅是实际操作过并做好记录.
https://www.cnblogs.com/aegis1019/p/9059220.html
https://www.modb.pro/db/40284