在 Oracle 数据库中,访问路径指的就是通过哪种扫描方式获取数据,比如全表扫描、索引扫描或者直接通过 ROWID 获取数据。根据访问对象的不同,访问路径可以分为表访问路径、索引访问路径等。
常见访问路径
TABLE ACCESS FULL
- 什么是 TABLE ACCESS FULL
TABLE ACCESS FULL 表示全表扫描,一般情况下是多块读。
等待事件为 db file scattered read
,如果是并行全表扫描,等待事件为 direct path read
。
在 Oracle 11g 中有一个新特性,在对一个大表进行全表扫描的时候,会将表直接读入 PGA,绕过 buffer cache,这个时候全表扫描的等待事件也是 direct path read
。
在逻辑上,全表扫描会读取表中的所有行,然后检查每一行是否满足语句的限制条件;在物理上,全表扫描会批量读取高水位线(HWM)下的每个数据块。
- TABLE ACCESS FULL 扫描数据的方式
在 Oracle 数据库中,最小的存储单位是块(Block),物理上连续的块组成了区(Extend),区又组成了段(Segment)。
对于非分区表,如果表中没有 CLOB/BLOB 字段,那么一个表就是一个段。全表扫描,其实就是扫描表中所有格式化过的区。因为区里面的数据块在物理上是连续的,所以全表扫描可以多块读。
对于分区表,如果表中没有 CLOB/BLOB 字段,那么一个分区就是一个段,分区表扫描方式与非分区表扫描方式是一样的。
如果表中有 CLOB 字段,CLOB 字段会单独存放在一个段中,当全表扫描需要访问 CLOB 字段时,这时性能会严重下降,因此尽量避免在 Oracle 中使用 CLOB 类型。
一般的操作系统,一次 I/O 最多只支持读取或者写入 1MB 数据。当数据块为 8KB 的时候,一次 I/O 最多能读取 128 个数据块;当数据块为 16KB 的时候,一次 I/O 最多能读取 64 个数据块;当数据块为 32 KB 的时候,一次 I/O 最多能读取 32 个数据块。
SQL> show parameter db_block_size; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ db_block_size integer 8192 SQL> select 1024 * 1024 / 8192 from dual; 1024*1024/8192 -------------- 128
-
TABLE ACCESS FULL 常见的使用场景
-
大范围数据读取的情况:这里的大范围是一个相对的概念。一般来说,如果访问表中的大部分数据,则全表扫描的效率较高;如果访问表中的小部分数据,则索引访问的效率较高。常见的经验在 1% ~ 10% 之间,具体还是取决于当时的成本评估;
- 从小数据表中读取数据的情况:如果访问的数据规模较小,则优化器(CBO)倾向于通过全表扫描的方式访问整个表。因为全表扫描使用了多快读的机制,往往效率是很高的;
- 按照并行处理方式读取数据的情况:在并行处理的情况下,全表扫描的执行速度会在更大程度上得到提高;
- 使用 Hint 提示(FULL)的情况:这个提示告诉优化器,使用全表扫描访问表。
TABLE ACCESS BY USER ROWID
TABLE ACCESS BY USER ROWID 表示直接使用 ROWID 获取数据,单块读。
该访问路径在 Oracle 所有的访问路径中性能是最好的。
SQL> set autotrace on; SQL> select * from test where rowid = 'AAAVY7AAEAAAAH7AAA'; 执行计划 ---------------------------------------------------------- Plan hash value: 1358188196 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 219 | 1 (0)| 00:00:01 | | 1 | TABLE ACCESS BY USER ROWID| TEST | 1 | 219 | 1 (0)| 00:00:01 | -----------------------------------------------------------------------------------
在 WHERE 条件中直接使用 ROWID 获取数据时,就会使用该访问路径。
TABLE ACCESS BY ROWID RANGE
TABLE ACCESS BY ROWID RANGE 表示 ROWID 范围扫描,多块读。
因为同一个块里面的 ROWID 是连续的,同一个区(EXTENT)里面的 ROWID 也是连续的,所以可以多块读。
SQL> select * from test where rowid >= 'AAAVY7AAEAAAAH7AAA'; SQL_ID fz2fwyfuhwj67, child number 0 ------------------------------------- SELECT * FROM TEST WHERE ROWID >= 'AAAVY7AAEAAAAH7AAA' Plan hash value: 3472873366 ------------------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | | 343 (100)| | 86120 |00:00:00.02 | 2080 | |* 1 | TABLE ACCESS BY ROWID RANGE| TEST | 1 | 4306 | 412K| 343 (1)| 00:00:05 | 86120 |00:00:00.02 | 2080 | ------------------------------------------------------------------------------------------------------------------------------ Query Block Name / Object Alias (identified by operation id): ------------------------------------------------------------- 1 - SEL$1 / TEST@SEL$1 Outline Data ------------- /*+ BEGIN_OUTLINE_DATA IGNORE_OPTIM_EMBEDDED_HINTS OPTIMIZER_FEATURES_ENABLE('11.2.0.4') DB_VERSION('11.2.0.4') OPT_PARAM('star_transformation_enabled' 'true') ALL_ROWS OUTLINE_LEAF(@"SEL$1") ROWID(@"SEL$1" "TEST"@"SEL$1") END_OUTLINE_DATA */ Predicate Information (identified by operation id): --------------------------------------------------- 1 - access(ROWID>='AAAVY7AAEAAAAH7AAA') Column Projection Information (identified by operation id): ----------------------------------------------------------- 1 - "TEST"."OWNER"[VARCHAR2,30], "TEST"."OBJECT_NAME"[VARCHAR2,128], "TEST"."SUBOBJECT_NAME"[VARCHAR2,30], "TEST"."OBJECT_ID"[NUMBER,22], "TEST"."DATA_OBJECT_ID"[NUMBER,22], "TEST"."OBJECT_TYPE"[VARCHAR2,19], "TEST"."CREATED"[DATE,7], "TEST"."LAST_DDL_TIME"[DATE,7], "TEST"."TIMESTAMP"[VARCHAR2,19], "TEST"."STATUS"[VARCHAR2,7], "TEST"."TEMPORARY"[VARCHAR2,1], "TEST"."GENERATED"[VARCHAR2,1], "TEST"."SECONDARY"[VARCHAR2,1], "TEST"."NAMESPACE"[NUMBER,22], "TEST"."EDITION_NAME"[VARCHAR2,30]
在 WHERE 条件中直接使用 ROWID 进行范围扫描获取数据时,就会使用该访问路径。
TABLE ACCESS BY INDEX ROWID
TABLE ACCESS BY INDEX ROWID 表示回表,单块读。
SQL> create table test as select * from dba_objects; SQL> create index idx_test on test(object_id); SQL> set autotrace on; SQL> select * from test where object_id < 10; 执行计划 ---------------------------------------------------------- Plan hash value: 2473784974 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 8 | 1656 | 3 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST | 8 | 1656 | 3 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_TEST | 8 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_ID"<10) Note ----- - dynamic sampling used for this statement (level=2) 统计信息 ---------------------------------------------------------- 24 recursive calls 0 db block gets 114 consistent gets 4 physical reads 0 redo size 2236 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 8 rows processed
INDEX UNIQUE SCAN
INDEX UNIQUE SCAN 表示索引唯一扫描,单块读。
对唯一索引或者对主键列进行等值查询,就会走 INDEX UNIQUE SCAN。因为对唯一索引或者对主键列进行等值查询,CBO 能确保最多只返回一行数据,所以这时可以走索引唯一扫描。
SQL> select * from emp where empno = '7369'; 执行计划 ---------------------------------------------------------- Plan hash value: 2949544139 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 38 | 1 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 38 | 1 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | PK_EMP | 1 | | 0 (0)| 00:00:01 | -------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("EMPNO"=7369) 统计信息 ---------------------------------------------------------- 12 recursive calls 0 db block gets 23 consistent gets 0 physical reads 0 redo size 890 bytes sent via SQL*Net to client 509 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 1 rows processed
INDEX UNIQUE SCAN 最多只返回一行数据,只会扫描 “索引高度” 个索引块,在所有的 Oracle 访问路径中,其性能仅次于 TABLE ACCESS BY USER ROWID。
INDEX RANGE SCAN
INDEX RANGE SCAN 表示索引范围扫描,单块读,且返回的数据是有序的(默认为升序)。
对唯一索引或者主键进行范围查询,对非唯一索引进行等值查询、范围查找,就会发生 INDEX RANGE SCAN。等待事件为 db file sequential read
。
- 对非唯一索引进行等值查询
SQL> create table test as select * from dba_objects; SQL> create index idx_test on test(object_id); SQL> set autotrace on; SQL> select * from test where object_id = 100; 执行计划 ---------------------------------------------------------- Plan hash value: 2473784974 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 207 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 207 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_TEST | 1 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_ID"=100) Note ----- - dynamic sampling used for this statement (level=2)
因为索引 IDX_TEST 是非唯一索引,对非唯一索引进行等值查询并不能确保只返回一行数据,有可能返回多行数据,所以执行计划会进行索引范围扫描。
- 对非唯一索引进行范围查询
SQL> select * from test where object_id < 100; 执行计划 ---------------------------------------------------------- Plan hash value: 2473784974 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 98 | 20286 | 4 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST | 98 | 20286 | 4 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_TEST | 98 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_ID"<100) Note ----- - dynamic sampling used for this statement (level=2)
- 索引范围扫描取消回表
SQL> create table test as select * from dba_objects; SQL> select object_name from test where object_id < 100;
上述 SQL 语句有过滤条件,根据过滤条件 where object_id < 100
过滤数据之后只返回少量数据,一般情况下我们直接在 object_id 列创建索引,让该 SQL 走 object_id 列的索引即可。
SQL> create index idx_test on test(object_id); SQL> select object_name from test where object_id < 100; 执行计划 ---------------------------------------------------------- Plan hash value: 2473784974 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 98 | 7742 | 4 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST | 98 | 7742 | 4 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_TEST | 98 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_ID"<100) Note ----- - dynamic sampling used for this statement (level=2)
因为该 SQL 只查询一个字段,所以我们可以将 select 列放到组合索引中,避免回表。
SQL> create index idx_test_2 on test(object_id, object_name, 0); SQL> select object_name from test where object_id < 100; 执行计划 ---------------------------------------------------------- Plan hash value: 2872074888 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 14 | 1106 | 2 (0)| 00:00:01 | |* 1 | INDEX RANGE SCAN| IDX_TEST_2 | 14 | 1106 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("OBJECT_ID"<100) Note ----- - dynamic sampling used for this statement (level=2)
假设一个索引叶子块能存储 100 行数据,通过索引返回 100 行以内的数据,只扫描 “索引高度” 个索引块;如果通过索引返回 200 行数据,就需要扫描两个叶子块。通过索引返回的行数越多,扫描的索引叶子块也就越多,随着扫描的叶子块个数的增加,索引范围扫描的性能开销也就越大。
如果索引范围扫描需要回表,同样假设一个索引叶子块能存储 100 行数据,通过索引返回 1000 行数据,只需要扫描 10 个索引叶子块(单块读),但是回表可能会需要访问几十个到几百个数据块(单块读)。
在检查执行计划的时候,我们要注意索引范围扫描返回多少行数据,如果返回少量数据,不会出现性能问题;如果返回大量数据,在没有回表的情况下也还好;如果返回大量数据同时还有回表,这时我们应该考虑通过创建组合索引消除回表或者使用全表扫描来代替索引范围扫描。
INDEX SKIP SCAN
INDEX SKIP SCAN 表示索引跳跃扫描,单块读,且返回的数据是有序的(默认为升序)。
当组合索引的前导列(第一个列)没有在 where 条件中,并且组合索引的前导列/前几个列的基数很低,where 过滤条件对组合索引中非前导列进行过滤的时候就会发生索引跳跃扫描,等待事件为 db file sequential read
。
SQL> create table test as select * from dba_objects; SQL> create index idx_test on test(owner, object_id); --执行目标 SQL 语句 SQL> select * from test where object_id < 100; --收集统计信息(对出现在 where 条件中的列自动判断是否收集直方图) begin dbms_stats.gather_table_stats(ownname => 'SCOTT', tabname => 'TEST', estimate_percent => 100, method_opt => 'for all columns size auto', no_invalidate => false, degree => 4, granularity => 'AUTO', cascade => true); end; SQL> alter session set statistics_level = all; SQL> select * from test where object_id < 100; SQL> select * from table(dbms_xplan.display_cursor(null, null, 'advanced allstats last')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID 6z7p6rt14t4gb, child number 0 ------------------------------------- select * from test where object_id < 100 Plan hash value: 430566402 ---------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | | 29 (100)| | 98 |00:00:00.01 | 38 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 96 | 9408 | 29 (0)| 00:00:01 | 98 |00:00:00.01 | 38 | |* 2 | INDEX SKIP SCAN | IDX_TEST | 1 | 96 | | 26 (0)| 00:00:01 | 98 |00:00:00.01 | 29 | ---------------------------------------------------------------------------------------------------------------------------------- Query Block Name / Object Alias (identified by operation id): ------------------------------------------------------------- 1 - SEL$1 / TEST@SEL$1 2 - SEL$1 / TEST@SEL$1 Outline Data ------------- /*+ BEGIN_OUTLINE_DATA IGNORE_OPTIM_EMBEDDED_HINTS OPTIMIZER_FEATURES_ENABLE('11.2.0.4') DB_VERSION('11.2.0.4') OPT_PARAM('star_transformation_enabled' 'true') ALL_ROWS OUTLINE_LEAF(@"SEL$1") INDEX_SS(@"SEL$1" "TEST"@"SEL$1" ("TEST"."OWNER" "TEST"."OBJECT_ID")) END_OUTLINE_DATA */ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_ID"<100) filter("OBJECT_ID"<100) Column Projection Information (identified by operation id): ----------------------------------------------------------- 1 - "TEST"."OWNER"[VARCHAR2,30], "TEST"."OBJECT_NAME"[VARCHAR2,128], "TEST"."SUBOBJECT_NAME"[VARCHAR2,30], "OBJECT_ID"[NUMBER,22], "TEST"."DATA_OBJECT_ID"[NUMBER,22], "TEST"."OBJECT_TYPE"[VARCHAR2,19], "TEST"."CREATED"[DATE,7], "TEST"."LAST_DDL_TIME"[DATE,7], "TEST"."TIMESTAMP"[VARCHAR2,19], "TEST"."STATUS"[VARCHAR2,7], "TEST"."TEMPORARY"[VARCHAR2,1], "TEST"."GENERATED"[VARCHAR2,1], "TEST"."SECONDARY"[VARCHAR2,1], "TEST"."NAMESPACE"[NUMBER,22], "TEST"."EDITION_NAME"[VARCHAR2,30] 2 - "TEST".ROWID[ROWID,10], "TEST"."OWNER"[VARCHAR2,30], "OBJECT_ID"[NUMBER,22]
从上述执行计划中可以看到,该目标 SQL 走了索引跳跃扫描。而最理想的情况应该是直接走 where 条件的 object_id 列上的索引,并且走 INDEX RANGE SCAN。但是因为 where 条件的 object_id 列上面没有直接创建单键值索引,而是间接地被包含在组合索引中,为了避免全表扫描,CBO 就选择了索引跳跃扫描。
INDEX SKIP SCAN 中有个 SKIP 关键字,也就是说它是跳着扫描的。那么想要跳跃扫描,有如下必须满足的条件:
- 必须是组合索引
- 组合索引的前导列不能出现在 where 条件中
需要注意的是,如果前导列的基数很高(DISTINCT 值过多),导致索引跳跃扫描的次数过多,会导致性能的下降。
INDEX FULL SCAN
INDEX FULL SCAN 表示索引全扫描,单块读,且返回的数据是有序的(默认为升序)。
索引全扫描会扫描索引中所有的叶子块(从左往右扫描),如果索引很大,会产生严重性能问题(因为是单块读)。等待事件为 db file sequential read
。
索引全扫描通常发生在如下 3 种情况:
- 分页语句;
- 目标 SQL 语句有 order by 子句,order by 的列都包含在索引中,并且 order by 中列的顺序必须和索引列顺序一致。order by 的第一个列不能有过滤条件,如果有过滤条件就会走 INDEX RANGE SCAN。同时表的数据量不能太大(数据量太大会走 TABLE ACCESS FULL + SORT ORDER BY);
SQL> create table test as select * from dba_objects; SQL> select * from test order by object_id, owner; --创建索引(索引顺序必须与上述目标 SQL 中 order by 顺序一致,加 0 是为了让索引能够存储 NULL) SQL> create index idx_test on test(object_id, owner, 0); SQL> select * from test order by object_id, owner; 执行计划 ---------------------------------------------------------- Plan hash value: 237643989 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 98543 | 19M| 1632 (1)| 00:00:20 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST | 98543 | 19M| 1632 (1)| 00:00:20 | | 2 | INDEX FULL SCAN | IDX_TEST | 98543 | | 302 (1)| 00:00:04 | ---------------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2) 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 12999 consistent gets 0 physical reads 0 redo size 4465332 bytes sent via SQL*Net to client 63671 bytes received via SQL*Net from client 5743 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 86122 rows processed --全表扫描执行计划 SQL> select /*+ full(test) */ * from test order by object_id, owner; 执行计划 ---------------------------------------------------------- Plan hash value: 2007178810 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 98543 | 19M| | 4777 (1)| 00:00:58 | | 1 | SORT ORDER BY | | 98543 | 19M| 23M| 4777 (1)| 00:00:58 | | 2 | TABLE ACCESS FULL| TEST | 98543 | 19M| | 343 (1)| 00:00:05 | ----------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2) 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 1235 consistent gets 0 physical reads 0 redo size 4465332 bytes sent via SQL*Net to client 63671 bytes received via SQL*Net from client 5743 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 86122 rows processed
- 在进行 SORT MERGE JOIN 的时候,如果表数据量比较小,让连接列走 INDEX FULL SCAN 可以避免排序。
SQL> set autotrace on; SQL> select /*+ use_merge(e, d) */ * from emp e, dept d where e.deptno = d.deptno; 执行计划 ---------------------------------------------------------- Plan hash value: 844388907 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 14 | 812 | 6 (17)| 00:00:01 | | 1 | MERGE JOIN | | 14 | 812 | 6 (17)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 4 | 80 | 2 (0)| 00:00:01 | | 3 | INDEX FULL SCAN | PK_DEPT | 4 | | 1 (0)| 00:00:01 | |* 4 | SORT JOIN | | 14 | 532 | 4 (25)| 00:00:01 | | 5 | TABLE ACCESS FULL | EMP | 14 | 532 | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("E"."DEPTNO"="D"."DEPTNO") filter("E"."DEPTNO"="D"."DEPTNO")
当看到执行计划中有 INDEX FULL SCAN 时,我们首先要检查 INDEX FULL SCAN 是否有回表。
如果 INDEX FULL SCAN 没有回表,我们要检查索引段大小,如果索引段太大(GB 级别),应该使用 INDEX FAST FULL SCAN 代替 INDEX FULL SCAN,因为 INDEX FAST FULL SCAN 是多块读,INDEX FULL SCAN 是单块读,即使使用了 INDEX FAST FULL SCAN 会产生额外的排序操作,也要用 INDEX FAST FULL SCAN 代替 INDEX FULL SCAN。
如果 INDEX FULL SCAN 有回表,大多数情况下,这种执行计划是错误的,因为 INDEX FULL SCAN 是单块读,回表也是单块读,这时应走全表扫描,因为全表扫描是多块读。
INDEX FAST FULL SCAN
INDEX FAST FULL SCAN 表示索引快速全扫描,多块读。
当需要从表中查询出大量数据但是只需要获取表中部分列的数据时,我们可以利用索引快速全扫描代替全表扫描来提升性能。索引快速全扫描的扫描方式与全表扫描的扫描方式是一样的,都是按区扫描,所以它可以多块读,而且可以并行扫描。
等待事件为 db file scattered read
,如果是并行扫描,等待事件为 direct path read
。
SQL> create table test as select * from dba_objects; SQL> select owner, object_name from test;
上述 SQL 语句没有过滤条件,默认情况下会走全表扫描。但是因为 Oracle 是行存储数据库,全表扫描的时候会扫描表中所有的列,而上述查询只访问表中两个列,全表扫描会扫描额外 13 个列(表 test 共有 15 个列),所以我们可以创建一个组合索引,使用索引快速全扫描代替全表扫描。
SQL> create index idx_test on test(owner, object_name, 0); SQL> select owner, object_name from test; 执行计划 ---------------------------------------------------------- Plan hash value: 285868359 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 98543 | 7987K| 149 (0)| 00:00:02 | | 1 | INDEX FAST FULL SCAN| IDX_TEST | 98543 | 7987K| 149 (0)| 00:00:02 | --------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2)
- 索引快速全扫描代替全表扫描
SQL> create table test as select * from dba_objects; SQL> select object_name from test where object_id > 100; 执行计划 ---------------------------------------------------------- Plan hash value: 1357081020 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 98543 | 7602K| 343 (1)| 00:00:05 | |* 1 | TABLE ACCESS FULL| TEST | 98543 | 7602K| 343 (1)| 00:00:05 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("OBJECT_ID">100) Note ----- - dynamic sampling used for this statement (level=2)
上述 SQL 语句过滤条件是 where object_id > 100
,返回大量数据,应该走全表扫描,但是因为该 SQL 仅访问一个字段,所以我们可以走索引快速全扫描来代替全表扫描。
SQL> create index idx_test on test(object_id, object_name, 0); SQL> set autotrace traceonly; SQL> select object_name from test where object_id > 100; 执行计划 ---------------------------------------------------------- Plan hash value: 285868359 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 98543 | 7602K| 147 (1)| 00:00:02 | |* 1 | INDEX FAST FULL SCAN| IDX_TEST | 98543 | 7602K| 147 (1)| 00:00:02 | --------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("OBJECT_ID">100) Note ----- - dynamic sampling used for this statement (level=2)
INDEX FULL SCAN(MIN/MAX)
INDEX FULL SCAN(MIN/MAX)表示索引最小/最大值扫描,单块读。该访问路径发生在 “SELECT MAX(column) FROM TABLE” 或者 “SELECT MIN(column) FROM TABLE” 等 SQL 语句中。
INDEX FULL SCAN(MIN/MAX)只会访问 “索引高度” 个索引块,其性能与 INDEX UNIQUE SCAN 一样,仅次于 TABLE ACCESS BY USER ROWID。
SQL> create table test as select * from dba_objects; SQL> create index idx_test on test(object_id); SQL> select max(object_id) from test; MAX(OBJECT_ID) -------------- 87672
上述 SQL 查询 object_id 的最大值,如果 object_id 列有索引(索引默认是升序排序的),这时候我们只需要扫描索引中 “最右边” 的叶子块就能得到 object_id 的最大值。
SQL> set autotrace on; SQL> select max(object_id) from test; 执行计划 ---------------------------------------------------------- Plan hash value: 469350181 --------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 13 | | | | 2 | INDEX FULL SCAN (MIN/MAX)| IDX_TEST | 1 | 13 | 2 (0)| 00:00:01 | --------------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2)
现有另外一个 SQL 语句,如下。
SQL> select max(object_id), min(object_id) from test;
该 SQL 要同时查看 object_id 的最大值和最小值,如果想直接从 object_id 列的索引获取数据,我们只需要扫描索引中 “最左边” 和 “最右边” 的叶子块就可以。在 BTree 索引中,索引叶子块是双向指向的,如果要一次性获取索引中 “最左边” 和 “最右边” 的叶子块,我们就需要连带的扫描 “最大值” 和 “最小值” 中间的叶子块,而上述 SQL 中,中间叶子块的数据并不是我们需要的。如果该 SQL 走索引,会走 INDEX FAST FULL SCAN,而不会走 INDEX FULL SCAN,因为 INDEX FAST FULL SCAN 可以多块读,而 INDEX FULL SCAN 是单块读,两者性能差距巨大。需要注意的是,该 SQL 没有排除 object_id 为 NULL,如果直接运行该 SQL,不会走索引。
SQL> select max(object_id), min(object_id) from test; 执行计划 ---------------------------------------------------------- Plan hash value: 1950795681 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 343 (1)| 00:00:05 | | 1 | SORT AGGREGATE | | 1 | 13 | | | | 2 | TABLE ACCESS FULL| TEST | 98543 | 1251K| 343 (1)| 00:00:05 | --------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2)
我们排除 object_id 为 NULL,查看其执行计划。
SQL> select max(object_id), min(object_id) from test where object_id is not null; 执行计划 ---------------------------------------------------------- Plan hash value: 3508397080 ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 57 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 13 | | | |* 2 | INDEX FAST FULL SCAN| IDX_TEST | 98543 | 1251K| 57 (0)| 00:00:01 | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("OBJECT_ID" IS NOT NULL) Note ----- - dynamic sampling used for this statement (level=2) 统计信息 ---------------------------------------------------------- 5 recursive calls 0 db block gets 275 consistent gets 187 physical reads 0 redo size 614 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
从上面的执行计划中我么可以看到该 SQL 走了 INDEX FAST FULL SCAN,INDEX FAST FULL SCAN 会扫描索引段中所有的块,理想的情况应该是只扫描索引中 “最左边” 和 “最右边” 的叶子块。现在我们将该 SQL 改写为如下 SQL。
SQL> select (select max(object_id) from test), (select min(object_id) from test) from dual; 执行计划 ---------------------------------------------------------- Plan hash value: 3827415409 --------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 13 | | | | 2 | INDEX FULL SCAN (MIN/MAX)| IDX_TEST | 1 | 13 | 2 (0)| 00:00:01 | | 3 | SORT AGGREGATE | | 1 | 13 | | | | 4 | INDEX FULL SCAN (MIN/MAX)| IDX_TEST | 1 | 13 | 2 (0)| 00:00:01 | | 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | --------------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2) 统计信息 ---------------------------------------------------------- 7 recursive calls 0 db block gets 158 consistent gets 0 physical reads 0 redo size 646 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
原始 SQL “select max(object_id), min(object_id) from test where object_id is not null” 因为需要一次性从索引中取得最大值和最小值,所以导致走了 INDEX FAST FULL SCAN。我们将该 SQL 进行等价改写之后,访问了索引两次,一次取最大值,一次取最小值,从而避免扫描不需要的索引叶子块,大大提升了查询性能。
MAT_VIEW REWRITE ACCESS FULL
MAT_VIEW REWRITE ACCESS FULL 表示物化视图全表扫描,多块读。
因为物化视图本质上也是一个表,所以其扫描方式与全表扫描方式一样。如果我们开启了查询重写功能,而且 SQL 查询能够直接从物化视图中获得结果,就会走该访问路径。
- 创建物化视图
SQL> create materialized view mv_test build immediate enable query rewrite as select object_id, object_name from test;
- 执行 SQL 查询
SQL> alter session set statistics_level = all; SQL> select object_id, object_name from test;
- 查看执行计划
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'advanced allstats last')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID 495ycvbfa4dvv, child number 1 ------------------------------------- select object_id, object_name from test Plan hash value: 591415988 ---------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | | 118 (100)| | 86122 |00:00:00.06 | 6129 | | 1 | MAT_VIEW REWRITE ACCESS FULL| MV_TEST | 1 | 80123 | 6181K| 118 (0)| 00:00:02 | 86122 |00:00:00.06 | 6129 | ---------------------------------------------------------------------------------------------------------------------------------- Query Block Name / Object Alias (identified by operation id): ------------------------------------------------------------- 1 - SEL$E04E005D / MV_TEST@SEL$07DB0C4D Outline Data ------------- /*+ BEGIN_OUTLINE_DATA IGNORE_OPTIM_EMBEDDED_HINTS OPTIMIZER_FEATURES_ENABLE('11.2.0.4') DB_VERSION('11.2.0.4') OPT_PARAM('star_transformation_enabled' 'true') ALL_ROWS OUTLINE_LEAF(@"SEL$E04E005D") REWRITE(@"SEL$C6F8ED93" "MV_TEST") OUTLINE(@"SEL$C6F8ED93") REWRITE(@"SEL$1" "MV_TEST") OUTLINE(@"SEL$1") FULL(@"SEL$E04E005D" "MV_TEST"@"SEL$07DB0C4D") END_OUTLINE_DATA */ Column Projection Information (identified by operation id): ----------------------------------------------------------- 1 - "MV_TEST"."OBJECT_ID"[NUMBER,22], "MV_TEST"."OBJECT_NAME"[VARCHAR2,128] Note ----- - dynamic sampling used for this statement (level=2)
因为物化视图 MV_TEST 已经包含了查询需要的字段,所以该 SQL 会直接访问物化视图 MV_TEST。
访问路径总结
序号 | 访问路径 | 读取方式 |
1 | TABLE ACCESS FULL | 多块读 |
2 | TABLE ACCESS BY USER ROWID | 单块读 |
3 | TABLE ACCESS BY ROWID RANGE | 多块读 |
4 | TABLE ACCESS BY INDEX ROWID | 单块读 |
5 | INDEX UNIQUE SCAN | 单块读 |
6 | INDEX RANGE SCAN | 单块读 |
7 | INDEX SKIP SCAN | 单块读 |
8 | INDEX FULL SCAN | 单块读 |
9 | INDEX FAST FULL SCAN | 多块读 |
10 | INDEX FULL SCAN (MIN/MAX) | 单块读 |
11 | MAT_VIEW REWRITE ACCESS FULL | 多块读 |
单块读与多块读
单块读与多块读这两个概念对于掌握 SQL 优化非常重要,更准确地说是单块读的物理 I/O 次数和多块读的物理 I/O 次数对于掌握 SQL 优化非常重要。
从磁盘 1 次读取 1 个块到 buffer cache 就叫单块读,从磁盘 1 次读取多个块到 buffer cache 就叫多块读。如果数据块都已经缓存在 buffer cache 中,那就不需要物理 I/O 了,没有物理 I/O 也就不存在单块读与多块读。
绝大多数的平台,一次 I/O 最多只能读取或者写入 1MB 数据,Oracle 的块大小默认是 8k,那么一次 I/O 最多只能写入 128 个块到磁盘,最多只能读取 128 个块到 buffer cache。在判断哪个访问路径性能好的时候,通常是估算每个访问路径的 I/O 次数,谁的 I/O 次数少,谁的性能就好。在估算 I/O 次数的时候,我么只需要算个大概就可以了,没必要很精确。
参考资料
《SQL 优化核心思想 -- 罗炳森》
《数据库高效优化 -- 马立和》
原创文章,转载请注明出处:http://www.opcoder.cn/article/46/