文章详情页_Oracle 中自定义函数性能优化浅析

在 SQL 语句中，如果不合理的使用函数就会严重影响性能，其实这里想说的是 PL/SQL 中的自定义函数，反而对于一些内置函数而言，影响性能的可能性较小。

为什么函数影响性能

在 SELECT 语句中调用函数时，那么查询返回的结果集中的每一行都会调用该函数。如果该函数需要执行 1 秒，返回的结果集是 10 行，那么此时 SQL 语句就需要 10 秒，如果该函数执行时间需要 3 秒，返回的结果集是 10000 条记录，那么这个时间就是 30000 秒 ≈ 500 分钟。

生产环境中自定义函数有时候会出现复杂的业务逻辑，导致自定义函数性能开销较高，如果出现不合理调用，那么很容易就会出现性能问题。

创建表 TEST

SQL> create table test (id number);

Table created


declare
  row_index number;
begin
  row_index := 1;
  while row_index <= 8 loop  
    insert into test select row_index from dual;
    row_index := row_index + 1;
  end loop;
  commit;
end;
/

创建函数 F_SLOW，使用 DBMS_LOCK 休眠 2 秒，模拟这个函数较慢

SQL> conn / as sysdba

SQL> grant execute on dbms_lock to scott;


create or replace function f_slow(i_value in number)
return number as
begin
  dbms_lock.sleep(2);
  return i_value + 10;
end;
/

在 SELECT 中调用函数

SQL> set timing on;


SQL> select * from test;

        ID
----------
         1
         2
         3
         4
         5
         6
         7
         8

8 rows selected

Executed in 0.095 seconds


SQL> select f_slow(id) from test;

F_SLOW(ID)
----------
        11
        12
        13
        14
        15
        16
        17
        18

8 rows selected

Executed in 16.243 seconds

在 SELECT 中使用函数，由于有 8 条记录，而每次调用函数需要 Sleep 2 秒，所以总共耗费了 16 秒。

在 WHERE 中调用函数

SQL> select * from test where f_slow(id) > 15;

        ID
----------
         6
         7
         8

Executed in 16.168 seconds

在 WHERE 中使用函数，同样耗费了 16 秒。

什么情况下函数影响性能

其实自定义函数影响性能，主要在与函数调用的次数或函数本身的业务逻辑是否复杂，如果 SELECT 查询中调用次数很少，影响还是非常小的。

仅调用一次函数

SQL> select f_slow(id) from test where id = 2;

F_SLOW(ID)
----------
        12

Executed in 2.03 seconds

函数业务逻辑简单，性能开销很低

--改写函数逻辑
create or replace function f_slow(i_value in number)
return number as
begin
  return i_value + 10;
end;
/


--创建存储过程，测试循环次数对性能的影响
create or replace procedure p_test_slow(i_iter in number)
as
  v_res varchar2(100);
begin
  for i in 1 .. i_iter loop
    select f_slow(i) into v_res from dual;
  end loop;
end;
/


--函数循环 10 次耗费的时间
SQL> exec p_test_slow(10);

PL/SQL procedure successfully completed

Executed in 0.005 seconds


--函数循环 1000 次耗费的时间
SQL> exec p_test_slow(1000);

PL/SQL procedure successfully completed

Executed in 0.02 seconds


--函数循环 100000 次耗费的时间
SQL> exec p_test_slow(100000);

PL/SQL procedure successfully completed

Executed in 1.886 seconds

从上述结果可知，当函数业务逻辑简单，性能开销很低时，循环次数对性能的影响较小。10 次循环跟 100000 循环调用差别是 1.8 秒多。

如何优化解决问题

对 SQL 中调用的自定义函数，可以通过等价改写成多表关联语句。避免产生大量的递归调用，另外就是设法减少函数被调用的次数。SQL 中应尽量避免使用自定义函数，或者尽量避免自定义函数中实现复杂业务逻辑。

标量子查询缓存

标量子查询缓存会通过缓存结果减少 SQL 对函数的调用次数，Oracle 会在内存中构建一个哈希表来缓存标量子查询的结果。当然前提是有重复值的情况下，如果没有重复值，其实这种办法是没有任何效果的。

将函数 F_SLOW 修改为休眠 2 秒

create or replace function f_slow(i_value in number)
return number as
begin
  dbms_lock.sleep(2);
  return i_value + 10;
end;
/

直接在 SELECT 中调用函数，并查看其执行时间

SQL> select f_slow(id) from test;

F_SLOW(ID)
----------
        11
        12
        13
        14
        15
        16
        17
        18

8 rows selected

Executed in 16.254 seconds

使用标量子查询，并查看其执行时间

SQL> select (select f_slow(id) from dual) from test;

(SELECTF_SLOW(ID)FROMDUAL)
--------------------------
                        11
                        12
                        13
                        14
                        15
                        16
                        17
                        18

8 rows selected

Executed in 16.158 seconds

上述执行结果中，使用标量子查询和未使用标量子查询的执行时间几乎一样，这是因为表 TEST 中没有重复值导致。

删除表 TEST 数据，构造重复数据

SQL> truncate table test;

Table truncated


SQL> insert into test 
select 1 from dual union all
select 2 from dual union all
select 2 from dual union all
select 3 from dual union all
select 3 from dual union all
select 3 from dual;

SQL> COMMIT;

直接在 SELECT 中调用函数，并查看其执行时间

SQL> select f_slow(id) from test;

F_SLOW(ID)
----------
        11
        12
        12
        13
        13
        13

6 rows selected

Executed in 12.178 seconds

使用标量子查询，并查看其执行时间

SQL> select (select f_slow(id) from dual) from test;

(SELECTF_SLOW(ID)FROMDUAL)
--------------------------
                        11
                        12
                        12
                        13
                        13
                        13

6 rows selected

Executed in 6.187 seconds

上述执行结果中，使用标量子查询的执行时间为 6s，而未使用标量子查询的执行时间为 12s，这是因为表 TEST 中存在重复值导致。

当使用标量子查询的时候，Oracle 会将子查询结果缓存在哈希表中，如果后续的记录出现同样的值，优化器通过存在哈希表中的值，判断重复值不用重复调用函数，直接使用上次计算结果即可。从而减少函数调用次数以达到优化性能的效果。

DETERMINISTIC 关键字

Oracle 通过关键字 DETERMINISTIC 来声明一个函数是确定性的，确定性函数可以用于创建基于函数的索引。这个仅仅能在 Oracle 10g 以后的版本中使用，它会影响函数如何缓存在 SQL 中。

新建带有 DETERMINISTIC 关键字的函数

create or replace function f_slow_derm(i_value in number)
return number
deterministic
as 
begin
  dbms_lock.sleep(2);
  return i_value + 10;
end;
/

直接在 SELECT 中调用不带 DETERMINISTIC 关键字的函数，并查看其执行时间

SQL> select f_slow(id) from test;

F_SLOW(ID)
----------
        11
        12
        12
        13
        13
        13

6 rows selected

Executed in 12.101 seconds

直接在 SELECT 中调用带 DETERMINISTIC 关键字的函数，并查看其执行时间

SQL> select f_slow_derm(id) from test;

F_SLOW_DERM(ID)
---------------
             11
             12
             12
             13
             13
             13

6 rows selected

Executed in 6.074 seconds

从上述结果可知，将函数标记为确定性函数确实可以提高性能，但 DETERMINISTIC 缓存受限于每次从服务器 fetch 多少数据，缓存仅在当前 fetch 的生命周期内有效，而标量子查询是当前查询内有效。另外，DETERMINISTIC 还受到数组大小的影响，数组大小（arraysize）的差异会产生截然不同的性能。

SQL> show arraysize;
arraysize 15


SQL> select f_slow_derm(id) from test;

F_SLOW_DERM(ID)
---------------
             11
             12
             12
             13
             13
             13

6 rows selected

Executed in 6.107 seconds


SQL> set arraysize 1;


SQL> select f_slow_derm(id) from test;

F_SLOW_DERM(ID)
---------------
             11
             12
             12
             13
             13
             13

6 rows selected

Executed in 12.067 seconds

arraysize 定义了一次返回到客户端的行数，当扫描了 arraysize 后，停止扫描，返回数据，然后继续扫描。这个过程就是 SET AUTOTRACE ON 统计信息中的 SQL*Net roundtrips to/from client。因为 arraysize 默认是 15 行，那么就有一个问题，因为我们一个 block 中的记录数一般都会超过 15 行，所以如果按照 15 行扫描一次，那么每次扫描要么仅扫描一个数据块，要么一个数据块重复扫描多次。

Function Result Cache

Oracle 11g 中引入了两个新的缓存机制。

跨会话 PL/SQL 函数结果缓存

跨会话 PL/SQL 函数结果缓存（Cross-Session PL/SQL Function Result Cache）：用于缓存函数调用的结果。

查询结果缓存

查询结果缓存（Query Result Cache）：用于缓存查询产生的整个结果集。

Oracle 11g 提供的 PL/SQL 函数的缓存机制（对于不同的会话之间可以共用），下面我们使用第一种机制进行查询结果的缓存，如下所示，指定关键词 RESULT_CACHE 来启用 Cross-Session PL/SQL Function Result Cache。

创建函数，并指定关键词 RESULT_CACHE

create or replace function f_slow(i_value in number)
return number
result_cache
as
begin
  dbms_lock.sleep(2);
  return i_value + 10;
end;
/

重建表 TEST 的数据

SQL> truncate table test;

Table truncated


declare
  row_index number;
begin
  row_index := 1;
  while row_index <= 8 loop  
    insert into test select row_index from dual;
    row_index := row_index + 1;
  end loop;
  commit;
end;
/

新建一个 session，记作 session1，在 session1 中调用函数 f_slow()，测试 SQL 性能

SQL> select * from v$mystat where rownum = 1;

       SID STATISTIC#      VALUE
---------- ---------- ----------
       195          0          0


SQL> set timing on;


SQL> select f_slow(id) from test;

F_SLOW(ID)
----------
        11
        12
        13
        14
        15
        16
        17
        18

8 rows selected

Executed in 10.147 seconds

从上述结果可知，在 session1 中，其执行时间约为 10s。

新建一个 session，记作 session2，在 session2 中调用函数 f_slow()，测试 SQL 性能

SQL> select * from v$mystat where rownum = 1;

       SID STATISTIC#      VALUE
---------- ---------- ----------
        67          0          0


SQL> set timing on;


SQL> select f_slow(id) from test;

F_SLOW(ID)
----------
        11
        12
        13
        14
        15
        16
        17
        18

8 rows selected

Executed in 0.068 seconds

从上述结果可知，在 session2 中，其执行时间约为 0.06s。

从 session1 和 session2 的执行结果可知，使用上述方式的缓存信息可以被其他的会话使用，它依赖的对象是自动管理的。也就是说，数据库会帮我们自动缓存，从而当其他会话调用时，使用相关缓存结果来减少函数调用次数，从而达到提高性能效果。

参考资料

https://www.cnblogs.com/kerrycode/p/9099507.html

http://blog.chinaunix.net/uid-22948773-id-3354466.html?utm_source=jiancool

原创文章，转载请注明出处：http://www.opcoder.cn/article/47/