Collected posts: Oracle, SQL, PL/SQL, Performance, Security...(More than 300 blogs)

dimanche 1 novembre 2020

Missing rows under dba_hist_sqlstat : is it a BUG?

         During monitoring of PROD long running reporting… I identified a query taking more than 17 000 sec

The next day when I generated AWR covering its execution period, I found that this query  had taken 2600 sec. I have checked dba_hist_sqlstat , there is only one row / one snap. The delta elapsed time was 2600 sec but the total elapsed time was exact / AWR report is based on delta columns.

Under dba_hist_active_sess_history, I get the exact number of snap_id, in fact
, there are many missing rows under dba_hist_sqlstat .

        I have identified the root cause. During these snapshots , we don’t have a final execution plan ( it was a complex query with more than 14 tables), it  was executed with an adaptive execution plan.

I identified that this behavior happened with dozens of queries.  I cannot share with you the real issue, I succeed in reproducing it.

1
2
3
4
5
6
7
8
9
Tested under 12.1.0.2.0 and 19.8
optimizer_adaptive_features =false
optimizer_adaptive_plans =true

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE    12.1.0.2.0      Production
TNS for Linux: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production
*/
1
2
3
4
5
6
---------------------------------------------------------------------------
BEGIN
  DBMS_WORKLOAD_REPOSITORY.modify_snapshot_settings(interval  => 10);         
-- Minutes. Current value retained if NULL.
END;
---------------------------------------------------------------------------

TEST CASE

with LiveSQL Test Case:

1
2
3
4
5
6
7
8
---------------------------------------------------------------------------
 
select a.ELAPSED_TIME,a.executions from gv$sqlstats a where
sql_id='6daxqtyukunhf';
 
ELAPSED_TIME   EXECUTIONS
1964510271      1
---------------------------------------------------------------------------
1
2
3
4
5
6
7
8
select distinct snap_id
from dba_hist_active_sess_history where sql_id='6daxqtyukunhf';
SNAP_ID
52546
52545
52544
52543
52542
1
2
3
select a.snap_id,round(a.ELAPSED_TIME_TOTAL/1e6) ELAPSED_TIME_TOTAL,
round(a.ELAPSED_TIME_DELTA /1e6) ELAPSED_TIME_DELTA
from dba_hist_sqlstat a where sql_id=’6daxqtyukunhf’

SNAP_IDELAPSED
_TIME_TOTAL
ELAPSED_
TIME_DELTA
525431138602
525441739600
525451965226
  1428

AWR

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:     52542 01-Nov-20 14:40:27       204       2.0
  End Snap:     52546 01-Nov-20 15:10:29       199       2.0
   Elapsed:               30.03 (mins)
   DB Time:               47.84 (mins)
 
 
SQL ordered by Elapsed Time                               Snaps: 52542-52546
-> Captured SQL account for   64.3% of Total DB Time (s):           2,870
-> Captured PL/SQL account for    0.3% of Total DB Time (s):           2,870
 
        Elapsed                  Elapsed Time                                  
        Time (s)    Executions  per Exec (s)  %Total   %CPU    %IO    SQL Id   
---------------- -------------- ------------- ------ ------ ------ -------------
         1,428.4              0           N/A   49.8   99.9     .0 6daxqtyukunhf
Module: PL/SQL Developer
SELECT /*+optimizer_adaptive_plans_True1 */ count(CHB_t1.pad), count(CHB_t2.pad)
FROM CHB_t1, CHB_t2 WHERE CHB_t1.id = CHB_t2.id AND CHB_t1.n = 666 AND burn
_cpu(CHB_t1.id/CHB_t1.id) = 1

I have worked on many cases where « wall clock time » >>  real execution time
of the query. ( file generation taking time par example) .

In that case , sum(DBA_HIST_SQLSTAT.ELAPSED_TIME_DELTA)
=MAX(DBA_HIST_SQLSTAT.ELAPSED_TIME_TOTAL
)  and MAX(DBA_HIST_SQLSTAT.ELAPSED_TIME_TOTAL) << « wall clock time »
I assumed that this query was executed one time.

In my case , the query is executed via PL/SQL function.

Another finding, the insertion into wrh$_sqlstat (DBA_HIST_SQLSTAT) is
based on V$SQLAREA_PLAN_HASH . When the query is under execution , there is
no row into V$SQLAREA_PLAN_HASH

1
2
3
4
5
6
7
insert into wrh$_sqlstat SELECT ...
     FROM X$KEWRSQLIDTAB sie, X$KGLCURSOR_CHILD_SQLIDPH sql
   WHERE (1 = 1)
     AND sie.sqlid_kewrsie = sql.kglobt03
     AND nlssort(sie.sqlid_kewrsie, 'nls_sort = binary') =
         nlssort(sql.kglobt03, 'nls_sort = binary')
     AND sie.pdbid_kewrsie = decode(sql.con_id, NULL, 1, 0, 1, sql.con_id)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Plan hash value: 1821829401
  
--------------------------------------------------------------------------------------------------
|   Id  | Operation                      | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT               |               |       |       |    50 (100)|          |
|     1 |  SORT AGGREGATE                |               |     1 |   214 |            |          |
|  *  2 |   HASH JOIN                    |               |     1 |   214 |    50   (0)| 00:00:01 |
|-    3 |    NESTED LOOPS                |               |     1 |   214 |    50   (0)| 00:00:01 |
|-    4 |     NESTED LOOPS               |               |     1 |   214 |    50   (0)| 00:00:01 |
|-    5 |      STATISTICS COLLECTOR      |               |       |       |            |          |
|  *  6 |       TABLE ACCESS FULL        | CHB_T1        |     1 |   109 |    49   (0)| 00:00:01 |
|- *  7 |      INDEX UNIQUE SCAN         | SYS_C00604946 |     1 |       |     0   (0)|          |
|-    8 |     TABLE ACCESS BY INDEX ROWID| CHB_T2        |     1 |   105 |     1   (0)| 00:00:01 |
|     9 |    TABLE ACCESS FULL           | CHB_T2        |     1 |   105 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
  
Predicate Information (identified by operation id):
---------------------------------------------------
  
   2 - access("CHB_T1"."ID"="CHB_T2"."ID")
   6 - filter(("CHB_T1"."N"=666 AND "BURN_CPU"("CHB_T1"."ID"/"CHB_T1"."ID")=1))
   7 - access("CHB_T1"."ID"="CHB_T2"."ID")
  
Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)

This should be a bug. Disable optimizer_adaptive_plan fix this issue…

Let’s wait for oracle analyze…