前言:
我們都知道在Oracle 8i後導入了所謂的CBO(cost base optimizer)最佳化處裡器,來自動判斷USER提交的SQL在系統內所走的execution plan,隨著CBO的越來越優化,所以相對於8i之前的版本,statistics的正確值就顯的非常的重要,在Oracle中進行Object的statistics的計算共有兩個工具:Analyze跟DBMS_STATS package,這一篇文章將會介紹Analyze的原理,以及如何利用前述的工具來獲得最新的object statistics。
為甚麼Analyze很重要?
Analyze可以收集物件的statistics以及決定SQL走的execution plan,而從statistics中我們也可以了解物件中data的特性,進而了解database object設計有沒有問題,例如:index是否建的恰當?
SQL> SELECT COLUMN_NAME, NUM_DISTINCT, NUM_NULLS, NUM_BUCKETS, DENSITY
FROM DBA_TAB_COL_STATISTICS
WHERE TABLE_NAME ="PA_EXPENDITURE_ITEMS_ALL"
ORDER BY COLUMN_NAME;
This returns the following data:
COLUMN_NAME NUM_DISTINCT NUM_NULLS NUM_BUCKETS DENSITY
------------------------------ ------------ ---------- ----------- ----------
BURDEN_COST 4300 71957 1 .000232558
BURDEN_COST_RATE 675 7376401 1 .001481481
CONVERTED_FLAG 1 16793903 1 1
COST_BURDEN_DISTRIBUTED_FLAG 2 15796 1 .5
COST_DISTRIBUTED_FLAG 2 0 1 .5
COST_IND_COMPILED_SET_ID 87 6153143 1 .011494253
EXPENDITURE_ID 1171831 0 1 8.5337E-07
TASK_ID 8648 0 1 .000115634
TRANSFERRED_FROM_EXP_ITEM_ID 1233787 15568891 1 8.1051E-07
像上面這個範例中,我們查詢的是table 中column的statistics值,其中, CONVERTED_FLAG欄位就表示值都有一種, EXPENDITURE_ID欄位就很適合建b-tree的index, COST_DISTRIBUTED_FLAG欄位很適合建bitmap的index。
如果從沒做過Analyze,CBO如何決策?
當沒有沒有做過analyze時,當系統收到SQL查詢時,會依照下列的預設值進行cost的計算來選擇要走的execution plan。
Default Table Values When Statistics are Missing
| Default Value Used by Optimizer |
| num_of_blocks * (block_size - cache_layer) / avg_row_len |
| |
|
|
| |
| |
Default Index Values When Statistics are Missing
| Default Value Used by Optimizer |
| |
| |
| |
| |
| |
| |
所以,一個有經過analyze跟沒有經過analyze所影響的統計值是非常大的,因此建議放入系統排程工作中,定期執行objecst的analyze。
Analyze table蒐集的資訊:
- Number of rows (NUM_ROWS)
- * Number of data blocks below the high water mark (that is, the number of data blocks that have been formatted to receive data, regardless whether they currently contain data or are empty) (BLOCKS)
- * Number of data blocks allocated to the table that have never been used (EMPTY_BLOCKS)
透過上述兩個參數值可以了解HWM水位是否太高,需不需要進行調整。
- Average available free space in each data block in bytes (AVG_SPACE)
- Number of chained rows (CHAIN_COUNT)
- Average row length, including the row's overhead, in bytes (AVG_ROW_LEN)
以上的資訊均可透過USER_TABLES
, ALL_TABLES
, and DBA_TABLES Views來查出。
Analyze index蒐集的資訊:
- *Depth of the index from its root block to its leaf blocks (BLEVEL)
- Number of leaf blocks (LEAF_BLOCKS)
- Number of distinct index values (DISTINCT_KEYS)
- Average number of leaf blocks for each index value (AVG_LEAF_BLOCKS_PER_KEY)
- Average number of data blocks for each index value (for an index on a table) (AVG_DATA_BLOCKS_PER_KEY)
- Clustering factor (how well ordered the rows are about the indexed values) (CLUSTERING_FACTOR)
以上的資訊均可透過USER_INDEXES
, ALL_INDEXES
, and DBA_INDEXES Views來查出。
Analyze的語法:
The following statement computes statistics for the sample table oe.orders
:
ANALYZE TABLE orders COMPUTE STATISTICS;
收集的資訊最完整,執行時的效能也最慢。
The following statement computes only system statistics on the sample table oe.orders
:
ANALYZE TABLE orders COMPUTE SYSTEM STATISTICS;
The following statement estimates (% or row counts) statistics for the sample table oe.
orders and all of its indexes:
ANALYZE TABLE orders ESTIMATE STATISTICS;
The following statement deletes statistics about the sample table oe.
orders and all its indexes from the data dictionary:
ANALYZE TABLE orders DELETE STATISTICS;
You can also collect histograms for a single partition of a table. The following statement analyzes partition sales_q2_2000 of the sample table sh.sales
:
ANALYZE TABLE sales PARTITION (sales_q2_2000) COMPUTE STATISTICS;
The following statement validates the structure of the sample index oe.inv_product_ix
:
ANALYZE INDEX inv_product_ix VALIDATE STRUCTURE;
The following statement analyzes the sample table hr.employees
and all of its indexes:
ANALYZE TABLE employees VALIDATE STRUCTURE CASCADE;
For a table, the VALIDATE
REF
UPDATE
clause verifies the REF
s in the specified table, checks the rowid portion of each REF
, and then compares it with the true rowid. If the result is an incorrect rowid, then the REF
is updated so that the rowid portion is correct.
The following statement validates the REF
s in the sample table oe.customers
:
ANALYZE TABLE customers VALIDATE REF UPDATE;
Customer reference 相關的table均 validate.
The following statements validates the structure of the sample table oe.customers
while allowing simultaneous DML:
ANALYZE TABLE customers VALIDATE STRUCTURE ONLINE;
For在Index及table analyze的用法如下:
For table:針對table 做analyze.
For all columns:針對table中的所有columns進行analyze.
For all indexed columns:針對table中有indexed的欄位進行analyze.
For all indexes:針對table中所有的index進行analyze.
DBMS_STATS Package
9i中導入了新的DBMS_STATS Package工具包,提供了更多面向的analyze功能。
DBMS_STAT所提供的功能:
Statistics Gathering Procedures in the DBMS_STATS Package
Procedure | Collects |
| |
| Table, column, and index statistics |
| Statistics for all objects in a schema |
| Statistics for all objects in a database |
| CPU and I/O statistics for the system |
※詳細的subprograms請詳閱附錄。
如何實際使用DBMS_STAT包來蒐集系統、物件參數?
BEGIN
DBMS_STATS.GATHER_SYSTEM_STATS (
interval => 720, --搜集從現在起720分鐘內的系統統計值。
stattab => 'mystats', --收集到的參數存放在'mystats table'中。
statid => 'OLTP'); --只是個平自用來識別用。
END;
實例:
SQL> analyze table employee compute statistics;
表已分析。
SQL> set autotrace on
SQL> select count(*) from employee ;
COUNT(*)
----------
299999
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=1)
1 0 SORT (AGGREGATE)
2 1 BITMAP CONVERSION (COUNT)
3 2 BITMAP INDEX (FAST FULL SCAN) OF 'IDX_BMP_EMP_SEX'
Statistics
----------------------------------------------------------
153 recursive calls
0 db block gets
96 consistent gets
11 physical reads
0 redo size
370 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> analyze table employee delete statistics;
表已分析。
SQL> select count(*) from employee;
COUNT(*)
----------
299999
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'EMPLOYEE'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
5418 consistent gets
3144 physical reads
0 redo size
370 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> analyze table employee ESTIMATE statistics;
表已分析。
SQL> select count(*) from employee;
COUNT(*)
----------
299999
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=1)
1 0 SORT (AGGREGATE)
2 1 BITMAP CONVERSION (COUNT)
3 2 BITMAP INDEX (FAST FULL SCAN) OF 'IDX_BMP_EMP_SEX'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
12 consistent gets
8 physical reads
0 redo size
370 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> analyze table employee delete statistics;
表已分析。
SQL> exec DBMS_STATS.GATHER_table_STATS(ownname => user,tabname => 'employee', method_opt => 'for all indexed columns size 1');
PL/SQL 過程已成功完成。
SQL> select count(*) from employee;
COUNT(*)
----------
299999
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1)
1 0 SORT (AGGREGATE)
2 1 INDEX (FAST FULL SCAN) OF 'EMPLOYEE_PK' (UNIQUE) (Cost=4
Card=299999)
Statistics
----------------------------------------------------------
14 recursive calls
0 db block gets
680 consistent gets
670 physical reads
0 redo size
370 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> spool off
以上我們知道,不良的analyze會造成execution plan的選擇錯誤。
結論:
1. 對於Partition table,建議使用DBMS-STATS,而不是使用Analyze語句。
a) 可以parallel進行,對多個Owner,多個Table。
b) 可以得到Multi Partition的資料和Single Partition分區的資料。
c) 可以在不同級別上Compute Statistics:單個分區,子分區,全表,所有分區。
d) 可以export統計資訊。
e) 可以用戶自動收集統計資訊。
2. DBMS_STATS的缺點:
a) 不能Validate Structure。
b) 不能收集CHAINED ROWS, 不能收集CLUSTER TABLE的資訊,這兩個仍舊需要使用Analyze語句。
c) DBMS_STATS 預設不對索引進行Analyze,因為預設Cascade是False,需要手動指定為True。
3. 對於oracle 9裏面的External Table(包括data dictionary tables、external table、temporary table),Analyze不能使用,只能使用DBMS_STATS來收集資訊。
對這份刊物有任何的意見或是想給作者打氣嗎?歡迎隨時聯絡submarine@ntuh.gov.tw
附錄
DBMS_STATS Subprograms的所有subprogram資訊
| |
PREPARE_COLUMN_VALUES Procedure | Converts user-specified minimum, maximum, and histogram endpoint datatype-specific values into Oracle's internal representation for future storage using SET_COLUMN_STATS. |
| Sets column-related information. |
| Sets index-related information. |
| |
| Sets table-related information. |
| Convert the internal representation of a minimum or maximum value into a datatype-specific value. |
| Gets all column-related information. |
| Gets all index-related information. |
| Gets system statistics from stattab, or from the dictionary if stattab is null. |
| Gets all table-related information. |
| Deletes column-related statistics. |
| Deletes index-related statistics. |
| |
| Deletes table-related statistics. |
| Deletes schema-related statistics. |
DELETE_DATABASE_STATS Procedure | Deletes statistics for the entire database. |
| Creates a table with name stattab in ownname's schema which is capable of holding statistics. |
| Drops a user stat table created by CREATE_STAT_TABLE. |
| Retrieves statistics for a particular column and stores them in the user stat table identified by stattab. |
| Retrieves statistics for a particular index and stores them in the user stat table identified by stattab. |
| Retrieves system statistics and stores them in the user stat table. |
| Retrieves statistics for a particular table and stores them in the user stat table. |
| Retrieves statistics for all objects in the schema identified by ownname and stores them in the user stat table identified by stattab. |
EXPORT_DATABASE_STATS Procedure | Retrieves statistics for all objects in the database and stores them in the user stat table identified by statown.stattab. |
| Retrieves statistics for a particular column from the user stat table identified by stattab and stores them in the dictionary. |
| Retrieves statistics for a particular index from the user stat table identified by stattab and stores them in the dictionary. |
| Retrieves system statistics from the user stat table and stores them in the dictionary |
| Retrieves statistics for a particular table from the user stat table identified by stattab and stores them in the dictionary. |
| Retrieves statistics for all objects in the schema identified by ownname from the user stat table and stores them in the dictionary. |
IMPORT_DATABASE_STATS Procedure | Retrieves statistics for all objects in the database from the user stat table and stores them in the dictionary. |
| |
| Gathers table and column (and index) statistics. |
| Gathers statistics for all objects in a schema. |
GATHER_DATABASE_STATS Procedure | Gathers statistics for all objects in the database. |
| |
| Generates object statistics from previously collected statistics of related objects. |
FLUSH_SCHEMA_MONITORING_INFO Procedure | Flushes in-memory monitoring information for the tables in the specified schema in the dictionary. |
FLUSH_DATABASE_MONITORING_INFO Procedure | Flushes in-memory monitoring information for all the tables to the dictionary. |
ALTER_SCHEMA_TABLE_MONITORING Procedure | Enables or disables the DML monitoring feature of all tables in the schema, except for snapshot logs and the tables, which monitoring does not support. |
ALTER_DATABASE_TABLE_MONITORING Procedure | Enables or disables the DML monitoring feature of all tables in the database, except for snapshot logs and the tables, which monitoring does not support. |