There are two ways to tell if statistics are stale in Oracle. One way is to allow Oracle to tell you if it thinks the statistics are stale. The other way is to compare the statistics of what Oracle thinks a table looks like, to what the table actually looks like.

Note that statistics don’t have to be perfect to be good. They merely have to be “approximately right” for the data in the table. A good indication that your table statistics might be stale is if your queries start running slowly due to Oracle picking a bad plan. BUT there are plenty of other reasons Oracle might pick a bad plan, too:

  • Maybe your data violates “uniformity” assumptions that Oracle makes about your data. This includes things like missing histograms, missing extended statistics, correlated or anti-correlated joins.
  • Your query might be written in such a way that it becomes very difficult for Oracle to figure out how much data to expect.
  • Maybe Oracle does not have an accurate representation of how long it takes to complete one or more operations. For example, the system statistics could be wrong.
  • Your data set might not be represented in the statistics that Oracle has. This could include things like unlucky “dynamic sampling.” This can also occur with a HUGE, widely-varied table where the data skew cannot be accurately represented by a histogram.
  • Maybe indexes that Oracle might otherwise use have become invalid

How To Have Oracle Tell You If Stats are Stale

This is the easiest strategy. But if you use this method, you can’t know how stale your stats are. You only know whether or not a table has changed sufficiently that Oracle thinks that the tables should have statistics re-gathered on them.

To see if Oracle thinks the statistics on your table are stale, you want to look at the STALE_STATS column in DBA_STATISTICS.

select stale_stats
from dba_tab_statistics
where owner = 'TABLE_OWNER_GOES_HERE'
  and table_name = 'TABLE_NAME_GOES_HERE'

If the column returns “YES” Oracle believes that it’s time to re-gather stats. However, if the column returns “NO” then Oracle thinks that the statistics are up-to-date. If the column is null, then there are either no statistics at all, or incomplete statistics on the table. Be cautious if the query returns no rows, as you may have misspelled the table name or the table owner.

How to Check The Stats Yourself

If you check the stats yourself, it’s a great way to determine “how stale” your stats are. By manually checking, you can gather stats on the “stalest” tables first so that you’re making the fewest changes to your database. Doing this might allow you to avoid gathering statistics where it can cause contention (such as on a busy OLTP system).

Your objective is to compare the values that Oracle has with the actual values in the table. Usually, having statistics within 10% of the actual values is OK.

If we ignore more complex cases involving partitioning and indexes, there are two different types of statistics that we need to check. We need to verify table level statistics and column level statistics.

Table Level Statistics

Table level statistics include things like:

  • The number of rows in the table
  • The quantity of blocks the table occupies
  • The number of empty blocks in the table

You can verify these values using queries:

-- Verify the number of rows in a table
-- You can verify this value against DBA_TAB_STATISTICS.NUM_ROWS
select count(*)

-- This query should return the number of "occupied blocks" in a table, so it should be very close to DBA_TAB_STATISTICS.BLOCKS - DBA_TAB_STATISTICS.EMPTY_BLOCKS
select count(distinct substr(rowid, 7, 9))

Column Level Statistics

Column Level statistics include things like:

  • The number of distinct values in a column
  • The high and low values of a column
  • The number of nulls in a column

Distinct values is useful for expressions like COLUMN = <SOMETHING>.

-- Below can be compared to DBA_TAB_COL_STATISTICS.NUM_DISTINCT

select count(distinct MY_COLUMN)

-- Or, if you have a large table and are on 11r2 or later, you might
-- use APPROX_COUNT_DISTINCT if COUNT(DISTINCT ...) takes too long

The high and low values of a column are useful for things like range-based predicates, such as COLUMN <= <SOMETHING> or COLUMN between <START> and <END>. You can check this value with the following query:

-- This query will show you statistics on a particular column

with cte (x)
   select /*+ inline */ MY_COLUMN
select (select approx_count_distinct(x) from cte) distinct_values
     , (select count(*) - count(x) from cte) num_nulls
     , (select min(x) from cte) low_value
     , (select max(x) from cte) high_value
from dual

-- The values displayed in the query above (ACTUAL STATISTICS) can be compared to the values in the query below (ORACLE'S PERCEIVED STATISTICS):

select column_id
     , column_name
     , data_type
     , num_distinct num_vals
     , num_nulls
     , density dnsty
     , case when data_type = 'NUMBER' then to_char(utl_raw.cast_to_number(low_value))
            when data_type = 'VARCHAR2' then to_char(utl_raw.cast_to_varchar2(low_value))
            when data_type = 'NVARCHAR2' then to_char(utl_raw.cast_to_nvarchar2(low_value))
            when data_type = 'BINARY_DOUBLE' then to_char(utl_raw.cast_to_binary_double(low_value))
            when data_type = 'BINARY_FLOAT' then to_char(utl_raw.cast_to_binary_float(low_value))
            when data_type = 'DATE' or data_type like 'TIMESTAMP%' then rtrim(ltrim(to_char(100*(to_number(substr(low_value,1,2),'XX')-100)
+ (to_number(substr(low_value,3,2),'XX')-100),'0000'))||'-'||
ltrim(to_char(to_number(substr(low_value,7,2),'XX'),'00'))||' '||
           else utl_raw.cast_to_varchar2(low_value)
           end low_value
     , case when data_type = 'NUMBER' then to_char(utl_raw.cast_to_number(high_value))
            when data_type = 'VARCHAR2' then to_char(utl_raw.cast_to_varchar2(high_value))
            when data_type = 'NVARCHAR2' then to_char(utl_raw.cast_to_nvarchar2(high_value))
            when data_type = 'BINARY_DOUBLE' then to_char(utl_raw.cast_to_binary_double(high_value))
            when data_type = 'BINARY_FLOAT' then to_char(utl_raw.cast_to_binary_float(high_value))
            when data_type = 'DATE' or data_type like 'TIMESTAMP%' then rtrim(ltrim(to_char(100*(to_number(substr(high_value,1,2),'XX')-100)
+ (to_number(substr(high_value,3,2),'XX')-100),'0000'))||'-'||
ltrim(to_char(to_number(substr(high_value,7,2),'XX'),'00'))||' '||
           else utl_raw.cast_to_varchar2(high_value)
           end high_value
from dba_tab_cols
where owner      = 'TABLE_OWNER'
  and table_name = 'TABLE_YOU_WANT_TO_CHECK'
  and column_name = 'MY_COLUMN'

Hopefully this has been useful!

Free Oracle SQL Tuning Guide

Check out my FREE guide! 7 SQL Tuning Secrets You Can Use Immediately, Even If You’ve Never Tuned a Query In Your Life!

Get it here:

7 SQL Tuning Secrets You can Use Immediately, Even If You've Never Tuned A Query In Your Life

Leave a Reply

Your email address will not be published. Required fields are marked *