Whenever you have a scalar subquery in Oracle, there are a few different options that Oracle can use for executing the subquery. Oracle can either run the query once for each row returned by the records in the from clause. In later versions, Oracle can unnest a scalar subquery and treat it like a table in the “from” clause. This means that the scalar subquery won’t be called over and over again.

If a scalar subquery isn’t unnested (meaning it gets executed once for each row returned from the FROM clause), there are some nice caching features. Tom Kyte wrote an article here.

How Scalar Subquery Caching Works

The article says that caching works like this:

  • Oracle keeps a set number of buckets…255 for Oracle 10g and 11g.
  • When Oracle runs the subquery, it takes the input parameters and hashes them to one of the specific buckets.
  • If the input parameters hash to a bucket that is “free” (no value occupies this bucket…then Oracle will cache the input parameters and the resulting value in this bucket
  • If the input parameters hash to a bucket that has already been occupied to a bucket, Oracle will not cache the value in the lookup hash table.
  • Oracle also keeps track of the last time the query was run. It remembers the last input parameters, and the last result. So even if Oracle doesn’t store the value in the lookup hash table, Oracle can still benefit from caching if we use the same input parameters to the scalar subquery over and over again consecutively.

Great article, right? Tons of valuable information!!

The problem is this: 11g is now officially old. Many people are getting newer versions of Oracle. So the question is–how many cache buckets does Oracle allow in more recent versions? (Say, 19c?) Is it still 255??

To figure this out, I made a test.

Setting Up the Test

First, let’s make a table where we’ll keep track of how many times the scalar subquery will be called.

create table test_subq_call
(
     id number constraint pk_test_subq_call primary key
   , numb_funct_calls number default '0' not null
);

Next, write a function that will be called from our scalar subquery

create or replace function test_subq_cache (p_n in number)
  return number
  authid definer
is
  pragma autonomous_transaction;
begin
 
  update test_subq_call
  set numb_funct_calls = numb_funct_calls + 1
  where id = test_subq_cache.p_n;
 
  if sql%rowcount != 1 then
    raise_application_error(-20001, 'Updated zero rows :(');
  end if;
 
  commit;
 
  return null;
 
end test_subq_cache;
/

Next, make a pipelined function that will generate a bunch of rows. I created this as a pipelined function (not a table) to control the exact ordering of the records…I want it to generate a list of numbers in a “2 steps forward, 1 step back” sort of way. So 1, 2, 1, 3, 2, 4, 3, 5, 4, 6, 5, 7, 6, 8, 7, 9, 8, etc…

create or replace type t_number is table of number;
/


create or replace function generate_rows
  return t_number
  pipelined
  authid definer
is
  pragma autonomous_transaction;
  CONST_MAX_NUMBER_OF_TESTS constant pls_integer := 1000000;
begin
 
  for v_rownum in 1 .. CONST_MAX_NUMBER_OF_TESTS + 1 loop
    insert into test_subq_call(id) values (v_rownum);
    commit;
   
    pipe row(v_rownum);
   
    if v_rownum > 1 then
      pipe row (v_rownum - 1);
    end if;
  end loop;
end generate_rows;
/

Running the Test

From here, we can run the following query with a scalar subquery:

select a.column_value, (select test_subq_cache(a.column_value) from dual)
from table(generate_Rows()) a;

Each time we call the scalar subquery, the function we created will go and update a record in the initial table we created.

Note that this takes a while to run. I could have made it more efficient…but why bother? One-time code, right?

This runs a million tests to see how many times the subquery will get called. We can find the “anticipated” number of subquery calls, and subtract from it the total number of calls. This should give us the total number of hash buckets.

select total_anticipated_func_calls - actual_func_calls number_of_caching_hash_buckets
from (select count(*) *2 total_anticipated_func_calls
           , sum(numb_funct_calls) actual_func_calls
      from test_subq_call
      where id <= 1000000);

For 19c, for the version I was testing, looks like the answer is “There are 1024 hash buckets for scalar subquery caching” …at least, for the version I tested.

Leave a Reply

Your email address will not be published. Required fields are marked *