How To Do It In Parallel


I just came across an interesting question on the Stack Exchange forum, where a user was asking how to execute a set of PL/SQL procedures in parallel with each other. There really isn’t a construct in PL/SQL to accomplish this. There are ways to break data into chunks and using dbms_parallel_execute to run the same procedure to process the chunks in parallel, as described here on AskTom.oracle.com, but no direct way to execute otherwise unrelated procedures in parallel.

That is not to say that it can’t be done, however. To execute independent procedures in parallel, I can use an Oracle Scheduler Job Chain. Oracle Scheduler is an amazing job scheduler, but it can also be used to construct elaborate workflows with conditional logic to branch execution paths and so much more.

I’ll start by creating a set of test procedures. Each procedure will sleep for a distinct length of time so that we can tell for sure when each is starting and stopping. Together these procedures represent a total of 18 seconds of wait time.

create or replace package test as
    procedure test1;
    procedure test2;
    procedure test3;
    procedure test4;
end test;
/

create or replace package body test as
    procedure test1 is
    begin
        sys.dbms_session.sleep(3);
    end test1;
    
    procedure test2 is
    begin
        sys.dbms_session.sleep(4);
    end test2;
    
    procedure test3 is
    begin
        sys.dbms_session.sleep(5);
    end test3;

    procedure test4 is
    begin
        sys.dbms_session.sleep(6);
    end test4;
end test;
/

Next I’ll create Scheduler Programs for each procedure. A scheduler program is a call out from the job chain to a specific stored procedure, an anonymous PL/SQL block, or an external script or executable. I will use these programs as building blocks for my test chain, but I could also use the same individual programs in multiple different chains or scheduled jobs.

BEGIN
    DBMS_SCHEDULER.create_program(
        program_name => 'TEST1_PROGRAM',
        program_action => 'TEST.TEST1',
        program_type => 'STORED_PROCEDURE',
        number_of_arguments => 0,
        comments => NULL,
        enabled => FALSE);

    DBMS_SCHEDULER.ENABLE(name=>'TEST1_PROGRAM');    

    DBMS_SCHEDULER.create_program(
        program_name => 'TEST2_PROGRAM',
        program_action => 'TEST.TEST2',
        program_type => 'STORED_PROCEDURE',
        number_of_arguments => 0,
        comments => NULL,
        enabled => FALSE);

    DBMS_SCHEDULER.ENABLE(name=>'TEST2_PROGRAM');    

    DBMS_SCHEDULER.create_program(
        program_name => 'TEST3_PROGRAM',
        program_action => 'TEST.TEST3',
        program_type => 'STORED_PROCEDURE',
        number_of_arguments => 0,
        comments => NULL,
        enabled => FALSE);

    DBMS_SCHEDULER.ENABLE(name=>'TEST3_PROGRAM');    

    DBMS_SCHEDULER.create_program(
        program_name => 'TEST4_PROGRAM',
        program_action => 'TEST.TEST4',
        program_type => 'STORED_PROCEDURE',
        number_of_arguments => 0,
        comments => NULL,
        enabled => FALSE);

    DBMS_SCHEDULER.ENABLE(name=>'TEST4_PROGRAM');    
END;
/

Now I’ll create the Scheduler Chain itself. The chain is comprised of Steps and Rules. The steps point to my scheduler programs, and the rules dictate the order and conditions in which the steps are executed. In this case, I’d like to run procedures 1, 2, and 3 in parallel and run procedure 4 only after the first three have completed:

BEGIN
  -- one step for each program
  SYS.DBMS_SCHEDULER.DEFINE_CHAIN_STEP (
     chain_name   => 'TEST_CHAIN'
    ,step_name    => 'CHAIN_STEP1'
    ,program_name => 'TEST1_PROGRAM');

  SYS.DBMS_SCHEDULER.DEFINE_CHAIN_STEP (
     chain_name   => 'TEST_CHAIN'
    ,step_name    => 'CHAIN_STEP2'
    ,program_name => 'TEST2_PROGRAM');

  SYS.DBMS_SCHEDULER.DEFINE_CHAIN_STEP (
     chain_name   => 'TEST_CHAIN'
    ,step_name    => 'CHAIN_STEP3'
    ,program_name => 'TEST3_PROGRAM');

  SYS.DBMS_SCHEDULER.DEFINE_CHAIN_STEP (
     chain_name   => 'TEST_CHAIN'
    ,step_name    => 'CHAIN_STEP4'
    ,program_name => 'TEST4_PROGRAM');

  -- one rule with condition "true" to start each 
  --step immediately
  SYS.DBMS_SCHEDULER.DEFINE_CHAIN_RULE  (
      CHAIN_NAME  => 'TEST_CHAIN',
      rule_name   => 'TEST_RULE1',
      condition   => 'TRUE',
      action      => 'START "CHAIN_STEP1"');   

  SYS.DBMS_SCHEDULER.DEFINE_CHAIN_RULE  (
      CHAIN_NAME  => 'TEST_CHAIN',
      rule_name   => 'TEST_RULE2',
      condition   => 'TRUE',
      action      => 'START "CHAIN_STEP2"');   

  SYS.DBMS_SCHEDULER.DEFINE_CHAIN_RULE  (
      CHAIN_NAME  => 'TEST_CHAIN',
      rule_name   => 'TEST_RULE3',
      condition   => 'TRUE',
      action      => 'START "CHAIN_STEP3"');   

  -- one rule to execute step 4 after steps 1, 2, 
  -- and 3 all complete
  SYS.DBMS_SCHEDULER.DEFINE_CHAIN_RULE (
     chain_name   => 'TEST_CHAIN', 
     rule_name    => 'TEST_RULE4',
     condition    => 'CHAIN_STEP1 Completed AND CHAIN_STEP2 Completed AND CHAIN_STEP3 Completed',
     action       => 'START "CHAIN_STEP4"');

  -- one rule to close out the chain after all steps 
  -- are completed    
  SYS.DBMS_SCHEDULER.DEFINE_CHAIN_RULE (
     chain_name          => 'TEST_CHAIN',
     rule_name           => 'TEST_RULE5',
     condition           => 'CHAIN_STEP4 Completed',
     action              => 'END 0');

END;
/

This chain should now represent about 11 seconds of execution time. That is the amount of time it takes to run procedure 3 (the longest of the procedures running in parallel) and procedure 4. The chain flow now looks like this (as depicted by SQL Developer):

All I need to do now is create a Scheduler Job to run the chain, and set the job to log all the details of its run so that I can see exactly what happens:

BEGIN
    DBMS_SCHEDULER.CREATE_JOB (
            job_name => 'TEST_JOB',
            job_type => 'CHAIN',
            job_action => 'TEST_CHAIN',
            number_of_arguments => 0,
            start_date => NULL,
            repeat_interval => NULL,
            end_date => NULL,
            enabled => FALSE,
            auto_drop => FALSE,
            comments => '');

    DBMS_SCHEDULER.SET_ATTRIBUTE( 
             name => 'TEST_JOB', 
             attribute => 'logging_level', 
             value => DBMS_SCHEDULER.LOGGING_RUNS);

END;
/

Now I run the job, which in turn runs the chain, which executes the steps which call the programs and stored procedures, according to the rules:

BEGIN
    DBMS_SCHEDULER.RUN_JOB (
        job_name => 'TEST_JOB', 
        USE_CURRENT_SESSION => FALSE);
END;
/

When the job completes, I can look at the job run details for the job:

select * from user_scheduler_job_run_details;
LOG_IDLOG_DATEJOB_NAMEJOB_SUBNAMESTATUSACTUAL_START_DATERUN_DURATION
156414-JUN-20 01.05.17.241251000 AM -04:00TEST_JOBCHAIN_STEP1SUCCEEDED14-JUN-20 01.05.14.181854000 AM AMERICA/NEW_YORK+00 00:00:03.000000
156214-JUN-20 01.05.18.199529000 AM -04:00TEST_JOBCHAIN_STEP2SUCCEEDED14-JUN-20 01.05.14.181536000 AM AMERICA/NEW_YORK+00 00:00:04.000000
156614-JUN-20 01.05.19.416132000 AM -04:00TEST_JOBCHAIN_STEP3SUCCEEDED14-JUN-20 01.05.14.199829000 AM AMERICA/NEW_YORK+00 00:00:05.000000
156814-JUN-20 01.05.25.558605000 AM -04:00TEST_JOBCHAIN_STEP4SUCCEEDED14-JUN-20 01.05.19.450690000 AM AMERICA/NEW_YORK+00 00:00:06.000000
157014-JUN-20 01.05.25.573513000 AM -04:00TEST_JOBSUCCEEDED14-JUN-20 01.05.14.067157000 AM AMERICA/NEW_YORK+00 00:00:12.000000

Looking at the results (with some columns removed here for brevity), I can make the following observations:

  • The job starts at “01.05.14.067157000” (ACTUAL_START_DATE, Line 1570).
  • Job steps 1 – 3 all start within a about a tenth of a second of the overall job start (as recorded in ACTUAL_START_DATE for each step in lines 1562, 1564, and 1566) and within about two hundredths of a second of each other, between “01.05.14.181536000” and “01.05.14.199829000”.
  • Running in parallel, Step 1 completes in 3 seconds, Step 2 in 4 seconds, and Step 3 in 5 seconds (as recorded in RUN_DURATION for each step), as expected. Note that the LOG_DATE column shows each step ending one second after the previous step.
  • Step 4 begins at “01.05.19.450690000”, just after Step 3 completes and satisfies all of the start requirements in the rule for Step 4. It takes 6 seconds to complete.
  • The overall job completes at “01.05.25.573513000” (LOG_DATE, Line 1570) with a total duration of 11.5 seconds (rounded to 12 seconds in the view) to complete all four steps. Note that rule processing adds a tiny bit of overhead to the run duration.

Using Scheduler Chains I can create complex workflows and automated processes – essentially structured programs – including parallel execution paths, very easily.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.