ABAP Cloud - Parallel processing
The CL_ABAP_PARALLEL class has been around for a while and is also used in ABAP Cloud. In this article you will learn more about its use and effects.
Table of contents
In this article we will take a closer look at the two variants of CL_ABAP_PARALLEL and how you can use them. The documentation for the class is already quite extensive, but unfortunately not all details are explained. That's why we'll look at the topic in more detail in this article.
Introduction
In many situations it can happen that the performance of an action in one processing takes up too much time simply because there are too many records to be processed. Then it usually helps to increase the number of processes in order to be able to process the mass of data more quickly. In the past, you could outsource the processing to a function module, but you had to take care of the handling of the individual sessions yourself. There is now the class CL_ABAP_PARALLEL which provides two scenarios.
Preparation
In this article, we'll use a few objects to offload most of the coding and limit ourselves to the essential implementations. We use the data model from our Core Data Service training to implement some of the logic. However, we would like to point out that the source code and accesses are not optimized and are only used to generate some runtime.
CLASS zcl_bs_demo_para_data DEFINITION
PUBLIC FINAL
CREATE PUBLIC.
PUBLIC SECTION.
TYPES td_packed TYPE p LENGTH 15 DECIMALS 3.
TYPES ts_partner TYPE zbs_dmo_partner.
TYPES tt_partner TYPE STANDARD TABLE OF ts_partner WITH EMPTY KEY.
TYPES: BEGIN OF ts_result,
partner TYPE zbs_dmo_partner-partner,
name TYPE zbs_dmo_partner-name,
headers TYPE i,
positions TYPE i,
pos_per_head TYPE td_packed,
start_time TYPE utclong,
end_time TYPE utclong,
END OF ts_result.
TYPES tt_result TYPE STANDARD TABLE OF ts_result WITH EMPTY KEY.
METHODS get_partners
RETURNING VALUE(rt_result) TYPE tt_partner.
METHODS get_result_from_partner
IMPORTING is_partner TYPE ts_partner
RETURNING VALUE(rs_result) TYPE ts_result.
PRIVATE SECTION.
METHODS get_number_of_headers
IMPORTING is_partner TYPE ts_partner
RETURNING VALUE(rd_result) TYPE i.
METHODS get_number_of_positions
IMPORTING is_partner TYPE ts_partner
RETURNING VALUE(rd_result) TYPE i.
METHODS get_positions_per_head
IMPORTING is_partner TYPE ts_partner
RETURNING VALUE(rd_result) TYPE zcl_bs_demo_para_data=>td_packed.
ENDCLASS.
CLASS zcl_bs_demo_para_data IMPLEMENTATION.
METHOD get_partners.
SELECT FROM zbs_dmo_partner
FIELDS *
WHERE partner BETWEEN '1000000000' AND '1000000006'
INTO TABLE @rt_result.
ENDMETHOD.
METHOD get_result_from_partner.
DATA ls_result TYPE ts_result.
ls_result-start_time = utclong_current( ).
ls_result-partner = is_partner-partner.
ls_result-name = is_partner-name.
ls_result-headers = get_number_of_headers( is_partner ).
ls_result-positions = get_number_of_positions( is_partner ).
ls_result-pos_per_head = get_positions_per_head( is_partner ).
ls_result-end_time = utclong_current( ).
RETURN ls_result.
ENDMETHOD.
METHOD get_number_of_headers.
SELECT FROM zbs_dmo_invoice
FIELDS *
WHERE partner = @is_partner-partner
INTO TABLE @DATA(lt_head).
LOOP AT lt_head INTO DATA(ls_head).
rd_result += 1.
ENDLOOP.
ENDMETHOD.
METHOD get_number_of_positions.
SELECT FROM zbs_dmo_invoice
FIELDS *
WHERE partner = @is_partner-partner
INTO TABLE @DATA(lt_head).
LOOP AT lt_head INTO DATA(ls_head).
SELECT FROM zbs_dmo_position
FIELDS *
WHERE document = @ls_head-document
INTO TABLE @DATA(lt_position).
LOOP AT lt_position INTO DATA(ls_position).
rd_result += 1.
ENDLOOP.
ENDLOOP.
ENDMETHOD.
METHOD get_positions_per_head.
DATA lt_count TYPE STANDARD TABLE OF i WITH EMPTY KEY.
SELECT FROM zbs_dmo_invoice
FIELDS *
WHERE partner = @is_partner-partner
INTO TABLE @DATA(lt_head).
LOOP AT lt_head INTO DATA(ls_head).
SELECT FROM zbs_dmo_position
FIELDS *
WHERE document = @ls_head-document
INTO TABLE @DATA(lt_position).
DATA(ld_count) = 0.
LOOP AT lt_position INTO DATA(ls_position).
ld_count += 1.
ENDLOOP.
INSERT ld_count INTO TABLE lt_count.
ENDLOOP.
DATA(ld_sum) = 0.
LOOP AT lt_count INTO ld_count.
ld_sum += ld_count.
ENDLOOP.
rd_result = ld_sum / lines( lt_count ).
ENDMETHOD.
ENDCLASS.
There is also a class for measuring the running time. In this case we use a simple TIMESTAMPL and calculate the difference.
CLASS zcl_bs_demo_runtime DEFINITION
PUBLIC FINAL
CREATE PUBLIC.
PUBLIC SECTION.
METHODS constructor.
METHODS get_diff
RETURNING VALUE(rd_result) TYPE timestampl.
PRIVATE SECTION.
DATA md_started TYPE timestampl.
METHODS get_timestampl
RETURNING VALUE(rd_result) TYPE timestampl.
ENDCLASS.
CLASS zcl_bs_demo_runtime IMPLEMENTATION.
METHOD constructor.
md_started = get_timestampl( ).
ENDMETHOD.
METHOD get_diff.
rd_result = get_timestampl( ) - md_started.
ENDMETHOD.
METHOD get_timestampl.
GET TIME STAMP FIELD rd_result.
ENDMETHOD.
ENDCLASS.
Scenario 1 - Inheritance
In this scenario, our class inherits from the Parallel object and we implement the logic in the DO method. The disadvantage of this scenario is that we have to take care of packing and unpacking the payload ourselves, which means more steps are our responsibility.
Structure
The executing class or process looks like this. Our class inherits from CL_ABAP_PARALLEL and we redefine the DO method.
CLASS zcl_bs_demo_para_inheriting DEFINITION
PUBLIC
INHERITING FROM cl_abap_parallel FINAL
CREATE PUBLIC.
PUBLIC SECTION.
METHODS
do REDEFINITION.
ENDCLASS.
CLASS zcl_bs_demo_para_inheriting IMPLEMENTATION.
METHOD do.
DATA ls_partner TYPE zcl_bs_demo_para_data=>ts_partner.
DATA lt_result TYPE zcl_bs_demo_para_data=>tt_result.
CALL TRANSFORMATION id SOURCE XML p_in
RESULT in = ls_partner.
WAIT UP TO 1 SECONDS.
INSERT NEW zcl_bs_demo_para_data( )->get_result_from_partner( ls_partner ) INTO TABLE lt_result.
CALL TRANSFORMATION id SOURCE out = lt_result
RESULT XML p_out.
ENDMETHOD.
ENDCLASS.
During processing we now have to convert the binary stream into our data to get the package to be processed. The wait is implemented for testing purposes only. The actual processing of the package can then begin and we then have to convert the result again as binary and export it.
Execution
If we now want to carry out the parallelization, we need some variables and objects in the first step. We use the LT_IN table to set up the individual work packages that are then to be processed in parallel.
DATA ld_in TYPE xstring.
DATA lt_in TYPE cl_abap_parallel=>t_in_tab.
DATA lt_out TYPE zcl_bs_demo_para_data=>tt_result.
DATA lt_result TYPE zcl_bs_demo_para_data=>tt_result.
DATA(lo_timer) = NEW zcl_bs_demo_runtime( ).
DATA(lo_data) = NEW zcl_bs_demo_para_data( ).
DATA(lo_parallel) = NEW zcl_bs_demo_para_inheriting( p_num_tasks = 3 ).
For testing purposes, we create a timer object for measurement, our data object, which provides us with the worklist, and our parallelization object, which is then supposed to take over the tasks. In the next step, we create the work packages by converting the structure into an XML binary and attaching it to the LT_IN table.
LOOP AT lo_data->get_partners( ) INTO DATA(ls_partner).
CALL TRANSFORMATION id SOURCE in = ls_partner
RESULT XML ld_in.
INSERT ld_in INTO TABLE lt_in.
ENDLOOP.
Now we execute the RUN method and transfer the individual data packets. Each row in the table will later create its own process when processed.
lo_parallel->run( EXPORTING p_in_tab = lt_in
IMPORTING p_out_tab = DATA(lt_out_tab) ).
Once the method is finished, all packets have been processed and we receive the result in the table LT_OUT_TAB. Finally, all we have to do is unpack the binary and merge the result.
LOOP AT lt_out_tab ASSIGNING FIELD-SYMBOL(<ld_out>).
CALL TRANSFORMATION id SOURCE XML <ld_out>-result
RESULT out = lt_out.
INSERT LINES OF lt_out INTO TABLE lt_result.
ENDLOOP.
Scenario 2 - Interface IF_ABAP_PARALLEL
The first scenario was already very complex because we had to take care of unpacking and packing the data ourselves each time. It's much easier with this scenario. As you can see, less explanation is necessary here.
Structure
The class for the process looks like this, this time we receive our data package via the constructor and manage the result within the class. To do this, we need to implement the IF_ABAP_PARALLEL interface and implement the actual logic in the DO method.
CLASS zcl_bs_demo_para_task DEFINITION
PUBLIC FINAL
CREATE PUBLIC.
PUBLIC SECTION.
INTERFACES if_abap_parallel.
METHODS constructor
IMPORTING is_partner TYPE zcl_bs_demo_para_data=>ts_partner.
METHODS get_result
RETURNING VALUE(rt_result) TYPE zcl_bs_demo_para_data=>tt_result.
PRIVATE SECTION.
DATA ms_partner TYPE zcl_bs_demo_para_data=>ts_partner.
DATA mt_result TYPE zcl_bs_demo_para_data=>tt_result.
ENDCLASS.
CLASS zcl_bs_demo_para_task IMPLEMENTATION.
METHOD constructor.
ms_partner = is_partner.
ENDMETHOD.
METHOD if_abap_parallel~do.
WAIT UP TO 1 SECONDS.
INSERT NEW zcl_bs_demo_para_data( )->get_result_from_partner( ms_partner ) INTO TABLE mt_result.
ENDMETHOD.
METHOD get_result.
RETURN mt_result.
ENDMETHOD.
ENDCLASS.
Here we use the same logic as in the first scenario, but save the transformation of the data.
Execution
In this example we also need some variables, but not as many as in the first case.
DATA lt_processes TYPE cl_abap_parallel=>t_in_inst_tab.
DATA lt_result TYPE zcl_bs_demo_para_data=>tt_result.
DATA(lo_timer) = NEW zcl_bs_demo_runtime( ).
DATA(lo_data) = NEW zcl_bs_demo_para_data( ).
In this case, the data packets are the individual instances that we attach to the LT_PROCESSES table. In our example, we create a data package from the database for each partner.
LOOP AT lo_data->get_partners( ) INTO DATA(ls_partner).
INSERT NEW zcl_bs_demo_para_task( ls_partner ) INTO TABLE lt_processes.
ENDLOOP.
We start the parallelization using the RUN_INST method and, after completion, we get the result in the variable LT_FINISHED.
NEW cl_abap_parallel( p_num_tasks = 3 )->run_inst( EXPORTING p_in_tab = lt_processes
IMPORTING p_out_tab = DATA(lt_finished) ).
Now we just have to collect the results from the individual instances and get the result of the processing.
LOOP AT lt_finished INTO DATA(ls_finished).
INSERT LINES OF CAST zcl_bs_demo_para_task( ls_finished-inst )->get_result( ) INTO TABLE lt_result.
ENDLOOP.
Full example
The complete example of the executable class for the two scenarios can be found here.
CLASS zcl_bs_demo_para_start DEFINITION
PUBLIC FINAL
CREATE PUBLIC.
PUBLIC SECTION.
INTERFACES if_oo_adt_classrun.
PRIVATE SECTION.
METHODS start_scenario_1
IMPORTING io_out TYPE REF TO if_oo_adt_classrun_out.
METHODS start_scenario_2
IMPORTING io_out TYPE REF TO if_oo_adt_classrun_out.
ENDCLASS.
CLASS zcl_bs_demo_para_start IMPLEMENTATION.
METHOD if_oo_adt_classrun~main.
out->write( 'Scenario 1 - Inheritance' ).
start_scenario_1( out ).
out->write( '-' ).
out->write( 'Scenario 2 - Interface' ).
start_scenario_2( out ).
ENDMETHOD.
METHOD start_scenario_1.
DATA ld_in TYPE xstring.
DATA lt_in TYPE cl_abap_parallel=>t_in_tab.
DATA lt_out TYPE zcl_bs_demo_para_data=>tt_result.
DATA lt_result TYPE zcl_bs_demo_para_data=>tt_result.
DATA(lo_timer) = NEW zcl_bs_demo_runtime( ).
DATA(lo_data) = NEW zcl_bs_demo_para_data( ).
DATA(lo_parallel) = NEW zcl_bs_demo_para_inheriting( p_num_tasks = 3 ).
LOOP AT lo_data->get_partners( ) INTO DATA(ls_partner).
CALL TRANSFORMATION id SOURCE in = ls_partner
RESULT XML ld_in.
INSERT ld_in INTO TABLE lt_in.
ENDLOOP.
lo_parallel->run( EXPORTING p_in_tab = lt_in
IMPORTING p_out_tab = DATA(l_out_tab) ).
LOOP AT l_out_tab ASSIGNING FIELD-SYMBOL(<l_out>).
CALL TRANSFORMATION id SOURCE XML <l_out>-result
RESULT out = lt_out.
INSERT LINES OF lt_out INTO TABLE lt_result.
ENDLOOP.
io_out->write( lo_timer->get_diff( ) ).
io_out->write( lt_result ).
ENDMETHOD.
METHOD start_scenario_2.
DATA lt_processes TYPE cl_abap_parallel=>t_in_inst_tab.
DATA lt_result TYPE zcl_bs_demo_para_data=>tt_result.
DATA(lo_timer) = NEW zcl_bs_demo_runtime( ).
DATA(lo_data) = NEW zcl_bs_demo_para_data( ).
LOOP AT lo_data->get_partners( ) INTO DATA(ls_partner).
INSERT NEW zcl_bs_demo_para_task( ls_partner ) INTO TABLE lt_processes.
ENDLOOP.
NEW cl_abap_parallel( p_num_tasks = 3 )->run_inst( EXPORTING p_in_tab = lt_processes
IMPORTING p_out_tab = DATA(lt_finished) ).
LOOP AT lt_finished INTO DATA(ls_finished).
INSERT LINES OF CAST zcl_bs_demo_para_task( ls_finished-inst )->get_result( ) INTO TABLE lt_result.
ENDLOOP.
io_out->write( lo_timer->get_diff( ) ).
io_out->write( lt_result ).
ENDMETHOD.
ENDCLASS.
If you run the example, you will get the following output.
Settings
What do the settings for the classes actually look like? The constructor of the class CL_ABAP_PARALLEL offers some setting options to control parallelization and system utilization. Via the ABAP Docs, opened in Eclipse using F2, we get a pretty good explanation of the settings, but unfortunately not how the parameters work together.
If P_PERCENTAGE is supplied, this value is used to calculate the processes to be used. Values from 0 to 100 are possible here, i.e. what percentage of usable processes are used for parallelization. P_NUM_TASKS specifies a fixed number of processes that may be used for parallelization. If both values are specified, the percentage value is used.
Here is an example for the parameter P_NUM_TASKS, in which we make 1, 3 and 9 processes available for processing. The more processes we create, the faster the throughput time for determining the results will be. As you can see from the start times, the different packages are started at different times or, in the last case, all at the same time.
You can also set a debug flag (P_DEBUG) using the RUN and RUN_INST methods. If you do this, all processes will be run through one after the other and no execution as RFC will take place. This allows you to debug the process and look for errors.
Conclusion
We recommend using scenario 2 as it is simpler and can be implemented with less code. Basically, the article should help you to easily parallelize processes and thus increase performance in your next project. The class shown is also possible in an older release up to 7.50, but may then differ in individual content.
Further information:
SAP Blog - CL_ABAP_PARALLEL
Blog from Sascha Wächter