Created by Software-Heroes

ABAP Cloud - Read XML

How can you read and process XML data relatively easily in ABAP Cloud? Let's look at an example and go through it step by step.

Introduction
Format
Parsing content

Reader
Contents
Attributes

Result
Complete example
Conclusion

In this article we want to read XML data and convert it into ABAP structures. To do this we will use released classes for ABAP Cloud and in this case we will forego transformations.

Introduction

In addition to JSON, there is also XML as a standard format for interfaces. Therefore we should at least have tools at hand to convert such data.

Format

Let's first look at the basic components of the XML format, how it is structured and which elements are available to us. In the following example we want to discuss some elements of the XML file that are important later for the conversion.

Let's now look at the different points:

Namespace - With the XML namespace (xmlns) we define the various namespaces used within our XML format. If we do not use a namespace, we do not need a definition of it.
Attribute - A tag can contain additional properties and attributes that transport further options and content.
CDATA - If you want to represent free text with unknown content in XML, then you need a CDATA tag. This ensures, among other things, that special characters in the content do not make the XML file corrupt (HTML or other XML content).
Content - Between a tag you will find the content that can be addressed under this name.
Tag - A tag always has a beginning and an end, and you will normally come across two variants, one with the content "<TAG>CONTENT</TAG>" and one without the content "<TAG/>". If the tag has a namespace, you will find this in front of the tag, separated by a colon.

Basically, there are still many small details to discover in an XML file or an XML stream, but this should be enough for us for processing.

Parsing content

If we now want to access the content, we can do this via a transformation in the system, for example. To be as flexible as possible, we do this using the shared class CL_SXML_STRING_READER.

Reader

First of all, we need the content (XML) as a string or XString. We can then use this to instantiate the reader. We get the XML as a string using the GET_XML method, in the next step we convert it to XString using the XCO class and then create our reader using the CREATE method.

DATA(xml_string) = get_xml( ).
DATA(binary) = xco_cp=>string( xml_string )->as_xstring( xco_cp_character=>code_page->utf_8 )->value.
DATA(reader) = cl_sxml_string_reader=>create( binary ).

In the next step, we want to look at the content that the reader would output. To do this, we create a loop and process nodes (tags) one by one. Using the NEXT_NODE method, we load the next node in the XML and the corresponding attributes in the READER object are filled.

The logic for the output would now look like this. We want to stay in the loop until the last element is reached. We would prepare the attributes accordingly for the output.

WHILE reader->node_type <> if_sxml_node=>co_nt_final.
  reader->next_node( ).

  DATA(output) = |Namespace: { reader->nsuri }, Name: { reader->name }, XML Type: { reader->xml_type }, Node Type: { reader->node_type }, Prefix: { reader->prefix }|.
  output &&= | Value: { reader->value }, Value-Raw: { reader->value_raw }, Value-Type: { reader->value_type }, Offset: { reader->get_byte_offset( ) }|.

  out->write( output ).
ENDWHILE.

The same node is called 3 times, once to open it, to read the value and once to close it. We can check which element it is using the NODE_TYPE attribute and the corresponding constants in the IF_SXML_NODE interface. The result of our first run would therefore look like this:

So what information do we find now? You will find the most important fields and contents here.

NSURI - Namespace URI, i.e. the URL to the namespace from the header
NAME - Name of the current tag
NODE_TYPE - Type of the current node (constants from the interface IF_SXML_NODE)
VALUE - Content that is between the tags
PREFIX - If the tag has a namespace, you will find the front part here

Attributes

But we are still missing one thing. We currently have no information about the contents of the attributes that are attached to the various tags. There are currently two methods for this. If we do not know which attributes are present, we can determine this quite generically using the NEXT_ATTRIBUTE method. The NODE_TYPE is set to attribute and we can check further here.

reader->next_attribute( ).
WHILE reader->node_type = if_sxml_node=>co_nt_attribute.
  DATA(output) = |Namespace: { reader->nsuri }, Name: { reader->name }, XML Type: { reader->xml_type }, Node Type: { reader->node_type }, Prefix: { reader->prefix }|.
  output &&= | Value: { reader->value }, Value-Raw: { reader->value_raw }, Value-Type: { reader->value_type }, Offset: { reader->get_byte_offset( ) }|.

  out->write( output ).
  reader->next_attribute( ).
ENDWHILE.

Finally, we now have all the attributes together and can look at the result and the various fields in the ABAP console.

With the second method, we know the attributes and the structure of the XML file. In this case, for example, we can read specific values using the GET_ATTRIBUTE_VALUE method and do not have to process all the data if we may not need it.

Result

Now that you know the theory and a few small examples, we can start actually parsing the XML shown above. To do this, we define an internal ABAP structure to map the XML file.

TYPES: BEGIN OF people,
         height TYPE c LENGTH 9,
         name   TYPE string,
       END OF people.
TYPES peoples TYPE STANDARD TABLE OF people WITH EMPTY KEY.

TYPES: BEGIN OF file,
         head_key         TYPE c LENGTH 10,
         head_description TYPE string,
         title            TYPE string,
         description      TYPE string,
         desc_length      TYPE i,
         desc_space       TYPE c LENGTH 20,
         tags             TYPE string,
         peoples          TYPE peoples,
       END OF file.

Unfortunately, the combined reading of tags and attributes does not work properly and the attributes in the READER object are not set properly, so we have to proceed a little differently here to get the right result. We should note the following points:

The content between the tags is at the level of NODE_TYPE = IF_SXML_NODE=>CO_NT_VALUE (value 4)
The attributes are at the level of NODE_TYPE = IF_SXML_NODE=>CO_NT_ELEMENT_OPEN (value 1)
The content on the opening tag is always from the last tag during the run (see example screenshot of reading the nodes above).

In this case we use a different method on the reader and have the node returned as its own object. Basically, the attributes in the reader object are not changed when we read the various attributes on the generated object.

DATA(node) = reader->read_next_node( ).

Then we proceed as follows: We note the last tag opened in order to then determine all values for NODE_TYPE = IF_SXML_NODE=>CO_NT_VALUE (value 4). We receive the current content of the tag via the READER and the attributes via the open node, so we can fill our structure with the content.

DO.
  DATA(node) = reader->read_next_node( ).
  IF reader->node_type = if_sxml_node=>co_nt_final.
    EXIT.
  ENDIF.

  IF reader->node_type = if_sxml_node=>co_nt_element_open.
    last_open_tag = CAST if_sxml_open_element( node ).

    CASE to_upper( reader->name ).
      WHEN 'MYROOT'.
        result-head_key         = last_open_tag->get_attribute_value( name = 'key' )->get_value( ).
        result-head_description = last_open_tag->get_attribute_value( name = 'description' )->get_value( ).
    ENDCASE.
  ENDIF.

  IF reader->node_type <> if_sxml_node=>co_nt_value.
    CONTINUE.
  ENDIF.

  CASE to_upper( reader->name ).
    WHEN 'DESCRIPTION'.
      result-description = reader->value.
      result-desc_length = last_open_tag->get_attribute_value( name = 'length' )->get_value( ).
      result-desc_space  = last_open_tag->get_attribute_value( name = 'space' )->get_value( ).

    WHEN 'ITEM'.
      INSERT INITIAL LINE INTO TABLE result-peoples REFERENCE INTO person.
      person->height = last_open_tag->get_attribute_value( name = 'height' )->get_value( ).
      person->name   = reader->value.

    WHEN OTHERS.
      ASSIGN COMPONENT to_upper( reader->name ) OF STRUCTURE result TO FIELD-SYMBOL(<line>).
      IF sy-subrc = 0.
        <line> = reader->value.
      ENDIF.
  ENDCASE.
ENDDO.

Unfortunately, a case does not work with this, and that is the case with all tags that have no value as content, but other tags. In this case, that would be MYROOT and TABLE. To do this, we must define a special case that goes into the IF statement to remember the last open sentence. Then we can also read in these attributes and get the following result:

Complete example

Here you can find the complete class with the examples shown. The XML stream is in the class, so you can recreate the example directly.

CLASS zcl_bs_demo_xml_read DEFINITION
  PUBLIC FINAL
  CREATE PUBLIC.

  PUBLIC SECTION.
    INTERFACES if_oo_adt_classrun.

  PRIVATE SECTION.
    TYPES: BEGIN OF people,
             height TYPE c LENGTH 9,
             name   TYPE string,
           END OF people.
    TYPES peoples TYPE STANDARD TABLE OF people WITH EMPTY KEY.

    TYPES: BEGIN OF file,
             head_key         TYPE c LENGTH 10,
             head_description TYPE string,
             title            TYPE string,
             description      TYPE string,
             desc_length      TYPE i,
             desc_space       TYPE c LENGTH 20,
             tags             TYPE string,
             peoples          TYPE peoples,
           END OF file.

    METHODS read_all_nodes_and_write
      IMPORTING !out TYPE REF TO if_oo_adt_classrun_out.

    METHODS get_xml
      RETURNING VALUE(result) TYPE string.

    METHODS read_attributes_from_tags
      IMPORTING !out TYPE REF TO if_oo_adt_classrun_out.

    METHODS parse_document
      IMPORTING !out          TYPE REF TO if_oo_adt_classrun_out
      RETURNING VALUE(result) TYPE file.
ENDCLASS.


CLASS zcl_bs_demo_xml_read IMPLEMENTATION.
  METHOD get_xml.
    RETURN |<?xml version="1.0" encoding="utf-8"?>| &
           |<swh:myroot xmlns:swh="http://software-heroes/swh" key="R1" description="Top node">| &
           |  <swh:title>This is a title</swh:title>| &
           |  <swh:description length="200" space="some"><![CDATA[My description is a bit longer]]></swh:description>| &
           |  <table>| &
           |    <item height="182cm">Bryan Jonnson</item>| &
           |    <item height="179cm">Linda Schwetzinger</item>| &
           |    <item height="162cm">Iana Petrova</item>| &
           |  </table>| &
           |  <swh:tags>People, Names, Data</swh:tags>| &
           |</swh:myroot>|.
  ENDMETHOD.


  METHOD if_oo_adt_classrun~main.
    read_all_nodes_and_write( out ).
    read_attributes_from_tags( out ).
    parse_document( out ).
  ENDMETHOD.


  METHOD read_all_nodes_and_write.
    DATA(xml_string) = get_xml( ).
    DATA(binary) = xco_cp=>string( xml_string )->as_xstring( xco_cp_character=>code_page->utf_8 )->value.
    DATA(reader) = cl_sxml_string_reader=>create( binary ).

    WHILE reader->node_type <> if_sxml_node=>co_nt_final.
      reader->next_node( ).

      DATA(output) = |Namespace: { reader->nsuri }, Name: { reader->name }, XML Type: { reader->xml_type }, Node Type: { reader->node_type }, Prefix: { reader->prefix }|.
      output &&= | Value: { reader->value }, Value-Raw: { reader->value_raw }, Value-Type: { reader->value_type }, Offset: { reader->get_byte_offset( ) }|.

      out->write( output ).
    ENDWHILE.
  ENDMETHOD.


  METHOD read_attributes_from_tags.
    DATA(xml_string) = get_xml( ).
    DATA(binary) = xco_cp=>string( xml_string )->as_xstring( xco_cp_character=>code_page->utf_8 )->value.
    DATA(reader) = cl_sxml_string_reader=>create( binary ).

    DATA(finished) = abap_false.
    WHILE finished = abap_false.
      reader->next_node( ).
      IF reader->node_type = if_sxml_node=>co_nt_final.
        finished = abap_true.
      ENDIF.

      IF reader->node_type <> if_sxml_node=>co_nt_element_open.
        CONTINUE.
      ENDIF.

      reader->next_attribute( ).
      WHILE reader->node_type = if_sxml_node=>co_nt_attribute.
        DATA(output) = |Namespace: { reader->nsuri }, Name: { reader->name }, XML Type: { reader->xml_type }, Node Type: { reader->node_type }, Prefix: { reader->prefix }|.
        output &&= | Value: { reader->value }, Value-Raw: { reader->value_raw }, Value-Type: { reader->value_type }, Offset: { reader->get_byte_offset( ) }|.

        out->write( output ).
        reader->next_attribute( ).
      ENDWHILE.

    ENDWHILE.
  ENDMETHOD.


  METHOD parse_document.
    DATA last_open_tag TYPE REF TO if_sxml_open_element.
    DATA person        TYPE REF TO zcl_bs_demo_xml_read=>people.

    DATA(xml_string) = get_xml( ).
    DATA(binary) = xco_cp=>string( xml_string )->as_xstring( xco_cp_character=>code_page->utf_8 )->value.
    DATA(reader) = cl_sxml_string_reader=>create( binary ).

    DO.
      DATA(node) = reader->read_next_node( ).
      IF reader->node_type = if_sxml_node=>co_nt_final.
        EXIT.
      ENDIF.

      IF reader->node_type = if_sxml_node=>co_nt_element_open.
        last_open_tag = CAST if_sxml_open_element( node ).

        CASE to_upper( reader->name ).
          WHEN 'MYROOT'.
            result-head_key         = last_open_tag->get_attribute_value( name = 'key' )->get_value( ).
            result-head_description = last_open_tag->get_attribute_value( name = 'description' )->get_value( ).
        ENDCASE.
      ENDIF.

      IF reader->node_type <> if_sxml_node=>co_nt_value.
        CONTINUE.
      ENDIF.

      CASE to_upper( reader->name ).
        WHEN 'DESCRIPTION'.
          result-description = reader->value.
          result-desc_length = last_open_tag->get_attribute_value( name = 'length' )->get_value( ).
          result-desc_space  = last_open_tag->get_attribute_value( name = 'space' )->get_value( ).

        WHEN 'ITEM'.
          INSERT INITIAL LINE INTO TABLE result-peoples REFERENCE INTO person.
          person->height = last_open_tag->get_attribute_value( name = 'height' )->get_value( ).
          person->name   = reader->value.

        WHEN OTHERS.
          ASSIGN COMPONENT to_upper( reader->name ) OF STRUCTURE result TO FIELD-SYMBOL(<line>).
          IF sy-subrc = 0.
            <line> = reader->value.
          ENDIF.
      ENDCASE.

    ENDDO.
  ENDMETHOD.
ENDCLASS.

Conclusion

Reading XML streams can be implemented relatively easily, but it becomes a little more difficult with mixed structures with attributes and tags. In your research, you will probably find other examples that work differently. You should find the best way for you here.

Included topics:

ABAP CloudABAPXMLRead XML

Comments (0)

And further ...

Are you satisfied with the content of the article? We post new content in the ABAP area every Friday and irregularly in all other areas. Take a look at our tools and apps, we provide them free of charge.

ABAP Cloud - Read XML

Table of contents

Introduction

Format

Parsing content

Reader

Contents

Attributes

Result

Complete example

Conclusion

And further ...

ABAP Cloud - Test Data Container

ABAP Cloud ... without BTP?

ABAP Cloud - Skills for the Start

ABAP Cloud - System Fields (Solution)

ABAP Cloud - System Fields (SYST)