Cobol Data Processing


The purpose of this use case is to:

  1. Read and parse COBOL data files in EBCDIC and packed decimal formats. COBOL copy-books including the data layout are also used to parse the data files.
  2. Convert the Cobol data file (EBCDIC) to CSV format. This is useful to ensure the conversion is done properly.
  3. Convert the data into XML format/JAXB objects representing customers positions, transactions and securities.

The XML content and the JAXB java object will be then passed to a downstream system (in this case WealthStation) for further processing.


COBOL (Common Business-Oriented Language) is one of the oldest programming languages. It's mainly used in business, finance, and administrative systems for companies and governments.

EBCDIC Char set encoding

Extended Binary Coded Decimal Interchange Code (EBCDIC) is an 8-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems.

EBCDIC descended from the code used with punched cards and the corresponding six bit binary-coded decimal code used with most of IBM's computer peripherals of the late 1950s and early 1960s.

Cobol Data Files

The data files are generated by legacy applications (host systems) and contain customers positions, transactions and securities data. The data files have the following structure:

  1. Header: starts with 0 and includes record length, interface ID, interface name.
  2. Records: starts with 1 and includes records details.
  3. Footer: starts with 2 and includes interface ID, account Date, interface creation date, interface creation time, total number of records used for validation.

These files are encoded in the EBCDIC. The following example show the result of direct conversion to ASCII format. Some data remains 'invisible', since they are encoded with special Cobol character set such as COMP-3 and/or Packed-Decimal.

The copy-book files define the data files layout including the special COMP-3 fields.

EBCDIC Converter Overview

Cobol data files are taken as input by the converter that will generate the following output:

  1. CSV Files representing the result of the conversion in human readable format. This is also useful for testing and validation purposes
  2. Java objects representing Cobol Interface objects that will be used for further processing such as transformation to JAXB models/XML required by WealthStation in this project.

The converter uses also the files below to do the parsing:

  1. Copy-book files
  2. Drools file

EBCDIC Converter Data Format

Spring File

The Spring file allows us to specify the bean Cobol data format with the two property:

  1. cobolCopybookFolderURL: The folder URL where you will put your copy-books files, you can add copy-books in runtime.
  2. cobolMarshallerRulesFileURL: The marshaler rules file URL.
<bean id="cobolDataFormat" class="org.eclipse.stardust.integration.cobol.marshaller.CobolDataFormat">
	<property name="cobolCopybookFolderURL"	value="${cobol.copybook.folder.url}/inPutCobolFilesSpec" />
	<property name="cobolMarshallerRulesFileURL" value="MarshallerRules.drl" />

We configure in the next phase our route to unmarshal/marshal the files using the cobolDataFormat bean.

Between the two parsing operation we can route the cobolInterface object to another route and do a specific behavior.

	<camel:custom ref="cobolDataFormat" />

	<camel:custom ref="cobolDataFormat" />

Cobol Copy-book Files

A copy-book is a section of code written in COBOL that is often used to define the physical layout of program data.

Major reason for using Copy-books is to ensure that the same version of a data layout definition is used by all.

In Cobol data files, we don't specify structure or fields informations, we found only encrypted data separated by tabulation, that's why we need to use the copy-book files.

All the structure specification can be found there:

  • Indentation Number: Specify the record structure.
  • Fields Name: Specify the fields name and used as a key in the CobolInterface object (Must be unique in each copy-book).
  • Picture(PIC): Specify the fields type(X for char and 9 for number).
  • Number in brackets: Specify the number of character or length of the numeric values and the scale('S9(13)V99' = signed numeric value with 13 number + 2 for decimal point).
  • Char set: Specify the encoding char set to use.
  • About the line that starts with 88 are not used in the parsing process, these information specifies the values that can be found in the previous field.

    Copy Book File

    Drools Rules File

    In some cases the copy-book files are not sufficient to unmarshal the Cobol data files, we can find different records type in the same copy-book.

    In the previous copy-book example screen-shot the structure depend on the field value named 'PVBBAL-CO-ID' to know which substructure we will use (the substructure is represented by the lines starts with the value 15).

    With the Drools API, we can add some rules in parsing process to indicate the unmarshaler to use the contract substructure instead of the deposit.

    rule "if PVBBAL-CO-ID eq C then CONTRACT"
    		Field(fieldName == "PVBBAL-CO-ID")
    		Field(fieldData == "C")
    rule "if PVBBAL-CO-ID eq F then DEPOSIT"
    		Field(fieldName == "PVBBAL-CO-ID")
    		Field(fieldData == "F")


    We need to add the following dependency to our pom.xml


    Cobol Data Transformer Project

    The Cobol Data Transformer Project is composed of two part, we start by using the EBCDIC converter to transform the Cobol data files in Java object then we map from these objects to Jaxb objects using the Dozer API.

    Dozer Mapping

    Dozer is an Java API can be used in the mapping data between 2 different bean using an XML file.

    In this project Dozer API is used to map values form the parsed file to the specific Java Jaxb object generated by the Wealth Station Global team in the WealthStationGlobal.DataImport.Interface project.

    Here are the Java Jaxb object used in this project:

    1. PositionInfoValueCType: used as positions Jaxb object.
    2. AccountInfoValueCType: used as positions Jaxb object.
    3. TransactionInfoValueCType: used as transactions Jaxb object.
    4. EquityLinkedDepositInfoValueCType: used as securities Jaxb object.
    5. DualCurrencyInfoValueCType: used as securities Jaxb object.
    <mapping map-id="BAL_COS_Mapping">
    	<class-b bean-factory="org.dozer.factory.JAXBBeanFactory" map-empty-string="false">
    	<field><a key="PVBBAL-CF-AC">this</a><b>accountNumber</b></field>
    	<field><a key="PVBBAL-CONTRACT-NO">this</a><b>securityIdentifierInfo.securityIdentifier</b></field>
    	<field><a key="SECURITY_IDENTIFIER_INFO_IDTYPE">this</a><b>securityIdentifierInfo.idType</b></field>
    	<field><a key="AVERAGE_COST">this</a><b>averageCost</b></field>
    	<field><a key="COST_BASIS">this</a><b>costBasis</b></field>
    	<field><a key="PVBBAL-CF-CUR">this</a><b>currency</b></field>
    	<field><a key="RECORD_NUMBER">this</a><b>externalPositionID</b></field>
    	<field><a key="EXTERNAL_SYSTEM_CODE">this</a><b>externalSystemCode</b></field>
    	<field custom-converter-id="StringToXMLGregorianCalendar"><a key="INTERFACE_CREATION_DATE" >this</a><b>externalUpdateTime</b></field>
    	<field custom-converter-id="StringToXMLGregorianCalendar"><a key="INTERFACE_CREATION_TIME" >this</a><b>externalUpdateTime</b></field>
    	<field><a key="PVBBAL-CF-BAL">this</a><b>marketValue</b></field>
    	<field><a key="PVBBAL-CF-BAL">this</a><b>units</b></field>

    Some custom converters are used to change values in specific fields like:

    1. StringToXMLGregorianCalendar: Where we change the String date and time values to XMLGregorianCalendar with this format 'yyyy-MM-dd'T'HH:mm:SS', we can also concatenate date and time.
    2. ConcatDestinationAndSourceString: As the name of the custom converter determines, it concatenates the source value previously mapped to the destination value.
    3. TruncateSourceString: Eliminate a piece of String from the source value.

    Drools Rules File

    Through the development phase, we met some difficulty to use conditions rules with the Dozer API, to overcome this problem, we used the Drools API. the following is an example

    rule "PVBSTM-SMSTX02-RECTYPE rule for SMS"
    		$field: Field(fieldName == "PVBSTM-SMSTX02-RECTYPE")
    		if ($field.getFieldData().equalsIgnoreCase("03")||$field.getFieldData().equalsIgnoreCase("16"))
    		else if ($field.getFieldData().equalsIgnoreCase("04"))

    XML Result

    The example below represent and position XML result that will be send to the Wealth Station for additional treatment.

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>