Mainframes

What kind of a mainframe blog doesn't mention DB2's 25th birthday? I hang my head in shame. DB2 celebrated its birthday on the 7th July this year, and like a forgetful relative, I don't send a card until a 4 month later. But, finally, here is my take on 25 years of DB2.

DB2 - which stands for DataBase 2 - first saw the light of day as an MVS application in 1983. It is a relational database and uses SQL (Structured Query Language) - pronounced SQL by mainframers and "sequel" by non-mainframers, apparently. Where did it come from? In the 1970s, Ted Codd created a series of rules for relational databases, which DB2 was very loosely based on. However, originally, DB2 did break many of Codd's rules. There was also a VM product called SQL/DS, which was similar in nature.

As well as MVS (or z/OS as it's called in its current incarnation), DB2 is available on other platforms. During the 1990s versions were produced for OS/2 (if you remember that one) Linux, Unix, and Windows. There have been a variety of naming conventions over the years. Things like DB2/VSE and DB2/6000, which were replaced by DB2 for VSE and then DB2 UDB (Universal DataBase). This current naming convention can make it harder to work out whether a mainframe version of DB2 or a server version of DB2 is being discussed in any article on the topic. Interestingly, although mainframe and server versions of DB2 are currently very similar in functionality, they are written in different languages. The mainframe version is written in PL/S and the server version in C++.

In the early days, the big competition was Oracle and Informix. Well, IBM bought Informix in 2001, and happily runs Oracle on a zLinux partition. There is also a 31-bit version of Oracle available for z/OS.

Of course, DB2 isn't IBM's only database. It has its relational IMS DB. People interested in IMS DB will be interested in the Virtual IMS Connections user group at www.virtualims.com.

As they say, other mainframe databases are available, including: CA-Datacom, CA-IDMS, Cincom's SUPRA, Computer Corporation of America's Model 204, Select Business Solutions' NOMAD, and Software AG's Adabas.

DB2 is currently at Version 9, which you might remember was code-named Viper before its launch.

Happy belated birthday DB2.

To make the mainframe a part of your SOA you must service enable legacy systems. This does not necessarily mean your mainframe needs to support web services. You can use integration techniques to incorporate your mainframe into a distributed SOA strategy.

There are many ways to integrate the mainframe with distributed systems to participate in the SOA. The mainframe is an odd creature to integrate with because legacy applications were build on proprietary systems (like CICS) and protocols (like SNA). Open systems and of late web services make recently developed systems more interoperable.

To understand mainframe integration, you have to understand the underlying data structures. We then can discuss integration techniques and patterns. The data structures are represented in the existing programs written for the mainframe, which are mostly COBOL programs.

COBOL Metadata Primer

COBOL is a structured language with a data division that defines data structures including external files (file description) and internal program storage (working storage). The file description defines the file structure to the COBOL program so we can use this COBOL definition to understand the existing data for integration purposes. We essentially use the COBOL file definition as a map to the mainframe data that is external to the distributed systems
When working with mainframe data you will hear the term copybook used to describe file structures. The copybook is a reusable file description that the COBOL complier copies into the code at compile time - there is no late binding of file structures. Since files are used over and over the copybook was created as a means to allow the COBOL definition to be done once and copied into all the programs the access the file. If the file structure changes, all the programs using the copybook generally must be recompiled - some changes do not affect the data structure and might not affect other programs.

The copybook therefore becomes a source of metadata describing file structures on the mainframe. Since flat file structures do not have a data dictionary or system catalog description (like a database would), at times the copybook is the single source of metadata about the file structure. Mainframe databases may also have a data dictionary as a source of more metadata, but the copybook is still required for COBOL and can be used for integration purposes.

The copybook is hierarchical in structure. Data definitions can be elementary (with only one level) or grouped. Grouped structures are numbered with the super-group having a lower level number (such as the 01 and 05 levels below) and subgroups higher numbers.

01
ORDER-MASTER.

05 ORDER-MASTER-ID.

10 ORDER TYPE PIC X(2).

10
PART-NUMBER PIC 9(4).

05 CUSTOMER-NAME
PIC X(20).

The picture clause defines the data type as alphanumeric (PIC X), numeric (PIC 9) and alphabetic (PIC A). The number in parenthesis, as shown above, indicates the number of characters in the data element. You might also see FILLER data elements at the end of the copybook. This is padding in the file to allow for additional fields to be added to the file without changing the programs that do not need the new fields.

Mainframe integration software products can take COBOL copybooks as input and create a mapping for the transformation of mainframe data to a structure that can be used by distributed systems such as an ASCII XML schema - mainframe data is encoded in the EBCDIC character set so this too must be translated.

Two additional useful mainframe terms are batch and online processing. Many legacy mainframe systems do not directly update master files as transactions occur. They write the transactions to a flat file (sequential records) and use batch programs to read flat files and update the master files. This is typically done at night when the online systems are down. The batch processing technique predates On-line Transaction Processing (OLTP) systems, such as CICS, and in many cases is an artifact of these legacy beginnings.

Online processing uses an OLTP system such as CICS to process multiple users' requests as transactions and updates files or databases as the transactions occurs. The OLTP system manages the concurrent access to resources with support for resource locking and transactions.

Mainframe Integration Patterns

At a high level, you can integrate with the mainframe within processes or though the underlying data. Integrating at a process level involves a peer-to-peer communication between the distributed system and the mainframe. Data integration operates at the data level without direct interaction with the mainframe process, for example using FTP or a database connection via ODBC/JDBC.

Process Integration Techniques

Screen Scraping

A "screen scraper" uses a terminal emulator on the distributed system to appear to the mainframe as a terminal to existing applications. For example, a mainframe CICS program that displays and accepts user input via a 3270 terminal can be emulated programmatically to process the screen displays and automatically enter data. The advantage of this approach is mainframe programs do not have to change. The disadvantage is that the integration is brittle, changes to the mainframe screens break the integration, and error recovery is cumbersome.

Peer-to-peer

Mainframes also support a peer-to-peer communications scheme called APPC (Advanced Program-to-Program Communication, sometimes called LU 6.2). This technique is mainly used for IBM systems to communicate with each other using a distributed protocol, but can be used through an emulator similar to screen scrapping. If the existing mainframe applications do not already support APPC, then messaging is preferred since the distributed system would not have to emulate IBM APPC.

Messaging

A program that supports a mainframe online transaction, for example a CICS transaction, can be modified to interact with a distributed system in several ways. The most common technique is to use asynchronous messaging, for example MQ request-reply messages, to communicate with the mainframe. The MQ approach is attractive since MQ is supported on many platforms, is well understood by mainframe programmers and the distributed system can connect to MQ through a JMS client library - good support for JAVA.

CICS Adapter

Many vendors provide CICS adapters that make the distributed system look like another mainframe CICS system. CICS supports region-to-region communication using a technique IBM calls Multi-Region Operation (MRO). In the case of MRO, external data is presented to and from a CICS transaction over a queue called the Transient Data Queue (TDQ). Both the TDQ and MQ approach require modifications to existing terminal-based CICS transactions.
Adapters typically support data format transformation such as copybook to XML. The MQ approach is usually more attractive if the MQ software is already available on the mainframe but at times an adapter may be cheaper and faster to implement if there are no other requirements of mainframe messaging in general.

API Adapter

Many mainframe applications have an Application Program Interface. If the program has an API, an adapter can be written, or bought off the shelf for popular applications such as SAP, that create message events through the API. The advantage of this approach is real-time integration with the application, automated mapping of the applications native data formats to a message and ease of integration. The disadvantage is the often high cost of the adapter plus the messaging software.

Web Services

There are a host of companies that can service wrap applications to support WS-* standards. In general, these products use process integration as outlined above to integrate with the legacy systems and then provide standards based protocol on top of the integration, for example, a web services interface to MQ or a web services interface on top of a screen scrapping tool. Tools are emerging to natively create services on the mainframe such as tools to parse and create XML schemas documents directly from COBOL or to generate WSDLs from mainframe constructs such as the CICS COM area.

Data Integration Techniques

File Transfer

The most common data integration technique is a file transfer using FTP. FTP is a TCP/IP application and is widely supported on many platforms. The FTP technique is well suited for batch processing. For online processing a delta (changes only) file can be created on a timed interval (like every five minutes) and sent via FTP to distributed systems. There are also programs available for Managed FTP with features such as scheduling, security, and monitoring of FTPs. The advantages of FTPs are costs, wide availability and simplicity to implement. The disadvantages are poor support for real-time systems and management of all the scripted FTPs - the latter can be addressed with Managed FTP.

Data Base Connectivity

With the now ubiquitous support of ODBC/JDBC distributed systems can connect remotely to mainframe databases effectively sharing the mainframe database with distributed systems. The advantages of this approach are distributed systems can cheaply connect to the database and retrieve or insert data real-time. The disadvantage comes into play if process integration is required - access to a mainframe algorithm and not just the underlying data. The technique may also require a two-phase commit if both the mainframe and distributed system database are being updated within a single transaction.

Data Base Adapters

There are also database adapters available that send messages when tables are changed based on a database trigger. This approach message enables the database and can make transactions entered into a database appear to distributed systems as real-time messages. You can also send messages to the adapter to update the database, but this is typically done with a simple remote database connection. Similar to a database connection, this approach does not help if you require mainframe process integration.

File Adapters

Many integration vendors have mainframe file adapters that run natively on the mainframe to read and write flat files based on messages. The file adapters can be configured to send messages based on the creation of a file, when records are added to the file or periodically poll the file for new records. File adapters can also create and delete files based on a message from the distributed system. The advantage of this approach is strong native support of the mainframe file system and the ability to message enable batch systems. The disadvantage is the slow transfer speed since most file adapters send message one record at a time versus blocked transmissions with FTPs. File adapters should be used to send delta (changes only) files.

Mainframes

Saturday, November 22, 2008

25 years of DB2

Arcati Mainframe Yearbook ..IIs Back for 2009

Sunday, September 14, 2008

A Mainframe SOA Strategy

Sunday, September 7, 2008

Mainframe in Come Back Mode

'

'

"

Blog Archive

Ad's

About Me

Mainframe

"

Google Search Box

Ad's