Book Read Free

Michael Coughlan

Page 19

by Beginning COBOL for Programmers-Apress (2014) (pdf)


  PERFORM VARYING Countdown FROM StartValue BY -1 UNTIL Countdown = ZERO

  DISPLAY Countdown

  END-PERFORM

  DISPLAY "Your name is " UserName

  STOP RUN.

  References

  1. Tompkins HE. In defense of teaching structured COBOL as computer science (or, notes on

  being sage struck). ACM SIGPLAN Notices. 1983; 18(4): 86-94.

  2. Baldwin, RR. A note on H.E. Tompkins’s minimum-period COBOL style. ACM SIGPLAN

  Notices. 1987; 22(5): 27-31. http://doi.acm.org/10.1145/25267.25273

  doi: 10.1145/25267.25273

  3. Compiled and run at compileoneline.com—Execute BASIC Program Online

  (Yabasic 2.9.15). www.compileonline.com/execute_basic_online.php

  129

  Chapter 7

  Introduction to Sequential Files

  An important characteristic of a programming language designed for enterprise or business computing is that it should have an external, rather than an internal focus. It should concentrate on processing data held externally in files and databases rather than on manipulating data in memory through linked lists, trees, stacks, and other sophisticated data structures. Whereas in most programming languages the focus is internal, in COBOL it is external. A glance at the table of contents of any programming book on Java, C, Pascal, or Ruby emphasizes the point. In most cases, only one chapter, if that, is devoted to files. In this book, over a quarter of the book deals with files: it covers such topics as sequential files, relative files, indexed files, the SORT, the MERGE, the Report Writer, control breaks, and the file-update problem.

  COBOL supports three file organizations: sequential files, relative files, and indexed files. Relative and indexed are direct-access file organizations that are discussed later in the book. They may be compared to a music CD on which you select the track you desire. Sequential files are like a music cassette: to listen to a particular song, you must go through all the preceding songs.

  This chapter provides a gentle introduction to sequential files. I introduce some of the terminology used when referring to files and explain how sequential files are organized and processed. Every COBOL file organization requires entries in the INPUT-OUTPUT SECTION of the ENVIRONMENT DIVISION and the FILE SECTION of the DATA DIVISION, and these declarations are specified and explained. Because files require more sophisticated data definition than the elementary data items introduced in Chapter 3, this chapter also introduces hierarchically structured data definitions.

  What Is a File?

  A file is a repository for data that resides on backing storage (hard disk, magnetic tape, or CD-ROM). Nowadays, files are used to store a variety of different types of information such as programs, documents, spreadsheets, videos, sounds, pictures, and record-based data. In a record-based file, the data is organized into discrete packages of information. For instance, a customer record holds information about a customer such as their identifying number, name, address, date of birth, and gender. A customer file may contain thousands or even millions of instances of the customer record. In a picture file or music file, by way of contrast, the information is essentially an undifferentiated stream of bytes.

  COBOL is often used in systems where the volume of data to be processed is large—not because the data is

  inherently voluminous, as it is in video or sound files, but because the same items of information have been recorded about a great many instances of the same object. Although COBOL can be used to process other kinds of data files, it is generally used only to process record-based files.

  There are essentially two types of record-based file organization—serial files (COBOL calls these sequential files) and direct-access files:

  • In a serial file, the records are organized and accessed serially (one after another).

  • In a direct-access file, the records are organized in a manner that allows direct access to a

  particular record based on a key value. Unlike serial files, a record in a direct-access file can be

  accessed without having to read any of the preceding records.

  131

  Chapter 7 ■ IntroduCtIon to SequentIal FIleS

  Terminology

  Before I discuss sequential files, I need to introduce some terminology:

  • Field: An item of information that you are recording about an object (StockNumber,

  SupplierCode, DateOfBirth, ValueOfSale)

  • Record: The collection of fields that record information about an object (for example, a

  CustomerRecord is a collection of fields recording information about a customer)

  • File: A collection of one or more occurrences (instances) of a record template (structure)

  Files, Records, and Fields

  It is important to distinguish between the record occurrence (the instance or values of a record) and the record template (the structure of the record). Every record in a file has a different value but the same structure. For instance, the record template illustrated in Figure 7-1 describes the structure of each record occurrence (instance).

  Figure 7-1. Record template/structure

  The occurrences of the employee records (Figure 7-2) are the actual values in the file. There is only one record template, but there are many record instances.

  Figure 7-2. Record occurrences/instances

  132

  Chapter 7 ■ IntroduCtIon to SequentIal FIleS

  How Files Are Processed

  Before a computer can process a piece of data, the data must be loaded into the computer’s main memory (RAM).

  For instance, if you want to manipulate a picture in Photoshop or edit a file in Word, you have to load the data file into main memory (RAM), make the changes you want, and then save the file on to backing storage (disk).

  Programmers in other languages, who may not be used to processing record-based data, often seek to load the entire file into memory as if it were an undifferentiated stream of bytes. For record-based data, this is inefficient and consumes unnecessary computing resources.

  A record-based file may consist of millions, tens of millions, or even hundreds of millions of records and may require gigabytes of storage. For instance, suppose you want to keep some basic census information about all the people in the United States. Suppose that each record is about 1,000 characters/bytes (1KB) in size. If you estimate the population of the United States at 314 million, this gives you a size for the file of 1,000 × 314,000,000 = 314,000,000,000

  bytes = 314GB. Most computers do not have 314GB of RAM available, and those that do are unlikely to be stand-alone machines running only your program. The likelihood is that your program is only one of many running on the machine at the same time. If your program is found to be using a substantial proportion of the available RAM, your manager is going to be less than gruntled.

  ■ Note I once asked an M.Sc. student who was a proficient C++ programmer to write the C++ equivalent of a CoBol file processing program I had written. his first action was to load the entire file into memory. doing this used an inordinate amount of memory and offered no benefit. he still had to read the file from disk, and the file size so overwhelmed the available raM that the virtual memory manager had to keep paging to disk.

  The data in a record-based file consists of discrete packages of information (records). The correct way to process such a file is to load a record into RAM, process it, and then load the next record. To store the record in memory and allow access to its individual fields, you must declare the record structure (Figure 7-1) in your program. The computer uses your description of the record (the record template) to set aside sufficient memory to store one instance of the record.

  The memory allocated for storing a record is usually called a r ecord buffer. To process a file, a program reads the records, one at a time, into the record buffer, as shown in Figure 7-3. The record buffer is the only connection between the program and the records in the file.

  133 />
  Chapter 7 ■ IntroduCtIon to SequentIal FIleS

  Figure 7-3. Reading records into the record buffer

  Implications of Buffers

  If your program processes more than one file, you have to describe a record buffer for each file. To process all the records in an input file, each record instance must be copied (read) from the file into the record buffer when required.

  To create an output file, each record must be placed in the record buffer and then transferred (written) to the file.

  To transfer a record from an input file to an output file, your program will have to do the following:

  • Read the record into the input record buffer.

  • Transfer it to the output record buffer.

  • Write the data to the output file from the output record buffer.

  This type of data transfer between buffers is common in COBOL programs.

  File and Record Declarations

  Suppose you want to create a file to hold information about your employees. What kind of information do you need to store about each employee?

  One thing you need to store is the employee’s Name. Each employee is also assigned a unique Social Security Number (SSN), so you need to store that as well. You also need to store the employee’s date of birth and gender.

  These fields are summarized here:

  • Employee SSN

  • Employee Name

  • Employee DOB

  • Employee Gender

  134

  Chapter 7 ■ IntroduCtIon to SequentIal FIleS

  ■ Note this is for demonstration only. In reality, you would need to include far more items than these.

  Creating a Record

  To create a record buffer large enough to store one instance of the employee record you must decide on the type and size of each of the fields:

  • Employee SSN is nine digits in size, so the data item to hold it is declared as PIC 9(9).

  • To store Employee Name, you can assume that you require only 25 characters. So the data item

  can be declared as PIC X(25).

  • Employee Date of Birth requires eight digits, so you can declare it as PIC 9(8).

  • Employee Gender is represented by a one-letter character, where m is male and f is female, so it can be declared as PIC X.

  These fields are individual data items, but they are collected together into a record structure as shown in Example 7-1.

  Example 7-1. The EmployeeDetails Record Description/Template

  01 EmployeeDetails.

  02 EmpSSN PIC 9(9).

  02 EmpName PIC X(25).

  02 EmpDateOfBirth PIC 9(8).

  02 EmpGender PIC X.

  This record description reserves the correct amount of storage for the record buffer, but it does not allow access to all the individual parts of the record that might be of interest.

  For instance, the name is actually made up of the employee’s surname and forename. And the date consists

  of four digits for the year, two digits for the month, and two digits for the day. To be able to access these fields individually, you need to declare the record as shown in Example 7-2.

  Example 7-2. A More Granular Version of the EmployeeDetails Record

  01 EmployeeDetails.

  02 EmpSSN PIC 9(9).

  02 EmpName.

  03 EmpSurname PIC X(15).

  03 EmpForename PIC X(10).

  02 EmpDateOfBirth.

  03 EmpYOB PIC 9(4).

  03 EmpMOB PIC 99.

  03 EmpDOB PIC 99.

  02 EmpGender PIC X.

  Declaring the Record Buffer in Your Program

  The record description in Example 7-2 sets aside sufficient storage to store one instance of the employee record.

  This area of storage is the record buffer; it’s the only connection between the program and the records in the file.

  To process the file, you must read the records from the file, one at a time, into the record buffer. The record buffer 135

  Chapter 7 ■ IntroduCtIon to SequentIal FIleS

  is connected to the file that resides on backing storage by declarations made in the FILE SECTION of the DATA DIVISION and the SELECT and ASSIGN clause of the ENVIRONMENT DIVISION.

  A record template (description/buffer) for every file used in a program must be described in the FILE SECTION by means of an FD (file description) entry. The FD entry consists of the letters FD and an internal name that you assign to the file. The full file description for the employee file might be as shown in Example 7-3.

  Example 7-3. The DATA DIVISION Declarations for the Employee File.

  DATA DIVISION.

  FILE SECTION.

  FD EmployeeFile.

  01 EmployeeDetails.

  02 EmpSSN PIC 9(9).

  02 EmpName.

  03 EmpSurname PIC X(15).

  03 EmpForename PIC X(10).

  02 EmpDateOfBirth.

  03 EmpYOB PIC 9(4).

  03 EmpMOB PIC 99.

  03 EmpDOB PIC 99.

  02 EmpGender PIC X.

  In this example, the name EmployeeFile has been assigned as the internal name for the file. This name is then used in the program for file operations such as these:

  OPEN INPUT EmployeeFile

  READ EmployeeFile

  CLOSE EmployeeFile

  The SELECT and ASSIGN Clause

  Although you are going to refer to the employee file as EmployeeFile in the program, the actual name of the file on disk is Employee.dat. To connect the name used in the program to the file’s actual name on backing storage, you require entries in the SELECT and ASSIGN clause of the FILE-CONTROL paragraph, in the INPUT-OUTPUT SECTION of the ENVIRONMENT DIVISION. As shown in Example 7-4, the SELECT and ASSIGN clause allows you to specify that an internal file name is to be connected to an external data resource. It also lets you specify how the file is organized. In the case of a sequential file, you specify that the file organization is sequential. Sequential files are ordinary text files such as you might create with a text editor.

  136

  Chapter 7 ■ IntroduCtIon to SequentIal FIleS

  Example 7-4. Using SELECT and ASSIGN

  SELECT and ASSIGN Syntax

  Here is the SELECT and ASSIGN syntax:

  SELECT InternalFileName

  ASSIGN TO ExternalFileSpecification

  ORGANIZATION IS  SEQUENTIAL

  

  

  

  ■ Note the SELECT and ASSIGN clause has far more entries (even for sequential files) than those shown here.

  I deal with these entries in this book as you require them.

  As illustrated by the examples in Example 7-5, ExternalFileSpecification can be either an identifier or a

  literal. The identifier or literal can consist of a simple file name or a full or partial file specification. If you use a simple file name, the drive and directory where the program is running are assumed.

  When you use a literal, the file specification is hard-coded into the program; but if you want to specify the name of a file when you run the program, you can use an identifier. If an identifier is used, you must move the actual file specification into the identifier before the file is opened.

  137

  Chapter 7 ■ IntroduCtIon to SequentIal FIleS

  Example 7-5. Some Example SELECT and ASSIGN Declarations

  SELECT EmployeeFile

  ASSIGN TO "D:CobolExampleProgsEmployee.Dat"

  ORGANIZATION IS SEQUENTIAL.

  SELECT EmployeeFile

  ASSIGN TO "Employee.Dat"

  ORGANIZATION IS SEQUENTIAL.

  SELECT EmployeeFile

  ASSIGN TO EmployeeFileName

  ORGANIZATION IS SEQUENTIAL.

  : : : : : : : : : : : :

  MOVE "C:datafilesEmployee.dat" TO EmployeeFileName

  OPEN INPUT EmployeeFile

  eXteNDeD SeLeCt aND aSSIGN

  I mentioned that sequentia
l files are ordinary text files such as might be created with a text editor. this is not entirely true. a text editor appends the Carriage return (Cr) and line Feed (lF) characters to each line of text.

  If you specify ORGANIZATION IS SEQUENTIAL and create your test data as lines of text in an ordinary text editor, these extra characters will be counted, and this will throw your records off by two characters each time you read a new record. For this reason, some vendors have extended SELECT and ASSIGN to allow these line-terminating characters to be either ignored or included. For instance, in Micro Focus CoBol, the metalanguage for the SELECT

  and ASSIGN is

  SELECT InternalFileName

  ASSIGN TO ExternalFileSpecification

  

  

   

  [

  ] LINE SEQUENTIAL

  ORGANIZATION IS 

   

  RECORD SEQUENTIAL

  

  

   

  here LINE SEQUENTIAL means the Cr and lF characters are not considered part of the record, and RECORD

  SEQUENTIAL means they are (same as the standard SEQUENTIAL).

  Because it is very convenient to be able to use an ordinary text editor to create test data files, I use the Micro Focus LINE SEQUENTIAL extension in the example programs.

  Processing Sequential Files

  Unlike direct-access files, sequential files are uncomplicated both in organization and in processing. To write programs that process sequential files, you only need to know four new verbs: OPEN, CLOSE, READ, and WRITE.

  138

  Chapter 7 ■ IntroduCtIon to SequentIal FIleS

  The OPEN Statement

  Before your program can access the data in an input file or place data in an output file, you must make the file available to the program by OPENing it. When you open a file, you have to indicate how you intend to use it (INPUT, OUTPUT, EXTEND) so the system can manage the file correctly:

 

‹ Prev