Friday, March 4, 2016

Hack 2.6 Processing Files with Varying Numbers of Records


SAS Programming Professionals,

Did you know that you can use a simple trick to determine how many records are in a flat file you have to process?

Flat files are one of the trickiest types of files to read into SAS for a number of reasons; and when they have varying numbers of records it can be a special challenge.  Determining the number of records a flat file has can inform the way you process it.

Here is an example where we receive files from a client that are supposed to have either 22 or 28 records. Each file type is to be treated differently. 

filename datafile "c:\temp\rawdata.dat";

 data _null_;
      infile datafile end=eof;
      input;
      records+1;
      if eof then call symput("RECORDS",records);
 run;

 %put &RECORDS;

 data mainfile;
      select(symgetn("RECORDS"));
      when(22) do;
            put "This file has 22 lines.";
            /* specific statements for when only 22 lines*/
      end;
      when(28) do;
            put " This file has 28 lines.";
            /* specific statements for when 28 lines*/
      end;
      otherwise do;
            put "Bad case – Unexpected number of lines!";
            /* specific statements for when not enough or
             too many lines*/
       end;
end;
run;

 We start out by specifying the file to be processed in a FILENAME statement.  The DATA _NULL_ step reads every record in the flat file, keeping count of them in the RECORDS variable.  When the DATA _NULL_ step completes reading the flat file, the count of records is stored in a macro variable aptly named RECORDS via CALL SYMPUT.

The DATA step uses a SELECT statement to determine which set of statements are used to process the flat file.  We have a set of statements for when there are 22 records and a different set for when there are 28.  The SELECT statement uses the SYMGETN to return the value stored in the RECORDS variable as a number.  The subsequent WHEN statements execute when the number returned by SYMGETN is equal to 22 or 28, respectively.

With a couple of small changes, this example can be macrotized to handle a directory full of flat files that have varying numbers of lines.

Best of luck in all of your SAS endeavors!

----MMMMIIIIKKKKEEEE
(aka Michael A. Raithel)

Excerpt from the book:  Did You Know That?  Essential Hacks for Clever SAS Programmers


I plan to post each and every one of the hacks in the book to social media on a weekly basis.  Please pass them along to colleagues who you know would benefit.

No comments:

Post a Comment