Thursday, February 18, 2016

Hack 2.2 Creating a Subset of a SAS Data Set for Testing

SAS Programming Professionals,

Did you know that you can effortlessly create a small subset from a large SAS data set for testing purposes?

The FIRSTOBS and OBS SAS data set options can be used to slice-n-dice a group of observations from within a SAS data set so they are the only ones written to an output data set.  The FIRSTOBS option specifies the first observation that SAS is to process.  The default is 1.  The OBS option specifies the last observation that SAS is to process.  The default is MAX, which means all observations in a SAS data set are to be processed.

When you specify OBS, SAS performs the following algorithm to return observations:  (OBS – FIRSTOBS) + 1 = the observations returned.

Let’s look at an example:

data cars;
set firstobs=151);

In the example, we need an extract of the CARS data set, which contains 428 observations,  to test our new program.  So, we create the CARS data set in the SAS WORK library with 50 observations in it.  We specified that all of the observations in SASHELP.CARS between 151 and 200, inclusive, be written to our new CARS data set.  The log looks like this:

1    data cars;
2    set firstobs=151);
3    run;

NOTE: There were 50 observations read from the data set SASHELP.CARS.
NOTE: The data set WORK.CARS has 50 observations and 15 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

So we now have a happy, healthy extract data set that we can use to develop our new program!

Best of luck in all of your SAS endeavors!

(aka Michael A. Raithel)

Excerpt from the book:  Did You Know That?  Essential Hacks for Clever SAS Programmers

I plan to post each and every one of the hacks in the book to social media on a weekly basis.  Please pass them along to colleagues who you know would benefit.