Monday, November 28, 2016

Hack 4.3 Computing Smallest and Largest Value for Variables in An Observation

SAS Programing Professionals,

Did you know that you can easily compute the smallest or the largest value for a group of variables within an observation in a DATA step?

The SMALLEST and LARGEST functions return the smallest and largest value, respectively, for the group of values that you specify.  You can also have these functions return the second from smallest (or largest) value, third from smallest (or largest) value, fourth from smallest (or largest) value, and so on.  The form of these functions is:

SMALLEST(k, value-1<,value-2,…>)

LARGEST(k, value-1<,value-2,…>)

…where k = is a numeric constant, variable, or expression that specifies which value to return.  And, value-1 (value-2, etc.) specifies the value of a numeric constant, variable, or expression to be processed.

Here is an example:

options nodate nonumber;

data class(keep=oldest youngest agerange);

array ages{19} ages1 - ages19;

do i = 1 to totobs;

set sashelp.class nobs=totobs;

ages{i} = age;

end;

label youngest = "Age of Youngest Participant"
      oldest   = "Age of Oldest Participant"
      agerange = "Range of Participant Ages"
      ;

youngest = smallest(1,of ages1-ages19);
oldest   = largest(1,of ages1-ages19);
agerange = oldest - youngest;

run;

proc print noobs data=class label;
run;

In this example, we create an array (AGES) to contain the ages for all participants in our study data set.  We use the DO loop to iterate through the CLASS data set and load the ages of the participants into the array.  Then, we use the SMALLEST and the LARGEST functions to return the smallest and largest age values stored in the array.  We also calculate the range of ages.

If we had wanted the next to smallest age value, we would have coded:

 youngest = smallest(2,of ages1-ages19);

Get the picture?

If you rolled your eyes when reviewing this contrived example, think how powerful these functions are when you have a flattened patient file of admission/discharge dates.  You can quickly and easily compute the earliest or latest admission dates as well as the earliest or latest discharge dates.  These functions are in the top compartment of my own SAS tips-and-techniques toolbox!

Best of luck in all your SAS endeavors!

----MMMMIIIIKKKKEEEE
(aka Michael A. Raithel)

Excerpt from the book:  Did You Know That?  Essential Hacks for Clever SAS Programmers
http://www.amazon.com/Michael-A.-Raithel/e/B001K8GG90/ref=ntt_dp_epwbk_0

Monday, November 21, 2016

Hack 4.2 Calculating Geodetic Distances with the ZIPCITYDISTANCE Function

SAS Programming Professionals,

Did you know that you can calculate the geodetic distance, in miles, between any two zip codes with SAS?

The aptly named ZIPCITYDISTANCE function allows you to specify both of the zip codes and then compute the distance between them.  Here is an example:

data gointoseattle;

seattle_dist = put(zipcitydistance(20850,98101),comma.);

label seattle_dist = "My Travel Distance from Rockville to Seattle";

run;

proc print noobs data=gointoseattle label;

run;

The code, above, produces the following output:

My Travel Distance
  from Rockville
  to Seattle

    2,315

Distances between Rockville and destinations in countries that do not have a US Zip code can be calculated via the GEODIST function as long as you have the longitude and the latitude of both points.  Here is an example:

data gointoseattle;

seattle_dist = put(geodist(39.085920,-77.174389,47.611330,-122.333219,"M"),comma.);

label seattle_dist = "My Travel Distance From Rockville to Seattle";

run;

proc print noobs data=gointoseattle label;

run;

This example used the GEODIST function to calculate the distance between Rockville and Seattle, and produced the same output as the previous example. 

Of course, if you are working with foreign clients, you might choose to substitute “K” (kilometers) for “M” (miles) to keep the distance in their perspective.  Miles; kilometers; what’s a few klicks between friends?

Best of luck in all your SAS endeavors!

----MMMMIIIIKKKKEEEE
(aka Michael A. Raithel)

Excerpt from the book:  Did You Know That?  Essential Hacks for Clever SAS Programmers
http://www.amazon.com/Michael-A.-Raithel/e/B001K8GG90/ref=ntt_dp_epwbk_0

Monday, November 14, 2016

Hack 4.1 Avoiding the Hidden “Problem” in the TRANWRD Function

SAS Programming Professionals,

Did you know that:  the TRANWRD function, which can replace or remove all occurrences of a given word, has a hidden gotcha

The gotcha is that, if you do not provide a length statement for a new variable that receives the output from the TRANWRD function, that variable defaults to a length of 200 characters.

Here is an example of the gotcha:

data _null_;

length newstatement1 $34;

statement = "I enjoy going to SUGI conferences.";

newstatement1 = tranwrd(statement,"SUGI", "SGF");

newstatement2 = tranwrd(statement,"SUGI", "SGF");

length_newstatement1 = lengthc(newstatement1);

length_newstatement2 = lengthc(newstatement2);

put "length_newstatement1 = " length_newstatement1;
put "length_newstatement2 = " length_newstatement2;

run;

If you submit the code, above, you will see that newstatement1 ends up with a length of 34, while newstatement2 ends up with a length of 200.  That is because newstatement1 had a length statement specifying its size before it was used with the TRANWRD function.  Conversely, newstatement2 was created fresh and new when used in the TRANWRD function and SAS--not knowing what to make of this infant variable--decided to make it 200 characters in length; the default for the TRANWRD function.

So, what are the ramifications of this gotcha?  The ramifications are that you could end up with larger variables than you intend to--needlessly inflating the size of a SAS data set--if you are not careful to declare the length of new SAS character variables created by using the TRANWRD function.  You have been warned!

Best of luck in all your SAS endeavors!

----MMMMIIIIKKKKEEEE
(aka Michael A. Raithel)

Excerpt from the book:  Did You Know That?  Essential Hacks for Clever SAS Programmers
http://www.amazon.com/Michael-A.-Raithel/e/B001K8GG90/ref=ntt_dp_epwbk_0

Monday, November 7, 2016

Hack 3.25 Updating SAS Data Set Metadata with PROC DATASETS

SAS Programming Professionals,

Did you know that you do not have to re-create an entire SAS data set in order to specify formats, informats, labels, or to rename variables? 

Here is an example of what a novice SAS programmer might code to add formats to price and date, add an informat to date, add labels for product and date, and to rename the variable Holiday to existing SAS data set SGFLIB.SNACKS:

data sgflib.snacks;
set  sgflib.snacks;

      format price dollar6.2
             date  worddate.
             ;
      informat date mmddyy10.;
      label    product = "Snack Name"
               date    = "Sale Date"
               ;
      rename Holiday = Holiday_Sale;

run;

This DATA step causes SAS to read every observation in SGFLIB.SNACKS and make a completely new version of it.  The only difference between the original and the new SAS data set is the change to the metadata of the four specified variables.  This is a waste of computer resources; especially if the data set being re-created is very large.

A much more efficient way to accomplish these changes is to use the DATASETS Procedure.  PROC DATASETS reads the descriptor portion of the SAS data set—stored in the first physical page on disk—makes the changes to the metadata, and then writes the page back to the disk.  So, instead of hundreds or thousands or millions of observations being read, SAS simply reads a single data set page.

This PROC DATASETS code is equivalent to the DATA step, above:

proc datasets library=sgflib;
modify snacks;
      format price dollar6.2
             date  worddate.
             ;
      informat date mmddyy10.;
      label product = "Snack Name"
            date    = "Sale Date"
            ;
      rename Holiday = Holiday_Sale;

run;
quit;

The PROC DATASETS code is clean, efficient, and much more professional!

Best of luck in all your SAS endeavors!

----MMMMIIIIKKKKEEEE
(aka Michael A. Raithel)

Excerpt from the book:  Did You Know That?  Essential Hacks for Clever SAS Programmers

I plan to post each and every one of the hacks in the book to social media on a weekly basis.  Please pass them along to colleagues who you know would benefit.