Monday, October 17, 2016

Hack 3.23 Sorting Character Variables That Have Leading Numbers in Them

SAS Programming Professionals,

Did you know that you can sort character variables with leading numbers in them into proper numeric order?

You can do this in the SORT procedure by using the SORTSEQ=LINGUISTIC option and also specifying NUMERIC_COLLATION=ON as the collating rule.  Sound like a lot of technical mumbo-jumbo to you?  Well, a simple example will doubtlessly make this clearer.

The following code:

data test;
input address $12.;
datalines;
123 Main St.
05 Main St.
45 Main St.
8 Main St.
;

proc sort data=test;
 by address;
run;

proc print data=test noobs;
run;

… produces this output:

  address

05 Main St.
123 Main St.
45 Main St.
8 Main St.

As you can see, the street addresses are not really sorted into ascending order by house number.  However, if we change the sort statement to this:

proc sort data=test sortseq=linguistic(numeric_collation=on);
by address;
run;

…we get the following:

  address

05 Main St.
8 Main St.
45 Main St.
123 Main St.

…which is much better for our interviewers to use when planning their routes down Main street USA!

Best of luck in all your SAS endeavors!

----MMMMIIIIKKKKEEEE
(aka Michael A. Raithel)

Excerpt from the book:  Did You Know That?  Essential Hacks for Clever SAS Programmers

I plan to post each and every one of the hacks in the book to social media on a weekly basis.  Please pass them along to colleagues who you know would benefit.