Monday, October 24, 2016

Hack 3.24 Taking Advantage of the INHERITLIB Option with SAS/CONNECT

SAS Programming Professionals,

Did you know that when using SAS/CONNECT, you can have a program running on a server “reach back” and process data directly from a SAS data set on your client machine (workstation)? 

The aptly-named INHERITLIB option allows a remotely-running SAS program to process data from your workstation or network as if it were local to the remote server.  Using the INHERITLIB option obviates the need for you to use PROC UPLOAD to send your data to the remote server for processing.  This can be very handy when you do not want to have multiple copies of a data set spread across your various computing platforms.

The INHERITLIB option can be coded on either the RSUBMIT or the SIGNON statement.  Here is an example:

libname pclib "C:\PRODDATA";

rsubmit inheritlib=(pclib);

proc summary nway data=pclib.shoeaud(where=(region="Canada"));

      class region product subsidiary;

      var   sales  returns;

output out=sumshoes sum=;

run

endrsubmit;

run;

In this example, we are allocating the “C:\proddata” directory (on a Windows workstation) to the “PCLIB” libref in our local SAS session.  Thereafter, we RSUBMIT a simple PROC SUMMARY that will run on our Linux server.  The RSUBMIT statement specifies that the remote session is to “inherit” the PCLIB libref.  So, when the SUMMARY procedure executes on the remote Linux server, it “reaches back” to the workstation and drags the relevant observations across the network to the waiting SUMMARY procedure running on the Linux server.  Pretty neat, eh?

Of course, you need to weigh the additional network traffic associated with remote servers processing local data against any gains realized by using the INHERITLIB option.  But, you already knew that; didn’t you?

Best of luck in all your SAS endeavors!

----MMMMIIIIKKKKEEEE
(aka Michael A. Raithel)

Excerpt from the book:  Did You Know That?  Essential Hacks for Clever SAS Programmers

I plan to post each and every one of the hacks in the book to social media on a weekly basis.  Please pass them along to colleagues who you know would benefit.

Monday, October 17, 2016

Hack 3.23 Sorting Character Variables That Have Leading Numbers in Them

SAS Programming Professionals,

Did you know that you can sort character variables with leading numbers in them into proper numeric order?

You can do this in the SORT procedure by using the SORTSEQ=LINGUISTIC option and also specifying NUMERIC_COLLATION=ON as the collating rule.  Sound like a lot of technical mumbo-jumbo to you?  Well, a simple example will doubtlessly make this clearer.

The following code:

data test;
input address $12.;
datalines;
123 Main St.
05 Main St.
45 Main St.
8 Main St.
;

proc sort data=test;
 by address;
run;

proc print data=test noobs;
run;

… produces this output:

  address

05 Main St.
123 Main St.
45 Main St.
8 Main St.

As you can see, the street addresses are not really sorted into ascending order by house number.  However, if we change the sort statement to this:

proc sort data=test sortseq=linguistic(numeric_collation=on);
by address;
run;

…we get the following:

  address

05 Main St.
8 Main St.
45 Main St.
123 Main St.

…which is much better for our interviewers to use when planning their routes down Main street USA!

Best of luck in all your SAS endeavors!

----MMMMIIIIKKKKEEEE
(aka Michael A. Raithel)

Excerpt from the book:  Did You Know That?  Essential Hacks for Clever SAS Programmers

I plan to post each and every one of the hacks in the book to social media on a weekly basis.  Please pass them along to colleagues who you know would benefit.

Monday, October 10, 2016

Hack 3.22 Simplifying PROC DATASETS Output with the NOLIST Option

SAS Programming Professionals,

Did you know that you can stop PROC DATASETS from inflating your SAS log with those pesky long library contents listings?

By default, PROC DATASETS prints a list of the contents of the SAS library it is operating on to the SAS log.  If you have one or two SAS files in that library, then it may not be noteworthy.  But, if you have scores, dozens, hundreds, or more SAS files in the SAS data library, it can be burdensome to have such information plopped into the middle of the log—especially if you use the DATASETS procedure multiple times in the same program.

The NOLIST option on the PROC DATASETS statement prevents the procedure from printing a library directory in your SAS log.  Here is an example.

This code:

libname ctemp "c:\temp";

proc datasets library=ctemp;
       modify drugs;
            rename subject = patient;
      run;
quit;

Produces the following SAS log:

1    libname ctemp "c:\temp";
NOTE: Libref CTEMP was successfully assigned as follows:
      Engine:        V9
      Physical Name: c:\temp

2    proc datasets library=ctemp;
Directory

Libref         CTEMP
Engine         V9
Physical Name  c:\temp
Filename       c:\temp


Member
#  Name                 Type     File Size  Last Modified

1  BUY                  DATA          5120  19May11:09:53:04
2  CARS                 DATA         74752  14Jun11:16:17:29
3  CLASS                DATA          5120  09Mar12:16:50:44
4  CLASSCAT             CATALOG      13312  13Jan12:11:06:42
5  DRUGS                DATA         13312  09Mar12:16:54:29
6  FINAL                DATA          5120  18Oct10:15:01:04
7  FORMATS              CATALOG    2499584  17Sep07:10:25:50
8  HEART                DATA        902144  09Mar12:16:55:46
9  HEART_BACKUP         DATA        885760  05Jul11:16:08:07
10  MASSZIPS             DATA        377856  20May11:16:15:19
11  MYFMTS               CATALOG      17408  13Feb12:14:34:11
12  ORSALES              DATA        111616  05Jun09:08:30:45
13  PRDSAL2              DATA       2790400  24Jun11:11:26:33
14  SALES                DATA          5120  12Nov09:09:46:31
15  SAMPLE_RESPITE       DATA          9216  19Jul11:11:54:20
16  SAMPLE_RESPITE1      DATA          5120  19Jul11:13:14:25
17  SASMACR              CATALOG      33792  07Nov11:10:06:08
18  SHOEAUD              DATA        132096  12Nov09:10:41:52
19  SHOES                DATA         17408  12Nov09:10:28:26
    SHOES                AUDIT       140288  12Nov09:10:28:26
20  SUMCLASS             DATA          5120  03Dec09:16:14:27
21  ZIPCODE_11Q2_UNIQUE  DATA      33915904  21May11:09:40:21
    ZIPCODE_11Q2_UNIQUE  INDEX       762880  21May11:09:40:20
22  ZIPMIL_11Q2          DATA        492544  21May11:09:40:22
    ZIPMIL_11Q2          INDEX        21504  21May11:09:40:22
23  ZIPMISC_11Q2         DATA         33792  21May11:09:40:22
    ZIPMISC_11Q2         INDEX         9216  21May11:09:40:22
3
4        modify drugs;
5            rename subject = patient;
NOTE: Renaming variable subject to patient.
6        run;

NOTE: MODIFY was successful for CTEMP.DRUGS.DATA.
7    quit;

Interesting enough, but definitely TMI!  The NOTE’s tell the tale that we were successful in renaming SUBJECT to PATIENT and that our modify of the DRUGS data set was successful.  So, we do not really need any additional information about the SAS data library at this time.

We decided that we want to rename PATIENT back to SUBJECT, and this time use the NOLIST option:

proc datasets library=ctemp nolist;

      modify drugs;
            rename patient = subject ;
      run;
quit;

… resulting in this log:

10   proc datasets library=ctemp nolist;
11
12       modify drugs;
13           rename patient = subject ;
NOTE: Renaming variable patient to subject.
14       run;

NOTE: MODIFY was successful for CTEMP.DRUGS.DATA.
15   quit;

Now, that’s exactly what we want: a leaner, cleaner log!

Best of luck in all your SAS endeavors!

----MMMMIIIIKKKKEEEE
(aka Michael A. Raithel)

Excerpt from the book:  Did You Know That?  Essential Hacks for Clever SAS Programmers

I plan to post each and every one of the hacks in the book to social media on a weekly basis.  Please pass them along to colleagues who you know would benefit.

Monday, October 3, 2016

Hack 3.21 Saving the Output of PROC CONTENTS to a SAS Data Set

SAS Programming Professionals,

Did you know that you can save the output of PROC CONTENTS to a SAS data set?

When you do so, it creates a SAS data set with one observation for each variable found in the original data set.  Each observation has specific information about one particular variable, as well as some redundant information about the original SAS data library.  For example:

proc contents data=sashelp.prdsale out=work.prdsale_contents;
run;

Creates a new SAS data set with one ten rows; one each for variables ACTUAL, COUNTRY, DIVISION, MONTH, PREDICT, PRODTYPE, PRODUCT, QUARTER, REGION, and YEAR.  A PROC CONTENTS of the new data set (work.prdsale_contents) yields, in part:

Alphabetic List of Variables and Attributes

     #    Variable    Type    Len    Format         Label

    32    CHARSET     Char      8                   Host Character Set
    33    COLLATE     Char      8                   Collating Sequence
    28    COMPRESS    Char      8                   Compression Routine
    20    CRDATE      Num       8    DATETIME16.    Create Date
    22    DELOBS      Num       8                   Deleted Observations in Data Set
    36    ENCRYPT     Char      8                   Encryption Routine
    19    ENGINE      Char      8                   Engine Name
    27    FLAGS       Char      3                   Update Flags (Protect Contribute Add)
    10    FORMAT      Char     32                   Variable Format
    12    FORMATD     Num       8                   Number of Format Decimals
    11    FORMATL     Num       8                   Format Length
    38    GENMAX      Num       8                   Maximum Number of Generations
    40    GENNEXT     Num       8                   Next Generation Number
    39    GENNUM      Num       8                   Generation Number
    25    IDXCOUNT    Num       8                   Number of Indexes for Data Set
    23    IDXUSAGE    Char      9                   Use of Variable in Indexes
    13    INFORMAT    Char     32                   Variable Informat
    15    INFORMD     Num       8                   Number of Informat Decimals
    14    INFORML     Num       8                   Informat Length
    16    JUST        Num       8                   Justification
     7    LENGTH      Num       8                   Variable Length
     1    LIBNAME     Char      8                   Library Name
     3    MEMLABEL    Char    256                   Data Set Label
     2    MEMNAME     Char     32                   Library Member Name
    24    MEMTYPE     Char      8                   Library Member Type
    21    MODATE      Num       8    DATETIME16.    Last Modified Date
     5   NAME       Char    32                Variable Name
    18    NOBS        Num       8                   Observations in Data Set
    34    NODUPKEY    Char      3                   Sort Option: No Duplicate Keys
    35    NODUPREC    Char      3                   Sort Option: No Duplicate Records
    17    NPOS        Num       8                   Position in Buffer
    37    POINTOBS    Char      3                   Point to Observations
    26    PROTECT     Char      3                   Password Protection (Read Write Alter)
    29    REUSE       Char      3                   Reuse Space
    30    SORTED      Num       8                   Sorted and/or Validated
    31    SORTEDBY    Num       8                   Position of Variable in Sortedby Clause
     6    TYPE        Num       8                   Variable Type
     4    TYPEMEM     Char      8                   Special Data Set Type (From TYPE=)
     8    VARNUM      Num       8                   Variable Number

I color coded the rows in the output above to better differentiate them.  Those colored blue specify information about the original SAS data set.  Those colored red specify information about the particular variable (ACTUAL, COUNTRY, DIVISION, etc.).  Variable names are stored in the NAME variable of this new data set.  So, in observation 1, NAME would have the value of “Actual”, in observation #2 NAME would have the value “Predict”, and so on.

There are dozens of uses for a PROC CONTENTS SAS data set, including documenting the specifics of the original SAS data set, programmatically querying a variable’s characteristics (Format, Informat, Length, whether it is used in an index, etc.), and programmatically querying the original data set’s attributes (whether it is compressed or indexed, the data set’s label, etc.).  So, add being able to create a PROC CONTENTS data set to your already burgeoning SAS programming techniques toolkit.

Best of luck in all your SAS endeavors!

----MMMMIIIIKKKKEEEE
(aka Michael A. Raithel)

Excerpt from the book:  Did You Know That?  Essential Hacks for Clever SAS Programmers

I plan to post each and every one of the hacks in the book to social media on a weekly basis.  Please pass them along to colleagues who you know would benefit.