Monday, April 13, 2015

Keep Your Hands Off of My SAS Data Sets!

SAS Programming Professionals,

I am all about sharing.  Knock on my door and I will gladly lend you a stick of margarine, a cup of sugar, an egg or two, some flour, a corkscrew, or a beer.  Not a problem to borrow a tie, one of my extra belts, a white shirt (if it fits), a scarf, or a pair of gloves.  Sure, I will gladly lend you one the books in my home library, a couple of my music Cd's, or one of my movie DVDs.  I am reasonably sure that most of these items will either be returned to me in due time, or reciprocated, or paid forward.  But, keep your hands off of my SAS data sets!

I would bet that final sentiment regarding SAS data sets is pretty prevalent in organizations where SAS programmers work with shared storage resources.  Whether you are storing your SAS data sets in server directories or on network directories, you do not want to have them deleted, resorted, updated, or otherwise overwritten by other programmers on your team.  At least, not without your permission.

Fortunately, most organizations create security permissions that largely safeguard against unauthorized access of data.  They implement security packages--either native to the OS or purchased from vendors--to ensure that data is accessed only by those with the need to know.  Groups of programmers are given access to specific directories and other staff, with the exception of systems administrators, cannot get into them or see them at all.  Despite all of this protection, data integrity issues can come from programmers within your own group; programmers who have access to the same directories that you do; programmers who have the same access rights that you do.

Unfortunately, there is no foolproof way for you to keep your teammates from crunching your SAS data sets.  But, here are a few protection measures that you can put into place to help safeguard your SAS data sets:

  • SAS Data Set Passwords.  SAS allows you to specify ALTER, READ, and WRITE passwords.  Users must know the password in order to process password-protected data sets with SAS.  So, you can specify passwords for your do-not-disturb SAS data sets, wait for your co-workers to complain, and determine if they really do need access to those data sets.
  • ACCESS=READONLY Libname Option.  This option specifies that no data sets in the library can be updated and no new files can be written to the library.  This option can be effectively deployed in a shared group AUTOEXEC.sas file.  If your colleagues grumble, tell them to copy the data sets in question to one of their libraries and process them there.  Caution them to be mindful of version control issues.
  • SAS Views.  Create views of your SAS data sets via the DATA step or PROC SQL and allow your colleagues to use the views instead of the actual data sets.  The beauty of this approach is that you can have your valued data sets in one directory and the views in another.  Point your coworkers at the views directory and do not tell them the whereabouts of the permanent data sets.
  • Generation Groups.  You can specify for SAS to keep several generations of your SAS data sets available.  Consequently, when one of them is modified, you will still have the older version available.
  • SAS Audit Logs.  SAS Audit Logs can be used to determine who updated SAS data sets after the fact. They don't stop your colleagues from actually updating SAS data sets, but you can use them to determine the who, what, and when so that you can storm into the right office for an explanation.
  • LOCK Statement.  You could use the LOCK statement to lock your SAS data sets so that no other SAS program can read or write to the file.  This is a bit extreme and would require that your SAS program with the LOCK statements be running for the duration of when you wanted to safeguard your SAS data sets.  This option can be effectively deployed in a shared group AUTOEXEC.sas file.
  • Zip Files.  You could simply zip your SAS data sets up into zip files, delete the permanent SAS data sets, and restore the data sets from the zip files when you need them.  This is another extreme measure, but if your data sets get clobbered on a regular basis, you may find it more appealing than having them constantly restored.
Obviously, the best solution for keeping your important SAS data sets from being updated by your teammates is good communications between all involved, and a shared set of best practices for accessing and modifying SAS data sets.  But, when that is not available, you may have to reach for some of the ideas I have posted here.

Oh, and about that book that you borrowed from my SAS bookshelf library a couple of months ago... can I get it back?

Best of luck in all your SAS endeavors!

----MMMMIIIIKKKKEEEE
aka Michael A. Raithel

Amazon Author's Page:  
http://www.amazon.com/Michael-A.-Raithel/e/B001K8GG90/ref=ntt_dp_epwbk_0