The little SAS program’s
official name was Extract_Transform_Load_ 0314.sas. But, that name
was much too formal, way too long, and did not roll off of the tongue very
easily at all. So, everybody simply called her: ETL Pi.
ETL Pi was conceived in a
2-hour project strategy meeting in conference room 427C on a Monday afternoon
in early May. She was born three days
later in Susan G.’s cubicle in the west wing of the sixth floor. Susan was a mid-level programmer/analyst with
seven years of SAS programming experience.
She first crafted ETL Pi to perform only the basic functions of extracting
data from the Oracle database and loading it into an Excel spreadsheet for use by the
analysts. Susan wanted to make sure that
she had the basic data transportation mechanisms—the Extract and Load parts of
the ETL process—correctly programmed before moving on. So, ETL Pi began her existence as a compact 18-line
SAS program that performed the bare minimum functions on test data.
After Susan was confident
that she had the Extract and Load mechanisms correct, she added the lines of
code that did the Transform part of
the ETL process by creating constructed variables from those extracted from the Oracle database. She sent the resulting Excel files
to the analysts for review and approval.
The analysts had a lot of comments, ideas, proposals, recommendations, suggestions,
and counter-suggestions; so Susan modified, changed, revised, amended,
adjusted, reformed and altered ETL Pi accordingly until the analysts were finally
satisfied. This resulted in ETL Pi
growing to modest 57 lines of lean, efficient SAS code.
Susan productionized ETL Pi
by first creating a directory for the Excel file that would be created each
night and granting permissions to that directory to the analysts. Then, she modified ETL Pi so that it would be
run on the UNIX server where the production Oracle database was housed. Susan did this by adding five SAS/Connect
statements to ETL Pi specifying how to connect to the UNIX server and
specifying which portions of the SAS code would be sent to execute on the
server. The additions of the OPTIONS, SIGNON, RSUBMIT, ENDRSUBMIT, and SIGNOFF
statements brought ETL Pi to 62 lines of SAS code.
Finally, Susan scheduled ETL
Pi to run at 10:00pm every night from her PC workstation. To do this, she first created a .bat file
that held the command to execute SAS and which also pointed at the full path to
the ETL Pi SAS program. Next, she
invoked the Windows Task Scheduler and went through the several manual point-and-click
steps to schedule ETL Pi to run at 10:00pm every night. Satisfied with her preparations, Susan went home
that evening anticipating a good first run and a clean SAS log.
ETL Pi was nervous as the clock slowly ticked away towards
10:00pm. She was used to working with
small amounts of data and jostling for network bandwidth during the day. But, she didn’t know quite what to expect
when running on the UNIX server where the big, important corporate
programs ran. Tonight promised to be quite
an adventure.
At 10:00pm sharp, the Windows Task Scheduler executed the .bat
file. That invoked SAS in batch mode and
started-up ETL Pi. Once initiated, ETL
Pi executed an OPTIONS statement that contained two SAS/Connect-oriented
options. COMMAMID=TCP specified for SAS to use the TCP communications
access method, and REMOTE=UNIX427 specified that ETL Pi would be working on the
UNIX427 server. Next, ETL Pi executed a
SIGNON statement that pointed to a SAS/Connect script that actually initiated
the connection between Susan’s PC and the UNIX server. The script executed and instantiated a SAS session on the UNIX server that ETL Pi could directly communicate with.
The next block of ETL Pi’s SAS code was sandwiched between RSUBMIT and ENDRSUBMIT SAS/Connect statements.
So, that block of code would be submitted directly to the UNIX server and
be processed there.
Oh, goodness; here I
go! ETL Pi thought. Then,
Wheeeeeeeeeee!
...she exclaimed as she was whizzed at dizzying speeds across the network, through half a mile of fiber-optic cable, and into the SAS session executing on the UNIX427 server sitting in the data center.
Wheeeeeeeeeee!
...she exclaimed as she was whizzed at dizzying speeds across the network, through half a mile of fiber-optic cable, and into the SAS session executing on the UNIX427 server sitting in the data center.
ETL Pi emerged into a very crowded UNIX server. There were system backups, database updates,
database backups, five java programs, and dozens of other SAS programs all jostling
for computer resources. Not intimidated,
ETL Pi bravely elbowed her way into the mix of programs, trying to get as much
memory, disk space, and processor cores as she possibly could. I have
a job to do, she thought. And I am going to do my best to get it done.
ETL Pi executed a LIBNAME statement that had the proper
connection string information to access a production Oracle database using
SAS/Access Interface to Oracle. Next she
executed the SQLGENERATION=DBMS option in an OPTIONS statement. That option allows SAS to use in-database
procedures which executes select SAS procedures within the Oracle database itself in order to take
advantage of the database’s native processing speed and facilities.
Next, ETL Pi submitted a PROC SUMMARY to summarize specific
data in one of the large tables in the Oracle database. Because of the SQLGENERATION option, that
summary was efficiently done within the Oracle database and the small result
set was returned to the SAS work library within ETL Pi’s UNIX SAS session.
That was all that ETL Pi needed to do on the UNIX
server. She finished up by using PROC
DOWNLOAD to download the summarized SAS data set in the work library on UNIX to
the SAS work library on the PC.
Okay, I’m going home
now, she thought as the ENDRSUBMIT statement executed.
Wheeeeeeeeeee!
...she shouted in exhilaration as control was passed back to the SAS session running back on the PC.
Wheeeeeeeeeee!
...she shouted in exhilaration as control was passed back to the SAS session running back on the PC.
Once she was back home, ETL Pi executed two ODS statements,
a DATA step, and the Report procedure that created the Excel spreadsheet
for Susan’s analyst clients. Then, with a
heavy sigh, she executed the SIGNOFF statement which terminated the UNIX SAS
session and closed the connection with the server.
Just before ending, ETL Pi took a quick look at the SAS
log. Not
a WARNING or ERROR anywhere in the log, she beamed proudly. Susan
is going to be very happy when she reviews it in the morning. Now that was fun, and I can’t wait to do it again tomorrow
night!
And with that, the little SAS program's big night out came to an end.
---MMMMIIIIKKKKEEEE
(aka Michael A. Raithel)
Author of the new cult classic for computer programmers: It Only Hurts When I Hit <ENTER>
Print edition: http://tinyurl.com/z8bzx2e
Kindle edition: http://tinyurl.com/zypgqa7
Print edition: http://tinyurl.com/z8bzx2e
Kindle edition: http://tinyurl.com/zypgqa7
SAS/Connect:
Report Procedure: http://support.sas.com/documentation/cdl/en/proc/70377/HTML/default/viewer.htm#p10d8v5dnafqb9n1p35e7kp9q67e.htm
SAS/Access Interface to Oracle: http://support.sas.com/documentation/cdl/en/acreldb/69580/HTML/default/viewer.htm#p1ujrhdoe1p743n12awcf7mwyg81.htm
SQLGENERATION system option:
http://support.sas.com/documentation/cdl/en/acreldb/69580/HTML/default/viewer.htm#p0ltbj6gvz6w1sn1oqg2cq15ff6u.htm
Output Delivery System (ODS):
http://support.sas.com/documentation/cdl/en/odsug/69832/PDF/default/odsug.pdf