BASE SAS Programming – Formatted Column Input




The previous post in the “BASE SAS Tutorial” series, started the discussion about the fixed column inputs. We had discussed, via a sample SAS program, how to import data in fixed columns by providing the range and the data type in the input statement. In this post, we would be discussion how to import the same data, as a formatted input.  Let us consider the sample SAS program as shown below:

data test;
input  @1 Subj $3.   @4 DOB mmddyy10.   @14 Gender $1.   @15 Balance 7.;

datalines;
00110/21/1955M 1145
00211/18/2001F 18722
00305/07/1944M 123.45
00407/25/1945F -12345
;
run;

In this program, just like in the earlier post, we are looking to import the datalines. The key difference here, can be seen in the way the Input statement is written for the same data set. The input statement uses the “@” character (used before the field name) to indicate the position of the field in the data line. For instance, the field “DOB” is starting at the position number 4 in all the data lines. The next noteworthy difference in the Input statement is the way the length of the field (along with the indication of the data type) is specified after the field name is listed. Lastly, the major change here is the use of informats in the Input statement. We would discuss the informats in more detail later on, but for now, it would be enough to consider them as providing an indication to SAS as to what format the data type is provided in the input and to store it accordingly. For instance, the field “DOB” has dates in the format of “MM/DD/YYYY”. This can be read in as a character value but it would not be useful, if we wanted to use any date specific functions on this field. Therefore, it is critical to use the appropriate informat to import the data values correctly. SAS has built standard informats which we will be discussing in later posts.

Let us consider, the results from the above sample SAS program.

Log

Output Tab:

Please pay attention to the output tab from the above sample SAS program. We have used informats to read in the data. Note how the date values are stored as numbers instead of displaying as dates are expected to be displayed. This is because SAS stores dates as numbers. The starting date in the SAS system is taken to be at 1 Jan 1960. All dates are stored in reference to this date value, which is considered as 0. The negative values in the DOB column in the output tab represents the number of days prior to the SAS system date.

At this point we hope that you, the readers are comfortable with the basic data import methods in BASE SAS. These the core concepts on which, we will be building on in future posts. It is also important to understand these from the point of view of preparing for the BASE SAS certification. So keep practicing.

You can find relevant reference books in the sidebar of this post, if you should want to purchase the same for further studying. BASE SAS Certification guide and the Little SAS book are extremely useful for preparation and also act as a wonderful reference for SAS concepts.

Please sign up for our newsletter, so that we may keep you posted on the latest activity on our website and Youtube channel.