BASE SAS Programming – Formatted Column Input




The previous post in the “BASE SAS Tutorial” series, started the discussion about the fixed column inputs. We had discussed, via a sample SAS program, how to import data in fixed columns by providing the range and the data type in the input statement. In this post, we would be discussion how to import the same data, as a formatted input.  Let us consider the sample SAS program as shown below:

data test;
input  @1 Subj $3.   @4 DOB mmddyy10.   @14 Gender $1.   @15 Balance 7.;

datalines;
00110/21/1955M 1145
00211/18/2001F 18722
00305/07/1944M 123.45
00407/25/1945F -12345
;
run;

In this program, just like in the earlier post, we are looking to import the datalines. The key difference here, can be seen in the way the Input statement is written for the same data set. The input statement uses the “@” character (used before the field name) to indicate the position of the field in the data line. For instance, the field “DOB” is starting at the position number 4 in all the data lines. The next noteworthy difference in the Input statement is the way the length of the field (along with the indication of the data type) is specified after the field name is listed. Lastly, the major change here is the use of informats in the Input statement. We would discuss the informats in more detail later on, but for now, it would be enough to consider them as providing an indication to SAS as to what format the data type is provided in the input and to store it accordingly. For instance, the field “DOB” has dates in the format of “MM/DD/YYYY”. This can be read in as a character value but it would not be useful, if we wanted to use any date specific functions on this field. Therefore, it is critical to use the appropriate informat to import the data values correctly. SAS has built standard informats which we will be discussing in later posts.

Let us consider, the results from the above sample SAS program.

Log

Output Tab:

Please pay attention to the output tab from the above sample SAS program. We have used informats to read in the data. Note how the date values are stored as numbers instead of displaying as dates are expected to be displayed. This is because SAS stores dates as numbers. The starting date in the SAS system is taken to be at 1 Jan 1960. All dates are stored in reference to this date value, which is considered as 0. The negative values in the DOB column in the output tab represents the number of days prior to the SAS system date.

At this point we hope that you, the readers are comfortable with the basic data import methods in BASE SAS. These the core concepts on which, we will be building on in future posts. It is also important to understand these from the point of view of preparing for the BASE SAS certification. So keep practicing.

You can find relevant reference books in the sidebar of this post, if you should want to purchase the same for further studying. BASE SAS Certification guide and the Little SAS book are extremely useful for preparation and also act as a wonderful reference for SAS concepts.

Please sign up for our newsletter, so that we may keep you posted on the latest activity on our website and Youtube channel.




Categories SAS

BASE SAS Programming – Fixed Column Input




Previously we discussed how to import data using the list input method. In this post  in the ‘BASE SAS tutorial” series, we would be discussing how to import data using the fixed columns method. There are two approaches within the fixed column method. The first is the Column Input. Under this approach, we provide the range of the fields as present in the raw data. Consider the below SAS program as an example.

data test;
input Subj        $ 1-3
DOB       $ 4-13
Gender  $ 14
Balance  15-21;

datalines;
00110/21/1955M 1145
00211/18/2001F 18722
00305/07/1944M 123.45
00407/25/1945F -12345
;
run;

We can the see the datalines are formatted such that all individual fields are within fixed lenghts without any overlap. In such cases, the input can written as in the SAS program above, where the range of the fields is specified along with the indication of whether or not the field is a character value. So in our example, we can see that the field DOB is being read as a character field ranging from the 4 character (from the left) to a width of 9 characters up to the 13th. All remaining fields are also defined in a similar fashion.

See below for the resulting logs and outputs from the above SAS program in the SAS University Edition interface:

LOG:

Output Tab:

 

It should be noted, that the raw data may not always be present in fixed width columns without overlap. These cases may be quite rare and may only be applicable to smaller projects. It is an important concept, nonetheless. It shows of the versatility of the Input Statement.

Please find relevant reference books in the sidebar of this post, if you should want to purchase the same for further studying. BASE SAS Certification guide and the Little SAS book are extremely useful for preparation and also act as a wonderful reference for SAS concepts.

Please sign up for our newsletter, so that we may keep you posted on the latest activity on our website and Youtube channel.




Categories SAS

BASE SAS Programming – Using DLM and DSD Options Together




Hi fellow data analysts. We are having a lot of fun putting together these posts for “BASE SAS Tutorial” series. We certainly hope you are enjoying and learning something from them as well. We have covered some very important ground so far but we have a long way to go. In this post, we will be discussing a possible scenario where are demonstrate the use of the DLM and DSD option together.

We have seen features of the DSD option. It is important to note that they all work independent of each other. For instance, if we were to set the delimiter as “/” using the DLM option, an accompanying DSD option would still identify missing values between consecutive delimiters without resetting the default delimiter to comma. Let us look at the sample data we would be using for this demo.

Sample Data

The below provided sample SAS program is used to read in the above sample data.

data sample1;
infile ‘/folders/myshortcuts/myFolder/Sample data – CSV.txt’ dsd dlm = “/”;
input gender $ Age Height Weight;
run;

Note: We are using the DSD and DLM options simultaneously in order to read in data from a delimited file with the delimiter “/”. The DSD option is useful to identify the missing values in the data. The outputs generated from this code are:

Logs:

Output:

As can be seen in the above screenshots, the data is read in appropriately. This combination is used so often in a day to day scenario that it is a must to learn, understand and deploy as if it were your second nature. Often, folks get confused, but with practice it is fairly easy to use and implement. Keep practicing by creating other sample data sets and see how the codes responds to changes.

Please find relevant reference books in the sidebar of this post, if you should want to purchase the same for further studying. BASE SAS Certification guide and the Little SAS book are extremely useful for preparation and also act as a wonderful reference for SAS concepts.

Please sign up for our newsletter, so that we may keep you posted on the latest activity on our website and Youtube channel.



BASE SAS Programming – Working with CSV Files – DSD Option




Hello people. Welcome to another post in the “BASE SAS Tutorial” series. We are making some quick progress in the last few posts. So far we have discussed how SAS reads text files as list inputs in the form of delimited files ( space delimited, or otherwise). In this post, we will look more closely at the a very useful option that is used with the infile statement. This is the DSD option.

Let us quickly first lay down the uses of the DSD option before we see its practical application.

  1. It can change any default delimiter from the space to a comma. This can be useful because very often the data files are received in the .csv format. So instead of using the DLM option, we can simply use the DSD as it also provides other benefits.
  2. If there are two consecutive delimiters, it considers the same as a missing value. The data doesn’t need to have specified missing values. This is very useful as we can imagine, how often we get data files where there are missing values. The DSD option helps to resolve this problem quite easily.
  3. Lastly, any double quotes around the character fields are stripped.

These are very useful capabilities for only three words.

Let us consider the data as shown below:

Please note the the missing value on the third record and the double quote qualifiers for the Gender column. See below the sample SAS Program to import the above sample data using the DSD option.

data sample1;
infile ‘/folders/myshortcuts/myFolder/Sample data – CSV.txt’ dsd;
input gender $
Age
Height
Weight;
run;

See below the results as they appear in the log and the output folder:

Log Tab:

Output Data Tab:

Play around with the DSD option and see how it responds to changes in the raw data file with each import. Please remember to review the log after running every sample script even if the results are all good. Any good SAS programmer must be in the habit of being able to read the SAS log. It is quite important.

You can find relevant reference books in the sidebar, if you should want to purchase the same for further studying. Please sign up for our newsletter, so that we may keep you posted on the latest activity on our website and Youtube channel.




Categories SAS

BASE SAS Programming – Import Delimited Files




Hello all. Nice to have you going strong with our “BASE SAS Tutorial’ series. In the previous post, we looked at how the SAS tool reads text files with space delimiter. Spaces being the default delimiter for the tool. However, in the real world scenario, data is never provided from any source with space as the delimiter value. Luckily, SAS allows a great deal of flexibility to read delimited files with a variety of options of different elements. Let us look at the sample program below where we try to import the same data as in the previous post, but with the “/” character as the delimiter instead of the the default space.

Data sample1;
infile ‘/folders/myshortcuts/myFolder/Sample data – delimiter.txt’ dlm = “/”;
input gender $ Age Height Weight;
run;

Snapshot of the data contained in the file;

The output of the code as it would appear in the log and the output tab:

Log:

Output Tab:

This is the most common method of importing data into any SAS tool that is available for end users. It is important to understand clearly how the simple code shared in this post is working so that it can be scaled for data files with more number of columns and complications. Please try it out and share your experience in the comments below. Please find relevant reference books in the sidebar, if you should want to purchase the same for further studying. Please sign up for our newsletter, so that we may keep you posted on the latest activity on our website and Youtube channel.




BASE SAS Programming – Data Import




Welcome to another post in the ‘BASE SAS Tutorial’ series. SAS is a very robust tool, capable of reading data from a variety of sources. In the next few posts, we will be discussing the various ways that data files are taken as an input into SAS. Let us start with text files or more specifically delimited text files.

So firstly, SAS considers spaces as the default value for the delimiter in a data file. Let us consider the piece of code below:

data sample;
infile ‘/folders/myshortcuts/myFolder/Sample data – blank delimiter.txt’;
input gender $ Age Height Weight;
run;

Screenshot of the data file being imported in the program:

The code provided above should work as it is in your SAS University software as long as there is a data file in the ‘myfolder’ location created during set up for SAS University Edition.

Let us quickly review the above information. The code is attempting to read a sample text file, which is space delimited. The file has 4 columns namely, Gender, Age, Height & Weight. Please note how the character variables are indicated in the input statement. Also please note, the infile statement. This statement is used to read in data into SAS programs. Users can provide options with this statement in order to change the delimiter depending on the file being imported.

Take time to review this the concepts shared in the post. You can create a sample data file and the use the program to test out your observations. Please try it out and share your experience in the comments below. Please find relevant reference books in the sidebar, if you should want to purchase the same for further studying. Please sign up for our newsletter, so that we may keep you posted on the latest activity on our website and Youtube channel.




Categories SAS

BASE SAS Programming – SAS Libraries




Many concepts seem intertwined when we study SAS. It is important to be patient and keep pushing through. Fundamental concepts are the key to understanding more advanced concepts. Welcome back fellow readers. We hope you have been able to follow along up to this point without much trouble.

In this post we will be looking into SAS Libraries. There are two types of libraries in SAS. There are permanent and temporary libraries. Temporary libraries are store data set only for the current session. Permanent libraries store data sets to a permanent location on the local system. Consider the screenshot below:

All the default libraries in SAS University Edition are listed in the left pane under the libraries section. Notice the ‘Work’ library. This is the default library created for any temporary data sets created in the current session. As seen in the SAS program written in the right pane, the temporary data set ‘sample’ is created in the ‘work’ library. These data sets won’t exist after the session is closed.

To understand how SAS creates the permanent libraries in SAS, let us consider the screenshots below:

In the above screenshot we can see the Libname statement being used to create a permanent library. Simultaneously consider the the screenshot below where the log indicates the successful creation of the permanent library ‘Test1’. Also notice, the changes in the left pane of SAS University interface. There is a permanent library ‘Test1’ with a data set ‘sample’ saved. As you can see the log indicates a successful run of the program, so the expected changes are now updated.

We hope this post provides some clarity on SAS libraries. You can now try to now start trying to create your own permanent libraries. Please note, if you are using any other  software to run SAS programs, similar interface elements should be present where you may locate the relevant local paths to libraries and based on how the project folders are created. For instance, the layout for BASE SAS tool would be somewhat similar but the path for the libraries would be more kin to an actual system path instead of the one shown in the sample SAS program.

Keep practicing and leave comments in case you face any challenges. Please find relevant reference books in the sidebar, if you should want to purchase the same for further studying. Please sign up for our newsletter, so that we may keep you posted on the latest activity on our website and Youtube channel.




Categories SAS

BASE SAS Programming – SAS Data Types

Welcome readers. We hope the “BASE SAS Tutorial’ series is working out for you all. So far we have tried to distill all the important elements that would be needed for first timers to get started with BASE SAS programs. If there is any thought, please let us know. From this post onward we will be discussing more direct topics related to the SAS programming so you can start practicing your own programs.

In this post let us consider the most obvious data types encountered in SAS. These are character, numeric and date formats. Just like Audit Command Language, these are the most commonly used data types in SAS. These three data types essentially capture the three major questions to be addressed in any data source i.e. what, when and how many. After all, any data source is a record of any event or commodity at a given point in time. Any analysis is based on answering these questions when data is analyzed.

Let us consider the sample SAS program in the screenshot below:

Consider the scope of the above program. The above SAS program is attempting to read a data source (data lines in this case) of two lines with the information (name, age, and DOB). The three main data types are clearly illustrated here. Take a look at the input statement in the SAS program and observe that the three different variables are input in different ways. The character variable ‘Name’ is indicated as so by the ‘$’ character. the numeric field age does not require any character. The date value has a small twist. SAS stores data only as numbers or characters.  As a matter of fact, all dates are stored as numeric values with the reference to the 1st January 1960 as the starting date. However, the date values are imported with the ‘informat’ (which would be discussed in more detail in future posts) indicating to SAS, that this field is a date. Further, in order to display the date appropriately in the dataset, the format statement (to discuss in more detail in future posts) is used. The output would be as shown in the screenshot below:

A look at the ‘Output Data’ tab shows how the data is stored in the dataset. Look at the section on the left pane where the data types are listed. The different symbols demonstrate the different datasets.

A few points to remember when importing a data set and deciding on the data types for the data values in the raw data sets. First of all, one must always aim to import in all the data fields as a character so that all values are captured so that a proper analysis may be one of the data quality overall. Secondly, there are functions for each data type (will be discussed in future posts).

Please find relevant reference books in the sidebar, if you should want to purchase the same for further studying. Please sign up for our newsletter, so that we may keep you posted on the latest activity on our website and Youtube channel.

Categories SAS