Audit Command Language – Data Issues

Welcome to the ‘Audit Command Language Tutorial for Beginners’. At this stage, we have discussed how to import delimited files and excel files into ACL projects to perform analytics. The next step is to perform checks to ensure that the data is imported correctly. There are two most common kinds of issues that are seen while importing data into any tool.

  1. Data Spill: This type of data issue causes the data to spill over into the next columns. This generally happens because there may be extra delimiters in some field (generally, this would happen in fields which are descriptions). In the screenshot below, the third line has an extra delimiter.
  2. Data Split: The second most common data issue is the splitting of data lines. This issues causes the data lines to be split across 2 or more lines in the raw file. The same would be reflected in the imported table in ACL. The screenshot below illustrates such an issue. The 6 line in the file has a new line character which is causing the data lines to be shifted to the next line.

Simply having the knowledge of such issues is not enough if these can’t be identified. After all, it is impossible to locate such issues in large files when working on a live project. The fastest way to identify these issues after import are as follows:

  1. Classify Command: In most kinds of data files, certain data columns have a certain set of unique values or similar looking values. For example, a data column like ‘Payment_Mode’ would only have a few possible values like ‘Card’, ‘Cash’ etc. Using the classify command on such a column, should have such values only. If there are any other values, go to the data lines with these values and verify those in the raw data file.
  2. Summarize Command: The summarize command works like the classify command to provide unique values in a column. However, along with the unique values in the column, the summarize command also provides information of the total line counts for each of the unique value and even subtotals for any numeric fields in the data. This information can be further used to locate any problem lines.

You can refer to the ACL Audit Command Language Help for more details on the syntax for these commands. There would be dedicated posts and videos detailing the use of the above statements.

There are some other techniques that you may leverage to identify such issues. There are some helpful tips in the videos below:

Data Spills

Data Split

There are a number of issues that can occur when importing data files. Now that you are aware of the most issues and how to locate the same, you can investigate your data imports for accuracy. This is probably the most tedious exercise, but also the most important, as incorrectly imported data would affect the outcome of the analytics performed.

Please keep practicing and feel free to reach out to us with your valuable feedback and comments. Please go to the website to review ACL script examples and ACL script commands sign up for our newsletter, so that we may keep you posted on the latest activity on our website and Youtube channel.

 

Audit Command Language – Print Image Files

Hello again to the series for ‘Audit Command Language Tutorial for Beginners’. For readers, who have been following along the series, should currently be working trying to import data files. So far we have discussed how to import delimited text files & importing excel files. As common as these inputs files are, there is another approach that is extremely useful to learn because life has a way to throw a curve ball when you least expect it. This approach is useful for ‘Print Image’ files. These files are typically in a ‘.pdf’ or ‘.txt’ format where the layout is laid out rather unconventionally yet in a consistent pattern, which can be imported into ACL.

A print image file would look somewhat like this:

import-prn-step-1

The key difference in such a file is that there are no delimiters to separate the columns. Additionally, there is information like the date in the top right corner to be captured along with each line of the detail lines in the data.

Let’s consider an example :

import-prn-step-2

In the above example, there are more data points which form a part of the set of data points, which are to repeated for each of the details lines. These data points appear to be ‘Contact’, ‘Account Number’, ‘Customer’, ‘Order Number’ & ‘Ship Date’.

Now that we can identify the relevant parts of a print image file, lets list down steps to identify the structure of the table that would be created using such a file:

  1. Identifying the Detail Lines – The details lines are the lines are the unique data lines. These are the columns labelled ‘Media’, ‘QTY’, ‘Description’, ‘Label/No.’ etc. All these values are with the exception of column ‘Media’ need not be repeated to fill in a table i.e. the value ‘CD’ for column ‘Media’ needs to repeated for each of the lines below for order number 536118.
  2. Identifying the Header Lines – These are lines which are forming the header section of each block of data. In our example the values for ‘Contact’, ‘Account Number’, ‘Customer’, ‘Order Number’ & ‘Ship Date’ are the header lines. These lines are meant to be repeated across all the individual detail lines for each block. Try and imagine, 4 data lines with ‘Order Number’ as ‘536118’, ‘Contact’ as ‘Marvin Mabry’, ‘Account Number’ as ‘17959’ etc.

A combination of the header and the detail lines would create the complete individual data lines for the final imported table in ACL Audit Command Language. The procedure to import such files is bit more complicated than simple delimited files. The method is covered in the two videos below:

Import Print Image files – Part 1

Import Print Image files – Part 2

It is understandable that when attempting this exercise, you should face some issues. Please leave comments in the videos above on the channel or here on the website. This approach is not just restricted to data import in ACL Audit Command Language. Understanding this concept would allow you to work across different tools. It is imperative that you practice with sample data files. If you need sample files, we can certainly share some. Another good resource for sample files would be a monarch tutorial. It should be available for free online.

Please keep practicing and feel free to reach out to us with your valuable feedback and comments. Please go through the website to review ACL script examples and ACL script commands sign up for our newsletter, so that we may keep you posted on the latest activity on our website and Youtube channel.