Thursday, February 17, 2011

Debugging SAS Programs using Various Methods Part 1


As per the request from the most of the followers, I have been started discussion about debugging methods in SAS software. I am dividing this topic into seven parts and to make ease understands about debugging techniques in SAS.
We have all experienced the frustration of being under tight deadline, writing DATA step code frantically in an attempt to obtain our SAS data set. When we push the little running man on the SAS toolbar, we want to see the SAS log and SAS output window pop up before us error free. Our blood pressure rises when red text appears in the SAS log. An error is identified and a correction is made. The program is resubmitted. Another red message appears, this time presenting an extensive collection of variables and their values and a message that the INPUT statement could not be completed. Why is SAS being so difficult? There are systematic ways of debugging the code that can reduce time and frustration. By reading the notes and warnings in red, green, and black text we can gain a stronger understanding as to how the input data set is evaluated.
Type of Errors:
SAS detects some errors for you. For example, SAS finds misspelled keywords and invalid options and writes messages to the SAS log describing the problems it discovers. SAS finds these errors during compilation and prevents the program from executing.
Syntax errors:
These errors occur when a program statement does not conform to the rules of the SAS language (includes misspellings of keywords or omission of semicolons).
When SAS detects a syntax error, it first tries to correct the error by deleting, inserting, or replacing tokens in the SAS statement. If this action succeeds, SAS notifies you about the action it took by underlining the code where it detected the syntax error and by writing a message to the SAS log about the error and the action taken. SAS continues processing the step.
When SAS cannot correct a syntax error it finds, it stops processing the step. Processing resumes with the next step in the program. The results of the subsequent steps, however, may be incorrect if information in these steps was needed from the step containing the error.
Semantic errors:
These errors occur when a program statement is syntactically correct, but the structure of the statement is incorrect (e.g., an array reference was not specified correctly)
The compiler detects semantic errors. SAS does not detect semantic errors during tokenization because nothing is wrong with the tokens. The problem is that SAS does not know how to interpret your code. Typical semantic errors include misspellings of variable names and incorrect specifications of arrays.
Execution-time errors:
These errors occur when compiled statements are applied to data values (e.g., division by zero).

SAS detects execution-time errors when it applies compiled programming statements to data values. Typical execution-time errors include

·         INPUT statements that do not match the data lines
·         illegal mathematical operations such as division by zero
·         observations not sorted in the order specified in the BY statement when doing BY-group processing
·         reference to a nonexistent member of an array
·         illegal arguments to functions
·         No resources to complete a task specified in the program.

When SAS encounters execution-time errors, it usually produces warning messages and continues to process the program. The information that SAS writes to the SAS log includes the following:

·         an error message
·         the values stored in the input buffer if SAS is reading data values from a source other than a data set (e.g., an external file)
·         the contents of the program data vector at the time the error occurred
·         a message explaining the error.

Errors that occur prior to the PROC step may be the source of an error in a procedure step. If the procedure depends on data generated in previous steps, the procedure may not execute and SAS may generate an error. Depending on the procedure, incorrect data may not stop the processing of the procedure, but the results instead will be in error. Note that the information written to the SAS log varies depending on how you specify certain SAS options. Since execution-time errors result from applying your compiled SAS statements to your data values, you must understand the data you are processing in order to correct the errors.

Data errors:
Data errors occur when the statements are correct, but the data is invalid (e.g., taking the logarithm of zero).

A data error occurs when a data value is not appropriate for the SAS statements you coded. SAS detects these errors when the program executes, but these errors are different from execution-time errors. With execution-time errors, something in the program statements is wrong. With data errors, the data is wrong. Data errors reflect problems with the creation of the input data source.

Remedies for data errors include correcting the data entry process or changing the DATA step to reflect the chance that errors may occur. When SAS detects a data error, it writes a message to the SAS log, lists the values of the input buffer when it reads raw data, and lists the program data vector for the observation where the error occurred.

SAS couldn’t able to detect errors some time for you. According to SAS, your program is syntactically correct and executes without error. Your SAS log does not contain messages describing syntax errors, semantic errors, execution-time errors, or data errors. Yet, when you review the results of your SAS program, you find them incorrect. This happens when your instructions to SAS did not correctly convey the actions you wanted SAS to take. Your program executes anyway because SAS does not know how to detect when your logic is faulty.
Logic errors:
How can you find the logic errors that you introduce into your programs? The answer is by fully understanding your data, closely reading the messages in the SAS log, carefully reviewing the results, and using the tools that SAS provides for detecting logic errors.

NOTE: I am referring SAS books and SUGI papers and collating all the information and posting in this blog to address all the possibility techniques. If I miss anything then let me know. I really appreciate your help.

2 comments:

  1. I remember way back that SAS has a professional programmer type
    debugging tool but I don't remember how to invoke it

    ReplyDelete
  2. I wrote a macro to check for issues in the log. It can be run from a keyboard shortcut. I think most of the issues you noted would be caught by the macro. If you would like, try it out and let me know what you think. I would appreciate the feedback.

    http://sas.cswenson.com/macros

    http://sas.cswenson.com/macros/CheckLog.sas?attredirects=0&d=1

    ReplyDelete

I love to hear from you! Leave a comment.
If your question is unrelated to this article, please use my Facebook page.