The Correct Bibliographic Citation For This Manual Is As Follows [SAS] SAS Certification Prep Guide Base Programmi

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 819 [warning: Documents this large are best viewed by clicking the View PDF Link!]

The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2011. SAS®
Certification Prep Guide: Base Programming for SAS®9, Third Edition. Cary, NC: SAS
Institute Inc.
SAS® Certification Prep Guide: Base Programming for SAS®9, Third Edition.
Copyright © 2011, SAS Institute Inc., Cary, NC, USA
ISBN 978-1-60764-924-3
All rights reserved. Produced in the United States of America.
For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval
system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or
otherwise, without the prior written permission of the publisher, SAS Institute Inc.
For a Web download or e-book: Your use of this publication shall be governed by the terms
established by the vendor at the time you acquire this publication.
U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and
related documentation by the U.S. government is subject to the Agreement with SAS Institute
and the restrictions set forth in FAR 52.227-19, Commercial Computer Software-Restricted
Rights (June 1987).
SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513.
1st printing, July 2011
SAS® Publishing provides a complete selection of books and electronic products to help
customers use SAS software to its fullest potential. For more information about our e-books,
e-learning products, CDs, and hard-copy books, visit the SAS Publishing Web site at
support.sas.com/publishing or call 1-800-727-3228.
SAS® and all other SAS Institute Inc. product or service names are registered trademarks or
trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are registered trademarks or trademarks of their respective
companies.
SAS® Certification Prep Guide Base Programming for
SAS®9, Third Edition
- 1 -
Table of Contents
About This Book.....................................................................................................................................................................2
Chapter 1 Base Programming .................................................................................................................................................4
Chapter 2 Referencing Files and Setting Options .................................................................................................................43
Chapter 3 Editing and Debugging SAS Programs ................................................................................................................79
Chapter 4 Creating List Reports..........................................................................................................................................115
Chapter 5 Creating SAS Data Sets from External Files......................................................................................................159
Chapter 6 Understanding DATA Step Processing ...............................................................................................................215
Chapter 7 Creating and Applying User-Defined Formats...................................................................................................254
Chapter 8 Producing Descriptive Statistics.........................................................................................................................268
Chapter 9 Producing HTML Output ...................................................................................................................................304
Chapter 10 Creating and Managing Variables.....................................................................................................................331
Chapter 11 Reading SAS Data Sets ....................................................................................................................................364
Chapter 12 Combining SAS Data Sets ...............................................................................................................................391
Chapter 13 Transforming Data with SAS Functions...........................................................................................................437
Chapter 14 Generating Data with DO Loops......................................................................................................................506
Chapter 15 Processing Variables with Arrays .....................................................................................................................527
Chapter 16 Reading Raw Data in Fixed Fields...................................................................................................................563
Chapter 17 Reading Free-Format Data ...............................................................................................................................586
Chapter 18 Reading Date and Time Values.........................................................................................................................626
Chapter 19 Creating a Single Observation from Multiple Records ....................................................................................648
Chapter 20 Creating Multiple Observations from a Single Record ....................................................................................671
Chapter 21 Reading Hierarchical Files ...............................................................................................................................706
Appendix 1 Quiz Answer Keys...........................................................................................................................................733
- 2 -
About This Book
Audience
The SAS Certification Prep Guide: Base Programming for SAS®9 is for new or experienced SAS
programmers who want to prepare for the SAS Base Programming for SAS®9 exam.
Requirements and Details
Purpose
The SAS Certification Prep Guide: Base Programming for SAS®9 helps prepare you to take the SAS Base
Programming for SAS®9 exam. The book covers the objectives tested on the exam, including basic
concepts, producing reports, creating and modifying SAS data sets, and reading various types of raw data.
Before attempting the exam you should also have experience programming in the SAS®9 environment.
The book includes quizzes that enable you to test your understanding of material in each chapter.
Additionally, solutions to all quizzes are included at the back of the book.
Note: Exam objectives are subject to change. Please view the current exam objectives at
support.sas.com/certify.
Programming Environments
This book assumes you are running Base SAS or SAS Enterprise Guide software in the windowing
environment. You will learn how to write and manage your SAS programs in either the SAS windowing
environment workspace or in the SAS Enterprise Guide workspace.
If you are not sure which programming workspace you are using, select HelpAbout from the SAS
software main menu. If you are using SAS Enterprise Guide, the About window displays the name
“Enterprise Guide.” If you are using the SAS windowing environment, the About window displays the
name “SAS for Windows.”
Because the two programming workspaces differ, you will occasionally see notes in this book that provide
information specific to either SAS Enterprise Guide or to the SAS windowing environment.
How to Create Practice Data
If you are using the SAS 9.3 windowing environment, you can practice what you learn in this book by
using sample data that you create from within the SAS®9 environment. To set up this practice data, select
HelpLearning SAS Programming from the main SAS menu. When the SAS Online Training Sample
Data window appears, click OK to create a permanent SAS library named sasuser, which contains the
sample data.
You can access additional sample data by visiting the SAS Certification page on the SAS Training and
Bookstore Web site at support.sas.com/basepractice. There you will find links to practice data as well as
any updates to the guide.
- 3 -
Setting Result Formats in the SAS Windowing Environment
In the SAS windowing environment, you can use the Preferences window to specify whether you want
your output in HTML or LISTING format, or both. Your preferences are saved until you modify them, and
they apply to all output that you create. SAS Certification Prep Guide: Base Programming for SAS®9
generally shows output in HTML format, but some sample programs in this book specify features that
appear only in LISTING output. To create both HTML and LISTING output, do the following:
Start SAS and select ToolsOptionsPreferences. Then click the Results tab and select the Create
listing and Create HTML check boxes. If you want to store your HTML output in a folder other than the
one shown, de-select the Use WORK folder check box and browse to the desired folder. HTML files are
named sashtml.htm. Click OK to close the Preferences window.
Note: In SAS 9.3, HTML output in the SAS windowing environment is the default for Windows and
UNIX, but not for other operating systems and not in batch mode. When you run SAS in batch mode or on
other operating systems, the LISTING destination is open and is the default. Your actual defaults might be
different because of your registry or configuration file settings.
SAS Certification Practice Exam: Base Programming for SAS®9
The SAS Certification Practice Exam: Base Programming for SAS®9 was designed to help you prepare for
the SAS Base Programming for SAS®9 exam. This practice exam was constructed to give you a view of
the type of questions on the official certification exam. You can get more information about this exam at
support.sas.com/basepractice.
SAS Base Programming for SAS®9
For information about how to register for the official SAS Base Programming for SAS®9 exam, see the
SAS Global Certification Web site at http://support.sas.com/certify.
Additional Resources
Other resources might be helpful when you are learning SAS programming. You can refer to them as
needed to enhance your understanding of the material covered in this book. You can access SAS Help,
documentation, and other resources from your SAS software or on the Web.
From SAS Software
Help For SAS®9, select HelpSAS Help and Documentation
SAS Enterprise Guide, select HelpSAS Enterprise Guide Help
Documentation For SAS®9, select HelpSAS Help and Documentation
SAS Enterprise Guide: Access online documentation on the Web. See On the Web below.
On the Web
Bookstore http://support.sas.com/publishing/
Training http://support.sas.com/training/
Certification http://support.sas.com/certify/
SAS Learning Edition http://support.sas.com/learn/le/
SAS Global Academic Program http://support.sas.com/learn/ap/
SAS OnDemand http://support.sas.com/ondemand/
Knowledge Base http://support.sas.com/resources/
Support http://support.sas.com/techsup/
Learning Center http://support.sas.com/learn/
Community http://support.sas.com/community/
- 4 -
Syntax Conventions
The following example shows the general form of SAS code is shown in the book:
DATA output-SAS-data-set
(DROP=variables(s) | KEEP=variables(s));
SET SAS-data-set <options>;
BY variable(s)
RUN;
In the general form above:
DATA, DROP=, KEEP=, SET, BY, and RUN are in uppercase bold because they must be spelled as shown.
output-SAS-data-set, variable(s), SAS-data-set, and options are in italics because each represents a value that you
supply.
<options> is enclosed in angle brackets because it is optional syntax.
DROP= and KEEP= are separated by a vertical bar ( | ) to indicate that they are mutually exclusive.
The general forms of SAS statements and commands that are shown in this book include only the syntax
that you need to know to prepare for the certification exam. For complete syntax, see the appropriate SAS
reference guide.
Chapter 1 Base Programming
Overview
SAS Programs
SAS Libraries
Referencing SAS Files
SAS Data Sets
Using the Programming Workspace
- 5 -
Chapter Summary
Chapter Quiz
Overview
Introduction
To program effectively using SAS, you need to understand basic concepts about SAS programs and the
SAS files that they process. In particular, you need to be familiar with SAS data sets.
In this chapter, you'll examine a simple SAS program and see how it works. You'll learn details about SAS
data sets (which are files that contain data that is logically arranged in a form that SAS can understand).
You'll see how SAS data sets are stored temporarily or permanently in SAS libraries. Finally, you'll learn
how to use SAS windows to manage your SAS session and to process SAS programs.
SAS Library with SAS Data Sets and Data Files
Objectives
In this chapter, you learn about
the structure and components of SAS programs
the steps involved in processing SAS programs
SAS libraries and the types of SAS files that they contain
temporary and permanent SAS libraries
the structure and components of SAS data sets
the SAS windowing environment.
SAS Programs
You can use SAS programs to access, manage, analyze, or present your data. Let's begin by looking at a
simple SAS program.
- 6 -
A Simple SAS Program
This program uses an existing SAS data set to create a new SAS data set containing a subset of the original
data set. It then prints a listing of the new data set using PROC PRINT. A SAS data set is a data file that is
formatted in a way that SAS can understand.
data sasuser.admit2;
set sasuser.admit;
where age>39;
run;
proc print data=sasuser.admit2;
run;
Let's see how this program works.
Components of SAS Programs
The sample SAS program contains two steps: a DATA step and a PROC step.
data sasuser.admit2;
set sasuser.admit;
where age>39;
run;
proc print data=sasuser.admit2;
run;
These two types of steps, alone or combined, form most SAS programs.
A SAS program can consist of a DATA step or a PROC step or any combination of DATA and PROC
steps.
Components of a SAS Program
- 7 -
DATA steps typically create or modify SAS data sets. They can also be used to produce custom-designed
reports. For example, you can use DATA steps to
put your data into a SAS data set
compute values
check for and correct errors in your data
produce new SAS data sets by subsetting, supersetting, merging, and updating existing data sets.
In the previous example, the DATA step produced a new SAS data set containing a subset of the original
data set. The new data set contains only those observations with an age value greater than 39.
PROC (procedure) steps invoke or call pre-written routines that enable you to analyze and process the data
in a SAS data set. PROC steps typically present the data in the form of a report. They sometimes create
new SAS data sets that contain the results of the procedure. PROC steps can list, sort, and summarize data.
For example, you can use PROC steps to
create a report that lists the data
produce descriptive statistics
create a summary report
produce plots and charts.
Characteristics of SAS Programs
Next let's look at the individual statements in our sample program. SAS programs consist of SAS
statements. A SAS statement has two important characteristics:
It usually begins with a SAS keyword.
It always ends with a semicolon.
As you've seen, a DATA step begins with a DATA statement, which begins with the keyword DATA. A
PROC step begins with a PROC statement, which begins with the keyword PROC. Our sample program
contains the following statements:
SAS Program Statements
- 8 -
Statements Sample Program Code
a DATA statement
data sasuser.admit2;
a SET statement
set sasuser.admit;
Additional programming statements
where age>39;
a RUN statement
run;
a PROC PRINT statement
proc print data=sasuser.admit2;
another RUN statement
run;
Layout for SAS Programs
SAS statements are free-format. This means that
they can begin and end anywhere on a line
one statement can continue over several lines
- 9 -
several statements can be on the same line.
Blanks or special characters separate words in a SAS statement.
You can specify SAS statements in uppercase or lowercase. In most situations, text that is enclosed in
quotation marks is case sensitive.
You've examined the general structure of our sample program. But what happens when you run the
program?
Processing SAS Programs
When you submit a SAS program, SAS begins reading the statements and checking them for errors.
DATA and PROC statements signal the beginning of a new step. The RUN statement (for most procedures
and the DATA step) and the QUIT statement (for some procedures) mark step boundaries. The beginning
of a new step (DATA or PROC) also implies the end of the previous step. At a step boundary, SAS
executes any statements that have not previously executed and ends the step. In our sample program, each
step ends with a RUN statement.
data sasuser.admit2;
set sasuser.admit;
where age>39;
run;
proc print data=sasuser.admit2;
run;
Note: Though the RUN statement is not always required between steps in a SAS program, using it can
make the SAS program easier to read and debug, and it makes the SAS log easier to read.
Log Messages
Each time a step is executed, SAS generates a log of the processing activities and the results of the
processing. The SAS log collects messages about the processing of SAS programs and about any errors
that occur.
When SAS processes our sample program, you see the log messages shown below. Notice that you get
separate sets of messages for each step in the program.
Log Messages
- 10 -
Results of Processing
DATA step Output
Suppose you submit the sample program below:
data sasuser.admit2;
set sasuser.admit;
where age>39;
run;
When the program is processed, it creates a new SAS data set (sasuser.admit2) containing only those observations with
age values greater than 39. The DATA step creates a new data set and produces messages in the SAS log, but it does not
create a report or other output.
Procedure Output
If you add a PROC PRINT statement to this same example, the program produces the same new data set as before, but it
also creates the following report, which is displayed in HTML:
data sasuser.admit2;
set sasuser.admit;
where age>39;
run;
proc print data=sasuser.admit2;
run;
PRINT Procedure Output
- 11 -
Note: Throughout this book, procedure output is shown in HTML in the style shown above unless
otherwise noted.
You've seen the results of submitting our sample program. For other SAS programs, the results of
processing might vary:
Other Types of Procedural Output
o Some SAS programs open an interactive window (a window that you can use to directly modify data),
such as the REPORT window.
proc report data=sasuser.admit;
columns id name sex age actlevel;
run;
Interactive Report Window
- 12 -
o SAS programs often invoke procedures that create output in the form of a report, as is the case with the
TABULATE procedure:
proc tabulate data=sasuser.admit;
class sex;
var height weight;
table sex*(height weight),mean;
run;
TABULATE Procedure Output
o Other SAS programs perform tasks such as sorting and managing data, which have no visible results
except for messages in the log. (All SAS programs produce log messages, but some SAS programs produce only log
messages.)
proc copy in=sasuser out=work;
select admit;
run;
Log Output
SAS Libraries
So far you've learned about SAS programs. Now let's look at SAS libraries to see how SAS data sets and
other SAS files are organized and stored.
How SAS Files Are Stored
- 13 -
Every SAS file is stored in a SAS library, which is a collection of SAS files. A SAS data library is the
highest level of organization for information within SAS.
For example, in the Windows and UNIX environments, a library is typically a group of SAS files in the
same folder or directory.
SAS Data Library
The table below summarizes the implementation of SAS libraries in various operating environments.
Environments and SAS Libraries
In this environment...
A SAS library is...
Windows, UNIX, OpenVMS,
OS/2 (directory based-systems)
a group of SAS files that are stored in the same directory. Other files can
be stored in the directory, but only the files that have SAS file extensions
are recognized as part of the SAS library.
For more information, see the SAS documentation for your operating
environment.
z/OS (OS/390)
a specially formatted host data set in which only SAS files are stored.
Storing Files Temporarily or Permanently
Depending on the library name that you use when you create a file, you can store SAS files temporarily or
permanently.
Temporary SAS Data Library
- 14 -
Permanent SAS Data Library
Temporary and Permanent SAS Libraries
Temporary SAS libraries last
only for the current SAS session.
Storing files temporarily:
If you don't specify a library name when you create a file (or if you specify the
library name Work), the file is stored in the temporary SAS data library. When you
end the session, the temporary library and all of its files are deleted.
Permanent SAS Libraries are
available to you during
subsequent SAS sessions.
Storing files permanently:
To store files permanently in a SAS data library, you specify a library name other
than the default library name Work.
For example, by specifying the library name sasuser when you create a file, you
specify that the file is to be stored in a permanent SAS data library until you delete
it.
You can learn how to set up permanent SAS libraries in Referencing Files and Setting Options.
Referencing SAS Files
Two-Level Names
To reference a permanent SAS data set in your SAS programs, you use a two-level name consisting of the
library name and the filename, or data set name:
libref.filename
- 15 -
In the two-level name, libref is the name of the SAS data library that contains the file, and filename is the name of the file,
or data set. A period separates the libref and filename.
Two-Level SAS Name
For example, suppose we want to create a new permanent sas library named Clinic. In our sample program,
Clinic.Admit is the two-level name for the SAS data set Admit, which is stored in the library named Clinic.
Notice that the LIBNAME statement is used to define the libref, Clinic, and to give SAS the physical
location of the data files.
libname clinic 'c:\Users\Name\sasuser';
data clinic.admit2;
set clinic.admit;
weight =round(weight);
run;
Two-Level Name Clinic.Admit
Referencing Temporary SAS Files
To reference temporary SAS files, you can specify the default libref Work, a period, and the filename. For
example, the two-level name Work.Test references the SAS data set named Test that is stored in the
temporary SAS library Work.
Two-Level Temporary SAS Library Work.Test
- 16 -
Alternatively, you can use a one-level name (the filename only) to reference a file in a temporary SAS
library. When you specify a one-level name, the default libref Work is assumed. For example, the
one-level name Test also references the SAS data set named Test that is stored in the temporary SAS
library Work.
One-Level Temporary SAS Library Test
Referencing Permanent SAS Files
You can see that Clinic.Admit and Clinic.Admit2 are permanent SAS data sets because the library name is
Clinic, not Work.
Referencing Permanent SAS Files
So, referencing a SAS file in any library except Work indicates that the SAS file is stored permanently. For
example, when our sample program creates Clinic.Admit2, it stores the new Admit2 data set permanently
in the SAS library Clinic.
Rules for SAS Names
These rules apply only to the filename portion of a SAS data set name. A libref can have a length of only
eight characters.
SAS data set names and variable names
can be 1 to 32 characters long
must begin with a letter (A-Z, either uppercase or lowercase) or an underscore (_)
- 17 -
can continue with any combination of numbers, letters, or underscores.
These are examples of valid data set names and variable names:
Payroll
LABDATA1995_1997
_EstimatedTaxPayments3
SAS Data Sets
So far, you've seen the components and characteristics of SAS programs, including how they reference
SAS data sets. Data sets are one type of SAS file. There are other types of SAS files (such as catalogs), but
this chapter focuses on SAS data sets. For most procedures, data must be in the form of a SAS data set to
be processed. Now let's take a closer look at SAS data sets.
Overview of Data Sets
As you saw in our sample program, for many of the data processing tasks that you perform with SAS, you
access data in the form of a SAS data set
analyze, manage, or present the data.
Conceptually, a SAS data set is a file that consists of two parts: a descriptor portion and a data portion.
Sometimes a SAS data set also points to one or more indexes, which enable SAS to locate rows in the data
set more efficiently. (The data sets that you work with in this chapter do not contain indexes.)
Parts of a SAS Data Set
Descriptor Portion
- 18 -
The descriptor portion of a SAS data set contains information about the data set, including
the name of the data set
the date and time that the data set was created
the number of observations
the number of variables.
Let's look at another SAS data set. The table below lists part of the descriptor portion of the data set
sasuser.insure, which contains insurance information for patients who are admitted to a wellness clinic.
(It's a good idea to give your data set a name that is descriptive of the contents.)
Descriptor Portion of Attributes in a SAS Data Set sasuser.insure
Data Set Name:
sasuser.INSURE
Member Type:
DATA
Engine:
V9
Created:
10:05 Tuesday, February 16, 2011
Observations:
21
Variables:
7
Indexes:
0
Observation Length:
64
Data Portion
The data portion of a SAS data set is a collection of data values that are arranged in a rectangular table. In
the example below, the name Murray is a data value, Policy 32668 is a data value, and so on.
Parts of a SAS Data Set: Data: Data Portion
- 19 -
Observations (Rows)
Rows (called observations) in the data set are collections of data values that usually relate to a single object.
The values 2458, Murray, 32668, Mutuality, 100, 98.64 and 0.00 comprise a single observation in the
data set shown below.
Parts of a SAS Data Set: Observations
This data set has seven observations, each containing information about an individual. A SAS data set can
store any number of observations.
Variables (Columns)
Columns (called variables) in the data set are collections of values that describe a particular characteristic.
The values 2458, 2462, 2501, and 2523 comprise the variable ID in the data set shown below.
Parts of a SAS Data Set: Variable
This data set contains seven variables: ID, Name, Policy, Company, PctInsured, Total, and BalanceDue. A
SAS data set can store thousands of variables.
Missing Values
Every variable and observation in a SAS data set must have a value. If a data value is unknown for a
particular observation, a missing value is recorded in the SAS data set.
- 20 -
Missing Data Values
Variable Attributes
In addition to general information about the data set, the descriptor portion contains information about the
properties of each variable in the data set. The properties information includes the variable's name, type,
length, format, informat, and label.
When you write SAS programs, it's important to understand the attributes of the variables that you use. For
example, you might need to combine SAS data sets that contain same-named variables. In this case, the
variables must be the same type (character or numeric).
The following is a partial listing of the attribute information in the descriptor portion of the SAS data set
insure.policy. First, let's look at the name, type, and length variable attributes.
Variable Attributes in the Descriptor Portion of a SAS Data Set insure.policy
Variable
Type
Length
Format
Informat
Label
Policy
Char
8
Policy Number
Total
Num
8
DOLLAR8.2
COMMA10.
Total Balance
Name
Char
20
Patient Name
Name
Each variable has a name that conforms to SAS naming conventions. Variable names follow exactly the
same rules as SAS data set names. Like data set names, variable names
can be 1 to 32 characters long
must begin with a letter (A-Z, either uppercase or lowercase) or an underscore (_)
can continue with any combination of numbers, letters, or underscores.
- 21 -
Variable Name Attributes
Variable
Type
Length
Format
Informat
Label
Policy
Char
8
Policy Number
Total
Num
8
DOLLAR8.2
COMMA10.
Total Balance
Name
Char
20
Patient Name
Your site may choose to restrict variable names to those valid in SAS 6, to uppercase variable names
automatically, or to remove all restrictions on variable names.
Type
A variable's type is either character or numeric.
Character variables, such as Name (shown below), can contain any values.
Numeric variables, such as Total (shown below), can contain only numeric values (the numerals 0 through 9, +,
-, ., and E for scientific notation).
Type Attribute
Variable
Type
Length
Format
Informat
Label
Policy
Char
8
Policy Number
Total
Num
8
DOLLAR8.2
COMMA10.
Total Balance
Name
Char
20
Patient Name
A variable's type determines how missing values for a variable are displayed. In the following data set,
Name and Sex are character variables, and Age and Weight are numeric variables.
For character variables such as Name, a blank represents a missing value.
For numeric variables such as Age, a period represents a missing value.
- 22 -
Missing Values Represented Based on Variable Type
Length
A variable's length (the number of bytes used to store it) is related to its type.
Character variables can be up to 32,767 bytes long. In the example below, Name has a length of 20 characters and
uses 20 bytes of storage.
All numeric variables have a default length of 8 bytes. Numeric values (no matter how many digits they contain)
are stored as floating-point numbers in 8 bytes of storage.
Length Attribute
Variable
Type
Length
Format
Informat
Label
Policy
Char
8
Policy Number
Total
Num
8
DOLLAR8.2
COMMA10.
Total Balance
Name
Char
20
Patient Name
You've seen that each SAS variable has a name, type, and length. In addition, you can optionally define
format, informat, and label attributes for variables. Let's look briefly at these optional attributes—you'll
learn more about them in later chapters as you need to use them.
Format
Formats are variable attributes that affect the way data values are written. SAS software offers a variety of
character, numeric, and date and time formats. You can also create and store your own formats. To write
values out using a particular form, you select the appropriate format.
Formats
- 23 -
For example, to display the value 1234 as $1,234.00 in a report, you can use the DOLLAR8.2 format, as
shown for Total below.
Format Attribute
Variable
Type
Length
Format
Informat
Label
Policy
Char
8
Policy Number
Total
Num
8
DOLLAR8.2
COMMA10.
Total Balance
Name
Char
20
Patient Name
Usually you have to specify the maximum width (w) of the value to be written. Depending on the
particular format, you may also need to specify the number of decimal places (d) to be written. For
example, to display the value 5678 as 5,678.00 in a report, you can use the COMMA8.2 format, which
specifies a width of 8 including 2 decimal places.
You can permanently assign a format to a variable in a SAS data set, or you can temporarily specify a
format in a PROC step to determine the way the data values appear in output.
Informat
Whereas formats write values out using some particular form, informats read data values in certain forms
into standard SAS values. Informats determine how data values are read into a SAS data set. You must use
informats to read numeric values that contain letters or other special characters.
Informats
- 24 -
For example, the numeric value $12,345.00 contains two special characters, a dollar sign ($) and a comma
(,). You can use an informat to read the value while removing the dollar sign and comma, and then store
the resulting value as a standard numeric value. For Total below, the COMMA10. informat is specified.
Informat Attribute
Variable
Type
Length
Format
Informat
Label
Policy
Char
8
Policy Number
Total
Num
8
DOLLAR8.2
COMMA10.
Total Balance
Name
Char
20
Patient Name
Label
A variable can have a label, which consists of descriptive text up to 256 characters long. By default, many
reports identify variables by their names. You may want to display more descriptive information about the
variable by assigning a label to the variable.
For example, you can label Policy as Policy Number, Total as Total Balance, and Name as Patient Name
to display these labels in reports.
Label Attribute
Variable
Type
Length
Format
Informat
Label
Policy
Char
8
Policy Number
Total
Num
8
DOLLAR8.2
COMMA10.
Total Balance
Name
Char
20
Patient Name
You may even want to use labels to shorten long variable names in your reports!
Using the Programming Workspace
- 25 -
Using the Main SAS Windows
When you start SAS, by default several primary windows are available to you. These include the Explorer,
Log, Output, Results, and code editing window(s). The window you use to edit your SAS programs may
vary, depending on your operating system and needs.
You use these SAS windows to explore and manage your files, to enter and submit SAS programs, to view
messages, and to view and manage your output.
We'll tour each of these windows shortly.
Using the Main SAS Windows
Your operating environment, and any options that you use when you start SAS, determine
which of the main SAS windows are displayed by default
their general appearance
their position.
Features of SAS Windows
SAS windows have many features that help you get your work done. For example, you can
maximize, minimize, and restore windows
- 26 -
use pull-down menus, pop-up menus, and toolbars
get more help.
This chapter and later chapters show SAS®9 windows in the Windows operating environment.
Features of SAS Windows
Minimizing and Restoring Windows
In the Windows environment, you can click the Minimize button to send a window that you aren't using to
the SAS window bar. To restore the window to its former position, click the corresponding button on the
SAS window bar.
In other operating environments, minimizing the window shrinks it to an icon.
Docking and Undocking Windows
In the Windows and OS/2 environments, the Explorer and Results windows are docked by default, so they
can be resized but not minimized. If you prefer, you can select WindowDocked to undock the active
window, or you can turn docking off completely in the Preferences dialog box.
Issuing Commands
In SAS, you can issue commands by
making selections from a menu bar
by typing commands in a command box (or Toolbox) or on a command line.
- 27 -
Using the SAS Command Box and Menu Bar to Issue Commands
In most operating environments, SAS displays a menu bar by default. In the Windows environment, the
menu bar selections correspond to the active window. To display a menu bar if it is not displayed, you can
type pmenus in the command box (or ToolBox) or on the command line.
In all operating environments, SAS displays a command box (or ToolBox) or command line by default.
You can display a command line in a particular window by activating that window and then using the
Tools menu as indicated below:
In the Windows environment, select ToolsOptionsPreferences, and then select the View tab and select the
Command line checkbox.
In the UNIX and z/OS environments, select ToolsOptionsTurn Command Line On.
In the z/OS environment, you can display both a command line and a menu bar simultaneously in a window by
selecting ToolsOptionsCommand....
In the Windows and UNIX operating environments, you can also display a command line in a window by
activating the window, typing command in the command box (or ToolBox), and pressing Enter.
See the online help for a complete list of command-line commands.
Using Pop-Up Menus
Pop-up menus are context sensitive; they list actions that pertain only to a particular window. Generally,
you display pop-up menus by clicking the right mouse button. If you like, you can specify a function key
to open pop-up menus. Simply select ToolsOptionsKeys and type wpopup as a function key setting.
To open a pop-up menu in the z/OS operating environment, type ? in the selection field beside the item.
- 28 -
Getting Help
Help is available for all windows in SAS. From the Help menu, you can access comprehensive online help
and documentation for SAS, or you can access task-oriented help for the active window. The Help menu is
discussed in more detail later in this chapter.
Customizing Your SAS Environment
You can customize many features of the SAS workspace such as toolbars, pop-up menus, icons, and so on.
Select the Tools menu to explore some of the customization options that are available.
You'll learn how to use features of SAS windows throughout this chapter. Now let's look at each of the
main SAS windows individually.
The Explorer Window
In the Explorer window, you can view and manage your SAS files, which are stored in SAS data libraries.
The library name is a logical name for the physical location of the files (such as a directory). You can think
of the library name as a temporary nickname or shortcut.
You use the Explorer window to
create new libraries and SAS files
open any SAS file
perform most file management tasks such as moving, copying, and deleting files
create shortcuts to files that were not created with SAS.
Notice that the Explorer window displays a tree view of its contents.
Tree View of the SAS Explorer Window
- 29 -
You can display the Explorer window by selecting ViewExplorer. In the Windows and z/OS operating
environments, if the Explorer window is docked, you can click the Explorer tab to display the window.
Navigating the Explorer Window
You can find your way around the Explorer window by double-clicking folders to open them and see their
contents. You can also use pop-up menus to perform actions on a file (such as viewing its properties, or
copying it). Pop-up menus contain different options for different file types.
To open a pop-up menu in the z/OS operating environment, type ? in the selection field beside the item in
the Explorer window. To simulate a double-click, type S in the selection field beside the item.
Code Editing Windows
You can use the following editors to write and edit SAS programs:
the Enhanced Editor window
the Program Editor window
the host editor of your choice.
This training focuses on the two SAS code editing windows: the Enhanced Editor and the Program Editor
windows. The Enhanced Editor is available only in the Windows operating environment.
The features of both editors are described below. In the remaining chapter and in future chapters, the
general term code editing window will be used to refer to your preferred SAS code editing window.
- 30 -
Enhanced Editor Window
In the Windows operating environment, an Enhanced Editor window opens by default. You can use the
Enhanced Editor window to enter, edit, and submit SAS programs. The initial window title is Editor -
Untitledn until you open a file or save the contents of the editor to a file. Then the window title changes to
reflect that filename. When the contents of the editor are modified, an asterisk is added to the title.
Enhanced Editor Window
You can redisplay or open additional Enhanced Editor windows by selecting ViewEnhanced Editor.
Enhanced Editor Features
In the Enhanced Editor, you can perform standard editing tasks such as
opening SAS programs in various ways, including drag and drop
entering, editing, and submitting SAS programs
using the command line or menus
saving SAS programs
clearing contents.
In addition, the Enhanced Editor provides useful editing features, including
color coding and syntax checking of the SAS programming language
expandable and collapsible sections
recordable macros
support for keyboard shortcuts (Alt or Shift plus keystroke)
multi-level undo and redo.
For more information about the Enhanced Editor, open or activate the Enhanced Editor window, and then
select HelpUsing This Window.
Clearing the Editor
- 31 -
In the Enhanced Editor, the code does not disappear when you submit it.
To clear any of these windows, you can activate the window and select EditClear All.
Cleared Editor Window
The Program Editor Window
As in the Enhanced Editor window, in the Program Editor window you enter, edit, and submit SAS
programs. You can also open existing SAS programs. You can display the Program Editor window by
selecting ViewProgram Editor.
Program Editor Window
Features
As in the Enhanced Editor window, in the Program Editor window you can perform standard editing tasks
such as
opening SAS programs in various ways, including drag and drop
entering, editing, and submitting SAS programs
using the command line or menus
saving SAS programs
clearing contents
recalling submitted statements.
- 32 -
However, the Program Editor does not provide some features of the enhanced Editor, such as syntax
checking (this feature is available in SAS 9.2), expandable and collapsible sections, and recordable
macros.
For more information about these features, open or activate the Program Editor window, and then select
HelpUsing This Window.
Clearing the Editor
At any time you can clear program code from the Program Editor window by activating the window and
selecting EditClear All.
When you submit SAS programs in the Program Editor window, the code in the window is automatically
cleared.
The Log Window
The Log window displays messages about your SAS session and about any SAS programs that you submit.
You can display the Log window by selecting ViewLog.
Log Window
The Output Window
In the Output window, you browse LISTING output from SAS programs that you submit. (You can use a
browser to view HTML output.)
By default, the Output window is positioned behind the code editing and Log windows. When you create
output, the Output window automatically moves to the front of your display.
You can display the Output window at any time by selecting ViewOutput.
Output Window
- 33 -
Not all SAS programs create output in the Output window. Some open interactive windows. Others
produce only messages in the Log window.
Note: In mainframe operating environments, when you create multiple pages of output, a message in the
Output window border indicates that the procedure is suspended. In the example below, PROC PRINT
output is suspended.
Mainframe Window Showing Suspended Procedure
To remove the message and view the remaining output, simply scroll to the bottom of the output.
The Results Window
The Results window helps you navigate and manage output from SAS programs that you submit. You can
view, save, and print individual items of output. The Results window uses a tree structure to list various
types of output that might be available after you run SAS.
On most operating systems, the Results window is positioned behind the Explorer window and is empty
until you submit a SAS program that creates output. Then the Results window moves to the front of your
display. The Results window displays separate icons for LISTING output and HTML output. In the
example below, the first Print folder contains both types of output.
Results Window
- 34 -
Viewing Output in the Results Window
You can display the Results window at any time by selecting ViewResults. HTML is the default output
type in the SAS windowing environment for UNIX and Windows. In these environments, when you
submit a SAS program, the HTML output is automatically displayed in the Results Viewer and the file is
listed in the Results window. In all other environments, LISTING is the default output.
The left pane of the following display shows the Results window, and the right pane shows the Results
Viewer where the default HTML output is displayed. The Results window lists the files that were created
when the SAS program executed.
Results Window and Results Viewer
- 35 -
Creating SAS Libraries
Earlier in this chapter, you saw that SAS files are stored in libraries. By default, SAS defines several
libraries for you (including Sashelp, Sasuser, and Work). You can also define additional libraries.
Sashelp
a permanent library that contains sample data and other files that control how SAS works at your site. This is a read-only
library.
Sasuser
a permanent library that contains SAS files in the Profile catalog that store your personal settings. This is also a
convenient place to store your own files.
Work
a temporary library for files that do not need to be saved from session to session.
Active SAS Libraries
- 36 -
You can also define additional libraries. When you define a library, you indicate the location of your SAS
files to SAS. Once you define a library, you can manage SAS files within it.
When you delete a SAS library, the pointer to the library is deleted, and SAS no longer has access to the
library. However, the contents of the library still exist in your operating environment.
Defining Libraries
To define a library, you assign a library name to it and specify a path, such as a directory path. (In some
operating environments you must create the directory or other storage location before defining the library.)
You can also specify an engine, which is a set of internal instructions that SAS uses for writing to and
reading from files in a library.
Defining LIbraries
In this chapter, you learn about SAS libraries. You can define SAS libraries using programming statements.
Specifying Engines shows you how to write LIBNAME statements to define SAS libraries.
Depending on your operating environment and the SAS/ACCESS products that you license, you can create
libraries with various engines. Each engine enables you to read a different file format, including file
formats from other software vendors.
Creating and Using File Shortcuts
- 37 -
You've seen that the Explorer window gives you access to your SAS files. You can also create a file
shortcut to an external file.
An external file is a file that is created and maintained in your host operating environment. External files
contain data or text, such as
SAS programming statements
records of raw data
procedure output.
SAS can use external files, but they are not managed by SAS.
A file shortcut (or fileref) is an optional name that is used to identify an external file to SAS. File shortcuts
are stored in the File Shortcuts folder in the Explorer window. You can use a file shortcut to open, browse,
and submit a file.
When you delete a file shortcut, the pointer to the file is deleted, and SAS no longer has access to the file.
However, the file still exists in your operating environment.
Creating and Using File Shortcuts
If you have used SAS before, a file shortcut is the same as a file reference or fileref.
Using SAS Solutions and Tools
Along with windows for working with your SAS files and SAS programs, SAS provides a set of
ready-to-use solutions, applications, and tools. You can access many of these tools by using the Solutions
and Tools menus.
Using SAS Solutions and Tools
- 38 -
For example, you can use the table editor in the Tools menu to enter, browse, or edit data in a SAS data
set.
Getting Help
You've learned to use SAS windows to perform common SAS tasks. As you begin working in SAS, be
sure to take advantage of the different types of online help that are available from the Help menu.
SAS Help
Using This Window is task-oriented help for the active window.
SAS Help and Documentation is a complete guide to syntax, examples, procedures, concepts, and what's new.
Getting Started tutorials are listed under Help for products where they are available.
Selecting Learning SAS programming enables you to create data that is used in online training courses, and
displays SAS OnlineTutor if you have a site license.
If you have Internet access, SAS on the Web provides links to information including Technical Support and
Frequently Asked Questions.
To access SAS online help, documentation, and other resources from your SAS software, select
HelpSAS Documentation from the SAS toolbar. You can also access SAS documentation in the SAS
Knowledge Base at support.sas.com.
Chapter Summary
- 39 -
Text Summary
Components of SAS Programs
SAS programs consist of two types of steps: DATA steps and PROC (procedure) steps. These two steps,
alone or combined, form most SAS programs. A SAS program can consist of a DATA step, a PROC step,
or any combination of DATA and PROC steps. DATA steps typically create or modify SAS data sets, but
they can also be used to produce custom-designed reports. PROC steps are pre-written routines that enable
you to analyze and process the data in a SAS data set and to present the data in the form of a report. They
sometimes create new SAS data sets that typically contain the results of the procedure.
Characteristics of SAS Programs
SAS programs consist of SAS statements. A SAS statement usually begins with a SAS keyword and
always ends with a semicolon. A DATA step begins with the keyword DATA. A PROC step begins with
the keyword PROC. SAS statements are free-format, so they can begin and end anywhere on a line. One
statement can continue over several lines, and several statements can be on a line. Blanks or special
characters separate “words” in a SAS statement.
Processing SAS Programs
When you submit a SAS program, SAS reads SAS statements and checks them for errors. When it
encounters a subsequent DATA, PROC, or RUN statement, SAS executes the previous step in the
program.
Each time a step is executed, SAS generates a log of the processing activities and the results of the
processing. The SAS log collects messages about the processing of SAS programs and about any errors
that occur.
The results of processing can vary. Some SAS programs open an interactive window or invoke procedures
that create output in the form of a report. Other SAS programs perform tasks such as sorting and managing
data, which have no visible results other than messages in the log.
SAS Libraries
Every SAS file is stored in a SAS library, which is a collection of SAS files such as SAS data sets and
catalogs. In the Windows and UNIX environments, a SAS library is typically a group of SAS files in the
same folder or directory.
Depending on the libref you use, you can store SAS files in a temporary SAS library or in permanent SAS
libraries.
Temporary SAS files that are created during the session are held in a special work space that is assigned the
default libref Work. If you don't specify a libref when you create a file (or if you specify Work), the file is stored in the
temporary SAS library. When you end the session, the temporary library is deleted.
To store files permanently in a SAS library, you assign it a libref other than the default Work. For example, by
assigning the libref sasuser to a SAS library, you specify that files within the library are to be stored until you delete them.
Referencing SAS Files
- 40 -
To reference a SAS file, you use a two-level name, libref.filename. In the two-level name, libref is the
name for the SAS library that contains the file, and filename is the name of the file itself. A period
separates the libref and filename.
To reference temporary SAS files, you specify the default libref Work, a period, and the filename.
Alternatively, you can simply use a one-level name (the filename only) to reference a file in a temporary
SAS library. Referencing a SAS file in any library except Work indicates that the SAS file is stored
permanently.
SAS data set names can be 1 to 32 characters long, must begin with a letter (A-Z, either uppercase or
lowercase) or an underscore (_), and can continue with any combination of numerals, letters, or
underscores.
SAS Data Sets
For many of the data processing tasks that you perform with SAS, you access data in the form of a SAS
data set and use SAS programs to analyze, manage, or present the data. Conceptually, a SAS data set is a
file that consists of two parts: a descriptor portion and a data portion. Some SAS data sets also contain one
or more indexes, which enable SAS to locate records in the data set more efficiently.
The descriptor portion of a SAS data set contains property information about the data set.
The data portion of a SAS data set is a collection of data values that are arranged in a rectangular table.
Observations in the data set correspond to rows or data lines. Variables in the data set correspond to
columns. If a data value is unknown for a particular observation, a missing value is recorded in the SAS
data set.
Variable Attributes
In addition to general information about the data set, the descriptor portion contains property information
for each variable in the data set. The property information includes the variable's name, type, and length. A
variable's type determines how missing values for a variable are displayed by SAS. For character variables,
a blank represents a missing value. For numeric variables, a period represents a missing value. You can
also specify format, informat, and label properties for variables.
Using the Main SAS Windows
You use the following windows to explore and manage your files, to enter and submit SAS programs, to
view messages, and to view and manage your output.
Windows and How They Are Used
Use this window ...
To ...
Explorer view your SAS files
create new libraries and SAS files
perform most file management tasks such as moving, copying, and deleting files
- 41 -
create shortcuts to files that were not created with SAS
Enhanced Editor (code editing
window)
enter, edit, and submit SAS program
Note: The Enhanced Editor window is available only in the Windows operating
environment.
Program Editor (code editing
window)
enter, edit, and submit SAS programs
Log view messages about your SAS session and about any SAS programs that you
submit
Output browse output from SAS programs
Results navigate and manage output from SAS programs
view, save, and print individual items of output
Points to Remember
Before referencing SAS files, you must assign a name (libref, or library reference) to the library in which the files
are stored (or specify that SAS is to assign the name automatically).
You can store SAS files either temporarily or permanently.
Variable names follow the same rules as SAS data set names. However, your site may choose to restrict variable
names to those valid in Version 6 SAS, to uppercase variable names automatically, or to remove all restrictions on variable
names.
Chapter Quiz
Select the best answer for each question. After completing the quiz, you can check your answers using the
answer key in the appendix.
1. How many observations and variables does the data set below contain?
1. 3 observations, 4 variables
2. 3 observations, 3 variables
3. 4 observations, 3 variables
4. can't tell because some values are missing
2. How many program steps are executed when the program below is processed?
data user.tables;
infile jobs;
input date yyddmm8. name $ job $;
- 42 -
run;
proc sort data=user.tables;
by name;
run;
proc print data=user.tables;
run;
1. three
2. four
3. five
4. six
3. What type of variable is the variable AcctNum in the data set below?
1. numeric
2. character
3. can be either character or numeric
4. can't tell from the data shown
4. What type of variable is the variable Wear in the data set below?
1. numeric
2. character
3. can be either character or numeric
4. can't tell from the data shown
5. Which of the following variable names is valid?
1. 4BirthDate
2. $Cost
3. _Items_
4. Tax-Rate
6. Which of the following files is a permanent SAS file?
1. Sashelp.PrdSale
- 43 -
2. Sasuser.MySales
3. Profits.Quarter1
4. all of the above
7. In a DATA step, how can you reference a temporary SAS data set named Forecast?
1. Forecast
2. Work.Forecast
3. Sales.Forecast (after assigning the libref Sales)
4. only a and b above
8. What is the default length for the numeric variable Balance?
1. 5
2. 6
3. 7
4. 8
9. How many statements does the following SAS program contain?
proc print data=new.prodsale
label double;
var state day price1 price2; where state='NC';
label state='Name of State'; run;
1. three
2. four
3. five
4. six
10. What is a SAS library?
1. collection of SAS files, such as SAS data sets and catalogs
2. in some operating environments, a physical collection of SAS files
3. a group of SAS files in the same folder or directory
4. all of the above
Chapter 2 Referencing Files and Setting Options
Overview
- 44 -
Viewing SAS Libraries
Specifying Results Formats
Setting System Options
Additional Features
Chapter Summary
Chapter Quiz
Overview
Introduction
When you begin a SAS session, it's often convenient to set up your environment first. For example, you
may want to
define libraries that contain the SAS data sets that you intend to use
specify whether your procedure output is created as HTML (Hyper Text Markup Language) output, LISTING
output, or another type of output.
set features of your LISTING output, if you are creating any, such as whether the date and time appear
specify how two-digit year values should be interpreted.
This chapter shows you how to define libraries, reference SAS files, and specify options for your SAS
session. You also learn how to specify the form(s) of output to produce.
SAS System Options Window
- 45 -
Objectives
In this chapter, you learn to
define new libraries by using programming statements
reference SAS files to be used during your SAS session
set results options to determine the type or types of output produced (HTML, LISTING output, or other) in
desktop operating environments
set system options to determine how date values are read and to control the appearance of any LISTING output
that is created during your SAS session.
Defining Libraries
Often the first step in setting up your SAS session is to define the libraries. You can also use programming
statements to assign library names.
Remember that to reference a permanent SAS file, you
1. assign a name (libref) to the SAS library in which the file is stored
2. use the libref as the first part of the file's two-level name (libref.filename) to reference the file within the library.
Defining Libraries
- 46 -
Assigning Librefs
To define libraries, you can use a LIBNAME statement. You can store the LIBNAME statement with any
SAS program so that the SAS data library is assigned each time the program is submitted.
General form, basic LIBNAME statement:
LIBNAME libref 'SAS-data-library';
where
libref is 1 to 8 characters long, begins with a letter or underscore, and contains only letters, numbers, or
underscores.
SAS-data-library is the name of a SAS data library in which SAS data files are stored. The specification of the
physical name of the library differs by operating environment.
The LIBNAME statement below assigns the libref Clinic to the SAS data library D:\Users\Qtr\Reports in
the Windows environment.
libname clinic 'd:\users\qtr\reports';
Many of the examples in this book use the libref sasuser. The following LIBNAME statement assigns the
libref sasuser to the c:\Users\name\sasuser folder in a Windows operating environment:
libname sasuser 'c:\Users\name\sasuser';
The table below gives some examples of physical names for SAS data libraries in various operating
environments.
Physical Names for SAS Data Libraries
Environment
Sample Physical Name
Windows
c:\fitness\data
UNIX
/users/april/fitness/sasdata
z/OS (OS/390)
april.fitness.sasdata
- 47 -
The code examples in this chapter are shown in the Windows operating environment. If you are running
SAS within another operating environment, then the platform-specific names and locations will look
different. Otherwise, SAS programming code will be the same across operating environments.
You can use multiple LIBNAME statements to assign as many librefs as needed.
Verifying Librefs
After assigning a libref, it is a good idea to check the Log window to verify that the libref has been
assigned successfully.
Log Output for Clinic libref
How Long Librefs Remain in Effect
The LIBNAME statement is global, which means that the librefs remain in effect until you modify them,
cancel them, or end your SAS session.
Therefore, the LIBNAME statement assigns the libref for the current SAS session only. Each time you
begin a SAS session, you must assign a libref to each permanent SAS data library that contains files that
you want to access in that session. (Remember that Work is the default libref for a temporary SAS data
library.)
How Long Librefs Remain in Effect
When you end your SAS session or delete a libref, SAS no longer has access to the files in the library.
However, the contents of the library still exist on your operating system.
Remember that you can also assign a library from the SAS Explorer using the New Library window.
Libraries that are created with the New Library window can be automatically assigned at startup by
selecting Enable at Startup.
Specifying Two-Level Names
After you assign a libref, you specify it as the first element in the two-level name for a SAS file.
Specifying Two-Level Names
- 48 -
For example, in order for the PRINT procedure to read Clinic.Admit, you specify the two-level name of
the file as follows:
proc print data=clinic.admit;
run;
Referencing Files in Other Formats
You can use the LIBNAME statement to reference not only SAS files but also files that were created with
other software products, such as database management systems.
SAS can read or write these files by using the appropriate engine for that file type. For some file types, you
need to tell SAS which engine to use. For others, SAS automatically chooses the appropriate engine.
A SAS engine is a set of internal instructions that SAS uses for writing to and reading from files in a SAS
library.
Referencing Files in Other Formats
- 49 -
Specifying Engines
To indicate which engine to use, you specify the engine name in the LIBNAME statement, as shown
below.
General form, LIBNAME statement for files in other formats:
LIBNAME libref engine 'SAS-data-library';
where
libref is 1 to 8 characters long, begins with a letter or underscore, and contains only letters, numbers, or
underscores.
engine is the name of a library engine that is supported in your operating environment.
SAS-data-library is the name of a SAS library in which SAS data files are stored. The specification of the
physical name of the library differs by operating environment.
Interface Library Engines
Interface library engines support read-only access to BMDP, OSIRIS, and SPSS files. With these engines,
the physical filename that is associated with a libref is an actual filename, not a SAS library. This is an
exception to the rules for librefs.
Engines and Their Descriptions
Engine
Description
BMDP
allows read-only access to BMDP files
OSIRIS
allows read-only access to OSIRIS files
- 50 -
SPSS allows read-only access to SPSS files
For example, the LIBNAME statement below specifies the libref Rptdata and the engine SPSS for the file
G:\Myspss.dat in the Windows operating environment.
libname rptdata spss 'g:\myspss.spss';
For more information about interface library engines, see the SAS documentation for your operating
environment.
SAS/ACCESS Engines
If your site licenses SAS/ACCESS software, you can use the LIBNAME statement to access data that is
stored in a DBMS file. The types of data you can access depend on your operating environment and on
which SAS/ACCESS products you have licensed.
Relational Databases and Their Associated Files
Relational Databases
Nonrelational Files
PC Files
ORACLE
ADABAS
Excel (.xls)
SYBASE
IMS/DL-I
Lotus (.wkn)
Informix
CA-IDMS
dBase
DB2 for z/OS
DB2 for UNIX and PC
SYSTEM 2000
DIF
Oracle Rdb
Teradata
Access
ODBC
MySQL
SPSS
CA-OpenIngres
Netezza
Stata
Ole DB
Paradox
- 51 -
Viewing SAS Libraries
Viewing the Contents of a SAS Library
You've seen that you can assign librefs in order to access different types of data. Using the Programming
Workspace explained that after you have assigned a libref, you can view
details about the library that the libref references
the library's contents
contents and properties of files in the library.
The libraries that are currently defined for your SAS session are listed under Libraries in the Explorer
window. To view details about a library, double-click Libraries (or select LibrariesOpen from the
pop-up menu). Then select ViewDetails.
Information for each library (name, engine, type, host pathname, and date modified) is listed under Active
Libraries.
Viewing Active Libraries
To view the contents of a library, double-click the library name (or select the library name, and then select
Open from the pop-up menu). A list of the files contained in the library is displayed. If you have the
details feature turned on, then information about each file (name, size, type, description, and date modified)
is also listed.
Viewing the Contents of a SAS Library
- 52 -
Viewing a File's Contents
To display a file's contents in a windowing environment, you can double-click the filename (or select the
filename, and then select Open from the pop-up menu). If you select a SAS data set (type Table or View),
its contents are displayed in the VIEWTABLE window.
Viewing a File's Contents
To display a file's properties, you can select the filename, and then select Properties from the pop-up
menu.
Note: If you are working in the z/OS operating environment, you can type ? in the selection field next to
a filename in the Explorer window to display a pop-up menu with a list of options for working with that
file.
If you have installed SAS/FSP software, you can type B or L in the selection field next to a data set name
to browse the data set observation by observation or to list the contents of the data set, respectively. For
more information, see the documentation for SAS/FSP.
- 53 -
If SAS/FSP is not installed, you can view the contents of a SAS data set by using the PRINT procedure
(PROC PRINT). You can learn how to use PROC PRINT in Creating List Reports.
Using PROC CONTENTS to View the Contents of a SAS Library
You've learned how to use SAS windows to view the contents of a SAS library or of a SAS file.
Alternatively, you can use the CONTENTS procedure to create SAS output that describes either of the
following:
the contents of a library
the descriptor information for an individual SAS data set.
General form, basic PROC CONTENTS step:
PROC CONTENTS DATA=SAS-file-specification NODS;
RUN;
where
SAS-file-specification specifies an entire library or a specific SAS data set within a library. SAS-file-specification
can take one of the following forms:
<libref.>SAS-data-set names one SAS data set to process.
<libref.>_ALL
_
requests a listing of all files in the library. (Use a period (.) to append _ALL_ to the
libref.)
NODS suppresses the printing of detailed information about each file when you specify _ALL_. (You can specify
NODS only when you specify _ALL_.)
Examples
To view the contents of the entire clinic library, you can submit the following PROC CONTENTS step:
proc contents data=clinic._all_ nods;
run;
The output from this step lists only the names, types, sizes, and modification dates for the SAS files in the
Clinic library.
Using PROC CONTENTS to View the Contents of a Library
- 54 -
Output from PROC CONTENTS on SAS Library Clinic
To view the descriptor information for only the clinic.Admit data set, you can submit the following PROC
CONTENTS step:
proc contents data=clinic.admit;
run;
The output from this step lists information for clinic.admit, including an alphabetic list of the variables in
the data set.
PROC CONTENTS Output Showing the Descriptor Information for a Single Data Set in a Library
- 55 -
PROC CONTENTS Output Showing the Engine/Host Dependent information for a Single Data Set in a
Library
PROC CONTENTS Output Showing an Alphabetic List of the Variables in the Data Set
- 56 -
Using PROC DATASETS
In addition to PROC CONTENTS, you can use PROC DATASETS to view the contents of a SAS library
or a SAS data set. PROC DATASETS also enables you to perform a number of management tasks such as
copying, deleting, or modifying SAS files.
PROC CONTENTS and PROC DATASETS overlap in terms of functionality. Generally, these two
function the same:
the CONTENTS procedure
the CONTENTS statement in the DATASETS procedure.
PROC CONTENTS<options>;
RUN;
PROC DATASETS<options>;
CONTENTS<options>;
QUIT;
The major difference between the CONTENTS procedure and the CONTENTS statement in PROC
DATASETS is the default for libref in the DATA= option. For PROC CONTENTS, the default is either
Work or User. For the CONTENTS statement, the default is the libref of the procedure input library.
Notice also that PROC DATASETS supports RUN-group processing. It uses a QUIT statement to end the
procedure. The QUIT statement and the RUN statement are not required.
However, the options for the PROC CONTENTS statement and the CONTENTS statement in the
DATASETS procedure are the same. For example, the following PROC steps produce essentially the same
output (with minor formatting differences):
proc datasets;
contents data=clinic._all_ nods;
- 57 -
proc contents data=clinic._all_ nods;
run;
In addition to the CONTENTS statement, PROC DATASETS also uses several other statements. These
statements enable you to perform tasks that PROC CONTENTS does not perform. For more information
about PROC DATASETS, see the SAS documentation.
Viewing Descriptor Information for a SAS Data Set Using VARNUM
As with PROC CONTENTS, you can also use PROC DATASETS to display the descriptor information
for a specific SAS data set.
By default, PROC CONTENTS and PROC DATASETS list variables alphabetically. To list variable
names in the order of their logical position (or creation order) in the data set, you can specify the
VARNUM option in PROC CONTENTS or in the CONTENTS statement in PROC DATASETS.
For example, either of these programs creates output that includes the list of variables shown below:
proc datasets;
contents data=clinic.admit varnum;
proc contents data=clinic.admit varnum;
run;
Note: If you are using the sample data in the sasuser library, you may want to specify sasuser as the
libref (instead of clinic):
contents data=sasuser.admit varnum;
proc contents data=sasuser.admit varnum;
run;
Viewing Descriptor Information for a SAS Data Set Using VARNUM
- 58 -
Specifying Results Formats
Next, let's consider the appearance and format of your SAS output.
HTML and Listing Formats
In SAS 9.3 and later versions, when running SAS in windowing mode in the Windows and UNIX
operating environments, HTML output is created by default. In other platforms, you can create HTML
output using programming statements. When running SAS in batch mode, the default format is LISTING.
Let's look at these two result formats. The following PROC PRINT output is a listing of part of the SAS
data set Clinic.Therapy:
LISTING Output
This is HTML output from the same program:
HTML Output
- 59 -
If you aren't running SAS in a desktop operating environment, you might want to skip this topic and go on
to Setting System Options. For details on creating HTML output using programming statements on any
SAS platform, see Producing HTML Output.
The Results Tab of the Preferences Window
You use the Preferences window to set the result format(s) that you prefer. Your preferences are saved
until you modify them, and they apply to all output that you create.
To open this window in desktop operating environments, select ToolsOptionsPreferences. Then
click the Results tab. You can remember this sequence using the mnemonic TOPR (pronounced “topper”).
You can choose Listing, HTML, or both. You can also specify options for displaying and storing your
results.
Results tab options may differ somewhat, depending on your operating environment. The example below
is from the Windows operating environment.
The following display shows the SAS Results tab with the new default settings specified.
SAS Results Tab with the New Default Settings
- 60 -
The default settings in the Results tab are as follows:
The Create listing check box is not selected, so LISTING output is not created.
The Create HTML check box is selected, so HTML output is created.
The Use WORK folder check box is selected, so both HTML and graph image files are saved in the WORK
folder (and not your current directory).
The default style, HTMLBlue, is selected from the Style drop-down list.
The Use ODS Graphics check box is selected, so ODS Graphics is enabled.
Internal browser is selected from the View results using: drop-down list, so results are viewed in an internal
SAS browser.
If you create HTML files, they are stored in the folder that you specify and are by default incrementally
named sashtml.htm, sashtml1.htm, sashtml2.htm, and so on, throughout your SAS session.
Now look at the two choices for viewing HTML results: Internal browser and Preferred web browser.
(These options appear in the Results tab only in the Windows operating environment.)
Internal Browser vs. Preferred Web Browser
To view HTML output, you can choose between these two options in the Results tab of the Preferences
window in the Windows operating environment:
the Internal browser, called the Results Viewer window. SAS provides this browser as part of your SAS
installation.
the Preferred web browser. If you select this option, SAS uses the browser that is specified in the Web tab of the
Preferences window. By default, this is the default browser for your PC.
- 61 -
Internal Browser
The Results Viewer is displayed as a SAS window, as shown below.
Internal Browser Showing HTML Output
Preferred Web Browser
If you select the preferred web browser, your HTML output is displayed in a separate browser that is
independent of SAS. For example, the HTML output below is displayed in Internet Explorer.
Preferred Web Browser Showing HTML Output
- 62 -
Setting System Options
Overview
If you create your procedure output as LISTING output, you can also control the appearance of your
output by setting system options such as
line size (the maximum width of the log and output)
page size (the number of lines per printed page of output)
the display of page numbers
the display of date and time.
The above options do not affect the appearance of HTML output.
All SAS system options have default settings that are used unless you specify otherwise. For example,
page numbers are automatically displayed in LISTING output (unless your site modifies this default).
LISTING Output Showing Default Settings
- 63 -
Changing System Options
To modify system options in your LISTING output, you submit an OPTIONS statement. You can place an
OPTIONS statement anywhere in a SAS program to change the settings from that point onward. However,
it is good programming practice to place OPTIONS statements outside of DATA or PROC steps so that
your programs are easier to read and debug.
Because the OPTIONS statement is global, the settings remain in effect until you modify them, or until
you end your SAS session.
General form, OPTIONS statement:
OPTIONS options;
where options specifies one or more system options to be changed. The available system options depend
on your host operating system.
Example: DATE | NODATE and NUMBER | NONUMBER Options
By default, page numbers and dates appear with LISTING output. The following OPTIONS statement
suppresses the printing of both page numbers and the date and time in LISTING output.
options nonumber nodate;
In the following example, the NONUMBER and NODATE system options suppress the display of page
numbers and the date in the PROC PRINT output. Page numbers are not displayed in the PROC FREQ
output, either, but the date does appear at the top of the PROC FREQ output since the DATE option was
specified.
ods listing;
options nonumber nodate;
proc print data=sasuser.admit;
var id sex age height weight;
- 64 -
where age>=30;
run;
options date;
proc freq data=sasuser.diabetes;
where fastgluc>=300;
tables sex;
run;
proc print data=sasuser.diabetes;
run;
ods listing close;
PROC PRINT LISTING Output for sasuser.diabetes
PROC FREQ LISTING Output for sasuser.diabetes
PROC PRINT LISTING Output for sasuser.diabetes
- 65 -
PROC FREQ LISTING Output with Date and Still No Page Number
Example: PAGENO= Option
If you print page numbers, you can specify the beginning page number for your LISTING report by using
the PAGENO= option. If you don't specify PAGENO=, output is numbered sequentially throughout your
SAS session, starting with page 1.
In the following example, the output pages are numbered sequentially throughout the SAS session,
beginning with number 3.
ods listing;
options nodate number pageno=3;
proc print data=hrd.funddrive;
run;
ods listing close;
Since SAS 9.3 does not creating LISTING output by default, the ODS LISTING statement was added to
generate LISTING output.
LISTING Output with the PAGENO= Option Set
- 66 -
Example: PAGESIZE= Option
The PAGESIZE= option (alias PS=) specifies how many lines each page of output contains. In the
following example, each page of the output that the PRINT procedure produces contains 15 lines
(including those used by the title, date, and so on).
options number date pagesize=15;
proc print data=sasuser.admit;
run;
LISTING Output with PAGESIZE= 15
Example: LINESIZE= Option
The LINESIZE= option (alias LS=) specifies the width of the print line for your procedure output and log.
Observations that do not fit within the line size continue on a different line.
In the following example, the observations are longer than 64 characters, so the observations continue on a
subsequent page.
ODS listing;
options number linesize=64;
proc print data=flights.europe;
run;
- 67 -
ODS listing close;
LISTING Output with the LINESIZE= Option Set, Page 1
LISTING Output with the LINESIZE= Option Set, Page 2
Handling Two-Digit Years: Year 2000 Compliance
If you use two-digit year values in your data lines, external files, or programming statements, you should
consider another important system option, the YEARCUTOFF= option. This option specifies which
100-year span is used to interpret two-digit year values.
The Default 100 Year Span in SAS
All versions of SAS represent dates correctly from 1582 A.D. to 20,000 A.D. (Leap years, century, and
fourth-century adjustments are made automatically. Leap seconds are ignored, and SAS does not adjust for
Daylight Savings Time.) However, you should be aware of the YEARCUTOFF= value to ensure that you
are properly interpreting two-digit years in data lines.
As with other system options, you specify the YEARCUTOFF= option in the OPTIONS statement:
- 68 -
options yearcutoff=1925;
How the YEARCUTOFF= Option Works
When a two-digit year value is read, SAS interprets it based on a 100-year span which starts with the
YEARCUTOFF= value. The default value of YEARCUTOFF= is 1920.
Default YEARCUTOFF= Date (1920)
Date Expressions and How They Are Interpreted
Date Expression Interpreted As
12/07/41 12/07/1941
18Dec15 18Dec2015
04/15/30 04/15/1930
15Apr95 15Apr1995
However, you can override the default and change the value of YEARCUTOFF= to the first year of
another 100-year span. For example, if you specify YEARCUTOFF=1950, then the 100-year span will be
from 1950 to 2049.
options yearcutoff=1950;
Using YEARCUTOFF=1950, dates are interpreted as shown below:
Interpreting Dates When YEARCUTOFF=1950
Date Expressions and How They Are Interpreted
Date Expression Interpreted As
12/07/41 12/07/2041
18Dec15 18Dec2015
04/15/30 04/15/2030
15Apr95 15Apr1995
How Four-Digit Years Are Handled
Remember, the value of the YEARCUTOFF= system option affects only two-digit year values. A date
value that contains a four-digit year value will be interpreted correctly even if it does not fall within the
100-year span set by the YEARCUTOFF= system option.
- 69 -
You can learn more about reading date values in Reading Date and Time Values.
Using System Options to Specify Observations
You've seen how to use SAS system options to change the appearance of output and interpret two-digit
year values. You can also use the FIRSTOBS= and OBS= system options to specify the observations to
process from SAS data sets.
You can specify either or both of these options as needed. That is, you can use
FIRSTOBS= to start processing at a specific observation
OBS= to stop processing after a specific observation
FIRSTOBS= and OBS= together to process a specific group of observations.
General form, FIRSTOBS= and OBS= options in an OPTIONS statement:
FIRSTOBS=n
OBS=n
where n is a positive integer. For FIRSTOBS=, n specifies the number of the first observation to process.
For OBS=, n specifies the number of the last observation to process. By default, FIRSTOBS=1. The
default value for OBS= is MAX, which is the largest signed, eight-byte integer that is representable in your
operating environment. The number can vary depending on your operating system.
Each of these options applies to every input data set that is used in a program or a SAS process.
Examples: FIRSTOBS= and OBS= Options
The data set clinic.heart contains 20 observations. If you specify FIRSTOBS=10, SAS reads the 10th
observation of the data set first and reads through the last observation (for a total of 11 observations).
options firstobs=10;
proc print data=sasuser.heart;
run;
The PROC PRINT step produces the following output:
PROC PRINT Output with FIRSTOBS=10
- 70 -
If you specify OBS=10 instead, SAS reads through the 10th observation, in this case for a total of 10
observations. (Notice that FIRSTOBS= has been reset to the default value.)
options firstobs=1 obs=10;
proc print data=sasuser.heart;
run;
Now the PROC PRINT step produces this output:
PROC PRINT Output with FIRSTOBS=1 and Obs=10
- 71 -
Combining FIRSTOBS= and OBS= processes observations in the middle of the data set. For example, the
following program processes only observations 10 through 15, for a total of 6 observations:
options firstobs=10 obs=15;
proc print data=sasuser.heart;
run;
Here is the output:
PROC PRINT Output with FIRSTOBS=10 and Obs=15
To reset the number of the last observation to process, you can specify OBS=MAX in the OPTIONS
statement.
options obs=max;
- 72 -
This instructs any subsequent SAS programs in the SAS session to process through the last observation in
the data set being read.
Using FIRSTOBS= and OBS= for Specific Data Sets
As you saw above, using the FIRSTOBS= or OBS= system options determines the first or last observation,
respectively, that is read for all steps for the duration of your current SAS session or until you change the
setting. However, you may want to
override these options for a given data set
apply these options to a specific data set only.
To affect any single file, you can use FIRSTOBS= or OBS= as data set options instead of as system
options. You specify the data set option in parentheses immediately following the input data set name.
A FIRSTOBS= or OBS= specification from a data set option overrides the corresponding FIRSTOBS= or
OBS= system option.
Example: FIRSTOBS= and OBS= as Data Set Options
As shown in the last example below, this program processes only observations 10 through 15, for a total of
6 observations:
options firstobs=10 obs=15;
proc print data=sasuser.heart;
run;
You can create the same output by specifying FIRSTOBS= and OBS= as data set options, as follows. The
data set options override the system options for this instance only.
options firstobs=10 obs=15;
proc print data=sasuser.heart(firstobs=10 obs=15);
run;
To specify FIRSTOBS= or OBS= for this program only, you could omit the OPTIONS statement
altogether and simply use the data set options.
The SAS System Options Window
You can also set system options by using the SAS System Options window. The changed options are reset
to the defaults at the end of your SAS session.
To view the SAS System Options window, select ToolsOptionsSystem.
The SAS System Options Window
- 73 -
Changing Options
To change an option:
1. Expand the groups and subgroups under SAS Options Environment until you find the option that you want to
change. (Options in subgroups are listed in the right pane of the window.)
2. Click the name of the option that you want to change, and display its pop-up menu. Then select one of the
choices:
o Modify Value opens a window in which you type or select a new value for the option.
o Set to Default immediately resets the option to its default value.
For example, the SAS System Options window above shows options for the SAS log and procedure
output subgroup under the group Log and procedure output control.
Finding Options Quickly
To locate an option in the SAS System Options window:
1. Place your cursor over the name of any option group or subgroup, and display its pop-up menu.
2. Click Find Option. The Find Option dialog box appears.
3. Type the name of the option that you want to locate, and click OK.
The SAS System Options window expands to the appropriate option subgroup. All subgroup options also
appear, and the option that you located is highlighted.
Additional Features
- 74 -
When you set up your SAS session, you can set SAS system options that affect LISTING output,
information written to the SAS log, and much more. Here are some additional system options that you are
likely to use with SAS:
Selected System Options and Their Descriptions
FORMCHAR=
'formatting-characters'
specifies the formatting characters for your output device. Formatting
characters are used to construct the outlines of tables as well as dividers for
various procedures, such as the FREQ and TABULATE procedures. If you
do not specify formatting characters as an option in the procedure, then the
default specifications given in the FORMCHAR= system option are used.
FORMDLIM=
'delimit-character'
specifies a character that is used to delimit page breaks in SAS System
output. Normally, the delimit character is null. When the delimit character is
null, a new physical page starts whenever a page break occurs.
LABEL|NOLABEL
permits SAS procedures to temporarily replace variable names with
descriptive labels. The LABEL system option must be in effect before the
LABEL option of any procedure can be used. If NOLABEL is specified,
then the LABEL option of a procedure is ignored. The default setting is
LABEL.
REPLACE|NOREPLACE
specifies whether permanently stored SAS data sets are replaced. If you
specify NOREPLACE, a permanently stored SAS data set cannot be
replaced with one that has the same name. This prevents you from
inadvertently replacing existing SAS data sets. The default setting is
REPLACE.
SOURCE|NOSOURCE
controls whether SAS source statements are written to the SAS log.
NOSOURCE specifies not to write SAS source statements to the SAS log.
The default setting is SOURCE.
You can also use programming statements to control the result format of each item of procedure output
individually. For more information, see Producing HTML Output .
Chapter Summary
Text Summary
Referencing SAS Files in SAS Libraries
To reference a SAS file, you assign a libref (library reference) to the SAS library in which the file is stored.
Then you use the libref as the first part of the two-level name (libref.filename) for the file. To assign a
libref, you can submit a LIBNAME statement. You can store the LIBNAME statement with any SAS
- 75 -
program to assign the libref automatically when you submit the program. The LIBNAME statement
assigns the libref for the current SAS session only. You must assign a libref each time you begin a SAS
session in order to access SAS files that are stored in a permanent SAS library other than clinic. (Work is
the default libref for a temporary SAS library.)
You can also use the LIBNAME statement to reference data in files that were created with other software
products, such as database management systems. SAS can write to or read from the files by using the
appropriate engine for that file type. For some file types, you need to tell SAS which engine to use. For
others, SAS automatically chooses the appropriate engine.
Viewing Librefs
The librefs that are in effect for your SAS session are listed under Libraries in the Explorer window. To
view details about a library, double-click Libraries (or select LibrariesOpen from the pop-up menu).
Then select ViewDetails. The library's name, engine, host pathname, and date are listed under Active
Libraries.
Viewing the Contents of a Library
To view the contents of a library, double-click the library name in the Explorer window (or select the
library name and then select Open from the pop-up menu). Files contained in the library are listed under
Contents.
Viewing the Contents of a File
If you are working in a windowing environment, you can display the contents of a file by double-clicking
the filename (or selecting the filename and then selecting Open from the pop-up menu) under Contents in
the Explorer window. If you select a SAS data set, its contents are displayed in the VIEWTABLE window.
If you are working in the z/OS operating environment, you can type ? in the selection field next to a
filename in the Explorer window to display a pop-up menu with a list of options for working with that file.
Listing the Contents of a Library
To list the contents of a library, use the CONTENTS procedure. Append a period and the _ALL_ option to
the libref to get a listing of all files in the library. Add the NODS option to suppress detailed information
about the files. As an alternative to PROC CONTENTS, you can use PROC DATASETS.
Specifying Result Formats
In desktop operating environments, you can choose to create your SAS procedure output as an HTML
document, a listing (traditional SAS output), or both. You choose the results format(s) that you prefer in
the Preferences window. Your preferences are saved until you modify them, and they apply to all output
that is created during your SAS session. To open this window, select ToolsOptionsPreferences.
Then click the Results tab. Choose Create listing, Create HTML, or both.
If you choose Create HTML, then each HTML file is displayed in the browser that you specify (in the
Windows operating environment, the internal browser is the Results Viewer window). HTML files are
stored in the location that you specify and are by default incrementally named sashtml.htm, sashtml1.htm,
sashtml2.htm, and so on throughout your SAS session. To specify where HTML files are stored, type a
path in the Folder box (or click Browse to locate a pathname). If you prefer to store your HTML files
temporarily and to delete them at the end of your SAS session, click Use WORK folder instead of
- 76 -
specifying a folder. To specify the presentation style for HTML output, you can select an item in the Style
box.
Setting System Options
For your LISTING output, you can also control the appearance of your output by setting system options
such as line size, page size, the display of page numbers, and the display of the date and time. (These
options do not affect the appearance of HTML output.)
All SAS system options have default settings that are used unless you specify otherwise. For example,
page numbers are automatically displayed (unless your site modifies this default). To modify system
options, you submit an OPTIONS statement. You can place an OPTIONS statement anywhere in a SAS
program to change the current settings. Because the OPTIONS statement is global, the settings remain in
effect until you modify them or until you end your SAS session.
If you use two-digit year values in your SAS data lines, you must be aware of the YEARCUTOFF= option
to ensure that you are properly interpreting two-digit years in your SAS program. This option specifies
which 100-year span is used to interpret two-digit year values.
To specify the observations to process from SAS data sets, you can use the FIRSTOBS= and OBS=
options.
You can also use the SAS System Options window to set system options.
Additional Features
You can set a number of additional SAS system options that are commonly used.
Syntax
LIBNAME libref 'SAS-data-library';
LIBNAME libref engine'SAS-data-library';
OPTIONS options;
PROC CONTENTS DATA= libref._ALL_ NODS;
PROC DATASETS;
CONTENTS DATA= libref._ALL_ NODS;
QUIT;
Points to Remember
LIBNAME and OPTIONS statements remain in effect for the current SAS session only.
When you work with date values, check the default value of the YEARCUTOFF= system option and change it if
necessary.
Chapter Quiz
- 77 -
Select the best answer for each question. After completing the quiz, check your answers using the answer
key in the appendix.
1. If you submit the following program, how does the output look?
options pagesize=55 nonumber;
proc tabulate data=clinic.admit;
class actlevel;
var age height weight;
table actlevel,(age height weight)*mean;
run;
options linesize=80;
proc means data=clinic.heart min max maxdec=1;
var arterial heart cardiac urinary;
class survive sex;
run;
1. The PROC MEANS output has a print line width of 80 characters, but the PROC TABULATE output has
no print line width.
2. The PROC TABULATE output has no page numbers, but the PROC MEANS output has page numbers.
3. Each page of output from both PROC steps is 55 lines long and has no page numbers, and the PROC
MEANS output has a print line width of 80 characters.
4. The date does not appear on output from either PROC step.
2. How can you create SAS output in HTML format on any SAS platform?
1. by specifying system options
2. by using programming statements
3. by using SAS windows to specify the result format
4. you can't create HTML output on all SAS platforms
3. In order for the date values 05May1955 and 04Mar2046 to be read correctly, what value must the
YEARCUTOFF= option have?
1. a value between 1947 and 1954, inclusive
2. 1955 or higher
3. 1946 or higher
4. any value
4. When you specify an engine for a library, you are always specifying
1. the file format for files that are stored in the library.
2. the version of SAS that you are using.
3. access to other software vendors' files.
4. instructions for creating temporary SAS files.
5. Which statement prints a summary of all the files stored in the library named Area51?
1.
proc contents data=area51._all_ nods;
2.
proc contents data=area51 _all_ nods;
3.
proc contents data=area51 _all_ noobs;
- 78 -
4.
proc contents data=area51 _all_.nods;
6. The following PROC PRINT output was created immediately after PROC TABULATE output. Which system
options were specified when the report was created?
1. OBS=, DATE, and NONUMBER
2. NUMBER, PAGENO=1, and DATE
3. NUMBER and DATE only
4. none of the above
7. Which of the following programs correctly references a SAS data set named SalesAnalysis that is stored in a
permanent SAS library?
1.
data saleslibrary.salesanalysis;
set mydata.quarter1sales;
if sales>100000;
run;
2.
data mysales.totals;
set sales_99.salesanalysis;
if totalsales>50000;
run;
3.
proc print data=salesanalysis.quarter1;
var sales salesrep month;
run;
4.
proc freq data=1999data.salesanalysis;
- 79 -
tables quarter*sales;
run;
8. Which time span is used to interpret two-digit year values if the YEARCUTOFF= option is set to 1950?
1. 1950-2049
2. 1950-2050
3. 1949-2050
4. 1950-2000
9. Assuming you are using SAS code and not special SAS windows, which one of the following statements is false?
1. LIBNAME statements can be stored with a SAS program to reference the SAS library automatically
when you submit the program.
2. When you delete a libref, SAS no longer has access to the files in the library. However, the contents of
the library still exist on your operating system.
3. Librefs can last from one SAS session to another.
4. You can access files that were created with other vendors' software by submitting a LIBNAME
statement.
10. What does the following statement do?
libname osiris spss 'c:\myfiles\sasdata\data.spss';
1. defines a library called Spss using the OSIRIS engine
2. defines a library called Osiris using the SPSS engine
3. defines two libraries called Osiris and Spss using the default engine
4. defines the default library using the OSIRIS and SPSS engines
Chapter 3 Editing and Debugging SAS Programs
Overview
Opening a Stored SAS Program
Editing SAS Programs
Clearing SAS Programming Windows
Interpreting Error Messages
Correcting Errors
Resolving Common Problems
Additional Features
Chapter Summary
Chapter Quiz
- 80 -
Overview
Introduction
Now that you're familiar with the basics, you can learn how to use the SAS programming windows to edit
and debug programs effectively.
SAS Programming Windows
Objectives
In this chapter you learn to
open a stored SAS program
edit SAS programs
clear SAS programming windows
interpret error messages in the SAS log
correct errors
resolve common problems.
Opening a Stored SAS Program
Overview
You can open a stored program in the code editing window, where you can edit the program and submit it
again.
You can open a program using
- 81 -
file shortcuts
My Favorite Folders
the Open window
the INCLUDE command.
In operating environments that provide drag-and-drop functionality, you can also open a SAS file by
dragging and dropping it into the code editing window.
Code Editor Window
Using File Shortcuts
File shortcuts are available in the SAS Explorer window. To open a program using file shortcuts:
1. At the top level of the SAS Explorer window, double-click File Shortcuts.
2. Double-click the file shortcut that you want to open, or select Open from the pop-up menu for the file.
File Short Cuts
The file opens in a new code editing window, with the filename shown as the window title.
You can select Submit from the pop-up menu to execute the file directly from File Shortcuts.
- 82 -
Remember that you open File Shortcuts with a single-click in the z/OS (z/OS) or CMS operating
environments. To open a file, you type
? beside the item to display pop-up menus
S beside the item to simulate a double-click.
Using My Favorite Folders
To view and manage any files in your operating environment, you can use the My Favorite Folders
window.
To open a file that is stored in My Favorite Folders:
1. Select ViewMy Favorite Folders.
2. Double-click the file, or select Open from the pop-up menu for the file.
The file opens in a new code editing window, with the filename shown as the window title.
You can select Submit from the pop-up menu to execute the file directly from My Favorite Folders.
Using the Open Window
To use the Open window:
1. With the code editing window active, select Open (or FileOpen Program).
2. In the Open window, click the file that you want to open (or type the path for the file).
3. Click Open or OK.
The Open WIndow
The file opens in a new code editing window, with the filename shown as the window title.
In the Windows environment, you can submit your file directly from the Open window by clicking the
Submit check box before clicking Open.
- 83 -
The appearance of the Open window varies by operating environment. In mainframe operating
environments, files are not listed, so you must type the path for the file.
If you're not sure of the name of the file in which your program is stored, you can do either of the
following:
use My Favorite Folders to locate the file in your operating environment
issue the X command to temporarily suspend your SAS session.
The X command enables you to use operating system facilities without ending your SAS session. When you issue the X
command, you temporarily exit SAS and enter the host system.
To resume your SAS session after issuing the X command, issue the appropriate host system command, as
shown below.
Resuming a SAS Session
Operating Environment
Host command to resume SAS session
z/OS
RETURN or END
UNIX
exit
Windows
EXIT
Issuing an INCLUDE Command
You can also open a program by issuing an INCLUDE command.
General form, basic INCLUDE command:
INCLUDE 'file-specification'
where file-specification is the physical name by which the host system recognizes the file.
Example
Suppose you want to include the program D:\Programs\Sas\Myprog1.sas in the Windows operating
environment. To do so, you can issue the following INCLUDE command:
include 'd:\programs\sas\myprog1.sas'
- 84 -
Because this is a command (not a SAS statement), no semicolon is used at the end.
Editing SAS Programs
Now that you know how to open a SAS program, let's review the characteristics of SAS statements and
look at enhancing the readability of your SAS programs.
SAS Program Structure
Remember that SAS programs consist of SAS statements. Consider the SAS program that is shown in a
code editing window below.
An Example SAS Program
Although you can write SAS statements in almost any format, a consistent layout enhances readability and
helps you understand the program's purpose. It's a good idea to
begin DATA and PROC steps in column one
indent statements within a step
begin RUN statements in column one
include a RUN statement after every DATA step or PROC step.
data sashelp.prdsale;
infile jobs;
input date name $ job $;
run;
proc print data=sashelp.prdsale;
run;
Now let's look at the Enhanced Editor window features that you can use to edit your SAS programs.
Note: If you are running SAS in an operating environment other than Windows, you can skip the
Enhanced Editor window section and see Program Editor Features instead.
- 85 -
Using the Enhanced Editor
When you edit SAS programs in the Enhanced Editor, you can take advantage of a number of useful
features. The following section shows you how to
use color-coding to identify code elements
automatically indent the next line when you press the Enter key
collapse and expand sections of SAS procedures, DATA steps, and macros
bookmark lines of code for easy access to different sections of your program
open multiple views of a file.
Enhanced Editor Window
You can also
save color-coding settings in a color scheme
create macros that record and play back program editing commands
create shortcuts for typing in text using abbreviations
customize keyboard shortcuts for most Enhanced Editor commands
create and format your own keywords
automatically reload modified disk files
access Help for SAS procedures by placing the insertion point within the procedure name and pressing F1.
Entering and Viewing Text
The Enhanced Editor uses visual aides such as color coding and code sections to help you write and debug
your SAS programs.
You can use the margin on the left side of the Editor window to
select one or more lines of text
- 86 -
expand and collapse code sections
display bookmarks.
Finding and Replacing Text
To find text in the Editor window, select EditFind or press Ctrl+F. You can set commonly used find
options, including specifying that the text string is a regular expression. You can also specify whether to
search in the code only, in the comments only, or in both the code and comments.
The Find Window
Using Abbreviations
You can define a character string so that when you type it and then press the Tab key or the Enter key, the
string expands to a longer character string.
For example, you could define the abbreviation myv7sasfiles, which would expand to
'c:\winnt\profiles\myid\personal\mysasfiles\v7';. Abbreviations are actually macros that insert one or
more lines of text.
To create an abbreviation,
1. Select ToolsAdd Abbreviation.
2. For Abbreviation, type the name of the abbreviation.
3. For Text to insert for abbreviation, type the text that the abbreviation will expand into.
4. Click OK.
To use an abbreviation, type the abbreviation. When an abbreviation is recognized, a tooltip displays the
expanded text. Press the Tab key or Enter key to accept the abbreviation.
You can modify or delete abbreviations that you create.
Opening Multiple Views of a File
When you use the Enhanced Editor, you can see different parts of the same file simultaneously by opening
multiple views of the same file. While you are working with multiple views, you are working with only
one file, not multiple copies of the same file.
To open multiple views of the same file,
- 87 -
1. make the file the active window.
2. select WindowNew Window.
The filename in the title bar is appended with a colon and a view number. Changes that you make to a file
in one view, such as changing text or bookmarking a line, occur in all views simultaneously. Actions such
as scroll bar movement, text selection, and expanding or contracting a section of code occur only in the
active window.
Setting Enhanced Editor Options
You can customize the Enhanced Editor by setting Enhanced Editor options. To open the Enhanced Editor
Options window, activate an Editor window and select ToolsOptionsEnhanced Editor.
Click the tabs that are located along the top of the window to navigate to the settings that you want to
change, and then select the options that you want.
For example, Show line numbers specifies whether to display line numbers in the margin. When line
numbers are displayed, the current line number is red. Insert spaces for tabs specifies whether to insert
the space character or the tab character when you press the Tab key. If it is selected, the space character is
used. If it is not selected, the tab character is used.
When you are finished, click OK.
Enhanced Editor Options Window
- 88 -
For more information about setting options in the Enhanced Editor, select HelpUsing This Window
from an Enhanced Editor window.
Program Editor Features
You use the Program Editor window to submit SAS programs. You can also enter and edit SAS programs
in this window, or you can open existing SAS programs that you created in another text editor.
To edit SAS programs in the Program Editor window, you can display line numbers. Then you can use text
editor line commands and block text editor line commands within the line numbers to insert, delete, move,
and copy lines within the Program Editor window.
In all operating environments, you can use line numbers, text editor line commands, and block text editor
line commands. Depending on your operating environment, you can also use features such as dragging and
dropping text, inserting lines with the Enter key, or marking and deleting text. For information about these
features, see the SAS documentation for your operating environment.
Line Numbers
In some operating environments, line numbers appear in the Program Editor window by default. In other
systems, the use of line numbers is optional. Activating line numbers can make it easier for you to edit
your program regardless of your operating environment.
Line Numbers
To turn on line numbers, use the NUMS command. Type nums on the command line or in the command
box and press Enter. To turn line numbers off, enter the NUMS command again. Line numbers activated
using the NUMS command will remain in effect for the duration of your SAS session.
To permanently display line numbers, select ToolsOptionsProgram Editor. Select the Editing tab.
Select the Display line numbers option. Click OK.
Text Editor Line Commands
Text editor line commands enable you to delete, insert, move, copy, and repeat text. You enter these
commands in the line number area in the Program Editor window.
The table below summarizes the basic text editing line commands.
- 89 -
Text Editor Line Commands
Command Action
Cn copies n lines (where n = a number up to 9999)
Dn deletes n lines
In insert n blank lines
Mn moves n lines
Rn repeats current line n times
A after (used with C, I, and M)
B before (used with C, I, and M)
You can use text editor line commands to perform actions like these:
Results from Using Text Editor Line Commands
Command
Action
00001
i3002
00003
inserts 3 lines after line 00002
0ib01
00002
00003
inserts 1 line before line 00001
0ib41
00002
00003
inserts 4 lines before line 00001
000c2
00002
0a 003
copies 2 lines (00001 and 00002) after line 00003
00001
0d302
00003
deletes 3 lines (00002, 00003, and 00004)
00b01
00002
00m 03
moves 1 line (00003) before line 00001
Example 1
In the example below, a PROC PRINT statement and a RUN statement need to be inserted after line
00004.
Incomplete Program
- 90 -
To insert the PROC PRINT and RUN statements in the program,
1. Type i2 anywhere in the line number area for line 00004 in the Program Editor window. Two blank lines are
inserted, and the cursor is positioned on the first new line.
2. Type the PROC PRINT statement on line 00005.
3. Type a RUN statement on line 00006.
Complete Program
Block Text Editor Line Commands
Block text editor line commands enable you to delete, repeat, copy, and move multiple lines in the
Program Editor window.
The block text editor line commands include the following:
Block Text Editor Line Commands
- 91 -
Command Action
DD deletes a block of lines
CC copies a block of lines
MM moves a block of lines
RR repeats multiple lines
A after (used with CC and MM commands)
B
b
efore (used with CC and MM commands)
To use a block command, specify the command on the first line affected and on the final line affected, and
then press Enter.
Example 1
In the program below, the PROC TABULATE step needs to be deleted.
Original Program
You can use the DD block command to remove the step. To use the DD command:
1. Type DD anywhere in the line number area for line 00007 to mark the first line affected.
2. Type DD anywhere in the line number area for line 00010 to mark the final line affected.
3. Press Enter to delete lines 00007-00010.
Edited Program
- 92 -
Example 2
In the example below, the PROC TABULATE step needs to be moved to the line after the PROC PRINT
step.
Original Program
You can use the MM block command with the A or B command to move a block of code to a specific
location. The MM is used to mark the block to be moved and the A or B is used to mark where you want
the block to go. You use the A if you want to move the block after the line, and you use b if you want to
move the block before the line.
- 93 -
To use the MM command:
1. Type MM anywhere in the line number area for line 00005 to mark the first line affected.
2. Type MM anywhere in the line number area for line 00008 to mark the last line affected.
3. Type A anywhere in the line number area for line 00010 to move the block after line 00010.
4. Press Enter to move lines 00005-00008 to line 00010.
Edited Program
Recalling SAS Programs
- 94 -
SAS statements disappear from the Program Editor window when you submit them. However, you can
recall a program to the Program Editor by selecting RunRecall Last Submit.
The area in which submitted SAS code is stored is known as the recall buffer. Program statements
accumulate in the recall buffer each time you submit a program. The recall buffer is last-in/first-out (the
most recently submitted statements are recalled first).
For example, if you submit two programs, you will need to select RunRecall Last Submit two times to
recall the first program to the Program Editor window. You can recall statements any time during your
SAS session.
Recalled Programs
Saving SAS Programs
To save your SAS program to an external file, first activate the code editing window and select FileSave
As. Then specify the name of your program in the Save As window. It's good practice to assign the
extension .SAS to your SAS programs to distinguish them from other types of SAS files (such as .LOG for
log files and .DAT for raw data files).
Note: To save a SAS program as an external file in the z/OS operating environment, select FileSave
AsWrite to File, and then specify the name of your program in the Save As dialog box. You can specify
SAS as the last level of the filename.
Save as WIndow
You can also save a SAS program by issuing a FILE command.
- 95 -
General form, basic FILE command:
FILE 'file-specification'
where file-specification is the name of the file to be saved.
Example
Suppose you want to save a program as D:\Programs\Sas\Newprog.Sas in the Windows operating
environment. To do so, you can issue the following FILE command:
file 'd:\programs\sas\newprog.sas'
Clearing SAS Programming Windows
When you run SAS programs, text accumulates in the Output window and in the Log window.
You might find it helpful to clear the contents of your SAS programming windows. To clear the Output
window, Editor window, Program Editor window, or Log window, activate each window individually and
select EditClear All.
Cleared Log Window
You can also clear the contents of a window by activating the window and then issuing the CLEAR
command.
Text that has been cleared from windows cannot be recalled. However, in the Editor and Program Editor
windows, you can select EditUndo to redisplay the text.
Interpreting Error Messages
Error Types
So far, the programs that are shown in this chapter have been error-free, but programming errors do occur.
SAS can detect several types of errors. The most common are
syntax errors that occur when program statements do not conform to the rules of the SAS language
data errors that occur when some data values are not appropriate for the SAS statements that are specified in a
program.
- 96 -
This chapter focuses on identifying and correcting syntax errors.
Syntax Errors
When you submit a program, SAS scans each statement for syntax errors, and then executes the step (if no
syntax errors are found). SAS then goes to the next step and repeats the process. Syntax errors, such as
misspelled keywords, generally prevent SAS from executing the step in which the error occurred.
You already know that notes are written to the Log window at the conclusion of execution of the program.
When a program that contains an error is submitted, messages regarding the problem also appear in the
Log window. When a syntax error is detected, the Log window
displays the word ERROR
identifies the possible location of the error
gives an explanation of the error.
Example
The program below contains a syntax error. The DATA step copies the SAS data set Clinic.Admit into a
new data set named Clinic.Admitfee. The PROC step should print the values for the variables ID, Name,
Actlevel, and Fee in the new data set. However, print is misspelled in the PROC PRINT statement.
data clinic.admitfee;
set clinic.admit;
run;
proc prin data=clinic.admitfee;
var id name actlevel fee;
run;
When the program is submitted, messages in the Log window indicate that the procedure PRIN was not
found and that SAS stopped processing the PRINT step due to errors. No output is produced by the PRINT
procedure, because the second step fails to execute.
Log Window Displaying Error Message
- 97 -
Problems with your statements or data might not be evident when you look at results in the Output window.
Therefore, it is important to review the messages in the Log window each time you submit a SAS program.
Correcting Errors
Overview
To modify programs that contain errors, you can edit them in the code editing window. You can correct
simple errors, such as the spelling error in the following program, by typing over the incorrect text,
deleting text, or inserting text.
data clinic.admitfee;
set clinic.admit;
run;
proc prin data=clinic.admitfee;
var id name actlevel fee;
run;
If you use the Program Editor window, you usually need to recall the submitted statements from the recall
buffer to the Program Editor window, where you can correct the problems. Remember that you can recall
submitted statements by issuing the RECALL command or by selecting RunRecall Last Submit.
In the program below, the missing t has been inserted into the PRINT keyword that is specified in the
PROC PRINT statement.
Corrected Program
- 98 -
Some problems are relatively easy to diagnose and correct. But sometimes you might not know right away
how to correct errors. The online Help provides information about individual procedures as well as help
that is specific to your operating environment. From the Help menu, you can also select SAS on the Web
for links to Technical Support and Frequently Asked Questions, if you have Internet access.
Resubmitting a Revised Program
After correcting your program, you can submit it again. However, before doing so, it is a good idea to clear
the messages from the Log window so that you don't confuse the old error messages with the new
messages. Then you can resubmit the program and view any resulting output.
Correct PROC PRINT Output
- 99 -
Remember to check the Log window again to verify that your program ran correctly.
Log Message with No Errors
- 100 -
Because submitted steps remain in the recall buffer, resubmitting error-free steps wastes system resources.
You can place SAS comment symbols before and after your error-free code until you debug the rest of
your program. SAS ignores text between comment symbols during processing. When your entire program
is error-free, remove the comment symbols and your entire SAS program is intact. See Additional Features
for instructions about creating a SAS comment statement.
To resubmit a section of code in the Windows operating environment, highlight the selected code in the
code editing window. Then select RunSubmit.
Note: This and other chapters show you the Enhanced Editor window only. If you are not using the
Enhanced Editor as a code editing window, be sure to adapt the directions for the Program Editor window.
For example, you might need to recall programs.
Resolving Common Problems
Overview
In addition to correcting spelling mistakes, you might need to resolve several other types of common
syntax errors. These errors include
omitting semicolons
leaving quotation marks unbalanced
specifying invalid options.
Another common problem is omitting a RUN statement at the end of a program. Although this is not
technically an error, it can produce unexpected results. For the sake of convenience, we'll consider it
together with syntax errors.
The table below lists these problems and their symptoms.
Identifying Problems in a SAS Program
- 101 -
Problem Symptom
missing RUN statement
“PROC (or DATA) step running” at top of active window
missing semicolon
log message indicating an error in a statement that seems to be valid
unbalanced quotation
marks
log message indicating that a quoted string has become too long or that a
statement is ambiguous
invalid option
log message indicating that an option is invalid or not recognized
Missing RUN Statement
Each step in a SAS program is compiled and executed independently from every other step. As a step is
compiled, SAS recognizes the end of the current step when it encounters
a DATA or PROC statement, which indicates the beginning of a new step
a RUN or QUIT statement, which indicates the end of the current step.
When the program below is submitted, the DATA step executes, but the PROC step does not. The PROC
step does not execute because there is no following DATA or PROC step to indicate the beginning of a
new step, nor is there a following RUN statement to indicate the end of the step.
data clinic.admitfee;
set clinic.admit;
run;
proc print data=clinic.admitfee;
var id name actlevel fee;
Because there is nothing to indicate the end of the PROC step, the PRINT procedure waits before
executing, and a “PROC PRINT running” message appears at the top of the active window.
Program Window with Message
- 102 -
Resolving the Problem
To correct the error, submit a RUN statement to complete the PROC step.
run;
If you are using the Program Editor window, you do not need to recall the program to the Program Editor
window.
Missing Semicolon
One of the most common errors is the omission of a semicolon at the end of a statement. The program
below is missing a semicolon at the end of the PROC PRINT statement.
data clinic.admitfee;
set clinic.admit;
run;
proc print data=clinic.admitfee
var id name actlevel fee;
run;
When you omit a semicolon, SAS reads the statement that lacks the semicolon, plus the following
statement, as one long statement. The SAS log then lists errors that relate to the combined statement, not
the actual mistake (the missing semicolon).
Log Window with Error Message
- 103 -
Resolving the Problem
To correct the error, do the following:
1. Find the statement that lacks a semicolon. You can usually locate the statement that lacks the semicolon by
looking at the underscored keywords in the error message and working backwards.
2. Add a semicolon in the appropriate location.
3. Resubmit the corrected program.
4. Check the Log window again to make sure there are no other errors.
Unbalanced Quotation Marks
Some syntax errors, such as the missing quotation mark after HIGH in the program below, cause SAS to
misinterpret the statements in your program.
data clinic.admitfee;
set clinic.admit;
where actlevel='HIGH;
run;
proc print data=clinic.admitfee;
var id name actlevel fee;
run;
When the program is submitted, SAS is unable to resolve the DATA step, and a “DATA STEP running”
message appears at the top of the active window.
Program Editor with Message
- 104 -
Sometimes a warning appears in the SAS log which indicates that
a quoted string has become too long
a statement that contains quotation marks (such as a TITLE or FOOTNOTE statement) is ambiguous due to
invalid options or unquoted text.
Log with Warning Message
When you have unbalanced quotation marks, SAS is often unable to detect the end of the statement in
which it occurs. Simply adding a quotation mark and resubmitting your program usually does not solve the
problem. SAS still considers the quotation marks to be unbalanced.
Therefore, you need to resolve the unbalanced quotation mark by canceling the submitted statements (in
the Windows and UNIX operating environments) or by submitting a line of SAS code (in the z/OS
operating environment) before you recall, correct, and resubmit the program.
If you do not resolve this problem when it occurs, it is likely that any subsequent programs that you submit
in the current SAS session will generate errors.
Resolving the Error in the Windows Operating Environment
- 105 -
To resolve the error in the Windows operating environment:
1. Press the Ctrl and Break keys or click the Break Icon on the toolbar.
2. Select 1. Cancel Submitted Statements, and then click OK.
Tasking Manager Window
3. Select Y to cancel submitted statements, and then click OK.
Break Window
4. Correct the error and resubmit the program.
Resolving the Error in the UNIX Operating Environment
To resolve the error in the UNIX operating environment:
1. Open the Session Management window and click Interrupt.
Session Management Window
- 106 -
2. Select 1. Cancel Submitted Statements, and then click Y.
Tasking Management Window
3. Correct the error and resubmit the program.
Resolving the Error in the z/OS Operating Environment
To resolve the error in the z/OS operating environment:
1. Submit an asterisk followed by a quotation mark, a semicolon, and a RUN statement.
*'; run;
2. Delete the line that contains the asterisk followed by the quotation mark, the semicolon, and the RUN statement.
3. Insert the missing quotation mark in the appropriate place.
4. Submit the corrected program.
Note: You can also use the above method in the Windows and UNIX operating environments.
Invalid Option
An invalid option error occurs when you specify an option that is not valid in a particular statement. In the
program below, the KEYLABEL option is not valid when used with the PROC PRINT statement.
data sasuser.admitfee;
set clinic.admit;
where weight>180 and (actlevel="MOD" or actlevel="LOW");
run;
proc print data=clinic.admitfee keylabel;
label actlevel='Activity Level';
run;
When a SAS statement that contains an invalid option is submitted, a message appears in the Log window
indicating that the option is not valid or not recognized.
Log Window with Syntax Error Message
- 107 -
Resolving the Problem
To correct the error:
1. Remove or replace the invalid option, and check your statement syntax as needed.
2. Resubmit the corrected program.
3. Check the Log window again to make sure there are no other errors.
Additional Features
Comments in SAS Programs
You can insert comments into a SAS program to document the purpose of the program, to explain
segments of the program, or to describe the steps in a complex program or calculation. A comment
statement begins and ends with a comment symbol. There are two forms of comment statements:
*text;
or
- 108 -
/*text*/
SAS ignores text between comment symbols during processing.
The following program shows some of the ways comments can be used to describe a SAS program.
/* Read national sales data for vans */
/* from an external raw data file */
data perm.vansales;
infile vandata;
input @1 Region $9.
@13 Quarter 1. /* Values are 1, 2, 3, or 4 */
@16 TotalSales comma11.;
/* Print the entire data set */
proc print data=perm.vansales;
run;
CAUTION:
Avoid placing the /* comment symbols in columns 1 and 2. On some host operating systems, SAS might
interpret a /* in columns 1 and 2 as a request to end the SAS job or session. For more information, see the
SAS documentation for your operating environment.
System Options
SAS includes several system options that enable you to control error handling and Log window messages.
The table shown below contains brief descriptions of some of these options. You can use either the
OPTIONS statement or the SAS System Options window to specify these options.
Options for Controlling Error Handling and Log Window Messages
Option
Descriptions
ERRORS=n
Specifies the maximum number of observations for which complete data error
messages are printed.
FMTERR |
NOFMTERR
Controls whether SAS generates an error message when a format of a variable cannot
b
e found. NOFMTERR results in a warning instead of an error. FMTERR is the
default.
SOURCE |
NOSOURCE
Controls whether SAS writes source statements to the SAS log. SOURCE is the
default.
- 109 -
Error Checking In the Enhanced Editor
If you are using the Enhanced Editor, you can use its color-coding for program elements, quoted strings,
and comments to help you find coding errors.
You can also search for
ending brackets or parentheses by pressing Ctrl+].
matching DO-END pairs by pressing Alt+[. (You can learn about DO-END pairs in .)
See the following table for suggestions about finding syntax errors using the Enhanced Editor.
Finding Syntax Errors Using the Enhanced Editor
To find...
Do this...
Undefined or
misspelled keywords
In the Appearance tab of the Enhanced Editor Options dialog box, set the file
elements Defined keyword, User defined keyword, and the Undefined keyword
to unique color combinations.
When SAS recognizes a keyword, the keyword changes to the defined colors.
You'll be able to easily spot undefined keywords by looking for the color that you
selected for undefined keywords.
Unmatched quoted
strings
Look for one or more lines of the program that are the same color.
Text following a quotation mark remains the same color until the string is closed
with a matching quotation mark.
Unmatched
comments
Look for one or more lines of the program that are the same color.
Text that follows an open comment symbol ( /* ) remains the same color until the
comment is closed with a closing comment symbol ( */).
Matching DO-END
pairs
Place the cursor within a DO-END block and press Alt+[.
The cursor moves first to the DO keyword. If one of the keywords is not found, the
cursor remains as positioned.
When both of the keywords exist, pressing Alt+[ moves the cursor between the
DO-END keywords.
Matching
- 110 -
parentheses or
brackets Place the cursor on either side of the parenthesis or bracket. Press Ctrl+].
The cursor moves to the matching parentheses or bracket. If one is not found, the
cursor remains as positioned.
Missing
semi-colons ( ; )
Look for keywords that appear in normal text.
Debugging with the DATA Step Debugger
Unlike most syntax errors, logic errors do not stop a program from running. Instead, they cause the
program to produce unexpected results.
You can debug logic errors in DATA steps by using the DATA step debugger. This tool enables you to
issue commands to execute DATA step statements one by one, and then to pause to display the resulting
variables' values in a window. By observing the results that are displayed, you can determine where the
logic error lies.
The debugger can be used only in interactive mode. Because the debugger is interactive, you can repeat the
process of issuing commands and observing results as many times as needed in a single debugging session.
To invoke the debugger, add the DEBUG option to the DATA statement, and submit the program.
data perm.publish / debug;
infile pubdata;
input BookID $ Publisher & $22. Year;
run;
proc print data=perm.publish;
run;
Debugger Log Window
- 111 -
For detailed information about how to use the debugger, see the SAS documentation.
Chapter Summary
Text Summary
Opening a Stored SAS Program
A SAS program that is stored in an external file can be opened in the code editing window using
file shortcuts
My Favorite Folders
the Open window
the INCLUDE command.
Editing SAS Programs
SAS programs consist of SAS statements. Although you can write SAS statements in almost any format, a
consistent layout enhances readability and enables you to understand the program's purpose.
In the Windows operating environment, the Enhanced Editor enables you to enter and view text and
select one or more lines of text
- 112 -
use color-coding to identify code elements
automatically indent the next line when you press the Enter key
collapse and expand sections of SAS procedures, DATA steps, and macros
bookmark lines of code for easy access to different sections of your program.
Using the Editor window, you can also find and replace text, use abbreviations, open multiple views of a
file, and set Enhanced Editor options.
The Program Editor window enables you to edit your programs just as you would with a word processing
program. You can also use text editor commands and block text editor commands to edit SAS programs.
Activating line numbers can make it easier for you to edit your program regardless of your operating
environment.
Remember that SAS statements disappear from the Program Editor window when they are submitted.
However, you can recall a program to the Program Editor window.
To save your SAS program to an external file, first activate the Program Editor window and select
FileSave As. Then specify the name of your program in the Save As window.
Clearing SAS Programming Windows
Text and output accumulate in the Editor, Program Editor, Log, and Output windows throughout your SAS
session. You can clear a window by selecting EditClear All.
Interpreting Error Messages
When a SAS program that contains errors is submitted, error messages appear in the Log window. SAS
can detect three types of errors: syntax, execution-time, and data. This chapter focuses on identifying and
resolving common syntax errors.
Correcting Errors
To modify programs that contain errors, you can correct the errors in the code editing window. In the
Program Editor window, you need to recall the submitted statements before editing them.
Before resubmitting a revised program, it is a good idea to clear the messages from the Log window so that
you don't confuse old messages with the new. You can delete any error-free steps from a revised program
before resubmitting it.
Resolving Common Problems
You might need to resolve several types of common problems: missing RUN statements, missing
semicolons, unbalanced quotation marks, and invalid options.
Points to Remember
It's a good idea to begin DATA steps, PROC steps, and RUN statements on the left and to indent statements
within a step.
End each step with a RUN statement.
- 113 -
Review the messages in the Log window each time you submit a SAS program.
You can delete any error-free steps from a revised program before resubmitting it, or you can submit only the
revised steps in a program.
Chapter Quiz
Select the best answer for each question. After completing the quiz, check your answers using the answer
key in the appendix.
1. As you write and edit SAS programs, it's a good idea to
1. begin DATA and PROC steps in column one.
2. indent statements within a step.
3. begin RUN statements in column one.
4. do all of the above.
2. Suppose you have submitted a SAS program that contains spelling errors. Which set of steps should you perform,
in the order shown, to revise and resubmit the program?
1.
Correct the errors.
Clear the Log window.
Resubmit the program.
Check the Log window.
Correct the errors.
Resubmit the program.
Check the Output window.
Check the Log window.
Correct the errors.
Clear the Log window.
Resubmit the program.
Check the Output window.
Correct the errors.
Clear the Output window.
Resubmit the program.
Check the Output window.
3. What happens if you submit the following program?
proc sort data=clinic.stress out=maxrates;
by maxhr;
run;
proc print data=maxrates label double noobs;
label rechr='Recovery Heart Rate;
var resthr maxhr rechr date;
where toler='I' and resthr>90;
sum fee;
run;
1. Log messages indicate that the program ran successfully.
- 114 -
2. A “PROC SORT running” message appears at the top of the active window, and a log message might
indicate an error in a statement that seems to be valid.
3. A log message indicates that an option is not valid or not recognized.
4. A “PROC PRINT running” message appears at the top of the active window, and a log message might
indicate that a quoted string has become too long or that the statement is ambiguous.
4. What generally happens when a syntax error is detected?
1. SAS continues processing the step.
2. SAS continues to process the step, and the Log window displays messages about the error.
3. SAS stops processing the step in which the error occurred, and the Log window displays messages about
the error.
4. SAS stops processing the step in which the error occurred, and the Output window displays messages
about the error.
5. A syntax error occurs when
1. some data values are not appropriate for the SAS statements that are specified in a program.
2. the form of the elements in a SAS statement is correct, but the elements are not valid for that usage.
3. program statements do not conform to the rules of the SAS language.
4. none of the above
6. How can you tell whether you have specified an invalid option in a SAS program?
1. A log message indicates an error in a statement that seems to be valid.
2. A log message indicates that an option is not valid or not recognized.
3. The message “PROC running” or “DATA step running” appears at the top of the active window.
4. You can't tell until you view the output from the program.
7. Which of the following programs contains a syntax error?
1.
proc sort data=sasuser.mysales;
by region;
run;
2.
dat sasuser.mysales;
set mydata.sales99;
where sales<5000;
run;
3.
proc print data=sasuser.mysales label;
label region='Sales Region';
run;
4. none of the above
8. What should you do after submitting the following program in the Windows or UNIX operating environment?
proc print data=mysales;
where state='NC;
run;
1. Submit a RUN statement to complete the PROC step.
2. Recall the program. Then add a quotation mark and resubmit the corrected program.
- 115 -
3. Cancel the submitted statements. Then recall the program, add a quotation mark, and resubmit the
corrected program.
4. Recall the program. Then replace the invalid option and resubmit the corrected program.
9. Which of the following commands opens a file in the code editing window?
1.
file 'd:\programs\sas\newprog.sas'
2.
include 'd:\programs\sas\newprog.sas'
3.
open 'd:\programs\sas\newprog.sas'
4. all of the above
10. Suppose you submit a short, simple DATA step. If the active window displays the message “DATA step running”
for a long time, what probably happened?
1. You misspelled a keyword.
2. You forgot to end the DATA step with a RUN statement.
3. You specified an invalid data set option.
4. Some data values weren't appropriate for the SAS statements that you specified.
Chapter 4 Creating List Reports
Overview
Creating a Basic Report
Selecting Variables
Identifying Observations
Sorting Data
Generating Column Totals
Double Spacing LISTING Output
Specifying Titles and Footnotes
Assigning Descriptive Labels
Formatting Data Values
Using Permanently Assigned Labels and Formats
Additional Features
Chapter Summary
- 116 -
Chapter Quiz
Overview
Introduction
To list the information in a data set, you can create a report with a PROC PRINT step. Then you can
enhance the report with additional statements and options to create reports like those shown below.
PROC PRINT
Objectives
In this chapter, you learn to
specify SAS data sets to print
select variables and observations to print
sort data by the values of one or more variables
specify column totals for numeric variables
double space LISTING output
add titles and footnotes to procedure output
assign descriptive labels to variables
apply formats to the values of variables.
Creating a Basic Report
- 117 -
To produce a simple list report, you first reference the library in which your SAS data set is stored. If you
want, you can also set system options to control the appearance of your reports. Then you submit a basic
PROC PRINT step.
General form, basic PROC PRINT step:
PROC PRINT DATA=SAS-data-set;
RUN;
where SAS-data-set is the name of the SAS data set to be printed.
In the program below, the PROC PRINT statement invokes the PRINT procedure and specifies the data set
Therapy in the SAS data library to which the libref Patients has been assigned.
libname patients 'c:\records\patients';
proc print data=patients.therapy;
run;
Notice the layout of the resulting report below. By default,
all observations and variables in the data set are printed
a column for observation numbers appears on the far left
variables and observations appear in the order in which they occur in the data set.
Patients.Therapy Data Set
- 118 -
Be sure to specify the equal sign in the DATA= option in SAS procedures. If you omit the equal sign, your
program produces an error similar to the following in the SAS log:
Error Message
Selecting Variables
- 119 -
Overview
By default, a PROC PRINT step lists all the variables in a data set. You can select variables and control the
order in which they appear by using a VAR statement in your PROC PRINT step.
General form, VAR statement:
VAR variable(s);
where variable(s) is one or more variable names, separated by blanks.
For example, the following VAR statement specifies that only the variables Age, Height, Weight, and Fee
be printed, in that order:
proc print data=clinic.admit;
var age height weight fee;
run;
The procedure output from the PROC PRINT step with the VAR statement lists only the values for the
variables Age, Height, Weight, and Fee.
Procedure Output
- 120 -
In addition to selecting variables, you can control the default Obs column that PROC PRINT displays to
list observation numbers. If you prefer, you can choose not to display observation numbers.
Printing Observations
Removing the OBS Column
- 121 -
To remove the Obs column, specify the NOOBS option in the PROC PRINT statement.
proc print data=work.example noobs;
var age height weight fee;
run;
Identifying Observations
You've learned how to remove the Obs column altogether. As another alternative, you can use one or more
variables to replace the Obs column in the output.
Using the ID Statement
To specify which variables should replace the Obs column, use the ID statement. This technique is
particularly useful when observations are too long to print on one line.
General form, ID statement:
ID variable(s);
where variable(s) specifies one or more variables to print instead of the observation number at the
beginning of each row of the report.
Example
To replace the Obs column and identify observations based on an employee's ID number and last name,
you can submit the following program.
proc print data=sales.reps;
id idnum lastname;
run;
This is HTML output from the program:
HTML Output
- 122 -
In LISTING output, the IDnum and LastName columns are repeated for each observation that is printed on
more than one line.
LISTING Output
If a variable in the ID statement also appears in the VAR statement, the output contains two columns for
that variable. In the example below, the variable IDnum appears twice.
proc print data=sales.reps;
id idnum lastname;
- 123 -
var idnum sex jobcode salary;
run;
IDNUM Output
Selecting Observations
By default, a PROC PRINT step lists all the observations in a data set. You can control which observations
are printed by adding a WHERE statement to your PROC PRINT step. There should be only one WHERE
statement in a step. If multiple WHERE statements are issued, only the last statement is processed.
General form, WHERE statement:
WHERE where-expression;
where where-expression specifies a condition for selecting observations. The where-expression can be any
valid SAS expression.
For example, the following WHERE statement selects only observations for which the value of Age is
greater than 30:
proc print data=clinic.admit;
var age height weight fee;
where age>30;
run;
Here is the procedure output from the PROC PRINT step with the WHERE statement:
PROC PRINT Output with WHERE Statement
- 124 -
Specifying WHERE Expressions
In the WHERE statement you can specify any variable in the SAS data set, not just the variables that are
specified in the VAR statement. The WHERE statement works for both character and numeric variables.
To specify a condition based on the value of a character variable, you must
enclose the value in quotation marks
write the value with lowercase, uppercase, or mixed case letters exactly as it appears in the data set.
You use the following comparison operators to express a condition in the WHERE statement:
Comparison Operators in a WHERE Statement
Symbol
Meaning
Example
= or eq
equal to
where name='Jones, C.';
- 125 -
^= or ne not equal to where temp ne 212;
> or gt
greater than
where income>20000;
< or lt
less than
where partno lt "BG05";
>= or ge
greater than or equal to
where id>='1543';
<= or le
less than or equal to
where pulse le 85;
You can learn more about valid SAS expressions in Creating SAS Data Sets from External Files.
Using the CONTAINS Operator
The CONTAINS operator selects observations that include the specified substring. The symbol for the
CONTAINS operator is ?. You can use either the CONTAINS keyword or the symbol in your code, as
shown below.
where firstname CONTAINS 'Jon';
where firstname ? 'Jon';
Specifying Compound WHERE Expressions
You can also use WHERE statements to select observations that meet multiple conditions. To link a
sequence of expressions into compound expressions, you use logical operators, including the following:
Compound WHERE Expression Operators
Operator
Meaning
AND
&
and, both. If both expressions are true, then the compound expression is true.
OR
|
or, either. If either expression is true, then the compound expression is true.
Examples of WHERE Statements
You can use compound expressions like these in your WHERE statements:
- 126 -
where age<=55 and pulse>75;
where area='A' or region='S';
where ID>'1050' and state='NC';
When you test for multiple values of the same variable, you specify the variable name in each expression:
where actlevel='LOW' or actlevel='MOD';
where fee=124.80 or fee=178.20;
You can use the IN operator as a convenient alternative:
where actlevel in ('LOW','MOD');
where fee in (124.80,178.20);
To control the way compound expressions are evaluated, you can use parentheses (expressions in parentheses are
evaluated first):
where (age<=55 and pulse>75) or area='A';
where age<=55 and (pulse>75 or area='A');
Sorting Data
Overview
By default, PROC PRINT lists observations in the order in which they appear in your data set. To sort your
report based on values of a variable, you must use PROC SORT to sort your data before using the PRINT
procedure to create reports from the data.
The SORT procedure
rearranges the observations in a SAS data set
creates a new SAS data set that contains the rearranged observations
replaces the original SAS data set by default
can sort on multiple variables
can sort in ascending or descending order
does not generate printed output
treats missing values as the smallest possible values.
General form, simple PROC SORT step:
- 127 -
PROC SORT DATA=SAS-data-set<OUT=SAS-data-set>;
BY <DESCENDING> BY-variable(s);
RUN;
where
the DATA= option specifies the data set to be read
the OUT= option creates an output data set that contains the data in sorted order
BY-variable(s) in the required BY statement specifies one or more variables whose values are used to sort the data
the DESCENDING option in the BY statement sorts observations in descending order. If you have more that one
variable in the BY statement, DESCENDING applies only to the variable that immediately follows it.
CAUTION:
If you don't use the OUT= option, PROC SORT overwrites the data set specified in the DATA= option.
Example
In the following program, the PROC SORT step sorts the permanent SAS data set Clinic.Admit by the
values of the variable Age within the values of the variable Weight and creates the temporary SAS data set
Wgtadmit. Then the PROC PRINT step prints a subset of the Wgtadmit data set.
proc sort data=clinic.admit out=work.wgtadmit;
by weight age;
run;
proc print data=work.wgtadmit;
var weight age height fee;
where age>30;
run;
The report displays observations in ascending order of age within weight.
Observations Displayed in Ascending Order of Age Within Weight
- 128 -
Adding the DESCENDING option to the BY statement sorts observations in ascending order of age within
descending order of weight. Notice that DESCENDING applies only to the variable Weight.
proc sort data=clinic.admit out=work.wgtadmit;
by descending weight age;
run;
proc print data=work.wgtadmit;
var weight age height fee;
where age>30;
run;
Observations Displayed in Descending Order
- 129 -
Generating Column Totals
Overview
To produce column totals for numeric variables, you can list the variables to be summed in a SUM
statement in your PROC PRINT step.
General form, SUM statement:
SUM variable(s);
where variable(s) is one or more numeric variable names, separated by blanks.
The SUM statement in the following PROC PRINT step requests column totals for the variable
BalanceDue:
proc print data=clinic.insure;
var name policy balancedue;
where pctinsured < 100;
sum balancedue;
run;
Column totals appear at the end of the report in the same format as the values of the variables.
- 130 -
Column Totals
Requesting Subtotals
You might also want to subtotal numeric variables. To produce subtotals, add both a SUM statement and a
BY statement to your PROC PRINT step.
General form, BY statement in the PRINT procedure:
BY <DESCENDING> BY-variable-1
<...<DESCENDING><BY-variable-n>>
<NOTSORTED>;
where
BY-variable specifies a variable that the procedure uses to form BY groups. You can specify more than one
variable, separated by blanks.
the DESCENDING option specifies that the data set is sorted in descending order by the variable that
immediately follows.
the NOTSORTED option specifies that observations are not necessarily sorted in alphabetic or numeric order. If
observations that have the same values for the BY variables are not contiguous, the procedure treats each contiguous set as
- 131 -
a separate BY group.
CAUTION:
If you do not use the NOTSORTED option in the BY statement, the observations in the data set must
either be sorted by all the variables that you specify, or they must be indexed appropriately.
Example
The SUM statement in the following PROC PRINT step requests column totals for the variable Fee, and
the BY statement produces a subtotal for each value of ActLevel.
proc sort data=clinic.admit out=work.activity;
by actlevel;
run;
proc print data=work.activity;
var age height weight fee;
where age>30;
sum fee;
by actlevel;
run;
In the output, the BY variable name and value appear before each BY group. The BY variable name and
the subtotal appear at the end of each BY group.
BY Group Output: High
BY Group Output: Low
- 132 -
BY Group Output: Mod
Creating a Customized Layout with BY Groups and ID Variables
In the previous example, you may have noticed the redundant information for the BY variable. For
example, in the partial PROC PRINT output below, the BY variable ActLevel is identified both before the
BY group and for the subtotal.
Creating a Customized Layout with BY Groups and ID Variables
- 133 -
To show the BY variable heading only once, you can use an ID statement and a BY statement together
with the SUM statement. When an ID statement specifies the same variable as the BY statement,
the Obs column is suppressed
the ID/BY variable is printed in the left-most column
each ID/BY value is printed only at the start of each BY group and on the line that contains that group's subtotal.
Example
The ID, BY, and SUM statements work together to produce the output shown below. The ID variable is
listed only once for each BY group and once for each sum. The BY lines are suppressed. Instead, the value
of the ID variable, ActLevel, identifies each BY group.
proc sort data=clinic.admit out=work.activity;
by actlevel;
run;
proc print data=work.activity;
var age height weight fee;
where age>30;
sum fee;
by actlevel;
id actlevel;
run;
Creating Custom Output Example Output
- 134 -
Requesting Subtotals on Separate Pages
As another enhancement to your PROC PRINT report, you can request that each BY group be printed on a
separate page by using the PAGEBY statement.
General form, PAGEBY statement:
PAGEBY BY-variable:
where BY-variable identifies a variable that appears in the BY statement in the PROC PRINT step. PROC
PRINT begins printing a new page if the value of the BY variable changes, or if the value of any BY
variable that precedes it in the BY statement changes.
CAUTION:
The variable specified in the PAGEBY statement must also be specified in the BY statement in the PROC
PRINT step.
- 135 -
Example
The PAGEBY statement in the program below prints BY groups for the variable ActLevel separately. The
BY groups appear separated by horizontal lines in the HTML output.
proc sort data=clinic.admit out=work.activity;
by actlevel;
run;
proc print data=work.activity;
var age height weight fee;
where age>30;
sum fee;
by actlevel;
id actlevel;
pageby actlevel;
run;
PAGEBY Example: High
PAGEBY Example: Low
PAGEBY Example: Mod
- 136 -
Double Spacing LISTING Output
If you are generating SAS LISTING output, one way to control the layout is to double space it. To do so,
specify the DOUBLE option in the PROC PRINT statement. For example,
proc print data=clinic.stress double;
var resthr maxhr rechr;
where tolerance='I';
run;
Double spacing does not apply to HTML output.
To generate SAS LISTING output, you must select ToolsOptionsPreferences. Select the Results tab.
Select the Create listing option.
Double-Spaced LISTING Output
- 137 -
Specifying Titles and Footnotes
Overview
Now you've learned how to structure your PRINT procedure output. However, you might also want to
make your reports easy to interpret by
adding titles and footnotes
replacing variable names with descriptive labels
formatting variable values.
Although this chapter focuses on PROC PRINT, you can apply these enhancements to most SAS
procedure output.
TITLE and FOOTNOTE Statements
To make your report more meaningful and self-explanatory, you can associate up to 10 titles with
procedure output by using TITLE statements before the PROC step. Likewise, you can specify up to 10
footnotes by using FOOTNOTE statements before the PROC step.
Because TITLE and FOOTNOTE statements are global statements, place them anywhere within or before
the PRINT procedure. Titles and footnotes are assigned as soon as TITLE or FOOTNOTE statements are
read; they apply to all subsequent output.
General form, TITLE and FOOTNOTE statements:
- 138 -
TITLE<n> 'text';
FOOTNOTE<n> 'text';
where n is a number from 1 to 10 that specifies the title or footnote line, and 'text' is the actual title or
footnote to be displayed. The maximum title or footnote length depends on your operating environment
and on the value of the LINESIZE= option.
The keyword title is equivalent to title1. Likewise, footnote is equivalent to footnote1. If you don't specify
a title, the default title is The SAS System. No footnote is printed unless you specify one.
Note: Be sure to match quotation marks that enclose the title or footnote text.
Using the TITLES and FOOTNOTES Windows
You can also specify titles in the TITLES window and footnotes in the FOOTNOTES window. Titles and
footnotes that you specify in these windows are not stored with your program, and they remain in effect
only during your SAS session.
To open the TITLES window, issue the TITLES command. To open the FOOTNOTES window, issue the
FOOTNOTES command.
To specify a title or footnote, type in the text you want next to the number of the line where the text should
appear. To cancel a title or footnote, erase the existing text. Notice that you do not enclose text in quotation
marks in these windows.
Titles Window
Example: Titles
The two TITLE statements below, specified for lines 1 and 3, define titles for the PROC PRINT output.
title1 'Heart Rates for Patients with';
title3 'Increased Stress Tolerance Levels';
proc print data=clinic.stress;
var resthr maxhr rechr;
- 139 -
where tolerance='I';
run;
Title lines for HTML output appear differently depending on the version of SAS that you use. In SAS
Version 8, title lines simply appear consecutively, without extra spacing to indicate skipped title numbers.
In SAS®9 HTML output, title line 2 is blank.
HTML Output with Titles: SAS®8
HTML Output with Titles: SAS®9
- 140 -
In SAS LISTING output for all versions of SAS, title line 2 is blank, as shown below. Titles are centered
by default.
LISTING Output with Titles: All Versions
Example: Footnotes
- 141 -
The two FOOTNOTE statements below, specified for lines 1 and 3, define footnotes for the PROC PRINT
output. Since there is no footnote2, a blank line is inserted between footnotes 1 and 2 in the output.
footnote1 'Data from Treadmill Tests';
footnote3 '1st Quarter Admissions';
proc print data=clinic.stress;
var resthr maxhr rechr;
where tolerance='I';
run;
Footnotes appear at the bottom of each page of procedure output. Notice that footnote lines are pushed up
from the bottom. The FOOTNOTE statement that has the largest number appears on the bottom line.
HTML Output with Footnotes
In SAS LISTING output, footnote line 2 is blank, as shown below. Footnotes are centered by default.
LISTING Output with Footnotes
- 142 -
Modifying and Canceling Titles and Footnotes
TITLE and FOOTNOTE statements are global statements. That is, after you define a title or footnote, it
remains in effect until you modify it, cancel it, or end your SAS session.
For example, the footnotes that are assigned in the PROC PRINT step below also appear in the output
from the PROC TABULATE step.
footnote1 'Data from Treadmill Tests';
footnote3 '1st Quarter Admissions';
proc print data=clinic.stress;
var resthr maxhr rechr;
where tolerance='I';
run;
proc tabulate data=clinic.stress;
where tolerance='I';
var resthr maxhr;
table mean*(resthr maxhr);
run;
Redefining a title or footnote line cancels any higher-numbered title or footnote lines, respectively. In the
example below, defining a title for line 2 in the second report automatically cancels title line 3.
title3 'Participation in Exercise Therapy';
proc print data=clinic.therapy;
var swim walkjogrun aerclass;
run;
title2 'Report for March';
proc print data=clinic.therapy;
run;
- 143 -
To cancel all previous titles or footnotes, specify a null TITLE or FOOTNOTE statement (a TITLE or
FOOTNOTE statement with no number or text) or a TITLE1 or FOOTNOTE1 statement with no text. This
will also cancel the default title The SAS System.
For example, in the program below, the null TITLE1 statement cancels all titles that are in effect before
either PROC step executes. The null FOOTNOTE statement cancels all footnotes that are in effect after the
PROC PRINT step executes. The PROC TABULATE output appears without a title or a footnote.
title1;
footnote1 'Data from Treadmill Tests';
footnote3 '1st Quarter Admissions';
proc print data=clinic.stress;
var resthr maxhr rechr;
where tolerance='I';
run;
footnote;
proc tabulate data=clinic.stress;
var timemin timesec;
table max*(timemin timesec);
run;
Assigning Descriptive Labels
Temporarily Assigning Labels to Variables
You can also enhance your PROC PRINT report by labeling columns with more descriptive text. To label
columns, you use
the LABEL statement to assign a descriptive label to a variable
the LABEL option in the PROC PRINT statement to specify that the labels be displayed.
General form, LABEL statement:
LABEL variable1='label1'
variable2='label2'
... ;
Labels can be up to 256 characters long. Enclose the label in quotation marks.
The LABEL statement applies only to the PROC step in which it appears.
Example
- 144 -
In the PROC PRINT step below, the variable name WalkJogRun is displayed with the label Walk/Jog/Run.
Note the LABEL option in the PROC PRINT statement. Without the LABEL option in the PROC PRINT
statement, PROC PRINT would use the name of the column heading walkjogrun even though you
specified a value for the variable.
proc print data=clinic.therapy label;
label walkjogrun=Walk/Jog/Run;
run;
Output Created Without the LABEL Option
Using Single or Multiple LABEL Statements
You can assign labels in separate LABEL statements . . .
proc print data=clinic.admit label;
var age height;
label age='Age of Patient';
label height='Height in Inches';
run;
. . . or you can assign any number of labels in a single LABEL statement.
proc print data=clinic.admit label;
var actlevel height weight;
label actlevel='Activity Level'
height='Height in Inches'
weight='Weight in Pounds';
- 145 -
run;
Formatting Data Values
Temporarily Assigning Formats to Variables
In your SAS reports, formats control how the data values are displayed. To make data values more
understandable when they are displayed in your procedure output, you can use the FORMAT statement,
which associates formats with variables.
Formats affect only how the data values appear in output, not the actual data values as they are stored in
the SAS data set.
General form, FORMAT statement:
FORMAT variable(s) format-name;
where
variable(s) is the name of one or more variables whose values are to be written according to a particular pattern
format-name specifies a SAS format or a user-defined format that is used to write out the values.
The FORMAT statement applies only to the PROC step in which it appears.
You can use a separate FORMAT statement for each variable, or you can format several variables (using
either the same format or different formats) in a single FORMAT statement.
Formats That Are Used to Format Data
This FORMAT
statement ...
Associates ...
To displa
y
values as ...
format date mmddyy8.;
the format MMDDYY8. with the variable Date
06/05/03
format net comma5.0
gross comma8.2;
the format COMMA5.0 with the variable Net and the
format COMMA8.2 with the variable Gross
1,234
5,678.90
- 146 -
format net gross dollar9.2;
the format DOLLAR9.2 with both variables, Net and
Gross
$1,234.00
$5,678.90
For example, the FORMAT statement below writes values of the variable Fee using dollar signs, commas,
and no decimal places.
proc print data=clinic.admit;
var actlevel fee;
where actlevel='HIGH';
format fee dollar4.;
run;
FORMAT Statement Example
Specifying SAS Formats
The table below describes some SAS formats that are commonly used in reports.
Commonly Used SAS Formats
Format
Specifies values ...
Example
COMMAw.d
that contain commas and decimal places
comma8.2
- 147 -
DOLLAR
w.d
that contain dollar signs, commas, and decimal places
dollar6.2
MMDDYYw.
as date values of the form 09/12/97 (MMDDYY8.) or 09/12/1997
(MMDDYY10.)
mmddyy10.
w.
rounded to the nearest integer in w spaces
7.
w.d
rounded to d decimal places in w spaces
8.2
$w.
as character values in w spaces
$12.
DATEw.
as date values of the form 16OCT99 (DATE7.) or 16OCT1999 (DATE9.)
date9.
Field Widths
All SAS formats specify the total field width (w) that is used for displaying the values in the output. For
example, suppose the longest value for the variable Net is a four-digit number, such as 5400. To specify
the COMMAw.d format for Net, you specify a field width of 5 or more. You must count the comma,
because it occupies a position in the output.
Note: When you use a SAS format, be sure to specify a field width (w) that is wide enough for the
largest possible value. Otherwise, values might not be displayed properly.
Specifying a Field Width (w) with the FORMAT Statement
Decimal Places
For numeric variables you can also specify the number of decimal places (d), if any, to be displayed in the
output. Numbers are rounded to the specified number of decimal places. In the example above, no decimal
places are displayed.
Writing the whole number 2030 as 2,030.00 requires eight print positions, including two decimal places
and the decimal point.
Whole Number Decimal Places
- 148 -
Formatting 15374 with a dollar sign, commas, and two decimal places requires ten print positions.
Specifying 10 Decimal Places
Examples
This table shows you how data values are displayed when different format, field width, and decimal place
specifications are used.
Displaying Data Values with Formats
Stored Value
Format
Displayed Value
38245.3975
COMMA12.2
38,245.40
38245.3975
12.2
38245.40
38245.3975
DOLLAR12.2
$38,245.40
38245.3975
DOLLAR9.2
$38245.40
38245.3975
DOLLAR8.2
38245.40
0
MMDDYY8.
01/01/60
0
MMDDYY10.
01/01/1960
- 149 -
0 DATE7. 01JAN60
0
DATE9.
01JAN1960
If a format is too small, the following message is written to the SAS log: “NOTE: At least one W.D format
was too small for the number to be printed. The decimal may be shifted by the 'BEST' format.”
Using Permanently Assigned Labels and Formats
You have seen how to temporarily assign labels and formats to variables. When you use a LABEL or
FORMAT statement within a PROC step, the label or format applies only to the output from that step.
However, in your PROC steps, you can also take advantage of permanently assigned labels or formats.
Permanent labels and formats can be assigned in the DATA step. These labels and formats are saved with
the data set, and they can later be used by procedures that reference the data set.
For example, the DATA step below creates Flights.March and defines a format and label for the variable
Date. Because the LABEL and FORMAT statements are inside the DATA step, they are written to the
Flights.March data set and are available to the subsequent PRINT procedure.
data sasuser.paris;
set sasuser.laguardia;
where dest="PAR" and (boarded=155 or boarded=146);
label date=’Departure Date’;
format date date9.;
run;
proc print data=sasuser.paris;
var date dest boarded;
run;
Using Permanent Labels and Formats
Notice that the PROC PRINT statement still requires the LABEL option in order to display the permanent
labels. Other SAS procedures display permanently assigned labels and formats without additional
statements or options.
You can learn about permanently assigning labels and formats in Creating and Managing Variables.
- 150 -
Additional Features
When you create list reports, you can use several other features to enhance your procedure output. For
example, you can
control where text strings split in labels by using the SPLIT= option.
proc print data=reps split='*';
var salesrep type unitsold net commission;
label salesrep='Sales*Representative';
run;
create your own formats, which are particularly useful for formatting character values.
proc format;
value $repfmt
'TFB'='Bynum'
'MDC'='Crowley'
'WKK'='King';
run;
proc print data=vcrsales;
var salesrep type unitsold;
format salesrep $repfmt.;
run;
You can learn more about user-defined formats in Creating and Applying User-Defined Formats .
Chapter Summary
Text Summary
Creating a Basic Report
To list the information in a SAS data set, you can use PROC PRINT. You use the PROC PRINT statement
to invoke the PRINT procedure and to specify the data set that you are listing. Include the DATA= option
to specify the data set that you are using. By default, PROC PRINT displays all observations and variables
in the data set, includes a column for observation numbers on the far left, and displays observations and
variables in the order in which they occur in the data set. If you use a LABEL statement with PROC
PRINT, you must specify the LABEL option in the PROC PRINT statement.
To refine a basic report, you can
select which variables and observations are processed
sort the data
generate column totals for numeric variables.
Selecting Variables
- 151 -
You can select variables and control the order in which they appear by using a VAR statement in your
PROC PRINT step. To remove the Obs column, you can specify the NOOBS option in the PROC PRINT
statement. As an alternative, you can replace the Obs column with one or more variables by using the ID
statement.
Selecting Observations
The WHERE statement enables you to select observations that meet a particular condition in the SAS data
set. You use comparison operators to express a condition in the WHERE statement. You can also use the
CONTAINS operator to express a condition in the WHERE statement. To specify a condition based on the
value of a character variable, you must enclose the value in quotation marks, and you must write the value
with lower and uppercase letters exactly as it appears in the data set. You can also use the WHERE
statement to select a subset of observations based on multiple conditions. To link a sequence of
expressions into compound expressions, you use logical operators. When you test for multiple values of
the same variable, you specify the variable name in each expression. You can use the IN operator as a
convenient alternative. To control how compound expressions are evaluated, you can use parentheses.
Sorting Data
To display your data in sorted order, you use PROC SORT to sort your data before using PROC PRINT to
create reports. By default, PROC SORT sorts the data set specified in the DATA= option permanently. If
you do not want your data to be sorted permanently, you must create an output data set that contains the
data in sorted order. The OUT= option in the PROC SORT statement specifies an output data set. If you
need sorted data to produce output for only one SAS session, you should specify a temporary SAS data set
as the output data set. The BY statement, which is required with PROC SORT, specifies the variable(s)
whose values are used to sort the data.
Generating Column Totals
To total the values of numeric variables, use the SUM statement in the PROC PRINT step. You do not
need to specify the variables in a VAR statement if you specify them in the SUM statement. Column totals
appear at the end of the report in the same format as the values of the variables. To produce subtotals, add
both the SUM statement and the BY statement to your PROC PRINT step. To show BY variable headings
only once, use an ID and BY statement together with the SUM statement. As another enhancement to your
report, you can request that each BY group be printed on a separate page by using the PAGEBY statement.
Double Spacing Output
To double space your SAS LISTING output, you can specify the DOUBLE option in the PROC PRINT
statement.
Specifying Titles
To make your report more meaningful and self-explanatory, you can associate up to 10 titles with
procedure output by using TITLE statements anywhere within or preceding the PROC step. After you
define a title, it remains in effect until you modify it, cancel it, or end your SAS session. Redefining a title
line cancels any higher-numbered title lines. To cancel all previous titles, specify a null TITLE statement
(a TITLE statement with no number or text).
Specifying Footnotes
- 152 -
To add footnotes to your output, you can use the FOOTNOTE statement. Like TITLE statements,
FOOTNOTE statements are global. Footnotes appear at the bottom of each page of procedure output, and
footnote lines are pushed up from the bottom. The FOOTNOTE statement that has the largest number
appears on the bottom line. After you define a footnote, it remains in effect until you modify it, cancel it,
or end your SAS session. Redefining a footnote line cancels any higher-numbered footnote lines. To
cancel all previous footnotes, specify a null FOOTNOTE statement (a FOOTNOTE statement with no
number or text).
Assigning Descriptive Labels
To label the columns in your report with more descriptive text, you use the LABEL statement, which
assigns a descriptive label to a variable. To display the labels that were assigned in a LABEL statement,
you must specify the LABEL option in the PROC PRINT statement.
Formatting Data Values
To make data values more understandable when they are displayed in your procedure output, you can use
the FORMAT statement, which associates formats with variables. The FORMAT statement remains in
effect only for the PROC step in which it appears. Formats affect only how the data values appear in
output, not the actual data values as they are stored in the SAS data set. All SAS formats can specify the
total field width (w) that is used for displaying the values in the output. For numeric variables you can also
specify the number of decimal places (d), if any, to be displayed in the output.
Using Permanently Assigned Labels and Formats
You can take advantage of permanently assigned labels or formats without adding LABEL or FORMAT
statements to your PROC step.
Syntax
LIBNAME libref 'SAS-data-library';
OPTIONS options;
PROC SORT DATA=SAS-data-set OUT=SAS-data-set;
BY variable(s);
RUN;
TITLE<n > 'text';
FOOTNOTE<n>'text';
PROC PRINT DATA=SAS-data-set
BY<DESCENDING>BY-variable-1<...<DESCENDING><BY-variable-n>>
<NOTSORTED>;
PAGEBYBY-variable;
NOOBS LABEL DOUBLE;
ID variable(s);
VAR variable(s);
WHERE where-expression;
SUM variable(s);
- 153 -
LABEL variable1='label1' variable2='label2' ...;
FORMAT variable(s) format-name;
RUN;
Sample Program
libname clinic 'c:\stress\labdata';
options nodate number pageno=15;
proc sort data=clinic.stress out=work.maxrates;
by maxhr;
where tolerance='I' and resthr>60;
run;
title 'August Admission Fees';
footnote 'For High Activity Patients';
proc print data=work.maxrates label double noobs;
id name;
var resthr maxhr rechr;
label rechr='Recovery HR';
run;
proc print data=clinic.admit label;
var actlevel fee;
where actlevel='HIGH';
label fee='Admission Fee';
sum fee;
format fee dollar4.;
run;
Points to Remember
VAR, WHERE, SUM, FORMAT and LABEL statements remain in effect only for the PROC step in which they
appear.
If you don't use the OUT= option, PROC SORT permanently sorts the data set specified in the DATA= option.
TITLE and FOOTNOTE statements remain in effect until you modify them, cancel them, or end your SAS
session.
Be sure to match the quotation marks that enclose the text in TITLE, FOOTNOTE, and LABEL statements.
To display labels in PRINT procedure output, remember to add the LABEL option to the PROC PRINT
statement.
To permanently assign labels or formats to data set variables, place the LABEL or FORMAT statement inside the
DATA step.
Chapter Quiz
- 154 -
Select the best answer for each question. After completing the quiz, you can check your answers using the
answer key in the appendix.
1. Which PROC PRINT step below creates the following output?
1.
proc print data=flights.laguardia noobs;
var on changed flight;
where on>=160;
run;
2.
proc print data=flights.laguardia;
var date on changed flight;
where changed>3;
run;
3.
proc print data=flights.laguardia label;
id date;
var boarded transferred flight;
label boarded='On' transferred='Changed';
where flight='219';
run;
4.
proc print flights.laguardia noobs;
id date;
var date on changed flight;
where flight='219';
run;
2. Which of the following PROC PRINT steps is correct if labels are not stored with the data set?
1.
proc print data=allsales.totals label;
- 155 -
label region8='Region 8 Yearly Totals';
run;
2.
proc print data=allsales.totals;
label region8='Region 8 Yearly Totals';
run;
3.
proc print data allsales.totals label noobs;
run;
4.
proc print allsales.totals label;
run;
3. Which of the following statements selects from a data set only those observations for which the value of the
variable Style is RANCH, SPLIT, or TWOSTORY?
1.
where style='RANCH' or 'SPLIT' or 'TWOSTORY';
2.
where style in 'RANCH' or 'SPLIT' or 'TWOSTORY';
3.
where style in (RANCH, SPLIT, TWOSTORY);
4.
where style in ('RANCH','SPLIT','TWOSTORY');
4. If you want to sort your data and create a temporary data set named Calc to store the sorted data, which of the
following steps should you submit?
1.
proc sort data=work.calc out=finance.dividend;
run;
2.
proc sort dividend out=calc;
by account;
run;
3.
proc sort data=finance.dividend out=work.calc;
by account;
run;
4.
proc sort from finance.dividend to calc;
by account;
run;
- 156 -
5. Which options are used to create the following PROC PRINT output?
1. the DATE system option and the LABEL option in PROC PRINT
2. the DATE and NONUMBER system options and the DOUBLE and NOOBS options in PROC PRINT
3. the DATE and NONUMBER system options and the DOUBLE option in PROC PRINT
4. the DATE and NONUMBER system options and the NOOBS option in PROC PRINT
6. Which of the following statements can you use in a PROC PRINT step to create this output?
1.
var month instructors;
sum instructors aerclass walkjogrun swim;
2.
var month;
sum instructors aerclass walkjogrun swim;
3.
var month instructors aerclass;
sum instructors aerclass walkjogrun swim;
4. all of the above
- 157 -
7. What happens if you submit the following program?
proc sort data=clinic.diabetes;
run;
proc print data=clinic.diabetes;
var age height weight pulse;
where sex='F';
run;
1. The PROC PRINT step runs successfully, printing observations in their sorted order.
2. The PROC SORT step permanently sorts the input data set.
3. The PROC SORT step generates errors and stops processing, but the PROC PRINT step runs
successfully, printing observations in their original (unsorted) order.
4. The PROC SORT step runs successfully, but the PROC PRINT step generates errors and stops
processing.
8. If you submit the following program, which output does it create?
proc sort data=finance.loans out=work.loans;
by months amount;
run;
proc print data=work.loans noobs;
var months;
sum amount payment;
where months<360;
run;
1.
2.
- 158 -
3.
4.
9. Choose the statement below that selects rows in which
1. the amount is less than or equal to $5000
2. the account is 101-1092 or the rate equals 0.095.
1.
where amount <= 5000 and
account='101-1092' or rate = 0.095;
2.
where (amount le 5000 and account='101-1092')
or rate = 0.095;
3.
where amount <= 5000 and
(account='101-1092' or rate eq 0.095);
4.
where amount <= 5000 or account='101-1092'
and rate = 0.095;
10. What does PROC PRINT display by default?
1. PROC PRINT does not create a default report; you must specify the rows and columns to be displayed.
2. PROC PRINT displays all observations and variables in the data set. If you want an additional column
for observation numbers, you can request it.
3. PROC PRINT displays columns in the following order: a column for observation numbers, all character
variables, and all numeric variables.
4. PROC PRINT displays all observations and variables in the data set, a column for observation numbers
on the far left, and variables in the order in which they occur in the data set.
- 159 -
Chapter 5 Creating SAS Data Sets from External Files
Overview
Raw Data Files
Steps to Create a SAS Data Set from a Raw Data File
Referencing a SAS Library
Referencing a Raw Data File
Writing a DATA Step Program
Submitting the DATA Step Program
Creating and Modifying Variables
Subsetting Data
Reading Instream Data
Creating a Raw Data File
Additional Features
Reading Microsoft Excel Data
LIBNAME Statement Options
Creating Excel Worksheets
The IMPORT Wizard
Chapter Summary
Chapter Quiz
Overview
Introduction
In order to create reports with SAS procedures, your data must be in the form of a SAS data set. If your
data is not stored in the form of a SAS data set, you need to create a SAS data set by entering data, by
reading raw data, or by accessing files that were created by other software.
This chapter shows you how to design and write DATA step programs to create SAS data sets from raw
data that is stored in an external file and from data stored in Microsoft Excel worksheets. It also shows you
how to read data from a SAS data set and write observations out to these destinations.
- 160 -
Regardless of the input data source — raw files or Excel worksheets — you use the DATA step to read in
the data and create the SAS data set.
Using the DATA Step to Create SAS Data Sets
Objectives
In this chapter, you learn to
reference a SAS library
reference a raw data file
name a SAS data set to be created
specify a raw data file to be read
read standard character and numeric values in fixed fields
create new variables and assign values
select observations based on conditions
read instream data
submit and verify a DATA step program
read a SAS data set and write the observations out to a raw data file.
use the DATA step to create a SAS data set from an Excel worksheet
use the SAS/ACCESS LIBNAME statement to read from an Excel worksheet
create an Excel worksheet from a SAS data set
- 161 -
use the IMPORT procedure to read external files
Raw Data Files
A raw data file is an external text file whose records contain data values that are organized in fields. Raw
data files are non-proprietary and can be read by a variety of software programs. The sample raw data files
in this chapter are shown with a ruler to help you identify where individual fields begin and end. The ruler
is not part of the raw data file.
Raw Data File
The table below describes the record layout for a raw data file that contains readings from exercise stress
tests that have been performed on patients at a health clinic. Exercise physiologists in the clinic use the test
results to prescribe various exercise therapies. The file contains fixed fields. That is, values for each
variable are in the same location in all records.
Record Layout for Raw Data
Field
Name
Starting
Column
Ending
Column
Description of Field
ID
1
4
patient ID number
Name
6
25
patient name
RestHR
27
29
resting heart rate
MaxHR
31
33
maximum heart rate during test
RecHR
35
37
recovery heart rate after test
TimeMin
39
40
time, complete minutes
- 162 -
TimeSec 42 43 time, seconds
Tolerance
45
45
comparison of stress test tolerance
b
etween this test and the last test (I=increased, D=decreased,
S=same, N=no previous test)
Steps to Create a SAS Data Set from a Raw Data File
Let's look at the steps for creating a SAS data set from a raw data file. In the first part of this chapter, you
will follow these steps to create a SAS data set from a raw data file that contains fixed fields.
Before reading raw data from a file, you might need to reference the SAS library in which you will store
the data set. Then you can write a DATA step program to read the raw data file and create a SAS data set.
To read the raw data file, the DATA step must provide the following instructions to SAS:
the location or name of the external text file
a name for the new SAS data set
a reference that identifies the external file
a description of the data values to be read.
After using the DATA step to read the raw data, you can use a PROC PRINT step to produce a report that
displays the data values that are in the new data set.
The table below outlines the basic statements that are used in a program that reads raw data in fixed fields.
Throughout this chapter, you'll see similar tables that show sample SAS statements.
Basic Statements for Reading Raw Data
To do this...
Use this SAS statement...
Reference SAS data library
LIBNAME statement
Reference external file
FILENAME statement
Name SAS data set
DATA statement
Identify external file
INFILE statement
- 163 -
Describe data INPUT statement
Execute DATA step
RUN statement
Print the data set
PROC PRINT statement
Execute final program step
RUN statement
You can also use additional SAS statements to perform tasks that customize your data for your needs. For
example, you may want to create new variables from the values of existing variables.
Referencing a SAS Library
Using a LIBNAME Statement
As you begin to write the program, remember that you might need to use a LIBNAME statement to
reference the permanent SAS library in which the data set will be stored.
To do this...
Use this SAS statement...
Example
Reference a SAS library
LIBNAME statement
libname taxes 'c:\users\name\sasuser';
For example, the LIBNAME statement below assigns the libref Taxes to the SAS library in the folder
C:\Acct\Qtr1\Report in the Windows environment.
libname taxes 'c:\acct\qtr1\report';
You do not need to use a LIBNAME statement in all situations. For example, if you are storing the data set
in a temporary SAS data set or if SAS has automatically assigned the libref for the permanent library that
you are using.
Many of the examples in this chapter use the libref Sasuser, which SAS automatically assigns.
Referencing a Raw Data File
Using a FILENAME Statement
When reading raw data, you can use the FILENAME statement to point to the location of the external file
that contains the data. Just as you assign a libref by using a LIBNAME statement, you assign a fileref by
using a FILENAME statement.
Referencing a Raw Data File
- 164 -
To do this...
Use this SAS statement...
Example
Reference a SAS library
LIBNAME statement
libname libref 'SAS-data-library';
Reference an external file
FILENAME statement
filename tests 'c:\users\tmill.dat';
Filerefs perform the same function as librefs: they temporarily point to a storage location for data.
However, librefs reference SAS data libraries, whereas filerefs reference external files.
Filerefs and Librefs
General form, FILENAME statement:
FILENAME fileref 'filename';
where
fileref is a name that you associate with an external file. The name must be 1 to 8 characters long, begin with a
letter or underscore, and contain only letters, numerals, or underscores.
'filename' is the fully qualified name or location of the file.
Defining a Fully Qualified Filename
The following FILENAME statement temporarily associates the fileref Tests with the external file that
contains the data from the exercise stress tests. The complete filename is specified as C:\Users\Tmill.dat
in the Windows environment.
filename tests 'c:\users\tmill.dat';
File Location
- 165 -
Defining an Aggregate Storage Location
You can also use a FILENAME statement to associate a fileref with an aggregate storage location, such as
a directory that contains multiple external files.
Aggregate Storage Location
This FILENAME statement temporarily associates the fileref Finance with the aggregate storage directory
C:\Users\Personal\Finances:
filename finance 'c:\users\personal\finances';
Viewing Active Filerefs
Like librefs, the filerefs currently defined for your SAS session are listed in the SAS Explorer window.
To view details about a referenced file, double-click File Shortcuts (or select File Shortcuts and then
Open from the pop-up menu). Then select ViewDetails. Information for each file (name, size, type, and
host path name) is listed.
Both the LIBNAME and FILENAME statements are global. In other words, they remain in effect until you
change them, cancel them, or end your SAS session.
Referencing a Fully Qualified Filename
When you associate a fileref with an individual external file, you specify the fileref in subsequent SAS
statements and commands.
Referencing a Fully Qualified Filename
- 166 -
Referencing a File in an Aggregate Storage Location
To reference an external file with a fileref that points to an aggregate storage location, you specify the
fileref followed by the individual filename in parentheses:
Referencing a File in an Aggregate Storage Location
In the Windows environment, you can omit the filename extension but you will need to add quotation
marks when referencing an external file, as in
infile tax('refund');
For details on referencing external files stored in aggregate storage locations, see the SAS documentation
for your operating environment.
Writing a DATA Step Program
Naming the Data Set
The DATA statement indicates the beginning of the DATA step and names the SAS data set to be created.
Naming the Data Set
To do this...
Use this SAS statement...
Example
Reference a SAS library
LIBNAME statement
libname libref 'SAS-data-library';
- 167 -
Reference an external file FILENAME statement filename tests 'c:\users\tmill.dat';
Name a SAS data set
DATA statement
data clinic.stress;
General form, basic DATA statement:
DATA SAS-data-set-1 <...SAS-data-set-n>;
where SAS-data-set names (in the format libref.filename) the data set or data sets to be created.
Remember that the SAS data set name is a two-level name. For example, the two-level name Clinic.Admit
specifies that the data set Admit is stored in the permanent SAS library to which the libref Clinic has been
assigned.
Two-Level Names
Specifying the Raw Data File
When reading raw data, use the INFILE statement to indicate which file the data is in.
Specifying the Raw Data File
To do this...
Use this SAS statement...
Example
Reference a SAS library
LIBNAME statement
libname libref 'SAS-data-library';
Reference an external file
FILENAME statement
filename tests 'c:\users\tmill.dat';
Name a SAS data set
DATA statement
data clinic.stress;
Identify an external file
INFILE statement
infile tests;
General form, INFILE statement:
- 168 -
INFILE file-specification <options>;
where
file-specification can take the form fileref to name a previously defined file reference or 'filename' to point to the
actual name and location of the file
options describe the input file's characteristics and specify how it is to be read with the INFILE statement.
To read the raw data file to which the fileref Tests has been assigned, you write the following INFILE
statement:
infile tests;
Instead of using a FILENAME statement, you can choose to identify the raw data file by specifying the
entire filename and location in the INFILE statement. For example, the following statement points directly
to the C:\Irs\Personal\Refund.dat file:
infile 'c:\irs\personal\refund.dat';
Column Input
Column input specifies actual column locations for values. However, column input is appropriate only in
certain situations. When you use column input, your data must be
standard character or numeric values
in fixed fields.
Standard and Nonstandard Numeric Data
Standard numeric data values can contain only
numerals
decimal points
numbers in scientific or E-notation (2.3E4, for example)
plus or minus signs.
Nonstandard numeric data includes
values that contain special characters, such as percent signs (%), dollar signs ($), and commas (,)
date and time values
data in fraction, integer binary, real binary, and hexadecimal forms.
- 169 -
The external file that is referenced by the fileref Staff contains the personnel information for a technical
writing department of a small computer manufacturer. The fields contain values for each employee's last
name, first name, job code, and annual salary.
Notice that the values for Salary contain commas. So, the values for Salary are considered to be
nonstandard numeric values. You cannot use column input to read these values.
Raw Data File
Fixed-Field Data
Raw data can be organized in several different ways.
The following external delimited file contains data that is not arranged in columns, meaning data that is not
arranged in columns. Notice that the values for a particular field do not begin and end in the same columns.
You cannot use column input to read this file.
Fixed Field Data
The following external file contains data that is arranged in columns or fixed fields. You can specify a
beginning and ending column for each field. Let's look at how column input can be used to read this data.
External File with Columns
- 170 -
If you are not familiar with the content and structure of your raw data files, you can use PROC FSLIST to
view them.
Describing the Data
The INPUT statement describes the fields of raw data to be read and placed into the SAS data set.
Describing the Data
To do this...
Use this SAS statement...
Example
Reference a SAS library
LIBNAME statement
libname libref 'SAS-data-library';
Reference an external file
FILENAME statement
filename tests 'c:\users\tmill.dat';
Name a SAS data set
DATA statement
data clinic.stress;
Identify an external file
INFILE statement
infile tests;
Describe data
INPUT statement
input ID $ 1-4 Name $ 6-25 ...;
Execute the DATA step
RUN statement
run;
General form, INPUT statement using column input:
INPUT variable <$> startcol-endcol . . .;
where
variable is the SAS name that you assign to the field
- 171 -
the dollar sign ($) identifies the variable type as character (if the variable is numeric, then nothing appears here)
startcol represents the starting column for this variable
endcol represents the ending column for this variable.
Look at the small data file shown below. For each field of raw data that you want to read into your SAS
data set, you must specify the following information in the INPUT statement:
a valid SAS variable name
a type (character or numeric)
a range (starting column and ending column).
Raw Data File
The INPUT statement below assigns the character variable ID to the data in columns 1-4, the numeric
variable Age to the data in columns 6-7, the character variable ActLevel to the data in columns 9-12, and
the character variable Sex to the data in column 14.
filename exer 'c:\users\exer.dat';
data exercise;
infile exer;
input ID $ 1-4 Age 6-7 ActLevel $ 9-12 Sex $ 14;
run;
Assigning Column Ranges to Variables
- 172 -
When you use column input, you can
read any or all fields from the raw data file
read the fields in any order
specify only the starting column for values that occupy only one column.
input ActLevel $ 9-12 Sex $ 14 Age 6-7;
Remember that when you name a new variable, you must specify the name in the exact case that you want
it stored, for example NewBalance. Thereafter, you can specify the name in uppercase, lowercase, or
mixed case.
Specifying Variable Names
Each variable has a name that conforms to SAS naming conventions. Variable names
must be 1 to 32 characters in length
must begin with a letter (A-Z) or an underscore (_)
can continue with any combination of numerals, letters, or underscores.
Let's look at an INPUT statement that uses column input to read the three data fields in the raw data file
below.
Raw Data File
- 173 -
The values for the variable Age are located in columns 1-2. Because Age is a numeric variable, you do not
specify a dollar sign ($) after the variable name.
input Age 1-2
The values for the variable ActLevel are located in columns 3-6. You specify a $ to indicate that ActLevel
is a character variable.
input Age 1-2 ActLevel $ 3-6
The values for the character variable Sex are located in column 7. Notice that you specify only a single
column.
input Age 1-2 ActLevel $ 3-6 Sex $ 7;
Submitting the DATA Step Program
Verifying the Data
To verify your data, it is a good idea to use the OBS= option in the INFILE statement. Adding OBS=n to
the INFILE statement enables you to process only records 1 through n, so you can verify that the correct
fields are read before reading the entire data file.
The program below reads the first ten records in the raw data file referenced by the fileref Tests. The data
is stored in a permanent SAS data set, named Sasuser.Stress. Don't forget a RUN statement, which tells
SAS to execute the previous SAS statements.
data sasuser.stress;
infile tests obs=10;
input ID $ 1-4 Name $ 6-25
RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40
TimeSec 42-43 Tolerance $ 45;
run;
- 174 -
SAS Data Set sasuser.stress
Checking DATA Step Processing
If you submit the DATA step below it will run successfully.
data sasuser.stress;
infile tests obs=10;
input ID $ 1-4 Name $ 6-25
RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40
TimeSec 42-43 Tolerance $ 45;
run;
Messages in the log verified that the raw data file was read correctly. The notes in the log indicate that
10 records were read from the raw data file
the SAS data set Sasuser.Stress was created with 10 observations and 8 variables.
SAS Log
- 175 -
Printing the Data Set
The messages in the log seem to indicate that the DATA step program correctly accessed the raw data file.
But it is a good idea to look at the ten observations in the new data set before reading the entire raw data
file. You can submit a PROC PRINT step to view the data.
Printing the Data Set
To do this...
Use this SAS statement...
Example
Reference a SAS library
LIBNAME statement
libname libref 'SAS-data-library';
Reference an external file
FILENAME statement
filename tests 'c:\users\tmill.dat';
Name a SAS data set
DATA statement
data clinic.stress;
Identify an external file
INFILE statement
infile tests obs=10;
Describe data
INPUT statement
input ID $ 1-4 Name $ 6-25 ...;
Execute the DATA step
RUN statement
run;
Print the data set
PROC PRINT statement
proc print data=clinic.stress;
- 176 -
Execute the final program step RUN statement run;
The following PROC PRINT step prints the Sasuser.Stress data set.
proc print data=sasuser.stress;
run;
The PROC PRINT output indicates that the variables in the Sasuser.Stress data set were read correctly for
the first ten records.
PROC Print Output
Reading the Entire Raw Data File
Now that you've checked the log and verified your data, you can modify the DATA step to read the entire
raw data file. To do so, remove the OBS= option from the INFILE statement and re-submit the program.
data sasuser.stress;
infile tests;
input ID $ 1-4 Name $ 6-25
RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40
TimeSec 42-43 Tolerance $ 45;
run;
Invalid Data
- 177 -
When you submit the revised DATA step and check the log, you see a note indicating that invalid data
appears for the variable RecHR in line 14 of the raw data file, columns 35-37.
This note is followed by a column ruler and the actual data line that contains the invalid value for RecHR.
SAS Log
The value Q13 is a data-entry error. It was entered incorrectly for the variable RecHR.
RecHR is a numeric variable, but Q13 is not a valid number. So RecHR is assigned a missing value, as
indicated in the log. Because RecHR is numeric, the missing value is represented with a period.
Notice, though, that the DATA step does not fail as a result of the invalid data but continues to execute.
Unlike syntax errors, invalid data errors do not cause SAS to stop processing a program.
Assuming that you have a way to edit the file and can justify a correction, you can correct the invalid value
and rerun the DATA step. If you did this, the log would then show that the data set Sasuser.Stress was
created with 21 observations, 8 variables, and no messages about invalid data.
SAS Log
- 178 -
After correcting the raw data file, you can print the data again to verify that it is correct.
proc print data=sasuser.stress;
run;
PROC Print Output
- 179 -
Whenever you use the DATA step to read raw data, remember the steps that you followed in this chapter,
which help ensure that you don't waste resources when accessing data:
write the DATA step using the OBS= option in the INFILE statement
submit the DATA step
check the log for messages
view the resulting data set
remove the OBS= option and resubmit the DATA step
check the log again
view the resulting data set again.
- 180 -
Creating and Modifying Variables
Overview
So far in this chapter, you've read existing data. But sometimes existing data doesn't provide the
information you need. To modify existing values or to create new variables, you can use an assignment
statement in any DATA step.
General form, assignment statement:
variable=expression;
where
variable names a new or existing variable
expression is any valid SAS expression
The assignment statement is one of the few SAS statements that doesn't begin with a keyword.
For example, here is an assignment statement that assigns the character value Toby Witherspoon to the
variable Name:
Name='Toby Witherspoon';
SAS Expressions
You use SAS expressions in assignment statements and many other SAS programming statements to
transform variables
create new variables
conditionally process variables
calculate new values
assign new values.
An expression is a sequence of operands and operators that form a set of instructions. The instructions are
performed to produce a new value:
Operands are variable names or constants. They can be numeric, character, or both.
Operators are special-character operators, grouping parentheses, or functions. You can learn about functions in
Transforming Data with SAS Functions.
Using Operators in SAS Expressions
To perform a calculation, you use arithmetic operators. The table below lists arithmetic operators.
- 181 -
Using Operators in SAS Expressions
Operator
Action
Example
Priority
-
negative prefix
negative=-x;
I
**
exponentiation
raise=x**y;
I
*
multiplication
mult=x*y;
II
/
division
divide=x/y;
II
+
addition
sum=x+y;
III
-
subtraction
diff=x-y;
III
When you use more than one arithmetic operator in an expression,
operations of priority I are performed before operations of priority II, and so on
consecutive operations that have the same priority are performed
o from right to left within priority I
o from left to right within priority II and III
you can use parentheses to control the order of operations.
Note: When a value that is used with an arithmetic operator is missing, the result of the expression is
missing. The assignment statement assigns a missing value to a variable if the result of the expression is
missing.
You use the following comparison operators to express a condition.
Comparison Operators
Operator
Meaning
Example
= or eq
equal to
name='Jones, C.'
- 182 -
^= or ne not equal to temp ne 212
> or gt
greater than
income>20000
< or lt
less than
partno lt "BG05"
>= or ge
greater than or equal to
id>='1543'
<= or le
less than or equal to
pulse le 85
To link a sequence of expressions into compound expressions, you use logical operators, including the
following:
Logical Operators
Operator
Meaning
AND or &
and, both. If both expressions are true, then the compound expression is true.
OR or |
or, either. If either expression is true, then the compound expression is true.
More Examples of Assignment Statements
The assignment statement in the DATA step below creates a new variable, TotalTime, by multiplying the
values of TimeMin by 60 and then adding the values of TimeSec.
data sasuser.stress;
infile tests;
input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40 TimeSec 42-43
Tolerance $ 45;
TotalTime=(timemin*60)+timesec;
run;
Assignment Statement Output
- 183 -
The expression can also contain the variable name that is on the left side of the equal sign, as the following
assignment statement shows. This statement re-defines the values of the variable RestHR as 10 percent
higher.
data sasuser.stress;
infile tests;
input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40 TimeSec 42-43
Tolerance $ 45;
resthr=resthr+(resthr*.10);
run;
When a variable name appears on both sides of the equal sign, the original value on the right side is used to
evaluate the expression. The result is assigned to the variable on the left side of the equal sign.
data sasuser.stress;
infile tests;
input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40 TimeSec 42-43
Tolerance $ 45;
resthr=resthr+(resthr*.10);
run; ^ ^
result original value
Date Constants
You can assign date values to variables in assignment statements by using date constants. To represent a
constant in SAS date form, specify the date as 'ddmmmyy' or 'ddmmmyyyy', immediately followed by a D.
General form, date constant:
'ddmmmyy'd
- 184 -
or
“ddmmmyy”d
where
dd is a one- or two-digit value for the day
mmm is a three-letter abbreviation for the month (JAN, FEB, and so on)
yy or yyyy is a two- or four-digit value for the year, respectively.
Be sure to enclose the date in quotation marks.
Example
In the following program, the second assignment statement assigns a date value to the variable TestDate.
data sasuser.stress;
infile tests;
input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40 TimeSec 42-43
Tolerance $ 45;
TotalTime=(timemin*60)+timesec;
TestDate='01jan2000'd;
run;
You can also use SAS time constants and SAS datetime constants in assignment statements.
Time='9:25't;
DateTime='18jan2005:9:27:05'dt;
Subsetting Data
As you read your data, you can subset it by processing only those observations that meet a specified
condition. To do this, you can use a subsetting IF statement in any DATA step.
Using a Subsetting IF Statement
The subsetting IF statement causes the DATA step to continue processing only those observations that
meet the condition of the expression specified in the IF statement. The resulting SAS data set or data sets
contain a subset of the original external file or SAS data set.
General form, subsetting IF statement:
IF expression;
- 185 -
where expression is any valid SAS expression.
If the expression is true, the DATA step continues to process that observation.
If the expression is false, no further statements are processed for that observation, and control returns to the top of
the DATA step.
Example
The subsetting IF statement below selects only observations whose values for Tolerance are D. It is
positioned in the DATA step so that other statements do not need to process unwanted observations.
data sasuser.stress;
infile tests;
input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40 TimeSec 42-43
Tolerance $ 45;
if tolerance='D';
TotalTime=(timemin*60)+timesec;
run;
Because Tolerance is a character variable, the value D must be enclosed in quotation marks, and it must be
the same case as in the data set.
See the SAS documentation for a comparison of the WHERE and subsetting IF statements when they are
used in the DATA step.
Reading Instream Data
Overview
Throughout this chapter, our program has contained an INFILE statement that identifies an external file to
read.
data sasuser.stress;
infile tests;
input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40 TimeSec 42-43
Tolerance $ 45;
if tolerance='D';
TotalTime=(timemin*60)+timesec;
run;
However, you can also read instream data lines that you enter directly in your SAS program, rather than
data that is stored in an external file. Reading instream data is extremely helpful if you want to create data
and test your programming statements on a few observations that you can specify according to your needs.
- 186 -
To read instream data, you use
a DATALINES statement as the last statement in the DATA step and immediately preceding the data lines
a null statement (a single semicolon) to indicate the end of the input data.
data sasuser.stress;
input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40 TimeSec 42-43
Tolerance $ 45;
datalines;
.
.
.
data lines go here
.
.
.
;
General form, DATALINES statement:
DATALINES;
You can use only one DATALINES statement in a DATA step. Use separate DATA steps to enter multiple sets of
data.
You can also use LINES; or CARDS; as the last statement in a DATA step and immediately preceding the data
lines. Both LINES and CARDS are aliases for the DATALINES statement.
If your data contains semicolons, use the DATALINES4 statement plus a null statement that consists of four
semicolons (;;;;).
Example
To read the data for the treadmill stress tests as instream data, you can submit the following program:
data sasuser.stress;
input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40 TimeSec 42-43
Tolerance $ 45;
if tolerance='D';
TotalTime=(timemin*60)+timesec;
datalines;
2458 Murray, W 72 185 128 12 38 D
2462 Almers, C 68 171 133 10 5 I
- 187 -
2501 Bonaventure, T 78 177 139 11 13 I
2523 Johnson, R 69 162 114 9 42 S
2539 LaMance, K 75 168 141 11 46 D
2544 Jones, M 79 187 136 12 26 N
2552 Reberson, P 69 158 139 15 41 D
2555 King, E 70 167 122 13 13 I
2563 Pitts, D 71 159 116 10 22 S
2568 Eberhardt, S 72 182 122 16 49 N
2571 Nunnelly, A 65 181 141 15 2 I
2572 Oberon, M 74 177 138 12 11 D
2574 Peterson, V 80 164 137 14 9 D
2575 Quigley, M 74 152 113 11 26 I
2578 Cameron, L 75 158 108 14 27 I
2579 Underwood, K 72 165 127 13 19 S
2584 Takahashi, Y 76 163 135 16 7 D
2586 Derber, B 68 176 119 17 35 N
2588 Ivan, H 70 182 126 15 41 N
2589 Wilcox, E 78 189 138 14 57 I
2595 Warren, C 77 170 136 12 10 S
;
Notice that you do not need a RUN statement following the null statement (the semicolon after the data
lines). The DATALINES statement functions as a step boundary, so the DATA step is executed as soon as
SAS encounters it.
Creating a Raw Data File
Overview
Look at the SAS program and SAS data set that you created earlier in this chapter.
data sasuser.stress;
infile tests;
input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40 TimeSec 42-43
Tolerance $ 45;
if tolerance='D';
TotalTime=(timemin*60)+timesec;
run;
SAS Data Set sasuser.stress Output
- 188 -
As you can see, the data set has been modified with SAS statements. If you wanted to write the new
observations to a raw data file, you could reverse the process that you've been following and write out the
observations from a SAS data set as records or lines to a new raw data file.
Using the _NULL_ Keyword
Because the goal of your SAS program is to create a raw data file and not a SAS data set, it is inefficient to
print a data set name in the DATA statement. Instead, use the keyword _NULL_, which enables you to use
the DATA step without actually creating a SAS data set. A SET statement specifies the SAS data set that
you want to read from.
data _null_;
set sasuser.stress;
The next step is to specify the output file.
Specifying the Raw Data File
You use the FILE and PUT statements to write the observations from a SAS data set to a raw data file, just
as you used the INFILE and INPUT statements to create a SAS data set. These two sets of statements work
almost identically.
When writing observations to a raw data file, use the FILE statement to specify the output file.
General form, FILE statement:
FILE file-specification <options> <operating-environment-options>;
where
file-specification can take the form fileref to name a previously defined file reference or 'filename' to point to the
actual name and location of the file
- 189 -
options names options that are used in creating the output file
operating-environment-options names options that are specific to an operating environment (for more
information, see the SAS documentation for your operating environment).
For example, if you want to read the Sasuser.Stress data set and write it to a raw data file that is referenced
by the fileref Newdat, you would begin your program with the following SAS statements.
data _null_;
set sasuser.stress;
file newdat;
Instead of identifying the raw data file with a SAS fileref, you can choose to specify the entire filename
and location in the FILE statement. For example, the following FILE statement points directly to the
C:\Clinic\Patients\Stress.dat file. Note that the path specifying the filename and location must be
enclosed in quotation marks.
data _null_;
set sasuser.stress;
file 'c:\clinic\patients\stress.dat';
Describing the Data
Whereas the FILE statement specifies the output raw data file, the PUT statement describes the lines to
write to the raw data file.
General form, PUT statement using column output:
PUT variable startcol-endcol...;
where
variable is the name of the variable whose value is written
startcol indicates where in the line to begin writing the value
endcol indicates where in the line to end the value.
In general, the PUT statement mirrors the capabilities of the INPUT statement. In this case you are
working with column output. Therefore, you need to specify the variable name, starting column, and
ending column for each field that you want to create. Because you are creating raw data, you don't need to
follow character variable names with a dollar sign ($).
data _null_;
- 190 -
set sasuser.stress;
file 'c:\clinic\patients\stress.dat';
put id $ 1-4 name 6-25 resthr 27-29 maxhr 31-33
rechr 35-37 timemin 39-40 timesec 42-43
tolerance 45 totaltime 47-49;
run;
SAS Data Set sasuser.stress Output with PUT Statement
The resulting raw data file would look like this.
Creating a Raw Data File
In later chapters you'll learn how to use INPUT and PUT statements to read and write raw data in other
forms and record types.
Note: If you do not execute a FILE statement before a PUT statement in the current iteration of the
DATA step, SAS writes the lines to the SAS log. If you specify the PRINT fileref in the FILE statement,
before the PUT statement, SAS writes the lines to the procedure output file.
Additional Features
In this section, you learned to read raw data by using an INPUT statement that uses column input. You
also learned how to write to a raw data file by using the FILE statement with column input. However,
column input is appropriate only in certain situations. When you use column input, your data must be
- 191 -
standard character and numeric values. If the raw data file contains nonstandard values, then you need to use
formatted input, another style of input. To learn about formatted input, see Reading SAS Data Sets.
in fixed fields. That is, values for a particular variable must be in the same location in all records. If your raw data
file contains values that are not in fixed fields, you need to use list input. To learn about list input, see Reading
Free-Format Data.
Other forms of the INPUT statement enable you to read
nonstandard data values such as hexadecimal, packed decimal, dates, and monetary values that contain dollar
signs and commas
free-format data (data that is not in fixed fields)
implied decimal points
variable-length data values
variable-length records
different record types.
Reading Microsoft Excel Data
Overview
In addition to reading raw data files, SAS can also read Microsoft Excel data. Whether the input data
source is a SAS data set, a raw data file, or a file from another application, you use the DATA step to
create a SAS data set. The difference between reading these various types of input is in how you reference
the data. To read in Excel data you use one of the following methods:
SAS/ACCESS LIBNAME statement
Import Wizard
Remember, the Base SAS LIBNAME statement associates a SAS name (libref) with a SAS DATA library
by pointing to its physical location. But, the SAS/ACCESS LIBNAME statement associates a SAS name
with an Excel workbook file by pointing to its location.
In doing so, the Excel workbook becomes a new library in SAS, and the worksheets in the workbook
become the individual SAS data sets in that library.
The figure below illustrates the difference between how the two LIBNAME statements treat the data.
Comparing Libname Statements
- 192 -
The next figure shows how the DATA step is used with three types of input data.
Using the DATA step with Different Types of Output
- 193 -
Notice how the INFILE and INPUT statements are used in the DATA step for reading raw data, but the
SET statement is used in the DATA step for reading in the Excel worksheets.
Running SAS with Microsoft Excel
You must have licensed SAS/ACCESS Interface to PC Files to use a SAS/ACCESS LIBNAME statement that
references an Excel workbook.
If you are running SAS version 9.1 or earlier and want to read in Microsoft Excel data, you must use Microsoft
Excel 2003 or earlier.
To read Microsoft Excel 2007 data you must be running SAS version 9.2 or later.
The examples in this section are based on SAS version 9.2 running with Microsoft Excel 2007.
Steps for Reading Excel Data
Let's look at the steps for reading in an Excel workbook file.
To read the Excel workbook file, the DATA step must provide the following instructions to SAS:
- 194 -
a libref to reference the Excel workbook to be read
the name and location (using another libref) of the new SAS data set
the name of the Excel worksheet that is to be read
The table below outlines the basic statements that are used in a program that reads Excel data and creates a
SAS data set from an Excel worksheet. The PROC CONTENTS and PROC PRINT statements are not
requirements for reading in Excel data and creating a SAS data set. However, these statements are useful
for confirming that your Excel data has successfully been read into SAS.
Basic Steps for Reading Excel Data into a SAS Data Set
To do this...
Use this SAS
statement...
Example
Reference an Excel workbook file
SAS/ACCESS
LIBNAME statement
libname results
'c:\users\exercise.xlsx';
Output the contents of the SAS Library
PROC CONTENTS
proc contents
data=results._all_;
Execute the PROC CONTENTS statement
RUN statement
run;
Name and create a new SAS data set
DATA statement
data work.stress;
Read in an Excel worksheet (as the input
data for the new SAS data set)
SET statement
set results.'ActLevel$'n;
Execute the DATA step
RUN statement
run;
View the contents of a particular data set
PROC PRINT
proc print data=stress;
Execute the PROC PRINT statement
RUN statement
run;
The SAS/ACCESS LIBNAME Statement
The general form of the SAS/ACCESS LIBNAME statement is as follows:
- 195 -
General form, SAS/ACCESS LIBNAME statement:
LIBNAME libref 'location-of-Excel-workbook ' <options>;
where
libref is a name that you associate with an Excel workbook.
'location-of-Excel-workbook' is the physical location of the Excel workbook.
Example:
libname results 'c:\users\exercise.xlsx';
Referencing an Excel Workbook
Overview
This example uses data similar to the scenario used for the raw data in the previous section. The data
shows the readings from exercise stress tests that have been performed on patients at a health clinic.
The stress test data is located in an Excel workbook named exercise.xlsx (shown below), which is stored in
the location c:\users.
Excel Workbook
- 196 -
Notice in the sample worksheet above that the date column is defined in Excel as dates. That is, if you
right-click on the cells and select Format Cells (in Excel), the cells have a category of Date. SAS reads
this data just as it is stored in Excel. If the date had been stored as text in Excel, then SAS would have read
it as a character string.
To read in this workbook, you must first create a libref to point to the workbook's location:
libname results 'c:\users\exercise.xlsx';
The LIBNAME statement creates the libref results, which points to the Excel workbook exercise.xlsx. The
workbook contains two worksheets, tests and ActLevel, which are now available in the new SAS library
(results) as data sets.
After submitting the LIBNAME statement, you can look in the SAS Explorer window to see how SAS
handles your Excel workbook. The Explorer window enables you to manage your files in the SAS
windowing environment.
SAS Explorer Window
- 197 -
Name Literals
In the figure above, notice how the LIBNAME statement created a permanent library, results, which is the
SAS name (libref) we gave to the workbook file and its location. The new library contains two SAS data
sets, which accesses the data from the Excel worksheets. From this window you can browse the list of SAS
libraries or display the descriptor portion of a SAS data set.
Notice that the Excel worksheet names have the special character ($) at the end. All Excel worksheets are
designated this way. But remember, special characters such as these are not allowed in SAS data set names
by default. So, in order for SAS to allow this character to be included in the data set name, you must assign
a name literal to the data set name. A SAS name literal is a name token that is expressed as a string within
quotation marks, followed by the uppercase or lowercase letter n. The name literal tells SAS to allow the
special character ($) in the data set name.
Name Literal
- 198 -
Named Ranges
A named range is a range of cells within a worksheet that you define in Excel and assign a name to. In the
example below, the worksheet contains a named range, tests_week_1, which SAS recognizes as a data set.
The named range, tests_week_1, and its parent worksheet, tests, will appear in the SAS Explorer window
as separate data sets, except that the data set created from the named range will have no dollar sign ($)
appended to its name.
For more information on named ranges, see your Microsoft Excel documentation.
Named Range
- 199 -
Using PROC CONTENTS
In addition to using the SAS explorer window to view library data, you can also use the CONTENTS
procedure with the _ALL_ keyword to produce information about a data library and its contents. In the
example below, PROC CONTENTS outputs summary information for the SAS data set tests, including
data set name, variables, data types, and other summary information. This statement is useful for making
sure that SAS successfully read in your Excel data before moving on to the DATA step.
proc contents data=results._all_;
run;
CONTENTS Procedure Output
About the sample output above:
- 200 -
The variables in the data set are pulled from the Excel column headings. SAS uses underscores to replace the
spaces.
The Excel dates in the Format column are converted to SAS dates with the default DATE9. format.
This is partial output. The ACTLEVEL data set would also be included in the PROC CONTENTS report.
Creating the DATA Step
You use the DATA statement to indicate the beginning of the DATA step and name the SAS data set to be
created. Remember that the SAS data set name is a two-level name. For example, the two-level name
results.Admit specifies that the data set Admit is stored in the permanent SAS library to which the libref
results has been assigned.
When reading Excel data, use the SET statement to indicate which worksheet in the Excel file that you
want to read. To read in the Excel file you write the DATA and SET statements as follows:
data work.stress;
set results.'ActLevel$'n;
run;
In this example, the DATA statement tells SAS to name the new data set, stress, and store it in the
temporary library WORK. The SET statement in the DATA step specifies the libref (reference to the Excel
file) and the worksheet name as the input data.
You can use several statements in the DATA step to subset your data as needed. Here, the WHERE
statement is used with a variable to include only those participants whose activity level is HIGH.
data work.stress;
set results.'ActLevel$'n;
where ActLevel='HIGH';
run;
The figure below shows the partial output for this DATA step in table format.
DATA Step Output
- 201 -
Using PROC PRINT
After using the DATA step to read in the Excel data and create the SAS data set, you can use PROC
PRINT to produce a report that displays the data set values.
You can also use the PRINT procedure to refer to a specific worksheet. Remember to use the name literal
when referring to a specific Excel worksheet. In the example below, the first PRINT statement displays the
data values for the new data set that was created in the DATA step. The second PRINT statement displays
the contents of the Excel worksheet that was referenced by the LIBNAME statement.
proc print data=work.stress;
run;
proc print data=results.'ActLevel$'n;
run;
Disassociating a Libref
If SAS has a libref assigned to an Excel workbook, the workbook cannot be opened in Excel. To
disassociate a libref, use a LIBNAME statement, specifying the libref and the CLEAR option.
libname results ’c:\users\exercise.xlsx’;
proc print data=results.'tests$'n;
run;
libname results clear;
SAS disconnects from the data source and closes any resources that are associated with that libref’s
connection.
LIBNAME Statement Options
There are several options that you can use with the LIBNAME statement to control how SAS interacts
with the Excel data. The general form of the SAS/ACCESS LIBNAME statement (with options) is as
follows:
- 202 -
libname libref 'location-of-Excel-workbook' <options>;
Example:
libname doctors 'c:\clinicNotes\addresses.xlsx' mixed=yes;
DBMAX_TEXT=n
indicates the length of the longest character string where n is any integer between 256 and 32,767 inclusive. Any character
string with a length greater than this value is truncated. The default is 1024.
GETNAMES=YES|NO
determines whether SAS will use the first row of data in an Excel worksheet or range as column names.
YES specifies to use the first row of data in an Excel worksheet or range as column names.
NO specifies not to use the first row of data in an Excel worksheet or range as column names. SAS generates and uses the
variable names F1, F2, F3, and so on.
The default is YES.
MIXED=YES|NO
Specifies whether to import data with both character and numeric values and convert all data to character.
YES specifies that all data values will be converted to character.
NO specifies that numeric data will be missing when a character type is assigned. Character data will be missing when a
numeric data type is assigned.
The default is NO.
SCANTEXT=YES|NO
specifies whether to read the entire data column and use the length of the longest string found as the SAS column width.
YES scans the entire data column and uses the longest string value to determine the SAS column width.
NO does not scan the column and defaults to a width of 255.
The default is YES.
SCANTIME=YES|NO
specifies whether to scan all row values in a date/time column and automatically determine the TIME. format if only time
values exist.
YES specifies that a column with only time values be assigned the TIME8. format.
NO specifies that a column with only time values be assigned the DATE9. format.
The default is NO.
USEDATE=YES|NO
specifies whether to use the DATE9. format for date/time values in Excel workbooks.
YES specifies that date/time values be assigned the DATE9. format.
NO specifies that date/time values be assigned the DATETIME. format.
The default is YES.
Creating Excel Worksheets
In addition to being able to read Excel data, SAS can also create Excel worksheets from SAS data sets.
- 203 -
To do this, you use the SAS/ACCESS LIBNAME statement. For example, to create a new worksheet
named high_stress from the temporary SAS data set work.high_stress and save this worksheet in the new
Excel file newExcel.xlsx, you would submit the following LIBNAME statement and DATA step:
libname clinic 'c:\Users\mylaptop\admitxl.xlsx' mixed=yes;
data clinic.admit;
set work.admit;
run;
The IMPORT Wizard
Importing Data
As an alternative to using programming statements, you can use the Import Wizard to guide you through
the process of creating a SAS data set from both raw data and from Excel worksheets. The Import Wizard
enables you to create a SAS data set from different types of external files, such as
dBase files (*.dbf)
Excel 2007 (or earlier version) workbooks (*.xls, *.xlsx, *.xlsb, or *.xlsm)
Microsoft Access tables (*.mdb, *.accdb)
Delimited files (*.*)
Comma-separated values (*.csv).
The data sources that are available to you depend on which SAS/ACCESS products you have licensed. If
you do not have any SAS/ACCESS products licensed, the only type of data source files available to you
are CSV files, TXT files, and delimited files.
To access the Import Wizard, select FileImport Data from the menu bar. The Import Wizard opens
with the Select import type screen.
Import Wizard
- 204 -
Follow the instructions on each screen of the Import Wizard to read in your data. If you need additional
information, select the Help button at the bottom of each screen in the wizard.
Just as you can create a SAS data set from raw data by using the Import Wizard, you can use the Export
Wizard to read data from a SAS data set and to write the data to an external data source. To access the
Export Wizard, select FileExport Data from the menu bar.
Chapter Summary
Text Summary
Raw Data Files
A raw data file is an external file whose records contain data values that are organized in fields. The raw
data files in this chapter contain fixed fields.
Steps to Create a SAS Data Set
You need to follow several steps to create a SAS data set using raw data. You need to
reference the raw data file to be read
name the SAS data set
identify the location of the raw data
describe the data values to be read.
Referencing a SAS Library
- 205 -
To begin your program, you might need to use a LIBNAME statement to reference the SAS library in
which your data set will be stored.
Writing a DATA Step Program
The DATA statement indicates the beginning of the DATA step and names the SAS data set(s) to be
created.
Next, you specify the raw data file by using the INFILE statement. The OBS= option in the INFILE
statement enables you to process a specified number of observations.
This chapter teaches column input, the most simple input style. Column input specifies actual column
locations for data values. The INPUT statement describes the raw data to be read and placed into the SAS
data set.
Submitting the Program
When you submit the program, you can use the OBS= option with the INFILE statement to verify that the
correct data is being read before reading the entire data file.
After you submit the program, view the log to check the DATA step processing. You can then print the
data set by using the PROC PRINT procedure.
Once you've checked the log and verified your data, you can modify the DATA step to read the entire raw
data file by removing the OBS= option from the INFILE statement.
If you are working with a raw data file that contains invalid data, the DATA step continues to execute.
Unlike syntax errors, invalid data errors do not cause SAS to stop processing a program. If you have a way
to edit the invalid data, it's best to correct the problem and rerun the DATA step.
Creating and Modifying Variables
To modify existing values or to create new variables, you can use an assignment statement in any DATA
step. Within assignment statements, you can specify any SAS expression.
You can use date constants to assign dates in assignment statements. You can also use SAS time constants
and SAS datetime constants in assignment statements.
Subsetting Data
To process only observations that meet a specified condition, use a subsetting IF statement in the DATA
step.
Reading Instream Data
To read instream data lines instead of an external file, use a DATALINES statement, a CARDS statement,
or a LINES statement and enter data directly in your SAS program. Omit the RUN at the end of the DATA
step.
Creating a Raw Data File
- 206 -
When the goal of your SAS program is to create a raw data file and not a SAS data set, it is inefficient to
list a data set name in the DATA statement. Instead use the keyword _NULL_, which allows the power of
the DATA step without actually creating a SAS data set. A SET statement specifies the SAS data set that
you want to read from.
You can use the FILE and PUT statements to write out the observations from a SAS data set to a raw data
file just as you used the INFILE and INPUT statements to create a SAS data set. These two sets of
statements work almost identically.
Microsoft Excel Files
You can read Excel worksheets by using the SAS/ACCESS LIBNAME statement.
Steps to Create a SAS Data Set from Excel Data
You need to follow several steps to create a SAS data set using Excel. You need to
provide a name for the new SAS data set
provide the location or name of the libref and Excel worksheet
Referencing an Excel Workbook
To begin your program, you need to use a LIBNAME statement to reference the Excel workbook.
Writing a DATA Step Program
The DATA statement indicates the beginning of the DATA step and names the SAS data set(s) to be
created.
Next, you specify the Excel worksheet to be read by using the SET statement. You must use a SAS name
literal since SAS uses the special character ($) to name Excel worksheets.
Submitting the Program
When you submit the program, you can use the CONTENTS procedure to explore the new library and
contents.
After you submit the program, view the log to check the DATA step processing. You can then print the
data sets created from the Excel worksheets by using the PROC PRINT procedure.
Once you've checked the log and verified your data, you can modify the DATA step along with the
WHERE statement to subset parts of the data as needed.
Syntax
Reading Data from a Raw File or Reading Instream Data
LIBNAME libref 'SAS-data-library';
FILENAME fileref 'filename';
- 207 -
DATA SAS-data-set;
INFILE file-specification<OBS=n>;
INPUT variable <$> startcol-endcol...;
IF expression;
variable=expression;
DATALINES;
instream data goes here if used
;
RUN; /* not used with the DATALINES statement */
PROC PRINT DATA= SAS-data set;
RUN;
Creating a Raw Data File
LIBNAME libref 'SAS-data-library';
DATA _NULL_;
SET SAS-data-set;
FILE file-specification;
PUT variable startcol-endcol...;
RUN;
Reading Data from an Excel Workbook
LIBNAME libref '<location-of-Excel-workbook>';
PROC CONTENTS DATA= libref._ALL_;
DATA SAS-data-set;
SET libref.'worksheet_name$'n;
WHERE where-expression;
RUN;
PROC PRINT DATA= SAS-data set;
RUN;
Sample Programs
Reading Data from an External File
- 208 -
libname clinic 'c:\bethesda\patients\admit';
filename admit 'c:\clinic\patients\admit.dat';
data clinic.admittan;
infile admit obs=5;
input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40 TimeSec 42-43
Tolerance $ 45;
if tolerance='D';
TotalTime=(timemin*60)+timesec;
run;
proc print data=clinic.admittan;
run;
Reading Instream Data
libname clinic 'c:\bethesda\patients\admit';
data clinic.group1;
input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33
RecHR 35-37 TimeMin 39-40 TimeSec 42-43
Tolerance $ 45;
if tolerance='D';
TotalTime=(timemin*60)+timesec;
datalines;
2458 Murray, W 72 185 128 12 38 D
2462 Almers, C 68 171 133 10 5 I
2501 Bonaventure, T 78 177 139 11 13 I
2523 Johnson, R 69 162 114 9 42 S
2539 LaMance, K 75 168 141 11 46 D
2544 Jones, M 79 187 136 12 26 N
2595 Warren, C 77 170 136 12 10 S
;
proc print data=clinic.group1;
run;
Reading Excel Data
libname sasuser 'c:\users\admit.xlsx' mixed=yes;
proc contents data=sasuser._all_;
run;
proc print data=sasuser.'worksheet1$'n;
run;
Creating an Excel Worksheet
- 209 -
libname clinic 'c:\Users\mylaptop\admitxl.xlsx' mixed=yes;
data clinic.admit;
set work.admit;
run;
Points to Remember
LIBNAME and FILENAME statements are global. Librefs and filerefs remain in effect until you change them,
cancel them, or end your SAS session.
For each field of raw data that you read into your SAS data set, you must specify the following in the INPUT
statement: a valid SAS variable name, a type (character or numeric), a starting column, and if necessary, an ending
column.
When you use column input, you can read any or all fields from the raw data file, read the fields in any order, and
specify only the starting column for variables whose values occupy only one column.
Column input is appropriate only in some situations. When you use column input, your data must be standard
character and numeric values, and these values must be in fixed fields. That is, values for a particular variable must be in
the same location in all records.
Chapter Quiz
Select the best answer for each question. After completing the quiz, you can check your answers using the
answer key in the appendix.
1. Which SAS statement associates the fileref Crime with the raw data file C:\States\Data\crime.dat?
1.
filename crime 'c:\states\data\crime.dat';
2.
filename crime c:\states\data\crime.dat;
3.
fileref crime 'c:\states\data\crime.dat';
4.
filename 'c:\states\data\crime' crime.dat;
2. Filerefs remain in effect until . . .
1. you change them.
2. you cancel them.
3. you end your SAS session.
4. all of the above
3. Which statement identifies the name of a raw data file to be read with the fileref Products and specifies that the
DATA step read-only records 1-15?
1.
infile products obs 15;
- 210 -
2.
infile products obs=15;
3.
input products obs=15;
4.
input products 1-15;
4. Which of the following programs correctly writes the observations from the data set below to a raw data file?
Data Set work.patients
1.
data _null_;
set work.patients;
infile 'c:\clinic\patients\referals.dat';
input id $ 1-4 sex 6 $ age 8-9 height 11-12
weight 14-16 pulse 18-20;
run;
2.
data referals.dat;
set work.patients;
input id $ 1-4 sex $ 6 age 8-9 height 11-12
weight 14-16 pulse 18-20;
run;
- 211 -
3.
data _null_;
set work.patients;
file c:\clinic\patients\referals.dat;
put id $ 1-4 sex 6 $ age 8-9 height 11-12