Mark G. Sobell Practical Guide To Linux Commands, Editors And Shell Programming (2013, Prentic

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 1169

DownloadMark G. Sobell - Practical Guide To Linux Commands, Editors And Shell Programming (2013, Prentic
Open PDF In BrowserView PDF
A Practical Guide to Linux Commands,
Editors and Shell Programming
THIRD EDITION

Mark G. Sobell

Visit us on the Web: informit.com/ph
Library of Congress Cataloging-in-Publication Data
Sobell, Mark G.
A practical guide to Linux commands, editors, and shell programming / Mark G. Sobell.—3rd ed.
p. cm.
Includes bibliographical references and index.
ISBN 978-0-13-308504-4 (pbk. : alk. paper) 1. Linux. 2. Operating systems (Computers) I. Title.
QA76.76.O63S59483 2012
005.4'32—dc23
2012028015
Copyright © 2013 Mark G. Sobell
ISBN-13: 978-0-13-308504-4
ISBN-10:
0-13-308504-X
Text printed in the United States at Edwards Brothers Malloy in Ann Arbor, Michigan.
Fourth printing, September 2014

Brief Contents
Contents xvii
Preface xxxvii
1

Welcome to Linux and Mac OS X

PART I
2
3
4
5

157

The Shells

273

The Bourne Again Shell (bash)
The TC Shell (tcsh) 369

PART IV
10
11

The Editors

The vim Editor 159
The emacs Editor 213

PART III
8
9

The Linux and Mac OS X Operating Systems

Getting Started 23
The Utilities 49
The Filesystem 81
The Shell 125

PART II
6
7

1

275

Programming Tools

415

Programming the Bourne Again Shell (bash) 417
The Perl Scripting Language 517

21

12
13
14
15

The Python Programming Language 563
The MySQL Database Management System 595
The AWK Pattern Processing Language 619
The sed Editor 653

PART V
16
17

Secure Network Utilities

The rsync Secure Copy Utility 673
The OpenSSH Secure Communication Utilities

PART VI

Command Reference

Command Reference

PART VII

713

Appendixes

1009

A Regular Expressions 1011
B Help 1021
C Keeping the System Up-to-Date
D Mac OS X Notes 1041

Glossary 1053
File Tree Index 1105
Utility Index 1107
Main Index 1111

1027

711

671
687

Contents
Contents

Preface xxxvii
Chapter 1: Welcome to Linux and Mac OS X

1

The History of UNIX and GNU–Linux 3
The Heritage of Linux: UNIX 3
Fade to 1983 3
Next Scene, 1991 4
The Code Is Free 5
Have Fun! 6
What Is So Good About Linux? 6
Why Linux Is Popular with Hardware Companies and Developers 9
Linux Is Portable 10
The C Programming Language 10
Overview of Linux 11
Linux Has a Kernel Programming Interface 12
Linux Can Support Many Users 12
Linux Can Run Many Tasks 12
Linux Provides a Secure Hierarchical Filesystem 12
The Shell: Command Interpreter and Programming Language 14
A Large Collection of Useful Utilities 15
Interprocess Communication 16
System Administration 16
Additional Features of Linux 16
GUIs: Graphical User Interfaces 16
(Inter)Networking Utilities 17
Software Development 18
Chapter Summary 18
Exercises 18

PART I The Linux and Mac OS X
Operating Systems 21
Chapter 2: Getting Started 23
Conventions Used in This Book 24
Logging In from a Terminal (Emulator) 26
Working from the Command Line 28
Which Shell Are You Running? 29
Correcting Mistakes 29
Repeating/Editing Command Lines 31
su/sudo: Curbing Your Power (root Privileges) 32
Where to Find Documentation 33
man: Displays the System Manual 33
apropos: Searches for a Keyword 35
info: Displays Information About Utilities 36
The ––help Option 38
The bash help Command 39
Getting Help 39
More About Logging In and Passwords 42
What to Do If You Cannot Log In 42
Logging In Remotely: Terminal Emulators, ssh, and Dial-Up Connections
Using Virtual Consoles 43
Logging Out 43
Changing Your Password 44
Chapter Summary 46
Exercises 46
Advanced Exercises 47

Chapter 3: The Utilities

49

Special Characters 50
Basic Utilities 51
ls: Lists the Names of Files 52
cat: Displays a Text File 52
rm: Deletes a File 52
less Is more: Display a Text File One Screen at a Time
hostname: Displays the System Name 53
Working with Files 53
cp: Copies a File 53
mv: Changes the Name of a File 54
lpr: Prints a File 55
grep: Searches for a String 56

53

43

head: Displays the Beginning of a File 56
tail: Displays the End of a File 57
sort: Displays a File in Order 58
uniq: Removes Duplicate Lines from a File 58
diff: Compares Two Files 58
file: Identifies the Contents of a File 60

| (Pipeline): Communicates Between Processes 60
Four More Utilities 61
echo: Displays Text 61
date: Displays the Time and Date 62
script: Records a Shell Session 62
unix2dos: Converts Linux Files to Windows and Macintosh OS X Format
Compressing and Archiving Files 63
bzip2: Compresses a File 64
bzcat and bunzip2: Decompress a File 65
gzip: Compresses a File 65
tar: Packs and Unpacks Archives 66
Locating Utilities 68
which and whereis: Locate a Utility 68
locate: Searches for a File 70
Displaying User and System Information 70
who: Lists Users on the System 70
finger: Lists Users on the System 71
uptime: Displays System Load and Duration Information 72
w: Lists Users on the System 73
free: Displays Memory Usage Information 74
Communicating with Other Users 74
write: Sends a Message 74
mesg: Denies or Accepts Messages 75
Email 76
Chapter Summary 76
Exercises 79
Advanced Exercises 80

Chapter 4: The Filesystem 81
The Hierarchical Filesystem 82
Directory Files and Ordinary Files
Filenames 84
The Working Directory 86
Your Home Directory 87
Pathnames 87
Absolute Pathnames 88
Relative Pathnames 89

83

63

Working with Directories 90
mkdir: Creates a Directory 90
cd: Changes to Another Working Directory 92
rmdir: Deletes a Directory 93
Using Pathnames 94
mv, cp: Move or Copy Files 94
mv: Moves a Directory 95
Important Standard Directories and Files 95
Access Permissions 98
ls –l: Displays Permissions 98
chmod: Changes Access Permissions 99
Setuid and Setgid Permissions 101
Directory Access Permissions 103
ACLs: Access Control Lists 104
Enabling ACLs 105
Working with Access Rules 105
Setting Default Rules for a Directory 108
Links 109
Hard Links 110
Symbolic Links 112
rm: Removes a Link 115
Dereferencing Symbolic Links 115
Chapter Summary 119
Exercises 120
Advanced Exercises 122

Chapter 5: The Shell

125

The Command Line 126
A Simple Command 126
Syntax 126
Simple Commands 130
Processing the Command Line 130
Executing a Command 132
Editing the Command Line 133
Standard Input and Standard Output 133
The Screen as a File 134
The Keyboard and Screen as Standard Input and Standard Output
Redirection 135
Pipelines 141
Lists 145
Running a Command in the Background 146
Filename Generation/Pathname Expansion 148
The ? Special Character 148
The
Special Character 149
The [ ] Special Characters 151

*

135

Builtins 153
Chapter Summary 153
Utilities and Builtins Introduced in This Chapter
Exercises 155
Advanced Exercises 156

PART II

The Editors

Chapter 6: The vim Editor

154

157
159

History 160
Tutorial: Using vim to Create and Edit a File 161
Starting vim 161
Command and Input Modes 163
Entering Text 164
Getting Help 165
Ending the Editing Session 168
The compatible Parameter 168
Introduction to vim Features 168
Online Help 168
Terminology 169
Modes of Operation 169
The Display 170
Correcting Text as You Insert It 170
Work Buffer 171
Line Length and File Size 171
Windows 171
File Locks 171
Abnormal Termination of an Editing Session 172
Recovering Text After a Crash 173
Command Mode: Moving the Cursor 174
Moving the Cursor by Characters 175
Moving the Cursor to a Specific Character 175
Moving the Cursor by Words 176
Moving the Cursor by Lines 176
Moving the Cursor by Sentences and Paragraphs 177
Moving the Cursor Within the Screen 177
Viewing Different Parts of the Work Buffer 177
Input Mode 178
Inserting Text 178
Appending Text 178
Opening a Line for Text 178
Replacing Text 179
Quoting Special Characters in Input Mode 179

Command Mode: Deleting and Changing Text 179
Undoing Changes 179
Deleting Characters 180
Deleting Text 180
Changing Text 181
Replacing Text 182
Changing Case 183
Searching and Substituting 183
Searching for a Character 183
Searching for a String 184
Substituting One String for Another 186
Miscellaneous Commands 190
Join 190
Status 190
. (Period) 190
Copying, Moving, and Deleting Text 190
The General-Purpose Buffer 191
Named Buffers 192
Numbered Buffers 192
Reading and Writing Files 193
Reading Files 193
Writing Files 193
Identifying the Current File 194
Setting Parameters 194
Setting Parameters from Within vim 194
Setting Parameters in a Startup File 195
The .vimrc Startup File 195
Parameters 195
Advanced Editing Techniques 199
Using Markers 199
Editing Other Files 200
Macros and Shortcuts 200
Executing Shell Commands from Within vim 201
Units of Measure 203
Character 203
Word 203
Blank-Delimited Word 204
Line 204
Sentence 204
Paragraph 205
Screen (Window) 206
Repeat Factor 206
Chapter Summary 206
Exercises 211
Advanced Exercises 212

Chapter 7: The emacs Editor

213

History 214
Evolution 214
emacs Versus vim 215
Tutorial: Getting Started with emacs 216
Starting emacs 216
Exiting 218
Inserting Text 218
Deleting Characters 218
Moving the Cursor 219
Editing at the Cursor Position 221
Saving and Retrieving the Buffer 222
Basic Editing Commands 223
Keys: Notation and Use 223
Key Sequences and Commands 224
META-x: Running a Command Without a Key Binding
Numeric Arguments 225
Point and the Cursor 225
Scrolling Through a Buffer 225
Erasing Text 226
Searching for Text 226
Using the Menubar from the Keyboard 228
Online Help 229
Advanced Editing 232
Undoing Changes 232
Point, Mark, and Region 233
Cut and Paste: Yanking Killed Text 235
Inserting Special Characters 236
Global Buffer Commands 237
Visiting and Saving Files 239
Buffers 242
Windows 243
Foreground Shell Commands 245
Background Shell Commands 246
Major Modes: Language-Sensitive Editing 246
Selecting a Major Mode 247
Human-Language Modes 247
C Mode 250
Customizing Indention 253
Comments 253
Special-Purpose Modes 254
Customizing emacs 256
The .emacs Startup File 257
Remapping Keys 258
A Sample .emacs File 259

224

More Information 260
Access to emacs 261
Chapter Summary 261
Exercises 269
Advanced Exercises 270

PART III

The Shells 273

Chapter 8: The Bourne Again Shell (bash)

275

Background 276
Startup Files 278
Login Shells 278
Interactive Nonlogin Shells 279
Noninteractive Shells 279
Setting Up Startup Files 279
. (Dot) or source: Runs a Startup File in the Current Shell 280
Commands That Are Symbols 281
Redirecting Standard Error 282
Writing and Executing a Simple Shell Script 284
chmod: Makes a File Executable 285
#! Specifies a Shell 287
# Begins a Comment 288
Executing a Shell Script 288
Control Operators: Separate and Group Commands 289
; and NEWLINE Separate Commands 290
| and & Separate Commands and Do Something Else 290
&& and || Boolean Control Operators 291
( ) Groups Commands 292
\ Continues a Command 293
Job Control 294
jobs: Lists Jobs 295
fg: Brings a Job to the Foreground 295
Suspending a Job 296
bg: Sends a Job to the Background 296
Manipulating the Directory Stack 297
dirs: Displays the Stack 297
pushd: Pushes a Directory on the Stack 298
popd: Pops a Directory Off the Stack 299
Parameters and Variables 300
User-Created Variables 302
Variable Attributes 305
Keyword Variables 307
Special Characters 315
Locale 316
Time 320

Processes 323
Process Structure 323
Process Identification 324
Executing a Command 325
History 326
Variables That Control History 326
Reexecuting and Editing Commands 328
The Readline Library 335
Aliases 342
Single Versus Double Quotation Marks in Aliases
Examples of Aliases 344
Functions 346
Controlling bash: Features and Options 349
bash Command-Line Options 349
Shell Features 350
Processing the Command Line 354
History Expansion 354
Alias Substitution 354
Parsing and Scanning the Command Line 354
Command-Line Expansion 354
Chapter Summary 364
Exercises 366
Advanced Exercises 368

Chapter 9: The TC Shell (tcsh)

343

369

Shell Scripts 370
Entering and Leaving the TC Shell 371
Startup Files 372
Features Common to the Bourne Again and TC Shells 373
Command-Line Expansion (Substitution) 374
Job Control 378
Filename Substitution 378
Manipulating the Directory Stack 379
Command Substitution 379
Redirecting Standard Error 379
Working with the Command Line 380
Word Completion 380
Editing the Command Line 383
Correcting Spelling 384
Variables 385
Variable Substitution 386
String Variables 386
Arrays of String Variables 387
Numeric Variables 388
Braces 390
Special Variable Forms 391
tcsh Variables 391

Control Structures 398
if 398
goto 401
Interrupt Handling 401
if...then...else 402
foreach 403
while 405
break and continue 405
switch 406
Builtins 407
Chapter Summary 411
Exercises 412
Advanced Exercises 414

PART IV

Programming Tools 415

Chapter 10: Programming the Bourne
Again Shell (bash) 417
Control Structures 418
if...then 419
if...then...else 423
if...then...elif 425
for...in 431
for 433
while 435
until 439
break and continue 441
case 442
select 448
Here Document 450
File Descriptors 452
Opening a File Descriptor 453
Duplicating a File Descriptor 453
File Descriptor Examples 453
Determining Whether a File Descriptor Is Associated with the Terminal 456
Parameters 458
Positional Parameters 458
Special Parameters 463
Variables 467
Shell Variables 467
Environment, Environment Variables, and Inheritance 468
Expanding Null and Unset Variables 472
Array Variables 474
Variables in Functions 475

Builtin Commands 476
type: Displays Information About a Command 477
read: Accepts User Input 477
exec: Executes a Command or Redirects File Descriptors 480
trap: Catches a Signal 483
kill: Aborts a Process 486
eval: Scans, Evaluates, and Executes a Command Line 487
getopts: Parses Options 488
A Partial List of Builtins 491
Expressions 492
Arithmetic Evaluation 492
Logical Evaluation (Conditional Expressions) 493
String Pattern Matching 494
Arithmetic Operators 495
Implicit Command-Line Continuation 499
Shell Programs 500
A Recursive Shell Script 501
The quiz Shell Script 504
Chapter Summary 510
Exercises 512
Advanced Exercises 514

Chapter 11: The Perl Scripting Language
Introduction to Perl 518
More Information 519
Help 519
perldoc 519
Terminology 521
Running a Perl Program 522
Syntax 524
Variables 526
Scalar Variables 528
Array Variables 529
Hash Variables 532
Control Structures 533
if/unless 534
if...else 536
if...elsif...else 537
foreach/for 537
last and next 539
while/until 540
Working with Files 542
Sort 546
Subroutines 547
Regular Expressions 550
Syntax and the =~ Operator 550

517

CPAN Modules 555
Examples 558
Chapter Summary 561
Exercises 562
Advanced Exercises 562

Chapter 12: The Python Programming Language
Introduction 564
Invoking Python 564
More Information 566
Writing to Standard Output and Reading from Standard Input
Functions and Methods 567
Scalar Variables, Lists, and Dictionaries 568
Scalar Variables 568
Lists 569
Dictionaries 573
Control Structures 574
if 575
if...else 575
if...elif...else 576
while 577
for 577
Reading from and Writing to Files 579
File Input and Output 579
Exception Handling 580
Pickle 582
Regular Expressions 583
Defining a Function 584
Using Libraries 585
Standard Library 585
Nonstandard Libraries 585
SciPy and NumPy Libraries 586
Namespace 586
Importing a Module 587
Example of Importing a Function 588
Lambda Functions 589
List Comprehensions 590
Chapter Summary 591
Exercises 592
Advanced Exercises 592

567

563

Chapter 13: The MySQL Database Management
System 595
Notes 596
Syntax and Conventions 597
More Information 599
Installing a MySQL Server and Client 599
Fedora/RHEL (Red Hat Enterprise Linux) 599
Debian/Ubuntu/Mint 600
openSUSE 600
OS X 600
Client Options 600
Setting Up MySQL 601
Assigning a Password to the MySQL User Named root
Removing Anonymous Users 602
Running the Secure Installation Script 602
~/.my.cnf: Configures a MySQL Client 602
~/.mysql_history: Stores Your MySQL History 603
Creating a Database 603
Adding a User 604
Examples 605
Logging In 605
Creating a Table 605
Adding Data 607
Retrieving Data 608
Backing Up a Database 610
Modifying Data 611
Creating a Second Table 612
Joins 613
Chapter Summary 617
Exercises 617

601

Chapter 14: The AWK Pattern Processing
Language 619
Syntax 620
Arguments 620
Options 621
Notes 622

Language Basics 622
Patterns 622
Actions 623
Comments 623
Variables 623
Functions 624
Arithmetic Operators 625
Associative Arrays 626
printf 626
Control Structures 627
Examples 629
Advanced gawk Programming 646
getline: Controlling Input 646
Coprocess: Two-Way I/O 648
Getting Input from a Network 650
Chapter Summary 651
Exercises 651
Advanced Exercises 652

Chapter 15: The sed Editor

653

Syntax 654
Arguments 654
Options 654
Editor Basics 655
Addresses 655
Instructions 656
Control Structures 657
The Hold Space 658
Examples 658
Chapter Summary 669
Exercises 669

PART V

Secure Network Utilities

Chapter 16: The rsync Secure Copy Utility
Syntax 674
Arguments 674
Options 675
Notes 676
More Information

677

671
673

Examples 677
Using a Trailing Slash ( /) on source-file 678
Removing Files 678
Copying Files to and from a Remote System 680
Mirroring a Directory 681
Making Backups 681
Restoring a File 684
Chapter Summary 684
Exercises 685

Chapter 17: The OpenSSH Secure Communication
Utilities 687
Introduction to OpenSSH 688
How OpenSSH Works 689
Files 689
More Information 691
Running the ssh, scp , and sftp OpenSSH Clients 691
Tutorial: Using ssh and scp to Connect to an OpenSSH Server 691
Configuring OpenSSH Clients 692
ssh: Logs In or Executes Commands on a Remote System 694
scp: Copies Files to and from a Remote System 698
sftp: A Secure FTP Client 700
~/.ssh/config and /etc/ssh/ssh_config Configuration Files 700
Authorized Keys: Automatic Login 702
ssh-agent: Holds Your Private Keys 705
Tunneling/Port Forwarding 706
Forwarding X11 706
Port Forwarding 707
Chapter Summary 708
Exercises 709
Advanced Exercises 709

PART VI

Command Reference

Utilities That Display and Manipulate Files 713
Network Utilities 714
Utilities That Display and Alter Status 715
Utilities That Are Programming Tools 716
Miscellaneous Utilities 716
Standard Multiplicative Suffixes 717
Common Options 718
The sample Utility 718

711

sample
aspell
at
busybox
bzip2
cal
cat
cd
chgrp
chmod
chown
cmp
comm
configure
cp
cpio
crontab
cut
date
dd
df
diff
diskutil
ditto
dmesg
dscl
du

Brief description of what the utility does 719
Checks a file for spelling errors 721
Executes commands at a specified time 725
Implements many standard utilities 729
Compresses or decompresses files 732
Displays a calendar 734
Joins and displays files 735
Changes to another working directory 737
Changes the group associated with a file 739
Changes the access mode (permissions) of a file 741
Changes the owner of a file and/or the group the file is
associated with 746
Compares two files 748
Compares sorted files 750
Configures source code automatically 752
Copies files 754
Creates an archive, restores files from an archive, or copies a
directory hierarchy 758
Maintains crontab files 763
Selects characters or fields from input lines 766
Displays or sets the system time and date 769
Converts and copies a file 772
Displays disk space usage 775
Displays the differences between two text files 777
Checks, modifies, and repairs local volumes O 782
Copies files and creates and unpacks archives O 785
Displays kernel messages 787
Displays and manages Directory Service information O 788
Displays information on disk usage by directory hierarchy
and/or file 791
Displays a message 794

echo
expand/unexpand
expr
file
find
finger
fmt
fsck
ftp
gawk

Converts TABs to SPACEs and SPACEs to TABs 796
Evaluates an expression 798
Displays the classification of a file 802
Finds files based on criteria 804
Displays information about users 810
Formats text very simply 812
Checks and repairs a filesystem 814
Transfers files over a network 819
Searches for and processes patterns in a file 825

gcc
GetFileInfo
grep
gzip
head
join
kill
killall
launchctl
less
ln
lpr
ls
make
man
mc
mkdir
mkfs
mv
nice
nl
nohup
od
open
otool
paste
pax
plutil
pr
printf
ps
renice
rm
rmdir
rsync
scp
screen

Compiles C and C++ programs 826
Displays file attributes O 831
Searches for a pattern in files 833
Compresses or decompresses files 838
Displays the beginning of a file 841
Joins lines from two files based on a common field 843
Terminates a process by PID 846
Terminates a process by name 848
Controls the launchd daemon O 850
Displays text files, one screen at a time 852
Makes a link to a file 856
Sends files to printers 858
Displays information about one or more files 861
Keeps a set of programs current 869
Displays documentation for utilities 875
Manages files in a textual environment (aka Midnight
Commander) 879
Creates a directory 886
Creates a filesystem on a device 887
Renames or moves a file 890
Changes the priority of a command 892
Numbers lines from a file 894
Runs a command that keeps running after you log out 896
Dumps the content of a file 897
Opens files, directories, and URLs O 901
Displays object, library, and executable files O 903
Joins corresponding lines from files 905
Creates an archive, restores files from an archive, or copies a
directory hierarchy 907
Manipulates property list files O 913
Paginates files for printing 915
Formats string and numeric data 917
Displays process status 921
Changes the priority of a process 925
Removes a file (deletes a link) 926
Removes directories 928
Copies files and directory hierarchies securely over a
network 929
Securely copies one or more files to or from a remote
system 930
Manages several textual windows 931

sed Edits a file noninteractively 937
SetFile Sets file attributes O 938
sleep Creates a process that sleeps for a specified interval 940
sort Sorts and/or merges files 942
split Divides a file into sections 951
ssh Securely executes commands on a remote system 953
sshfs/curlftpfs

stat
strings
stty
sysctl
tail
tar
tee
telnet
test
top
touch
tr
tty
tune2fs
umask
uniq
w
wc
which
who
xargs

Mounts a directory on an OpenSSH or FTP server as a local
directory 954
Displays information about files 957
Displays strings of printable characters from files 959
Displays or sets terminal parameters 960
Displays and alters kernel variables at runtime 964
Displays the last part (tail) of a file 965
Stores or retrieves files to/from an archive file 968
Copies standard input to standard output and one or more
files 973
Connects to a remote computer over a network 974
Evaluates an expression 978
Dynamically displays process status 981
Creates a file or changes a file’s access and/or modification
time 985
Replaces specified characters 987
Displays the terminal pathname 990
Changes parameters on an ext2, ext3, or ext4 filesystem 991
Specifies the file-creation permissions mask 994
Displays unique lines from a file 996
Displays information about local system users 998
Displays the number of lines, words, and bytes in one or
more files 1000
Shows where in PATH a utility is located 1001
Displays information about logged-in users 1003
Converts standard input to command lines 1005

PART VII

Appendixes 1009

Appendix A: Regular Expressions
Characters 1012
Delimiters 1012
Simple Strings 1012

1011

Special Characters 1012
Periods 1013
Brackets 1013
Asterisks 1014
Carets and Dollar Signs 1014
Quoting Special Characters 1015
Rules 1015
Longest Match Possible 1015
Empty Regular Expressions 1016
Bracketing Expressions 1016
The Replacement String 1016
Ampersand 1017
Quoted Digit 1017
Extended Regular Expressions 1017
Appendix Summary 1019

Appendix B: Help

1021

Solving a Problem 1022
Finding Linux and OS X Related Information 1023
Linux and OS X Newsgroups 1023
Mailing Lists 1024
Specifying a Terminal 1024

Appendix C: Keeping the System Up-to-Date

1027

Using yum 1028
Using yum to Install, Remove, and Update Packages 1028
Other yum Commands 1030
yum Groups 1031
Downloading rpm Package Files Using yumdownloader 1032
Configuring yum 1032
Using apt-get 1034
Using apt-get to Install, Remove, and Update Packages 1035
Using apt-get to Upgrade the System 1036
Other apt-get Commands 1037
Repositories 1037
sources.list: Specifies Repositories for apt-get to Search 1038
BitTorrent 1038

Appendix D: Mac OS X Notes
Open Directory 1042
Filesystems 1043
Nondisk Filesystems 1043
Case Sensitivity 1043
/Volumes 1044

1041

Extended Attributes 1044
File Forks 1044
File Attributes 1046
ACLs 1047
Activating the Terminal META Key 1049
Startup Files 1050
Remote Logins 1050
Many Utilities Do Not Respect Apple Human Interface Guidelines
Installing Xcode and MacPorts 1050
Mac OS X Implementation of Linux Features 1051

Glossary 1053
File Tree Index 1105
Utility Index 1107
Main Index 1111

1050

M

Preface

Preface
Linux

A Practical Guide to Linux® Commands, Editors, and Shell Programming, Third
Edition, explains how to work with the Linux operating system from the command
line. The first few chapters of this book build a foundation for learning about Linux.
The rest of the book covers more advanced topics and goes into more detail. This book
does not describe a particular release or distribution of Linux but rather pertains to
all recent versions of Linux.

Mac OS X

This book also explains how to work with the UNIX/Linux foundation of Mac
OS X. It looks “under the hood,” past the traditional GUI (graphical user interface)
that most people associate with the Macintosh, and explains how to use the powerful
command-line interface (CLI) that connects you directly to OS X. Where this book
refers to Linux, it implicitly refers to Mac OS X as well and makes note of differences
between the two operating systems.

Command-line
interface (CLI)

In the beginning there was the command-line (textual) interface, which enabled a user
to give Linux commands from the command line. There was no mouse to point with
or icons to drag and drop. Some programs, such as emacs, implemented rudimentary
windows using the very minimal graphics available in the ASCII character set.
Reverse video helped separate areas of the screen.

Preface

Linux was born and raised in this environment, so naturally all the original Linux
tools were invoked from the command line. The real power of Linux still lies in this
environment, which explains why many Linux professionals work exclusively from
the command line. Using clear descriptions and many examples, this book shows you
how to get the most out of your Linux system using the command-line interface.
Linux distributions

A Linux distribution comprises the Linux kernel, utilities, and application programs.
Many distributions are available, including Ubuntu, Fedora, openSUSE, Red Hat,
Debian, Mageia, Arch, CentOS, and Mint. Although the distributions differ from one

another in various ways, all of them rely on the Linux kernel, utilities, and applications.
This book is based on the code that is common to most distributions. As a consequence
you can use it regardless of which distribution you are running.
New in this edition

This edition includes a wealth of new and updated material:
• A new chapter on the Python programming language (Chapter 12).
• A new chapter on the MySQL database management system (Chapter 13).
• A new chapter on the OpenSSH secure communication utilities including
ssh (Chapter 17).
• Rewritten chapters on bash (Chapters 5, 8, and 10) that include new
information on
◆

The hash table and the hash builtin;

◆

Using brace expansion with a sequence expression to generate step
values;

◆

Determining whether a file descriptor is associated with the terminal;

◆

Environment, environment versus local variables, inheritance, and
process locality;

◆

Using eval to scan, evaluate, and execute a command line;

◆

Control operators and implicit command-line continuation;

◆

Localization and internationalization including discussions of the locale
utility and the LC_ variables; and

◆

Setting the time zone, the TZ environment variable, and the tzselect
utility.

• Many new command-line utilities in Part VI, including

Overlap

◆

screen, a terminal session manager/multiplexer;

◆

busybox, a single binary collection of utilities;

◆

The Midnight Commander (mc) file manager;

◆

sshfs/curlftpfs, which allows you to mount a directory on an OpenSSH
or FTP server as a local directory;

◆

expand/unexpand, which converts TABs to SPACEs and vice versa;

◆

join, which joins lines from two files based on a common field;

◆

nl, which numbers lines from a file; and

◆

printf, which formats string and numeric data.

If you read one of Mark Sobell’s other books, A Practical Guide to Fedora™ and Red
Hat ® Enterprise Linux, Sixth Edition, or A Practical Guide to Ubuntu Linux®, Third
Edition, or a subsequent edition of either book, you will notice some overlap between

those books and the one you are reading now. The introduction, the appendix on regular expressions, and the chapters on the utilities (Chapter 3 of this book—not Part
VI), the filesystem, the Bourne Again Shell (bash), and Perl are very similar in the
books. Chapters that appear in this book but not in the other two books include those
covering the vim and emacs editors, the TC Shell (tcsh), the AWK and sed languages,
the Python programming language, the MySQL RDBMS, the rsync utility, and Part VI,
which describes 98 of the most useful Linux and Mac OS X utility programs in detail.
Audience

This book is designed for a wide range of readers. It does not require programming
experience, although some experience using a computer is helpful. It is appropriate
for the following readers:
• Students taking a class in which they use Linux or Mac OS X
• Power users who want to explore the power of Linux or Mac OS X from
the command line
• Professionals who use Linux or Mac OS X at work
• Beginning Macintosh users who want to know what UNIX/Linux is, why
everyone keeps saying it is important, and how to take advantage of it
• Experienced Macintosh users who want to know how to take advantage of
the power of UNIX/Linux that underlies Mac OS X
• UNIX users who want to adapt their UNIX skills to the Linux or Mac OS X
environment
• System administrators who need a deeper understanding of Linux or Mac
OS X and the tools that are available to them, including the bash, Perl, and
Python scripting languages
• Web developers who need to understand Linux inside and out including Perl
and Python
• Computer science students who are studying the Linux or Mac OS X
operating system
• Programmers who need to understand the Linux or Mac OS X programming
environment
• Technical executives who want to get a grounding in Linux or Mac OS X

Benefits

A Practical Guide to Linux® Commands, Editors, and Shell Programming, Third
Edition, gives you a broad understanding of how to use Linux and Mac OS X from the
command line. Regardless of your background, it offers the knowledge you need to get
on with your work: You will come away from this book with an understanding of how
to use Linux/OS X, and this text will remain a valuable reference for years to come.
A large amount of free software has always been available for Macintosh systems. In
addition, the Macintosh shareware community is very active. By introducing the
UNIX/Linux aspects of Mac OS X, this book throws open to Macintosh users the vast
store of free and low-cost software available for Linux and other UNIX-like systems.

In this book, Linux refers to Linux and Mac OS X
tip The UNIX operating system is the common ancestor of Linux and Mac OS X. Although the GUIs
(graphical user interfaces) of these two operating systems differ significantly, the command-line
interfaces (CLIs) are very similar and in many cases identical. This book describes the CLIs of both
Linux and Mac OS X. To make the content more readable, this book uses the term Linux to refer
to both Linux and Mac OS X. It makes explicit note of where the two operating systems differ.

Features of This Book
This book is organized for ease of use in different situations. For example, you can read
it from cover to cover to learn command-line Linux from the ground up. Alternatively,
once you are comfortable using Linux, you can use this book as a reference: Look up
a topic of interest in the table of contents or index and read about it. Or, refer to one
of the utilities covered in Part VI, “Command Reference.” You can also think of this
book as a catalog of Linux topics: Flip through the pages until a topic catches your eye.
The book also includes many pointers to Web sites where you can obtain additional
information: Consider the Internet to be an extension of this book.
A Practical Guide to Linux® Commands, Editors, and Shell Programming, Third
Edition, offers the following features:
• Optional sections allow you to read the book at different levels, returning
to more difficult material when you are ready to tackle it.
• Caution boxes highlight procedures that can easily go wrong, giving you
guidance before you run into trouble.
• Tip boxes highlight places in the text where you can save time by doing
something differently or when it might be useful or just interesting to have
additional information.
• Security boxes point out ways you can make a system more secure.
• Each chapter starts with a list of chapter objectives—a list of important
tasks you should be able to perform after reading the chapter.
• Concepts are illustrated by practical examples found throughout the book.
• The many useful URLs (Internet addresses) identify sites where you can
obtain software and information.
• Main, File Tree, and Utility indexes help you find what you are looking for
quickly; for easy access, the Utility index is reproduced on the insides of the
front and back covers.
• Chapter summaries review the important points covered in each chapter.
• Review exercises are included at the end of each chapter for readers who
want to hone their skills. Answers to even-numbered exercises are posted
at www.sobell.com.

• Important GNU tools, including gcc, GNU Configure and Build System,
make, gzip, and many others, are described in detail.
• Pointers throughout the book provide help in obtaining online documentation
from many sources, including the local system and the Internet.
• Important command-line utilities that were developed by Apple specifically
for Mac OS X are covered in detail, including diskutil, ditto, dscl, GetFileInfo,
launchctl, otool, plutil, and SetFile.
• Descriptions of Mac OS X extended attributes include file forks, file
attributes, attribute flags, and Access Control Lists (ACLs).
• Appendix D, “Mac OS X Notes,” lists some differences between Mac OS X
and Linux.

Contents
This section describes the information that each chapter covers and explains how that
information can help you take advantage of the power of Linux. You might want to
review the table of contents for more detail.
• Chapter 1—Welcome to Linux and Mac OS X
Presents background information on Linux and OS X. This chapter covers the
history of Linux, profiles the OS X Mach kernel, explains how the GNU Project helped Linux get started, and discusses some of the important features of
Linux that distinguish it from other operating systems.

Part I: The Linux and Mac OS X Operating Systems
Experienced users might want to skim Part I
tip If you have used a UNIX/Linux system before, you might want to skim or skip some or all of the chapters
in Part I. All readers should take a look at “Conventions Used in This Book” (page 24), which explains
the typographic conventions that this book uses, and “Where to Find Documentation” (page 33), which
points you toward both local and remote sources of Linux documentation.

Part I introduces Linux and gets you started using it.
• Chapter 2—Getting Started
Explains the typographic conventions this book uses to make explanations
clearer and easier to read. This chapter provides basic information and
explains how to log in, change your password, give Linux commands using
the shell, and find system documentation.
• Chapter 3—The Utilities
Explains the command-line interface (CLI) and briefly introduces more than
30 command-line utilities. Working through this chapter gives you a feel for

Linux and introduces some of the tools you will use day in, day out. Deeper
discussion of utilities is reserved for Part VI. The utilities covered in this
chapter include
◆

grep, which searches through files for strings of characters;

◆

unix2dos, which converts Linux text files to Windows format;

◆

tar, which creates archive files that can hold many other files;

◆

bzip2 and gzip, which compress files so that they take up less space on

disk and allow you to transfer them over a network more quickly; and
◆

diff, which displays the differences between two text files.

• Chapter 4—The Filesystem
Discusses the Linux hierarchical filesystem, covering files, filenames,
pathnames, working with directories, access permissions, and hard and
symbolic links. Understanding the filesystem allows you to organize your
data so that you can find information quickly. It also enables you to share
some of your files with other users while keeping other files private.
• Chapter 5—The Shell
Explains how to use shell features to make your work faster and easier. All
of the features covered in this chapter work with both bash and tcsh. This
chapter discusses
◆

Using command-line options to modify the way a command works;

◆

Making minor changes in a command line to redirect input to a command
so that it comes from a file instead of the keyboard;

◆

Redirecting output from a command to go to a file instead of the screen;

◆

Using pipelines to send the output of one utility directly to another utility
so you can solve problems right on the command line;

◆

Running programs in the background so you can work on one task
while Linux is working on a different one; and

◆

Using the shell to generate filenames to save time spent on typing and
help you when you do not remember the exact name of a file.

Part II: The Editors
Part II covers two classic, powerful Linux command-line text editors. Most Linux
distributions include the vim text editor, an “improved” version of the widely used vi
editor, as well as the popular GNU emacs editor. Text editors enable you to create
and modify text files that can hold programs, shell scripts, memos, and input to text
formatting programs. Because Linux system administration involves editing textbased configuration files, skilled Linux administrators are adept at using text editors.

• Chapter 6—The vim Editor
Starts with a tutorial on vim and then explains how to use many of the
advanced features of vim, including special characters in search strings, the
General-Purpose and Named buffers, parameters, markers, and execution
of commands from within vim. The chapter concludes with a summary of
vim commands.
• Chapter 7—The emacs Editor
Opens with a tutorial and then explains many of the features of the emacs
editor as well as how to use the META, ALT, and ESCAPE keys. In addition, this
chapter covers key bindings, buffers, and incremental and complete
searching for both character strings and regular expressions. It details the
relationship between Point, the cursor, Mark, and Region. It also explains
how to take advantage of the extensive online help facilities available from
emacs. Other topics covered include cutting and pasting, using multiple
windows and frames, and working with emacs modes—specifically C
mode, which aids programmers in writing and debugging C code. Chapter 7
concludes with a summary of emacs commands.

Part III: The Shells
Part III goes into more detail about bash and introduces the TC Shell (tcsh).
• Chapter 8—The Bourne Again Shell (bash)
Picks up where Chapter 5 left off, covering more advanced aspects of
working with a shell. For examples it uses the Bourne Again Shell—bash,
the shell used almost exclusively for system shell scripts. Chapter 8 describes
how to
◆

Use shell startup files, shell options, and shell features to customize the
shell;

◆

Use job control to stop jobs and move jobs from the foreground to the
background, and vice versa;

◆

Modify and reexecute commands using the shell history list;

◆

Create aliases to customize commands;

◆

Work with user-created and keyword variables in shell scripts;

◆

Implement localization including discussions of the locale utility, the
LC_ variables, and internationalization;

◆

Set up functions, which are similar to shell scripts but are executed
more quickly;

◆

Write and execute simple shell scripts; and

◆

Redirect error messages so they go to a file instead of the screen.

• Chapter 9—The TC Shell (tcsh)
Describes tcsh and covers features common to and different between bash
and tcsh. This chapter explains how to
◆

Run tcsh and change your default shell to tcsh;

◆

Redirect error messages so they go to files instead of the screen;

◆

Use control structures to alter the flow of control within shell scripts;

◆

Work with tcsh array and numeric variables; and

◆

Use shell builtin commands.

Part IV: Programming Tools
Part IV covers important programming tools that are used extensively in Linux and
Mac OS X system administration and general-purpose programming.
• Chapter 10—Programming the Bourne Again Shell (bash)
Continues where Chapter 8 left off, going into greater depth about
advanced shell programming using bash, with the discussion enhanced by
extensive examples. This chapter discusses
◆

Control structures including if...then...else and case;

◆

Variables, with discussions of attributes, expanding null and unset
variables, array variables, and variables in functions;

◆

Environment, including environment versus local variables, inheritance,
and process locality;

◆

Arithmetic and logical (Boolean) expressions; and

◆

Some of the most useful shell builtin commands, including exec, trap,
and getopts.

Once you have mastered the basics of Linux, you can use your knowledge
to build more complex and specialized programs, using the shell as a programming language.
Chapter 10 poses two complete shell programming problems and then
shows you how to solve them step by step. The first problem uses recursion
to create a hierarchy of directories. The second problem develops a quiz
program, shows you how to set up a shell script that interacts with a user,
and explains how the script processes data. (The examples in Part VI also
demonstrate many features of the utilities you can use in shell scripts.)
• Chapter 11—The Perl Scripting Language
Introduces the popular, feature-rich Perl programming language. This
chapter covers
◆

Perl help tools including perldoc;

◆

Perl variables and control structures;

◆

File handling;

◆

Regular expressions; and

◆

Installation and use of CPAN modules.

Many Linux administration scripts are written in Perl. After reading
Chapter 11 you will be able to better understand these scripts and start
writing your own. This chapter includes many examples of Perl scripts.
• Chapter 12—The Python Programming Language
Introduces the flexible and friendly Python programming language. This
chapter covers
◆

Python lists and dictionaries;

◆

Python functions and methods you can use to write to and read from files;

◆

Using pickle to store an object on disk;

◆

Importing and using libraries;

◆

Defining and using functions, including regular and Lambda functions;

◆

Regular expressions; and

◆

Using list comprehensions.

Many Linux tools are written in Python. Chapter 12 introduces Python,
including some basic object-oriented concepts, so you can read and
understand Python programs and write your own. This chapter includes
many examples of Python programs.
• Chapter 13—The MySQL Database Management System
Introduces the widely used MySQL relational database management system
(RDBMS). This chapter covers
◆

Relational database terminology;

◆

Installing the MySQL client and server;

◆

Creating a database;

◆

Adding a user;

◆

Creating and modifying tables;

◆

Adding data to a database; and

◆

Backing up and restoring a database.

• Chapter 14—The AWK Pattern Processing Language
Explains how to use the powerful AWK language to write programs that
filter data, write reports, and retrieve data from the Internet. The advanced
programming section describes how to set up two-way communication
with another program using a coprocess and how to obtain input over a
network instead of from a local file.

• Chapter 15—The sed Editor
Describes sed, the noninteractive stream editor that finds many applications
as a filter within shell scripts. This chapter discusses how to use sed’s buffers
to write simple yet powerful programs and includes many examples.

Part V: Secure Network Utilities
Part V describes two utilities you can use to work on a remote system and copy files
across a network securely.
• Chapter 16—The rsync Secure Copy Utility
Covers rsync, a secure utility that copies an ordinary file or directory hierarchy
locally or between the local system and a remote system. As you write programs, you can use this utility to back them up to another system.
• Chapter 17—The OpenSSH Secure Communication Utilities
Explains how to use the ssh, scp, and sftp utilities to communicate securely
over the Internet. This chapter covers the use of authorized keys that allow
you to log in on a remote system securely without a password, ssh-agent
that can hold your private keys while you are working, and forwarding X11
so you can run graphical programs remotely.

Part VI: Command Reference
Linux includes hundreds of utilities. Chapters 14, 15, 16, and 17 as well as Part VI
provide extensive examples of the use of over 100 of the most important utilities with
which you can solve problems without resorting to programming in C. If you are
already familiar with UNIX/Linux, this part of the book will be a valuable, easy-touse reference. If you are not an experienced user, it will serve as a useful supplement
while you are mastering the earlier sections of the book.
Although the descriptions of the utilities in Chapters 14, 15, 16, and 17 and Part VI are
presented in a format similar to that used by the Linux manual (man) pages, they are
much easier to read and understand. These utilities are included because you will work
with them day in, day out (for example, ls and cp), because they are powerful tools that
are especially useful in shell scripts (sort, paste, and test), because they help you work
with a Linux system (ps, kill, and fsck), or because they enable you to communicate with
other systems (ssh, scp, and ftp). Each utility description includes complete explanations
of its most useful options, differentiating between options supported under Mac OS X
and those supported under Linux. The “Discussion” and “Notes” sections present tips
and tricks for taking full advantage of the utility’s power. The “Examples” sections
demonstrate how to use these utilities in real life, alone and together with other utilities,
to generate reports, summarize data, and extract information. Take a look at the
“Examples” sections for find (page 808), ftp (page 822), and sort (page 944) to see how
extensive these sections are. Some utilities, such as Midnight Commander ( mc ;
page 879) and screen (page 931), include extensive discussion sections and tutorials.

Part VII: Appendixes
Part VII includes the appendixes, the glossary, and three indexes.
• Appendix A—Regular Expressions
Explains how to use regular expressions to take advantage of the hidden
power of Linux. Many utilities, including grep, sed, vim, AWK, Perl, and
Python, accept regular expressions in place of simple strings of characters.
A single regular expression can match many simple strings.
• Appendix B—Help
Details the steps typically used to solve the problems you might encounter
when using a Linux system.
• Appendix C—Keeping the System Up-to-Date
Describes how to use tools to download software and keep a system current.
This appendix includes information on
◆

yum—Downloads software from the Internet, keeping a system up-to-date
and resolving dependencies as it goes.

◆

apt-get—An alternative to yum for keeping a system current.

◆

BitTorrent—Good for distributing large amounts of data such as Linux
installation CDs and DVDs.

• Appendix D—Mac OS X Notes
A brief guide to Mac OS X features and quirks that might be unfamiliar to
users who have been using Linux or other UNIX-like systems.
• Glossary
Defines more than 500 terms that pertain to the use of Linux and Mac
OS X.
• Indexes
Three indexes that make it easier to find what you are looking for quickly.
These indexes indicate where you can locate tables (page numbers followed
by the letter t) and definitions (italic page numbers). They also differentiate
between light and comprehensive coverage (page numbers in light and
standard fonts, respectively).
◆

File Tree Index—Lists, in hierarchical fashion, most files mentioned in
this book. These files are also listed in the Main index.

◆

Utility Index—Locates all utilities mentioned in this book. A page
number in a light font indicates a brief mention of the utility; use of
the regular font indicates more substantial coverage. The Utility index
is reproduced on the insides of the front and back covers.

◆

Main Index—Helps you find the information you want quickly.

Supplements
The author’s home page (www.sobell.com) contains downloadable listings of the longer
programs from this book as well as pointers to many interesting and useful Linux- and
OS X-related sites on the World Wide Web, a list of corrections to the book, answers to
even-numbered exercises, and a solicitation for corrections, comments, and suggestions.
PowerPoint slides of the figures in this book and answers to end-of-chapter exercises
are provided to qualified instructors for classroom use. If you want to obtain these
ancillaries, contact your Pearson representative.

1
Welcome to Linux
and Mac OS X

Chapter1
1

In This Chapter

Objectives

The History of UNIX and
GNU–Linux . . . . . . . . . . . . . . . . . . . 3

After reading this chapter you should be able to:

The Heritage of Linux: UNIX . . . . . . . 3
What Is So Good About Linux?. . . . . 6
Overview of Linux . . . . . . . . . . . . . . 11
Additional Features of Linux. . . . . . 16

 Discuss the history of UNIX, Linux, and the GNU
project
 Explain what is meant by “free software” and list
characteristics of the GNU General Public License
 List characteristics of Linux and reasons the Linux
operating system is so popular
 Discuss three benefits of virtual machines over single
physical machines

2 Chapter 1 Welcome to Linux and Mac OS X

An operating system is the low-level software that schedules tasks, allocates storage,
and handles the interfaces to peripheral hardware, such as printers, disk drives, the
screen, keyboard, and mouse. An operating system has two main parts: the kernel
and the system programs. The kernel allocates machine resources—including memory, disk space, and CPU (page 1063) cycles—to all other programs that run on the
computer. The system programs include device drivers, libraries, utility programs,
shells (command interpreters), configuration scripts and files, application programs,
servers, and documentation. They perform higher-level housekeeping tasks, often
acting as servers in a client/server relationship. Many of the libraries, servers, and
utility programs were written by the GNU Project, which is discussed shortly.
Linux kernel

The Linux kernel was developed by Finnish undergraduate student Linus Torvalds,
who used the Internet to make the source code immediately available to others for
free. Torvalds released Linux version 0.01 in September 1991.
The new operating system came together through a lot of hard work. Programmers
around the world were quick to extend the kernel and develop other tools, adding
functionality to match that already found in both BSD UNIX and System V UNIX
(SVR4) as well as new functionality. The name Linux is a combination of Linus
and UNIX.
The Linux operating system, which was developed through the cooperation of
numerous people around the world, is a product of the Internet and is a free (open
source; page 1084) operating system. In other words, all the source code is free. You
are free to study it, redistribute it, and modify it. As a result, the code is available free
of cost—no charge for the software, source, documentation, or support (via newsgroups, mailing lists, and other Internet resources). As the GNU Free Software
Definition (www.gnu.org/philosophy/free-sw.html) puts it:

Free beer

Mach kernel

“Free software” is a matter of liberty, not price. To understand the
concept, you should think of “free” as in “free speech,” not as in
“free beer.”
OS X runs the Mach kernel, which was developed at Carnegie Mellon University (CMU) and is free software. CMU concluded its work on the project in
1994, although other groups have continued this line of research. Much of the
Mac OS X software is open source: the Mac OS X kernel is based on Mach and
FreeBSD code, utilities come from BSD and the GNU project, and system programs come mostly from BSD code, although Apple has developed a number of
new programs.

Linux, OS X, and UNIX
tip Linux and OS X are closely related to the UNIX operating system. This book describes Linux and
OS X. To make reading easier, this book talks about Linux when it means OS X and Linux, and
points out where OS X behaves differently from Linux. For the same reason, this chapter frequently uses the term Linux to describe both Linux and OS X features.

The History of UNIX and GNU–Linux 3

The History of UNIX and GNU–Linux
This section presents some background on the relationships between UNIX and
Linux and between GNU and Linux. Visit www.levenez.com/unix for an extensive
history of UNIX.

The Heritage of Linux: UNIX
The UNIX system was developed by researchers who needed a set of modern computing tools to help them with their projects. The system allowed a group of people
working together on a project to share selected data and programs while keeping
other information private.
Universities and colleges played a major role in furthering the popularity of the UNIX
operating system through the “four-year effect.” When the UNIX operating system
became widely available in 1975, Bell Labs offered it to educational institutions at
nominal cost. The schools, in turn, used it in their computer science programs, ensuring that computer science students became familiar with it. Because UNIX was such
an advanced development system, the students became acclimated to a sophisticated
programming environment. As these students graduated and went into industry, they
expected to work in a similarly advanced environment. As more of them worked their
way up the ladder in the commercial world, the UNIX operating system found its way
into industry.
Berkeley UNIX
(BSD)

In addition to introducing students to the UNIX operating system, the Computer Systems Research Group (CSRG) at the University of California at Berkeley made
significant additions and changes to it. In fact, it made so many popular changes that
one version of the system is called the Berkeley Software Distribution (BSD) of the
UNIX system, or just Berkeley UNIX. The other major version is UNIX System V
(SVR4), which descended from versions developed and maintained by AT&T and
UNIX System Laboratories. Mac OS X inherits much more strongly from the BSD
branch of the tree.

Fade to 1983
Richard Stallman (www.stallman.org) announced1 the GNU Project for creating an
operating system, both kernel and system programs, and presented the GNU Manifesto,2 which begins as follows:
GNU, which stands for Gnu’s Not UNIX, is the name for the complete UNIX-compatible software system which I am writing so that
I can give it away free to everyone who can use it.
1. www.gnu.org/gnu/initial-announcement.html
2. www.gnu.org/gnu/manifesto.html

4 Chapter 1 Welcome to Linux and Mac OS X

Some years later, Stallman added a footnote to the preceding sentence when he realized that it was creating confusion:
The wording here was careless. The intention was that nobody
would have to pay for *permission* to use the GNU system. But
the words don’t make this clear, and people often interpret them as
saying that copies of GNU should always be distributed at little or
no charge. That was never the intent; later on, the manifesto mentions the possibility of companies providing the service of distribution for a profit. Subsequently I have learned to distinguish
carefully between “free” in the sense of freedom and “free” in the
sense of price. Free software is software that users have the freedom to distribute and change. Some users may obtain copies at no
charge, while others pay to obtain copies—and if the funds help
support improving the software, so much the better. The important
thing is that everyone who has a copy has the freedom to cooperate
with others in using it.
In the manifesto, after explaining a little about the project and what has been accomplished so far, Stallman continues:
Why I Must Write GNU
I consider that the golden rule requires that if I like a program I must
share it with other people who like it. Software sellers want to divide
the users and conquer them, making each user agree not to share
with others. I refuse to break solidarity with other users in this way.
I cannot in good conscience sign a nondisclosure agreement or a
software license agreement. For years I worked within the Artificial
Intelligence Lab to resist such tendencies and other inhospitalities,
but eventually they had gone too far: I could not remain in an institution where such things are done for me against my will.
So that I can continue to use computers without dishonor, I have
decided to put together a sufficient body of free software so that I
will be able to get along without any software that is not free. I have
resigned from the AI Lab to deny MIT any legal excuse to prevent
me from giving GNU away.

Next Scene, 1991
The GNU Project has moved well along toward its goal. Much of the GNU operating
system, except for the kernel, is complete. Richard Stallman later writes:
By the early ’90s we had put together the whole system aside from
the kernel (and we were also working on a kernel, the GNU Hurd,3

3. www.gnu.org/software/hurd/hurd.html

The History of UNIX and GNU–Linux 5

which runs on top of Mach4). Developing this kernel has been a lot
harder than we expected, and we are still working on finishing it.5
...[M]any believe that once Linus Torvalds finished writing the kernel, his friends looked around for other free software, and for no
particular reason most everything necessary to make a UNIX-like
system was already available.
What they found was no accident—it was the GNU system. The
available free software6 added up to a complete system because the
GNU Project had been working since 1984 to make one. The GNU
Manifesto had set forth the goal of developing a free UNIX-like system, called GNU. The Initial Announcement of the GNU Project
also outlines some of the original plans for the GNU system. By the
time Linux was written, the [GNU] system was almost finished.7
Today the GNU “operating system” runs on top of the FreeBSD (www.freebsd.org)
and NetBSD (www.netbsd.org) kernels with complete Linux binary compatibility
and on top of Hurd pre-releases and Darwin (developer.apple.com/opensource) without this compatibility.

The Code Is Free
The tradition of free software dates back to the days when UNIX was released to universities at nominal cost, which contributed to its portability and success. This
tradition eventually died as UNIX was commercialized and manufacturers came to
regard the source code as proprietary, making it effectively unavailable. Another
problem with the commercial versions of UNIX related to their complexity. As each
manufacturer tuned UNIX for a specific architecture, the operating system became
less portable and too unwieldy for teaching and experimentation.
MINIX

Two professors created their own stripped-down UNIX look-alikes for educational
purposes: Doug Comer created XINU, and Andrew Tanenbaum created MINIX.
Linus Torvalds created Linux to counteract the shortcomings in MINIX. Every time
there was a choice between code simplicity and efficiency/features, Tanenbaum chose
simplicity (to make it easy to teach with MINIX), which meant this system lacked
many features people wanted. Linux went in the opposite direction.
You can obtain Linux at no cost over the Internet. You can also obtain the GNU code
via the U.S. mail at a modest cost for materials and shipping. You can support the
Free Software Foundation (www.fsf.org) by buying the same (GNU) code in higherpriced packages, and you can buy commercial packaged releases of Linux (called

4. www.gnu.org/software/hurd/gnumach.html
5. www.gnu.org/software/hurd/hurd-and-linux.html
6. See www.gnu.org/philosophy/free-sw.html.
7. www.gnu.org/gnu/linux-and-gnu.html

6 Chapter 1 Welcome to Linux and Mac OS X

distributions), such as Fedora/Red Hat Enterprise Linux, openSUSE, Debian, and
Ubuntu, that include installation instructions, software, and support.
GPL

Linux and GNU software are distributed under the terms of the GNU General Public
License (GPL, www.gnu.org/licenses/licenses.html). The GPL says you have the right
to copy, modify, and redistribute the code covered by the agreement. When you redistribute the code, however, you must also distribute the same license with the code,
thereby making the code and the license inseparable. If you download source code
from the Internet for an accounting program that is under the GPL and then modify
that code and redistribute an executable version of the program, you must also distribute the modified source code and the GPL agreement with it. Because this
arrangement is the reverse of the way a normal copyright works (it gives rights
instead of limiting them), it has been termed a copyleft. (This paragraph is not a legal
interpretation of the GPL; it is intended merely to give you an idea of how it works.
Refer to the GPL itself when you want to make use of it.)

Have Fun!
Two key words for Linux are “Have Fun!” These words pop up in prompts and documentation. The UNIX—now Linux—culture is steeped in humor that can be seen
throughout the system. For example, less is more—GNU has replaced the UNIX paging utility named more with an improved utility named less. The utility to view
PostScript documents is named ghostscript, and one of several replacements for the vi
editor is named elvis. While machines with Intel processors have “Intel Inside” logos
on their outside, some Linux machines sport “Linux Inside” logos. And Torvalds
himself has been seen wearing a T-shirt bearing a “Linus Inside” logo.

What Is So Good About Linux?
In recent years Linux has emerged as a powerful and innovative UNIX work-alike.
Its popularity has surpassed that of its UNIX predecessors. Although it mimics UNIX
in many ways, the Linux operating system departs from UNIX in several significant
ways: The Linux kernel is implemented independently of both BSD and System V, the
continuing development of Linux is taking place through the combined efforts of
many capable individuals throughout the world, and Linux puts the power of UNIX
within easy reach of both business and personal computer users. Using the Internet,
today’s skilled programmers submit additions and improvements to the operating
system to Linus Torvalds, GNU, or one of the other authors of Linux.
Standards

In 1985, individuals from companies throughout the computer industry joined
together to develop the POSIX (Portable Operating System Interface for Computer
Environments) standard, which is based largely on the UNIX System V Interface
Definition (SVID) and other earlier standardization efforts. These efforts were
spurred by the U.S. government, which needed a standard computing environment
to minimize its training and procurement costs. Released in 1988, POSIX is a group

What Is So Good About Linux? 7

of IEEE standards that define the API (application programming interface), shell, and
utility interfaces for an operating system. Although aimed at UNIX-like systems, the
standards can apply to any compatible operating system. Now that these standards
have gained acceptance, software developers are able to develop applications that run
on all conforming versions of UNIX, Linux, and other operating systems.
Applications

A rich selection of applications is available for Linux—both free and commercial—as
well as a wide variety of tools: graphical, word processing, networking, security,
administration, Web server, and many others. Large software companies have
recently seen the benefit in supporting Linux and now have on-staff programmers
whose job it is to design and code the Linux kernel, GNU, KDE, or other software
that runs on Linux. For example, IBM (www.ibm.com/linux) is a major Linux supporter. Linux conforms increasingly more closely to POSIX standards, and some
distributions and parts of others meet this standard. These developments indicate
that Linux is becoming mainstream and is respected as an attractive alternative to
other popular operating systems.

Peripherals

Another aspect of Linux that appeals to users is the amazing range of peripherals that is
supported and the speed with which support for new peripherals emerges. Linux often
supports a peripheral or interface card before any company does. Unfortunately
some types of peripherals—particularly proprietary graphics cards—lag in their support because the manufacturers do not release specifications or source code for
drivers in a timely manner, if at all.

Software

Also important to users is the amount of software that is available—not just source
code (which needs to be compiled) but also prebuilt binaries that are easy to install
and ready to run. These programs include more than free software. Netscape, for
example, was available for Linux from the start and included Java support before it
was available from many commercial vendors. Its sibling Mozilla/Thunderbird/
Firefox is now a viable browser, mail client, and newsreader, performing many other
functions as well.

Platforms

Linux is not just for Intel-based platforms (which now include Apple computers): It
has been ported to and runs on the Power PC—including older Apple computers
(ppclinux), Compaq’s (née Digital Equipment Corporation) Alpha-based machines,
MIPS-based machines, Motorola’s 68K-based machines, various 64-bit systems, and
IBM’s S/390. Nor is Linux just for single-processor machines: As of version 2.0, it
runs on multiple-processor machines (SMPs; page 1093). It also includes an O(1)
scheduler, which dramatically increases scalability on SMP systems.

Emulators

Linux supports programs, called emulators, that run code intended for other operating systems. By using emulators you can run some DOS, Windows, and Macintosh
programs under Linux. For example, Wine (www.winehq.com) is an open-source
implementation of the Windows API that runs on top of the X Window System and
UNIX/Linux.

Virtual machines

A virtual machine (VM or guest) appears to the user and to the software running on it
as a complete physical machine. It is, however, one of potentially many such VMs running on a single physical machine (the host). The software that provides the

8 Chapter 1 Welcome to Linux and Mac OS X

virtualization is called a virtual machine monitor (VMM) or hypervisor. Each VM can
run a different operating system from the other VMs. For example, on a single host you
could have VMs running Windows 7, Ubuntu 12.10, Ubuntu 13.04, and Fedora 17.
A multitasking operating system allows you to run many programs on a single physical system. Similarly, a hypervisor allows you to run many operating systems (VMs)
on a single physical system.
VMs provide many advantages over single, dedicated machines:
• Isolation—Each VM is isolated from the other VMs running on the same
host: Thus, if one VM crashes or is compromised, the others are not
affected.
• Security—When a single server system running several servers is compromised, all servers are compromised. If each server is running on its own VM,
only the compromised server is affected; other servers remain secure.
• Power consumption—Using VMs, a single powerful machine can replace
many less powerful machines, thereby cutting power consumption.
• Development and support—Multiple VMs, each running a different version
of an operating system and/or different operating systems, can facilitate
development and support of software designed to run in many environments. With this organization you can easily test a product in different
environments before releasing it. Similarly, when a user submits a bug, you
can reproduce the bug in the same environment it occurred in.
• Servers—In some cases, different servers require different versions of system
libraries. In this instance, you can run each server on its own VM, all on a
single piece of hardware.
• Testing—Using VMs, you can experiment with cutting-edge releases of
operating systems and applications without concern for the base (stable)
system, all on a single machine.
• Networks—You can set up and test networks of systems on a single
machine.
• Sandboxes—A VM presents a sandbox—an area (system) that you can work
in without regard for the results of your work or for the need to clean up.
• Snapshots—You can take snapshots of a VM and return the VM to the state
it was in when you took the snapshot simply by reloading the VM from the
snapshot.
Xen

Xen, which was created at the University of Cambridge and is now being developed
in the open-source community, is an open-source virtual machine monitor (VMM).
A VMM enables several virtual machines (VMs), each running an instance of a separate operating system, to run on a single computer. Xen introduces minimal
performance overhead when compared with running each of the operating systems

What Is So Good About Linux? 9

natively. For more information on Xen, refer to the Xen home page at
www.cl.cam.ac.uk/research/srg/netos/xen and wiki.xen.org.
VMware

VMware, Inc. (www.vmware.com) offers VMware Server, a free, downloadable, proprietary product you can install and run as an application under Linux. VMware
Server enables you to install several VMs, each running a different operating system,
including Windows and Linux. VMware also offers a free VMware player that
enables you to run VMs you create using VMware Server.

KVM

The Kernel-based Virtual Machine (KVM; www.linux-kvm.org and libvirt.org) is an
open-source VM and runs as part of the Linux kernel.

Qemu

Qemu (wiki.qemu.org), written by Fabrice Bellard, is an open-source VMM that runs
as a user application with no CPU requirements. It can run code written for a different CPU than that of the host machine.

VirtualBox

VirtualBox (www.virtualbox.org) is a VM developed by Sun Microsystems. If you
want to run a virtual instance of Windows, you might want to investigate VirtualBox.

Why Linux Is Popular with Hardware Companies
and Developers
Two trends in the computer industry set the stage for the growing popularity of
UNIX and Linux. First, advances in hardware technology created the need for an
operating system that could take advantage of available hardware power. In the mid1970s, minicomputers began challenging the large mainframe computers because, in
many applications, minicomputers could perform the same functions less expensively. More recently, powerful 64-bit processor chips, plentiful and inexpensive
memory, and lower-priced hard disk storage have allowed hardware companies to
install multiuser operating systems on desktop computers.
Proprietary
operating systems

Second, with the cost of hardware continually dropping, hardware manufacturers
could no longer afford to develop and support proprietary operating systems. A proprietary operating system is one that is written and owned by the manufacturer of
the hardware (for example, DEC/Compaq owns VMS). Today’s manufacturers need
a generic operating system they can easily adapt to their machines.

Generic operating
systems

A generic operating system is written outside of the company manufacturing the
hardware and is sold (UNIX, OS X, Windows) or given (Linux) to the manufacturer.
Linux is a generic operating system because it runs on different types of hardware
produced by different manufacturers. Of course, if manufacturers can pay only for
development and avoid per-unit costs (which they have to pay to Microsoft for each
copy of Windows they sell), they are much better off. In turn, software developers
need to keep the prices of their products down; they cannot afford to create new versions of their products to run under many different proprietary operating systems.
Like hardware manufacturers, software developers need a generic operating system.
Although the UNIX system once met the needs of hardware companies and researchers for a generic operating system, over time it has become more proprietary as

10 Chapter 1 Welcome to Linux and Mac OS X

manufacturers added support for their own specialized features and introduced new
software libraries and utilities. Linux emerged to serve both needs: It is a generic
operating system that takes advantage of available hardware power.

Linux Is Portable
A portable operating system is one that can run on many different machines. More
than 95 percent of the Linux operating system is written in the C programming language, and C is portable because it is written in a higher-level, machine-independent
language. (The C compiler is written in C.)
Because Linux is portable, it can be adapted (ported) to different machines and can
meet special requirements. For example, Linux is used in embedded computers, such
as the ones found in cellphones, PDAs, and the cable boxes on top of many TVs. The
file structure takes full advantage of large, fast hard disks. Equally important, Linux
was originally designed as a multiuser operating system—it was not modified to serve
several users as an afterthought. Sharing the computer’s power among many users
and giving them the ability to share data and programs are central features of the
system.
Because it is adaptable and takes advantage of available hardware, Linux runs on
many different microprocessor-based systems as well as mainframes. The popularity
of the microprocessor-based hardware drives Linux; these microcomputers are getting faster all the time at about the same price point. This widespread acceptance
benefits both users, who do not like having to learn a new operating system for each
vendor’s hardware, and system administrators, who like having a consistent software
environment.
The advent of a standard operating system has given a boost to the development of
the software industry. Now software manufacturers can afford to make one version
of a product available on machines from different manufacturers.

The C Programming Language
Ken Thompson wrote the UNIX operating system in 1969 in PDP-7 assembly language. Assembly language is machine-dependent: Programs written in assembly
language work on only one machine or, at best, on one family of machines. For this
reason, the original UNIX operating system could not easily be transported to run
on other machines: It was not portable.
To make UNIX portable, Thompson developed the B programming language, a
machine-independent language, from the BCPL language. Dennis Ritchie developed
the C programming language by modifying B and, with Thompson, rewrote UNIX
in C in 1973. Originally, C was touted as a “portable assembler.” The revised operating system could be transported more easily to run on other machines.
That development marked the start of C. Its roots reveal some of the reasons why it
is such a powerful tool. C can be used to write machine-independent programs. A programmer who designs a program to be portable can easily move it to any computer

Overview of Linux 11

Compilers

Database
Management
Systems

Word
Processors

Mail and
Message
Facilities

Shells

Linux Kernel

Hardware
Figure 1-1

A layered view of the Linux operating system

that has a C compiler. C is also designed to compile into very efficient code. With the
advent of C, a programmer no longer had to resort to assembly language to produce
code that would run well (that is, quickly—although an assembler will always generate more efficient code than a high-level language).
C is a good systems language. You can write a compiler or an operating system in C.
It is a highly structured but not necessarily a high-level language. C allows a programmer to manipulate bits and bytes, as is necessary when writing an operating system.
At the same time, it has high-level constructs that allow for efficient, modular
programming.
In the late 1980s the American National Standards Institute (ANSI) defined a standard version of the C language, commonly referred to as ANSI C or C89 (for the year
the standard was published). Ten years later the C99 standard was published; it is
mostly supported by the GNU Project’s C compiler (named gcc). The original version
of the language is often referred to as Kernighan & Ritchie (or K&R) C, named for
the authors of the book that first described the C language.
Another researcher at Bell Labs, Bjarne Stroustrup, created an object-oriented programming language named C++, which is built on the foundation of C. Because
object-oriented programming is desired by many employers today, C++ is preferred
over C in many environments. Another language of choice is Objective-C, which was
used to write the first Web browser. The GNU Project’s C compiler supports C, C++,
and Objective-C.

Overview of Linux
The Linux operating system has many unique and powerful features. Like other
operating systems, it is a control program for computers. But like UNIX, it is also
a well-thought-out family of utility programs (Figure 1-1) and a set of tools that
allow users to connect and use these utilities to build systems and applications.

12 Chapter 1 Welcome to Linux and Mac OS X

Linux Has a Kernel Programming Interface
The Linux kernel—the heart of the Linux operating system—is responsible for allocating the computer’s resources and scheduling user jobs so each one gets its fair share
of system resources, including access to the CPU; peripheral devices, such as hard
disk, DVD, and tape storage; and printers. Programs interact with the kernel through
system calls, special functions with well-known names. A programmer can use a single system call to interact with many kinds of devices. For example, there is one
write() system call, rather than many device-specific ones. When a program issues a
write() request, the kernel interprets the context and passes the request to the appropriate device. This flexibility allows old utilities to work with devices that did not
exist when the utilities were written. It also makes it possible to move programs to
new versions of the operating system without rewriting them (provided the new version recognizes the same system calls).

Linux Can Support Many Users
Depending on the hardware and the types of tasks the computer performs, a Linux
system can support from 1 to more than 1,000 users, each concurrently running a different set of programs. The per-user cost of a computer that can be used by many
people at the same time is less than that of a computer that can be used by only a single person at a time. It is less because one person cannot generally take advantage of
all the resources a computer has to offer. That is, no one can keep all the printers
going constantly, keep all the system memory in use, keep all the disks busy reading
and writing, keep the Internet connection in use, and keep all the terminals busy at
the same time. By contrast, a multiuser operating system allows many people to use
all of the system resources almost simultaneously. The use of costly resources can be
maximized, and the cost per user can be minimized—the primary objectives of a
multiuser operating system.

Linux Can Run Many Tasks
Linux is a fully protected multitasking operating system, allowing each user to run
more than one job at a time. Processes can communicate with one another but remain
fully protected from one another, just as the kernel remains protected from all processes. You can run several jobs in the background while giving all your attention to
the job being displayed on the screen, and you can switch back and forth between
jobs. If you are running the X Window System (page 16), you can run different programs in different windows on the same screen and watch all of them. This capability
helps users be more productive.

Linux Provides a Secure Hierarchical Filesystem
A file is a collection of information, such as text for a memo or report, an accumulation of sales figures, an image, a song, or an executable program. Each file is
stored under a unique identifier on a storage device, such as a hard disk. The Linux

Overview of Linux 13

/

home

tmp

etc

max

sam

hls

bin

report

Figure 1-2

notes

log

The Linux filesystem structure

filesystem provides a structure whereby files are arranged under directories, which
are like folders or boxes. Each directory has a name and can hold other files and
directories. Directories, in turn, are arranged under other directories and so forth
in a treelike organization. This structure helps users keep track of large numbers
of files by grouping related files in directories. Each user has one primary directory
and as many subdirectories as required (Figure 1-2).
Standards

With the idea of making life easier for system administrators and software developers, a group got together over the Internet and developed the Linux Filesystem
Standard (FSSTND), which has since evolved into the Linux Filesystem Hierarchy
Standard (FHS). Before this standard was adopted, key programs were located in different places in different Linux distributions. Today you can sit down at a Linux
system and expect to find any given standard program at a consistent location
(page 95).

Links

A link allows a given file to be accessed by means of two or more names. The alternative names can be located in the same directory as the original file or in another
directory. Links can make the same file appear in several users’ directories, enabling
those users to share the file easily. Windows uses the term shortcut in place of link to
describe this capability. Macintosh users will be more familiar with the term alias.
Under Linux, an alias is different from a link; it is a command macro feature provided
by the shell (page 342).

Security

Like most multiuser operating systems, Linux allows users to protect their data from
access by other users. It also allows users to share selected data and programs with
certain other users by means of a simple but effective protection scheme. This level
of security is provided by file access permissions, which limit the users who can read
from, write to, or execute a file. Linux also implements ACLs (Access Control Lists),
which give users and administrators finer-grained control over file access
permissions.

14 Chapter 1 Welcome to Linux and Mac OS X

The Shell: Command Interpreter and Programming Language
In a textual environment, the shell—the command interpreter—acts as an interface
between you and the operating system. When you enter a command on the screen,
the shell interprets the command and calls the program you want. A number of shells
are available for Linux. The four most popular shells are
• The Bourne Again Shell (bash), an enhanced version of the original Bourne
Shell (an original UNIX shell).
• The Debian Almquist Shell (dash; page 277), a smaller version of bash with
fewer features. Many startup shell scripts call dash in place of bash to speed
the boot process.
• The TC Shell (tcsh; Chapter 9), an enhanced version of the C Shell, developed as part of BSD UNIX.
• The Z Shell (zsh), which incorporates features from a number of shells,
including the Korn Shell.
Because different users might prefer different shells, multiuser systems can have several different shells in use at any given time. The choice of shells demonstrates one
of the advantages of the Linux operating system: the ability to provide a customized
interface for each user.
Shell scripts

Besides performing its function of interpreting commands from a keyboard and sending those commands to the operating system, the shell is a high-level programming
language. Shell commands can be arranged in a file for later execution (Linux calls
these files shell scripts; Windows calls them batch files). This flexibility allows users
to perform complex operations with relative ease, often by issuing short commands,
or to build with surprisingly little effort elaborate programs that perform highly complex operations.

Filename Generation
Wildcards and
ambiguous file
references

When you type commands to be processed by the shell, you can construct patterns
using characters that have special meanings to the shell. These characters are called
wildcard characters. The patterns, which are called ambiguous file references, are a
kind of shorthand: Rather than typing in complete filenames, you can type patterns;
the shell expands these patterns into matching filenames. An ambiguous file reference
can save you the effort of typing in a long filename or a long series of similar filenames. For example, the shell might expand the pattern mak* to make-3.80.tar.gz.
Patterns can also be useful when you know only part of a filename or cannot remember the exact spelling of a filename.

Completion
In conjunction with the Readline library, the shell performs command, filename,
pathname, and variable completion: You type a prefix and press TAB, and the shell lists
the items that begin with that prefix or completes the item if the prefix specifies a
unique item.

Overview of Linux 15

Device-Independent Input and Output
Redirection

Devices (such as a printer or a terminal) and disk files appear as files to Linux programs. When you give a command to the Linux operating system, you can instruct
it to send the output to any one of several devices or files. This diversion is called
output redirection.

Device
independence

In a similar manner, a program’s input, which normally comes from a keyboard, can
be redirected so that it comes from a disk file instead. Input and output are device
independent; that is, they can be redirected to or from any appropriate device.
As an example, the cat utility normally displays the contents of a file on the screen.
When you run a cat command, you can easily cause its output to go to a disk file
instead of the screen.

Shell Functions
One of the most important features of the shell is that users can use it as a programming language. Because the shell is an interpreter, it does not compile programs
written for it but rather interprets programs each time they are loaded from the disk.
Loading and interpreting programs can be time-consuming.
Many shells, including the Bourne Again Shell, support shell functions that the shell
holds in memory so it does not have to read them from the disk each time you execute
them. The shell also keeps functions in an internal format so it does not have to spend
as much time interpreting them.

Job Control
Job control is a shell feature that allows users to work on several jobs at once, switching back and forth between them as desired. When you start a job, it is frequently
run in the foreground so it is connected to the terminal. Using job control, you can
move the job you are working with to the background and continue running it there
while working on or observing another job in the foreground. If a background job
then needs your attention, you can move it to the foreground so it is once again
attached to the terminal. The concept of job control originated with BSD UNIX,
where it appeared in the C Shell.

A Large Collection of Useful Utilities
Linux includes a family of several hundred utility programs, often referred to as
commands. These utilities perform functions that are universally required by users.
The sort utility, for example, puts lists (or groups of lists) in alphabetical or numerical order and can be used to sort lists by part number, last name, city, ZIP code,
telephone number, age, size, cost, and so forth. The sort utility is an important programming tool that is part of the standard Linux system. Other utilities allow users
to create, display, print, copy, search, and delete files as well as to edit, format, and
typeset text. The man (for manual) and info utilities provide online documentation
for Linux.

16 Chapter 1 Welcome to Linux and Mac OS X

Interprocess Communication
Pipelines and filters

Linux enables users to establish both pipelines and filters on the command line. A
pipeline passes the output of one program to another program as input. A filter is a
special kind of pipeline that processes a stream of input data to yield a stream of output data. A filter processes another program’s output, altering it as a result. The
filter’s output then becomes input to another program.
Pipelines and filters frequently join utilities to perform a specific task. For example,
you can use a pipeline to send the output of the sort utility to head (a filter that lists
the first ten lines of its input); you can then use another pipeline to send the output
of head to a third utility, lpr, that sends the data to a printer. Thus, in one command
line, you can use three utilities together to sort and print part of a file.

System Administration
On a Linux system the system administrator is frequently the owner and only user
of the system. This person has many responsibilities. The first responsibility might be
to set up the system, install the software, and possibly edit configuration files. Once
the system is up and running, the system administrator is responsible for downloading and installing software (including upgrading the operating system), backing up
and restoring files, and managing such system facilities as printers, terminals, servers,
and a local network. The system administrator is also responsible for setting up
accounts for new users on a multiuser system, bringing the system up and down as
needed, monitoring the system, and taking care of any problems that arise.

Additional Features of Linux
The developers of Linux included features from BSD, System V, and Sun Microsystems’ Solaris, as well as new features, in their operating system. Although most of the
tools found on UNIX exist for Linux, in some cases these tools have been replaced
by more modern counterparts. This section describes some of the popular tools and
features available under Linux.

GUIs: Graphical User Interfaces
The X Window System (also called X or X11) was developed in part by researchers
at MIT (Massachusetts Institute of Technology) and provides the foundation for the
GUIs available with Linux. Given a terminal or workstation screen that supports X,
a user can interact with the computer through multiple windows on the screen, display graphical information, or use special-purpose applications to draw pictures,
monitor processes, or preview formatted output. X is an across-the-network protocol
that allows a user to open a window on a workstation or computer system that is
remote from the CPU generating the window.

Additional Features of Linux

17

Aqua

Most Macintosh users are familiar with Aqua, the standard Mac OS X graphical
interface. Aqua is based on a rendering technology named Quartz and has a standard look and feel for applications. By default, X11 is not installed on a Macintosh;
you can use XQuartz in its place (xquartz.macosforge.org/trac/wiki).

Desktop manager

Usually two layers run on top of X: a desktop manager and a window manager. A
desktop manager is a picture-oriented user interface that enables you to interact with
system programs by manipulating icons instead of typing the corresponding commands to a shell. Many Linux distributions run the GNOME desktop manager
(www.gnome.org) by default, but X can also run KDE (www.kde.org) and a number
of other desktop managers. Mac OS X handles the desktop in Aqua, not in X11, so
there is no desktop manager under X11.

Window manager

A window manager is a program that runs under the desktop manager and allows
you to open and close windows, run programs, and set up a mouse so it has different
effects depending on how and where you click it. The window manager also gives the
screen its personality. Whereas Microsoft Windows allows you to change the color
of key elements in a window, a window manager under X allows you to customize
the overall look and feel of the screen: You can change the way a window looks and
works (by giving it different borders, buttons, and scrollbars), set up virtual desktops,
create menus, and more. When you are working from the command line, you can
approximate a window manager by using Midnight Commander (mc; page 879).
Several popular window managers run under X and Linux. Many Linux distributions
provide both Metacity (the default under GNOME 2) and kwin (the default under
KDE). In addition to KDE, Fedora provides Mutter (the default under GNOME 3).
Mutter is short for Metacity Clutter (the graphics library is named Clutter). Other
window managers, such as Sawfish and WindowMaker, are also available.
Under Mac OS X, most windows are managed by a Quartz layer, which applies the
Apple Aqua look and feel. For X11 applications only, this task is performed by
quartz-wm, which mimics the Apple Aqua look and feel so X11 applications on the
Mac desktop have the same appearance as native Mac OS X applications.

(Inter)Networking Utilities
Linux network support includes many utilities that enable you to access remote systems over a variety of networks. In addition to sending email to users on other
systems, you can access files on disks mounted on other computers as if they were
located on the local system, make your files available to other systems in a similar
manner, copy files back and forth, run programs on remote systems while displaying
the results on the local system, and perform many other operations across local area
networks (LANs) and wide area networks (WANs), including the Internet.
Layered on top of this network access is a wide range of application programs that
extend the computer’s resources around the globe. You can carry on conversations
with people throughout the world, gather information on a wide variety of subjects,
and download new software over the Internet quickly and reliably.

18 Chapter 1 Welcome to Linux and Mac OS X

Software Development
One of Linux’s most impressive strengths is its rich software development environment. Linux supports compilers and interpreters for many computer languages.
Besides C and C++, languages available for Linux include Ada, Fortran, Java, Lisp,
Pascal, Perl, and Python. The bison utility generates parsing code that makes it easier
to write programs to build compilers (tools that parse files containing structured
information). The flex utility generates scanners (code that recognizes lexical patterns
in text). The make utility and the GNU Configure and Build System make it easier to
manage complex development projects. Source code management systems, such as
CVS, simplify version control. Several debuggers, including ups and gdb, can help you
track down and repair software defects. The GNU C compiler (gcc) works with the
gprof profiling utility to help programmers identify potential bottlenecks in a program’s performance. The C compiler includes options to perform extensive checking
of C code, thereby making the code more portable and reducing debugging time.
Under OS X, Apple’s Xcode development environment provides a unified graphical
front end to most of these tools as well as other options and features.

Chapter Summary
The Linux operating system grew out of the UNIX heritage to become a popular alternative to traditional systems (that is, Windows) available for microcomputer (PC)
hardware. UNIX users will find a familiar environment in Linux. Distributions of
Linux contain the expected complement of UNIX utilities, contributed by programmers around the world, including the set of tools developed as part of the GNU Project.
The Linux community is committed to the continued development of this system. Support for new microcomputer devices and features is added soon after the hardware
becomes available, and the tools available on Linux continue to be refined. Given the
many commercial software packages available to run on Linux platforms and the many
hardware manufacturers offering Linux on their systems, it is clear that the system has
evolved well beyond its origin as an undergraduate project to become an operating system of choice for academic, commercial, professional, and personal use.

Exercises
1. What is free software? List three characteristics of free software.
2. Why is Linux popular? Why is it popular in academia?
3. What are multiuser systems? Why are they successful?

Exercises 19

4. What is the Free Software Foundation/GNU? What is Linux? Which parts
of the Linux operating system did each provide? Who else has helped build
and refine this operating system?
5. In which language is Linux written? What does the language have to do
with the success of Linux?
6. What is a utility program?
7. What is a shell? How does it work with the kernel? With the user?
8. How can you use utility programs and a shell to create your own
applications?
9. Why is the Linux filesystem referred to as hierarchical?
10. What is the difference between a multiuser and a multitasking system?
11. Give an example of when you would want to use a multitasking system.
12. Approximately how many people wrote Linux? Why is this project unique?
13. What are the key terms of the GNU General Public License?

PART I
The Linux and Mac OS X
Operating Systems
CHAPTER 2
Getting Started
CHAPTER 3
The Utilities

49

CHAPTER 4
The Filesystem
CHAPTER 5
The Shell

23

125

81

2
Getting Started

2

In This Chapter

Objectives

Conventions Used in This Book . . . 24

After reading this chapter you should be able to:

Logging In from a Terminal
(Emulator) . . . . . . . . . . . . . . . . . . 26
Working from the Command Line . . 28
su/sudo: Curbing Your Power (root
Privileges). . . . . . . . . . . . . . . . . . . . 32

 Log in on a Linux system using the textual interface
 Describe the advantages of the textual interface
 Correct typing mistakes on the command line
 Use kill to abort program execution using the termination
signal

man: Displays the System
Manual . . . . . . . . . . . . . . . . . . . . . 33

 Repeat and edit previous command lines

info: Displays Information About
Utilities. . . . . . . . . . . . . . . . . . . . . 36

 Understand the need to be careful when working with
root privileges

The ––help Option . . . . . . . . . . . . . 38

 Use man and info to display information about utilities

HOWTOs . . . . . . . . . . . . . . . . . . . . . 41
What to Do If You Cannot Log In. . . 42

 Use the ––help option to display information about a
utility

Changing Your Password . . . . . . . . 44

 Change your password from the command line

24 Chapter 2 Getting Started

One way or another you are sitting in front of a screen that is connected to a computer
running Linux. You might be working with a graphical user interface (GUI) or a textual
interface. This book is about the textual interface, also called the command-line interface
(CLI). If you are working with a GUI, you will need to use a terminal emulator such as
xterm, Konsole, GNOME Terminal, Terminal (under Mac OS X), or a virtual console
(page 43) to follow the examples in this book.
This chapter starts with a discussion of the typographical conventions used in this
book, followed by a section about logging in on the system. The next section introduces
the shell and explains how to fix mistakes on the command line and repeat previous
command lines. Next come a brief reminder about the powers of working with root
privileges and suggestions about how to avoid making mistakes that will make your
system inoperable or hard to work with. The chapter continues with a discussion about
where to find more information about Linux. It concludes with additional information
on logging in, including how to change your password.
Be sure to read the warning on page 32 about the dangers of misusing the powers of
working with root privileges. While heeding that warning, feel free to experiment with
the system: Give commands, create files, follow the examples in this book, and have fun.

Conventions Used in This Book
This book uses conventions to make its explanations shorter and clearer. The following
paragraphs describe these conventions.
Mac OS X versions

References to Mac OS X refer to version 10.8 (Mountain Lion). Because the book
focuses on the underlying operating system, which changes little from one release of
OS X to the next, the text will remain relevant through several future releases. The
author’s Web site (www.sobell.com) provides corrections and updates as appropriate.

Text and examples

The text is set in this type, whereas examples are shown in a
called a fixed-width font):

monospaced font

(also

$ cat practice
This is a small file I created
with a text editor.
Items you enter

Everything you enter at the keyboard is shown in a bold typeface. Within the text, this
bold typeface is used; within examples and screens, this one is used. In the previous
example, the dollar sign ($) on the first line is a prompt that Linux displays, so it is
not bold; the remainder of the first line is entered by a user, so it is bold.

Utility names

Names of utilities are printed in this sans serif typeface. This book references the
emacs text editor and the ls utility or ls command (or just ls) but instructs you to enter
ls –a on the command line. In this way the text distinguishes between utilities, which
are programs, and the instructions you enter on the command line to invoke the
utilities.

Conventions Used in This Book 25
Filenames

Filenames appear in a bold typeface. Examples are memo5, letter.1283, and reports.
Filenames might include uppercase and lowercase letters; however, Linux is case sensitive (page 1060), so memo5, MEMO5, and Memo5 name three different files.
The default Mac OS X filesystem, HFS+, is not case sensitive; under OS X, memo5,
MEMO5, and Memo5 refer to the same file. For more information refer to “Case
Sensitivity” on page 1043.

Character strings

Keys and characters

Within the text, characters and character strings are marked by putting them in a
bold typeface. This convention avoids the need for quotation marks or other delimiters before and after a string. An example is the following string, which is displayed
by the passwd utility: Sorry, passwords do not match.
This book uses SMALL CAPS for three kinds of items:
• Keyboard keys, such as the SPACE bar and the RETURN,1 ESCAPE, and TAB keys.
• The characters that keys generate, such as the SPACEs generated by the SPACE
bar.
• Keyboard keys that you press simultaneously with the CONTROL key, such as
CONTROL-D. (Even though D is shown as an uppercase letter, you do not have
to press the SHIFT key; enter CONTROL-D by holding the CONTROL key down and
pressing d.)

Prompts and RETURNs

Most examples include the shell prompt—the signal that Linux is waiting for a
command—as a dollar sign ($), a hashmark (#), or sometimes a percent sign (%).
The prompt does not appear in a bold typeface in this book because you do not
enter it. Do not type the prompt on the keyboard when you are experimenting with
examples from this book. If you do, the examples will not work.
Examples omit the RETURN keystroke that you must use to execute them. An example
of a command line is
$ vim memo.1204

To use this example as a model for running the vim text editor, enter the command
vim memo.1204 (some systems use vim.tiny in place of vim) and press the RETURN key.
(Press ESCAPE ZZ to exit from vim; see page 161 for a vim tutorial.) This method of displaying commands makes the examples in the book correspond to what appears on
the screen.
Definitions

All glossary entries marked with FOLDOC are courtesy of Denis Howe, editor of the Free
Online Dictionary of Computing (foldoc.org), and are used with permission. This site
is an ongoing work containing definitions, anecdotes, and trivia.
1. Different keyboards use different keys to move the cursor (page 1064) to the beginning of the next line.
This book always refers to the key that ends a line as the RETURN key. The keyboard you are using might
have a RET, NEWLINE, ENTER, RETURN , or some other key. Use the corresponding key on your keyboard each time
this book asks you to press RETURN.

26 Chapter 2 Getting Started

optional

Optional Information
Passages marked as optional appear in a gray box. This material is not central to the
ideas presented in the chapter and often involves more challenging concepts. A good
strategy when reading a chapter is to skip the optional sections and then return to
them when you are comfortable with the main ideas presented in the chapter. This is
an optional paragraph.

URLs (Web
addresses)

Web addresses, or URLs, have an implicit http:// prefix, unless ftp:// or https:// is
shown. You do not normally need to specify a prefix when the prefix is http://, but
you must use a prefix in a browser when you specify an FTP or secure HTTP site.
Thus you can specify a URL in a browser exactly as it is shown in this book.

ls output

This book uses the output of ls –l commands as produced by including the option
––time-style=iso. This option produces shorter lines, making the examples more
readable.

Tip, caution, and
security boxes

The following boxes highlight information that might be helpful while you are using
or administrating a Linux system.

This is a tip box
tip A tip box might help you avoid repeating a common mistake or might point toward additional
information.

This box warns you about something
caution A caution box warns you about a potential pitfall.
This box marks a security note
security A security box highlights a potential security issue. These notes are usually intended for system
administrators, but some apply to all users.

Logging In from a Terminal (Emulator)
Above the login prompt on a terminal, terminal emulator, or other textual device,
many systems display a message called issue (stored in the /etc/issue file). This message usually identifies the version of Linux running on the system, the name of the
system, and the device you are logging in on. A sample issue message follows:
Fedora release 16 (Verne)
Kernel 3.3.2-6.fc16.i686 on an i686 (tty4)

The issue message is followed by a prompt to log in. Enter your username and
password in response to the system prompts. Make sure you enter your username
and password as they were specified when your account was set up; the routine
that verifies the username and password is case sensitive. Like most systems,
Linux does not display your password when you enter it. By default Mac OS X
does not allow remote logins (page 1050).

Logging In from a Terminal (Emulator) 27

The following example shows Max logging in on the system named tiny:
tiny login: max
Password:
Last login: Wed Mar 13 19:50:38 from plum
[max@tiny max]$

If you are using a terminal (page 1097) and the screen does not display the login:
prompt, check whether the terminal is plugged in and turned on, and then press the
RETURN key a few times. If login: still does not appear, try pressing CONTROL-Q (Xon).

Did you log in last?
security As you are logging in to a textual environment, after you enter your username and password, the
system displays information about the last login on this account, showing when it took place and
where it originated. You can use this information to determine whether anyone has accessed the
account since you last used it. If someone has, perhaps an unauthorized user has learned your
password and logged in as you. In the interest of maintaining security, advise the system administrator of any circumstances that make you suspicious and change your password.

If you are using a Mac, PC, another Linux system, or a workstation (page 1102),
open the program that runs ssh (secure; page 953), telnet (not secure; page 974), or
whichever communications/emulation software you use to log in on the system, and
give it the name or IP address (page 1075) of the system you want to log in on.

telnet is not secure
security One of the reasons telnet is not secure is that it sends your username and password over the network
in cleartext (page 1061) when you log in, allowing someone to capture your login information and log
in on your account. The ssh utility encrypts all information it sends over the network and, if available,
is a better choice than telnet. The ssh program has been implemented on many operating systems,
not just Linux. Many user interfaces to ssh include a terminal emulator.

Following is an example of logging in using ssh from a Linux system:
$ ssh max@tiny
max@tiny's password:
Permission denied, please try again.
max@tiny's password:
Last login: Wed Mar 13 21:21:49 2005 from plum
[max@tiny max]$

In the example Max mistyped his password, received an error message and another
prompt, and then retyped the password correctly. If your username is the same on
the system you are logging in from and the system you are logging in on, you can omit
your username and the following at sign (@). In the example, Max could have given
the command ssh tiny.
After you log in, the shell prompt (or just prompt) appears, indicating you have successfully logged in; it shows the system is ready for you to give a command. The first
shell prompt might be preceded by a short message called the message of the day, or
motd, which is stored in the /etc/motd file.

28 Chapter 2 Getting Started

The usual prompt is a dollar sign ($). Do not be concerned if you have a different
prompt; the examples in this book will work regardless of which prompt the system
displays. In the previous example, the $ prompt (last line) is preceded by the username
(max), an at sign (@), the system name (tiny), and the name of the directory Max is
working in (max). For information on how to change the prompt, refer to page 309
(bash) or page 393 (tcsh).

Make sure TERM is set correctly
tip The TERM shell variable establishes the pseudographical characteristics of a character-based
terminal or terminal emulator. Typically TERM is set for you—you do not have to set it manually.
If things on the screen do not look right, refer to “Specifying a Terminal” on page 1024.

Working from the Command Line
Before the introduction of the graphical user interface, UNIX and then Linux provided only a textual (command-line) interface. Today, a textual interface is available
when you log in from a terminal, a terminal emulator, or a textual virtual console,
or when you use ssh or telnet to log in on a system.
Advantages
of the textual
interface

Although the concept might seem antiquated, the textual interface has a place in
modern computing. In some cases an administrator might use a command-line tool
either because a graphical equivalent does not exist or because the graphical tool is
not as powerful or flexible as the textual one. For example, chmod (pages 99 and 741)
is more powerful and flexible than its GUI counterpart. Frequently, on a server system, a graphical interface might not even be installed. The first reason for this
omission is that a GUI consumes a lot of system resources; on a server, those resources
are better dedicated to the main task of the server. Additionally, security considerations mandate that a server system run as few tasks as possible because each
additional task can make the system more vulnerable to attack.
You can also write scripts using the textual interface. Using scripts, you can easily
reproduce tasks on multiple systems, enabling you to scale the tasks to larger environments. When you are the administrator of only a single system, using a GUI is often
the easiest way to configure the system. When you act as administrator for many systems, all of which need the same configuration installed or updated, a script can make
the task go more quickly. Writing a script using command-line tools is frequently easy,
whereas the same task can be difficult to impossible using graphical tools.

Pseudographical
interface

Before the introduction of GUIs, resourceful programmers created textual interfaces
that included graphical elements such as boxes, borders outlining rudimentary windows, highlights, and, more recently, color. These textual interfaces, called
pseudographical interfaces, bridge the gap between textual and graphical interfaces.
The Midnight Commander file management utility (mc; page 879) is a good example
of a utility with a well-designed pseudographical interface.

Working from the Command Line 29

Which Shell Are You Running?
This book discusses both the Bourne Again Shell (bash) and the TC Shell (tcsh). You
are probably running bash, but you might be running tcsh or another shell such as
the Z Shell (zsh). When you enter echo $0 and press RETURN in response to a shell
prompt (usually $ or %), the shell displays the name of the shell you are working
with. This command works because the shell expands $0 to the name of the program
you are running (page 458). This command might display output like this:
$ echo $0
-bash

Or the local system might display output like this:
$ echo $0
/bin/bash

Either way, this output shows you are running bash. If you are running a different
shell, the shell will display appropriate output.

Correcting Mistakes
This section explains how to correct typographical and other errors you might make
while you are logged in on a textual display. Because the shell and most other utilities
do not interpret the command line or other text you enter until you press RETURN, you
can readily correct a typing mistake before you press RETURN.
You can correct such mistakes in several ways: Erase one character at a time, back
up a word at a time, or back up to the beginning of the line in one step. After you
press RETURN, it is too late to correct a mistake: At that point, you must either wait for
the command to run to completion or abort execution of the program (next page).

Erasing a Character
While entering characters from the keyboard, you can back up and erase a mistake
by pressing the erase key once for each character you want to delete. The erase key
backs over as many characters as you wish. It does not, in general, back up past the
beginning of the line.
The default erase key is BACKSPACE. If this key does not work, try pressing DEL or CONTROL-H.
If these keys do not work, give the following stty2 command to set the erase and line
kill (see “Deleting a Line”) keys to their default values:
$ stty ek

2. The command stty is an abbreviation for set teletypewriter, the first terminal UNIX ran on. Today stty
is commonly thought of as meaning set terminal.

30 Chapter 2 Getting Started

Alternatively, you can give the next command to reset most terminal parameters to a sane
value. If the RETURN key does not move the cursor to the next line, press CONTROL-J instead.
$ stty sane

See page 962 for more examples of using stty.

Deleting a Word
You can delete a word you entered by pressing CONTROL-W. A word is any sequence of
characters that does not contain a SPACE or TAB. When you press CONTROL-W, the cursor
moves left to the beginning of the current word (as you are entering a word) or the
previous word (when you have just entered a SPACE or TAB), removing the word.

CONTROL-Z suspends a program
tip Although it is not a way of correcting a mistake, you might press the suspend key (typically
CONTROL-Z) by mistake and wonder what happened. If you see a message containing the word

Stopped, you have just stopped your job using job control (page 147). If you give the command
fg to continue your job in the foreground, you should return to where you were before you pressed
the suspend key. For more information refer to “bg: Sends a Job to the Background” on page 296.

Deleting a Line
Any time before you press RETURN, you can delete the line you are entering by pressing the
(line) kill key. When you press this key, the cursor moves to the left, erasing characters
as it goes, back to the beginning of the line. The default line kill key is CONTROL-U. If this
key does not work, try CONTROL-X. If these keys do not work, give the stty command
described under “Erasing a Character.”

Aborting Execution
Sometimes you might want to terminate a running program. For example, you
might want to stop a program that is performing a lengthy task such as displaying
the contents of a file that is several hundred pages long or copying a large file that
is not the one you meant to copy.
To terminate a program from a textual display, press the interrupt key (CONTROL-C or
sometimes DELETE or DEL). When you press this key, the Linux operating system sends
a TERM (termination) signal to the program you are running and to the shell. Exactly
what effect this signal has depends on the program. Some programs stop execution
immediately, some ignore the signal, and some take other actions. When the shell
receives a TERM signal, it displays a prompt and waits for another command.
If these methods do not terminate the program, try sending the program a QUIT signal (CONTROL-\). If all else fails, try pressing the suspend key (typically CONTROL-Z), giving
a jobs command to verify the number of the job running the program, and using kill
to abort the job. The job number is the number within the brackets at the left end of
the line displayed by jobs ([1]). In the next example, the kill command (pages 147
and 846) uses –TERM to send a TERM signal to the job specified by the job number,
which is preceded by a percent sign (%1). You can omit –TERM from the command,
as kill sends a TERM signal by default. Table 10-5 on page 483 lists some signals.

Working from the Command Line 31

Use the KILL signal as a last resort
caution When the termination signal does not work, use the KILL signal (specify –KILL in place of –TERM
in the example). A running program cannot ignore a KILL signal; it is sure to abort the program.
Because a program receiving a KILL signal has no chance to clean up its open files before being
terminated, using KILL can corrupt application data. Use the KILL signal as a last resort. Before
using KILL, give a termination (TERM) or quit (QUIT) signal a full ten seconds to take effect.
$ bigjob
^Z

[1]+ Stopped
$ jobs
[1]+ Stopped
$ kill -TERM %1
[1]+ Killed

bigjob
bigjob
bigjob

The kill command returns a prompt; you might need to press RETURN again to see the
confirmation message. For more information refer to “Running a Command in the
Background” on page 146.

Repeating/Editing Command Lines
To repeat a previous command, press the UP ARROW key. Each time you press this key, the shell
displays an earlier command line. Press the DOWN ARROW key to browse through the command
lines in the other direction. To reexecute the displayed command line, press RETURN.
The RIGHT ARROW and LEFT ARROW keys move the cursor back and forth along the displayed
command line. At any point along the command line, you can add characters by typing them. Use the erase key (page 29) to remove characters from the command line.
Press RETURN to execute the modified command.
You can also repeat the previous command using !!. This technique is useful if you
forgot to use su (next page) to prefix a command. In this case, if you type su –c "!!",
the shell will run the previous command with root privileges. Or, if the local system
is set up to use sudo (next page), you can type sudo !! and the shell will run the previous command with root privileges.
The command ^old^new^ reruns the previous command, substituting the first occurrence of the string old with new. Also, on a command line, the shell replaces the
characters !$ with the last token (word) on the previous command line. The following
example shows the user correcting the filename meno to memo using ^n^m^ and
then printing the file named memo by giving the command lpr !$. The shell replaces
!$ with memo, the last token on the previous command line.
$ cat meno
cat: meno: No such file or directory
$ ^n^m^
cat memo
This is the memo file.
$ lpr !$
lpr memo

For information about more complex command-line editing, see page 328.

32 Chapter 2 Getting Started

su/sudo: Curbing Your Power (root Privileges)
UNIX and Linux systems have always had a privileged user named root. When you
are working as the root user (“working with root privileges”), you have extraordinary systemwide powers. A user working with root privileges is sometimes
referred to as Superuser or administrator. When working with root privileges, you
can read from or write to almost any file on the system, execute programs that
ordinary users cannot, and more. On a multiuser system you might not be permitted to gain root privileges and so might not be able to run certain programs.
Nevertheless, someone—the system administrator—can, and that person maintains
the system.

Do not experiment while you are working with root privileges
caution Feel free to experiment when you are not working with root privileges. When you are working with
root privileges, do only what you have to do and make sure you know exactly what you are doing.
After you have completed the task at hand, revert to working as yourself. When working with root
privileges, you can damage the system to such an extent that you will need to reinstall Linux to
get it working again.

With a conventional setup, you can gain root privileges in one of two ways. First you
can log in as the user named root; when you do so you are working with root privileges until you log out. Alternatively, while you are working as yourself, you can use
the su (substitute user) utility to execute a single command with root privileges or to
gain root privileges temporarily so you can execute several commands. Logging in as
root and running su to gain root privileges require you to enter the root password.
The following example shows how to use su to execute a single command.
$ ls -l /lost+found
ls: cannot open directory /lost+found: Permission denied
$ su -c 'ls -l /lost+found'
Enter the root password
Password:
total 0
$

The first command in the preceding example shows that a user who is not working
with root privileges is not permitted to list the files in the /lost+found directory: ls
displays an error message. The second command uses su with the –c (command)
option to execute the same command with root privileges. Single quotation marks
enclose the command to ensure the shell interprets the command properly. When the
command finishes executing (ls shows there are no files in the directory), the user no
longer has root privileges.
Without any arguments, su spawns a new shell running with root privileges. Typically the shell displays a hashmark (#) prompt when you are working with root
privileges. Give an exit command to return to the normal prompt and nonroot
privileges.

Where to Find Documentation 33
$ su
Password:
# ls -l /lost+found
total 0
# exit
exit
$

Enter the root password

Some distributions (e.g., Ubuntu) ship with the root account locked—there is no
root password—and rely on the sudo (www.sudo.ws) utility to allow users to work
with root privileges. The sudo utility requires you to enter your password (not the
root password) to gain root privileges. The following example allows the user to
gain root privileges to view the contents of the /lost+found directory.
$ sudo ls -l /lost+found
[sudo] password for sam:
total 0
$

Enter your password

With an argument of –s, sudo spawns a new shell running with root privileges. Typically the shell displays a hashmark (#) prompt when you are working with root
privileges. Give an exit command to return to the normal prompt and nonroot
privileges.
$ sudo -s
[sudo] password for sam:
# ls -l /lost+found
total 0
# exit
logout
$

Enter your password

Where to Find Documentation
Distributions of Linux typically do not come with hardcopy reference manuals. However, its online documentation has always been one of Linux’s strengths. The man (or
manual) and info pages have been available via the man and info utilities since early
releases of the operating system. Not surprisingly, with the ongoing growth of Linux
and the Internet, the sources of documentation have expanded as well. This section discusses some of the places you can look for information on Linux. See also Appendix B.

man: Displays the System Manual
The textual man (manual) utility displays (man) pages from the system documentation. This documentation is helpful when you know which utility you want to use but
have forgotten exactly how to use it. You can also refer to the man pages to get more
information about specific topics or to determine which features are available with
Linux. Because the descriptions in the system documentation are often terse, they are
most helpful if you already understand the basic functions of a utility.

34 Chapter 2 Getting Started

Figure 2-1

The man utility displaying information about itself

To find out more about a utility, give the command man, followed by the name of
the utility. Figure 2-1 shows man displaying information about itself; the user entered
a man man command.
less (pager)

The man utility sends its output through a pager—usually less (page 53), which
displays one screen of information at a time. When you display a manual page
using man, less displays a prompt [e.g., Manual page man(1) line 1] at the bottom
of the screen after it displays each screen of text and waits for you to take one of
the following steps:
• Press the SPACE bar to display another screen of text.
• Press PAGE UP, PAGE DOWN, UP ARROW, or DOWN ARROW to navigate the text.
• Press h (help) to display a list of less commands.
• Press q (quit) to stop less and cause the shell to display a prompt.
You can search for topics covered by man pages using the apropos utility (next page).

Manual sections

Based on the FHS (Filesystem Hierarchy Standard; page 95), the Linux system manual and the man pages are divided into ten sections, where each section describes
related tools:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

User Commands
System Calls
Subroutines
Devices
File Formats
Games
Miscellaneous
System Administration
Kernel
New

Where to Find Documentation 35

This layout closely mimics the way the set of UNIX manuals has always been divided.
Unless you specify a manual section, man displays the earliest occurrence in the manual of the word you specify on the command line. Most users find the information
they need in sections 1, 6, and 7; programmers and system administrators frequently
need to consult the other sections.
In some cases the manual contains entries for different tools with the same name. For
example, the following command displays the man page for the passwd utility from
section 1 of the system manual:
$ man passwd

To see the man page for the passwd file from section 5, enter this command:
$ man 5 passwd

The preceding command instructs man to look only in section 5 for the man page. In
documentation you might see this man page referred to as passwd(5). Use the –a
option (see the adjacent tip) to view all man pages for a given subject (press qRETURN
to display each subsequent man page). For example, give the command man –a
passwd to view all man pages for passwd.

Options
tip An option modifies the way a utility works. Options are usually specified as one or more letters
that are preceded by one or two hyphens. An option typically appears following the name of the
utility you are calling and a SPACE. Other arguments (page 1055) to the command follow the option
and a SPACE. For more information refer to “Options” on page 127.

apropos: Searches for a Keyword
When you do not know the name of the command required to carry out a particular
task, you can use apropos with a keyword to search for it. This utility searches for
the keyword in the short description line of all man pages and displays those that contain a match. The man utility, when called with the –k (keyword) option, provides the
same output as apropos.
The database apropos uses, named mandb or makewhatis, is not available when a system is first installed, but is built automatically by cron or crond (see crontab on
page 763 for a discussion of cron/crond).
The following example shows the output of apropos when you call it with the who
keyword. The output includes the name of each command, the section of the manual
that contains it, and the short description from the man page. This list includes the
utility you need (who) and identifies other, related tools you might find useful.
$ apropos who
at.allow (5)
jwhois (1)
w (1)
who (1)
who (1p)
whoami (1)
whois (1)
whois.jwhois (1)

-

determine who can submit jobs via at or batch
client for the whois service
Show who is logged on and what they are doing.
show who is logged on
display who is on the system
print effective userid
client for the whois service
client for the whois service

36 Chapter 2 Getting Started

Figure 2-2
whatis

The initial screen displayed by the command info coreutils

The whatis utility is similar to apropos but finds only complete word matches for the
name of the utility:
$ whatis who
who (1p)
who (1)

- display who is on the system
- show who is logged on

info: Displays Information About Utilities
The textual info utility (www.gnu.org/software/texinfo/manual/info) is a menu-based
hypertext system developed by the GNU project (page 3) and distributed with Linux.
It includes a tutorial on itself (give the command info info) and documentation on
many Linux shells, utilities, and programs developed by the GNU project. Figure 2-2
shows the screen that info displays when you give the command info coreutils (the
coreutils software package holds the Linux core utilities).

man and info display different information
tip The info utility displays more complete and up-to-date information on GNU utilities than does man.
When a man page displays abbreviated information on a utility that is covered by info, the man page
refers to info. The man utility frequently displays the only information available on non-GNU utilities.
When info displays information on non-GNU utilities, it is frequently a copy of the man page.

Because the information on this screen is drawn from an editable file, your display
might differ from the screens shown in this section. You can press any of the following
keys while the initial info screen is displayed:
• h or ? to list info commands
•

SPACE

to scroll through the display

• m followed by the name of the menu you want to display or a SPACE to display
a list of menus
• q or CONTROL-C to quit

Where to Find Documentation 37

Figure 2-3

The screen displayed by the command info
coreutils after you type /sleepRETURN twice

The notation info uses to describe keyboard keys is the same notation emacs uses and
might not be familiar to you. For example, the notation C-h is the same as CONTROL-H.
Similarly, M-x means hold down the META or ALT key and press x. (On some systems
you need to press ESCAPE and then x to duplicate the function of META-X.) For more information refer to “Keys: Notation and Use” on page 223.
After giving the command info coreutils, press the SPACE bar a few times to scroll
through the display. Type /sleepRETURN to search for the string sleep. When you type
/, the cursor moves to the bottom line of the window and displays Regexp search
[string]:, where string is the last string you searched for. Press RETURN to search for
string or enter the string you want to search for. Typing sleep displays sleep on that
line, and pressing RETURN displays the next occurrence of sleep.

You might find pinfo easier to use than info
tip The pinfo utility is similar to info but is more intuitive if you are not familiar with emacs editor
commands. This utility runs in a textual environment, as does info. When it is available, pinfo uses
color to make its interface easier to use. If pinfo is not installed on the system, install the pinfo
package as explained in Appendix C.

Now type /RETURN (or /sleepRETURN) to search for the next occurrence of sleep as shown
in Figure 2-3. The asterisk at the left end of the line indicates that this entry is a menu
item. Following the asterisk is the name of the menu item and a description of the item.
Each menu item is a link to the info page that describes the item. To jump to that page,
search for or use the ARROW keys to move the cursor to the line containing the menu
item and press RETURN. With the cursor positioned as it is in Figure 2-3, press RETURN to
display information on sleep. Alternatively, you can type the name of the menu item
in a menu command to view the information: To display information on sleep, for
example, you can give the command m sleep, followed by RETURN. When you type m
(for menu), the cursor moves to the bottom line of the window (as it did when you

38 Chapter 2 Getting Started

Figure 2-4

The info page on the sleep utility

typed /) and displays Menu item:. Typing sleep displays sleep on that line, and pressing RETURN displays information about the menu item you specified.
Figure 2-4 shows the top node of information on sleep. A node groups a set of information you can scroll through by pressing the SPACE bar. To display the next node,
press n. Press p to display the previous node.
As you read through this book and learn about new utilities, you can use man or info
to find out more about those utilities. If the local system can print PostScript documents, you can print a manual page by using the man utility with the –t option. For
example, man –t cat | lpr prints information about the cat utility. You can also use a
Web browser to display the documentation at one of the sites listed in Appendix B
and then print the desired information from the browser.

The ––help Option
Another tool you can use in a textual environment is the ––help option. Most GNU
utilities provide a ––help option that displays information about the utility. A nonGNU utility might use a –h or –help option to display information about itself.
$ cat --help
Usage: cat [OPTION] [FILE]...
Concatenate FILE(s), or standard input, to standard output.
-A, --show-all
-b, --number-nonblank
-e
-E, --show-ends
...

equivalent to -vET
number nonempty output lines, overrides -n
equivalent to -vE
display $ at end of each line

If the information that ––help displays runs off the screen, send the output through
the less pager (page 34) using a pipeline (page 60):
$ ls --help | less

Where to Find Documentation

39

The bash help Command
The bash help command displays information about bash commands, control structures, and other features. From the bash prompt, give the command help followed by
the keyword you are interested in. Following are some examples.
$ help help
help: help [-dms] [pattern ...]
Display information about builtin commands.
Displays brief summaries of builtin commands. If PATTERN is
specified, gives detailed help on all commands matching PATTERN,
otherwise the list of help topics is printed.
...
$ help echo
echo: echo [-neE] [arg ...]
Write arguments to the standard output.
Display the ARGs on the standard output followed by a newline.
Options:
-n

do not append a newline

...
$ help while
while: while COMMANDS; do COMMANDS; done
Execute commands as long as a test succeeds.
...

Getting Help
This section describes several methods you can use to get help with a Linux system
and lists some helpful Web sites. See also Appendix B.

Finding Help Locally
/usr/share/doc

The /usr/src/linux/Documentation (present only if you install the kernel source code)
and /usr/share/doc directories often contain more detailed and different information
about a utility than either man or info provides. Frequently this information is meant
for people who will be compiling and modifying the utility, not just using it. These
directories hold thousands of files, each containing information on a separate topic.
As the following example shows, the names of most directories in /usr/share/doc end
in version numbers:
$ ls /usr/share/doc
abrt-2.0.7
accountsservice-0.6.15
acl-2.2.51
aic94xx-firmware-30
aisleriot-3.2.1
alsa-firmware-1.0.25
alsa-lib-1.0.25

iwl100-firmware-39.31.5.1
iwl3945-firmware-15.32.2.9
iwl4965-firmware-228.61.2.24
iwl5000-firmware-8.83.5.1_1
iwl5150-firmware-8.24.2.2
iwl6000-firmware-9.221.4.1
iwl6000g2a-firmware-17.168.5.3

openldap-2.4.26
openobex-1.5
openssh-5.8p2
openssl-1.0.0g
openvpn-2.2.1
orc-0.4.16
orca-3.2.1

40 Chapter 2 Getting Started

Figure 2-5 Google reporting on an error message

Most of these directories hold a README file, which is a good place to start reading about the utility or file the directory describes. Use an asterisk (*; page 149)
in place of the version number to make it easier to type the filename. The following README file for bzip2 (pages 64 and 732) explains how to compile the
source code.
$ cat /usr/share/doc/bzip2*/README
This is the README for bzip2/libzip2.
This version is fully compatible with the previous public releases.
...
Complete documentation is available in Postscript form (manual.ps),
PDF (manual.pdf) or html (manual.html). A plain-text version of the
manual page is available as bzip2.txt.
HOW TO BUILD -- UNIX
Type 'make'. This builds the library libbz2.a and then the programs
bzip2 and bzip2recover. Six self-tests are run. If the self-tests
complete ok, carry on to installation:
To install in /usr/local/bin, /usr/local/lib, /usr/local/man and
/usr/local/include, type
make install
...

Where to Find Documentation 41

Figure 2-6 The Linux Documentation Project home page

Using the Internet to Get Help
The Internet provides many helpful sites related to Linux and OS X. Aside from sites
that offer various forms of documentation, you can enter an error message from a
program you are having a problem with in a search engine such as Google
(www.google.com). The search will likely yield a post concerning your problem and
suggestions about how to solve it. See Figure 2-5.
GNU

GNU manuals are available at www.gnu.org/manual. In addition, you can visit the
GNU home page (www.gnu.org) to obtain other documentation and GNU resources.
Many of the GNU pages and resources are available in a variety of languages.

The Linux
Documentation
Project

The Linux Documentation Project (www.tldp.org; Figure 2-6), which has been around
for almost as long as Linux, houses a complete collection of guides, HOWTOs, FAQs,
man pages, and Linux magazines. The home page is available in English, Portuguese
(Brazilian), Spanish, Italian, Korean, and French. It is easy to use and supports local
text searches. This site also provides a complete set of links you can use to find almost
anything you want related to Linux (click Links in the Search box or go to
www.tldp.org/links). The links page includes sections on general information, events,
getting started, user groups, mailing lists, and newsgroups, with each section containing many subsections.

HOWTOs

A HOWTO document explains in detail how to do something related to
Linux—from setting up a specialized piece of hardware to performing a system

42 Chapter 2 Getting Started

administration task to setting up specific networking software. Mini-HOWTOs offer
shorter explanations.
The Linux Documentation Project site houses most HOWTO and mini-HOWTO
documents. Use a Web browser to visit www.tldp.org, click HOWTOs, and pick the
index you want to use to find a HOWTO or mini-HOWTO. You can also use the LDP
search feature on its home page to find HOWTOs and other documents.

More About Logging In and Passwords
Refer to “Logging In from a Terminal (Emulator)” on page 26 for information about
logging in. This section covers solutions to common login problems, logging in
remotely, virtual consoles, and changing your password.

Always use a password
security Unless you are the only user of a system; the system is not connected to any other systems, the
Internet, or a modem; and you are the only one with physical access to the system, it is poor practice to maintain a user account without a password.

What to Do If You Cannot Log In
If you enter either your username or your password incorrectly, the system displays
an error message after you enter both your username and your password. This message indicates that you have entered either the username or the password incorrectly
or that they are not valid. It does not differentiate between an unacceptable username
and an unacceptable password—a strategy meant to discourage unauthorized people
from guessing names and passwords to gain access to the system.
Following are some common reasons why logins fail:
• The username and password are case sensitive. Make sure the CAPS LOCK key
is off and enter your username and password exactly as specified or as you
set them up.
• You are not logging in on the right machine. The login/password combination might not be valid if you are trying to log in on the wrong machine.
On a larger, networked system, you might have to specify the machine you
want to connect to before you can log in.
• Your username is not valid. The login/password combination might not be
valid if you have not been set up as a user.
• A filesystem is full. When a filesystem critical to the login process is full, it
might appear as though you have logged in successfully, but after a moment
the login prompt reappears. In this situation you must boot the system in
rescue/recovery mode and delete some files.

More About Logging In and Passwords

43

• The account is disabled. On some systems, the root account is disabled by
default. An administrator might disable other accounts. Often the root
account is not allowed to log in over a network: In this case, log in as yourself
and then gain root privileges using su/sudo.
Refer to “Changing Your Password” on page 44 if you want to change your
password.

Logging In Remotely: Terminal Emulators, ssh, and
Dial-Up Connections
When you are not using a console, terminal, or other device connected directly to the
Linux system you are logging in on, you are probably connected to the Linux system
using terminal emulation software on another system. Running on the local system,
this software connects to the remote Linux system via a network (Ethernet, asynchronous phone line, PPP, or other type) and allows you to log in.

Make sure TERM is set correctly
tip No matter how you connect, make sure the TERM variable is set to the type of terminal your
emulator is emulating. For more information refer to “Specifying a Terminal” on page 1024.

When you log in via a dial-up line, the connection is straightforward: You instruct
the local emulator program to contact the remote Linux system, it dials the phone,
and the remote system displays a login prompt. When you log in via a directly connected network, you use either ssh (secure; page 687) or telnet (not secure; page 974)
to connect to the remote system. The ssh program has been implemented on many
operating systems, not just Linux. Many user interfaces to ssh include a terminal
emulator. From an Apple, Windows, or UNIX machine, open the program that runs
ssh and give it the name or IP address of the system you want to log in on.

Using Virtual Consoles
When running Linux on a personal computer, you will frequently work with the display
and keyboard attached to the computer. Using this physical console, you can access as
many as 63 virtual consoles (also called virtual terminals). Some are set up to allow logins; others act as graphical displays. To switch between virtual consoles, hold the CONTROL
and ALT keys down and press the function key that corresponds to the console you want
to view. For example, CONTROL-ALT-F5 displays the fifth virtual console.
By default, five or six virtual consoles are active and have textual login sessions running. When you want to use both textual and graphical interfaces, you can set up a
textual session on one virtual console and a graphical session on another.

Logging Out
To log out from a character-based interface, press CONTROL-D in response to the shell
prompt. This action sends the shell an EOF (end of file). Alternatively, you can give

44 Chapter 2 Getting Started

the command exit. Exiting from a shell does not end a graphical session; it just exits
from the shell you are working with. For example, exiting from the shell that
GNOME terminal provides closes the GNOME terminal window.

Changing Your Password
If someone else assigned you a password, it is a good idea to give yourself a new one.
For security reasons, passwords you enter are not displayed by any utility.

Protect your password
security Do not allow someone to find out your password: Do not put your password in a file that is not
encrypted, allow someone to watch you type your password, or give your password to someone
you do not know (a system administrator never needs to know your password). You can always
write your password down and keep it in a safe, private place.

Choose a password that is difficult to guess
security Do not use phone numbers, names of pets or kids, birthdays, words from a dictionary (not even
a foreign language), and so forth. Do not use permutations of these items or a l33t-speak variation
of a word: Modern dictionary crackers might also try these permutations.

Include nonalphanumeric characters in your password
security Automated password cracking tools first try using alphabetic and numeric characters when they
try to guess your password. Including at least one character such as @ or # in a password
increases the amount of time it takes for one of these tools to crack your password.

Differentiate between important and less important passwords
security It is a good idea to differentiate between important and less important passwords. For example,
Web site passwords for blogs or download access are not very important; it is acceptable to use
the same password for these types of sites. However, your login, mail server, and bank account
Web site passwords are critical: Never use these passwords for an unimportant Web site and use
a different password for each of these accounts.
Secure passwords

To be relatively secure, a password should contain a combination of numbers, uppercase and lowercase letters, and punctuation characters. It should also meet the
following criteria:
• Must be at least six characters long (or longer if the system administrator
sets it up that way). Seven or eight characters is a good compromise between
being easy to remember and being secure.
• Should not be a word in a dictionary of any language, no matter how
seemingly obscure.
• Should not be the name of a person, place, pet, or other thing that might be
discovered easily.
• Should contain at least two letters and one digit or punctuation character.

More About Logging In and Passwords

45

• Should not be your username, the reverse of your username, or your
username shifted by one or more characters.
Only the first item is mandatory. Avoid using control characters (such as CONTROL-H)
because they might have a special meaning to the system, making it impossible for
you to log in. If you are changing your password, the new password should differ
from the old one by at least three characters. Changing the case of a character does
not make it count as a different character.

pwgen helps you pick a password
security The pwgen utility (install the pwgen package as explained in Appendix C) generates a list of
almost random passwords. With a little imagination, you can pronounce, and therefore remember,
some of these passwords.

To change your password, give the command passwd. The first item passwd asks for
is your current (old) password. This password is verified to ensure that an unauthorized user is not trying to alter your password. Then the system requests a new
password.
$ passwd
Changing password for user sam.
Changing password for sam.
(current) UNIX password:
New password:
Retype new password:
passwd: all authentication tokens updated successfully.

After you enter your new password, the system asks you to retype it to make sure you
did not make a mistake when you entered it the first time. If the new password is the
same both times you enter it, your password is changed. If the passwords differ, it
means that you made an error in one of them, and the system displays this error
message:
Sorry, passwords do not match

If your password is not long enough, the system displays the following message:
BAD PASSWORD: it is too short

When it is too simple, the system displays this message:
BAD PASSWORD: it is too simplistic/systematic

After several failures, the system displays an error message and displays a prompt.
At this point you need to run passwd again.

OS X: passwd does not change your Keychain password
tip Under OS X, the passwd utility changes your login password, but does not change your Keychain
password. The Keychain password is used by various graphical applications. You can change the
Keychain password using the Keychain Access application.

46 Chapter 2 Getting Started

When you successfully change your password, you change the way you log in. If you
forget your password, someone working with root privileges can run passwd to
change it and tell you your new password.
Working with root privileges (use su/sudo [page 32]) you can assign a new password
to any user on the system without knowing the user’s old password. Use this technique when a user forgets his password.
# passwd sam
Changing password for user sam.
New password:
...

Chapter Summary
As with many operating systems, your access to a Linux system is authorized when
you log in. You enter your username in response to the login: prompt, followed by a
password. You can use passwd to change your password while you are logged in.
Choose a password that is difficult to guess and that conforms to the criteria imposed
by the system administrator.
The system administrator is responsible for maintaining the system. On a single-user
system, you are the system administrator. On a small, multiuser system, you or
another user will act as the system administrator, or this job might be shared. On a
large, multiuser system or network of systems, there is frequently a full-time system
administrator. When extra privileges are required to perform certain system tasks, the
system administrator gains root privileges by logging in as root or by running su or
sudo. On a multiuser system, several trusted users might be allowed to gain root
privileges.
Do not work with root privileges as a matter of course. When you have to do something that requires root privileges, work with root privileges for only as long as you
need to; revert to working as yourself as soon as possible.
The man utility provides online documentation on system utilities. This utility is helpful
both to new Linux users and to experienced users who must often delve into the system
documentation for information on the fine points of a utility’s behavior. The apropos
utility can help you search for utilities. The info utility helps the beginner and the expert
alike. It includes documentation on many Linux utilities. Some utilities, when called
with the ––help option, provide brief documentation on themselves.

Exercises
1. The following message is displayed when you attempt to log in with an
incorrect username or an incorrect password:

Advanced Exercises 47
Login incorrect

This message does not indicate whether your username, your password, or
both are invalid. Why does it not tell you this information?
2. Give three examples of poor password choices. What is wrong with each?
Include one that is too short. Give the error message the system displays.
3. Is fido an acceptable password? Give several reasons why or why not.
4. What would you do if you could not log in?
5. Try to change your password to dog. What happens? Now change it to a
more secure password. What makes that password relatively secure?
6. How would you display a list of utilities that compress files?
7. How would you repeat the second preceding command line, edit it, and
then execute it?
8. Briefly, what information does the ––help option display for the tar utility?
How would you display this information one screen at a time?

Advanced Exercises
9. How would you display the man page for shadow in section 5 of the system
manual?
10. How would you change your login shell to tcsh without using root
privileges?
11. How many man pages are in the Devices subsection of the system manual?
(Hint: Devices is a subsection of Special Files.)
12. The example on page 35 shows that man pages for passwd appear in sections 1 and 5 of the system manual. Explain how you can use man to
determine which sections of the system manual contain a manual page with
a given name.
13. How would you find out which Linux utilities create and work with archive
files?

3
The Utilities

Chapter3
3

In This Chapter

Objectives

Special Characters . . . . . . . . . . . . . 50

After reading this chapter you should be able to:

Basic Utilities . . . . . . . . . . . . . . . . . 51
less Is more: Display a Text File
One Screen at a Time . . . . . . . . . . 53

 List special characters and methods of preventing the
shell from interpreting these characters
 Use basic utilities to list files and display text files

Working with Files. . . . . . . . . . . . . . 53

 Copy, move, and remove files

lpr: Prints a File . . . . . . . . . . . . . . . . 55

 Search, sort, print, and compare text files

| (Pipeline): Communicates
Between Processes . . . . . . . . . . . . 60

 String commands together using a pipeline

Compressing and Archiving Files . . . 63

 Compress, decompress, and archive files

Displaying User and System
Information . . . . . . . . . . . . . . . . . 70

 Locate utilities on the system
 Display information about users
 Communicate with other users

50 Chapter 3 The Utilities

When Linus Torvalds introduced Linux and for a long time thereafter, Linux did not
have a graphical user interface (GUI): It ran on character-based terminals only, using
a command-line interface (CLI), also referred to as a textual interface. All the tools
ran from a command line. Today the Linux GUI is important, but many people—especially system administrators—run many command-line utilities.
Command-line utilities are often faster, more powerful, or more complete than their
GUI counterparts. Sometimes there is no GUI counterpart to a textual utility; some
people just prefer the hands-on feeling of the command line.
When you work with a command-line interface, you are working with a shell
(Chapters 5, 8, and 10). Before you start working with a shell, it is important that
you understand something about the characters that are special to the shell, so this
chapter starts with a discussion of special characters. The chapter then describes
five basic utilities: ls, cat, rm, less, and hostname. It continues by describing several
other file manipulation utilities as well as utilities that compress and decompress
files, pack and unpack archive files, locate utilities, display system information,
communicate with other users, and print files.

Special Characters
Special characters, which have a special meaning to the shell, are discussed in “Filename
Generation/Pathname Expansion” on page 148. These characters are mentioned here so
you can avoid accidentally using them as regular characters until you understand how the
shell interprets them. Avoid using any of the following characters in a filename (even
though emacs and some other programs do) because they make the file harder to reference on the command line:
& ; |

*

?

'

"

‘

[ ] ( ) $ < > { } # / \ ! ~

Whitespace

Although not considered special characters, RETURN, SPACE, and TAB have special meanings
to the shell. RETURN usually ends a command line and initiates execution of a command.
The SPACE and TAB characters separate tokens (elements) on the command line and are
collectively known as whitespace or blanks.

Quoting special
characters

If you need to use a character that has a special meaning to the shell as a regular character, you can quote (or escape) it. When you quote a special character, you prevent
the shell from giving it special meaning. The shell treats a quoted special character
as a regular character. However, a slash (/) is always a separator in a pathname, even
when you quote it.

Backslash

To quote a character, precede it with a backslash (\). When two or more special characters appear together, you must precede each with a backslash (for example, you
would enter ** as \*\*). You can quote a backslash just as you would quote any
other special character—by preceding it with a backslash ( \\).

Single quotation
marks

Another way of quoting special characters is to enclose them between single quotation marks: '**'. You can quote many special and regular characters between a pair

Basic Utilities 51

of single quotation marks: 'This is a special character: >'. The regular characters are
interpreted as usual, and the shell also interprets the special characters as regular
characters.
The only way to quote the erase character (CONTROL-H), the line kill character (CONTROL-U),
and other control characters (try CONTROL-M) is by preceding each with a CONTROL-V. Single
quotation marks and backslashes do not work. Try the following:
$ echo 'xxxxxxCONTROL-U'
$ echo xxxxxxCONTROL-V CONTROL-U

optional Although you cannot see the CONTROL-U displayed by the second of the preceding pair
of commands, it is there. The following command sends the output of echo (page 61)
through a pipeline (page 60) to od (octal display; page 897) to display CONTROL-U as
octal 25 (025):
$ echo xxxxxxCONTROL-V CONTROL-U | od -c
0000000
x
x
x
x
x
x 025
0000010

\n

The \n is the NEWLINE character that echo sends at the end of its output.

Basic Utilities
One of the advantages of Linux is that it comes with thousands of utilities that perform myriad functions. You will use utilities whenever you work with Linux, whether
you use them directly by name from the command line or indirectly from a menu or
icon. The following sections discuss some of the most basic and important utilities;
these utilities are available from a CLI. Some of the more important utilities are also
available from a GUI; others are available only from a GUI.

Run these utilities from a command line
tip This chapter describes command-line, or textual, utilities. You can experiment with these utilities
from a terminal, a terminal emulator within a GUI, or a virtual console.
Folder/directory

The term directory is used extensively in the next sections. A directory is a resource
that can hold files. On other operating systems, including Windows and Macintosh
OS X, and frequently when speaking about a Linux GUI, a directory is referred to
as a folder. That is a good analogy: A traditional manila folder holds files just as a
directory does.

In this chapter you work in your home directory
tip When you log in on the system, you are working in your home directory. In this chapter that is the
only directory you use: All the files you create in this chapter are in your home directory. Chapter 4
goes into more detail about directories.

52 Chapter 3 The Utilities

$ ls
practice
$ cat practice
This is a small file that I created
with a text editor.
$ rm practice
$ ls
$ cat practice
cat: practice: No such file or directory
$

Figure 3-1

Using ls, cat, and rm on the file named practice

ls: Lists the Names of Files
Using the editor of your choice, create a small file named practice. (A tutorial on the
vim editor appears on page 161 and a tutorial on emacs appears on page 216.) After
exiting from the editor, you can use the ls (list) utility to display a list of the names
of the files in your home directory. In the first command in Figure 3-1, ls lists the
name of the practice file. (You might also see files that the system or a program created automatically.) Subsequent commands in Figure 3-1 display the contents of the
file and remove the file. These commands are described next.

cat: Displays a Text File
The cat utility displays the contents of a text file. The name of the command is derived
from catenate, which means to join together, one after the other. (Figure 5-8 on
page 137 shows how to use cat to string together the contents of three files.)
A convenient way to display the contents of a file on the screen is by giving the command cat, followed by a SPACE and the name of the file. Figure 3-1 shows cat displaying
the contents of practice. This figure shows the difference between the ls and cat utilities:
The ls utility displays the name of a file, whereas cat displays the contents of a file.

rm: Deletes a File
The rm (remove) utility deletes a file. Figure 3-1 shows rm deleting the file named practice. After rm deletes the file, ls and cat show that practice is no longer in the directory.

A safer way of removing files
tip You can use the interactive form of rm to make sure you delete only the file(s) you intend to delete.
When you follow rm with the –i option (see page 35 for a tip on options) and the name of the file
you want to delete, rm prompts you with the name of the file and waits for you to respond with y
(yes) before it deletes the file. It does not delete the file if you respond with a string that begins with
a character other than y. Under some distributions, the –i option is set up by default for the root user:
$ rm -i toollist
rm: remove regular file 'toollist'? y

Optional: You can create an alias (page 342) for rm –i and put it in your startup file (page 87) so
rm always runs in interactive mode.

Working with Files

53

The ls utility does not list its filename, and cat says that no such file exists. Use rm carefully. Refer to page 926 or give the command info coreutils 'rm invocation' for more
information. If you are running OS X, see “Many Utilities Do Not Respect Apple
Human Interface Guidelines” on page 1050.

less Is more: Display a Text File One Screen at a Time
Pagers

When you want to view a file that is longer than one screen, you can use either the less
utility or the more utility. Each of these utilities pauses after displaying a screen of text;
press the SPACE bar to display the next screen of text. Because these utilities show one
page at a time, they are called pagers. Although less and more are very similar, they
have subtle differences. At the end of the file, for example, less displays an END message and waits for you to press q before returning you to the shell. In contrast, more
returns you directly to the shell. While using both utilities you can press h to display a
Help screen that lists commands you can use while paging through a file. Give the commands less practice and more practice in place of the cat command in Figure 3-1 to
see how these commands work. Use the command less /etc/services instead if you want
to experiment with a longer file. Refer to page 852 for more information on less.

hostname: Displays the System Name
The hostname utility displays the name of the system you are working on. Use this
utility if you are not sure that you are logged in on the correct machine.
$ hostname
guava

Working with Files
This section describes utilities that copy, move, print, search through, display, sort,
compare, and identify files. If you are running Mac OS X, see “Resource forks” on
page 1044.

Filename completion
tip After you enter one or more letters of a filename (following a command) on a command line, press
TAB, and the shell will complete as much of the filename as it can. When only one filename starts with
the characters you entered, the shell completes the filename and places a SPACE after it. You can keep
typing or you can press RETURN to execute the command at this point. When the characters you
entered do not uniquely identify a filename, the shell completes what it can and waits for more input.
If pressing TAB does not change the display, press TAB again (bash; page 338) or CONTROL-D (tcsh;
“Word Completion” on page 380) to display a list of possible completions.

cp: Copies a File
The cp (copy) utility (Figure 3-2, next page) makes a copy of a file. This utility can
copy any file, including text and executable program (binary) files. You can use cp to
make a backup copy of a file or a copy to experiment with.

54 Chapter 3 The Utilities

$ ls
memo
$ cp memo memo.copy
$ ls
memo memo.copy

Figure 3-2

cp copies a file

The cp command line uses the following syntax to specify source and destination
files:
cp source-file destination-file
The source-file is the name of the file that cp will copy. The destination-file is the
name cp assigns to the resulting (new) copy of the file.
The cp command line in Figure 3-2 copies the file named memo to memo.copy. The
period is part of the filename—just another character. The initial ls command shows
that memo is the only file in the directory. After the cp command, a second ls shows
two files in the directory, memo and memo.copy.
Sometimes it is useful to incorporate the date into the name of a copy of a file. The
following example includes the date January 30 (0130) in the copied file:
$ cp memo memo.0130

Although it has no significance to Linux, including the date in this way can help you
find a version of a file you created on a certain date. Including the date can also help
you avoid overwriting existing files by providing a unique filename each day. For
more information refer to “Filenames” on page 84.
Use scp (page 691) or ftp (page 819) when you need to copy a file from one system
to another on a network.

cp can destroy a file
caution If the destination-file exists before you give a cp command, cp overwrites it. Because cp overwrites (and destroys the contents of) an existing destination-file without warning, you must take
care not to cause cp to overwrite a file that you need. The cp –i (interactive) option prompts you
before it overwrites a file. See page 35 for a tip on options.
The following example assumes the file named orange.2 exists before you give the cp command.
The user answers y to overwrite the file.
$ cp –i orange orange.2
cp: overwrite 'orange.2'? y

mv: Changes the Name of a File
The mv (move) utility can rename a file without making a copy of it. The mv command
line specifies an existing file and a new filename using the same syntax as cp:
mv existing-filename new-filename

Working with Files

55

$ ls
memo
$ mv memo memo.0130
$ ls
memo.0130

Figure 3-3

mv renames a file

The command line in Figure 3-3 changes the name of the file memo to memo.0130.
The initial ls command shows that memo is the only file in the directory. After you
give the mv command, memo.0130 is the only file in the directory. Compare this
result to that of the cp example in Figure 3-2.
The mv utility can be used for more than changing the name of a file; refer to “mv,
cp: Move or Copy Files” on page 94 and to the mv info page.

mv can destroy a file
caution Just as cp can destroy a file, so can mv. Also like cp, mv has a –i (interactive) option. See the
caution box labeled “cp can destroy a file.”

lpr: Prints a File
The lpr (line printer) utility places one or more files in a print queue for printing.
Linux provides print queues so only one job is printed on a given printer at a time.
A queue allows several people or jobs to send output simultaneously to a single
printer with the expected results. For systems that have access to more than one
printer, you can use lpstat –p to display a list of available printers. Use the –P option
to instruct lpr to place the file in the queue for a specific printer—even one that is connected to another system on the network. The following command prints the file
named report:
$ lpr report

Because this command does not specify a printer, the output goes to the default
printer, which is the printer when you have only one printer.
The next command line prints the same file on the printer named mailroom:
$ lpr -P mailroom report

You can send more than one file to the printer with a single command. The following
command line prints three files on the printer named laser1:
$ lpr -P laser1 05.txt 108.txt 12.txt
lpq

You can see which jobs are in the print queue by giving an lpstat –o command or by
using the lpq utility:
$ lpq
lp is ready and printing
Rank Owner
Job Files
active max
86 (standard input)

Total Size
954061 bytes

56 Chapter 3 The Utilities

$ cat memo
Helen:
In our meeting on June 6 we
discussed the issue of credit.
Have you had any further thoughts
about it?
Max
$ grep 'credit' memo
discussed the issue of credit.

Figure 3-4
lprm

grep searches for a string

In this example, Max has one job that is being printed; no other jobs are in the queue.
You can use the job number (86 in this case) with the lprm utility to remove the job
from the print queue and stop it from printing:
$ lprm 86

grep: Searches for a String
The grep1 utility searches through one or more files to see whether any contain a
specified string of characters. This utility does not change the file it searches but
simply displays each line that contains the string.
The grep command in Figure 3-4 searches through the memo file for lines that contain
the string credit and displays the single line that meets this criterion. If memo contained
such words as discredit, creditor, or accreditation, grep would have displayed those
lines as well because they contain the string it was searching for. The –w (words) option
causes grep to match only whole words. Although you do not need to enclose the string
you are searching for in single quotation marks, doing so allows you to put SPACEs and
special characters in the search string.
The grep utility can do much more than search for a simple string in a single file. Refer
to page 833 and Appendix A for more information.

head: Displays the Beginning of a File
By default the head utility displays the first ten lines of a file. You can use head to help
you remember what a particular file contains. For example, if you have a file named

1. Originally the name grep was a play on an ed—an original UNIX editor, available on most distributions—
command: g/re/p. In this command g stands for global, re is a regular expression delimited by slashes, and
p means print.

Working with Files

57

$ head months
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
$ tail -5 months
Aug
Sep
Oct
Nov
Dec

Figure 3-5

head displays the first ten lines of a file

months that lists the 12 months of the year in calendar order, one to a line, then head
displays Jan through Oct (Figure 3-5).
This utility can display any number of lines, so you can use it to look at only the first
line of a file, at a full screen, or even more. To specify the number of lines displayed,
include a hyphen followed by the number of lines you want head to display. For
example, the following command displays only the first line of months:
$ head -1 months
Jan

The head utility can also display parts of a file based on a count of blocks or characters
rather than lines. Refer to page 841 for more information on head.

tail: Displays the End of a File
The tail utility is similar to head but by default displays the last ten lines of a file.
Depending on how you invoke it, this utility can display fewer or more than ten lines.
Alternatively, you can use a count of blocks or characters rather than lines to display
parts of a file. The tail command in Figure 3-5 displays the last five lines (Aug through
Dec) of the months file.
You can monitor lines as they are added to the end of the growing file named logfile
by using the following command:
$ tail -f logfile

Press the interrupt key (usually CONTROL-C) to stop tail and display the shell prompt.
Refer to page 965 for more information on tail.

58 Chapter 3 The Utilities

$ cat days
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday
$ sort days
Friday
Monday
Saturday
Sunday
Thursday
Tuesday
Wednesday

Figure 3-6

sort displays the lines of a file in order

sort: Displays a File in Order
The sort utility displays the contents of a file in order by lines; it does not change the
original file.
Figure 3-6 shows cat displaying the file named days, which contains the name of each
day of the week on a separate line in calendar order. The sort utility then displays the
file in alphabetical order.
The sort utility is useful for putting lists in order. The –u option generates a sorted
list in which each line is unique (no duplicates). The –n option puts a list of numbers
in numerical order. Refer to page 942 for more information on sort.

uniq: Removes Duplicate Lines from a File
The uniq (unique) utility displays a file, skipping adjacent duplicate lines; it does not
change the original file. If a file contains a list of names and has two successive entries
for the same person, uniq skips the extra line (Figure 3-7).
If a file is sorted before it is processed by uniq, this utility ensures that no two lines
in the file are the same. (Of course, sort can do that all by itself with the –u option.)
Refer to page 996 for more information on uniq.

diff: Compares Two Files
The diff (difference) utility compares two files and displays a list of the differences
between them. This utility does not change either file; it is useful when you want to compare two versions of a letter or a report or two versions of the source code for a program.
The diff utility with the –u (unified output format) option first displays two lines
indicating which of the files you are comparing will be denoted by a plus sign (+)

Working with Files

59

$ cat dups
Cathy
Fred
Joe
John
Mary
Mary
Paula
$ uniq dups
Cathy
Fred
Joe
John
Mary
Paula

Figure 3-7

uniq removes duplicate lines

and which by a minus sign (–). In Figure 3-8, a minus sign indicates the colors.1 file;
a plus sign indicates the colors.2 file.
The diff –u command breaks long, multiline text into hunks. Each hunk is preceded
by a line starting and ending with two at signs (@@). This hunk identifier indicates the
starting line number and the number of lines from each file for this hunk. In Figure 3-8,
the hunk covers the section of the colors.1 file (indicated by a minus sign) from the first
line through the sixth line. The +1,5 then indicates the hunk covers colors.2 from the
first line through the fifth line.
Following these header lines, diff –u displays each line of text with a leading minus
sign, a leading plus sign, or a SPACE. A leading minus sign indicates the line occurs only
in the file denoted by the minus sign. A leading plus sign indicates the line occurs only
in the file denoted by the plus sign. A line that begins with a SPACE (neither a plus sign
nor a minus sign) occurs in both files in the same location. Refer to page 777 for more
information on diff.
$ diff -u colors.1 colors.2
--- colors.1
2012-04-05 10:12:12.322528610 -0700
+++ colors.2
2012-04-05 10:12:18.420531033 -0700
@@ -1,6 +1,5 @@
red
+blue
green
yellow
-pink
-purple
orange

Figure 3-8

diff displaying the unified output format

60 Chapter 3 The Utilities

file: Identifies the Contents of a File
You can use the file utility to learn about the contents of any file on a Linux system
without having to open and examine the file yourself. In the following example, file
reports that letter_e.bz2 contains data that was compressed using the bzip2 utility
(page 64):
$ file letter_e.bz2
letter_e.bz2: bzip2 compressed data, block size = 900k

Next file reports on two more files:
$ file memo zach.jpg
memo:
ASCII text
zach.jpg: JPEG image data, ... resolution (DPI), 72 x 72

Refer to page 802 for more information on file.

| (Pipeline): Communicates Between Processes
Because pipelines are integral to the functioning of a Linux system, this chapter
introduces them for use in examples. Pipelines are covered in detail beginning on
page 141. Pipelines do not work with OS X resource forks; they work with data
forks only.
A pipeline (denoted by a pipe symbol that is written as a vertical bar [|] on the command line and appears as a solid or broken vertical line on a keyboard) takes the
output of one utility and sends that output as input to another utility. More accurately, a pipeline takes standard output of one process and redirects it to become
standard input of another process. See page 133 for more information on standard
output and standard input.
Some utilities, such as head, can accept input from a file named on the command line
or, via a pipeline, from standard input. In the following command line, sort processes
the months file (Figure 3-5, page 57); using a pipeline, the shell sends the output from
sort to the input of head, which displays the first four months of the sorted list:
$ sort months | head -4
Apr
Aug
Dec
Feb
wc

The next command line displays the number of files in a directory. The wc (word count)
utility with the –w (words) option displays the number of words in its standard input
or in a file you specify on the command line:
$ ls | wc -w
14

Four More Utilities 61

$ ls
memo memo.0714 practice
$ echo Hi
Hi
$ echo This is a sentence.
This is a sentence.
$ echo star: *
star: memo memo.0714 practice
$

Figure 3-9

echo copies the command line (but not the word echo) to the screen

You can also use a pipeline to send output of a program to the printer:
$ tail months | lpr

Four More Utilities
The echo and date utilities are two of the most frequently used members of the large
collection of Linux utilities. The script utility records part of a session in a file, and
unix2dos makes a copy of a Linux text file that can be read on a machine running
either Windows or Macintosh OS X.

echo: Displays Text
The echo utility copies the characters you type on the command line following echo
to the screen. Figure 3-9 shows some echo commands. The last command shows what
the shell does with an unquoted asterisk (*) on the command line: It expands the
asterisk into a list of filenames in the directory.
The echo utility is a good tool for learning about the shell and other Linux utilities.
Some examples on page 149 use echo to illustrate how special characters, such as the
asterisk, work. Throughout Chapters 5, 8, and 10, echo helps explain how shell variables work and how you can send messages from shell scripts to the screen. Refer to
page 794 for more information on echo.

optional You can use echo to create a simple file by redirecting its output to a file:
$ echo 'My new file.' > myfile
$ cat myfile
My new file.

The greater than (>) sign tells the shell to redirect the output of echo to the file named
myfile instead of to the screen. For more information refer to “Redirecting Standard
Output” on page 136.

62 Chapter 3 The Utilities

date: Displays the Time and Date
The date utility displays the current date and time:
$ date
Tue Apr

3 10:14:41 PDT 2012

The following example shows how you can specify the format and contents of the
output of date:
$ date +"%A %B %d"
Tuesday April 03

Refer to page 769 for more information on date.

script: Records a Shell Session
The script utility records all or part of a login session, including your input and the
system’s responses. This utility is useful only from character-based devices, such as a
terminal or a terminal emulator. It does capture a session with vim; however, because
vim uses control characters to position the cursor and display different typefaces, such
as bold, the output will be difficult to read and might not be useful. When you cat a
file that has captured a vim session, the session quickly passes before your eyes.
By default script captures the session in a file named typescript. To specify a different
filename, follow the script command with a SPACE and the filename. To append to a
file, use the –a option after script but before the filename; otherwise script overwrites
an existing file. Following is a session being recorded by script:
$ script
Script started, file is typescript
$ ls -l /bin | head -5
-rwxr-xr-x. 1 root root
123 02-07
-rwxr-xr-x. 1 root root
25948 02-08
lrwxrwxrwx. 1 root root
4 02-25
-rwxr-xr-x. 1 root root
25088 02-08
$ exit
exit
Script done, file is typescript

17:32
03:46
16:52
03:46

alsaunmute
arch
awk -> gawk
basename

Use the exit command to terminate a script session. You can then view the file you created
using cat, less, more, or an editor. Following is the file created by the preceding script
command:
$ cat typescript
Script started on Tue 03 Apr 2012 10:16:36 AM PDT
$ ls -l /bin | head -5
-rwxr-xr-x. 1 root root
123 02-07 17:32 alsaunmute
-rwxr-xr-x. 1 root root
25948 02-08 03:46 arch
lrwxrwxrwx. 1 root root
4 02-25 16:52 awk -> gawk
-rwxr-xr-x. 1 root root
25088 02-08 03:46 basename
$ exit

Compressing and Archiving Files

63

exit
Script done on Tue 03 Apr 2012 10:16:50 AM PDT

If you will be editing the file, you can use dos2unix (next) to eliminate from the
typescript file the ^M characters that appear at the ends of the lines. Refer to the
script man page for more information.

unix2dos: Converts Linux Files to Windows and Macintosh
OS X Format
unix2dos,
unix2mac

If you want to share a text file you created on a Linux system with someone on a Windows or Macintosh OS X system, you need to convert the file for the person on the
other system to read it easily. The unix2dos utility converts a Linux text file so it can
be read on a Windows machine; use unix2mac to convert a Linux file so it can be read
on a Macintosh system. This utility is part of the dos2unix software package. Some
distributions use todos in place of unix2dos; todos is part of the tofrodos software
package and has no Macintosh-specific conversion utility. If you are using unix2dos,
enter the following command to convert a file named memo.txt (created with a text
editor) to a DOS-format file (use unix2mac to convert to a Macintosh-format file):
$ unix2dos memo.txt

You can now email the file as an attachment to someone on a Windows or OS X
system. This utility overwrites the original file.
dos2unix,
mac2unix

The dos2unix (or fromdos) utility converts Windows files so they can be read on a
Linux system (use mac2unix to convert from a Macintosh system):
$ dos2unix memo.txt

See the dos2unix man page for more information.
tr

You can also use tr (translate; page 987) to change a Windows or OS X text file into
a Linux text file. In the following example, the –d (delete) option causes tr to remove
RETURNs (represented by \r) as it makes a copy of the file:
$ cat memo | tr -d '\r' > memo.txt

The greater than (>) symbol redirects the standard output of tr to the file named
memo.txt. For more information refer to “Redirecting Standard Output” on
page 136. Converting a file the other way without using unix2dos is not as easy.

Compressing and Archiving Files
Large files use more disk space and take longer to transfer over a network than
smaller files. To reduce these factors you can compress a file without losing any of
the information it holds. Similarly, a single archive of several files packed into a larger

64 Chapter 3 The Utilities

file is easier to manipulate, upload, download, and email than multiple files. You
might frequently download compressed, archived files from the Internet. The utilities
described in this section compress and decompress files and pack and unpack
archives.

bzip2: Compresses a File
The bzip2 utility compresses a file by analyzing it and recoding it more efficiently. The
new version of the file looks completely different. In fact, because the new file contains many nonprinting characters, you cannot view it directly. The bzip2 utility
works particularly well on files that contain a lot of repeated information, such as
text and image data, although most image data is already in a compressed format.
The following example shows a boring file. Each of the 8,000 lines of the letter_e file
contains 72 e’s and a NEWLINE character that marks the end of the line. The file occupies
more than half a megabyte of disk storage.
$ ls -l
-rw-rw-r--. 1 sam pubs 584000 03-01 22:31 letter_e

The –l (long) option causes ls to display more information about a file. Here it shows
that letter_e is 584,000 bytes long. The –v (verbose) option causes bzip2 to report
how much it was able to reduce the size of the file. In this case it shrank the file by
99.99 percent:
$ bzip2 -v letter_e
letter_e: 11680.00:1, 0.001 bits/byte, 99.99% saved, 584000 in, 50 out.
$ ls -l
-rw-rw-r--. 1 sam pubs 50 03-01 22:31 letter_e.bz2
.bz2 filename
extension

Now the file is only 50 bytes long. The bzip2 utility also renamed the file, appending
.bz2 to its name. This naming convention reminds you that the file is compressed; you
would not want to display or print it, for example, without first decompressing it.
The bzip2 utility does not change the modification date associated with the file, even
though it completely changes the file’s contents.

Keep the original file by using the –k option
tip The bzip2 utility and its counterpart, bunzip2, remove the original file when they compress or
decompress a file. Use the –k (keep) option to keep the original file.

In the following, more realistic example, the file zach.jpg contains a computer graphics
image:
$ ls -l
-rw-r--r--. 1 sam pubs 33287 03-01 22:40 zach.jpg

The bzip2 utility can reduce the size of the file by only 28 percent because the image
is already in a compressed format:
$ bzip2 -v zach.jpg
zach.jpg: 1.391:1, 5.749 bits/byte, 28.13% saved, 33287 in, 23922 out.

Compressing and Archiving Files

65

$ ls -l
-rw-r--r--. 1 sam pubs 23922 03-01 22:40 zach.jpg.bz2

Refer to page 732, www.bzip.org, and the Bzip2 mini-HOWTO (see page 41 for
instructions on obtaining this document) for more information.

bzcat and bunzip2: Decompress a File
bzcat

The bzcat utility displays a file that has been compressed with bzip2. The equivalent
of cat for .bz2 files, bzcat decompresses the compressed data and displays the decompressed data. Like cat, bzcat does not change the source file. The pipe symbol in the
following example redirects the output of bzcat so that instead of being displayed on
the screen it becomes the input to head, which displays the first two lines of the file:
$ bzcat letter_e.bz2 | head -2
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee

After bzcat is run, the contents of letter_e.bz2 is unchanged; the file is still stored on
the disk in compressed form.
bunzip

The bunzip2 utility restores a file that has been compressed with bzip2:
$ bunzip2 letter_e.bz2
$ ls -l
-rw-rw-r--. 1 sam pubs 584000 03-01 22:31 letter_e
$ bunzip2 zach.jpg.bz2
$ ls -l
-rw-r--r--. 1 sam pubs 33287 03-01 22:40 zach.jpg

bzip2recover

The bzip2recover utility supports limited data recovery from media errors. Give the
command bzip2recover followed by the name of the compressed, corrupted file from
which you want to try to recover data.

gzip: Compresses a File
gunzip and zcat

The gzip (GNU zip) utility is older and less efficient than bzip2. Its flags and operation
are very similar to those of bzip2. A file compressed by gzip is marked with a .gz filename extension. Linux stores manual pages in gzip format to save disk space;
likewise, files you download from the Internet are frequently in gzip format. Use gzip,
gunzip, and zcat just as you would use bzip2, bunzip2, and bzcat, respectively. Refer to
page 838 for more information on gzip.

compress

The compress utility can also compress files, albeit not as well as gzip. This utility
marks a file it has compressed by adding .Z to its name.

gzip versus zip
tip Do not confuse gzip and gunzip with the zip and unzip utilities. These last two are used to pack
and unpack zip archives containing several files compressed into a single file that has been
imported from or is being exported to a Windows system. The zip utility constructs a zip archive,
whereas unzip unpacks zip archives. The zip and unzip utilities are compatible with PKZIP, a
Windows program that compresses and archives files.

66 Chapter 3 The Utilities

tar: Packs and Unpacks Archives
The tar utility performs many functions. Its name is short for tape archive, as its original function was to create and read archive and backup tapes. Today it is used to
create a single file (called a tar file, archive, or tarball) from multiple files or directory
hierarchies and to extract files from a tar file. The cpio (page 758) and pax (page 907)
utilities perform a similar function.
In the following example, the first ls shows the sizes of the files g, b, and d. Next tar
uses the –c (create), –v (verbose), and –f (write to or read from a file) options to create
an archive named all.tar from these files. Each line of output displays the name of the
file tar is appending to the archive it is creating.
The tar utility adds overhead when it creates an archive. The next command shows
that the archive file all.tar occupies more than 9,700 bytes, whereas the sum of the
sizes of the three files is about 6,000 bytes. This overhead is more appreciable on
smaller files, such as the ones in this example:
$ ls -l g b
-rw-r--r--.
-rw-r--r--.
-rw-r--r--.

d
1 zach other 1178 08-20 14:16 b
1 zach zach 3783 08-20 14:17 d
1 zach zach 1302 08-20 14:16 g

$ tar -cvf all.tar g b d
g
b
d
$ ls -l all.tar
-rw-r--r--. 1 zach zach 9728 08-20 14:17 all.tar
$ tar -tvf
-rw-r--r--rw-r--r--rw-r--r--

all.tar
zach /zach
zach /other
zach /zach

1302 2012-08-20 14:16 g
1178 2012-08-20 14:16 b
3783 2012-08-20 14:17 d

The final command in the preceding example uses the –t option to display a table of
contents for the archive. Use –x in place of –t to extract files from a tar archive. Omit
the –v option if you want tar to do its work silently.2
You can use bzip2, compress, or gzip to compress tar files, making them easier to store
and handle. Many files you download from the Internet will already be in one of these
formats. Files that have been processed by tar and compressed by bzip2 frequently
have a filename extension of .tar.bz2 or .tbz. Those processed by tar and gzip have an
extension of .tar.gz or .tgz, whereas files processed by tar and compress use .tar.Z as
the extension.
2. Although the original UNIX tar did not use a leading hyphen to indicate an option on the command
line, the GNU/Linux version accepts hyphens but works as well without them. This book precedes tar
options with a hyphen for consistency with most other utilities.

Compressing and Archiving Files

67

You can unpack a tarred and gzipped file in two steps. (Follow the same procedure if
the file was compressed by bzip2, but use bunzip2 instead of gunzip.) The next example
shows how to unpack the GNU make utility after it has been downloaded
(ftp.gnu.org/pub/gnu/make/make-3.82.tar.gz):
$ ls -l mak*
-rw-r--r--. 1 sam pubs 1712747 04-05 10:43 make-3.82.tar.gz
$ gunzip mak*
$ ls -l mak*
-rw-r--r--. 1 sam pubs 6338560 04-05 10:43 make-3.82.tar
$ tar -xvf mak*
make-3.82/
make-3.82/vmsfunctions.c
make-3.82/getopt.h
make-3.82/make.1
...
make-3.82/README.OS2
make-3.82/remote-cstms.c

The first command lists the downloaded tarred and gzipped file: make-3.82.tar.gz
(about 1.7 megabytes). The asterisk (*) in the filename matches any characters
in any filenames (page 149), so ls displays a list of files whose names begin with
mak; in this case there is only one. Using an asterisk saves typing and can improve
accuracy with long filenames. The gunzip command decompresses the file and
yields make-3.82.tar (no .gz extension), which is about 6.3 megabytes. The tar command creates the make-3.82 directory in the working directory and unpacks the files
into it.
$ ls -ld mak*
drwxr-xr-x. 8 sam pubs
4096 2010-07-27 make-3.82
-rw-r--r--. 1 sam pubs 6338560 04-05 10:43 make-3.82.tar
$ ls -l make-3.82
-rw-r--r--. 1 sam
-rw-r--r--. 1 sam
-rw-r--r--. 1 sam
-rw-r--r--. 1 sam
...
-rw-r--r--. 1 sam
-rw-r--r--. 1 sam
drwxr-xr-x. 6 sam

pubs
pubs
pubs
pubs

53838
4783
36990
14231

2010-07-27
2010-07-12
2010-07-27
2002-10-14

pubs
pubs
pubs

18391 2010-07-12
17905 2010-07-19
4096 2010-07-27

ABOUT-NLS
acinclude.m4
aclocal.m4
alloca.c
vmsjobs.c
vpath.c
w32

After tar extracts the files from the archive, the working directory contains two files
whose names start with mak: make-3.82.tar and make-3.82. The –d (directory) option
causes ls to display only file and directory names, not the contents of directories as it
normally does. The final ls command shows the files and directories in the make-3.82
directory. Refer to page 968 for more information on tar.

68 Chapter 3 The Utilities

tar: the –x option might extract a lot of files
caution Some tar archives contain many files. To list the files in the archive without unpacking them, run
tar with the –tf options followed by the name of the tar file. In some cases you might want to create
a new directory (mkdir [page 90]), move the tar file into that directory, and expand it there. That
way the unpacked files will not mingle with existing files, and no confusion will occur. This strategy
also makes it easier to delete the extracted files. Depending on how they were created, some tar
files automatically create a new directory and put the files into it; the –t option indicates where tar
will place the files you extract.

tar: the –x option can overwrite files
caution The –x option to tar overwrites a file that has the same filename as a file you are extracting. Follow
the suggestion in the preceding caution box to avoid overwriting files.

optional You can combine the gunzip and tar commands on one command line using a pipe
symbol (|), which redirects the output of gunzip so it becomes the input to tar:
$ gunzip -c make-3.82.tar.gz | tar -xvf -

The –c option causes gunzip to send its output through the pipeline instead of creating
a file. The final hyphen (–) causes tar to read from standard input. Refer to “Pipelines”
(page 141), gzip (pages 65 and 968), and tar (page 968) for more information about
how this command line works.
A simpler solution is to use the –z option to tar. This option causes tar to call gunzip (or
gzip when you are creating an archive) directly and simplifies the preceding command
line to
$ tar -xvzf make-3.82.tar.gz

In a similar manner, the –j option calls bzip2 or bunzip2.

Locating Utilities
The whereis and locate utilities can help you find a command whose name you have
forgotten or whose location you do not know. When multiple copies of a utility or
program are present, which tells you which copy you will run. The locate utility
searches for files on the local system.

which and whereis: Locate a Utility
When you give Linux a command, the shell searches a list of directories for a program
with that name. This list of directories is called a search path. For information on
how to change the search path, refer to “PATH: Where the Shell Looks for Programs”
on page 307. If you do not change the search path, the shell searches only a standard
set of directories and then stops searching. However, other directories on the system
might also contain useful utilities.
which

The which utility locates utilities by displaying the full pathname of the file for the utility. (Chapter 4 contains more information on pathnames and the structure of the

Locating Utilities 69

Linux filesystem.) The local system might include several utilities that have the same
name. When you type the name of a utility, the shell searches for the utility in your
search path and runs the first one it finds. You can find out which copy of the utility
the shell will run by using which. In the following example, which reports the location
of the tar utility:
$ which tar
/bin/tar

The which utility can be helpful when a utility seems to be working in unexpected ways.
By running which, you might discover that you are running a nonstandard version of a
tool or a different one from the one you expected. (“Important Standard Directories
and Files” on page 95 provides a list of standard locations for executable files.) For
example, if tar is not working properly and you find that you are running
/usr/local/bin/tar instead of /bin/tar, you might suspect the local version is broken.
whereis

The whereis utility searches for files related to a utility by looking in standard locations
instead of using your search path. For example, you can find the locations for files
related to tar:
$ whereis tar
tar: /bin/tar /usr/share/man/man1/tar.1.gz

In this example whereis finds two references to tar: the tar utility file and the (compressed)
tar man page.

which versus whereis
tip Given the name of a utility, which looks through the directories in your search path (page 307) in
order and locates the utility. If your search path includes more than one utility with the specified
name, which displays the name of only the first one (the one you would run).
The whereis utility looks through a list of standard directories and works independently of your
search path. Use whereis to locate a binary (executable) file, any manual pages, and source code
for a program you specify; whereis displays all the files it finds.

which, whereis, and builtin commands
caution Both the which and whereis utilities report only the names for utilities as they are found on the
disk; they do not report shell builtins (utilities that are built into a shell; page 153). When you use
whereis to try to find where the echo command (which exists as both a utility program and a
shell builtin) is kept, it displays the following information:
$ whereis echo
echo: /bin/echo /usr/share/man/man1/echo.1.gz

The whereis utility does not display the echo builtin. Even the which utility reports the wrong
information:
$ which echo
/bin/echo

Under bash you can use the type builtin (page 477) to determine whether a command is a builtin:
$ type echo
echo is a shell builtin

70 Chapter 3 The Utilities

locate: Searches for a File
The locate utility (locate package; some distributions use mlocate) searches for files
on the local system:
$ locate init
/boot/initramfs-2.6.38-0.rc5.git1.1.fc15.i686.img
/boot/initrd-plymouth.img
/etc/gdbinit
/etc/gdbinit.d
/etc/init
/etc/init.d
...

Before you can use locate (mlocate), the updatedb utility must build or update the locate
(mlocate) database. Typically the database is updated once a day by a cron script
(page 763).

Displaying User and System Information
This section covers utilities that provide information about who is using the system,
what those users are doing, and how the system is running.
To find out who is using the local system, you can employ one of several utilities that
vary in the details they provide and the options they support. The oldest utility, who,
produces a list of users who are logged in on the local system, the device each person
is using, and the time each person logged in.
The w and finger utilities show more detail, such as each user’s full name and the command line each user is running. The finger utility can retrieve information about users
on remote systems. Table 3-1 on page 73 summarizes the output of these utilities.

who: Lists Users on the System
The who utility displays a list of users who are logged in on the local system. In
Figure 3-10 the first column who displays shows that Sam, Max, and Zach are
logged in. (Max is logged in from two locations.) The second column shows the
name of the device that each user’s terminal, workstation, or terminal emulator is
connected to. The third column shows the date and time the user logged in. An
optional fourth column shows (in parentheses) the name of the system a remote
user logged in from.
The information who displays is useful when you want to communicate with a user
on the local system. When the user is logged in, you can use write (page 74) to establish communication immediately. If who does not list the user or if you do not need
to communicate immediately, you can send email to that person (page 76).
If the output of who scrolls off the screen, you can redirect the output using a pipe
symbol (|; page 60) so it becomes the input to less, which displays the output one

Displaying User and System Information 71

$ who
sam
max
zach
max

Figure 3-10

tty4
tty2
tty1
pts/4

2012-07-25
2012-07-25
2012-07-25
2012-07-25

17:18
16:42
16:39
17:27 (guava)

who lists who is logged in

screen at a time. You can also use a pipe symbol to redirect the output through grep
to look for a specific name.
If you need to find out which terminal you are using or what time you logged in, you
can use the command who am i:
$ who am i
max
pts/4

2012-07-25 17:27 (guava)

finger: Lists Users on the System
The finger utility displays a list of users who are logged in on the local system and, in
some cases, information about remote systems and users. In addition to usernames, finger supplies each user’s full name along with information about which device the user’s
terminal is connected to, how recently the user typed something on the keyboard, when
the user logged in, and contact information. If the user has logged in over the network,
the name of the remote system is shown as the user’s office. For example, in Figure 3-11
Max is logged in from the remote system named guava. The asterisks (*) in front of
the device names in the Tty column indicate the user has blocked messages sent directly
to his terminal (refer to “mesg: Denies or Accepts Messages” on page 75).

finger can be a security risk
security On systems where security is a concern, the system administrator might disable finger because
it can reveal information that can help a malicious user break into a system. Mac OS X disables
remote finger support by default.

You can also use finger to learn more about an individual by specifying a username
on the command line. In Figure 3-12 on the next page, finger displays detailed information about Max: He is logged in and actively using one of his terminals (tty2), and
he has not typed at his other terminal (pts/4) for 3 minutes and 7 seconds. You also
learn from finger that if you want to set up a meeting with Max, you should contact
Sam at extension 1693.
$ finger
Login
max
max
sam
zach

Figure 3-11

Name
Max Wild
Max Wild
Sam the Great
Zach Brill

Tty
*tty2
pts/4
*tty4
*tty1

finger I: lists who is logged in

Idle
3
29
1:07

Login Time
Office ...
Jul 25 16:42
Jul 25 17:27 (guava)
Jul 25 17:18
Jul 25 16:39

72 Chapter 3 The Utilities

$ finger max
Login: max
Name: Max Wild
Directory: /home/max
Shell: /bin/tcsh
On since Wed Jul 25 16:42 (PDT) on tty2 (messages off)
On since Wed Jul 25 17:27 (PDT) on pts/4 from guava
3 minutes 7 seconds idle
New mail received Tue Jul 25 17:16 2013 (PDT)
Unread since Tue Jul 25 16:44 2013 (PDT)
Plan:
I will be at a conference in Hawaii next week.
If you need to see me, contact Sam, x1693.

Figure 3-12
.plan and .project

finger II: lists details about one user

Most of the information in Figure 3-12 was collected by finger from system files. The
information shown after the heading Plan:, however, was supplied by Max. The finger
utility searched for a file named .plan in Max’s home directory and displayed its contents. (Filenames that begin with a period, such as .plan, are not normally listed by
ls and are called hidden filenames [page 86].)
You might find it helpful to create a .plan file for yourself; it can contain any information you choose, such as your schedule, interests, phone number, or address. In a similar
manner, finger displays the contents of the .project and .pgpkey files in your home
directory. If Max had not been logged in, finger would have reported only his user information, the last time he logged in, the last time he read his email, and his plan.
You can also use finger to display a user’s username. For example, on a system with
a user named Helen Simpson, you might know that Helen’s last name is Simpson but
might not guess her username is hls. The finger utility, which is not case sensitive, can
search for information on Helen using her first or last name. The following commands find the information you seek as well as information on other users whose
names are Helen or Simpson:
$ finger HELEN
Login: hls
...
$ finger simpson
Login: hls
...

Name: Helen Simpson.

Name: Helen Simpson.

uptime: Displays System Load and Duration Information
The uptime utility displays a single line that includes the time of day, the period of time
the computer has been running (in days, hours, and minutes), the number of users
logged in, and the load average (how busy the system is). The three load average numbers represent the number of jobs waiting to run, averaged over the past 1, 5, and 15
minutes.
$ uptime
09:49:14 up 2 days, 23:13,

3 users,

load average: 0.00, 0.01, 0.05

Displaying User and System Information 73

$ w
17:47:35 up 1 day, 8:10,
USER
TTY
FROM
sam
tty4
max
tty2
zach
tty1
max
pts/4
guava

Figure 3-13

6 users, load average: 0.34, 0.23, 0.26
LOGIN@
IDLE
JCPU
PCPU WHAT
17:18
29:14m 0.20s 0.00s vi memo
16:42
0.00s 0.20s 0.07s w
16:39
1:07
0.05s 0.00s run_bdgt
17:27
3:10m 0.24s 0.24s -bash

The w utility

w: Lists Users on the System
The first line the w utility displays is the same as the output of uptime (above). Following
that line, w displays a list of the users who are logged in. As discussed in the section on
who, the information that w displays is useful when you want to communicate with
someone at your installation.
The first column in Figure 3-13 shows that Max, Zach, and Sam are logged in. The
second column shows the name of the device file each user’s terminal is connected to.
The third column shows the system that a remote user is logged in from. The fourth
column shows the time each user logged in. The fifth column indicates how long each
user has been idle (how much time has elapsed since the user pressed a key on the
keyboard). The next two columns identify how much computer processor time each
user has used during this login session and on the task that user is running. The last
column shows the command each user is running. Table 3-1 compares the w, who, and
finger utilities.

Table 3-1

Comparison of w, who, and finger

Information displayed

w

who

finger

Username

x

x

x

Terminal-line identification (tty)

x

x

x

Login time (and day for old logins)

x
x

x

Login date and time
Idle time

x

Program the user is executing

x

Location the user logged in from
CPU time used

x

x
x

Full name (or other information from /etc/passwd)

x

User-supplied vanity information

x

System uptime and load average

x

74 Chapter 3 The Utilities

$ write max
Hi Max, are you there? o

Figure 3-14

The write utility I

free: Displays Memory Usage Information
The free utility displays the amount of physical (RAM) and swap (swap space on the
disk; page 1096) memory in the local system. It displays columns for total, used, and
free memory as well as for kernel buffers. The column labeled shared is obsolete. This
utility is not available under OS X; vm_stat performs a similar function.
In the following example, the –m option causes free to display memory sizes in megabytes and the –t option adds the line labeled Total to the end of the output. You can
cause free to display memory sizes in gigabytes (–g), megabytes (–m), kilobytes (–k;
the default), or bytes (–b). See the free man page for additional options.
$ free -mt
total
Mem:
2013
-/+ buffers/cache:
Swap:
2044
Total:
4058

used
748
254
0
748

free
1264
1759
2044
3309

shared
0

buffers
110

cached
383

One of the ways Linux takes advantage of free memory is to allocate memory it is
not otherwise using to buffers (page 1059) and cache (page 1059). Thus the value on
the Mem line in the free column will be small and is not representative of the total
available memory when the kernel is working properly. As the kernel needs more
memory, it reallocates memory it had allocated to buffers and cache.
The –/+ buffers/cache line gives values assuming memory used for buffers and cache
is free memory. The value in the used column on this line assumes buffers and cache
(110 + 383 = 493 on the Mem line) are freed; thus the value in the used column is
254 (~748 – 493) while the value in the free column value increases to 1759
(~1,264 + 493). Unlike the value in the free column on the Mem line, as the value
in the free column on the –/+ buffers/cache line approaches zero, the system is truly
running out of memory.
The Swap line displays the total, used, and free amounts of swap space.

Communicating with Other Users
The utilities discussed in this section enable you to exchange messages and files with
other users either interactively or through email.

write: Sends a Message
The write utility sends a message to another user who is logged in. When you and
another user use write to send messages to each other, you establish two-way

Communicating with Other Users 75

$ write max
Hi Max, are you there? o
Message from max@guava on pts/4 at 18:23 ...
Yes Zach, I'm here. o

Figure 3-15

The write utility II

communication. Initially a write command (Figure 3-14) displays a banner on the
other user’s terminal, saying that you are about to send a message.
The syntax of a write command line is
write username [terminal]
The username is the username of the user you want to communicate with. The terminal
is an optional device name that is useful if the user is logged in more than once. You can
display the usernames and device names of all users who are logged in on the local system
by using who, w, or finger.
To establish two-way communication with another user, you and the other user must
each execute write, specifying the other’s username as the username. The write utility
then copies text, line by line, from one keyboard/display to the other (Figure 3-15).
Sometimes it helps to establish a convention, such as typing o (for “over”) when you
are ready for the other person to type and typing oo (for “over and out”) when you
are ready to end the conversation. When you want to stop communicating with the
other user, press CONTROL-D at the beginning of a line. Pressing CONTROL-D tells write to quit,
displays EOF (end of file) on the other user’s terminal, and returns you to the shell.
The other user must do the same.
If the Message from banner appears on your screen and obscures something you are
working on, press CONTROL- L or CONTROL- R to refresh the screen and remove the banner.
Then you can clean up, exit from your work, and respond to the person who is writing to you. You have to remember who is writing to you, however, because the banner
will no longer appear on the screen.

mesg: Denies or Accepts Messages
If messages to your screen are blocked, give the following mesg command to allow
other users to send you messages:
$ mesg y

If Max had not given this command before Zach tried to send him a message, Zach
might have seen the following message:
$ write max
write: max has messages disabled

You can block messages by entering mesg n. Give the command mesg by itself to display
is y (for “yes, messages are allowed”) or is n (for “no, messages are not allowed”).

76 Chapter 3 The Utilities

If you have messages blocked and you write to another user, write displays the following
message because even if you are allowed to write to another user, the user will not be
able to respond to you:
$ write max
write: you have write permission turned off.

Email
Email enables you to communicate with users on the local system as well as those on
the network. If you are connected to the Internet, you can communicate electronically
with users around the world.
Email utilities differ from write in that they can send a message when the recipient is
not logged in. In this case the email is stored until the recipient reads it. These utilities
can also send the same message to more than one user at a time.
Many email programs are available for Linux, including the original character-based
mail program, Mozilla/Thunderbird, pine, mail through emacs, KMail, and evolution.
Another popular graphical email program is sylpheed (sylpheed.sraoss.jp/en).

Two programs are available that can make any email program easier to use and more
secure. The procmail program (www.procmail.org) creates and maintains email servers
and mailing lists; preprocesses email by sorting it into appropriate files and directories;
starts various programs depending on the characteristics of incoming email; forwards
email; and so on. The GNU Privacy Guard (GPG or GNUpg) encrypts and decrypts
email making it almost impossible for an unauthorized person to read.
Network addresses

If the local system is part of a LAN, you can generally send email to and receive email
from users on other systems on the LAN by using their usernames. Someone sending
Max email on the Internet would need to specify his domain name (page 1066) along
with his username. Use this address to send email to the author of this book:
mgs@sobell.com.

Chapter Summary
The utilities introduced in this chapter are a small but powerful subset of the many
utilities available on a typical system. Because you will use them frequently and
because they are integral to the following chapters, it is important that you become
comfortable using them.
The utilities listed in Table 3-2 manipulate, display, compare, and print files.

Table 3-2

File utilities

Utility

Function

cp

Copies one or more files (page 53)

Chapter Summary

Table 3-2

77

File utilities (continued)

Utility

Function

diff

Displays the differences between two files (page 58)

file

Displays information about the contents of a file (page 60)

grep

Searches file(s) for a string (page 56)

head

Displays the lines at the beginning of a file (page 56)

lpq

Displays a list of jobs in the print queue (page 55)

lpr

Places file(s) in the print queue (page 55)

lprm

Removes a job from the print queue (page 56)

mv

Renames a file or moves file(s) to another directory (page 54)

sort

Puts a file in order by lines (page 58)

tail

Displays the lines at the end of a file (page 57)

uniq

Displays the contents of a file, skipping adjacent duplicate lines (page 58)

To reduce the amount of disk space a file occupies, you can compress it using
the bzip2 utility. Compression works especially well on files that contain patterns, as do most text files, but reduces the size of almost all files. The inverse
of bzip2—bunzip2—restores a file to its original, decompressed form. Table 3-3
lists utilities that compress and decompress files. The bzip2 utility is the most efficient of these.

Table 3-3

(De)compression utilities

Utility

Function

bunzip2

Returns a file compressed with bzip2 to its original size and format (page 65)

bzcat

Displays a file compressed with bzip2 (page 65)

bzip2

Compresses a file (page 64)

compress

Compresses a file (not as well as bzip2 or gzip; page 65)

gunzip

Returns a file compressed with gzip or compress to its original size and
format (page 65)

gzip

Compresses a file (not as well as bzip2; page 65)

unzip

Unpacks zip archives, which are compatible with Windows PKZIP

zcat

Displays a file compressed with gzip (page 65)

zip

Constructs zip archives, which are compatible with Windows PKZIP

78 Chapter 3 The Utilities

An archive is a file, frequently compressed, that contains a group of files. The tar
utility (Table 3-4) packs and unpacks archives. The filename extensions .tar.bz2,
.tar.gz, and .tgz identify compressed tar archive files and are often seen on software
packages obtained over the Internet.

Table 3-4

Archive utility

Utility

Function

tar

Creates or extracts files from an archive file (page 66)

The utilities listed in Table 3-5 determine the location of a utility on the local system.
For example, they can display the pathname of a utility or a list of C++ compilers
available on the local system.

Table 3-5
Utility

Location utilities
Function

locate/mlocate Searches for files on the local system (page 70)
whereis

Displays the full pathnames of a utility, source code, or man page
(page 68)

which

Displays the full pathname of a command you can run (page 68)

Table 3-6 lists utilities that display information about the local system and other users.
You can easily learn a user’s full name, login status, login shell, and other information
maintained by the system.

Table 3-6

User and system information utilities

Utility

Function

finger

Displays detailed information about users, including their full names (page 71)

free

Displays memory usage information (page 74)

hostname

Displays the name of the local system (page 53)

uptime

Displays system load and duration information (page 72)

w

Displays detailed information about users who are logged in on the local
system (page 73)

who

Displays information about users who are logged in on the local system
(page 70)

The utilities shown in Table 3-7 can help you stay in touch with other users on the
local network.

Exercises 79

Table 3-7

User communication utilities

Utility

Function

mesg

Permits or denies messages sent by write (page 75)

write

Sends a message to another user who is logged in (page 74)

Table 3-8 lists miscellaneous utilities.

Table 3-8

Miscellaneous utilities

Utility

Function

date

Displays the current date and time (page 62)

echo

Copies its arguments (page 1055) to the screen (page 61)

Exercises
1. Which commands can you use to determine who is logged in on a specific
terminal?
2. How can you keep other users from using write to communicate with you?
Why would you want to?
3. What happens when you give the following commands if the file named
done already exists?
$ cp to_do done
$ mv to_do done

4. How can you find out which utilities are available on your system for editing files? Which utilities are available for editing on your system?
5. How can you find the phone number for Ace Electronics in a file named
phone that contains a list of names and phone numbers? Which command
can you use to display the entire file in alphabetical order? How can you display the file without any adjacent duplicate lines? How can you display the
file without any duplicate lines?
6. What happens when you use diff to compare two binary files that are not
identical? (You can use gzip to create the binary files.) Explain why the diff
output for binary files is different from the diff output for ASCII files.
7. Create a .plan file in your home directory. Does finger display the contents
of your .plan file?

80 Chapter 3 The Utilities

8. What is the result of giving the which utility the name of a command that
resides in a directory that is not in your search path?
9. Are any of the utilities discussed in this chapter located in more than one
directory on the local system? If so, which ones?
10. Experiment by calling the file utility with the names of files in /usr/bin. How
many different types of files are there?
11. Which command can you use to look at the first few lines of a file named
status.report? Which command can you use to look at the end of the file?

Advanced Exercises
12. Re-create the colors.1 and colors.2 files used in Figure 3-8 on page 59. Test
your files by running diff –u on them. Does diff display the same results as
in the figure?
13. Try giving these two commands:
$ echo cat
$ cat echo

Explain the differences between the output of each command.
14. Repeat exercise 5 using the file phone.gz, a compressed version of the list
of names and phone numbers. Consider more than one approach to answer
each question and explain how you made your choices.
15. Find or create files that
a. gzip compresses by more than 80 percent.
b. gzip compresses by less than 10 percent.
c. Get larger when compressed with gzip.
d. Use ls –l to determine the sizes of the files in question. Can you characterize the files in a, b, and c?
16. Older email programs were not able to handle binary files. Suppose you are
emailing a file that has been compressed with gzip, which produces a binary
file, and the recipient is using an old email program. Refer to the man page
on uuencode, which converts a binary file to ASCII. Learn about the utility
and how to use it.
a. Convert a compressed file to ASCII using uuencode. Is the encoded file
larger or smaller than the compressed file? Explain. (If uuencode is not on
the local system, you can install it using one of the tools described in
Appendix C; it is part of the sharutils package.)
b. Would it ever make sense to use uuencode on a file before compressing
it? Explain.

4
The Filesystem

Chapter4
4

In This Chapter

Objectives

The Hierarchical Filesystem . . . . . . 82

After reading this chapter you should be able to:

Directory Files and Ordinary Files . . 83
The Working Directory. . . . . . . . . . . 86
Your Home Directory . . . . . . . . . . . . 87

 Define hierarchical filesystem, ordinary file, directory
file, home directory, working directory, and parent
directory

Pathnames . . . . . . . . . . . . . . . . . . . 87

 List best practices for filenames

Relative Pathnames . . . . . . . . . . . . 89

 Determine the name of the working directory

Working with Directories . . . . . . . . 90
Access Permissions . . . . . . . . . . . . 98
ACLs: Access Control Lists . . . . . . 104
Hard Links . . . . . . . . . . . . . . . . . . . 110
Symbolic Links . . . . . . . . . . . . . . . 112
Dereferencing Symbolic Links . . . 115

 Explain the difference between absolute and relative
pathnames
 Create and remove directories
 List files in a directory, remove files from a directory,
and copy and move files between directories
 List and describe the uses of standard Linux directories
and files
 Display and interpret file and directory ownership and
permissions
 Modify file and directory permissions
 Expand access control using ACLs
 Describe the uses, differences, and methods of creating
hard links and symbolic links

82 Chapter 4 The Filesystem

Grandparent

Aunt

Mom

Uncle

Sister

Brother

Self

Daughter 1

Grandchild 1

Daughter 2

Grandchild 2

Figure 4-1 A family tree

A filesystem is a set of data structures (page 1064) that usually resides on part of a
disk and holds directories of files. Filesystems store user and system data that are the
basis of users’ work on the system and the system’s existence. This chapter discusses
the organization and terminology of the Linux filesystem, defines ordinary and directory files, and explains the rules for naming them. It also shows how to create and
delete directories, move through the filesystem, and use absolute and relative pathnames to access files in various directories. It includes a discussion of important files
and directories as well as file access permissions and ACLs (Access Control Lists),
which allow you to share selected files with specified users. It concludes with a discussion of hard and symbolic links, which can make a single file appear in more than
one directory.
In addition to reading this chapter, you can refer to the df, fsck, mkfs, and tune2fs utilities in Part VII for more information on filesystems. If you are running OS X, see
“Filesystems” on page 1043.

The Hierarchical Filesystem
Family tree

A hierarchical (page 1072) structure frequently takes the shape of a pyramid. One
example of this type of structure is found by tracing a family’s lineage: A couple has
a child, who might in turn have several children, each of whom might have more children. This hierarchical structure is called a family tree (Figure 4-1).

Directory tree

Like the family tree it resembles, the Linux filesystem is called a tree. It consists of a
set of connected files. This structure allows you to organize files so you can easily find
any particular one. On a standard Linux system, each user starts with one directory,

Directory Files and Ordinary Files 83

correspond

personal

memos

business

milk_co

letter_1

cheese_co

letter_2

Figure 4-2 A secretary’s directories

to which the user can add subdirectories to any desired level. By creating multiple levels of subdirectories, a user can expand the structure as needed.
Subdirectories

Typically each subdirectory is dedicated to a single subject, such as a person, project,
or event. The subject dictates whether a subdirectory should be subdivided further.
For example, Figure 4-2 shows a secretary’s subdirectory named correspond. This
directory contains three subdirectories: business, memos, and personal. The business
directory contains files that store each letter the secretary types. If you expect many
letters to go to one client, as is the case with milk_co, you can dedicate a subdirectory
to that client.
One major strength of the Linux filesystem is its ability to adapt to users’ needs. You
can take advantage of this strength by strategically organizing your files so they are
most convenient and useful for you.

Directory Files and Ordinary Files
Like a family tree, the tree representing the filesystem is usually pictured upside down
with its root at the top. Figures 4-2 and 4-3 (on the next page) show that the tree
“grows” downward from the root with paths connecting the root to each of the other
files. At the end of each path is either an ordinary file or a directory file. Special files,
which can also appear at the ends of paths, provide access to operating system features.
Ordinary files, or simply files, appear at the ends of paths that cannot support other
paths. Directory files, also referred to as directories or folders, are the points that other
paths can branch off from. (Figures 4-2 and 4-3 show some empty directories.) When
you refer to the tree, up is toward the root and down is away from the root. Directories
directly connected by a path are called parents (closer to the root) and children (farther
from the root). A pathname is a series of names that trace a path along branches from
one file to another. See page 87 for more information about pathnames.

84 Chapter 4 The Filesystem

Directory

Directory

Directory

Directory

Ordinary File

Ordinary File

Directory

Directory

Ordinary File

Ordinary File

Ordinary File

Figure 4-3 Directories and ordinary files

Filenames
Every file has a filename. The maximum length of a filename varies with the type of
filesystem; Linux supports several types of filesystems. Most modern filesystems
allow files with names up to 255 characters long, however, some filesystems restrict
filenames to fewer characters. Although you can use almost any character in a filename, you will avoid confusion if you choose characters from the following list:
• Uppercase letters (A–Z)
• Lowercase letters (a–z)
• Numbers (0–9)
• Underscore (_)
• Period (.)
• Comma (,)
Like the children of one parent, no two files in the same directory can have the same
name. (Parents give their children different names because it makes good sense, but
Linux requires it.) Files in different directories, like the children of different parents,
can have the same name.
The filenames you choose should mean something. Too often a directory is filled with
important files with such unhelpful names as hold1, wombat, and junk, not to mention foo and foobar. Such names are poor choices because they do not help you recall
what you stored in a file. The following filenames conform to the suggested syntax
and convey information about the contents of the file:
• correspond
• january
• davis

Directory Files and Ordinary Files 85

• reports
• 2001
• acct_payable
Filename length

When you share your files with users on other systems, you might need to make long
filenames differ within the first few characters. Systems running DOS or older versions of Windows have an 8-character filename body length limit and a 3-character
filename extension length limit. Some UNIX systems have a 14-character limit, and
older Macintosh systems have a 31-character limit. If you keep filenames short, they
are easy to type; later you can add extensions to them without exceeding the shorter
limits imposed by some filesystems. The disadvantage of short filenames is that they
are typically less descriptive than long filenames.
Long filenames enable you to assign descriptive names to files. To help you select
among files without typing entire filenames, shells support filename completion. For
more information about this feature, see the “Filename completion” tip on page 53.

Case sensitivity

You can use uppercase and/or lowercase letters within filenames, but be careful:
Many filesystems are case sensitive. For example, the popular ext family of filesystems and the UFS filesystem are case sensitive, so files named JANUARY, January,
and january refer to three distinct files. The FAT family of filesystems (used mostly
for removable media) is not case sensitive, so those three filenames represent the same
file. The HFS+ filesystem, which is the default OS X filesystem, is case preserving but
not case sensitive; refer to “Case Sensitivity” on page 1043 for more information.

Do not use SPACEs within filenames
caution Although Linux allows you to use SPACEs within filenames, it is a poor idea. Because a SPACE is a
special character, you must quote it on a command line. Quoting a character on a command line
can be difficult for a novice user and cumbersome for an experienced user. Use periods or underscores instead of SPACEs: joe.05.04.26, new_stuff.
If you are working with a filename that includes a SPACE, such as a file from another operating system, you must quote the SPACE on the command line by preceding it with a backslash or by placing
quotation marks on either side of the filename. The two following commands send the file named
my file to the printer.
$ lpr my\ file
$ lpr "my file"

Filename Extensions
A filename extension is the part of the filename that follows an embedded period. In
the filenames listed in Table 4-1 on the next page, filename extensions help describe
the contents of the file. Some programs, such as the C programming language compiler, default to specific filename extensions; in most cases, however, filename
extensions are optional. Use extensions freely to make filenames easy to understand.
If you like, you can use several periods within the same filename—for example,
notes.4.10.54 or files.tar.gz. Under Mac OS X, some applications use filename extensions to identify files, but many use type codes and creator codes (page 1046).

86 Chapter 4 The Filesystem

Table 4-1

Filename extensions

Filename with extension

Meaning of extension

compute.c

A C programming language source file

compute.o

The object code file for compute.c

compute

The executable file for compute.c

memo.0410.txt

A text file

memo.pdf

A PDF file; view with xpdf or kpdf under a GUI

memo.ps

A PostScript file; view with ghostscript or kpdf under a GUI

memo.Z

A file compressed with compress (page 65); use
uncompress or gunzip (page 65) to decompress

memo.gz

A file compressed with gzip (page 65); view with zcat or
decompress with gunzip (both on page 65)

memo.tgz or memo.tar.gz

A tar (page 66) archive of files compressed with gzip (page 65)

memo.bz2

A file compressed with bzip2 (page 64); view with bzcat or
decompress with bunzip2 (both on page 65)

memo.html

A file meant to be viewed using a Web browser, such as Firefox

photo.gif, photo.jpg,
photo.jpeg, photo.bmp,
photo.tif, or photo.tiff

A file containing graphical information, such as a picture

Hidden Filenames
A filename that begins with a period is called a hidden filename (or a hidden file or
sometimes an invisible file) because ls does not normally display it. The command
ls –a displays all filenames, including hidden ones. Names of startup files (next page)
usually begin with a period so they are hidden and do not clutter a directory listing.
Two special hidden entries—single and double periods (. and ..)—appear in every
directory (page 92).

The Working Directory
pwd

While you are logged in on a character-based interface to a Linux system, you are
always associated with a directory. The directory you are associated with is called the
working directory or current directory. Sometimes this association is referred to in a
physical sense: “You are in (or working in) the zach directory.” The pwd (print working directory) builtin displays the pathname of the working directory.

Pathnames 87

login: max
Password:
Last login: Wed Oct 20 11:14:21 from 172.16.192.150
$ pwd
/home/max

Figure 4-4

Logging in and displaying the pathname of your home directory

Your Home Directory
When you first log in on a system or start a terminal emulator window, the working
directory is your home directory. To display the pathname of your home directory,
use pwd just after you log in (Figure 4-4). Linux home directories are typically located
in /home while Mac OS X home directories are located in /Users.
When called without arguments, the ls utility displays a list of the files in the working
directory. Because your home directory has been the only working directory you have
used so far, ls has always displayed a list of files in your home directory. (All the files
you have created up to this point were created in your home directory.)

Startup Files
Startup files, which appear in your home directory, give the shell and other programs
information about you and your preferences. Under Mac OS X these files are called
configuration files or preference files (page 1050). Frequently one of these files tells
the shell what kind of terminal you are using (page 1024) and executes the stty (set
terminal) utility to establish the erase (page 29) and line kill (page 30) keys.
Either you or the system administrator can put a shell startup file containing shell
commands in your home directory. The shell executes the commands in this file each
time you log in. Because the startup files have hidden filenames (filenames that begin
with a period; page 86), you must use the ls –a command to see whether one is in your
home directory. See page 278 (bash) and page 372 (tcsh) for more information about
startup files.

Pathnames
Every file has a pathname, which is a trail from a directory through part of the directory hierarchy to an ordinary file or a directory. Within a pathname, a slash (/)
following (to the right of) a filename indicates that the file is a directory file. The
file following (to the right of) the slash can be an ordinary file or a directory file.
The simplest pathname is a simple filename, which points to a file in the working
directory. This section discusses absolute and relative pathnames and explains how
to use each.

88 Chapter 4 The Filesystem

/
/home

/etc
home

tmp

etc
/home/hls

max

zach

hls
/home/hls/notes

/home/zach
bin

notes
/home/hls/bin/log

report

Figure 4-5

log

Absolute pathnames

Absolute Pathnames
/ (root)

The root directory of the filesystem hierarchy does not have a name; it is referred to
as the root directory and is represented by a slash (/) standing alone or at the left end
of a pathname.
An absolute pathname starts with a slash (/), which represents the root directory. The
slash is followed by the name of a file located in the root directory. An absolute pathname can continue, tracing a path through all intermediate directories, to the file
identified by the pathname. String all the filenames in the path together, following
each directory with a slash (/). This string of filenames is called an absolute pathname
because it locates a file absolutely by tracing a path from the root directory to the file.
Typically the absolute pathname of a directory does not include the trailing slash,
although that format can be used to emphasize that the pathname specifies a directory (e.g., /home/zach/). The part of a pathname following the final slash is called a
simple filename, filename, or basename. Figure 4-5 shows the absolute pathnames of
directories and ordinary files in part of a filesystem hierarchy.
Using an absolute pathname, you can list or otherwise work with any file on the local
system, assuming you have permission to do so, regardless of the working directory
at the time you give the command. For example, Sam can give the following command while working in his home directory to list the files in the /etc/ssh directory:
$ pwd
/home/sam
$ ls /etc/ssh
moduli
ssh_host_dsa_key
ssh_config
ssh_host_dsa_key.pub
sshd_config ssh_host_key

ssh_host_key.pub
ssh_host_rsa_key
ssh_host_rsa_key.pub

Pathnames 89

~ (Tilde) in Pathnames
In another form of absolute pathname, the shell expands the characters ~/ (a tilde followed by a slash) at the start of a pathname into the pathname of your home
directory. Using this shortcut, you can display your .bashrc startup file (page 279) by
using the following command no matter which directory is the working directory:
$ less ~/.bashrc

A tilde quickly references paths that start with your or someone else’s home directory.
The shell expands a tilde followed by a username at the beginning of a pathname into
the pathname of that user’s home directory. For example, assuming he has permission
to do so, Max can examine Sam’s .bashrc file by using the following command:
$ less ~sam/.bashrc

Refer to “Tilde Expansion” on page 358 for more information.

Relative Pathnames
A relative pathname traces a path from the working directory to a file. The pathname
is relative to the working directory. Any pathname that does not begin with the root
directory (represented by /) or a tilde (~) is a relative pathname. Like absolute pathnames, relative pathnames can trace a path through many directories. The simplest
relative pathname is a simple filename, which identifies a file in the working directory. The examples in the next sections use absolute and relative pathnames.

Significance of the Working Directory
To access any file in the working directory, you need only a simple filename. To access
a file in another directory, you must use a pathname. Typing a long pathname is
tedious and increases the chance of making a mistake. This possibility is less likely
under a GUI, where you click filenames or icons. You can choose a working directory
for any particular task to reduce the need for long pathnames. Your choice of a working directory does not allow you to do anything you could not do otherwise— it just
makes some operations easier.

When using a relative pathname, know which directory is the working directory
caution The location of the file you are accessing with a relative pathname is dependent on (is relative to)
the working directory. Always make sure you know which directory is the working directory before
you use a relative pathname. Use pwd to verify the name of the directory. If you are creating a file
using vim and you are not where you think you are in the file hierarchy, the new file will end up in
an unexpected location.
It does not matter which directory is the working directory when you use an absolute pathname.
Thus, the following command always edits a file named goals in your home directory:
$ vim ~/goals

90 Chapter 4 The Filesystem
.. (parent of the
working directory)

/

home

tmp

etc
. (working directory)

max

zach

hls
notes

../zach
bin

notes
bin/log

report

log

Figure 4-6 Relative pathnames

Refer to Figure 4-6 as you read this paragraph. Files that are children of the working
directory can be referenced by simple filenames. Grandchildren of the working directory can be referenced by short relative pathnames: two filenames separated by a
slash. When you manipulate files in a large directory structure, using short relative
pathnames can save you time and aggravation. If you choose a working directory that
contains the files used most often for a particular task, you need use fewer long, cumbersome pathnames.

Working with Directories
This section discusses how to create directories (mkdir), switch between directories
(cd), remove directories (rmdir), use pathnames to make your work easier, and move
and copy files and directories between directories. It concludes with brief descriptions
of important standard directories and files in the Linux filesystem.

mkdir: Creates a Directory
The mkdir utility creates a directory. The argument (page 1055) to mkdir is the pathname of the new directory. The following examples develop the directory structure
shown in Figure 4-7. In the figure, the directories that are added appear lighter than
the others and are connected by dashes.
In Figure 4-8, pwd shows that Max is working in his home directory (/home/max), and
ls shows the names of the files in his home directory: demo, names, and temp. Using mkdir,
Max creates a directory named literature as a child of his home directory. He uses a relative pathname (a simple filename) because he wants the literature directory to be a child
of the working directory. Max could have used an absolute pathname to create the same
directory: mkdir /home/max/literature, mkdir ~max/literature, or mkdir ~/literature.

Working with Directories

91

/

home

max

names

temp

literature

demo

promo

Figure 4-7

The file structure developed in the examples

The second ls in Figure 4-8 verifies the presence of the new directory. The –F
option to ls displays a slash after the name of each directory and an asterisk after
each executable file (shell script, utility, or application). When you call it with an
argument that is the name of a directory, ls lists the contents of that directory. The
final ls displays nothing because there are no files in the literature directory.
The following commands show two ways for Max to create the promo directory as
a child of the newly created literature directory. The first way checks that /home/max
is the working directory and uses a relative pathname:
$ pwd
/home/max
$ mkdir literature/promo

The second way uses an absolute pathname:
$ mkdir /home/max/literature/promo

$ pwd
/home/max
$ ls
demo names temp
$ mkdir literature
$ ls
demo literature names temp
$ ls -F
demo literature/ names temp
$ ls literature
$

Figure 4-8

The mkdir utility

92 Chapter 4 The Filesystem

$ cd /home/max/literature
$ pwd
/home/max/literature
$ cd
$ pwd
/home/max
$ cd literature
$ pwd
/home/max/literature

Figure 4-9

cd changes the working directory

Use the –p (parents) option to mkdir to create both the literature and promo directories
using one command:
$ pwd
/home/max
$ ls
demo names temp
$ mkdir -p literature/promo

or
$ mkdir -p /home/max/literature/promo

cd: Changes to Another Working Directory
Using cd (change directory) makes another directory the working directory; it does
not change the contents of the working directory. Figure 4-9 shows two ways to make
the /home/max/literature directory the working directory, as verified by pwd. First
Max uses cd with an absolute pathname to make literature his working directory—it
does not matter which is the working directory when you give a command with an
absolute pathname.
A pwd command confirms the change Max made. When used without an argument,
cd makes your home directory the working directory, as it was when you logged in.
The second cd command in Figure 4-9 does not have an argument, so it makes Max’s
home directory the working directory. Finally, knowing that he is working in his home
directory, Max uses a simple filename to make the literature directory his working
directory (cd literature) and confirms the change using pwd.

The . and .. Directory Entries
The mkdir utility automatically puts two entries in each directory it creates: a single
period (.) and a double period (..). The . is synonymous with the pathname of the
working directory and can be used in its place; the .. is synonymous with the pathname of the parent of the working directory. These entries are hidden because their
filenames begin with a period.
With the literature directory as the working directory, the following example uses ..
three times: first to list the contents of the parent directory (/home/max), second to

Working with Directories

93

copy the memoA file to the parent directory, and third to list the contents of the parent directory again.
$ pwd
/home/max/literature
$ ls ..
demo literature names
$ cp memoA ..
$ ls ..
demo literature memoA

temp

names

temp

After using cd to make promo (a subdirectory of literature) his working directory,
Max can use a relative pathname to call vim to edit a file in his home directory.
$ cd promo
$ vim ../../names

You can use an absolute or relative pathname or a simple filename virtually anywhere
a utility or program requires a filename or pathname. This usage holds true for ls,
vim, mkdir, rm, and most other utilities.

The working directory versus your home directory
tip The working directory is not the same as your home directory. Your home directory remains the
same for the duration of your session and usually from session to session. Immediately after you
log in, you are always working in the same directory: your home directory.
Unlike your home directory, the working directory can change as often as you like. You have no
set working directory, which explains why some people refer to it as the current directory. When
you log in and until you change directories using cd, your home directory is the working directory.
If you were to change directories to Sam’s home directory, then Sam’s home directory would be
the working directory.

rmdir: Deletes a Directory
The rmdir (remove directory) utility deletes a directory. You cannot delete the
working directory or a directory that contains files other than the . and .. entries.
If you need to delete a directory that has files in it, first use rm to delete the files
and then delete the directory. You do not have to (nor can you) delete the . and ..
entries; rmdir removes them automatically. The following command deletes the
promo directory:
$ rmdir /home/max/literature/promo

The rm utility has a –r option (rm –r filename) that recursively deletes files, including
directories, within a directory and also deletes the directory itself.

Use rm –r carefully, if at all
caution Although rm –r is a handy command, you must use it carefully. Do not use it with an ambiguous
file reference such as *. It is frighteningly easy to wipe out the contents of your entire home directory with a single short command.

94 Chapter 4 The Filesystem

Using Pathnames
touch

Use a text editor to create a file named letter if you want to experiment with the
examples that follow. Alternatively you can use touch (page 985) to create an empty
file:
$ cd
$ pwd
/home/max
$ touch letter

With /home/max as the working directory, the following example uses cp with a relative pathname to copy the file named letter to the /home/max/literature/promo
directory. (You will need to create promo again if you deleted it earlier.) The copy of
the file has the simple filename letter.0210:
$ cp letter literature/promo/letter.0210

If Max does not change to another directory, he can use vim as shown to edit the copy
of the file he just made:
$ vim literature/promo/letter.0210

If Max does not want to use a long pathname to specify the file, he can use cd to make
promo the working directory before using vim:
$ cd literature/promo
$ pwd
/home/max/literature/promo
$ vim letter.0210

To make the parent of the working directory the new working directory, Max can
give the following command, which takes advantage of the .. directory entry:
$ cd ..
$ pwd
/home/max/literature

mv, cp: Move or Copy Files
Chapter 3 discussed the use of mv to rename files. However, mv works even more generally: You can use this utility to move files from one directory to another (change
the pathname of a file) as well as to change a simple filename. When used to move
one or more files to a new directory, the mv command has this syntax:
mv existing-file-list directory
If the working directory is /home/max, Max can use the following command to move
the files names and temp from the working directory to the literature directory:
$ mv names temp literature

This command changes the absolute pathnames of the names and temp files from
/home/max/names and /home/max/temp to /home/max/literature/names and

Working with Directories

95

/

home

zach

max

sam

names

temp

literature

names

Figure 4-10

temp

Using mv to move names and temp

/home/max/literature/temp, respectively (Figure 4-10). Like most utilities, mv
accepts either absolute or relative pathnames.
As you create more files, you will need to create new directories using mkdir to keep
the files organized. The mv utility is a useful tool for moving files from one directory
to another as you extend your directory hierarchy.
The cp utility works in the same way mv does, except it makes copies of the existingfile-list in the specified directory.

mv: Moves a Directory
Just as it moves ordinary files from one directory to another, so mv can move directories. The syntax is similar except you specify one or more directories, not ordinary
files, to move:
mv existing-directory-list new-directory
If new-directory does not exist, the existing-directory-list must contain just one directory name, which mv changes to new-directory (mv renames the directory). Although
you can rename directories using mv, you cannot copy their contents with cp unless
you use the –r (recursive) option. Refer to the explanations of cpio (page 758), pax
(page 907), and tar (page 968) for other ways to copy and move directories.

Important Standard Directories and Files
Originally files on a Linux system were not located in standard places within the directory hierarchy. The scattered files made it difficult to document and maintain a Linux
system and just about impossible for someone to release a software package that
would compile and run on all Linux systems. The first standard for the Linux filesystem, the FSSTND (Linux Filesystem Standard), was released early in 1994. In early
1995 work was started on a broader standard covering many UNIX-like systems: FHS

96 Chapter 4 The Filesystem

/
bin

sbin

mail

Figure 4-11

var

spool

dev

usr

bin

etc

sbin

tmp

max

home

zach

root

hls

A typical FHS-based Linux filesystem structure

(Linux Filesystem Hierarchy Standard; proton.pathname.com/fhs). More recently
FHS has been incorporated in LSB (Linux Standard Base; www.linuxfoundation.org/collaborate/workgroups/lsb), a workgroup of FSG (Free Standards Group).
Finally, FSG combined with Open Source Development Labs (OSDL) to form the
Linux Foundation (www.linuxfoundation.org). Figure 4-11 shows the locations of
some important directories and files as specified by FHS. The significance of many of
these directories will become clear as you continue reading.
The following list describes the directories shown in Figure 4-11, some of the directories specified by FHS, and some other directories. Most distributions do not use all
the directories specified by FHS. Be aware that you cannot always determine the
function of a directory by its name. For example, although /opt stores add-on software, /etc/opt stores configuration files for the software in /opt.
/ Root The root directory, present in all Linux filesystem structures, is the ancestor of

all files in the filesystem. It does not have a name and is represented by a slash (/)
standing alone or at the left end of a pathname.
/bin Essential command binaries Holds the files needed to bring the system up and run it

when it first comes up in single-user/recovery mode.
/boot Static files of the boot loader Contains all the files needed to boot the system.
/dev Device files Contains all the files that represent peripheral devices, such as disk drives,

terminals, and printers. Previously this directory was filled with all possible devices.
The udev utility provides a dynamic device directory that enables /dev to contain only
devices that are present on the system.
/etc Machine-local system configuration files Holds administrative, configuration, and

other system files. Mac OS X uses Open Directory (page 1042) in place of /etc/passwd.
/etc/opt Configuration files for add-on software packages kept in /opt
/etc/X11 Machine-local configuration files for the X Window System
/home User home directories Each user’s home directory is typically one of many sub-

directories of the /home directory. As an example, assuming that users’ directories are
under /home, the absolute pathname of Zach’s home directory is /home/zach. Under
Mac OS X, user home directories are typically located in /Users.
/lib Shared libraries
/lib/modules Loadable kernel modules
/mnt Mount point for temporarily mounting filesystems
/opt Add-on (optional) software packages

Working with Directories

97

/proc Kernel and process information virtual filesystem
/root Home directory for the root account
/run Runtime data A tmpfs filesystem (mounted, but stored in RAM) that holds startup

files previously hidden in /dev and other directories. For more information see
lists.fedoraproject.org/pipermail/devel/2011-March/150031.html.
/sbin Essential system binaries Utilities used for system administration are stored in /sbin

and /usr/sbin. The /sbin directory includes utilities needed during the booting process, and /usr/sbin holds utilities used after the system is up and running.
/sys Device pseudofilesystem
/tmp Temporary files
/Users User home directories Under Mac OS X, each user’s home directory is typically one

of many subdirectories of the /Users directory. Linux typically stores home directories
in /home.
/usr Second major hierarchy Traditionally includes subdirectories that contain informa-

tion used by the system. Files in /usr subdirectories do not change often and can be
shared by several systems.
/usr/bin Most user commands Contains the standard Linux utility programs—that is, binaries

that are not needed in single-user/recovery mode.
/usr/games Games and educational programs
/usr/include Header files included by C programs
/usr/lib Libraries
/usr/local Local hierarchy Holds locally important files and directories that are added to the

system. Subdirectories can include bin, games, include, lib, sbin, share, and src.
/usr/sbin Nonvital system administration binaries See /sbin.
/usr/share Architecture-independent data Subdirectories can include dict, doc, games, info,

locale, man, misc, terminfo, and zoneinfo.
/usr/share/doc Documentation
/usr/share/info GNU info system’s primary directory
/usr/share/man Online manuals
/usr/src Source code
/var Variable data Files with contents that vary as the system runs are kept in sub-

directories under /var. The most common examples are temporary files, system log
files, spooled files, and user mailbox files. Subdirectories can include cache, lib, lock,
log, mail, opt, run, spool, tmp, and yp.
/var/log Log files Contains lastlog (a record of the last login by each user), messages (system

messages from syslogd), and wtmp (a record of all logins/logouts), among other
log files.
/var/spool Spooled application data Contains anacron, at, cron, lpd, mail, mqueue, samba, and

other directories. The file /var/mail is typically a link to /var/spool/mail.

-rwxr-xr-x+
Figure 4-12

1 sam pubs

e
am
en
Fi l

Da
t
of e (an
mo d
d i f ti m
i c a e)
ti o
n

S iz
e

r

p
Gr
ou

Ow
ne

Ty
p

eo

f fi
le
Fi l
ea
pe c c
r m es
iss s
i on
s
AC
Lf
la g
Lin
ks

98 Chapter 4 The Filesystem

2048 06-10 10:44 memo

The columns displayed by the ls –l command

Access Permissions
Most distributions support two methods of controlling who can access a file and how
they can access it: traditional access permissions and ACLs (Access Control Lists).
This section describes traditional access permissions. See page 104 for a discussion
of ACLs, which provide finer-grained control of access permissions than do traditional access permissions.
Three types of users can access a file: the owner of the file (owner), a member of a
group that the file is associated with (group), and everyone else (other). A user can
attempt to access an ordinary file in three ways: by trying to read from, write to, or
execute it.

ls –l: Displays Permissions
When you call ls with the –l option and the name of one or more ordinary files, ls
displays a line of information about the file(s). See “ls output” on page 26 for information about the format of the display this book uses. The following example
displays information for two files. The file letter.0210 contains the text of a letter, and
check_spell contains a shell script, a program written in a high-level shell programming language:
$ ls -l check_spell letter.0210
-rwxr-xr-x. 1 sam pubs 766 03-21 14:02 check_spell
-rw-r--r--. 1 sam pubs 6193 02-10 14:22 letter.0210

From left to right, the lines that an ls –l command displays contain the following
information (refer to Figure 4-12):
• The type of file (first character)
• The file’s access permissions (the next nine characters)
• The ACL flag (present if the file has an ACL, page 104)
• The number of links to the file (page 109)
• The name of the owner of the file (usually the person who created the file)

Access Permissions

99

• The name of the group the file is associated with
• The size of the file in characters (bytes)
• The date and time the file was created or last modified
• The name of the file
The type of file (first column) for letter.0210 is a hyphen (–) because it is an ordinary
file (directory files have a d in this column; see Table VI-21 on page 864).
The next three characters specify the access permissions for the owner of the file: r
indicates read permission, w indicates write permission, and x indicates execute permission. A – in a column indicates that the owner does not have the permission that
could have appeared in that position.
In a similar manner the next three characters represent permissions for the group, and
the final three characters represent permissions for other (everyone else). In the preceding example, Sam, the owner of letter.0210, can read from and write to the file,
whereas the group and others can only read from the file, and no one is allowed to
execute it. Although execute permission can be allowed for any file, it does not make
sense to assign execute permission to a file that contains a document such as a letter.
The check_spell file is an executable shell script, so execute permission is appropriate
for it. (The owner, group, and others have execute permission.) For more information
refer to “Discussion” on page 864.

chmod: Changes Access Permissions
The Linux file access permission scheme lets you give other users access to the files
you want to share yet keep your private files confidential. You can allow other users
to read from and write to a file (handy if you are one of several people working on
a joint project). You can allow others only to read from a file (perhaps a project specification you are proposing). Or you can allow others only to write to a file (similar
to an inbox or mailbox, where you want others to be able to send you mail but do
not want them to read your mail). Similarly you can protect entire directories from
being scanned (covered shortly).

A user with root privileges can access any file on the system
security There is an exception to the access permissions described in this section. Anyone who can gain root
privileges has full access to all files, regardless of the file’s owner or access permissions. Of course,
if the file is encrypted, read access does not mean the person reading the file can understand what
is in the file.

The owner of a file controls which users have permission to access the file and how
those users can access it. When you own a file, you can use the chmod (change mode)
utility to change access permissions for that file. You can specify symbolic (relative)
or numeric (absolute) arguments to chmod.

100 Chapter 4 The Filesystem

Symbolic Arguments to chmod
The following example, which uses symbolic arguments to chmod, adds (+) read and
write permissions (rw) for all (a) users:
$ ls -l letter.0210
-rw-r-----. 1 sam pubs 6193 02-10 14:22 letter.0210
$ chmod a+rw letter.0210
$ ls -l letter.0210
-rw-rw-rw-. 1 sam pubs 6193 02-10 14:22 letter.0210

You must have read permission to execute a shell script
tip Because a shell needs to read a shell script (a text file containing shell commands) before it can
execute the commands within that script, you must have read permission for the file containing
the script to execute it. You also need execute permission to execute a shell script directly from
the command line. In contrast, binary (program) files do not need to be read; they are executed
directly. You need only execute permission to run a binary program.

Using symbolic arguments with chmod modifies existing permissions; the change a
given argument makes depends on (is relative to) the existing permissions. In the next
example, chmod removes (–) read (r) and execute (x) permissions for other (o) users.
The owner and group permissions are not affected.
$ ls -l check_spell
-rwxr-xr-x. 1 sam pubs 766 03-21 14:02 check_spell
$ chmod o-rx check_spell
$ ls -l check_spell
-rwxr-x---. 1 sam pubs 766 03-21 14:02 check_spell

In addition to a (all) and o (other), you can use g (group) and u (user, although user
refers to the owner of the file who might or might not be the user of the file at any
given time) in the argument to chmod. For example, chmod a+x adds execute permission for all users (other, group, and owner), and chmod go–rwx removes all
permissions for all but the owner of the file.

chmod: o for other, u for owner
tip When using chmod, many people assume that the o stands for owner; it does not. The o stands
for other, whereas u stands for owner (user ). The acronym UGO (user-group-other) might help
you remember how permissions are named.

Numeric Arguments to chmod
You can also use numeric arguments to specify permissions with chmod. In place of
the letters and symbols specifying permissions used in the previous examples,
numeric arguments comprise three octal digits. (A fourth, leading digit controls
setuid and setgid permissions and is discussed next.) The first digit specifies permissions for the owner, the second for the group, and the third for other users. A 1 gives

Access Permissions 101

the specified user(s) execute permission, a 2 gives write permission, and a 4 gives read
permission. Construct the digit representing the permissions for the owner, group, or
others by ORing (adding) the appropriate values as shown in the following examples.
Using numeric arguments sets file permissions absolutely; it does not modify existing
permissions as symbolic arguments do.
In the following example, chmod changes permissions so only the owner of the file
can read from and write to the file, regardless of how permissions were previously
set. The 6 in the first position gives the owner read (4) and write (2) permissions. The
0s remove all permissions for the group and other users.
$ chmod 600 letter.0210
$ ls -l letter.0210
-rw-------. 1 sam pubs 6193 02-10 14:22 letter.0210

Next, 7 (4 + 2 + 1) gives the owner read, write, and execute permissions. The 5 (4 + 1)
gives the group and other users read and execute permissions:
$ chmod 755 check_spell
$ ls -l check_spell
-rwxr-xr-x. 1 sam pubs 766 03-21 14:02 check_spell

Refer to Table 4-2 for more examples of numeric permissions.

Table 4-2

Examples of numeric permission specifications

Mode

Meaning

777

Owner, group, and others can read, write, and execute file

755

Owner can read, write, and execute file; group and others can read and
execute file

711

Owner can read, write, and execute file; group and others can execute file

644

Owner can read and write file; group and others can read file

640

Owner can read and write file; group can read file; others cannot access file

Refer to page 285 for more information on using chmod to make a file executable and
to page 741 for more information on absolute arguments and chmod in general.

Setuid and Setgid Permissions
When you execute a file that has setuid (set user ID) permission, the process executing the file takes on the privileges of the file’s owner. For example, if you run a
setuid program that removes all files in a directory, you can remove files in any of
the file owner’s directories, even if you do not normally have permission to do so.
In a similar manner, setgid (set group ID) permission gives the process executing the
file the privileges of the group the file is associated with.

102 Chapter 4 The Filesystem

Minimize use of setuid and setgid programs owned by root
security Executable files that are setuid and owned by root have root privileges when they run, even if they
are not run by root. This type of program is very powerful because it can do anything that root can
do (and that the program is designed to do). Similarly, executable files that are setgid and belong
to the group root have extensive privileges.
Because of the power they hold and their potential for destruction, it is wise to avoid indiscriminately creating and using setuid programs owned by root and setgid programs belonging to the
group root. Because of their inherent dangers, many sites minimize the use of these programs on
their systems. One necessary setuid program is passwd.

The following example shows a user working with root privileges and using symbolic
arguments to chmod to give one program setuid privileges and another program setgid privileges. The ls –l output (page 98) shows setuid permission by displaying an s
in the owner’s executable position and setgid permission by displaying an s in the
group’s executable position:
# ls -l myprog*
-rwxr-xr-x. 1 root pubs 362804 03-21 15:38 myprog1
-rwxr-xr-x. 1 root pubs 189960 03-21 15:38 myprog2
# chmod u+s myprog1
# chmod g+s myprog2
# ls -l myprog*
-rwsr-xr-x. 1 root pubs 362804 03-21 15:38 myprog1
-rwxr-sr-x. 1 root pubs 189960 03-21 15:38 myprog2

The next example uses numeric arguments to chmod to make the same changes. When
you use four digits to specify permissions, setting the first digit to 1 sets the sticky bit
(page 1095), setting it to 2 specifies setgid permissions, and setting it to 4 specifies
setuid permissions:
# ls -l myprog*
-rwxr-xr-x. 1 root pubs 362804 03-21 15:38 myprog1
-rwxr-xr-x. 1 root pubs 189960 03-21 15:38 myprog2
# chmod 4755 myprog1
# chmod 2755 myprog2
# ls -l myprog*
-rwsr-xr-x. 1 root pubs 362804 03-21 15:38 myprog1
-rwxr-sr-x. 1 root pubs 189960 03-21 15:38 myprog2

Do not give shell scripts setuid/setgid permission
security Never give shell scripts setuid or setgid permission. Several techniques for subverting files with
these permissions are well known.

Access Permissions 103

Directory Access Permissions
Access permissions have slightly different meanings when they are applied to directories. Although the three types of users can read from or write to a directory, the
directory cannot be executed. Execute permission is redefined for a directory: It means
that you can cd into the directory and/or examine files that you have permission to
read from in the directory. It has nothing to do with executing a file.
When you have only execute permission for a directory, you can use ls to list a file
in the directory if you know its name. You cannot use ls to list the contents of the
directory. In the following exchange, Zach first verifies that he is logged in as himself. He then checks the permissions on Max’s info directory. You can view the access
permissions associated with a directory by running ls with the –d (directory) and –l
(long) options:
$ who am i
zach
pts/7
Aug 21 10:02
$ ls -ld /home/max/info
drwx-----x. 2 max pubs 4096 08-21 09:31 /home/max/info
$ ls -l /home/max/info
ls: /home/max/info: Permission denied

The d at the left end of the line that ls displays indicates /home/max/info is a directory. Max has read, write, and execute permissions; members of the pubs group have
no access permissions; and other users have execute permission only, indicated by the
x at the right end of the permissions. Because Zach does not have read permission
for the directory, the ls –l command returns an error.
When Zach specifies the names of the files he wants information about, he is not reading new directory information but rather searching for specific information, which he
is allowed to do with execute access to the directory. He has read permission for notes
so he has no problem using cat to display the file. He cannot display financial because
he does not have read permission for it:
$ ls -l /home/max/info/financial /home/max/info/notes
-rw-------. 1 max pubs 34 08-21 09:31 /home/max/info/financial
-rw-r--r--. 1 max pubs 30 08-21 09:32 /home/max/info/notes
$ cat /home/max/info/notes
This is the file named notes.
$ cat /home/max/info/financial
cat: /home/max/info/financial: Permission denied

Next Max gives others read access to his info directory:
$ chmod o+r /home/max/info

When Zach checks his access permissions on info, he finds he has both read and execute
access to the directory. Now ls –l displays the contents of the info directory, but he still
cannot read financial. (This restriction is an issue of file permissions, not directory permissions.) Finally, Zach tries to create a file named newfile using touch (page 985). If

104 Chapter 4 The Filesystem

Max were to give him write permission to the info directory, Zach would be able to create new files in it:
$ ls -ld /home/max/info
drwx---r-x. 2 max pubs 4096 08-21 09:31 /home/max/info
$ ls -l /home/max/info
-rw-------. 1 max pubs 34 08-21 09:31 financial
-rw-r--r--. 1 max pubs 30 08-21 09:32 notes
$ cat /home/max/info/financial
cat: financial: Permission denied
$ touch /home/max/info/newfile
touch: cannot touch '/home/max/info/newfile': Permission denied

ACLs: Access Control Lists
ACLs (Access Control Lists) provide finer-grained control over which users can
access specific directories and files than do traditional permissions (page 98). Using
ACLs you can specify the ways in which each of several users can access a directory
or file. Because ACLs can reduce performance, do not enable them on filesystems
that hold system files, where the traditional Linux permissions are sufficient. Also,
be careful when moving, copying, or archiving files: Not all utilities preserve ACLs.
In addition, you cannot copy ACLs to filesystems that do not support ACLs.
An ACL comprises a set of rules. A rule specifies how a specific user or group can
access the file that the ACL is associated with. There are two kinds of rules: access
rules and default rules. (The documentation refers to access ACLs and default ACLs,
even though there is only one type of ACL: There is one type of list [ACL] and there
are two types of rules an ACL can contain.)
An access rule specifies access information for a single file or directory. A default ACL
pertains to a directory only; it specifies default access information (an ACL) for any
file in the directory that is not given an explicit ACL.

Most utilities do not preserve ACLs
caution When used with the –p (preserve) or –a (archive) option, cp preserves ACLs when it copies files.
The mv utility also preserves ACLs. When you use cp with the –p or –a option and it is not able
to copy ACLs, and in the case where mv is unable to preserve ACLs, the utility performs the operation and issues an error message:
$ mv report /tmp
mv: preserving permissions for '/tmp/report': Operation not supported

Other utilities, such as tar, cpio, and dump, do not support ACLs. You can use cp with the –a
option to copy directory hierarchies, including ACLs.
You can never copy ACLs to a filesystem that does not support ACLs or to a filesystem that does
not have ACL support turned on.

ACLs: Access Control Lists 105

Enabling ACLs
The following explanation of how to enable ACLs pertains to Linux. See page 1047
if you are running Mac OS X.
The acl package must be installed before you can use ACLs. Most Linux distributions
officially support ACLs on ext2, ext3, and ext4 filesystems only, although informal
support for ACLs is available on other filesystems. To use ACLs on an ext2/ext3/ext4
filesystem, you must mount the device with the acl option (no_acl is the default). For
example, if you want to mount the device represented by /home so you can use ACLs
on files in /home, you can add acl to its options list in /etc/fstab:
$ grep home /etc/fstab
LABEL=/home
remount option

/home

ext4

defaults,acl

1 2

After changing fstab, you need to remount /home before you can use ACLs. If no one
else is using the system, you can unmount it and mount it again (working with root
privileges) as long as the working directory is not in the /home hierarchy. Alternatively
you can use the remount option to mount to remount /home while the device is in use:
# mount -v -o remount /home
/dev/sda3 on /home type ext4 (rw,acl)

Working with Access Rules
The setfacl utility modifies a file’s ACL and getfacl displays a file’s ACL. These utilities
are available under Linux only. If you are running OS X you must use chmod as
explained on page 1047. When you use getfacl to obtain information about a file that
does not have an ACL, it displays some of the same information as an ls –l command,
albeit in a different format:
$ ls -l report
-rw-r--r--. 1 max pubs 9537 01-12 23:17 report
$ getfacl report
# file: report
# owner: max
# group: pubs
user::rwgroup::r-other::r--

The first three lines of the getfacl output comprise the header; they specify the name
of the file, the owner of the file, and the group the file is associated with. For more
information refer to “ls –l: Displays Permissions” on page 98. The ––omit-header (or
just ––omit) option causes getfacl not to display the header:
$ getfacl --omit-header report
user::rwgroup::r-other::r--

In the line that starts with user, the two colons (::) with no name between them indicate that the line specifies the permissions for the owner of the file. Similarly, the two

106 Chapter 4 The Filesystem

adjacent colons in the group line indicate the line specifies permissions for the group
the file is associated with. The two colons following other are for consistency: No
name can be associated with other.
The setfacl ––modify (or –m) option adds or modifies one or more rules in a file’s ACL
using the following syntax:
setfacl ––modify ugo:name:permissions file-list
where ugo can be either u, g, or o to indicate that the command sets file permissions
for a user, a group, or all other users, respectively; name is the name of the user or
group that permissions are being set for; permissions is the permissions in either symbolic or absolute format; and file-list is the list of files the permissions are to be
applied to. You must omit name when you specify permissions for other users (o).
Symbolic permissions use letters to represent file permissions (rwx, r–x, and so on),
whereas absolute permissions use an octal number. While chmod uses three sets of
permissions or three octal numbers (one each for the owner, group, and other users),
setfacl uses a single set of permissions or a single octal number to represent the permissions being granted to the user or group represented by ugo and name. See the
discussion of chmod on pages 99 and 741 for more information about symbolic and
absolute representations of file permissions.
For example, both of the following commands add a rule to the ACL for the report
file that gives Sam read and write permission to that file:
$ setfacl --modify u:sam:rw- report

or
$ setfacl --modify u:sam:6 report
$ getfacl report
# file: report
# owner: max
# group: pubs
user::rwuser:sam:rwgroup::r-mask::rwother::r--

The line containing user:sam:rw– shows that the user named sam has read and write
access (rw–) to the file. See page 98 for an explanation of how to read access permissions.
See the following optional section for a description of the line that starts with mask.
When a file has an ACL, ls –l displays a plus sign (+) following the permissions, even
if the ACL is empty:
$ ls -l report
-rw-rw-r--+ 1 max pubs 9537 01-12 23:17 report

optional

Effective Rights Mask
The line in the output of getfacl that starts with mask specifies the effective rights
mask. This mask limits the effective permissions granted to ACL groups and users.

ACLs: Access Control Lists 107

It does not affect the owner of the file or the group the file is associated with. In other
words, it does not affect traditional Linux permissions. However, because setfacl
always sets the effective rights mask to the least restrictive ACL permissions for the
file, the mask has no effect unless you set it explicitly after you set up an ACL for the
file. You can set the mask by specifying mask in place of ugo and by not specifying
a name in a setfacl command.
The following example sets the effective rights mask to read for the report file:
$ setfacl -m mask::r-- report

The mask line in the following getfacl output shows the effective rights mask set to
read (r––). The line that displays Sam’s file access permissions shows them still set to
read and write. However, the comment at the right end of the line shows that his
effective permission is read.
$ getfacl report
# file: report
# owner: max
# group: pubs
user::rwuser:sam:rwgroup::r-mask::r-other::r--

#effective:r--

As the next example shows, setfacl can modify ACL rules and can set more than one
ACL rule at a time:
$ setfacl -m u:sam:r--,u:zach:rw- report
$ getfacl --omit-header report
user::rwuser:sam:r-user:zach:rwgroup::r-mask::rwother::r--

The –x option removes ACL rules for a user or a group. It has no effect on permissions for the owner of the file or the group that the file is associated with. The next
example shows setfacl removing the rule that gives Sam permission to access the file:
$ setfacl -x u:sam report
$ getfacl --omit-header report
user::rwuser:zach:rwgroup::r-mask::rwother::r--

You must not specify permissions when you use the –x option. Instead, specify only
the ugo and name. The –b option, followed by a filename only, removes all ACL rules
and the ACL itself from the file or directory you specify.

108 Chapter 4 The Filesystem

Both setfacl and getfacl have many options. Use the ––help option to display brief lists
of options or refer to the man pages for details.

Setting Default Rules for a Directory
The following example shows that the dir directory initially has no ACL. The setfacl
command uses the –d (default) option to add two default rules to the ACL for dir.
These rules apply to all files in the dir directory that do not have explicit ACLs. The
rules give members of the pubs group read and execute permissions and give members of the adm group read, write, and execute permissions.
$ ls -ld dir
drwx------. 2 max pubs 4096 02-12 23:15 dir
$ getfacl dir
# file: dir
# owner: max
# group: pubs
user::rwx
group::--other::--$ setfacl -d -m g:pubs:r-x,g:adm:rwx dir

The following ls command shows that the dir directory now has an ACL, as indicated
by the + to the right of the permissions. Each of the default rules that getfacl displays
starts with default:. The first two default rules and the last default rule specify the permissions for the owner of the file, the group that the file is associated with, and all
other users. These three rules specify the traditional Linux permissions and take precedence over other ACL rules. The third and fourth rules specify the permissions for
the pubs and adm groups. Next is the default effective rights mask.
$ ls -ld dir
drwx------+ 2 max pubs 4096 02-12 23:15 dir
$ getfacl dir
# file: dir
# owner: max
# group: pubs
user::rwx
group::--other::--default:user::rwx
default:group::--default:group:pubs:r-x
default:group:adm:rwx
default:mask::rwx
default:other::---

Remember that the default rules pertain to files held in the directory that are not
assigned ACLs explicitly. You can also specify access rules for the directory itself.
When you create a file within a directory that has default rules in its ACL, the effective rights mask for that file is created based on the file’s permissions. In some cases
the mask can override default ACL rules.

Links

109

In the next example, touch creates a file named new in the dir directory. The ls command shows this file has an ACL. Based on the value of umask (page 994), both the
owner and the group that the file is associated with have read and write permissions
for the file. The effective rights mask is set to read and write so that the effective permission for pubs is read and the effective permissions for adm are read and write.
Neither group has execute permission.
$ cd dir
$ touch new
$ ls -l new
-rw-rw----+ 1 max pubs 0 02-13 00:39 new
$ getfacl --omit new
user::rwgroup::--group:pubs:r-x
#effective:r-group:adm:rwx
#effective:rwmask::rwother::---

If you change the file’s traditional permissions to read, write, and execute for the
owner and the group, the effective rights mask changes to read, write, and execute,
and the groups specified by the default rules gain execute access to the file.
$ chmod 770 new
$ ls -l new
-rwxrwx---+ 1 max pubs 0 02-13 00:39 new
$ getfacl --omit new
user::rwx
group::--group:pubs:r-x
group:adm:rwx
mask::rwx
other::---

Links
A link is a pointer to a file. Each time you create a file using vim, touch, cp, or by some
other means, you are putting a pointer in a directory. This pointer associates a filename with a place on the disk. When you specify a filename in a command, you are
indirectly pointing to the place on the disk that holds the information you want.
Sharing files can be useful when two or more people are working on the same project
and need to share some information. You can make it easy for other users to access
one of your files by creating additional links to the file.
To share a file with another user, first give the user permission to read from and write
to the file (page 99). You might also have to change the access permissions of the parent directory of the file to give the user read, write, or execute permission (page 103).
When the permissions are appropriately set, the user can create a link to the file so
each of you can access the file from your separate directory hierarchies.

110 Chapter 4 The Filesystem

correspond

personal

memos

business

to_do

to_do

to_do

to_do

personal

memos

business
business

Links
Figure 4-13

Using links to cross-classify files

A link can also be useful to a single user with a large directory hierarchy. You can
create links to cross-classify files in your directory hierarchy, using different classifications for different tasks. For example, if you have the file layout depicted in
Figure 4-2 on page 83, a file named to_do might appear in each subdirectory of the
correspond directory—that is, in personal, memos, and business. If you find it difficult to keep track of everything you need to do, you can create a separate directory
named to_do in the correspond directory. You can then link each subdirectory’s todo list into that directory. For example, you could link the file named to_do in the
memos directory to a file named memos in the to_do directory. This set of links is
shown in Figure 4-13.
Although it might sound complicated, this technique keeps all your to-do lists conveniently in one place. The appropriate list is easily accessible in the task-related directory
when you are busy composing letters, writing memos, or handling personal business.

About the discussion of hard links
tip Two kinds of links exist: hard links and symbolic (soft) links. Hard links are older and becoming
outdated. The section on hard links is marked as optional; you can skip it, although it discusses
inodes and gives you insight into the structure of the filesystem.

optional

Hard Links
A hard link to a file appears as another file. If the file appears in the same directory
as the linked-to file, the links must have different filenames because two files in the
same directory cannot have the same name. You can create a hard link to a file only
from within the filesystem that holds the file.

ln: Creates a Hard Link
The ln (link) utility (without the –s or ––symbolic option) creates a hard link to an
existing file using the following syntax:
ln existing-file new-link

Links

111

/

home

max

zach

memo

planning

/home/max/letter and /home/zach/draft

Figure 4-14

Two links to the same file: /home/max/letter and /home/zach/draft

The next command shows Zach making the link shown in Figure 4-14 by creating a
new link named /home/max/letter to an existing file named draft in Zach’s home
directory:
$ pwd
/home/zach
$ ln draft /home/max/letter

The new link appears in the /home/max directory with the filename letter. In practice,
Max might need to change directory permissions so Zach will be able to create the
link. Even though /home/max/letter appears in Max’s directory, Zach is the owner
of the file because he created it.
The ln utility creates an additional pointer to an existing file, but it does not make
another copy of the file. Because there is only one file, the file status information—such as access permissions, owner, and the time the file was last modified—is
the same for all links; only the filenames differ. When Zach modifies
/home/zach/draft, for example, Max sees the changes in /home/max/letter.

cp Versus ln
The following commands verify that ln does not make an additional copy of a file.
Create a file, use ln to make an additional link to the file, change the contents of the
file through one link, and verify the change through the other link:
$ cat file_a
This is file A.
$ ln file_a file_b
$ cat file_b
This is file A.
$ vim file_b
...
$ cat file_b
This is file B after the change.
$ cat file_a
This is file B after the change.

112 Chapter 4 The Filesystem

If you try the same experiment using cp instead of ln and change a copy of the file,
the difference between the two utilities will become clearer. After you change a copy
of a file, the two files are different:
$ cat file_c
This is file C.
$ cp file_c file_d
$ cat file_d
This is file C.
$ vim file_d
...
$ cat file_d
This is file D after the change.
$ cat file_c
This is file C.
ls and link counts

You can use ls with the –l option, followed by the names of the files you want to compare, to confirm that the status information is the same for two links to the same file
and is different for files that are not linked. In the following example, the 2 in the
links field (just to the left of max) shows there are two links to file_a and file_b (from
the previous example):
$ ls -l file_a file_b file_c file_d
-rw-r--r--. 2 max pubs 33 05-24 10:52
-rw-r--r--. 2 max pubs 33 05-24 10:52
-rw-r--r--. 1 max pubs 16 05-24 10:55
-rw-r--r--. 1 max pubs 33 05-24 10:57

file_a
file_b
file_c
file_d

Although it is easy to guess which files are linked to one another in this example, ls
does not explicitly tell you.
ls and inodes

Use ls with the –i option to determine without a doubt which files are linked. The –i
option lists the inode (page 1074) number for each file. An inode is the control structure for a file. (HFS+, the default filesystem under Mac OS X, does not have inodes
but, through an elaborate scheme, appears to have inodes.) If the two filenames have
the same inode number, they share the same control structure and are links to the
same file. Conversely, when two filenames have different inode numbers, they are different files. The following example shows that file_a and file_b have the same inode
number and that file_c and file_d have different inode numbers:
$ ls -i file_a file_b file_c file_d
3534 file_a 3534 file_b 5800 file_c

7328 file_d

All links to a file are of equal value: The operating system cannot distinguish the order
in which multiple links were created. When a file has two links, you can remove either
one and still access the file through the remaining link. You can remove the link used
to create the file, for example, and as long as one link remains, still access the file
through that link.

Symbolic Links
In addition to hard links, Linux supports symbolic links, also called soft links or
symlinks. A hard link is a pointer to a file (the directory entry points to the inode),

Links

113

whereas a symbolic link is an indirect pointer to a file (the directory entry contains
the pathname of the pointed-to file—a pointer to the hard link to the file).
Dereferencing
symbolic links

To dereference a symbolic link means to follow the link to the target file rather than
work with the link itself. See page 115 for information on dereferencing symbolic links.

Advantages of
symbolic links

Symbolic links were developed because of the limitations inherent in hard links. You
cannot create a hard link to a directory, but you can create a symbolic link to a directory.
In many cases the Linux file hierarchy encompasses several filesystems. Because each
filesystem keeps separate control information (that is, separate inode tables or filesystem structures) for the files it holds, it is not possible to create hard links between
files in different filesystems. A symbolic link can point to any file, regardless of where
it is located in the file structure, but a hard link to a file must be in the same filesystem
as the other hard link(s) to the file. When you create links only among files in your
home directory, you will not notice this limitation.
A major advantage of a symbolic link is that it can point to a nonexistent file. This
ability is useful if you need a link to a file that is periodically removed and re-created.
A hard link keeps pointing to a “removed” file, which the link keeps alive even after
a new file is created. In contrast, a symbolic link always points to the newly created
file and does not interfere when you delete the old file. For example, a symbolic link
could point to a file that gets checked in and out under a source code control system,
a .o file that is re-created by the C compiler each time you run make, or a log file that
is repeatedly archived.
Although they are more general than hard links, symbolic links have some disadvantages. Whereas all hard links to a file have equal status, symbolic links do not have
the same status as hard links. When a file has multiple hard links, it is analogous to
a person having multiple full legal names, as many married women do. In contrast,
symbolic links are analogous to nicknames. Anyone can have one or more nicknames, but these nicknames have a lesser status than legal names. The following
sections describe some of the peculiarities of symbolic links.

ln: Creates Symbolic Links
The ln utility with the ––symbolic (or –s) option creates a symbolic link. The following example creates a symbolic link /tmp/s3 to the file sum in Max’s home directory.
When you use an ls –l command to look at the symbolic link, ls displays the name of
the link and the name of the file it points to. The first character of the listing is l (for
link). The size of a symbolic link is the number of characters in the target pathname.
$ ln --symbolic /home/max/sum /tmp/s3
$ ls -l /home/max/sum /tmp/s3
-rw-rw-r--. 1 max pubs 38 06-12 09:51 /home/max/sum
lrwxrwxrwx. 1 max pubs 13 06-12 09:52 /tmp/s3 -> /home/max/sum
$ cat /tmp/s3
This is sum.

The sizes and times of the last modifications of the two files are different. Unlike a
hard link, a symbolic link to a file does not have the same status information as the
file itself.

114 Chapter 4 The Filesystem

You can also use ln to create a symbolic link to a directory. When you use the
––symbolic option, ln works as expected whether the file you are creating a link to
is an ordinary file or a directory.

Use absolute pathnames with symbolic links
caution Symbolic links are literal and are not aware of directories. A link that points to a relative pathname,
which includes simple filenames, assumes the relative pathname is relative to the directory that
the link was created in (not the directory the link was created from). In the following example, the
link points to the file named sum in the /tmp directory. Because no such file exists, cat gives an
error message:
$ pwd
/home/max
$ ln --symbolic sum /tmp/s4
$ ls -l /home/max/sum /tmp/s4
lrwxrwxrwx. 1 max pubs 3 06-12 10:13 /tmp/s4 -> sum
-rw-rw-r--. 1 max pubs 38 06-12 09:51 /home/max/sum
$ cat /tmp/s4
cat: /tmp/s4: No such file or directory

optional

cd and Symbolic Links
When you use a symbolic link as an argument to cd to change directories, the results
can be confusing, particularly if you did not realize that you were using a symbolic link.
If you use cd to change to a directory that is represented by a symbolic link, the pwd
shell builtin (page 153) lists the name of the symbolic link. The pwd utility (/bin/pwd)
lists the name of the linked-to directory, not the link, regardless of how you got there.
You can also use the pwd builtin with the –P (physical) option to display the linkedto directory. This option displays a pathname that does not contain symbolic links.
$ ln -s /home/max/grades /tmp/grades.old
$ pwd
/home/max
$ cd /tmp/grades.old
$ pwd
/tmp/grades.old
$ /bin/pwd
/home/max/grades
$ pwd -P
/home/max/grades

When you change directories back to the parent, you end up in the directory holding
the symbolic link (unless you use the –P option to cd):
$ cd ..
$ pwd
/tmp
$ /bin/pwd
/tmp

Under Mac OS X, /tmp is a symbolic link to /private/tmp. When you are running OS
X, after you give the cd .. command in the previous example, the working directory
is /private/tmp.

Links

115

rm: Removes a Link
When you create a file, there is one hard link to it. You can then delete the file or,
using more accurate terminology, remove the link using the rm utility. When you
remove the last hard link to a file, you can no longer access the information stored
there, and the operating system releases the space the file occupied on the disk for use
by other files. This space is released even if symbolic links to the file remain. When
there is more than one hard link to a file, you can remove a hard link and still access
the file from any remaining link. Unlike DOS and Windows, Linux does not provide
an easy way to undelete a file once you have removed it. A skilled hacker, however,
can sometimes piece the file together with time and effort.
When you remove all hard links to a file, you will not be able to access the file through
a symbolic link. In the following example, cat reports that the file total does not exist
because it is a symbolic link to a file that has been removed:
$ ls -l sum
-rw-r--r--. 1 max pubs 981 05-24 11:05 sum
$ ln -s sum total
$ rm sum
$ cat total
cat: total: No such file or directory
$ ls -l total
lrwxrwxrwx. 1 max pubs 6 05-24 11:09 total -> sum

When you remove a file, be sure to remove all symbolic links to it. Remove a symbolic
link in the same way you remove other files:
$ rm total

Dereferencing Symbolic Links
A filename points at a file. A symbolic link is a file whose name refers to another file
(a target file) without pointing directly at the target file: It is a reference to the target
file. See page 112 for more information on symbolic links.
To dereference a symbolic link means to follow the link to the target file rather than
work with the link itself. When you dereference a symbolic link, you end up with a
pointer to the file (the filename of the target file). The term no-dereference is a double
negative: It means reference. To no-dereference a symbolic link means to work with
the link itself (do not dereference the symbolic link).
Many utilities have dereference and no-dereference options, usually invoked by the
–L (––dereference) option and the –P (––no-dereference) option, respectively. Some
utilities, such as chgrp, cp, and ls, also have a partial dereference option that is usually
invoked by –H. With a –H option, a utility dereferences files listed on the command
line only, not files found by traversing the hierarchy of a directory listed on the command line.
This section explains the –L (– –dereference) and –H (partial dereference) options
twice, once using ls and then using chgrp. It also covers the chgrp –P (––no-dereference)
option.

116 Chapter 4 The Filesystem

Dereferencing Symbolic Links Using ls
No options

Most utilities default to no-dereference, although many do not have an explicit
no-dereference option. For example, the GNU ls utility, which is used in most
Linux distributions, does not have a –P (––no-dereference) option, although the
BSD ls utility, which is used in OS X, does.
In the following example, ls with the –l option displays information about the files
in the working directory and does not dereference the sam.memo symbolic link; it
displays the symbolic link including the pathname of the file the link points to (the
target file). The first character of the sam.memo line is an l, indicating the line
describes a symbolic link; Max created the symbolic link and owns it.
$ ls -l
-rw-r--r--. 1 max pubs 1129 04-10 15:53 memoD
-rw-r--r--. 1 max pubs 14198 04-10 15:56 memoE
lrwxrwxrwx. 1 max pubs
19 04-10 15:57 sam.memo -> /home/max/sam/memoA

The next command specifies on the command line the file the symbolic link points
to (the target file) and displays information about that file. The file type, permissions,
owner, and time for the file are different from that of the link. Sam created the file
and owns it.
$ ls -l /home/max/sam/memoA
-rw-r--r--. 1 sam sam 2126 04-10 15:54 /home/max/sam/memoA
–L (––dereference)

Next, the –L (––dereference) option to ls displays information about the files in the
working directory and dereferences the sam.memo symbolic link; it displays the file
the link points to (the target file). The first character of the sam.memo line is a –, indicating the line describes a regular file. The command displays the same information
about memoA as the preceding command, except it displays the name of the link
(sam.memo), not that of the target file (memoA).
$ ls -lL
-rw-r--r--. 1 max pubs 1129 04-10 15:53 memoD
-rw-r--r--. 1 max pubs 14198 04-10 15:56 memoE
-rw-r--r--. 1 sam sam
2126 04-10 15:54 sam.memo

–H

When you do not specify a symbolic link as an argument to ls, the –H (partial dereference; this short option has no long version) option displays the same information
as the –l option.
$ ls -lH
-rw-r--r--. 1 max pubs 1129 04-10 15:53 memoD
-rw-r--r--. 1 max pubs 14198 04-10 15:56 memoE
lrwxrwxrwx. 1 max pubs
19 04-10 15:57 sam.memo -> /home/max/sam/memoA

When you specify a symbolic link as an argument to ls, the –H option causes ls to
dereference the symbolic link; it displays information about the file the link points to
(the target file; memoA in the example). As with –L, it refers to the file by the name
of the symbolic link.
$ ls -lH sam.memo
-rw-r--r--. 1 sam sam 2126 04-10 15:54 sam.memo

Links

117

In the next example, the shell expands the * to a list of the names of the files in the
working directory and passes that list to ls. Specifying an ambiguous file reference
that expands to a symbolic link produces the same results as explicitly specifying the
symbolic link (because ls does not know it was called with an ambiguous file reference, it just sees the list of files the shell passes to it).
$ ls -lH *
-rw-r--r--. 1 max pubs 1129 04-10 15:53 memoD
-rw-r--r--. 1 max pubs 14198 04-10 15:56 memoE
-rw-r--r--. 1 sam sam
2126 04-10 15:54 sam.memo

optional
readlink

The readlink utility displays the absolute pathname of a file, dereferencing symbolic
links when needed. With the –f (––canonicalize) option, readlink follows nested symbolic links; all links except the last must exist. Following is an example:

$ ls -l /etc/alternatives/mta-mailq
lrwxrwxrwx. 1 root root 23 01-11 15:35 /etc/alternatives/mta-mailq -> /usr/bin/mailq.sendmail
$ ls -l /usr/bin/mailq.sendmail
lrwxrwxrwx. 1 root root 25 01-11 15:32 /usr/bin/mailq.sendmail -> ../sbin/sendmail.sendmail
$ readlink -f /etc/alternatives/mta-mailq
/usr/sbin/sendmail.sendmail

Dereferencing Symbolic Links Using chgrp
No options

The following examples demonstrate the difference between the –H and –L options,
this time using chgrp. Initially all files and directories in the working directory are
associated with the zach group:
$ ls -lR
.:
-rw-r--r-- 1 zach zach 102 07-02 12:31 bb
drwxr-xr-x 2 zach zach 4096 07-02 15:34 dir1
drwxr-xr-x 2 zach zach 4096 07-02 15:33 dir4
./dir1:
-rw-r--r-- 1 zach zach 102 07-02 12:32 dd
lrwxrwxrwx 1 zach zach
7 07-02 15:33 dir4.link -> ../dir4
./dir4:
-rw-r--r-- 1 zach zach 125 07-02 15:33 gg
-rw-r--r-- 1 zach zach 375 07-02 15:33 hh

–H

When you call chgrp with the –R and –H options (when used with chrgp, –H does not
work without –R), chgrp dereferences only symbolic links you list on the command
line and those in directories you list on the command line. The chgrp utility changes
the group association of the files these links point to. It does not dereference symbolic
links it finds as it descends directory hierarchies, nor does it change symbolic links
themselves. While descending the dir1 hierarchy, chgrp does not change dir4.link, but
it does change dir4, the directory dir4.link points to.

118 Chapter 4 The Filesystem
$ chgrp -RH pubs bb dir1
$ ls -lR
.:
-rw-r--r-- 1 zach pubs 102 07-02 12:31 bb
drwxr-xr-x 2 zach pubs 4096 07-02 15:34 dir1
drwxr-xr-x 2 zach pubs 4096 07-02 15:33 dir4
./dir1:
-rw-r--r-- 1 zach pubs 102 07-02 12:32 dd
lrwxrwxrwx 1 zach zach
7 07-02 15:33 dir4.link -> ../dir4
./dir4:
-rw-r--r-- 1 zach zach 125 07-02 15:33 gg
-rw-r--r-- 1 zach zach 375 07-02 15:33 hh

The –H option under Mac OS X
caution The chgrp –H option works slightly differently under Mac OS X than it does under Linux. Under
OS X, chgrp –RH changes the group of the symbolic link it finds in a directory listed on the command line and does not change the file the link points to. (It does not dereference the symbolic
link.) When you run the preceding example under OS X, the group association of dir4 is not
changed, but the group association of dir4.link is.
If your program depends on how the –H option functions with a utility under OS X, test the option
with that utility to determine exactly how it works.
–L

When you call chgrp with the –R and –L options (when used with chgrp, –L does not
work without –R), chgrp dereferences all symbolic links: those you list on the command line and those it finds as it descends the directory hierarchy. It does not change
the symbolic links themselves. This command changes the files in the directory
dir4.link points to:
$ chgrp -RL pubs bb dir1
$ ls -lR
.:
-rw-r--r-- 1 zach pubs 102 07-02 12:31 bb
drwxr-xr-x 2 zach pubs 4096 07-02 15:34 dir1
drwxr-xr-x 2 zach pubs 4096 07-02 15:33 dir4
./dir1:
-rw-r--r-- 1 zach pubs 102 07-02 12:32 dd
lrwxrwxrwx 1 zach zach
7 07-02 15:33 dir4.link -> ../dir4
./dir4:
-rw-r--r-- 1 zach pubs 125 07-02 15:33 gg
-rw-r--r-- 1 zach pubs 375 07-02 15:33 hh

–P

When you call chgrp with the –R and –P options (when used with chgrp, –P does not
work without –R), chgrp does not dereference symbolic links. It does change the
group of the symbolic link itself.

Chapter Summary

119

$ ls -l bb*
-rw-r--r-- 1 zach zach 102 07-02 12:31 bb
lrwxrwxrwx 1 zach zach
2 07-02 16:02 bb.link -> bb
$ chgrp -PR pubs bb.link
$ ls -l bb*
-rw-r--r-- 1 zach zach 102 07-02 12:31 bb
lrwxrwxrwx 1 zach pubs
2 07-02 16:02 bb.link -> bb

Chapter Summary
Linux has a hierarchical, or treelike, file structure that makes it possible to organize
files so you can find them quickly and easily. The file structure contains directory files
and ordinary files. Directories contain other files, including other directories; ordinary files generally contain text, programs, or images. The ancestor of all files is the
root directory and is represented by / standing alone or at the left end of a pathname.
Most Linux filesystems support 255-character filenames. Nonetheless, it is a good
idea to keep filenames simple and intuitive. Filename extensions can help make filenames more meaningful.
When you are logged in, you are always associated with a working directory. Your
home directory is the working directory from the time you log in until you use cd to
change directories.
An absolute pathname starts with the root directory and contains all the filenames
that trace a path to a given file. The pathname starts with a slash, representing the
root directory, and contains additional slashes following each of the directories in the
path, except for the last directory in the case of a path that points to a directory file.
A relative pathname is similar to an absolute pathname but traces the path starting
from the working directory. A simple filename is the last element of a pathname and
is a form of a relative pathname; it represents a file in the working directory.
A Linux filesystem contains many important directories, including /usr/bin, which
stores most of the Linux utilities, and /dev, which stores device files, many of which
represent physical pieces of hardware. An important standard Linux file is
/etc/passwd; it contains information about users, such as a user’s ID and full name.
Among the attributes associated with each file are access permissions. They determine who can access the file and how the file may be accessed. Three groups of users
can potentially access the file: the owner, the members of a group, and all other users.
An ordinary file can be accessed in three ways: read, write, and execute. The ls utility
with the –l option displays these permissions. For directories, execute access is redefined to mean that the directory can be searched.

120 Chapter 4 The Filesystem

The owner of a file or a user working with root privileges can use the chmod utility
to change the access permissions of a file. This utility specifies read, write, and execute permissions for the file’s owner, the group, and all other users on the system.
ACLs (Access Control Lists) provide finer-grained control over which users can
access specific directories and files than do traditional permissions. Using ACLs you
can specify the ways in which each of several users can access a directory or file. Few
utilities preserve ACLs when working with files.
An ordinary file stores user data, such as textual information, programs, or images.
A directory is a standard-format disk file that stores information, including names,
about ordinary files and other directory files. An inode is a data structure, stored on
disk, that defines a file’s existence and is identified by an inode number. A directory
relates each of the filenames it stores to an inode.
A link is a pointer to a file. You can have several links to a file so you can share the
file with other users or have the file appear in more than one directory. Because only
one copy of a file with multiple links exists, changing the file through any one link
causes the changes to appear in all the links. Hard links cannot link directories or
span filesystems, whereas symbolic links can.
Table 4-3 summarizes the utilities introduced in this chapter.

Table 4-3

Utilities introduced in Chapter 4

Utility

Function

cd

Associates you with another working directory (page 92)

chmod

Changes access permissions on a file (page 99)

getfacl

Displays a file’s ACL (page 105)

ln

Makes a link to an existing file (page 110)

mkdir

Creates a directory (page 90)

pwd

Displays the pathname of the working directory (page 86)

rmdir

Deletes a directory (page 93)

setfacl

Modifies a file’s ACL (page 105)

Exercises
1. Is each of the following an absolute pathname, a relative pathname, or a
simple filename?
a. milk_co

Exercises 121

b. correspond/business/milk_co
c. /home/max
d. /home/max/literature/promo
e. ..
f. letter.0210
2. List the commands you can use to perform these operations:
a. Make your home directory the working directory
b. Identify the working directory
3. If the working directory is /home/max with a subdirectory named
literature, give three sets of commands you can use to create a subdirectory
named classics under literature. Also give several sets of commands you can
use to remove the classics directory and its contents.
4. The df utility displays all mounted filesystems along with information about
each. Use the df utility with the –h (human-readable) option to answer the
following questions.
a. How many filesystems are mounted on the local system?
b. Which filesystem stores your home directory?
c. Assuming your answer to exercise 4a is two or more, attempt to create a
hard link to a file on another filesystem. What error message is displayed?
What happens when you attempt to create a symbolic link to the file
instead?
5. Suppose you have a file that is linked to a file owned by another user. How
can you ensure that changes to the file are no longer shared?
6. You should have read permission for the /etc/passwd file. To answer the following questions, use cat or less to display /etc/passwd. Look at the fields
of information in /etc/passwd for the users on the local system.
a. Which character is used to separate fields in /etc/passwd?
b. How many fields are used to describe each user?
c. How many users are on the local system?
d. How many different login shells are in use on your system? (Hint: Look
at the last field.)
e. The second field of /etc/passwd stores user passwords in encoded form.
If the password field contains an x, your system uses shadow passwords
and stores the encoded passwords elsewhere. Does your system use
shadow passwords?

122 Chapter 4 The Filesystem

7. If /home/zach/draft and /home/max/letter are links to the same file and the
following sequence of events occurs, what will be the date in the opening of
the letter?
a. Max gives the command vim letter.
b. Zach gives the command vim draft.
c. Zach changes the date in the opening of the letter to January 31, writes
the file, and exits from vim.
d. Max changes the date to February 1, writes the file, and exits from vim.
8. Suppose a user belongs to a group that has all permissions on a file named
jobs_list, but the user, as the owner of the file, has no permissions. Describe
which operations, if any, the user/owner can perform on jobs_list. Which
command can the user/owner give that will grant the user/owner all permissions on the file?
9. Does the root directory have any subdirectories you cannot search as an
ordinary user? Does the root directory have any subdirectories you cannot
read as a regular user? Explain.
10. Assume you are given the directory structure shown in Figure 4-2 on
page 83 and the following directory permissions:
d--x--x--drwxr-xr-x

3 zach pubs 512 2013-03-10 15:16 business
2 zach pubs 512 2013-03-10 15:16 business/milk_co

For each category of permissions—owner, group, and other—what happens
when you run each of the following commands? Assume the working directory is the parent of correspond and that the file cheese_co is readable by
everyone.
a. cd correspond/business/milk_co
b. ls –l correspond/business
c. cat correspond/business/cheese_co

Advanced Exercises
11. What is an inode? What happens to the inode when you move a file within
a filesystem?
12. What does the .. entry in a directory point to? What does this entry point
to in the root (/) directory?
13. How can you create a file named –i? Which techniques do not work, and
why do they not work? How can you remove the file named –i?

Advanced Exercises

14. Suppose the working directory contains a single file named andor. What
error message is displayed when you run the following command line?
$ mv andor and\/or

Under what circumstances is it possible to run the command without producing an error?
15. The ls –i command displays a filename preceded by the inode number of the
file (page 112). Write a command to output inode/filename pairs for the
files in the working directory, sorted by inode number. (Hint: Use a
pipeline.)
16. Do you think the system administrator has access to a program that can
decode user passwords? Why or why not? (See exercise 6.)
17. Is it possible to distinguish a file from a hard link to a file? That is, given a
filename, can you tell whether it was created using an ln command?
Explain.
18. Explain the error messages displayed in the following sequence of
commands:
$ ls -l
drwxrwxr-x. 2 max pubs 1024 03-02 17:57 dirtmp
$ ls dirtmp
$ rmdir dirtmp
rmdir: dirtmp: Directory not empty
$ rm dirtmp/*
rm: No match.

123

5
The Shell

Chapter5
5

In This Chapter

Objectives

The Command Line . . . . . . . . . . . . 126

After reading this chapter you should be able to:

Standard Input and Standard
Output . . . . . . . . . . . . . . . . . . . . 133

 Describe a simple command

Redirection . . . . . . . . . . . . . . . . . . 135

 Understand command-line syntax and run commands
that include options and arguments

Pipelines . . . . . . . . . . . . . . . . . . . . 141

 Explain how the shell interprets the command line

Running a Command in the
Background . . . . . . . . . . . . . . . . 146

 Redirect output of a command to a file, overwriting
the file or appending to it

kill: Aborting a Background Job . . . 147
Filename Generation/Pathname
Expansion . . . . . . . . . . . . . . . . . 148
Builtins . . . . . . . . . . . . . . . . . . . . . 153

 Redirect input for a command so it comes from a file
 Connect commands using a pipeline
 Run commands in the background
 Use special characters as wildcards to generate
filenames
 Explain the difference between a stand-alone utility
and a shell builtin

126 Chapter 5 The Shell

This chapter takes a close look at the shell and explains how to use some of its features.
It discusses command-line syntax and describes how the shell processes a command line
and initiates execution of a program. This chapter also explains how to redirect input
to and output from a command, construct pipelines and filters on the command line,
and run a command in the background. The final section covers filename expansion and
explains how you can use this feature in your everyday work.
Except as noted, everything in this chapter applies to the Bourne Again (bash) and
TC (tcsh) Shells. The exact wording of the shell output differs from shell to shell:
What the shell you are using displays might differ slightly from what appears in this
book. For shell-specific information, refer to Chapters 8 ( bash ) and 9 (tcsh ).
Chapter 10 covers writing and executing bash shell scripts.

The Command Line
Command

This book uses the term command to refer to both the characters you type on the
command line and the program that action invokes.

Command line

A command line comprises a simple command (below), a pipeline (page 141), or a
list (page 145).

A Simple Command
The shell executes a program when you enter a command in response to its prompt.
For example, when you give an ls command, the shell executes the utility program
named ls. You can cause the shell to execute other types of programs—such as shell
scripts, application programs, and programs you have written—in the same way. The
line that contains the command, including any arguments, is called a simple command.
The following sections discuss simple commands; see page 130 for a more technical
and complete description of a simple command.

Syntax
Command-line syntax dictates the ordering and separation of the elements on a
command line. When you press the RETURN key after entering a command, the shell
scans the command line for proper syntax. The syntax for a simple command is
command [arg1] [arg2] ... [argn] RETURN
Whitespace (any combination of SPACEs and/or TABs) must separate elements on the command line. The command is the name of the command, arg1 through argn are
arguments, and RETURN is the keystroke that terminates the command line. The brackets
in the command-line syntax indicate that the arguments they enclose are optional. Not
all commands require arguments: Some commands do not allow arguments; other
commands allow a variable number of arguments; and still others require a specific
number of arguments. Options, a special kind of argument, are usually preceded by one
or two hyphens (–).

The Command Line

127

Command Name
Usage message

Some useful Linux command lines consist of only the name of the command without
any arguments. For example, ls by itself lists the contents of the working directory.
Commands that require arguments typically give a short error message, called a usage
message, when you use them without arguments, with incorrect arguments, or with
the wrong number of arguments.
For example, the mkdir (make directory) utility requires an argument that specifies the
name of the directory you want it to create. Without this argument, it displays a usage
message (operand is another term for “argument”):
$ mkdir
mkdir: missing operand
Try 'mkdir --help' for more information.

Arguments
Token

On the command line each sequence of nonblank characters is called a token or
word. An argument is a token that a command acts on (e.g., a filename, a string of
characters, a number). For example, the argument to a vim or emacs command is the
name of the file you want to edit.
The following command line uses cp to copy the file named temp to tempcopy:
$ cp temp tempcopy

Arguments are numbered starting with the command itself, which is argument zero.
In this example, cp is argument zero, temp is argument one, and tempcopy is argument two. The cp utility requires at least two arguments on the command line.
Argument one is the name of an existing file. In this case, argument two is the name
of the file that cp is creating or overwriting. Here the arguments are not optional;
both arguments must be present for the command to work. When you do not supply
the right number or kind of arguments, cp displays a usage message. Try typing cp
and then pressing RETURN.

Options
An option is an argument that modifies the effects of a command. These arguments
are called options because they are usually optional. You can frequently specify more
than one option, modifying the command in several ways. Options are specific to and
interpreted by the program that the command line calls, not the shell.
By convention, options are separate arguments that follow the name of the command
and usually precede other arguments, such as filenames. Many utilities require options
to be prefixed with a single hyphen. However, this requirement is specific to the utility
and not the shell. GNU long (multicharacter) program options are frequently prefixed
with two hyphens. For example, – –help generates a (sometimes extensive) usage
message.

128 Chapter 5 The Shell

$ ls
hold
mark names
oldstuff temp zach
house max
office personal test
$ ls -r
zach temp
oldstuff names mark
hold
test personal office
max
house
$ ls -x
hold
house
mark max
names office
oldstuff personal temp test zach
$ ls -rx
zach
test temp personal oldstuff office
names max
mark house
hold

Figure 5-1

Using options

The first command in Figure 5-1 shows the output of an ls command without any
options. By default, ls lists the contents of the working directory in alphabetical order,
vertically sorted in columns. Next the –r (reverse order; because this is a GNU utility,
you can also specify ––reverse) option causes the ls utility to display the list of files
in reverse alphabetical order, still sorted in columns. The –x option causes ls to display the list of files in horizontally sorted rows.
Combining options

When you need to use several options, you can usually group multiple single-letter
options into one argument that starts with a single hyphen; do not put SPACEs between
the options. You cannot combine options that are preceded by two hyphens in this
way. Specific rules for combining options depend on the program you are running.
Figure 5-1 shows both the –r and –x options with the ls utility. Together these options
generate a list of filenames in horizontally sorted rows in reverse alphabetical order.
Most utilities allow you to list options in any order; thus ls –xr produces the same
results as ls –rx. The command ls –x –r also generates the same list.

The ––help option
tip Many utilities display a (sometimes extensive) help message when you call them with an argument
of ––help. All utilities developed by the GNU Project (page 3) accept this option. Following is the
help message displayed by the bzip2 compression utility (page 64).
$ bzip2 --help
bzip2, a block-sorting file compressor.

Version 1.0.6, 6-Sept-2010.

usage: bunzip2 [flags and input files in any order]
-h
-d
-z
-k
-f

--help
--decompress
--compress
--keep
--force

print this message
force decompression
force compression
keep (don't delete) input files
overwrite existing output files

...
If invoked as 'bzip2', default action is to compress.
as 'bunzip2', default action is to decompress.
as 'bzcat', default action is to decompress to stdout.
...

The Command Line
Option arguments

129

Some utilities have options that require arguments. These arguments are not
optional. For example, the gcc utility (C compiler) has a –o (output) option that must
be followed by the name you want to give the executable file that gcc generates. Typically an argument to an option is separated from its option letter by a SPACE:
$ gcc -o prog prog.c

Some utilities sometimes require an equal sign between an option and its argument.
For example, you can specify the width of output from diff in two ways:
$ diff -W 60 filea fileb

or
$ diff --width=60 filea fileb

Displaying readable file sizes: the –h option
tip Most utilities that report on file sizes specify the size of a file in bytes. Bytes work well when you
are dealing with smaller files, but the numbers can be difficult to read when you are working with
file sizes that are measured in gigabytes or terabytes. Use the –h (or ––human-readable) option
to display file sizes in kilobytes, megabytes, gigabytes, and terabytes. Experiment with the df –h
(disk free) and ls –lh commands.
Arguments that start
with a hyphen

Another convention allows utilities to work with arguments, such as filenames, that
start with a hyphen. If a file named –l is in the working directory, the following
command is ambiguous:
$ ls -l

This command could be a request to display a long listing of all files in the working directory (–l option) or a request for a listing of the file named –l. The ls utility interprets it as
the former. Avoid creating a file whose name begins with a hyphen. If you do create such
a file, many utilities follow the convention that a –– argument (two consecutive hyphens)
indicates the end of the options (and the beginning of the arguments). To disambiguate
the preceding command, you can type
$ ls -- -l

Using two consecutive hyphens to indicate the end of the options is a convention, not
a hard-and-fast rule, and a number of utilities do not follow it (e.g., find). Following
this convention makes it easier for users to work with a program you write.
For utilities that do not follow this convention, there are other ways to specify a filename that begins with a hyphen. You can use a period to refer to the working
directory and a slash to indicate the following filename refers to a file in the working
directory:
$ ls ./-l

You can also specify the absolute pathname of the file:
$ ls /home/max/-l

130 Chapter 5 The Shell

optional

Simple Commands
This section expands on the discussion of command-line syntax starting on page 126.
A simple command comprises zero or more variable assignments followed by a command line. It is terminated by a control operator (e.g., &, ;, |, NEWLINE; page 289). A
simple command has the following syntax:
[name=value ...] command-line
The shell assigns a value to each name and places it in the environment (page 468)
of the program that command-line calls so it is available to the called program and
its children as a variable. The shell evaluates the name=value pairs from left to right,
so if name appears more than once in this list, the rightmost value takes precedence.
The command-line might include redirection operators such as > and < (page 135).
The exit status (page 465) of a simple command is its return value. Under tcsh you
must use env (page 471) to place variables in the environment of a called program
without declaring them in the calling shell.
Placing a variable
in the environment
of a child

The following commands demonstrate how you can assign a value to a name (variable) and place that name in the environment of a child program; the variable is not
available to the interactive shell you are running (the parent program). The script
named echo_ee displays the value of the variable named ee. The first call to echo_ee
shows ee is not set in the child shell running the script. When the call to echo_ee is
preceded by assigning a value to ee, the script displays the value of ee in the child
shell. The final command shows ee has not been set in the interactive shell.
$ cat echo_ee
echo "The value of the ee variable is: $ee"
$ ./echo_ee
The value of the ee variable is:
$ ee=88 ./echo_ee
The value of the ee variable is: 88
$ echo $ee
$

Processing the Command Line
As you enter a command line, the tty device driver (part of the Linux kernel) examines
each character to see whether it must take immediate action. When you press CONTROL-H
(to erase a character) or CONTROL-U (to kill a line), the device driver immediately adjusts
the command line as required; the shell never sees the character(s) you erased or the line
you killed. Often a similar adjustment occurs when you press CONTROL-W (to erase a
word). When the character you entered does not require immediate action, the device
driver stores the character in a buffer and waits for additional characters. When you
press RETURN, the device driver passes the command line to the shell for processing.
Parsing the
command line

When the shell processes a command line, it looks at the line as a whole and parses
(breaks) it into its component parts (Figure 5-2). Next the shell looks for the name of

The Command Line

131

Get first word
and save as
command name

NEWLINE

no

Get more
of the
command line

no

Display
not found

yes

Execute program

yes

Does
program
exist?

Issue prompt

Figure 5-2 Processing the command line

the command. Usually the name of the command is the first item on the command line
after the prompt (argument zero). The shell takes the first characters on the command
line up to the first blank (TAB or SPACE) and then looks for a command with that name.
The command name (the first token) can be specified on the command line either as a
simple filename or as a pathname. For example, you can call the ls command in either
of the following ways:
$ ls

or
$ /bin/ls

optional The shell does not require the name of the program to appear first on the command
line. Thus you can structure a command line as follows:
$ >bb ) instructs the shell to redirect the output of a command
to the specified file instead of to the screen (Figure 5-6). The syntax of a command
line that redirects output is
command [arguments] > filename
where command is any executable program (e.g., an application program or a utility),
arguments are optional arguments, and filename is the name of the ordinary file the
shell redirects the output to.
Figure 5-7 uses cat to demonstrate output redirection. This figure contrasts with
Figure 5-5, where standard input and standard output are associated with the keyboard and screen. The input in Figure 5-7 comes from the keyboard. The redirect
output symbol on the command line causes the shell to associate cat’s standard output
with the sample.txt file specified following this symbol.

Redirecting output can destroy a file I
caution Use caution when you redirect output to a file. If the file exists, the shell will overwrite it and destroy
its contents. For more information see the tip “Redirecting output can destroy a file II” on
page 139.

After giving the command and typing the text shown in Figure 5-7, the sample.txt file
contains the text you entered. You can use cat with an argument of sample.txt to display
this file. The next section shows another way to use cat to display the file.

Standard Input and Standard Output 137

$ cat > sample.txt
This text is being entered at the keyboard and
cat is copying it to a file.
Press CONTROL-D to indicate the
end of file.
CONTROL-D

$

Figure 5-7

cat with its output redirected

Figure 5-7 shows that redirecting standard output from cat is a handy way to create
a file without using an editor. The drawback is that once you enter a line and press
RETURN, you cannot edit the text until after you finish creating the file. While you are
entering a line, the erase and kill keys work to delete text on that line. This procedure
is useful for creating short, simple files.
Figure 5-8 shows how to run cat and use the redirect output symbol to catenate (join
one after the other—the derivation of the name of the cat utility) several files into one
larger file. The first three commands display the contents of three files: stationery,
tape, and pens. The next command shows cat with three filenames as arguments.
When you call it with more than one filename, cat copies the files, one at a time, to
standard output. This command redirects standard output to the file supply_orders.
The final cat command shows that supply_orders contains the contents of the three
original files.

Redirecting Standard Input
Just as you can redirect standard output, so you can redirect standard input. The
redirect input symbol (<) instructs the shell to redirect a command’s input to come
$ cat stationery
2,000 sheets letterhead ordered:
$ cat tape
1 box masking tape ordered:
5 boxes filament tape ordered:
$ cat pens
12 doz. black pens ordered:

October 7
October 14
October 28
October 4

$ cat stationery tape pens > supply_orders
$ cat supply_orders
2,000 sheets letterhead ordered:
1 box masking tape ordered:
5 boxes filament tape ordered:
12 doz. black pens ordered:

Figure 5-8

Using cat to catenate files

October
October
October
October

7
14
28
4

138 Chapter 5 The Shell

Sh
ell

Standard
input

Shell

File
Standard
output

Command

Figure 5-9

Redirecting standard input

from the specified file instead of from the keyboard (Figure 5-9). The syntax of a
command line that redirects input is
command [arguments] < filename
where command is any executable program (such as an application program or a utility),
arguments are optional arguments, and filename is the name of the ordinary file the shell
redirects the input from.
Figure 5-10 shows cat with its input redirected from the supply_orders file created in
Figure 5-8 and standard output going to the screen. This setup causes cat to display
the supply_orders file on the screen. The system automatically supplies an EOF signal
at the end of an ordinary file.
Utilities that take
input from a file or
standard input

Giving a cat command with input redirected from a file yields the same result as giving a cat command with the filename as an argument. The cat utility is a member of
a class of utilities that function in this manner. Other members of this class of utilities
include lpr, sort, grep, and Perl. These utilities first examine the command line that
called them. If the command line includes a filename as an argument, the utility takes
its input from the specified file. If no filename argument is present, the utility takes
its input from standard input. It is the utility or program—not the shell or operating
system—that functions in this manner.
$ cat < supply_orders
2,000 sheets letterhead ordered:
1 box masking tape ordered:
5 boxes filament tape ordered:
12 doz. black pens ordered:

Figure 5-10

cat with its input redirected

October
October
October
October

7
14
28
4

Standard Input and Standard Output 139

Redirecting output can destroy a file II
caution Depending on which shell you are using and how the environment is set up, a command such as
the following can yield undesired results:
$ cat orange pear > orange
cat: orange: input file is output file

Although cat displays an error message, the shell destroys the contents of the existing orange file.
The new orange file will have the same contents as pear because the first action the shell takes
when it sees the redirection symbol (>) is to remove the contents of the original orange file. If you
want to catenate two files into one, use cat to put the two files into a temporary file and then use
mv to rename the temporary file:
$ cat orange pear > temp
$ mv temp orange

What happens in the next example can be even worse. The user giving the command wants to
search through files a, b, and c for the word apple and redirect the output from grep (page 56)
to the file a.output. Unfortunately the user enters the filename as a output, omitting the period and
inserting a SPACE in its place:
$ grep apple a b c > a output
grep: output: No such file or directory

The shell obediently removes the contents of a and then calls grep. The error message could take
a moment to appear, giving you a sense that the command is running correctly. Even after you see
the error message, it might take a while to realize that you have destroyed the contents of a.

noclobber: Prevents Overwriting Files
The shell provides the noclobber feature, which prevents you from overwriting a file
using redirection. Enable this feature by setting noclobber using the command set –o
noclobber. The same command with +o unsets noclobber. Under tcsh use set noclobber
and unset noclobber. With noclobber set, if you redirect output to an existing file, the
shell displays an error message and does not execute the command. Run under bash
and tcsh, the following examples create a file using touch, set noclobber, attempt to
redirect the output from echo to the newly created file, unset noclobber, and perform
the same redirection:
bash

tcsh

$ touch tmp
$ set -o noclobber
$ echo "hi there" > tmp
-bash: tmp: cannot overwrite existing file
$ set +o noclobber
$ echo "hi there" > tmp
tcsh
tcsh
tcsh
tmp:
tcsh
tcsh

$ touch tmp
$ set noclobber
$ echo "hi there" > tmp
File exists.
$ unset noclobber
$ echo "hi there" > tmp

140 Chapter 5 The Shell

$ date > whoson
$ cat whoson
Wed Mar 27 14:31:18 PST 2013
$ who >> whoson
$ cat whoson
Wed Mar 27 14:31:18 PST 2013
sam
tty1
2013-03-27
max
pts/4
2013-03-27
max
pts/5
2013-03-27
zach
pts/7
2013-03-26

Figure 5-11

05:00(:0)
12:23(:0.0)
12:33(:0.0)
08:45 (172.16.192.1)

Redirecting and appending output

You can override noclobber by putting a pipe symbol (tcsh uses an exclamation point)
after the redirect symbol (>|). In the following example, the user creates a file by redirecting the output of date. Next the user sets the noclobber variable and redirects output
to the same file again. The shell displays an error message. Then the user places a pipe
symbol after the redirect symbol, and the shell allows the user to overwrite the file.
$ date > tmp2
$ set -o noclobber
$ date > tmp2
-bash: tmp2: cannot overwrite existing file
$ date >| tmp2

For more information on using noclobber under tcsh, refer to page 397.

Do not trust noclobber
caution Appending output is simpler than the two-step procedure described in the preceding caution box
but you must be careful to include both greater than signs. If you accidentally use only one greater
than sign and the noclobber feature is not set, the shell will overwrite the orange file. Even if you
have the noclobber feature turned on, it is a good idea to keep backup copies of the files you are
manipulating in case you make a mistake.
Although it protects you from overwriting a file using redirection, noclobber does not stop you
from overwriting a file using cp or mv. These utilities include the –i (interactive) option that helps
protect you from this type of mistake by verifying your intentions when you try to overwrite a file.
For more information see the tip “cp can destroy a file” on page 54.

Appending Standard Output to a File
The append output symbol (>>) causes the shell to add new information to the end
of a file, leaving existing information intact. This symbol provides a convenient way
of catenating two files into one. The following commands demonstrate the action of
the append output symbol. The second command accomplishes the catenation
described in the preceding caution box:
$ cat orange
this is orange
$ cat pear >> orange
$ cat orange
this is orange
this is pear

Standard Input and Standard Output 141

The first command displays the contents of the orange file. The second command
appends the contents of the pear file to the orange file. The final command displays
the result.
Figure 5-11 shows how to create a file that contains the date and time (the output from
date), followed by a list of who is logged in (the output from who). The first command
in the example redirects the output from date to the file named whoson. Then cat displays
the file. The next command appends the output from who to the whoson file. Finally cat
displays the file containing the output of both utilities.

/dev/null: Making Data Disappear
The /dev/null device is a data sink, commonly referred to as a bit bucket. You can
redirect output you do not want to keep or see to /dev/null, and the output will
disappear without a trace:
$ echo "hi there" > /dev/null
$

Reading from /dev/null yields a null string. The following command truncates the
file named messages to zero length while preserving the ownership and permissions
of the file:
$ ls -lh messages
-rw-rw-r--. 1 sam pubs 125K 03-16 14:30 messages
$ cat /dev/null > messages
$ ls -lh messages
-rw-rw-r--. 1 sam pubs 0 03-16 14:32 messages

Pipelines
A pipeline consists of one or more commands separated by a pipe symbol (|). The
shell connects standard output (and optionally standard error) of the command preceding the pipe symbol to standard input of the command following the pipe symbol.
A pipeline has the same effect as redirecting standard output of one command to a
file and then using that file as standard input to another command. A pipeline does
away with separate commands and the intermediate file. The syntax of a pipeline is
command_a [arguments] | command_b [arguments]
The preceding command line uses a pipeline to effect the same result as the following
three commands:
command_a [arguments] > temp
command_b [arguments] < temp
rm temp
In the preceding sequence of commands, the first line redirects standard output
from command_a to an intermediate file named temp. The second line redirects
standard input for command_b to come from temp. The final line deletes temp. The
pipeline syntax is not only easier to type but also is more efficient because it does
not create a temporary file.

142 Chapter 5 The Shell

optional More precisely, a bash pipeline comprises one or more simple commands (page 130)
separated by a | or |& control operator. A pipeline has the following syntax:
[time] [!] command1 [| | |& command2 ... ]
where time is an optional utility that summarizes the system resources used by the
pipeline, ! logically negates the exit status returned by the pipeline, and the commands are simple commands (page 130) separated by | or |&. The | control operator
sends standard output of command1 to standard input of command2. The |& control operator is short for 2>&1 | (see “Sending errors through a pipeline” on
page 283) and sends standard output and standard error of command1 to standard
input of command2. The exit status of a pipeline is the exit status of the last simple
command unless pipefail (page 353) is set, in which case the exit status is the rightmost simple command that failed (returned a nonzero exit status) or zero if all simple
commands completed successfully.

Examples of Pipelines
tr

You can include in a pipeline any utility that accepts input either from a file specified
on the command line or from standard input. You can also include utilities that
accept input only from standard input. For example, the tr (translate; page 987) utility takes its input from standard input only. In its simplest usage tr has the following
syntax:
tr string1 string2
The tr utility accepts input from standard input and looks for characters that match
one of the characters in string1. Upon finding a match, it translates the matched character in string1 to the corresponding character in string2. That is, the first character
in string1 translates into the first character in string2, and so forth. The tr utility sends
its output to standard output. In both of the following tr commands, tr displays the
contents of the abstract file with the letters a, b, and c translated into A, B, and C,
respectively:
$ cat abstract
I took a cab today!
$
I
$
I

cat abstract | tr abc ABC
took A CAB todAy!
tr abc ABC < abstract
took A CAB todAy!

The tr utility does not change the contents of the original file; it cannot change the
original file because it does not “know” the source of its input.
lpr

The lpr (line printer) utility accepts input from either a file or standard input. When
you type the name of a file following lpr on the command line, it places that file in
the print queue. When you do not specify a filename on the command line, lpr takes

Standard Input and Standard Output 143

$ ls > temp
$ lpr temp
$ rm temp

or
$ ls | lpr

Figure 5-12

A pipeline

input from standard input. This feature enables you to use a pipeline to redirect input
to lpr. The first set of commands in Figure 5-12 shows how you can use ls and lpr with
an intermediate file (temp) to send a list of the files in the working directory to the
printer. If the temp file exists, the first command overwrites its contents. The second
set of commands uses a pipeline to send the same list (with the exception of temp) to
the printer.
sort

The commands in Figure 5-13 redirect the output from the who utility to temp and
then display this file in sorted order. The sort utility (page 58) takes its input from the
file specified on the command line or, when a file is not specified, from standard
input; it sends its output to standard output. The sort command line in Figure 5-13
takes its input from standard input, which is redirected (<) to come from temp. The
output sort sends to the screen lists the users in sorted (alphabetical) order. Because
sort can take its input from standard input or from a file named on the command line,
omitting the < symbol from Figure 5-13 yields the same result.
Figure 5-14 achieves the same result without creating the temp file. Using a pipeline,
the shell redirects the output from who to the input of sort. The sort utility takes input
from standard input because no filename follows it on the command line.
$ who > temp
$ sort < temp
max
pts/4
max
pts/5
sam
tty1
zach
pts/7
$ rm temp

Figure 5-13

12:23
12:33
05:00
08:45

Using a temporary file to store intermediate results

$ who | sort
max
pts/4
max
pts/5
sam
tty1
zach
pts/7

Figure 5-14

2013-03-24
2013-03-24
2013-03-24
2013-03-23

2013-03-24
2013-03-24
2013-03-24
2013-03-23

12:23
12:33
05:00
08:45

A pipeline doing the work of a temporary file

144 Chapter 5 The Shell
grep

When many people are using the system and you want information about only one of
them, you can send the output from who to grep (pages 56 and 833) using a pipeline.
The grep utility displays the line containing the string you specify—sam in the following
example:
$ who | grep sam
sam
tty1

less and more

2013-03-24 05:00

Another way of handling output that is too long to fit on the screen, such as a list of
files in a crowded directory, is to use a pipeline to send the output through less or
more (both on page 53).
$ ls | less

The less utility displays text one screen at a time. To view another screen of text, press
the SPACE bar. To view one more line, press RETURN. Press h for help and q to quit.

optional The pipe symbol (|) implies continuation. Thus the following command line
$ who | grep 'sam'
sam
tty1

2013-03-24 05:00

is the same as these command lines
$ who |
> grep 'sam'
sam
tty1

2013-03-24 05:00

When the shell parses a line that ends with a pipe symbol, it requires more input
before it can execute the command line. In an interactive environment, it issues a
secondary prompt (>; page 311) as shown above. Within a shell script, it processes
the next line as a continuation of the line that ends with the pipe symbol. See
page 499 for information about control operators and implicit command-line
continuation.

Filters
A filter is a command that processes an input stream of data to produce an output
stream of data. A command line that includes a filter uses a pipe symbol to connect
standard output of one command to standard input of the filter. Another pipe symbol
connects standard output of the filter to standard input of another command. Not
all utilities can be used as filters.
In the following example, sort is a filter, taking standard input from standard output
of who and using a pipe symbol to redirect standard output to standard input of lpr.
This command line sends the sorted output of who to the printer:
$ who | sort | lpr

The preceding example demonstrates the power of the shell combined with the versatility of Linux utilities. The three utilities who, sort, and lpr were not designed to

Standard Input and Standard Output 145

$ who | tee who.out | grep sam
sam
tty1
2013-03-24 05:00
$ cat who.out
sam
tty1
2013-03-24 05:00
max
pts/4
2013-03-24 12:23
max
pts/5
2013-03-24 12:33
zach
pts/7
2013 -03-23 08:45

Figure 5-15

tee sends its output to a file and to standard output

work with one another, but they all use standard input and standard output in the
conventional way. By using the shell to handle input and output, you can piece standard utilities together on the command line to achieve the results you want.
tee

The tee utility copies its standard input both to a file and to standard output. This
utility is aptly named: It takes a single stream of input and sends the output in two
directions. In Figure 5-15 the output of who is sent via a pipeline to standard input
of tee. The tee utility saves a copy of standard input in a file named who.out and
also sends a copy to standard output. Standard output of tee goes via a pipeline
to standard input of grep, which displays only those lines containing the string
sam. Use tee with the –a (append) option to cause it to append to a file instead of
overwriting it.

optional

Lists
A list is one or more pipelines (including simple commands), each separated from the
next by one of the following control operators: ;, &, &&, or ||. The && and || control
operators have equal precedence; they are followed by ; and &, which have equal precedence. The ; control operator is covered on page 290 and & on page 290. See
page 499 for information about control operators and implicit command-line
continuation.
An AND list has the following syntax:
pipeline1 && pipeline2
where pipeline2 is executed if and only if pipeline1 returns a true (zero) exit status.
In the following example, the first command in the list fails (and displays an error
message) so the shell does not execute the second command (cd /newdir; because it
is not executed, it does not display an error message):
$ mkdir /newdir && cd /newdir
mkdir: cannot create directory '/newdir': Permission denied

The exit status of AND and OR lists is the exit status of the last command in the list
that is executed. The exit status of the preceding list is false because mkdir was the
last command executed and it failed.

146 Chapter 5 The Shell

An OR list has the following syntax:
pipeline1 || pipeline2
where pipeline2 is executed if and only if pipeline1 returns a false (nonzero) exit
status. In the next example, the first command ( ping tests the connection to a
remote machine and sends standard output and standard error to /dev/null) in the list fails
so the shell executes the second command (it displays a message). If the first command had completed successfully, the shell would not have executed the second
command (and would not have displayed the message). The list returns an exit
status of true.
$ ping -c1 station &>/dev/null || echo "station is down"
station is down

For more information refer to “&& and || Boolean Control Operators” on page 291.

Running a Command in the Background
Foreground

All commands up to this point have been run in the foreground. When you run a command in the foreground, the shell waits for it to finish before displaying another
prompt and allowing you to continue. When you run a command in the background,
you do not have to wait for the command to finish before running another command.

Jobs

A job is another name for a process running a pipeline (which can be a simple command). You can have only one foreground job on a screen, but you can have many
background jobs. By running more than one job at a time, you are using one of
Linux’s features: multitasking. Running a command in the background can be useful
when the command will run for a long time and does not need supervision. It leaves
the screen free so you can use it for other work.

Job number,
PID number

To run a command in the background, type an ampersand (&; a control operator)
just before the RETURN that ends the command line. The shell assigns a small number
to the job and displays this job number between brackets. Following the job number,
the shell displays the process identification (PID) number—a larger number assigned
by the operating system. Each of these numbers identifies the command running in
the background. The shell then displays another prompt, and you can enter another
command. When the background job finishes, the shell displays a message giving
both the job number and the command line used to run the command.
The following example runs in the background; it is a pipeline that sends the output
of ls to lpr, which sends it to the printer.
$ ls -l | lpr &
[1] 22092
$

The [1] following the command line indicates that the shell has assigned job number
1 to this job. The 22092 is the PID number of the first command in the job. (The

Running a Command in the Background

147

TC Shell shows PID numbers for all commands in a job.) When this background
job completes execution, you see the message
[1]+ Done

ls -l | lpr

(In place of ls –l, the shell might display something similar to ls ––color=auto –l. This
difference is due to the fact that ls is aliased [page 342] to ls ––color=auto.)

Moving a Job from the Foreground to the Background
CONTROL-Z

and bg

You can suspend a foreground job (stop it from running) by pressing the suspend key,
usually CONTROL-Z . The shell then stops the process and disconnects standard input from
the keyboard. It does, however, still send standard output and standard error to the
screen. You can put a suspended job in the background and restart it by using the bg
command followed by the job number. You do not need to specify the job number
when there is only one suspended job.
Redirect the output of a job you run in the background to keep it from interfering
with whatever you are working on in the foreground (on the screen). Refer to
“Control Operators: Separate and Group Commands” on page 289 for more detail
about background tasks.

fg

Only the foreground job can take input from the keyboard. To connect the keyboard
to a program running in the background, you must bring the program to the foreground. To do so, type fg without any arguments when only one job is in the
background. When more than one job is in the background, type fg, or a percent sign
(%), followed by the number of the job you want to bring to the foreground. The
shell displays the command you used to start the job (promptme in the following
example), and you can enter input the program requires to continue.
$ fg 1
promptme

kill: Aborting a Background Job
The interrupt key (usually CONTROL-C ) cannot abort a background process because the
keyboard is not attached to the job; you must use kill (page 846) for this purpose. Follow kill on the command line with either the PID number of the process you want to
abort or a percent sign (%) followed by the job number.
Determining the
PID of a process
using ps

If you forget a PID number, you can use the ps (process status) utility (page 324) to
display it. The following example runs a find command in the background, uses ps to
display the PID number of the process, and aborts the job using kill:
$ find / -name memo55 > mem.out &
[1] 18228
$ ps | grep find
18228 pts/10
00:00:01 find
$ kill 18228
[1]+ Terminated
find / -name memo55 > mem.out
$

148 Chapter 5 The Shell
Determining the
number of a job
using jobs

If you forget a job number, you can use the jobs command to display a list of jobs
that includes job numbers. The next example is similar to the previous one except it
uses the job number instead of the PID number to identify the job to be killed. Sometimes the message saying the job is terminated does not appear until you press RETURN
after the RETURN that executes the kill command.
$ find / -name memo55 > mem.out &
[1] 18236
$ bigjob &
[2] 18237
$ jobs
[1]- Running
[2]+ Running
$ kill %1
$ RETURN
[1]- Terminated
$

find / -name memo55 > mem.out &
bigjob &

find / -name memo55 > mem.out

Filename Generation/Pathname Expansion
Wildcards, globbing

When you specify an abbreviated filename that contains special characters, also called
metacharacters, the shell can generate filenames that match the names of existing files.
These special characters are also referred to as wildcards because they act much as the
jokers do in a deck of cards. When one of these characters appears in an argument on
the command line, the shell expands that argument in sorted order into a list of filenames
and passes the list to the program called by the command line. Filenames that contain
these special characters are called ambiguous file references because they do not refer to
one specific file. The process the shell performs on these filenames is called pathname
expansion or globbing.
Ambiguous file references can quickly refer to a group of files with similar names,
saving the effort of typing the names individually. They can also help find a file whose
name you do not remember in its entirety. If no filename matches the ambiguous file
reference, the shell generally passes the unexpanded reference—special characters and
all—to the command. See “Brace Expansion” on page 356 for a technique that generates strings that do not necessarily match filenames.

The ? Special Character
The question mark (?) is a special character that causes the shell to generate filenames.
It matches any single character in the name of an existing file. The following command
uses this special character in an argument to the lpr utility:
$ lpr memo?

Filename Generation/Pathname Expansion 149

The shell expands the memo? argument and generates a list of files in the working
directory that have names composed of memo followed by any single character. The
shell then passes this list to lpr. The lpr utility never “knows” the shell generated the filenames it was called with. If no filename matches the ambiguous file reference, the shell
passes the string itself (memo?) to lpr or, if it is set up to do so, passes a null string (see
nullglob on page 353).
The following example uses ls first to display the names of all files in the working
directory and then to display the filenames that memo? matches:
$ ls
mem
memo

memo12
memo5

memo9
memoa

memomax
memos

$ ls memo?
memo5 memo9

memoa

memos

newmemo5

The memo? ambiguous file reference does not match mem, memo, memo12,
memomax, or newmemo5. You can also use a question mark in the middle of an
ambiguous file reference:
$ ls
7may4report
may14report

may4report
may4report.79

$ ls may?report
may4report mayqreport
echo

mayqreport
mayreport

may_report

may_report
may.report

may.report

You can use echo and ls to practice generating filenames. The echo utility displays the
arguments the shell passes to it:
$ echo may?report
may4report mayqreport may_report may.report

The shell first expands the ambiguous file reference into a list of files in the working
directory that match the string may?report. It then passes this list to echo, as though
you had entered the list of filenames as arguments to echo. The echo utility displays
the list of filenames.
A question mark does not match a leading period (one that indicates a hidden filename;
page 86). When you want to match filenames that begin with a period, you must
explicitly include the period in the ambiguous file reference.

The * Special Character
The asterisk (*) performs a function similar to that of the question mark but matches
any number of characters, including zero characters, in a filename. The following
example first shows all files in the working directory and then shows commands that

150 Chapter 5 The Shell

display all the filenames that begin with the string memo, end with the string mo, and
contain the string alx:
$ ls
amemo
mem

memalx
memo

memo.0612
memoa

memoalx.0620
memoalx.keep

memorandum
memosally

sallymemo
user.memo

$ echo memo*
memo memo.0612 memoa memoalx.0620 memoalx.keep memorandum memosally
$ echo *mo
amemo memo sallymemo user.memo
$ echo *alx*
memalx memoalx.0620 memoalx.keep

The ambiguous file reference memo* does not match amemo, mem, sallymemo, or
user.memo. Like the question mark, an asterisk does not match a leading period in a
filename.
The –a option causes ls to display hidden filenames (page 86). The command echo *
does not display . (the working directory), .. (the parent of the working directory),
.aaa, or .profile. In contrast, the command echo .* displays only those four names:
$ ls
aaa memo.0612
$ ls -a
.
aaa
.. .aaa

memo.sally

memo.0612
memo.sally

report

.profile
report

sally.0612

sally.0612
saturday

saturday

thurs

thurs

$ echo *
aaa memo.0612 memo.sally report sally.0612 saturday thurs
$ echo .*
. .. .aaa .profile

In the following example, .p* does not match memo.0612, private, reminder, or
report. The ls .* command causes ls to list .private and .profile in addition to the contents of the . directory (the working directory) and the .. directory (the parent of the
working directory). When called with the same argument, echo displays the names
of files (including directories) in the working directory that begin with a dot (.) but
not the contents of directories.
$ ls -a
. .. memo.0612

private

.private

.profile

$ echo .p*
.private .profile
$ ls .*
.private .profile
.:
memo.0612 private
..:

reminder

report

reminder

report

Filename Generation/Pathname Expansion 151
...
$ echo .*
. .. .private .profile

You can plan to take advantage of ambiguous file references when you establish
conventions for naming files. For example, when you end the names of all text files
with .txt, you can reference that group of files with *.txt. The next command uses
this convention to send all text files in the working directory to the printer. The
ampersand causes lpr to run in the background.
$ lpr *.txt &

The shell expands ambiguous file references
tip The shell does the expansion when it processes an ambiguous file reference, not the program
that the shell runs. In the examples in this section, the utilities (ls, cat, echo, lpr) never see the
ambiguous file references. The shell expands the ambiguous file references and passes a list of
ordinary filenames to the utility. In the previous examples, echo demonstrates this fact because
it simply displays its arguments; it never displays the ambiguous file reference.

The [ ] Special Characters
A pair of brackets surrounding one or more characters causes the shell to match
filenames containing the individual characters within the brackets. Whereas memo?
matches memo followed by any character, memo[17a] is more restrictive: It matches
only memo1, memo7, and memoa. The brackets define a character class that
includes all the characters within the brackets. (GNU calls this a character list; a
GNU character class is something different.) The shell expands an argument that
includes a character-class definition by substituting each member of the character
class, one at a time, in place of the brackets and their contents. The shell then passes
the list of matching filenames to the program it is calling.
Each character-class definition can replace only a single character within a filename.
The brackets and their contents are like a question mark that substitutes only the
members of the character class.
The first of the following commands lists the names of all files in the working directory
that begin with a, e, i, o, or u. The second command displays the contents of the files
named page2.txt, page4.txt, page6.txt, and page8.txt.
$ echo [aeiou]*
...
$ less page[2468].txt
...

A hyphen within brackets defines a range of characters within a character-class definition. For example, [6–9] represents [6789], [a–z] represents all lowercase letters in
English, and [a–zA–Z] represents all letters, both uppercase and lowercase, in
English.

152 Chapter 5 The Shell

The following command lines show three ways to print the files named part0, part1,
part2, part3, and part5. Each of these command lines causes the shell to call lpr with
five filenames:
$ lpr part0 part1 part2 part3 part5
$ lpr part[01235]
$ lpr part[0-35]

The first command line explicitly specifies the five filenames. The second and third
command lines use ambiguous file references, incorporating character-class definitions. The shell expands the argument on the second command line to include all files
that have names beginning with part and ending with any of the characters in the
character class. The character class is explicitly defined as 0, 1, 2, 3, and 5. The third
command line also uses a character-class definition but defines the character class to
be all characters in the range 0–3 plus 5.
The following command line prints 39 files, part0 through part38:
$ lpr part[0-9] part[12][0-9] part3[0-8]

The first of the following commands lists the files in the working directory whose
names start with a through m. The second lists files whose names end with x, y, or z.
$ echo [a-m]*
...
$ echo *[x-z]
...

optional When an exclamation point (!) or a caret (^) immediately follows the opening bracket
([) that starts a character-class definition, the character class matches any character
not between the brackets. Thus [^tsq]* matches any filename that does not begin
with t, s, or q.
The following examples show that *[^ab] matches filenames that do not end with
the letter a or b and that [^b-d]* matches filenames that do not begin with b, c, or d.
$ ls
aa ab

ac

ad

ba

bb

$ ls *[^ab]
ac ad bc bd

cc

dd

bc

bd

cc

dd

$ ls [^b-d]*
aa ab ac ad

You can cause a character class to match a hyphen (–) or a closing bracket (]) by placing
it immediately before the final (closing) bracket.

Chapter Summary

153

The next example demonstrates that the ls utility cannot interpret ambiguous file references. First ls is called with an argument of ?old. The shell expands ?old into a
matching filename, hold, and passes that name to ls. The second command is the same
as the first, except the ? is quoted (by preceding it with a backslash [\]; refer to “Special
Characters” on page 50). Because the ? is quoted, the shell does not recognize it as a
special character and passes it to ls. The ls utility generates an error message saying
that it cannot find a file named ?old (because there is no file named ?old).
$ ls ?old
hold
$ ls \?old
ls: ?old: No such file or directory

Like most utilities and programs, ls cannot interpret ambiguous file references; that
work is left to the shell.

Builtins
A builtin is a utility (also called a command) that is built into a shell. Each of the
shells has its own set of builtins. When it runs a builtin, the shell does not fork a new
process. Consequently builtins run more quickly and can affect the environment of
the current shell. Because builtins are used in the same way as utilities, you will not
typically be aware of whether a utility is built into the shell or is a stand-alone utility.
For example, echo is a shell builtin. It is also a stand-alone utility. The shell always
executes a shell builtin before trying to find a command or utility with the same
name. See page 476 for an in-depth discussion of builtin commands, page 491 for a
list of bash builtins, and page 407 for a list of tcsh builtins.
Listing bash
builtins

To display a list of bash builtins, give the command info bash shell builtin. To display
a page with information on each builtin, move the cursor to the Bash Builtins line
and press RETURN. Alternatively, you can view the builtins man page.

Getting help with
bash builtins

You can use the bash help command to display information about bash builtins. See
page 39 for more information.

Listing tcsh builtins

To list tcsh builtins, give the command man tcsh to display the tcsh man page and then
search for the second occurrence of Builtin commands by using the following two
commands: /Builtin commands (search for the string) and n (search for the next
occurrence of the string).

Chapter Summary
The shell is the Linux command interpreter. It scans the command line for proper syntax, picking out the command name and arguments. The name of the command is
argument zero. The first argument is argument one, the second is argument two, and

154 Chapter 5 The Shell

so on. Many programs use options to modify the effects of a command. Most Linux
utilities identify an option by its leading one or two hyphens.
When you give it a command, the shell tries to find an executable program with the
same name as the command. When it does, the shell executes the program. When it
does not, the shell tells you it cannot find or execute the program. If the command is
a simple filename, the shell searches the directories listed in the PATH variable to
locate the command.
When it executes a command, the shell assigns one file or device to the command’s
standard input and another file to its standard output. By default, the shell causes a
command’s standard input to come from the keyboard and its standard output to go
to the screen. You can instruct the shell to redirect a command’s standard input from
or standard output to any file or device. You can also connect standard output of one
command to standard input of another command to form a pipeline. A filter is a command that reads its standard input from standard output of one command and writes
its standard output to standard input of another command.
When a command runs in the foreground, the shell waits for the command to finish
before it displays a prompt and allows you to continue. When you put an ampersand
(&) at the end of a command line, the shell executes the command in the background
and displays another prompt immediately. Run slow commands in the background
when you want to enter other commands at the shell prompt. The jobs builtin displays a list of suspended jobs and jobs running in the background and includes the
job number of each.
The shell interprets special characters on a command line to generate filenames. A
reference that uses special characters (wildcards) to abbreviate a list of one or more
filenames is called an ambiguous file reference. A question mark represents any single
character, and an asterisk represents zero or more characters. A single character
might also be represented by a character class: a list of characters within brackets.
A builtin is a utility that is built into a shell. Each shell has its own set of builtins.
When it runs a builtin, the shell does not fork a new process. Consequently builtins
run more quickly and can affect the environment of the current shell.

Utilities and Builtins Introduced in This Chapter
Table 5-1 lists the utilities introduced in this chapter.

Table 5-1

New utilities

Utility

Function

tr

Maps one string of characters to another (page 142)

tee

Sends standard input to both a file and standard output (page 145)

bg

Moves a process to the background (page 147)

fg

Moves a process to the foreground (page 147)

jobs

Displays a list of suspended jobs and jobs running in the background (page 148)

Exercises 155

Exercises
1. What does the shell ordinarily do while a command is executing? What
should you do if you do not want to wait for a command to finish before
running another command?
2. Using sort as a filter, rewrite the following sequence of commands:
$ sort list > temp
$ lpr temp
$ rm temp

3. What is a PID number? Why are these numbers useful when you run processes
in the background? Which utility displays the PID numbers of the commands
you are running?
4. Assume the following files are in the working directory:
$ ls
intro
notesa

notesb
ref1

ref2
ref3

section1
section2

section3
section4a

section4b
sentrev

Give commands for each of the following, using wildcards to express
filenames with as few characters as possible.
a. List all files that begin with section.
b. List the section1, section2, and section3 files only.
c. List the intro file only.
d. List the section1, section3, ref1, and ref3 files.
5. Refer to Part VII or the info or man pages to determine which command will
a. Display the number of lines in its standard input that contain the word a or A.
b. Display only the names of the files in the working directory that contain
the pattern $(.
c. List the files in the working directory in reverse alphabetical order.
d. Send a list of files in the working directory to the printer, sorted by size.
6. Give a command to
a. Redirect standard output from a sort command to a file named
phone_list. Assume the input file is named numbers.
b. Translate all occurrences of the characters [ and { to the character (, and
all occurrences of the characters ] and } to the character ), in the file
permdemos.c. (Hint: Refer to tr on page 987.)
c. Create a file named book that contains the contents of two other files:
part1 and part2.
7. The lpr and sort utilities accept input either from a file named on the command
line or from standard input.
a. Name two other utilities that function in a similar manner.
b. Name a utility that accepts its input only from standard input.

156 Chapter 5 The Shell

8. Give an example of a command that uses grep
a. With both input and output redirected.
b. With only input redirected.
c. With only output redirected.
d. Within a pipeline.
In which of the preceding cases is grep used as a filter?
9. Explain the following error message. Which filenames would a subsequent
ls command display?
$ ls
abc abd abe abf abg abh
$ rm abc ab*
rm: cannot remove 'abc': No such file or directory

Advanced Exercises
10. When you use the redirect output symbol (>) on a command line, the shell
creates the output file immediately, before the command is executed. Demonstrate that this is true.
11. In experimenting with variables, Max accidentally deletes his PATH variable.
He decides he does not need the PATH variable. Discuss some of the problems
he could soon encounter and explain the reasons for these problems. How
could he easily return PATH to its original value?
12. Assume permissions on a file allow you to write to the file but not to delete it.
a. Give a command to empty the file without invoking an editor.
b. Explain how you might have permission to modify a file that you cannot
delete.
13. If you accidentally create a filename that contains a nonprinting character,
such as a CONTROL character, how can you remove the file?
14. Why does the noclobber variable not protect you from overwriting an
existing file with cp or mv?
15. Why do command names and filenames usually not have embedded SPACEs?
How would you create a filename containing a SPACE? How would you
remove it? (This is a thought exercise, not recommended practice. If you
want to experiment, create a file and work in a directory that contains only
your experimental file.)
16. Create a file named answer and give the following command:
$ > answers.0102 < answer cat

Explain what the command does and why. What is a more conventional
way of expressing this command?

I

PART II
The Editors
CHAPTER 6
The vim Editor

159

CHAPTER 7
The emacs Editor

213

6
The vim Editor

Chapter6
6

In This Chapter

Objectives

Tutorial: Using vim to Create
and Edit a File . . . . . . . . . . . . . . 161

After reading this chapter you should be able to:

Introduction to vim Features . . . . 168
Online Help . . . . . . . . . . . . . . . . . . 168
Command Mode: Moving the
Cursor . . . . . . . . . . . . . . . . . . . . . 174
Input Mode . . . . . . . . . . . . . . . . . . 178
Command Mode: Deleting and
Changing Text . . . . . . . . . . . . . . 179
Searching and Substituting . . . . . 183
Copying, Moving, and
Deleting Text . . . . . . . . . . . . . . . 190
The General-Purpose Buffer . . . . . 191
Reading and Writing Files . . . . . . . 193
The .vimrc Startup File . . . . . . . . . 195

 Use vim to create and edit a file
 View vim online help
 Explain the difference between Command and Input
modes
 Explain the purpose of the Work buffer
 List the commands that open a line above the cursor,
append text to the end of a line, move the cursor to the
first line of the file, and move the cursor to the middle
line of the screen
 Describe Last Line mode and list some commands that
use this mode
 Describe how to set and move the cursor to a marker
 List the commands that move the cursor backward and
forward by characters and words
 Describe how to read a file into the Work buffer
 Explain how to search backward and forward for text
and how to repeat that search

160 Chapter 6 The vim Editor

This chapter begins with a history and description of vi, the original, powerful,
sometimes cryptic, interactive, visually oriented text editor. The chapter continues
with a tutorial that explains how to use vim (vi improved—a vi clone supplied with
or available for most Linux distributions) to create and edit a file. Much of the
tutorial and the balance of the chapter apply to vi and other vi clones. Following
the tutorial, the chapter delves into the details of many vim commands and explains
how to use parameters to customize vim to meet your needs. It concludes with a
quick reference/summary of vim commands.

History
Before vi was developed, the standard UNIX system editor was ed (available on most
Linux systems), a line-oriented editor that made it difficult to see the context of your
editing. Next came ex,1 a superset of ed. The most notable advantage that ex has over
ed is a display-editing facility that allows you to work with a full screen of text instead
of just a line. While using ex, you can bring up the display-editing facility by giving
a vi (Visual mode) command. People used this display-editing facility so extensively
that the developers of ex made it possible to start the editor with the display-editing
facility already running, rather than having to start ex and then give a vi command.
Appropriately they named the program vi. You can call the Visual mode from ex, and
you can go back to ex while you are using vi. Start by running ex; give a vi command
to switch to Visual mode, and give a Q command while in Visual mode to use ex. The
quit command exits from ex.
vi clones

Linux offers a number of versions, or clones, of vi. The most popular of these clones
are elvis (elvis.the-little-red-haired-girl.org), nvi (an implementation of the original vi
editor by Keith Bostic), vile (invisible-island.net/vile/vile.html), and vim
(www.vim.org). Each clone offers additional features beyond those provided by the
original vi.
The examples in this book are based on vim. Several Linux distributions support
multiple versions of vim. For example, Fedora provides /bin/vi, a minimal build of
vim that is compact and faster to load but offers fewer features, and /usr/bin/vim,
a full-featured version of vim.
If you use one of the clones other than vim, or vi itself, you might notice slight differences from the examples presented in this chapter. The vim editor is compatible
with almost all vi commands and runs on many platforms, including Windows,
Macintosh, OS/2, UNIX, and Linux. Refer to the vim home page (www.vim.org) for
more information and a very useful Tips section.

1. The ex program is usually a link to vi, which is a version of vim on some systems.

Tutorial: Using vim to Create and Edit a File 161
What vim is not

The vim editor is not a text formatting program. It does not justify margins or provide
the output formatting features of a sophisticated word processing system such as
OpenOffice.org Writer. Rather, vim is a sophisticated text editor meant to be used to
write code (C, HTML, Java, and so on), short notes, and input to a text formatting
system, such as groff or troff. You can use fmt (page 812) to minimally format a text
file you create with vim.

Reading this chapter

Because vim is so large and powerful, this chapter describes only some of its features.
Nonetheless, if vim is completely new to you, you might find even this limited set of
commands overwhelming. The vim editor provides a variety of ways to accomplish
most editing tasks. A useful strategy for learning vim is to begin by learning a subset
of commands to accomplish basic editing tasks. Then, as you become more comfortable with the editor, you can learn other commands that enable you to edit a file more
quickly and efficiently. The following tutorial section introduces a basic, useful set of
vim commands and features that will enable you to create and edit a file.

Tutorial: Using vim to Create and Edit a File
This section explains how to start vim, enter text, move the cursor, correct text, save
the file to the disk, and exit from vim. The tutorial discusses three of the modes of
operation of vim and explains how to switch from one mode to another.
vimtutor

In addition to working with this tutorial, you might want to try vim’s instructional
program, vimtutor. Enter its name as a command to run it.

vimtutor and vim help files are not installed by default
tip To run vimtutor and to get help as described on page 165, you must install the vim-enhanced or
vim-runtime package. See Appendix C for instructions.
Specifying a
terminal

Because vim takes advantage of features that are specific to various kinds of terminals,
you must tell it what type of terminal or terminal emulator you are using. On many
systems, and usually when you work on a terminal emulator, your terminal type is
set automatically. If you need to specify your terminal type explicitly, refer to “Specifying a Terminal” on page 1024.

Starting vim
Start vim with the following command to create and edit a file named practice (you
might need to use the command vi or vim.tiny in place of vim):
$ vim practice

When you press RETURN, the command line disappears, and the screen looks similar to
the one shown in Figure 6-1 on the next page.

162 Chapter 6 The vim Editor

Figure 6-1

Starting vim

The tildes (~) at the left of the screen indicate the file is empty. They disappear as you
add lines of text to the file. If the screen looks like a distorted version of the one shown
in Figure 6-1, the terminal type is probably not set correctly (see “Problem,” next).

The vi command might run vim
tip On some systems the command vi runs vim in vi-compatible mode (page 168).
The practice file is new so it contains no text. The vim editor displays a message similar to the one shown in Figure 6-1 on the status (bottom) line of the terminal to
indicate you are creating and editing a new file. When you edit an existing file, vim
displays the first few lines of the file and gives status information about the file on
the status line.
Problem

If you start vim with a terminal type that is not in the terminfo database, vim displays
an error message and waits for you to press RETURN or sets the terminal type to ansi,
which works on many terminals.

Emergency exit

To reset the terminal type, press ESCAPE and then give the following command to exit
from vim and display the shell prompt:
:q!

When you enter the colon (:), vim moves the cursor to the bottom line of the screen.
The characters q! tell vim to quit without saving your work. (You will not ordinarily
exit from vim this way because you typically want to save your work.) You must press
RETURN after you give this command. When the shell displays its prompt, refer to
“Specifying a Terminal” on page 1024 and start vim again.
If you start vim without a filename, it displays information about itself (Figure 6-2).

Tutorial: Using vim to Create and Edit a File 163

Figure 6-2

Starting vim without a filename

Command and Input Modes
Two of vim’s modes of operation are Command mode (also called Normal mode) and
Input mode (Figure 6-3). While vim is in Command mode, you can give vim commands. For example, you can delete text or exit from vim. You can also command
vim to enter Input mode. In Input mode, vim accepts anything you enter as text and
displays it on the screen. Press ESCAPE to return vim to Command mode. By default the
vim editor keeps you informed about which mode it is in: It displays INSERT at the
lower-left corner of the screen while it is in Insert mode.

Colon (:)
Last
Line
mode

Command
mode
RETURN
Insert,
Append,
Open,
Replace,
Change

ESCAPE

Input
mode

Figure 6-3

Modes in vim

164 Chapter 6 The vim Editor

Figure 6-4

Entering text with vim

The following command causes vim to display line numbers next to the text you are
editing:
:set number RETURN
Last Line mode

The colon (:) in the preceding command puts vim into another mode, Last Line mode.
While in this mode, vim keeps the cursor on the bottom line of the screen. When you
finish entering the command by pressing RETURN, vim restores the cursor to its place in
the text. Give the command :set nonumber RETURN to turn off line numbering.

vim is case
sensitive

When you give vim a command, remember that the editor is case sensitive. In other
words, vim interprets the same letter as two different commands, depending on
whether you enter an uppercase or lowercase character. Beware of the CAPS LOCK (SHIFTLOCK) key. If you set this key to enter uppercase text while you are in Input mode and
then exit to Command mode, vim interprets your commands as uppercase letters. It
can be confusing when this happens because vim does not appear to be executing the
commands you are entering.

Entering Text
i/a (Input mode)

When you start vim, you must put it in Input mode before you can enter text. To put vim
in Input mode, press the i (insert before cursor) key or the a (append after cursor) key.
If you are not sure whether vim is in Input mode, press the ESCAPE key; vim returns to Command mode if it is in Input mode or beeps, flashes, or does nothing if it is already in
Command mode. You can put vim back in Input mode by pressing the i or a key again.
While vim is in Input mode, you can enter text by typing on the keyboard. If the text
does not appear on the screen as you type, vim is not in Input mode.

Tutorial: Using vim to Create and Edit a File 165

Figure 6-5

The main vim Help screen

To continue with this tutorial, enter the sample paragraph shown in Figure 6-4, pressing the RETURN key at the end of each line. If you do not press RETURN before the cursor
reaches the right side of the screen or window, vim wraps the text so that it appears to
start a new line. Physical lines will not correspond to programmatic (logical) lines in
this situation, so editing will be more difficult. While you are using vim, you can correct typing mistakes. If you notice a mistake on the line you are entering, you can
correct it before you continue (page 166). You can correct other mistakes later. When
you finish entering the paragraph, press ESCAPE to return vim to Command mode.

Getting Help
You must have the vim-runtime package installed to use vim’s help system; see
Appendix C.
To get help while you are using vim, enter the command :help [feature] followed by
The editor must be in Command mode when you enter this command. The
colon moves the cursor to the last line of the screen. If you type :help, vim displays
an introduction to vim Help (Figure 6-5). Each dark band near the bottom of the
screen names the file that is displayed above it. (Each area of the screen that displays
a file, such as the two areas shown in Figure 6-5, is a vim “window.”) The help.txt
file occupies most of the screen (the upper window) in Figure 6-5. The file that is
being edited (practice) occupies a few lines in the lower portion of the screen (the
lower window).
RETURN.

Read through the introduction to Help by scrolling the text as you read. Press j or the
DOWN ARROW key to move the cursor down one line at a time; press CONTROL-D or CONTROL-U
to scroll the cursor down or up half a window at a time. Give the command :q to close
the Help window.

166 Chapter 6 The vim Editor

Figure 6-6

Help with insert commands

You can display information about the insert commands by giving the command
:help insert while vim is in Command mode (Figure 6-6).

Correcting Text as You Insert It
The keys that back up and correct a shell command line serve the same functions
when vim is in Input mode. These keys include the erase, line kill, and word kill keys
(usually CONTROL-H, CONTROL-U, and CONTROL-W, respectively). Although vim might not
remove deleted text from the screen as you back up over it using one of these keys,
the editor does remove it when you type over the text or press RETURN.

Moving the Cursor
You need to be able to move the cursor on the screen so you can delete, insert, and
correct text. While vim is in Command mode, the RETURN key, the SPACE bar, and the ARROW
keys move the cursor. If you prefer to keep your hand closer to the center of the keyboard, if your terminal does not have ARROW keys, or if the emulator you are using does
not support them, you can use the h, j, k, and l (lowercase “l”) keys to move the cursor left, down, up, and right, respectively.

Deleting Text
x (Delete character)
dw (Delete word)
dd (Delete line)

You can delete a single character by moving the cursor until it is over the character
you want to delete and then giving the command x. You can delete a word by positioning the cursor on the first letter of the word and then giving the command dw
(Delete word). You can delete a line of text by moving the cursor until it is anywhere
on the line and then giving the command dd.

Tutorial: Using vim to Create and Edit a File 167

Undoing Mistakes
u (Undo)

If you delete a character, line, or word by mistake or give any command you want to
reverse, give the command u (Undo) immediately after the command you want to
undo. The vim editor will restore the text to the way it was before you gave the last
command. If you give the u command again, vim will undo the command you gave
before the one it just undid. You can use this technique to back up over many of your
actions. With the compatible parameter (page 168) set, however, vim can undo only
the most recent change.

:redo (Redo)

If you undo a command you did not mean to undo, give a Redo command: CONTROL-R
or :redo (followed by a RETURN). The vim editor will redo the undone command. As
with the Undo command, you can give the Redo command many times in a row.

Entering Additional Text
i (Insert)
a (Append)

When you want to insert new text within existing text, move the cursor so it is on
the character that follows the new text you plan to enter. Then give the i (Insert) command to put vim in Input mode, enter the new text, and press ESCAPE to return vim to
Command mode. Alternatively, you can position the cursor on the character that precedes the new text and use the a (Append) command.

o/O (Open)

To enter one or more lines, position the cursor on the line above where you want the
new text to go. Give the command o (Open). The vim editor opens a blank line below
the line the cursor was on, puts the cursor on the new, empty line, and goes into Input
mode. Enter the new text, ending each line with a RETURN. When you are finished entering text, press ESCAPE to return vim to Command mode. The O command works in the
same way o works, except it opens a blank line above the line the cursor is on.

Correcting Text
To correct text, use dd, dw, or x to remove the incorrect text. Then use i, a, o, or O
to insert the correct text.
For example, to change the word pressing to hitting in Figure 6-4 on page 164, you
might use the ARROW keys to move the cursor until it is on top of the p in pressing. Then
give the command dw to delete the word pressing. Put vim in Input mode by giving
an i command, enter the word hitting followed by a SPACE, and press ESCAPE. The word
is changed, and vim is in Command mode, waiting for another command. A shorthand for the two commands dw followed by the i command is cw (Change word).
The command cw puts vim into Input mode.

Page breaks for the printer
tip

CONTROL-L tells the printer to skip to the top of the next page. You can enter this character anywhere
in a document by pressing CONTROL-L while you are in Input mode. If ^L does not appear, press
CONTROL-V before CONTROL-L.

168 Chapter 6 The vim Editor

Ending the Editing Session
While you are editing, vim keeps the edited text in an area named the Work buffer.
When you finish editing, you must write out the contents of the Work buffer to a disk
file so the edited text is saved and available when you next want it.
Make sure vim is in Command mode and use the ZZ command (you must use uppercase Zs) to write the newly entered text to the disk and end the editing session. After
you give the ZZ command, vim returns control to the shell. You can exit with :q! if
you do not want to save your work.

Do not confuse ZZ with CONTROL-Z
caution When you exit from vim with ZZ, make sure that you type ZZ and not CONTROL-Z (typically the
suspend key). When you press CONTROL-Z, vim disappears from your screen, almost as though
you had exited from it. In fact, vim will continue running in the background with your work
unsaved. Refer to “Job Control” on page 294. If you try to start editing the same file with a new
vim command, vim displays a message about a swap file.

The compatible Parameter
The compatible parameter makes vim more compatible with vi. By default this
parameter is not set. To get started with vim, you can ignore this parameter.
Setting the compatible parameter changes many aspects of how vim works. For
example, when the compatible parameter is set, the Undo command (page 167) can
undo only the most recent change; in contrast, with the compatible parameter
unset, you can call Undo repeatedly to undo many changes. This chapter notes
when the compatible parameter affects a command. To obtain more details on the
compatible parameter, give the command :help compatible RETURN. To display a complete list of vim’s differences from the original vi, use :help vi-diff RETURN. See page 165
for a discussion of the help command.
From the command line use the –C option to set the compatible parameter and the
–N option to unset it. Refer to “Setting Parameters from Within vim” on page 175 for
information on how to change the compatible parameter while you are running vim.

Introduction to vim Features
This section covers online help, modes of operation, the Work buffer, emergency procedures, and other vim features. To see which features are incorporated in a particular
build, give a vim command followed by the ––version option.

Online Help
As covered briefly earlier, vim provides help while you are using it. Give the command
:help feature to display information about feature. As you scroll through the various
help texts, you will see words with a bar on either side, such as |tutor|. These words

Introduction to vim Features 169

are active links: Move the cursor on top of an active link and press CONTROL-] to jump
to the linked text. Use CONTROL-o (lowercase “o”) to jump back to where you were in
the help text. You can also use the active link words in place of feature. For example,
you might see the reference |credits|; you could enter :help credits RETURN to read more
about credits. Enter :q! to close a help window.
Some common features that you might want to investigate by using the help system
are insert, delete, and opening-window. Although opening-window is not intuitive,
you will get to know the names of features as you spend more time with vim. You
can also give the command :help doc-file-list to view a complete list of the help files.
Although vim is a free program, the author requests that you donate the money you
would have spent on similar software to help the children in Uganda (give the command :help iccf for more information).

Terminology
This chapter uses the following terms:
Current character
Current line
Status line

The character the cursor is on.
The line the cursor is on.
The last or bottom line of the screen. This line is reserved for Last Line mode and
status information. Text you are editing does not appear on this line.

Modes of Operation
The vim editor is part of the ex editor, which has five modes of operation:
• ex Command mode
• ex Input mode
• vim Command mode
• vim Input mode
• vim Last Line mode
While in Command mode, vim accepts keystrokes as commands, responding to each
command as you enter it. It does not display the characters you type in this mode.
While in Input mode, vim accepts and displays keystrokes as text that it eventually
puts into the file you are editing. All commands that start with a colon (:) put vim in
Last Line mode. The colon moves the cursor to the status line of the screen, where
you enter the rest of the command.
In addition to the position of the cursor, there is another important difference
between Last Line mode and Command mode. When you give a command in Command mode, you do not terminate the command with a RETURN. In contrast, you must
terminate all Last Line mode commands with a RETURN.
You do not normally use the ex modes. When this chapter refers to Input and
Command modes, it means the vim modes, not the ex modes.

170 Chapter 6 The vim Editor

When an editing session begins, vim is in Command mode. Several commands, including Insert and Append, put vim in Input mode. When you press the ESCAPE key, vim
always reverts to Command mode.
The Change and Replace commands combine the Command and Input modes. The
Change command deletes the text you want to change and puts vim in Input mode so
you can insert new text. The Replace command deletes the character(s) you overwrite
and inserts the new one(s) you enter. Figure 6-3 on page 163 shows the modes and
the methods for changing between them.

Watch the mode and the CAPS LOCK key
tip Almost anything you type in Command mode means something to vim. If you think vim is in Input
mode when it is in Command mode, typing in text can produce confusing results. When you are
learning to use vim, make sure the showmode parameter (page 198) is set (it is by default) to
remind you which mode you are using. You might also find it useful to turn on the status line by
giving a :set laststatus=2 command (page 197).
Also keep your eye on the CAPS LOCK key. In Command mode, typing uppercase letters produces
different results than typing lowercase ones. It can be disorienting to give commands and have
vim give the “wrong” responses.

The Display
The vim editor uses the status line and several symbols to give information about what
is happening during an editing session.

Status Line
The vim editor displays status information on the bottom line of the display area. This
information includes error messages, information about the deletion or addition of
blocks of text, and file status information. In addition, vim displays Last Line mode
commands on the status line.

Redrawing the Screen
Sometimes the screen might become garbled or overwritten. When vim puts characters on the screen, it sometimes leaves @ on a line instead of deleting the line. When
output from a program becomes intermixed with the display of the Work buffer,
things can get even more confusing. The output does not become part of the Work
buffer but affects only the display. If the screen gets overwritten, press ESCAPE to make
sure vim is in Command mode, and press CONTROL-L to redraw (refresh) the screen.

Tilde (~) Symbol
If the end of the file is displayed on the screen, vim marks lines that would appear past
the end of the file with a tilde (~) at the left of the screen. When you start editing a new
file, the vim editor marks each line on the screen (except the first line) with this symbol.

Correcting Text as You Insert It
While vim is in Input mode, you can use the erase and line kill keys to back up over
text so you can correct it. You can also use CONTROL-W to back up over words.

Introduction to vim Features 171

Work Buffer
The vim editor does all its work in the Work buffer. At the beginning of an editing
session, vim reads the file you are editing from the disk into the Work buffer. During
the editing session, it makes all changes to this copy of the file but does not change
the file on the disk until you write the contents of the Work buffer back to the disk.
Normally when you end an editing session, you tell vim to write the contents of the
Work buffer, which makes the changes to the text final. When you edit a new file,
vim creates the file when it writes the contents of the Work buffer to the disk, usually
at the end of the editing session.
Storing the text you are editing in the Work buffer has both advantages and disadvantages. If you accidentally end an editing session without writing out the contents
of the Work buffer, your work is lost. However, if you unintentionally make some
major changes (such as deleting the entire contents of the Work buffer), you can end
the editing session without implementing the changes.
To look at a file but not to change it while you are working with vim, you can use the
view utility:
$ view filename

Calling the view utility is the same as calling the vim editor with the –R (readonly)
option. Once you have invoked the editor in this way, you cannot write the contents
of the Work buffer back to the file whose name appeared on the command line. You
can always write the Work buffer out to a file with a different name. If you have
installed mc (Midnight Commander; page 879), the view command calls mcview and
not vim.

Line Length and File Size
The vim editor operates on files of any format, provided the length of a single line
(that is, the characters between two NEWLINE characters) can fit into available memory.
The total length of the file is limited only by available disk space and memory.

Windows
The vim editor allows you to open, close, and hide multiple windows, each of which
allows you to edit a different file. Most of the window commands consist of CONTROL-W followed by another letter. For example, CONTROL-W s opens another window (splits the screen)
that is editing the same file. CONTROL-W n opens a second window that is editing an empty
file. CONTROL-W w moves the cursor between windows, and CONTROL-W q (or :q) quits (closes)
a window. Give the command :help windows to display a complete list of windows
commands.

File Locks
When you edit an existing file, vim displays the first few lines of the file, gives status
information about the file on the status line, and locks the file. When you try to open
a locked file with vim, you will see a message similar to the one shown in Figure 6-7.
You will see this type of message in two scenarios: when you try to edit a file that

172 Chapter 6 The vim Editor

Figure 6-7

Attempting to open a locked file

someone is already editing (perhaps you are editing it in another window, in the background, or on another terminal) and when you try to edit a file that you were editing
when vim or the system crashed.
Although it is advisable to follow the instructions that vim displays, a second user can
edit a file and write it out with a different filename. Refer to the next sections for
more information.

Abnormal Termination of an Editing Session
You can end an editing session in one of two ways: When you exit from vim, you can
save the changes you made during the editing session or you can abandon those
changes. You can use the ZZ or :wq command from Command mode to save the
changes and exit from vim (see “Ending the Editing Session” on page 168).
To end an editing session without writing out the contents of the Work buffer, give
the following command:
:q!

Use the :q! command cautiously. When you use this command to end an editing session, vim does not preserve the contents of the Work buffer, so you will lose any work
you did since the last time you wrote the Work buffer to disk. The next time you edit
or use the file, it will appear as it did the last time you wrote the Work buffer to disk.
Sometimes you might find that you created or edited a file but vim will not let you
exit. For example, if you forgot to specify a filename when you first called vim, you
will get a message saying No file name when you give a ZZ command. If vim does
not let you exit normally, you can use the Write command (:w) to name the file and
write it to disk before you quit vim. Give the following command, substituting the
name of the file for filename (remember to follow the command with a RETURN):

Introduction to vim Features 173

:w filename
After you give the Write command, you can use :q to quit using vim. You do not need
to include the exclamation point (as in q!); it is necessary only when you have made
changes since the last time you wrote the Work buffer to disk. Refer to page 193 for
more information about the Write command.

When you cannot write to a file
tip It might be necessary to write a file using :w filename if you do not have write permission for the
file you are editing. If you give a ZZ command and see the message "filename " is read only, you
do not have write permission for the file. Use the Write command with a temporary filename to
write the file to disk under a different filename. If you do not have write permission for the working
directory, however, vim might not be able to write the file to the disk. Give the command again,
using an absolute pathname of a dummy (nonexistent) file in your home directory in place of the
filename. (For example, Max might give the command :w /home/max/tempor :w ~/temp.)
If vim reports File exists, you will need to use :w! filename to overwrite the existing file (make
sure you want to overwrite the file). Refer to page 194.

Recovering Text After a Crash
The vim editor temporarily stores the file you are working on in a swap file. If the
system crashes while you are editing a file with vim, you can often recover its text
from the swap file. When you attempt to edit a file that has a swap file, you will see
a message similar to the one shown in Figure 6-7 on page 172. If someone else is
editing the file, quit or open the file as a readonly file.
In the following example, Max uses the –r option to check whether the swap file
exists for a file named memo, which he was editing when the system crashed:
$ vim -r
Swap files found:
In current directory:
1.
.party.swp
owned by: max
dated: Thu Jan 26 11:36:44 2012
file name: ~max/party
modified: YES
user name: max
host name: coffee
process ID: 18439
2.
.memo.swp
owned by: max
dated: Sat Mar 23 17:14:05 2013
file name: ~max/memo
modified: no
user name: max
host name: coffee
process ID: 27733 (still running)
In directory ~/tmp:
-- none -In directory /var/tmp:
-- none -In directory /tmp:
-- none --

Backward

Backward

174 Chapter 6 The vim Editor

Forward
Forward

Figure 6-8

Forward and backward

With the –r option, vim displays a list of swap files it has saved (some might be old).
If your work was saved, give the same command followed by a SPACE and the name of
the file. You will then be editing a recent copy of your Work buffer. Give the command
:w filename immediately to save the salvaged copy of the Work buffer to disk under
a name different from the original file; then check the recovered file to make sure it is
OK. Following is Max’s exchange with vim as he recovers memo. Subsequently he
deletes the swap file:
$ vim -r memo
Using swap file ".memo.swp"
Original file "~/memo"
Recovery completed. You should check if everything is OK.
(You might want to write out this file under another name
and run diff with the original file to check for changes)
Delete the .swp file afterwards.
Hit ENTER or type command to continue
:w memo2
:q
$ rm .memo.swp

You must recover files on the system you were using
tip The recovery feature of vim is specific to the system you were using when the crash occurred. If
you are running on a cluster, you must log in on the system you were using before the crash to
use the –r option successfully.

Command Mode: Moving the Cursor
While vim is in Command mode, you can position the cursor over any character on
the screen. You can also display a different portion of the Work buffer on the screen.
By manipulating the screen and cursor position, you can place the cursor on any character in the Work buffer.

Command Mode: Moving the Cursor

h

175

l
SPACE

Figure 6-9

Moving the cursor by characters

You can move the cursor forward or backward through the text. As illustrated in
Figure 6-8, forward means toward the right and bottom of the screen and the end of
the file. Backward means toward the left and top of the screen and the beginning of
the file. When you use a command that moves the cursor forward past the end (right)
of a line, the cursor generally moves to the beginning (left) of the next line. When you
move it backward past the beginning of a line, the cursor generally moves to the end
of the previous line.
Long lines

Sometimes a line in the Work buffer might be too long to appear as a single line on
the screen. In such a case vim wraps the current line onto the next line (unless you set
the nowrap option [page 197]).
You can move the cursor through the text by any Unit of Measure (that is, character,
word, line, sentence, paragraph, or screen). If you precede a cursor-movement command with a number, called a Repeat Factor, the cursor moves that number of units
through the text. Refer to pages 203 through page 206 for precise definitions of
these terms.

Moving the Cursor by Characters
l/h

The SPACE bar moves the cursor forward, one character at a time, toward the right side
of the screen. The l (lowercase “l”) key and the RIGHT ARROW key (Figure 6-9) do the
same thing. For example, the command 7 SPACE or 7l moves the cursor seven characters to the right. These keys cannot move the cursor past the end of the current line
to the beginning of the next line. The h and LEFT ARROW keys are similar to the l and
RIGHT ARROW keys but work in the opposite direction.

Moving the Cursor to a Specific Character
f/F

You can move the cursor to the next occurrence of a specified character on the current
line by using the Find command. For example, the following command moves the
cursor from its current position to the next occurrence of the character a, if one
appears on the same line:
fa

You can also find the previous occurrence by using a capital F. The following command moves the cursor to the position of the closest previous a in the current line:
Fa

A semicolon (;) repeats the last Find command.

176 Chapter 6 The vim Editor

B

b

w

W

belief,.really...It
Moving the cursor by words

Figure 6-10

Moving the Cursor by Words
w/W

The w (word) key moves the cursor forward to the first letter of the next word
(Figure 6-10). Groups of punctuation count as words. This command goes to the
next line if the next word is located there. The command 15w moves the cursor to
the first character of the fifteenth subsequent word.
The W key is similar to the w key but moves the cursor by blank-delimited words,
including punctuation, as it skips forward. (Refer to “Blank-Delimited Word” on
page 204.)

b/B
e/E

The b (back) key moves the cursor backward to the first letter of the previous word.
The B key moves the cursor backward by blank-delimited words. Similarly the e key
moves the cursor to the end of the next word; E moves it to the end of the next blankdelimited word.

Moving the Cursor by Lines
j/k

The RETURN key moves the cursor to the beginning of the next line; the j and DOWN ARROW
keys move the cursor down one line to the character just below the current character
(Figure 6-11). If no character appears immediately below the current character, the
cursor moves to the end of the next line. The cursor will not move past the last line
of text in the work buffer.
The k and UP ARROW keys are similar to the j and DOWN ARROW keys but work in the opposite direction. The minus (–) key is similar to the RETURN key but works in the opposite
direction.

needed.as
–

k

with.their
RETURN

j

working..To
Figure 6-11

Moving the cursor by lines

Command Mode: Moving the Cursor

177

{

H

(

M
Cursor

)

L
}

Figure 6-12

Moving the cursor by sentences, paragraphs, H, M, and L

Moving the Cursor by Sentences and Paragraphs
)/(
}/{

The ) and } keys move the cursor forward to the beginning of the next sentence or
the next paragraph, respectively (Figure 6-12). The ( and { keys move the cursor
backward to the beginning of the current sentence or paragraph, respectively. You
can find more information on sentences and paragraphs starting on page 204.

Moving the Cursor Within the Screen
H/M/L

The H (home) key positions the cursor at the left end of the top line of the screen,
the M (middle) key moves the cursor to the middle line, and the L (lower) key moves
it to the bottom line (Figure 6-12).

Viewing Different Parts of the Work Buffer
The screen displays a portion of the text that is in the Work buffer. You can display
the text preceding or following the text on the screen by scrolling the display. You
can also display a portion of the Work buffer based on a line number.
CONTROL-D
CONTROL-U

Press CONTROL-D to scroll the screen down (forward) through the file so that vim displays
half a screen of new text. Use CONTROL-U to scroll the screen up (backward) by the same
amount. If you precede either of these commands with a number, vim scrolls that
number of lines each time you press CONTROL-D or CONTROL-U for the rest of the session
(unless you again change the number of lines to scroll). See page 198 for a discussion
of the scroll parameter.

CONTROL-F
CONTROL-B

The CONTROL-F (forward) and CONTROL-B (backward) keys display almost a whole screen
of new text, leaving a couple of lines from the previous screen for continuity. On
many keyboards you can use the PAGE DOWN and PAGE UP keys in place of CONTROL-F and
CONTROL-B , respectively.

178 Chapter 6 The vim Editor

I

i

a

A

This.is.a.line.of.text.
Figure 6-13
Line numbers (G)

The I, i, a, and A commands

When you enter a line number followed by G (goto), vim positions the cursor on that
line in the Work buffer. If you press G without a number, vim positions the cursor on
the last line in the Work buffer. Line numbers are implicit; the file does not need to
have actual line numbers for this command to work. Refer to “Line numbers” on
page 197 if you want vim to display line numbers.

Input Mode
The Insert, Append, Open, Change, and Replace commands put vim in Input mode.
While vim is in this mode, you can put new text into the Work buffer. To return vim
to Command mode when you finish entering text, press the ESCAPE key. Refer to “Show
mode” on page 198 if you want vim to remind you when it is in Input mode (it does
by default).

Inserting Text
Insert (i/I)

The i (Insert) command puts vim in Input mode and places the text you enter before
the current character. The I command places text at the beginning of the current line
(Figure 6-13). Although the i and I commands sometimes overwrite text on the
screen, the characters in the Work buffer are not changed; only the display is affected.
The overwritten text is redisplayed when you press ESCAPE and vim returns to Command mode. Use i or I to insert a few characters or words into existing text or to insert
text in a new file.

Appending Text
Append (a/A)

The a (Append) command is similar to the i command, except that it places the text
you enter after the current character (Figure 6-13). The A command places the text
after the last character on the current line.

Opening a Line for Text
Open (o/O)

The o (Open) and O commands open a blank line within existing text, place the cursor at the beginning of the new (blank) line, and put vim in Input mode. The O
command opens a line above the current line; the o command opens one below the
current line. Use these commands when you are entering several new lines within
existing text.

Command Mode: Deleting and Changing Text 179

Replacing Text
Replace (r/R)

The r and R (Replace) commands cause the new text you enter to overwrite (replace)
existing text. The single character you enter following an r command overwrites the
current character. After you enter that character, vim returns to Command
mode—you do not need to press the ESCAPE key.
The R command causes all subsequent characters to overwrite existing text until you
press ESCAPE to return vim to Command mode.

Replacing TABs
tip The Replace commands might appear to behave strangely when you replace TAB characters. TAB
characters can appear as several SPACEs—until you try to replace them. A TAB is one character and
is replaced by a single character. Refer to “Invisible characters” on page 197 for information on
displaying TABs as visible characters.

Quoting Special Characters in Input Mode
CONTROL-V

While you are in Input mode, you can use the Quote command, CONTROL-V, to enter any
character into the text, including characters that normally have special meaning to
vim. Among these characters are CONTROL-L (or CONTROL-R), which redraws the screen;
CONTROL-W, which backs the cursor up a word to the left; CONTROL-M, which enters a NEWLINE;
and ESCAPE , which ends Input mode.
To insert one of these characters into the text, type CONTROL-V followed by the character.
quotes the single character that follows it. For example, to insert the
sequence ESCAPE [2J into a file you are creating in vim, you would type the character
sequence CONTROL-V ESCAPE [2J. This character sequence clears the screen of a DEC VT100 and other similar terminals. Although you would not ordinarily want to type this
sequence into a document, you might want to use it or another ESCAPE sequence in a
shell script you are creating in vim. Refer to Chapter 10 for information about writing
shell scripts.
CONTROL-V

Command Mode: Deleting and Changing Text
This section describes the commands to delete and replace, or change, text in the document you are editing. The Undo command is covered here because it allows you to
restore deleted or changed text.

Undoing Changes
Undo (u/U)

The u command (Undo) restores text that you just deleted or changed by mistake. A
single Undo command restores only the most recently deleted text. If you delete a line
and then change a word, the first Undo restores only the changed word; you have to
give a second Undo command to restore the deleted line. With the compatible parameter (page 168) set, vim can undo only the most recent change. The U command

180 Chapter 6 The vim Editor

restores the last line you changed to the way it was before you started changing it,
even after several changes.

Deleting Characters
Delete character
(x/X)

The x command deletes the current character. You can precede the x command by a
Repeat Factor (page 206) to delete several characters on the current line, starting
with the current character. The X command deletes the character to the left of the
cursor.

Deleting Text
Delete (d/D)

The d (Delete) command removes text from the Work buffer. The amount of text that
d removes depends on the Repeat Factor and the Unit of Measure (page 203). After
the text is deleted, vim is still in Command mode.

Use dd to delete a single line
tip The command d RETURN deletes two lines: the current line and the following one. Use dd to delete
just the current line, or precede dd by a Repeat Factor (page 206) to delete several lines.

You can delete from the current cursor position up to a specific character on the same
line. To delete up to the next semicolon (;), give the command dt; (see page 183 for
more information on the t command). To delete the remainder of the current line, use
D or d$. Table 6-1 lists some Delete commands. Each command, except the last
group that starts with dd, deletes from/to the current character.

Exchange characters and lines
tip If two characters are out of order, position the cursor on the first character and give the commands
xp.
If two lines are out of order, position the cursor on the first line and give the commands ddp.
See page 191 for more information on the Put commands.

Table 6-1

Delete command examples

Command

Result

dl

Deletes current character (same as the x command)

d0

Deletes from beginning of line

d^

Deletes from first character of line (not including leading SPACEs or TABs)

dw

Deletes to end of word

d3w

Deletes to end of third word

db

Deletes from beginning of word

dW

Deletes to end of blank-delimited word

Command Mode: Deleting and Changing Text 181

Table 6-1

Delete command examples (continued)

Command

Result

dB

Deletes from beginning of blank-delimited word

d7B

Deletes from seventh previous beginning of blank-delimited word

d)

Deletes to end of sentence

d4)

Deletes to end of fourth sentence

d(

Deletes from beginning of sentence

d}

Deletes to end of paragraph

d{

Deletes from beginning of paragraph

d7{

Deletes from seventh paragraph preceding beginning of paragraph

d/text

Deletes up to next occurrence of word text

dfc

Deletes on current line up to and including next occurrence of character c

dtc

Deletes on current line up to next occurrence of c

D

Deletes to end of line

d$

Deletes to end of line

dd

Deletes current line

5dd

Deletes five lines starting with current line

dL

Deletes through last line on screen

dH

Deletes from first line on screen

dG

Deletes through end of Work buffer

d1G

Deletes from beginning of Work buffer

Changing Text
Change (c/C)

The c (Change) command replaces existing text with new text. The new text does not
have to occupy the same space as the existing text. You can change a word to several
words, a line to several lines, or a paragraph to a single character. The C command
replaces the text from the cursor position to the end of the line.
The c command deletes the amount of text specified by the Repeat Factor and the
Unit of Measure (page 203) and puts vim in Input mode. When you finish entering
the new text and press ESCAPE, the old word, line, sentence, or paragraph is changed to
the new one. Pressing ESCAPE without entering new text deletes the specified text (that
is, it replaces the specified text with nothing).

182 Chapter 6 The vim Editor

Table 6-2 lists some Change commands. Except for the last two, each command
changes text from/to the current character.

dw works differently from cw
tip The dw command deletes all characters through (including) the SPACE at the end of a word. The
cw command changes only the characters in the word, leaving the trailing SPACE intact.

Table 6-2

Change command examples

Command

Result

cl

Changes current character

cw

Changes to end of word

c3w

Changes to end of third word

cb

Changes from beginning of word

cW

Changes to end of blank-delimited word

cB

Changes from beginning of blank-delimited word

c7B

Changes from beginning of seventh previous blank-delimited word

c$

Changes to end of line

c0

Changes from beginning of line

c)

Changes to end of sentence

c4)

Changes to end of fourth sentence

c(

Changes from beginning of sentence

c}

Changes to end of paragraph

c{

Changes from beginning of paragraph

c7{

Changes from beginning of seventh preceding paragraph

ctc

Changes on current line up to next occurrence of c

C

Changes to end of line

cc

Changes current line

5cc

Changes five lines starting with current line

Replacing Text
Substitute (s/S)

The s and S (Substitute) commands also replace existing text with new text
(Table 6-3). The s command deletes the current character and puts vim into Input
mode. It has the effect of replacing the current character with whatever you type until
you press ESCAPE. The S command does the same thing as the cc command: It changes

Searching and Substituting 183

the current line. The s command replaces characters only on the current line. If you
specify a Repeat Factor before an s command and this action would replace more
characters than are present on the current line, s changes characters only to the end
of the line (same as C).

Table 6-3

Substitute command examples

Command

Result

s

Substitutes one or more characters for current character

S

Substitutes one or more characters for current line

5s

Substitutes one or more characters for five characters, starting with current
character

Changing Case
The tilde (~) character changes the case of the current character from uppercase to
lowercase, or vice versa. You can precede the tilde with a number to specify the number of characters you want the command to affect. For example, the command 5~
transposes the next five characters starting with the character under the cursor, but
will not transpose characters past the end of the current line.

Searching and Substituting
Searching for and replacing a character, a string of text, or a string that is matched
by a regular expression is a key feature of any editor. The vim editor provides simple
commands for searching for a character on the current line. It also provides more
complex commands for searching for—and optionally substituting for—single and
multiple occurrences of strings or regular expressions anywhere in the Work buffer.

Searching for a Character
Find (f/F)

You can search for and move the cursor to the next occurrence of a specified character on the current line using the f (Find) command. Refer to “Moving the Cursor to
a Specific Character” on page 175.

Find (t/T)

The next two commands are used in the same manner as the Find commands. The t
command places the cursor on the character before the next occurrence of the specified character. The T command places the cursor on the character after the previous
occurrence of the specified character.
A semicolon (;) repeats the last f, F, t, or T command.
You can combine these search commands with other commands. For example, the
command d2fq deletes the text from the current character to the second occurrence
of the letter q on the current line.

184 Chapter 6 The vim Editor

Searching for a String
Search (//?)

The vim editor can search backward or forward through the Work buffer to find a
string of text or a string that matches a regular expression (Appendix A). To find the
next occurrence of a string (forward), press the forward slash (/) key, enter the text
you want to find (called the search string), and press RETURN. When you press the slash
key, vim displays a slash on the status line. As you enter the string of text, it is also
displayed on the status line. When you press RETURN, vim searches for the string. If this
search is successful, vim positions the cursor on the first character of the string. If you
use a question mark (?) in place of the forward slash, vim searches for the previous
occurrence of the string. If you need to include a forward slash in a forward search
or a question mark in a backward search, you must quote it by preceding it with a
backslash (\).

Two distinct ways of quoting characters
tip You use CONTROL-V to quote special characters in text that you are entering into a file (page 179).
This section discusses the use of a backslash (\) to quote special characters in a search string.
The two techniques of quoting characters are not interchangeable.
Next (n/N)

The N and n keys repeat the last search but do not require you to reenter the search
string. The n key repeats the original search exactly, and the N key repeats the search
in the opposite direction of the original search.
If you are searching forward and vim does not find the search string before it gets to
the end of the Work buffer, the editor typically wraps around and continues the
search at the beginning of the Work buffer. During a backward search, vim wraps
around from the beginning of the Work buffer to the end. Also, vim normally performs case-sensitive searches. Refer to “Wrap scan” (page 199) and “Ignore case in
searches” (page 197) for information about how to change these search parameters.

Normal Versus Incremental Searches
When vim performs a normal search (its default behavior), you enter a slash or question mark followed by the search string and press RETURN. The vim editor then moves
the cursor to the next or previous occurrence of the string you are searching for.
When vim performs an incremental search, you enter a slash or question mark. As
you enter each character of the search string, vim moves the highlight to the next
or previous occurrence of the string you have entered so far. When the highlight is
on the string you are searching for, you must press RETURN to move the cursor to the
highlighted string. If the string you enter does not match any text, vim does not
highlight anything.
The type of search that vim performs depends on the incsearch parameter (page 197).
Give the command :set incsearch to turn on incremental searching; use noincsearch
to turn it off. When you set the compatible parameter (page 168), vim turns off incremental searching.

Searching and Substituting 185

Special Characters in Search Strings
Because the search string is a regular expression, some characters take on a special
meaning within the search string. The following paragraphs list some of these characters. See also “Extended Regular Expressions” on page 1017.
The first two items in the following list (^ and $) always have their special meanings
within a search string unless you quote them by preceding them with a backslash (\).
You can turn off the special meanings within a search string for the rest of the items
in the list by setting the nomagic parameter. For more information refer to “Allow
special characters in searches” on page 196.

^

Beginning-of-Line Indicator

When the first character in a search string is a caret (also called a circumflex), it
matches the beginning of a line. For example, the command /^the finds the next line
that begins with the string the.

$ End-of-Line Indicator
A dollar sign matches the end of a line. For example, the command /!$ finds the next
line that ends with an exclamation point and / $ matches the next line that ends with
a SPACE.

.

Any-Character Indicator

A period matches any character, anywhere in the search string. For example, the command /l..e finds line, followed, like, included, all memory, or any other word or
character string that contains an l followed by any two characters and an e. To search
for a period, use a backslash to quote the period (\.).

\> End-of-Word Indicator
This pair of characters matches the end of a word. For example, the command /s\>
finds the next word that ends with an s. Whereas a backslash (\) is typically used to
turn off the special meaning of a character, the character sequence \> has a special
meaning, while > alone does not.

\<

Beginning-of-Word Indicator

This pair of characters matches the beginning of a word. For example, the command /\

Finds the next occurrence of the word and
Example: and

/^The

Finds the next line that starts with The
Examples:
The . . .
There . . .

/^[0-9][0-9])

Finds the next line that starts with a two-digit number followed by a right
parenthesis
Examples:
77)...
01)...
15)...

/ \<[adr]

Finds the next word that starts with a, d, or r
Examples: apple drive road argument right

/^[A-Za-z]

Finds the next line that starts with an uppercase or lowercase letter
Examples:
will not find a line starting with the number 7 . . .
Dear Mr. Jones . . .
in the middle of a sentence like this . . .

Substituting One String for Another
A Substitute command combines the effects of a Search command and a Change
command. That is, it searches for a string (regular expression) just as the / command

Searching and Substituting 187

does, allowing the same special characters discussed in the previous section. When
it finds the string or matches the regular expression, the Substitute command
changes the string or regular expression it matches. The syntax of the Substitute
command is
:[g][address]s/search-string/replacement-string[/option]
As with all commands that begin with a colon, vim executes a Substitute command
from the status line.

The Substitute Address
If you do not specify an address, Substitute searches only the current line. If you use
a single line number as the address, Substitute searches that line. If the address is two
line numbers separated by a comma, Substitute searches those lines and the lines
between them. Refer to “Line numbers” on page 197 if you want vim to display line
numbers. Wherever a line number is allowed in the address, you might also use an
address string enclosed between slashes. The vim editor operates on the next line that
the address string matches. When you precede the first slash of the address string
with the letter g (for global), vim operates on all lines in the file that the address string
matches. (This g is not the same as the one that goes at the end of the Substitute command to cause multiple replacements on a single line; see “Searching for and
Replacing Strings” on the next page).
Within the address, a period represents the current line, a dollar sign represents the
last line in the Work buffer, and a percent sign represents the entire Work buffer. You
can perform address arithmetic using plus and minus signs. Table 6-5 shows some
examples of addresses.

Table 6-5

Addresses

Address

Portion of Work buffer addressed

5

Line 5

77,100

Lines 77 through 100 inclusive

1,.

Beginning of Work buffer through current line

.,$

Current line through end of Work buffer

1,$

Entire Work buffer

%

Entire Work buffer

/pine/

The next line containing the word pine

g/pine/

All lines containing the word pine

.,.+10

Current line through tenth following line (11 lines in all)

188 Chapter 6 The vim Editor

Searching for and Replacing Strings
An s comes after the address in the command syntax, indicating that this is a Substitute command. A delimiter follows the s, marking the beginning of the search-string.
Although the examples in this book use a forward slash, you can use as a delimiter
any character that is not a letter, number, blank, or backslash. You must use the same
delimiter at the end of the search-string.
Next comes the search-string. It has the same format as the search string in the / command and can include the same special characters (page 185). (The search-string is
a regular expression; refer to Appendix A for more information.) Another delimiter
marks the end of the search-string and the beginning of the replacement-string.
The replacement-string replaces the text matched by the search-string and is typically
followed by the delimiter character. You can omit the final delimiter when no option
follows the replacement-string; a final delimiter is required if an option is present.
Several characters have special meanings in the search-string, and other characters
have special meanings in the replacement-string. For example, an ampersand (&) in
the replacement-string represents the text that was matched by the search-string. A
backslash in the replacement-string quotes the character that follows it. Refer to
Table 6-6 and Appendix A.

Table 6-6

Search and replace examples

Command

Result

:s/bigger/biggest/

Replaces the first occurrence of the string bigger on the current line
with biggest
Example:
bigger  biggest

:1,.s/Ch 1/Ch 2/g

Replaces every occurrence of the string Ch 1, before or on the
current line, with the string Ch 2
Examples:
Ch 1  Ch 2
Ch 12  Ch 22

:1,$s/ten/10/g

Replaces every occurrence of the string ten with the string 10
Examples:
ten 

10

often  of10
tenant 
:g/chapter/s/ten/10/

10ant

Replaces the first occurrence of the string ten with the string 10 on
all lines containing the word chapter
Examples:
chapter ten  chapter 10
chapters will often  chapters will of10

Searching and Substituting 189

Table 6-6

Search and replace examples (continued)

Command

Result

:%s/\/10/g

Replaces every occurrence of the word ten with the string 10
Example:
ten 

:.,.+10s/every/each/g

10

Replaces every occurrence of the string every with the string each
on the current line through the tenth following line
Examples:
every 

each

everything  eachthing
:s/\/"&"/

Replaces the word short on the current line with "short" (enclosed
within quotation marks)
Example:
the shortest of the short  the shortest of the "short"

Normally, the Substitute command replaces only the first occurrence of any text that
matches the search-string on a line. If you want a global substitution—that is, if you
want to replace all matching occurrences of text on a line—append the g (global)
option after the delimiter that ends the replacement-string. Another useful option, c
(check), causes vim to ask whether you would like to make the change each time it
finds text that matches the search-string. Pressing y replaces the search-string, q terminates the command, l (last) makes the replacement and quits, a (all) makes all
remaining replacements, and n continues the search without making that
replacement.
The address string need not be the same as the search-string. For example,
:/candle/s/wick/flame/

substitutes flame for the first occurrence of wick on the next line that contains the
string candle. Similarly,
:g/candle/s/wick/flame/

performs the same substitution for the first occurrence of wick on each line of the file
containing the string candle and
:g/candle/s/wick/flame/g

performs the same substitution for all occurrences of wick on each line that contains
the string candle.
If the search-string is the same as the address, you can leave the search-string blank.
For example, the command :/candle/s//lamp/ is equivalent to the command
:/candle/s/candle/lamp/.

190 Chapter 6 The vim Editor

Miscellaneous Commands
This section describes three commands that do not fit naturally into any other
groups.

Join
Join (J)

The J (Join) command joins the line below the current line to the end of the current
line, inserting a SPACE between what was previously two lines and leaving the cursor
on this SPACE. If the current line ends with a period, vim inserts two SPACEs.
You can always “unjoin” (break) a line into two lines by replacing the SPACE or SPACEs
where you want to break the line with a RETURN.

Status
Status (CONTROL-G)

The Status command, CONTROL-G , displays the name of the file you are editing, information about whether the file has been modified or is a readonly file, the number
of the current line, the total number of lines in the Work buffer, and the percentage
of the Work buffer preceding the current line. You can also use :f to display status
information. Following is a sample status line:
"/usr/share/dict/words" [readonly] line 28501 of 98569 --28%-- col 1

. (Period)
Repeat last
command (.)

The . (period) command repeats the most recent command that made a change. If you
had just given a d2w command (delete the next two words), for example, the . command would delete the next two words. If you had just inserted text, the . command
would repeat the insertion of the same text. This command is useful if you want to
change some occurrences of a word or phrase in the Work buffer. Search for the first
occurrence of the word (use /) and then make the change you want (use cw). You can
then use n to search for the next occurrence of the word and . to make the same
change to it. If you do not want to make the change, give the n command again to
find the next occurrence.

Copying, Moving, and Deleting Text
The vim editor has a General-Purpose buffer and 26 Named buffers that can hold text
during an editing session. These buffers are useful if you want to move or copy a portion of text to another location in the Work buffer. A combination of the Delete and
Put commands removes text from one location in the Work buffer and places it in
another location in the Work buffer. The Yank and Put commands copy text to
another location in the Work buffer, without changing the original text.

Copying, Moving, and Deleting Text

191

The General-Purpose Buffer
The vim editor stores the text that you most recently changed, deleted, or yanked
(covered below) in the General-Purpose buffer. The Undo command retrieves text
from the General-Purpose buffer when it restores text.

Copying Text to the Buffer
Yank (y/Y)

The Yank command (y) is identical to the Delete (d) command except that it does not
delete text from the Work buffer. The vim editor places a copy of the yanked text in the
General-Purpose buffer. You can then use a Put command to place another copy of it
elsewhere in the Work buffer. Use the Yank command just as you use the Delete command. The uppercase Y command yanks an entire line into the General-Purpose buffer.

Use yy to yank one line
tip Just as d RETURN deletes two lines, so y RETURN yanks two lines. Use the yy command to yank and
dd to delete the current line.

D works differently from Y
tip The D command (page 180) does not work in the same manner as the Y command. Whereas D
deletes to the end of the line, Y yanks the entire line regardless of the cursor position.

Copying Text from the Buffer
Put (p/P)

The Put commands, p and P, copy text from the General-Purpose buffer to the Work
buffer. If you delete or yank characters or words into the General-Purpose buffer, p
inserts them after the current character, and P inserts them before this character. If you
delete or yank lines, sentences, or paragraphs, P inserts the contents of the GeneralPurpose buffer before the current line, and p inserts them after the current line.
Put commands do not destroy the contents of the General-Purpose buffer. Thus you
can place the same text at several points within the file by giving one Delete or Yank
command and several Put commands.

Deleting Text Copies It into the Buffer
Any of the Delete commands described earlier in this chapter (page 179) place the
deleted text in the General-Purpose buffer. Just as you can use the Undo command
to put the deleted text back where it came from, so you can use a Put command to
put the deleted text at another location in the Work buffer.
Suppose you delete a word from the middle of a sentence by giving the dw command
and then move the cursor to a SPACE between two words and give a p command; vim
places the word you just deleted at the new location. If you delete a line using the dd
command and then move the cursor to the line below the line where you want the
deleted line to appear and give a P command, vim places the line at the new location.

192 Chapter 6 The vim Editor

optional

Named Buffers
You can use a Named buffer with any of the Delete, Yank, or Put commands. Each
of the 26 Named buffers is named by a letter of the alphabet. Each Named buffer can
store a different block of text and you can recall each block as needed. Unlike the
General-Purpose buffer, vim does not change the contents of a Named buffer unless
you issue a command that specifically overwrites that buffer. The vim editor maintains the contents of the Named buffers throughout an editing session.
The vim editor stores text in a Named buffer if you precede a Delete or Yank command with a double quotation mark (") and a buffer name (for example, "kyy
yanks a copy of the current line into buffer k). You can put information from the
Work buffer into a Named buffer in two ways. First, if you give the name of the
buffer as a lowercase letter, vim overwrites the contents of the buffer when it deletes
or yanks text into the buffer. Second, if you use an uppercase letter for the buffer
name, vim appends the newly deleted or yanked text to the end of the buffer. This
feature enables you to collect blocks of text from various sections of a file and
deposit them at one place in the file with a single command. Named buffers are also
useful when you are moving a section of a file and do not want to give a Put command immediately after the corresponding Delete command, and when you want
to insert a paragraph, sentence, or phrase repeatedly in a document.
If you have one sentence you use throughout a document, you can yank that sentence into a Named buffer and put it wherever you need it by using the following
procedure: After entering the first occurrence of the sentence and pressing ESCAPE to
return to Command mode, leave the cursor on the line containing the sentence. (The
sentence must appear on a line or lines by itself for this procedure to work.) Then
yank the sentence into Named buffer a by giving the "ayy command (or "a2yy if
the sentence takes up two lines). Now anytime you need the sentence, you can return
to Command mode and give the command "ap to put a copy of the sentence below
the current line.
This technique provides a quick and easy way to insert text that you use frequently
in a document. For example, if you were editing a legal document, you might store
the phrase The Plaintiff alleges that the Defendant in a Named buffer to save yourself
the trouble of typing it every time you want to use it. Similarly, if you were creating
a letter that frequently used a long company name, such as National Standards Institute, you might put it into a Named buffer.

Numbered Buffers
In addition to the 26 Named buffers and 1 General-Purpose buffer, 9 Numbered buffers are available. They are, in one sense, readonly buffers. The vim editor fills them
with the nine most recently deleted chunks of text that are at least one line long. The
most recently deleted text is held in "1, the next most recent in "2, and so on. If

Reading and Writing Files

193

you delete a block of text and then give other vim commands so that you cannot
reclaim the deleted text with an Undo command, you can use "1p to paste the most
recently deleted chunk of text below the location of the cursor. If you have deleted
several blocks of text and want to reclaim a specific one, proceed as follows: Paste
the contents of the first buffer with "1p. If the first buffer does not hold the text you
are looking for, undo the paste operation with u and then give the period (.) command to repeat the previous command. The Numbered buffers work in a unique
way with the period command: Instead of pasting the contents of buffer "1, the
period command pastes the contents of the next buffer ("2). Another u and period
would replace the contents of buffer "2 with that of buffer "3, and so on through
the nine buffers.

Reading and Writing Files
Exit (ZZ)

The vim editor reads a disk file into the Work buffer when you specify a filename on
the command line you use to call vim. A ZZ command that terminates an editing session writes the contents of the Work buffer back to the disk file. This section discusses
other ways of reading text into the Work buffer and writing it to a file.

Reading Files
Read (:r)

The Read command reads a file into the Work buffer. The new file does not overwrite
any text in the Work buffer but rather is positioned following the single address you
specify (or the current line if you do not specify an address). You can use an address
of 0 to read the file into the beginning of the Work buffer. The Read command has
the following syntax:
:[address]r [filename]
As with other commands that begin with a colon, when you enter the colon it appears
on the status line. The filename is the pathname of the file that you want to read and
must be terminated by RETURN. If you omit the filename, vim reads from the disk the
file you are editing.

Writing Files
Write (:w)

The Write command writes part or all of the Work buffer to a file. You can specify
an address to write part of the Work buffer and a filename to specify a file to receive
the text. If you do not specify an address or filename, vim writes the entire contents
of the Work buffer to the file you are editing, updating the file on the disk.
During a long editing session, it is a good idea to use the Write command occasionally. If a problem develops later, a recent copy of the Work buffer is then safe on the
disk. If you use a :q! command to exit from vim, the disk file reflects the version of
the Work buffer at the time you last used a Write command.

194 Chapter 6 The vim Editor

The Write command has two syntaxes:
:[address]w[!] [filename]
:[address]w>> filename
The second syntax appends text to an existing file. The address specifies the portion
of the Work buffer vim will write to the file. The address follows the form of the
address that the Substitute command uses (page 187). If you do not specify an
address, vim writes the entire contents of the Work buffer. The optional filename is
the pathname of the file you are writing to. If you do not specify a filename, vim
writes to the file you are editing.
w!

Because the Write command can quickly destroy a large amount of work, vim
demands that you enter an exclamation point (!) following the w as a safeguard
against accidentally overwriting a file. The only times you do not need an exclamation point are when you are writing out the entire contents of the Work buffer to
the file being edited (using no address and no filename) and when you are writing
part or all of the Work buffer to a new file. When you are writing part of the file
to the file being edited or when you are overwriting another file, you must use an
exclamation point.

Identifying the Current File
The File command (:f) provides the same information as the Status command (CONTROL-G;
page 190). The filename the File command displays is the one the Write command uses
if you give a :w command without a filename.

Setting Parameters
You can tailor the vim editor to your needs and habits by setting vim parameters.
Parameters perform such functions as displaying line numbers, automatically inserting RETURNs, and establishing incremental and nonstandard searches.
You can set parameters in several ways. For example, you can set them to establish
the environment for the current editing session while you are using vim. Alternatively,
you can set the parameters in your ~/.bash_profile (bash) or ~/.tcshrc (tcsh) shell
startup file or in the vim startup file, ~/.vimrc. When you set the parameters in any
of these files, the same customized environment will be available each time vim starts
and you can begin editing immediately.

Setting Parameters from Within vim
To set a parameter while you are using vim, enter a colon (:), the word set, a SPACE,
and the parameter (refer to “Parameters” on the next page). The command appears
on the status line as you type it and takes effect when you press RETURN. The following
command establishes incremental searches for the current editing session:
:set incsearch

Setting Parameters 195

Setting Parameters in a Startup File
VIMINIT

If you are using bash , you can put a line with the following syntax in your
~/.bash_profile startup file (page 278):
export VIMINIT='set param1 param2 ...'
Replace param1 and param2 with parameters selected from Table 6-7. VIMINIT is
a shell variable that vim reads. The following statement causes vim to ignore the case
of characters in searches, display line numbers, use the TC Shell to execute Linux
commands, and wrap text 15 characters from the right edge of the screen:
export VIMINIT='set ignorecase number shell=/bin/tcsh wrapmargin=15'

If you use the parameter abbreviations, it looks like this:
export VIMINIT='set ic nu sh=/bin/tcsh wm=15'

If you are using tcsh, put the following line in your ~/.tcshrc startup file (page 372):
setenv VIMINIT 'set param1 param2 ...'
Again, replace param1 and param2 with parameters from Table 6-7. The values
between the single quotation marks are the same as those shown in the preceding
examples.

The .vimrc Startup File
Instead of setting vim parameters in your shell startup file, you can create a ~/.vimrc
file in your home directory and set the parameters there. Creating a .vimrc file causes
vim to start with the compatible parameter unset (page 168). Lines in a .vimrc file follow this syntax:
set param1 param2 ...
Following are examples of .vimrc files that perform the same function as VIMINIT
described previously:
$ cat ~/.vimrc
set ignorecase
set number
set shell=/bin/tcsh
set wrapmargin=15
$ cat ~/.vimrc
set ic nu sh=/bin/tcsh wm=15

Parameters set by the VIMINIT variable take precedence over those set in the
.vimrc file.

Parameters
Table 6-7 lists some of the most useful vim parameters. The vim editor displays a complete list of parameters and indicates how they are currently set when you give the
command :set all followed by a RETURN. The command :set RETURN displays a list of

196 Chapter 6 The vim Editor

options that are set to values other than their default values. Two classes of parameters exist: those that contain an equal sign (and can take on a value) and those that
are optionally prefixed with no (switches that are on or off). You can change the sense
of a switch parameter by giving the command :set [no]param. For example, give the
command :set number (or :set nonumber) to turn on (or off) line numbering. To
change the value of a parameter that takes on a value (and uses an equal sign), give
a command such as :set shiftwidth=15.
Most parameters have abbreviations—for example, nu for number, nonu for
nonumber, and sw for shiftwidth. The abbreviations are listed in the left column of
Table 6-7, following the name of the parameter.

Table 6-7

Parameters

Parameter

Effect

Allow special
characters in searches

Refer to “Special Characters in Search Strings” on page 185. By
default the following characters have special meanings when used in
a search string:

magic

.

[

]

*

When you set the nomagic parameter, these characters no longer
have special meanings. The magic parameter restores their special
meanings.
The ^ and $ characters always have special meanings within search
strings, regardless of how you set this parameter.
Automatic indention
autoindent, ai

The automatic indention feature works with the shiftwidth parameter
to provide a regular set of indentions for programs or tabular
material. This feature is off by default. You can turn it on by setting
autoindent and turn it off by setting noautoindent.
When automatic indention is on and vim is in Input mode, pressing
CONTROL-T moves the cursor from the left margin (or an indention) to
the next indention position, pressing RETURN moves the cursor to the

left side of the next line under the first character of the previous line,
and pressing CONTROL-D backs up over indention positions. The
CONTROL-T and CONTROL-D keys work only before text is placed on a line.
Automatic write
autowrite, aw

Flash
flash, fl

By default vim asks you before writing out the Work buffer when you
have not explicitly told it to do so (as when you give a :n command
to edit the next file). The autowrite option causes vim to write the
Work buffer automatically when you use commands, such as :n, to
edit to another file. You can disable this parameter by setting the
noautowrite or noaw option.
The vim editor normally causes the terminal to beep when you give
an invalid command or press ESCAPE when it is in Command mode.
Setting the parameter flash causes the terminal to flash instead of
beep. Set noflash to cause it to beep. Not all terminals and emulators
support this parameter.

Setting Parameters 197

Table 6-7

Parameters (continued)

Parameter

Effect

Ignore case in searches

The vim editor normally performs case-sensitive searches,
differentiating between uppercase and lowercase letters. It performs
case-insensitive searches when you set the ignorecase parameter.
Set noignorecase to restore case-sensitive searches.

ignorecase, ic

Incremental search
incsearch, is

Invisible characters
list
Status line
laststatus=n, ls=n

Line numbers
number, nu

Refer to “Normal Versus Incremental Searches” on page 184. By default
vim does not perform incremental searches. To cause vim to perform
incremental searches, set the parameter incsearch. To cause vim not
to perform incremental searches, set the parameter noincsearch.
To cause vim to display each TAB as ^I and to mark the end of each
line with a $, set the list parameter. To display TABs as whitespace and
not mark ends of lines, set nolist.
This parameter displays a status line that shows the name of the file
you are editing, a [+] if the file has been changed since it was last
written out, and the position of the cursor. When setting the
parameter laststatus=n, n equal to 0 (zero) turns off the status line,
1 displays the status line when at least two vim windows are
displayed, and 2 always displays the status line.
The vim editor does not normally display the line number associated
with each line. To display line numbers, set the parameter number. To
cause line numbers not to be displayed, set the parameter nonumber.
Line numbers are not part of the file, are not stored with the file, and
are not displayed when the file is printed. They appear on the screen
only while you are using vim.

Line wrap
wrap

Line wrap margin
wrapmargin=nn, wm=nn

The line wrap controls how vim displays lines that are too long to fit
on the screen. To cause vim to wrap long lines and continue them
on the next line, set wrap (set by default). If you set nowrap, vim
truncates long lines at the right edge of the screen.
The line wrap margin causes vim to break the text that you are
inserting at approximately the specified number of characters from
the right margin. The vim editor breaks the text by inserting a NEWLINE
character at the closest blank-delimited word boundary. Setting the
line wrap margin is handy if you want all the text lines to be
approximately the same length. This feature relieves you of the need
to remember to press RETURN to end each line of input.
When setting the parameter wrapmargin=nn, nn is the number of
characters from the right side of the screen where you want vim to
break the text. This number is not the column width of the text but
rather the distance from the end of the text to the right edge of the
screen. Setting the wrap margin to 0 (zero) turns this feature off. By
default the line wrap margin is off (set to 0).

198 Chapter 6 The vim Editor

Table 6-7

Parameters (continued)

Parameter

Effect

Report

This parameter causes vim to display a report on the status line
whenever you make a change that affects at least nn lines. For
example, if report is set to 7 and you delete seven lines, vim displays
the message 7 lines deleted. When you delete six or fewer lines, vim
does not display a message. The default for report is 5.

report=nn

Scroll
scroll=nn, scr=nn

This parameter controls the number of lines that CONTROL-D and
CONTROL-U (page 177) scroll text on the screen. By default scroll is set
to half the window height.
There are two ways to change the value of scroll. First you can enter
a number before pressing CONTROL-D or CONTROL-U; vim sets scroll to
that number. Alternatively, you can set scroll explicitly with
scroll=nn, where nn is the number of lines you want to scroll with
each CONTROL-D or CONTROL-U command.

Shell
shell=path, sh=path

Shift width
shiftwidth=nn, sw=nn

Show match
showmatch, sm

Show mode
showmode, smd

While you are using vim, you can cause it to spawn a new shell. You
can either create an interactive shell (if you want to run several
commands) or run a single command. The shell parameter
determines which shell vim invokes. By default vim sets the shell
parameter to your login shell. To change it, set the parameter
shell=path, where path is the absolute pathname of the shell you want
to use.
This parameter controls the functioning of CONTROL-T and CONTROL-D
in Input mode when automatic indention is on (see “Automatic
indention” in this table). When setting the parameter shiftwidth=nn,
nn is the spacing of the indention positions (8 by default). Setting the
shift width is similar to setting the TAB stops on a typewriter; with
shiftwidth, however, the distance between TAB stops remains
constant.
This parameter is useful for programmers who are working in
languages that use braces ({ }) or parentheses as expression
delimiters (Lisp, C, Tcl, and so on). When showmatch is set and you
are entering code (in Input mode) and type a closing brace or
parenthesis, the cursor jumps briefly to the matching opening brace
or parenthesis (that is, the preceding corresponding element at the
same nesting level). After it highlights the matching element, the
cursor resumes its previous position. When you type a right brace or
parenthesis that does not have a match, vim beeps. Use
noshowmatch to turn off automatic matching.
Set the parameter showmode to display the mode in the lower-right
corner of the screen when vim is in Input mode (default). Set
noshowmode to cause vim not to display the mode.

Advanced Editing Techniques 199

Table 6-7

Parameters (continued)

Parameter

Effect

vi compatibility

Refer to “The compatible Parameter” on page 168. By default, vim
does not attempt to be compatible with vi. To cause vim to be
compatible with vi, set the parameter compatible. To cause vim not
to be compatible with vi, set the parameter nocompatible.

compatible, cp

Wrap scan
wrapscan, ws

By default, when a search for the next occurrence of a search string
reaches the end of the Work buffer, vim continues the search at the
beginning of the Work buffer. The reverse is true with a search for the
previous occurrence of a search string. The nowrapscan parameter
stops the search at either end of the Work buffer. Set the wrapscan
parameter if you want searches to wrap around the ends of the Work
buffer.

Advanced Editing Techniques
This section presents several commands you might find useful once you have become
comfortable using vim.

optional

Using Markers
While you are using vim, you can set and use markers to make addressing more convenient. Set a marker by giving the command mc, where c is any character. (Letters
are preferred because some characters, such as a single quotation mark, have special
meanings when used as markers.) The vim editor does not preserve markers when you
exit from vim.
Once you have set a marker, you can use it in a manner similar to a line number. You
can move the cursor to the beginning of a line that contains a marker by preceding
the marker name with a single quotation mark. For example, to set marker t, position
the cursor on the line you want to mark and give the command mt. During the
remainder of this editing session, unless you reset marker t or delete the line it marks,
you can return to the beginning of the line you marked by giving the command 't.
You can delete all text from the current line through the line containing marker r with
the following command:
d'r

You can use a back tick (‘, also called a grave accent or reverse single quotation
mark) to go to the exact position of the marker on the line. After setting marker t,
you can move the cursor to the location of this marker (not the beginning of the line

200 Chapter 6 The vim Editor

that holds the marker) with the command ‘t. The following command deletes all the
text from the current line up to the character where the marker r was placed; the rest
of the line containing the marker remains intact:
d‘r

You can use markers in addresses of commands instead of line numbers. The following command replaces all occurrences of The with THE on all lines from marker m
to the current line (marker m must precede the current line):
:'m,.s/The/THE/g

Editing Other Files
The following command causes vim to edit the file you specify with filename:
:e[!] [filename]
If you want to save the contents of the Work buffer, you must write it out (using :w)
before you give this command. If you do not want to save the contents of the Work
buffer, vim insists you use an exclamation point to acknowledge that you will lose the
work you did since the last time you wrote out the Work buffer. If you do not supply
a filename, vim edits the file you are working on.
:e!

The command :e! starts an editing session over again. This command returns the
Work buffer to the state it was in the last time you wrote it out or, if you have not
written it out, the state it was in when you started editing the file. It is useful when
you make mistakes while editing a file and decide that it would be easier to start over
than to fix the mistakes.
Because the :e command does not destroy the contents of the General-Purpose or
Named buffers, you can store text from one file in a buffer, use a :e command to edit
a second file, and put text from the buffer in the second file.

:e#

The command :e# closes the current file and opens the last file you were editing,
placing the cursor on the line that it was on when you last closed the file. If you do
not save the file you are working on before you give this command, vim prompts
you to do so. Setting the autowrite parameter (page 196) will not stop vim from
prompting you.

:n
:rew

The :e# command can help you copy blocks of text from one file to another. When
you call vim with the names of several files as arguments, you can use :n to edit the
next file, :e# to edit the file you just edited, and :rew to rewind the sequence of files
so that you are editing the first file again. As you move between files, you can copy
text from one file into a buffer and paste that text into another file. You can use :n!
to force vim to close a file without writing out changes before it opens the next file.

Macros and Shortcuts
:map

The vim editor allows you to create both macros and shortcuts. The :map command
defines a key or sequence of keys that perform some action in Command mode. The

Advanced Editing Techniques 201

following command maps CONTROL-X to the commands that will find the next left
bracket on the current line (f[), delete all characters from that bracket to the next
right bracket (df]) on the same line, delete the next character (x), move the cursor
down two lines (2j), and finally move the cursor to the beginning of the line (0):
:map ^X f[df]x2j0

Although you can use ESCAPE and CONTROL sequences, it is a good idea to avoid remapping characters or sequences that are vim commands. Type :map by itself to see a list
of the current mappings. You might need to use CONTROL-V (page 179) to quote some of
the characters you want to enter into the :map string.
:abbrev

The :abbrev command is similar to :map but creates abbreviations you can use while
in Input mode. When you are in Input mode and type a string you have defined with
:abbrev, followed by a SPACE, vim replaces the string and the SPACE with the characters
you specified when you defined the string. For ease of use, avoid common sequences
of characters when creating abbreviations. The following command defines ZZ as an
abbreviation for Sam the Great:
:abbrev ZZ Sam the Great

Even though ZZ is a vim command, it is used only in Command mode. It has no special
meaning in Input mode, where you use abbreviations.

Executing Shell Commands from Within vim
:sh

You can execute shell commands in several ways while you are using vim. For
instance, you can spawn a new interactive shell by giving the following command and
pressing RETURN:
:sh

The vim shell parameter (page 198) determines which shell is spawned (usually bash
or tcsh). By default shell is the same as your login shell.
After you have finished your work in the shell, you can return to vim by exiting from
the shell (press CONTROL-D or give an exit command).

If :sh does not work correctly
tip The :sh command might behave strangely depending on how the shell has been configured. You
might get warnings with the :sh command or it might even hang. Experiment with the :sh command to be sure it works correctly with your configuration. If it does not, you might want to set
the vim shell parameter to another shell before using :sh. For example, the following command
causes vim to use tcsh with the :sh command:
:set shell=/bin/tcsh

You might need to change the SHELL environment variable after starting :sh to show the correct
shell.

202 Chapter 6 The vim Editor

Edit only one copy of a file
caution When you create a new shell by giving the command :sh, remember you are still using vim. A common mistake is to try to edit the same file from the new shell, forgetting that vim is already editing
the file from a different shell. Because you can lose information by editing the same file from two
instances of an editor, vim warns you when you make this mistake. Refer to “File Locks” on
page 171 to see an example of the message that vim displays.
:!command

You can execute a shell command line from vim by giving the following command,
replacing command with the command line you want to execute and terminating the
command with a RETURN:
:!command
The vim editor spawns a new shell that executes the command. When the command
runs to completion, the newly spawned shell returns control to the editor.

:!!command

You can execute a command from vim and have it replace the current line with the
output from the command. If you do not want to replace any text, put the cursor on
a blank line before giving the following command:
!!command
Nothing happens when you enter the first exclamation point. When you enter the second one, vim moves the cursor to the status line and allows you to enter the command
you want to execute. Because this command puts vim in Last Line mode, you must
end the command with a RETURN (as you would end most shell commands).
You can also execute a command from vim with standard input to the command
coming from all or part of the file you are editing and standard output from the
command replacing the input in the file you are editing. This type of command is
handy for sorting a list in place within a file.
To specify the block of text that will become standard input for the command, move
the cursor to one end of the block of text. Then enter an exclamation point followed
by a command that would normally move the cursor to the other end of the block of
text. For example, if the cursor is at the beginning of the file and you want to specify
the whole file, give the command !G. If you want to specify the part of the file
between the cursor and marker b, give the command !'b. After you give the cursormovement command, vim displays an exclamation point on the status line and waits
for you to enter a shell command.
To sort a list of names in a file, move the cursor to the beginning of the list and set
marker q with an mq command. Then move the cursor to the end of the list and give
the following command:
!'qsort

Press RETURN and wait. After a few seconds, the sorted list should replace the original
list on the screen. If the command did not behave as expected, you can usually undo
the change with a u command. Refer to page 942 for more information on sort.

Units of Measure 203

! can destroy a file
caution If you enter the wrong command or mistype a command, you can destroy a file (for example, if
the command hangs or stops vim from working). For this reason it is a good idea to save your
file before using this command. The Undo command (page 179) can be a lifesaver. A :e! command (page 200) will get rid of the changes, returning the buffer to the state it was in last time
you saved it.
As with the :sh command, the default shell might not work properly with the ! command. You
might want to test the shell with a sample file before executing this command with your real work.
If the default shell does not work properly, change the shell parameter.

Units of Measure
Many vim commands operate on a block of text—ranging from one character to
many paragraphs. You specify the size of a block of text with a Unit of Measure. You
can specify multiple Units of Measure by preceding a Unit of Measure with a Repeat
Factor (page 206). This section defines the various Units of Measure.

Character
A character is one character—visible or not, printable or not—including
TABs. Some examples of characters are
a

q

A

.

5

R

-

>

SPACEs

and

TAB SPACE

Word
A word, similar to a word in the English language, is a string of one or more characters bounded on both sides by any combination of one or more of the following
elements: a punctuation mark, SPACE, TAB, numeral, or NEWLINE. In addition, vim considers
each group of punctuation marks to be a word (Table 6-8).

Table 6-8

Words

Word count

Text

1

pear

2

pear!

2

pear!)

3

pear!) The

4

pear!) "The

11

This is a short, concise line (no frills).

204 Chapter 6 The vim Editor

Blank-Delimited Word
A blank-delimited word is the same as a word but includes adjacent punctuation.
Blank-delimited words are separated by one or more of the following elements: a
SPACE, TAB, or NEWLINE (Table 6-9).

Table 6-9

Blank-delimited words

Word count

Text

1

pear

1

pear!

1

pear!)

2

pear!) The

2

pear!) "The

8

This is a short, concise line (no frills).

Line
A line is a string of characters bounded by NEWLINEs that is not necessarily displayed as
a single physical line on the screen. You can enter a very long single (logical) line that
wraps around (continues on the next physical line) several times or disappears off the
right edge of the display. It is a good idea to avoid creating long logical lines; ideally
you would terminate lines with a RETURN before they reach the right side of the screen.
Terminating lines in this manner ensures that each physical line contains one logical
line and avoids confusion when you edit and format text. Some commands do not
appear to work properly on physical lines that are longer than the width of the screen.
For example, with the cursor on a long logical line that wraps around several physical
lines, pressing RETURN once appears to move the cursor down more than one line. You
can use fmt (page 812) to break long logical lines into shorter ones.

Sentence
A sentence is an English sentence or the equivalent. A sentence starts at the end of
the previous sentence and ends with a period, exclamation point, or question mark,
followed by two SPACEs or a NEWLINE (Table 6-10).

Table 6-10

Sentences

Sentence count

Text

One: only one SPACE
after the first period
and a NEWLINE after the
second period

That's it. This is one sentence.

Units of Measure 205

Table 6-10

Sentences (continued)

Sentence count

Text

Two: two SPACEs after
the first period and a
NEWLINE after the
second period

That's it.

Three: two SPACEs after
the first two question
marks and a NEWLINE
after the exclamation
point

What?

One: NEWLINE after the
period

This sentence takes

This is two sentences.

Three sentences?

One line!

up a total of
three lines.

Paragraph
A paragraph is preceded and followed by one or more blank lines. A blank line is
composed of two NEWLINE characters in a row (Table 6-11).

Table 6-11

Paragraphs

Paragraph count

Text

One: blank line before
and after text

One paragraph

One: blank line before
and after text

Three: three blocks of
text separated by blank
lines

This might appear to be
more than one paragraph.
Just because there are
two indentions does not mean
it qualifies as two paragraphs.

Even though in
English this is only
one sentence,
vim considers it to be
three paragraphs.

206 Chapter 6 The vim Editor

Screen (Window)
Under vim, a screen or terminal emulator window can display one or more logical
windows of information. A window displays all or part of a Work buffer. Figure 6-5
on page 165 shows a screen with two windows.

Repeat Factor
A number that precedes a Unit of Measure (page 203) is a Repeat Factor. Just as the
5 in 5 inches causes you to consider 5 inches as a single Unit of Measure, so a Repeat
Factor causes vim to group more than one Unit of Measure and consider it as a single
Unit of Measure. For example, the command w moves the cursor forward 1 word,
the command 5w moves it forward 5 words, and the command 250w moves it forward 250 words. If you do not specify a Repeat Factor, vim assumes a Repeat Factor
of 1. If the Repeat Factor would move the cursor past the end of the file, the cursor
is left at the end of the file.

Chapter Summary
This summary of vim includes all the commands covered in this chapter, plus a few
more. Table 6-12 lists some of the ways you can call vim from the command line.

Table 6-12

Calling vim

Command

Result

vim filename

Edits filename starting at line 1

vim +n filename

Edits filename starting at line n

vim + filename

Edits filename starting at the last line

vim +/pattern filename Edits filename starting at the first line containing pattern
vim –r filename

Recovers filename after a system crash

vim –R filename

Edits filename readonly (same as opening the file with view)

You must be in Command mode to use commands that move the cursor by Units of
Measure (Table 6-13). You can use these Units of Measure with Change, Delete, and
Yank commands. Each of these commands can be preceded by a Repeat Factor.

Table 6-13

Moving the cursor by Units of Measure

Command
SPACE, l (ell),
RIGHT ARROW

Moves the cursor
or

Space to the right

h or LEFT ARROW

Space to the left

w

Word to the right

Chapter Summary

Table 6-13

207

Moving the cursor by Units of Measure (continued)

Command

Moves the cursor

W

Blank-delimited word to the right

b

Word to the left

B

Blank-delimited word to the left

$

End of line

e

End of word to the right

E

End of blank-delimited word to the right

0 (zero)

Beginning of line (cannot be used with a Repeat Factor)

RETURN

Beginning of next line

j or DOWN ARROW

Down one line

–

Beginning of previous line

k or UP ARROW

Up one line

)

End of sentence

(

Beginning of sentence

}

End of paragraph

{

Beginning of paragraph

%

Move to matching brace of same type at same nesting level

Table 6-14 shows the commands that enable you to view different parts of the Work
buffer.

Table 6-14

Viewing the Work buffer

Command

Moves the cursor

CONTROL-D

Forward one-half window

CONTROL-U

Backward one-half window

CONTROL-F or

Forward one window

PAGE DOWN
CONTROL-B or

Backward one window

PAGE UP

nG

To line n (without n, to the last line)

H

To top of window

M

To middle of window

L

To bottom of window

208 Chapter 6 The vim Editor

The commands in Table 6-15 enable you to add text to the buffer. All these commands, except r, leave vim in Input mode. You must press ESCAPE to return to
Command mode.

Table 6-15

Adding text

Command

Adds text

i

Before cursor

I

Before first nonblank character on line

a

After cursor

A

At end of line

o

Opens a line below current line

O

Opens a line above current line

r

Replaces current character (no ESCAPE needed)

R

Replaces characters, starting with current character (overwrite until ESCAPE)

Table 6-16 lists commands that delete and change text. In this table M is a Unit of
Measure that you can precede with a Repeat Factor, n is an optional Repeat Factor,
and c is any character.

Table 6-16

Deleting and changing text

Command

Result

nx

Deletes the number of characters specified by n, starting with the current
character

nX

Deletes n characters before the current character, starting with the character
preceding the current character

dM

Deletes text specified by M

n dd

Deletes n lines

dtc

Deletes to the next character c on the current line

D

Deletes to end of the line

n~

Changes case of the next n characters

The following commands leave vim in Input mode. You must press ESCAPE to return to
Command mode.

ns

Substitutes n characters

S

Substitutes for the entire line

Chapter Summary

Table 6-16

209

Deleting and changing text (continued)

Command

Result

cM

Changes text specified by M

n cc

Changes n lines

ctc

Changes to the next character c on the current line

C

Changes to end of line

Table 6-17 lists search commands. Here rexp is a regular expression that can be a
simple string of characters.

Table 6-17

Searching

Command

Result

/rexp RETURN

Searches forward for rexp

?rexp RETURN

Searches backward for rexp

n

Repeats original search exactly

N

Repeats original search, in the opposite direction

/RETURN

Repeats original search forward

?RETURN

Repeats original search backward

fc

Positions the cursor on the next character c on the current line

Fc

Positions the cursor on the previous character c on the current line

tc

Positions the cursor on the character before (to the left of) the next character
c on the current line

Tc

Positions the cursor on the character after (to the right of) the previous
character c on the current line

;

Repeats the last f, F, t, or T command

The syntax of a Substitute command is
:[address]s/search-string/replacement-string[/g]
where address is one line number or two line numbers separated by a comma. A
. (period) represents the current line, $ represents the last line, and % represents
the entire file. You can use a marker or a search string in place of a line number.
The search-string is a regular expression that can be a simple string of characters.
The replacement-string is the replacement string. A g indicates a global replacement (more than one replacement per line).

210 Chapter 6 The vim Editor

Table 6-18 lists miscellaneous vim commands.

Table 6-18

Miscellaneous commands

Command

Result

J

Joins the current line and the following line

.

Repeats the most recent command that made a change

:w filename

Writes the contents of the Work buffer to filename (or to the current file if there
is no filename)

:q

Quits vim

ZZ

Writes the contents of the Work buffer to the current file and quits vim

:f or CONTROL-G

Displays the filename, status, current line number, number of lines in the Work
buffer, and percentage of the Work buffer preceding the current line

CONTROL-V

Inserts the next character literally even if it is a vim command (use in Input mode)

Table 6-19 lists commands that yank and put text. In this table M is a Unit of Measure that you can precede with a Repeat Factor and n is a Repeat Factor. You can
precede any of these commands with the name of a buffer using the form "x, where
x is the name of the buffer (a–z).

Table 6-19

Yanking and putting text

Command

Result

yM

Yanks text specified by M

n yy

Yanks n lines

Y

Yanks to end of line

P

Puts text before or above

p

Puts text after or below

Table 6-20 lists advanced vim commands.

Table 6-20

Advanced commands

Command

Result

mx

Sets marker x, where x is a letter from a to z.

''(two single

Moves cursor back to its previous location.

quotation marks)

'x

Moves cursor to line with marker x.

‘x

Moves cursor to character with marker x.

:e filename

Edits filename, requiring you to write changes to the current file (with :w or
autowrite) before editing the new file. Use :e! filename to discard changes to
the current file. Use :e! without a filename to discard changes to the current
file and start editing the saved version of the current file.

Exercises 211

Table 6-20

Advanced commands (continued)

Command

Result

:n

Edits the next file when vim is started with multiple filename arguments.
Requires you to write changes to the current file (with :w or autowrite) before
editing the next file. Use :n! to discard changes to the current file and edit the
next file.

:rew

Rewinds the filename list when vim is started with multiple filename
arguments and starts editing with the first file. Requires you to write changes
to the current file (with :w or autowrite) before editing the first file. Use :rew!
to discard changes to the current file and edit the first file.

:sh

Starts a shell. Exit from the shell to return to vim.

:!command

Starts a shell and executes command.

!!command

Starts a shell, executes command, and places output in the Work buffer,
replacing the current line.

Exercises
1. How can you cause vim to enter Input mode? How can you make vim revert
to Command mode?
2. What is the Work buffer? Name two ways of writing the contents of the
Work buffer to the disk.
3. Suppose that you are editing a file that contains the following paragraph
and the cursor is on the second tilde (~):
The vim editor has a command, tilde (~),
that changes lowercase letters to
uppercase, and vice versa.
The ~ command works with a Unit of Measure or
a Repeat Factor, so you can change
the case of more than one character at a time.

How can you
a. Move the cursor to the end of the paragraph?
b. Move the cursor to the beginning of the word Unit?
c. Change the word character to letter?
4. While working in vim, with the cursor positioned on the first letter of a
word, you give the command x followed by p. Explain what happens.
5. What are the differences between the following commands?
a. i and I
b. a and A
c. o and O

212 Chapter 6 The vim Editor

d. r and R
e. u and U
6. Which command would you use to search backward through the Work buffer for lines that start with the word it?
7. Which command substitutes all occurrences of the phrase this week with the
phrase next week?
8. Consider the following scenario: You start vim to edit an existing file. You
make many changes to the file and then realize that you deleted a critical
section of the file early in your editing session. You want to get that section
back but do not want to lose all the other changes you made. What would
you do?
9. How can you move the current line to the beginning of the file?
10. Use vim to create the letter_e file of e’s used on page 64. Use as few vim
commands as possible. Which vim commands did you use?

Advanced Exercises
11. Which commands can you use to take a paragraph from one file and insert
it in a second file?
12. Create a file that contains the following list, and then execute commands
from within vim to sort the list and display it in two columns. (Hint: Refer
to page 915 for more information on pr.)
Command mode
Input mode
Last Line mode
Work buffer
General-Purpose buffer
Named buffer
Regular Expression
Search String
Replacement String
Startup File
Repeat Factor

13. How do the Named buffers differ from the General-Purpose buffer?
14. Assume that your version of vim does not support multiple Undo commands. If you delete a line of text, then delete a second line, and then a third
line, which commands would you use to recover the first two lines that you
deleted?
15. Which command would you use to swap the words hither and yon on any
line with any number of words between them? (You need not worry about
special punctuation, just uppercase and lowercase letters and spaces.)

7
The emacs Editor

Chapter7
7

In This Chapter

Objectives

History . . . . . . . . . . . . . . . . . . . . . . 214

After reading this chapter you should be able to:

emacs Versus vim . . . . . . . . . . . . . 215

 Use emacs to create and edit a file

Tutorial: Getting Started with
emacs. . . . . . . . . . . . . . . . . . . . . 216

 Save and retrieve the buffer

Basic Editing Commands . . . . . . . 223
Online Help . . . . . . . . . . . . . . . . . . 229
Advanced Editing . . . . . . . . . . . . . 232
Major Modes: Language-Sensitive
Editing. . . . . . . . . . . . . . . . . . . . . . 246
Customizing emacs. . . . . . . . . . . . 256

 Use emacs online help
 Describe how to move the cursor by characters, words,
lines, and paragraphs
 List the commands that move the cursor backward and
forward by characters and words
 Explain how to search backward and forward for text
and what an incremental search is
 Describe emacs key notation
 Split a window
 Describe the process of undoing changes

214 Chapter 7 The emacs Editor

History
In 1956 the Lisp (List processing) language was developed at MIT by John
McCarthy. In its original conception, Lisp had only a few scalar (atomic) data
types and only one data structure (page 1064): a list. Lists could contain atomic
data or other lists. Lisp supported recursion and nonnumeric data (exciting concepts in those Fortran and COBOL days) and, in the Cambridge culture at least,
was once the favored implementation language. Richard Stallman and Guy Steele
were part of this MIT Lisp culture. In 1975 they collaborated on emacs, which
Stallman maintained by himself for a long time. This chapter discusses the emacs
editor as implemented by the Free Software Foundation (GNU), version 23. The
emacs home page is www.gnu.org/software/emacs.
The emacs editor was prototyped as a series of extension commands or macros for
the late 1960s text editor TECO (Text Editor and COrrector). Its acronymic name,
Editor MACroS, reflects this origin, although there have been many humorous reinterpretations, including ESCAPE META ALT CONTROL SHIFT, Emacs Makes All Computing Simple,
and the unkind translation Eight Megabytes And Constantly Swapping.

Evolution
Over time emacs has grown and evolved through more than 20 major revisions to
the mainstream GNU version. The emacs editor, which is coded in C, contains a complete Lisp interpreter and fully supports the X Window System and mouse
interaction. The original TECO macros are long gone, but emacs is still very much a
work in progress. Version 22 has significant internationalization upgrades: an
extended UTF-8 internal character set four times bigger than Unicode, along with
fonts and keyboard input methods for more than 30 languages. Also, the user interface is moving in the direction of a WYSIWYG (what you see is what you get) word
processor, which makes it easier for beginners to use the editor.
The emacs editor has always been considerably more than a text editor. Not having been
developed originally in a UNIX environment, it does not adhere to the UNIX/Linux philosophy. Whereas a UNIX/Linux utility is typically designed to do one thing and to
be used in conjunction with other utilities, emacs is designed to “do it all.” Taking
advantage of the underlying programming language (Lisp), emacs users tend to customize and extend the editor rather than to use existing utilities or create new
general-purpose tools. Instead they share their ~/.emacs (customization) files.
Well before the emergence of the X Window System, Stallman put a great deal of
thought and effort into designing a window-oriented work environment, and he used
emacs as his research vehicle. Over time he built facilities within emacs for reading
and composing email messages, reading and posting netnews, giving shell commands,
compiling programs and analyzing error messages, running and debugging these programs, and playing games. Eventually it became possible to enter the emacs
environment and not come out all day, switching from window to window and from

History 215

file to file. If you had only an ordinary serial, character-based terminal, emacs gave
you tremendous leverage.
In an X Window System environment, emacs does not need to control the whole
display. Instead, it usually operates only one or two windows. The original,
character-based work environment is still available and is covered in this chapter.
As a language-sensitive editor, emacs has special features that you can turn on to help
edit text, nroff, TeX, Lisp, C, Fortran, and so on. These feature sets are called modes,
but they are not related to the Command and Input modes found in vi, vim, and other
editors. Because you never need to switch emacs between Input and Command
modes, emacs is a modeless editor.

emacs Versus vim
See en.wikipedia.org/wiki/Editor_war for an interesting discussion of the ongoing
editor wars; or search the Web for emacs vs vi.
Like vim, emacs is a display editor: It displays on the screen the text you are editing
and changes the display as you type each command or insert new text. Unlike vim,
emacs does not require you to keep track of whether you are in Command mode or
Insert mode: Commands always use CONTROL or other special keys. The emacs editor
inserts ordinary characters into the text you are editing (as opposed to using ordinary
characters as commands), another trait of modeless editing. For many people this
approach is convenient and natural.
As with vim, you use emacs to edit a file in a work area, or buffer, and have the option
of writing this buffer back to the file on the disk when you are finished. With emacs,
however, you can have many work buffers and switch among them without having
to write the buffer out and read it back in. Furthermore, you can display multiple buffers at one time, each in its own window within emacs. This way of displaying files
is often helpful when you are cutting and pasting text or when you want to keep C
declarations visible while editing related code in another part of a file.
Like vim, emacs has a rich, extensive command set for moving about in the buffer and
altering text. This command set is not “cast in concrete”—you can change or customize commands at any time. Any key can be coupled (bound) to any command to
match a particular keyboard better or to fulfill a personal whim. Usually key bindings
are set in the ~/.emacs startup file, but they can also be changed interactively during
a session. All the key bindings described in this chapter are standard on the current
versions of GNU emacs.

Too many key bindings
caution If you change too many key bindings, you might produce a command set that you will not remember
or that will make it impossible for you to return to the standard bindings in the same session.

Finally, and very unlike vim, emacs allows you to use Lisp to write new commands
or override old ones. Stallman calls this feature online extensibility, but it would take

216 Chapter 7 The emacs Editor

a gutsy Lisp guru to write and debug a new command while editing text. It is much
more common to add debugged commands to the .emacs file, where they are loaded
when you start emacs. Experienced emacs users often write modes, or environments,
that are conditionally loaded by emacs for specific tasks. For more information on
the .emacs file, see page 257.

The screen and emacs windows
tip In this chapter, the term screen denotes a character-based terminal screen or a terminal emulator
window in a graphical environment. The term window refers to an emacs window within a screen.

emacs and the X Window System
tip Since version 19, GNU emacs has fully embraced the X Window System environment. If you start
emacs from a terminal emulator window running in a graphical environment, you will bring up
the X interface (GUI) to emacs. This book does not cover the graphical interface; use the –nw
option when you start emacs to bring up the textual interface in any environment. See “Starting
emacs” below.

Tutorial: Getting Started with emacs
The emacs editor has many, many features, and there are many ways to use it. Its
complete manual includes more than 35 chapters. Nevertheless, you can do a considerable amount of meaningful work with a relatively small subset of the commands.
This section describes a simple editing session, explaining how to start and exit from
emacs and how to move the cursor and delete text. Coverage of some issues is postponed or simplified in the interest of clarity.

emacs online tutorial
tip The emacs editor provides an online tutorial. After starting emacs, press CONTROL-H t to start the
tutorial. Press CONTROL-X CONTROL-C to exit from emacs. If you have more than one emacs window
open, see the tip “Closing the help window” on page 230.

Starting emacs
To edit a file named sample using emacs as a text-based editor, enter the following
command:
$ emacs -nw -q sample

The –nw option, which must be the first option on the emacs command line, tells
emacs not to use its X interface (GUI). The –q option tells emacs not to read the
~/.emacs startup file. Not reading this file guarantees that emacs will behave in a
standard manner and can be useful for beginners or for other users who want to
bypass a .emacs file.
The preceding command starts emacs, reads the file named sample into a buffer, and
displays its contents on the screen or window. If no file has this name, emacs displays
a blank screen with (New File) at the bottom (Figure 7-1). If the file exists, emacs

Tutorial: Getting Started with emacs 217

Menubar

Position of
window in buffer

Mode Line

Echo Area/
Minibuffer
Buffer name

Major mode

Figure 7-1

The emacs new file screen

displays the file and a different message (Figure 7-3, page 219). If you start emacs
without naming a file on the command line, it displays a welcome screen that
includes usage information and a list of basic commands (Figure 7-2).
Initially, emacs displays a single window. At the top of the window is a reverse-video
menubar that you can access using a mouse or keyboard. From the keyboard, F10 , META-‘
(back tick), or META-x tmm-menubar RETURN displays the Menubar Completion List window.
For more information refer to “Using the Menubar from the Keyboard” on page 228.
At the bottom of the emacs window is a reverse-video titlebar called the Mode Line.
At a minimum, the Mode Line shows which buffer the window is viewing, whether

Figure 7-2

The emacs welcome screen

218 Chapter 7 The emacs Editor

the buffer has been changed, which major and minor modes are in effect, and how
far down the buffer the window is positioned. When multiple windows appear on the
screen, one Mode Line appears in each window. At the bottom of the screen, emacs
leaves a single line open. This Echo Area and Minibuffer line (they coexist on one
line) is used for messages and special one-line commands.

The emacs manual
tip The emacs manual is available from within emacs. While you are running emacs, give the command CONTROL-H r. Then use the ARROW keys to scroll to the section you want to view and press
RETURN. Alternatively, type m (which moves the cursor to the Minibuffer) followed by the name of
the section (menu) you want to view. Type TAB to cause emacs to complete the menu name; menu
completion works similarly to pathname completion (page 240). See the tip on page 230 for
instructions on how to close the Help window and page 229 for more information on online help.
For example, to view the Minibuffer section of the online manual, give the command CONTROL-H r
m minibuffer RETURN. You can also give the command CONTROL-H r m min TAB RETURN.

If you make an error while you are typing in the Minibuffer, emacs displays the error
message in the Echo Area. The error message overwrites the command you were typing, but emacs restores the command in a few seconds. The brief display of the error
messages gives you time to read it before you continue typing the command from
where you left off. More detailed information is available from the Minibuffer menu
of the emacs online manual (see the preceding tip).
A cursor is either in the window or in the Minibuffer. All input and nearly all editing
take place at the cursor. As you type ordinary characters, emacs inserts them at the
cursor position. If characters are under the cursor or to its right, they are pushed to
the right as you type, so no characters are lost.

Exiting
The command to exit from emacs is CONTROL-X CONTROL-C. You can give this command at
almost any time (in some modes you might have to press CONTROL-G first). It stops emacs
gracefully, asking if you want to keep the changes you made during the editing session.
If you want to cancel a half-typed command or stop a running command before it is
done, press CONTROL-G. The emacs editor displays Quit in the Echo Area and waits for
another command.

Inserting Text
Typing an ordinary (printing) character pushes the cursor and any characters to the
right of the cursor one position to the right and inserts the new character in the space
opened by moving the characters.

Deleting Characters
Depending on the keyboard and the emacs startup file, different keys might delete
characters in different ways. CONTROL-D typically deletes the character under the cursor,
as do DELETE and DEL. BACKSPACE typically deletes the character to the left of the cursor.
Try each of these keys and see what it does.

Tutorial: Getting Started with emacs 219

More about deleting characters
tip If the instructions described in this section do not work, read the emacs info section on deletion.
Give this command from a shell prompt:
$ info emacs

From info give the command m deletion to display a document that describes in detail how to
delete small amounts of text. Use the SPACE bar to scroll through the document. Type q to exit from
info. You can read the same information in the emacs online manual (CONTROL-H r; page 229)

Start emacs and type a few lines of text. If you make a mistake, correct the error using
the deletion characters discussed previously. The RETURN key inserts an invisible endof-line character in the buffer and returns the cursor to the left margin, one line down.
It is possible to back up past the start of a line and up to the end of the previous line.
Figure 7-3 shows a sample buffer.

Use the ARROW keys
tip Sometimes the easiest way to move the cursor is by using the LEFT ARROW, RIGHT ARROW, UP ARROW,
and DOWN ARROW keys.

Moving the Cursor
You can position the cursor over any character in the emacs window and move the
window so it displays any portion of the buffer. You can move the cursor forward or
backward through the text (Figure 6-8, page 174) by various textual units—for
example, characters, words, sentences, lines, and paragraphs. Any of the cursormovement commands can be preceded by a repetition count (CONTROL-U followed by a
numeric argument), which causes the cursor to move that number of textual units
through the text. Refer to page 225 for a discussion of numeric arguments.

Figure 7-3

Sample buffer

220 Chapter 7 The emacs Editor

CONTROL-B
Figure 7-4

CONTROL-F

Moving the cursor by characters

Moving the Cursor by Characters
CONTROL-F

Pressing the RIGHT ARROW key or CONTROL-F moves the cursor forward (to the right) one
character. If the cursor is at the end of a line, these commands wrap it to the beginning
of the next line. For example, the command CONTROL-U 7 CONTROL-F moves the cursor seven
characters forward.

CONTROL-B

Pressing the LEFT ARROW key or CONTROL-B moves the cursor backward (to the left) one character. For example, the command CONTROL-U 7 CONTROL-B moves the cursor seven characters
backward. The command CONTROL-B works in a manner similar to CONTROL-F (Figure 7-4).

Moving the Cursor by Words
META-f

Pressing META-f moves the cursor forward one word. To invoke this command, hold
down the META or ALT key while you press f. If the keyboard you are using does not have
either of these keys, press ESCAPE, release it, and then press f. This command leaves the
cursor on the first character that is not part of the word the cursor started on. The
command CONTROL-U 4 META-f moves the cursor forward one space past the end of the
fourth word. For more information refer to “Keys: Notation and Use” on page 223.

META-b

Pressing META-b moves the cursor backward one word, leaving the cursor on the first
letter of the word it started on. If the cursor was on the first letter of a word, META-b
moves the cursor to the first letter of the preceding word. The command META-b works
in a manner similar to META-f (Figure 7-5).

Moving the Cursor by Lines
CONTROL-A
CONTROL-E
CONTROL-P
CONTROL-N

Pressing CONTROL-A moves the cursor to the beginning of the line it is on; CONTROL-E moves
it to the end. Pressing the UP ARROW key or CONTROL-P moves the cursor up one line to
the position directly above where the cursor started; pressing the DOWN ARROW key or
CONTROL-N moves it down. As with the other cursor-movement keys, you can precede
CONTROL-P and CONTROL-N with CONTROL-U and a numeric argument to move the cursor up
or down multiple lines. You can also use pairs of these commands to move the cursor
up to the beginning of the previous line, down to the end of the following line, and
so on (Figure 7-6).
META-b

META-f

belief,.really...It
Figure 7-5

Moving the cursor by words

Tutorial: Getting Started with emacs 221

CONTROL-P
CONTROL-P
CONTROL-A

needed.as

CONTROL-P
CONTROL-E

CONTROL-A

with.their

CONTROL-E

CONTROL-N
CONTROL-A

working.to

CONTROL-N
CONTROL-E

CONTROL-N

Figure 7-6

Moving the cursor by lines

Moving the Cursor by Sentences, Paragraphs, and
Window Position
META-a, META-e
META-{, META-}

META-r

Pressing META-a moves the cursor to the beginning of the sentence the cursor is on; META-e
moves the cursor to the end. META-{ moves the cursor to the beginning of the paragraph
the cursor is on; META-} moves it to the end. (Sentences and paragraphs are defined starting
on page 248.) You can precede any of these commands with a repetition count (CONTROL-U
followed by a numeric argument) to move the cursor by that many sentences or
paragraphs.
Pressing META-r moves the cursor to the beginning of the middle line of the window. You
can precede this command with CONTROL-U and a line number (here CONTROL-U does not
indicate a repetition count but rather a screen line number). The command CONTROL-U 0
META-r moves the cursor to the beginning of the top line (line zero) in the window. The
command CONTROL-U – (minus sign) moves the cursor to the beginning of the last line of
the window (Figure 7-7, next page).

Editing at the Cursor Position
Entering text requires no commands once you position the cursor in the window at
the location you want to enter text. When you type text, emacs displays that text at
the position of the cursor. Any text under or to the right of the cursor is pushed to
the right. If you type enough characters so the text would extend past the right edge
of the window, emacs displays a backslash (\) near the right edge of the window and
wraps the text to the next line. The backslash appears on the screen but is not saved
as part of the file and is never printed. Although you can create an arbitrarily long
line, some Linux tools have problems with text files containing such lines. To split a
line into two lines, position the cursor at the location you want to split the line and
press RETURN.
Deleting text

Pressing BACKSPACE removes characters to the left of the cursor. The cursor and the
remainder of the text on this line both move to the left each time you press BACKSPACE.

222 Chapter 7 The emacs Editor

CONTROL-U
0 META-r

META-{

META-r

META-a

Cursor

META-e

CONTROL-U
– META-r

Figure 7-7

META-}

Moving the cursor by sentences, paragraphs, and window position
To join a line with the line above it, position the cursor on the first character of the
second line and press BACKSPACE.
Press CONTROL-D to delete the character under the cursor. The cursor remains stationary,
but the remainder of the text on the line moves left to replace the deleted character.
See the tip “More about deleting characters” on page 219 if either of these keys does
not work as described here.

Saving and Retrieving the Buffer
No matter what changes you make to a buffer during an emacs session, the associated
file does not change until you save the buffer. If you leave emacs without saving the
buffer (emacs allows you to do so if you are persistent), the file is not changed and
emacs discards the work you did during the session.
Backups

As it writes a buffer’s edited contents back to the file, emacs might optionally first
make a backup of the original file. You can choose to make no backups, one level of
backup (default), or an arbitrary number of levels of backups. The level one backup
filenames are formed by appending a tilde (~) to the original filename. The multilevel
backups have .~n~ appended to the filename, where n is the sequential backup number, starting with 1. The version-control variable dictates how emacs saves backups.
See page 257 for instructions on assigning a value to an emacs variable.

Saving the buffer

The command CONTROL-X CONTROL-S saves the current buffer in its associated file. The
emacs editor confirms a successful save by displaying an appropriate message in the
Echo Area.

Visiting another file

When you are editing a file with emacs and want to edit another file (emacs documentation refers to editing a file as visiting a file), you can copy the new file into a new emacs
buffer by giving the command CONTROL-X CONTROL-F. The emacs editor prompts for a filename, reads that file into a new buffer, and displays that buffer in the current window.
Having two files open in one editing session is more convenient than exiting from
emacs, returning to the shell, and then starting a new copy of emacs to edit a second file.

Basic Editing Commands 223

Visiting a file with CONTROL-X CONTROL-F
tip When you give the command CONTROL-X CONTROL-F to visit a file, emacs displays the pathname of
the directory in which it assumes the file is located. Normally it displays the pathname of the working directory, but in some situations emacs displays a different pathname, such as the pathname
of your home directory. Edit this pathname if it is not pointing to the correct directory. This command provides pathname completion (page 240).

Basic Editing Commands
This section takes a more detailed look at the fundamental emacs editing commands.
It covers editing a single file in a single emacs window.

Keys: Notation and Use
Although emacs has been internationalized, its keyboard input is still an evolved and
extended ASCII code, usually with one keystroke producing one byte. ASCII keyboards have a typewriter-style SHIFT key and a CONTROL key. Some keyboards also have
a META (diamond or ALT) key that controls the eighth bit. It takes seven bits to describe
an ASCII character; the eighth bit of an eight-bit byte can be used to communicate
additional information. Because so much of the emacs command set is in the nonprinting CONTROL or META case, Stallman was one of the first to develop a nonnumeric
notation for describing keystrokes.
His solution, which is still used in the emacs community, is clear and unambiguous
(Table 7-1). It uses the capital letters C and M to denote holding down the CONTROL and
META (or ALT) keys, respectively, and a few simple acronyms for the most common special characters, such as RET (this book uses RETURN), LFD (LINEFEED), DEL (DELETE), ESC (ESCAPE),
SPC (SPACE), and TAB. Most emacs documentation, including the online help, uses this
notation.

Table 7-1

emacs key notation

Character

Classic emacs notation

(lowercase) a

a

(uppercase) SHIFT-a

A

CONTROL-a

C-a

CONTROL-A

C-a (do not use SHIFT), equivalent to CONTROL-a

META-a

M-a

META-A

M-A (do use SHIFT), different from M-a

CONTROL-META-a

C-M-a

META-CONTROL-a

M-C-a (not used frequently)

224 Chapter 7 The emacs Editor

The emacs use of keys had some problems. Many keyboards had no META key, and
some operating systems discarded the META bit. In addition, the emacs command set
clashes with the increasingly outdated XON-XOFF flow control, which also uses
CONTROL-S and CONTROL-Q.
Under OS X, most keyboards do not have a META or ALT key. See page 1049 for an
explanation of how to set up the OPTION key to perform the same functions as the META
key on a Macintosh.
The missing META key issue was resolved by making an optional two-key sequence
starting with ESCAPE equate to a META character. If the keyboard you are using does
not have a META or ALT key, you can use the two-key ESCAPE sequence by pressing the
ESCAPE key, releasing it, and then pressing the key following the META key in this book.
For example, you can type ESCAPE a instead of META-a or type ESCAPE CONTROL-A instead of
CONTROL-META-a.
Stallman considers XON-XOFF flow control to be a historical issue, and has no
plans to change the emacs command set. However, the online help emacs FAQ offers
several workarounds for this issue.

The notation used in this book
tip This book uses an uppercase letter following the CONTROL key and a lowercase letter following the
META key. In either case you do not have to hold down the SHIFT key while entering a CONTROL or
META character. Although the META uppercase character (that is, META-A) is a different character, it

is usually set up to cause no action or to have the same effect as its lowercase counterpart.

Key Sequences and Commands
In emacs the relationship between key sequences (one or more keys that you press
together or in sequence to issue an emacs command) and commands is very flexible,
and there is considerable opportunity for exercising your personal preference. You
can translate and remap key sequences to other commands and replace or reprogram
commands.
Although most emacs documentation glosses over the details and talks about keystrokes as though they were the commands, it is important to recognize that the
underlying machinery remains separate from the key sequences and to understand
that you can change the behavior of the key sequences and the commands. For more
information refer to “Customizing emacs” on page 256.

META-x:

Running a Command Without a Key Binding
The emacs keymaps (the tables, or vectors, that emacs uses to translate key sequences
into commands [page 258]) are very crowded, and often it is not possible to bind
every command to a key sequence. You can execute any command by name by preceding it with META-x. When you press META-x, the emacs editor prompts you for a
command in the Echo Area. After you enter the command name and press RETURN, it
executes the command.

Basic Editing Commands 225
Smart completion

When a command has no common key sequence, it is sometimes described as META-x
command-name. The emacs editor provides smart completion for most answers it
prompts for. After you type part of a response to a prompt, press SPACE or TAB to cause
emacs to complete, if possible, to the end of the current word or the whole command,
respectively. Forcing a completion past the last unambiguous point or typing a question mark (?) opens a Completion List window that displays a list of alternatives.
Smart completion works in a manner similar to pathname completion (page 240).

Numeric Arguments
Some of the emacs editing commands accept a numeric argument as a repetition
count. Place this argument immediately before the key sequence for the command.
The absence of an argument almost always means a count of 1. Even an ordinary
alphabetic character can have a numeric argument, which means “insert this many
times.” Use either of the following techniques to give a numeric argument to a
command:
• Press META with each digit (0–9) or the minus sign (–). For example, to insert
10 z characters, type META-1 META-0 z.
• Use CONTROL-U to begin a string of digits, including the minus sign. For example,
to move the cursor forward 20 words, type CONTROL-U 20 META-f.
CONTROL-U

For convenience, CONTROL-U defaults to multiply by 4 when you do not follow it with a
string of one or more digits. For example, entering CONTROL-U r means insert rrrr (4 *
1), whereas CONTROL-U CONTROL-U r means insert rrrrrrrrrrrrrrrr (4 * 4 * 1). For quick partial scrolling of a tall window, you might find it convenient to use repeated sequences
of CONTROL-U CONTROL-V to scroll down 4 lines, CONTROL-U META-v to scroll up 4 lines, CONTROL-U
CONTROL-U CONTROL-V to scroll down 16 lines, or CONTROL-U CONTROL-U META-v to scroll up 16 lines.

Point and the Cursor
Point is the place in a buffer where editing takes place and is where the cursor is positioned. Strictly speaking, Point is at the left edge of the cursor—think of it as lying
between two characters.
Each window has its own Point, but there is only one cursor. When the cursor is in
a window, moving the cursor also moves Point. Switching the cursor out of a window
does not change that window’s Point; it is in the same place when you switch the cursor back to that window.
All of the cursor-movement commands described previously also move Point.

Scrolling Through a Buffer
CONTROL-V
META-v
CONTROL-L

A buffer is likely to be much larger than the window through which it is viewed, so
you need a way of moving the display of the buffer contents up or down so as to position the interesting part in the window. Scrolling forward refers to moving the text
upward, with new lines entering at the bottom of the window. Press CONTROL-V or the

226 Chapter 7 The emacs Editor
PAGE DOWN key to scroll forward one window (minus two lines for context). Scrolling
backward refers to moving the text downward, with new lines entering at the top of
the window. Press META-v or the PAGE UP key to scroll backward one window (again leaving two lines for context). Pressing CONTROL-L clears the screen and repaints it, moving
the line the cursor is on to the middle line of the window. This command is useful if
the screen becomes garbled.

A numeric argument to CONTROL-V or META-v means “scroll that many lines”; for example,
CONTROL-U 10 CONTROL-V means scroll forward ten lines. A numeric argument to CONTROL-L
means “scroll the text so the cursor is on that line of the window,” where 0 means
the top line and –1 means the bottom line, just above the Mode Line. Scrolling occurs
automatically if you exceed the window limits when pressing CONTROL-P or CONTROL-N.
META-<
META->

You can move the cursor to the beginning of the buffer with
the buffer with META->.

META-<

or to the end of

Erasing Text
Delete versus kill

When you erase text you can discard it or move it into a holding area and optionally
bring it back later. The term delete means permanently discard, and the term kill
means move to a holding area. The holding area, called the Kill Ring, can hold several
pieces of killed text. You can use the text in the Kill Ring in many ways (refer to “Cut
and Paste: Yanking Killed Text” on page 235).

META-d
CONTROL-K

The META-d command kills from the cursor forward to the end of the current word. CONTROL-K kills from the cursor forward to the end of the current line. It does not delete
the line-ending LINEFEED character unless Point and the cursor are just to the left of the
LINEFEED. This setup allows you to reach the left end of a line with CONTROL-A, kill the
whole line with CONTROL-K, and then immediately type a replacement line without having to reopen a hole for the new line. Another consequence is that, from the beginning
of the line, it takes the command CONTROL-K CONTROL-K (or CONTROL-U 2 CONTROL-K) to kill the
text and close the hole.

Searching for Text
The emacs editor allows you to search for text in the following ways:
• Incrementally for a character string
• Incrementally for a regular expression (possible but uncommon)
• For a complete character string
• For a complete regular expression (Appendix A)
You can run each of the four types of searches either forward or backward in the
buffer.
The complete searches behave in the same manner as searches carried out in other
editors. Searching begins only when the search string is complete. In contrast, an
incremental search begins when you type the first character of the search string and

Basic Editing Commands 227

keeps going as you enter additional characters. Initially this approach might sound
confusing, but it is surprisingly useful.

Incremental Searches
CONTROL-S
CONTROL-R

A single command selects the direction of and starts an incremental search. CONTROL-S
starts a forward incremental search and CONTROL-R starts a reverse incremental search.
When you start an incremental search, emacs prompts you with I-search: in the Echo
Area. When you enter a character, it immediately searches for that character in the
buffer. If it finds that character, emacs moves Point and cursor to that position so you
can see the search progress. If the search fails, emacs tells you so.
After you enter each character of the search string, you can take one of several actions
depending on the result of the search to that point. The following paragraphs list
results and corresponding actions.
• The search finds the string you are looking for, leaving the cursor positioned
just to its right. You can stop the search and leave the cursor in its new position by pressing RETURN. (Any emacs command not related to searching will
also stop the search but remembering exactly which ones apply can be difficult. For a new user, RETURN is safer.)
• The search finds a string but it is not the one you are looking for. You can
refine the search string by adding another letter, press CONTROL-R or CONTROL-S
to look for the next occurrence of this search string, or press RETURN to stop
the search and leave the cursor where it is.
• The search hits the beginning or end of the buffer and reports Failing
I-Search. You can proceed in one of the following ways:
◆

If you mistyped the search string, press BACKSPACE as needed to remove
characters from the search string. The text and cursor in the window
jump backward in step as you remove characters.

◆

If you want to wrap past the beginning or end of the buffer and continue searching, you can force a wrap by pressing CONTROL-R or CONTROL-S.

◆

If the search has not found the string you are looking for but you want
to leave the cursor at its current position, press RETURN to stop the search.

◆

If the search has gone wrong and you just want to get back to where
you started, press CONTROL-G (the quit character). From an unsuccessful
search, a single CONTROL-G backs out all the characters in the search string
that could not be found. If this action returns you to a place you wish
to continue searching from, you can add characters to the search string
again. If you do not want to continue the search from that position,
pressing CONTROL-G a second time stops the search and leaves the cursor
where it was initially.

228 Chapter 7 The emacs Editor

Nonincremental Searches
CONTROL-S RETURN
CONTROL-R RETURN

If you prefer that your searches succeed or fail without showing all the intermediate
results, you can give the nonincremental command CONTROL-S RETURN to search forward
or CONTROL-R RETURN to search backward. Searching does not begin until you enter a
search string in response to the emacs prompt and press RETURN again. Neither of these
commands wraps past the end of the buffer.

Regular Expression Searches
You can perform both incremental and nonincremental regular expression searching
in emacs. Use the commands listed in Table 7-2 to begin a regular expression search.

Table 7-2

Searching for regular expressions

Command

Result

META-CONTROL-s

Incrementally searches forward for a regular expression;
prompts for a regular expression one character at a time

META-CONTROL-r

Incrementally searches backward for a regular expression;
prompts for a regular expression one character at a time

META-CONTROL-s RETURN

Prompts for and then searches forward for a complete regular
expression

META-CONTROL-r RETURN

Prompts for and then searches backward for a complete regular
expression

Using the Menubar from the Keyboard
This section describes how to use the keyboard to make selections from the emacs
menubar (Figure 7-1, page 217). In a graphical environment you can also use a
mouse for this purpose. The menubar selections are appropriate to the Major mode
emacs is in (see “Major Modes: Language-Sensitive Editing” on page 246). For
example, when you are editing a C program, the menubar includes a C menu that
holds commands specific to editing and indenting C programs.
To make a selection from the menubar, first press the F10 function key, META-‘ (back
tick), or META-x tmm-menubar RETURN. The emacs editor displays the Menubar Completion List window populated with the top-level menubar selections (File, Edit,
Options, and so on), with the current selection displayed in the Minibuffer.
Figure 7-8 shows the Menubar Completion List window with File as the current
selection in the Minibuffer.
With the Menubar Completion List window open, you can perform any of the following
actions:
• Cancel the menu selection by pressing CONTROL-G or ESCAPE ESCAPE ESCAPE. The
display returns to the state it was in before you opened the Menubar
Completion List window.

Online Help

Figure 7-8

229

The top-level Menubar Completion List window

• Use the UP ARROW and DOWN ARROW keys to display successive menu selections in
the Minibuffer. Press RETURN to choose the displayed selection.
• Type the one-character abbreviation of a selection as shown in the Menubar
Completion List window to choose the selection. You do not need to press
RETURN.
• Press PAGE UP or META-v to move the cursor to the Menubar Completion List
window. Use the ARROW keys to move the cursor between selections. Press
RETURN to choose the selection the cursor is on. You can type ESCAPE ESCAPE ESCAPE
to back out of this window and return the cursor to the Minibuffer.
When you make a choice from the top-level menu, emacs displays the corresponding
second-level menu in the Menubar Completion List window. Repeat one of the preceding actions to make a selection from this menu. When you make a final selection,
emacs closes the Menubar Completion List window and takes the action you
selected. More information is available from the Menu Bar menu of the emacs online
manual (see the tip on page 218).

Online Help
CONTROL-H

The emacs help system is always available. With the default key bindings, you can
start it with CONTROL-H. The help system then prompts you for a one-letter help command. If you do not know which help command you want, type ? or CONTROL-H to
switch the current window to a list of help commands, each with a one-line description; emacs again requests a one-letter help command. If you decide you do not want
help after all, type CONTROL-G to cancel the help request and return to the former buffer.

230 Chapter 7 The emacs Editor

If the help output is only a single line, it appears in the Echo Area. If it is more than
one line, the output appears in its own window. Use CONTROL-V and META-v to scroll forward and backward through the buffer (page 225). You can move the cursor between
windows with CONTROL-X o (lowercase “o”). See page 243 for a discussion of working
with multiple windows.

Closing the help window
tip To delete the help window while the cursor is in the window that holds the text you are editing,
type CONTROL-X 1 (one). Alternatively, you can move the cursor to the help window (CONTROL-X o
[lowercase “o”]) and type CONTROL-X 0 (zero) to delete the current window.
If help displays a window that occupies the entire screen, as is the case with CONTROL-H n (emacs
news) and CONTROL-H t (emacs tutorial), you can kill the help buffer by pressing CONTROL-X k or
switch buffers by pressing CONTROL-X b (both discussed on page 242).

On many terminals the BACKSPACE or LEFT ARROW key generates CONTROL-H. If you forget that
you are using emacs and try to back over a few characters, you might unintentionally
enter the help system. This action does not pose a danger to the buffer you are editing,
but it can be unsettling to lose the window contents and not have a clear picture of how
to restore it. While you are being prompted for the type of help you want, you can type
CONTROL-G to remove the prompt and return to editing the buffer. Some users elect to put
help on a different key (page 258). Table 7-3 lists some of the help commands.

Table 7-3

Help commands

Command

Type of help offered

CONTROL-H a

Prompts for a string and displays a list of commands whose
names contain that string.

CONTROL-H b

Displays a long table of the key bindings in effect.

CONTROL-H c

key-sequence

Displays the name of the command bound to key-sequence.
Multiple key sequences are allowed. For a long key sequence
where only the first part is recognized, the command describes
the first part and quietly inserts the unrecognized part into the
buffer. This can happen with three-character function keys (F1,
F2, and so on, on the keyboard) that generate character
sequences such as ESCAPE [ SHIFT.

CONTROL-H f

Prompts for the name of a Lisp function and displays the
documentation for it. Because commands are Lisp functions,
you can use a command name with this command.

CONTROL-H i

Displays the top info (page 36) menu where you can browse for
emacs or other documentation.

CONTROL-H k

key-sequence

Displays the name and documentation of the command bound
to key-sequence. (See the notes on CONTROL-H c.)

Online Help

Table 7-3

231

Help commands (continued)

Command

Type of help offered

CONTROL-H l (lowercase “l”)

Displays the last 100 characters typed. The record is kept after
the first-stage keyboard translation. If you have customized the
keyboard translation table, you must make a mental reverse
translation.

CONTROL-H m

Displays the documentation and special key bindings for the
current Major mode (Text, C, Fundamental, and so on,
[page 247]).

CONTROL-H n

Displays the emacs news file, which lists recent changes to
emacs, ordered with the most recent changes first.

CONTROL-H r

Displays the emacs manual.

CONTROL-H t

Runs an emacs tutorial session.

CONTROL-H v

Prompts for a Lisp variable name and displays the
documentation for that variable.

CONTROL-H w

Prompts for a command name and identifies any key sequence
bound to that command. Multiple key sequences are allowed.
(See the notes on CONTROL-H c.)

optional As this abridged presentation makes clear, you can use the help system to browse
through the emacs internal Lisp system. For the curious, following is Stallman’s list
of strings that match many names in the Lisp system. To get a view of the internal
functionality of emacs, you can use any of these strings with CONTROL-H a (help system
list of commands) or META-x apropos (prompts for a string and lists variables whose
names contain that string).
backward

dir

insert

previous

view

beginning

down

kill

region

what

buffer

end

line

register

window

case

file

list

screen

word

change

fill

mark

search

yank

char

find

mode

sentence

defun

forward

next

set

delete

goto

page

sexp

describe

indent

paragraph

up

232 Chapter 7 The emacs Editor

Advanced Editing
The basic emacs commands suffice for many editing tasks but the serious user will
quickly discover the need for more power. This section presents some of the more
advanced emacs capabilities.

Undoing Changes
An editing session begins when you read a file into an emacs buffer. At that point the
buffer content matches the file exactly. As you insert text and give editing commands,
the buffer content becomes increasingly more different from the file. If you are satisfied
with the changes, you can write the altered buffer back out to the file and end the session.
Near the left end of the Mode Line (Figure 7-1, page 217) is an indicator that shows
the modification state of the buffer displayed in the window. The three possible states
are –– (not modified), ** (modified), and %% (readonly).
The emacs editor keeps a record of all keys you have pressed (text and commands)
since the beginning of the editing session, up to a limit currently set at 20,000 characters. If you are within this limit, it is possible to undo the entire session for this
buffer, one change at a time. If you have multiple buffers (page 242), each buffer has
its own undo record.
Undoing is considered so important that it has a backup key sequence, in case a keyboard cannot easily handle the primary sequence. The two sequences are CONTROL-_
(underscore, which on old ASR-33 TTY keyboards was LEFT ARROW) and CONTROL-X u.
When you type CONTROL-_ , emacs undoes the last command and moves the cursor to
the position of the change in the buffer so you can see what happened. If you type
CONTROL-_ a second time, the next-to-last command is undone, and so on. If you keep
typing CONTROL-_, eventually the buffer will be returned to its original unmodified state
and the ** Mode Line indicator will change to ––.
When you break the string of Undo commands by typing text or giving any command
except Undo, all reverse changes you made during the string of undos become a part
of the change record and can themselves be undone. This strategy offers a way to redo
some or all of the undo operations. If you decide you backed up too far, type a command (something innocuous that does not change the buffer, such as CONTROL-F ), and
begin undoing your changes in reverse. Table 7-4 lists some examples of Undo
commands.

Table 7-4

Undo commands

Commands

Result

CONTROL-_

Undoes the last change

CONTROL-_ CONTROL-F CONTROL-_

Undoes the last change and changes it back
again

Advanced Editing 233

Table 7-4

Undo commands (continued)

Commands

Result

CONTROL-_ CONTROL-_

Undoes the last two changes

CONTROL-_ CONTROL-_ CONTROL-F CONTROL-_ CONTROL-_ Undoes two changes and changes them both

back again
CONTROL-_ CONTROL-_ CONTROL-F CONTROL-_

Undoes two changes and changes the most
recent one back again

If you do not remember the last change you made, you can type CONTROL-_ and undo
it. If you wanted to make this change, type CONTROL-F CONTROL-_ to make the change again.
If you modified a buffer by accident, you can keep typing CONTROL-_ until the Mode Line
indicator shows –– once more.
If the buffer is completely ruined and you want to start over, issue the command META-x
revert-buffer to discard the current buffer contents and reread the associated file. The
emacs editor asks you to confirm your intentions.

Point, Mark, and Region
Point is the current editing position in a buffer. You can move Point anywhere within
the buffer by moving the cursor. It is also possible to set a marker called Mark in the
buffer. The contiguous characters between Point and Mark (either one might come
first) are called Region. Many commands operate on a buffer’s Region, not just on
the characters near Point.

Moving Mark and Establishing Region
CONTROL-@
CONTROL-SPACE
CONTROL-X CONTROL-X

Mark is not as easy to move as Point. Once set, Mark can be moved only by setting
it somewhere else. Each buffer has only one Mark. The CONTROL-@ (or CONTROL-SPACE) command explicitly sets Mark at the current cursor (and Point) position. Some keyboards
generate CONTROL-@ when you type CONTROL-Q. Although this is not really a backup key
binding, it is occasionally a convenient alternative. You can use CONTROL-X CONTROL-X to
exchange Point and Mark (and move the cursor to the new Point).
To establish Region, you usually position the cursor (and Point) at one end of the
desired Region, set Mark with CONTROL-@, and then move the cursor (and Point) to the
other end of Region. If you forget where you left Mark, you can move the cursor
back to it again by giving the command CONTROL-X CONTROL-X. You can move the cursor
back and forth with repeated CONTROL-X CONTROL-X commands to show Region more
clearly.
If a Region boundary is not to your liking, you can swap Point and Mark using
to move the cursor from one end of Region to the other and then
move Point. Continue until you are satisfied with Region.
CONTROL-X CONTROL-X

234 Chapter 7 The emacs Editor

Operating on Region
Table 7-5 lists selected commands that operate on Region. Give the command CONTROL-H
a region to see a complete list of these commands.

Table 7-5

Operating on Region

Command

Result

META-w

Copies Region nondestructively (without killing it) to the Kill
Ring

CONTROL-W

Kills Region

META-x print-region

Sends Region to the printer

META-x append-to-buffer

Prompts for a buffer and appends Region to that buffer

META-x append-to-file

Prompts for a filename and appends Region to that file

META-x capitalize-region

Converts Region to uppercase

CONTROL-X CONTROL-L

Converts Region to lowercase

The Mark Ring
Each time you set Mark in a buffer, you are also pushing Mark’s former location onto
the buffer’s Mark Ring. The Mark Ring is organized as a FIFO (first in, first out) list
and holds the 16 most recent locations where Mark was set. Each buffer has its own
Mark Ring. This record of recent Mark history is useful because it often holds locations that you want to jump back to quickly. Jumping to a location pointed to by the
Mark Ring can be faster and easier than scrolling or searching your way through the
buffer to find the site of a previous change.
CONTROL-U
CONTROL-@

To work your way backward along the trail of former Mark locations, use the command CONTROL-U CONTROL-@ one or more times. Each time you give the command, emacs
• Moves Point (and the cursor) to the current Mark location
• Saves the current Mark location at the oldest end of the Mark Ring
• Pops off the youngest (most recent) Mark Ring entry and sets Mark
Each additional CONTROL-U CONTROL-@ command causes emacs to move Point and the cursor
to the previous entry on the Mark Ring.
Although this process might seem complex, it really just makes a safe jump to a
previous Mark location. It is safe because each jump’s starting point is recirculated
through the Mark Ring, where it is easy to find again. You can jump to all previous
locations on the Mark Ring (it might be fewer than 16) by giving the command
CONTROL-U CONTROL-@ repeatedly. You can go around the ring as many times as you like
and stop whenever you want.

Advanced Editing 235

Setting Mark Automatically
Some commands set Mark automatically: The idea is to leave a bookmark before
moving Point a long distance. For example, META-> sets Mark before jumping to the
end of the buffer. You can then return to your starting position with CONTROL-U CONTROL-@.
Searches behave similarly. To help you avoid surprises the message Mark Set appears
in the Echo Area whenever Mark is set, either explicitly or implicitly.

Cut and Paste: Yanking Killed Text
Recall that killed text is not discarded but rather is kept in the Kill Ring. The Kill Ring
holds the last 30 pieces of killed text and is visible from all buffers.
Retrieving text from the Kill Ring is called yanking. The meaning of this term in
emacs is the opposite of that used in vim: In vim yanking pulls text from the buffer,
and putting puts text into the buffer. Killing and yanking—which are roughly analogous to cutting and pasting—are emacs’s primary mechanisms for moving and
copying text. Table 7-6 lists the most common kill and yank commands.

Table 7-6

Common kill and yank commands

Command

Result

META-d

Kills to end of current word

META-D

Kills from beginning of previous word

CONTROL-K

Kills to end of line, not including LINEFEED

CONTROL-U 1 CONTROL-K

Kills to end of line, including LINEFEED

CONTROL-U 0 CONTROL-K

Kills from beginning of line

META-w

Copies Region to the Kill Ring but does not erase Region from
the buffer

CONTROL-W

Kills Region

META-z char

Kills up to next occurrence of char

CONTROL-Y

Yanks the most recently killed text into the current buffer at
Point, sets Mark at the beginning of this text, and positions
Point and the cursor at the end; follow with CONTROL-Y to swap
Point and Mark

META-y

Erases the just-yanked text, rotates the Kill Ring, and yanks the
next item (only after CONTROL-Y or META-y)

To move two lines of text, move Point to the beginning of the first line and then enter
to kill two lines. Move Point to the destination position and then
enter CONTROL-Y.
CONTROL-U 2 CONTROL-K

236 Chapter 7 The emacs Editor

To copy two lines of text, move Point to the beginning of the first line and give the
commands CONTROL-U 2 CONTROL-K CONTROL-Y to kill the lines and then yank them back immediately. Move Point to the destination position and type CONTROL-Y.
To copy a larger piece of the buffer, set Region to cover this piece and type CONTROL-W
CONTROL-Y to kill Region and yank it back. Next move Point to the destination and type
CONTROL-Y. You can also set Region and use META-w to copy Region to the Kill Ring.
The Kill Ring is organized as a fixed-length FIFO list, with each new entry causing
the eldest to be discarded (once you build up to 30 entries). Simple cut-and-paste
operations generally use only the newest entry. The older entries are retained to give
you time to change your mind about a deletion. If you do change your mind, you can
“mine” the Kill Ring like an archaeological dig, working backward through time and
down through the strata of killed material to copy a specific item back into the buffer.
To view every entry in the Kill Ring, begin a yanking session by pressing CONTROL-Y. This
action copies the youngest entry in the Kill Ring to the buffer at the current cursor
position. If this entry is not the item you want, continue the yanking session by pressing META-y. This action erases the previous yank and copies the next youngest entry to
the buffer at the current cursor position. If this still is not the item you wanted, press
META-y again to erase it and retrieve a copy of the next entry, and so on. You can continue giving META-y commands all the way back to the oldest entry. If you continue to
press META-y, you will eventually wrap back to the youngest entry again. In this manner
you can examine each entry as many times as you wish.
The sequence used in a yanking session consists of CONTROL-Y followed by any mixture
of CONTROL-Y and META-y. If you type any other command after META-y, the sequence is broken and you must give the CONTROL-Y command again to start another yanking session.
As you work backward in the Kill Ring, it is useful to think of this process as advancing a Last Yank pointer back through history to increasingly older entries. This
pointer is not reset to the youngest entry until you give a new kill command. Using
this technique, you can work backward partway through the Kill Ring with CONTROL-Y
and a few META-y commands, give some commands that do not kill, and then pick up
where you left off with another CONTROL-Y and a succession of META-y commands.
It is also possible to position the Last Yank pointer with positive or negative numeric
arguments to META-y. Refer to the online documentation for more information.

Inserting Special Characters
As stated earlier, emacs inserts everything that is not a command into the buffer at
the position of the cursor. To insert characters that would ordinarily be emacs commands, you can use the emacs escape character: CONTROL-Q. There are two ways of using
this escape character:
•

CONTROL-Q followed by any other character inserts that character in the buffer,
no matter which command interpretation it was supposed to have.

Advanced Editing 237

•

CONTROL-Q

followed by three octal digits inserts a byte with that value in the

buffer.

CONTROL-Q
tip Depending on the way your terminal is set up, CONTROL-Q might clash with software flow control.
If CONTROL-Q seems to have no effect, it is most likely being used for flow control. In that case you
must bind another key to the command quoted-insert (page 258).

Global Buffer Commands
The vim editor and its predecessors have global commands for bufferwide search and
replace operations. They operate on the entire buffer. The emacs editor has a similar
family of commands. They operate on the portion of the buffer between Point and
the end of the buffer. If you wish to operate on the entire buffer, use META-< to move
Point to the beginning of the buffer before issuing the command.

Line-Oriented Operations
The commands listed in Table 7-7 take a regular expression and apply it to the lines
between Point and the end of the buffer.

Table 7-7

Line-oriented operations

Command

Result

META-x occur

Prompts for a regular expression and copies each line with a
match for the expression to a buffer named *Occur*

META-x delete-matching-lines

Prompts for a regular expression and deletes each line with a
match for the expression

META-x delete-non-matching-lines Prompts for a regular expression and deletes each line that does

not have a match for that expression

The META-x occur command puts its output in a special buffer named *Occur*, which
you can peruse and discard or use as a jump menu to reach each line quickly. To use
the *Occur* buffer as a jump menu, switch to it (CONTROL-X o [lowercase “o”]), move
the cursor to the copy of the desired destination line, and give the command CONTROL-C
CONTROL-C. This command moves the cursor to the buffer that was searched and positions it on the line that the regular expression matched.
As with any buffer change, you can undo the effect of the delete commands.

Unconditional and Interactive Replacement
The commands listed in Table 7-8 (next page) operate on the characters between Point
and the end of the buffer, changing every string match or regular expression match. An
unconditional replacement makes all replacements automatically. An interactive replacement gives you the opportunity to see and approve each replacement before it is made.

238 Chapter 7 The emacs Editor

Table 7-8

Replacement commands

Command

Result

META-x replace-string

Prompts for string and newstring and replaces every instance of
string with newstring. Point is left at the site of the last
replacement, but Mark is set when you give the command, so
you can return to it with CONTROL-U CONTROL-@.

META-x replace-regexp

Prompts for regexp and newstring and replaces every match for
regexp with newstring. Point is left at the site of the last
replacement, but Mark is set when you give the command, so
you can return to it with CONTROL-U CONTROL-@.

META-% string or
META-x query-replace

The first form uses string; the second form prompts for string.
Both forms prompt for newstring, query each instance of string,
and, depending on your response, replace it with newstring.
Point is left at the site of the last replacement, but Mark is set
when you give the command, so you can return to it with
CONTROL-U CONTROL-@.

META-x query-replace-regexp

Prompts for regexp and newstring, queries each match for
regexp, and, depending on your response, replaces it with
newstring. Point is left at the site of the last replacement, but
Mark is set when you give the command, so you can return to
it with CONTROL-U CONTROL-@.

If you perform an interactive replacement, emacs displays each instance of string or
match for regexp and prompts you for an action to take. Table 7-9 lists some of the
possible responses.

Table 7-9

Responses to interactive replacement prompts

Response

Meaning

RETURN

Do not do any more replacements; quit now.

SPACE

Make this replacement and go on.

DELETE

Do not make this replacement. Skip it and go on.

, (comma)

Make this replacement, display the result, and ask for another
command. Any command is legal except DELETE is treated like
SPACE and does not undo the change.

. (period)

Make this replacement and quit searching.

! (exclamation point)

Replace this and all remaining instances without asking any
more questions.

Advanced Editing 239

Visiting and Saving Files
When you visit (emacs terminology for “call up”) a file, emacs reads it into a buffer
(page 242), allows you to edit the buffer, and eventually usually saves the buffer back
to the file. The commands discussed here relate to visiting and saving files.
META-x pwd
META-x cd

Each emacs buffer keeps a record of its default directory (the directory the file was
read from or the working directory, if it is a new file) that is prepended to any relative
pathname you specify. This convenience is meant to save some typing. Enter META-x
pwd to print the default directory for the current buffer or META-x cd to prompt for a
new default directory and assign it to this buffer. The next section discusses pathname
completion, which you can use when emacs prompts for a pathname.

Visiting Files
The emacs editor works well when you visit a file that has already been called up and
whose image is now in a buffer. After a check of the modification time to ensure that
the file has not been changed since it was last called up, emacs simply switches to that
buffer. Table 7-10 lists commands used to visit files.

Table 7-10

Visiting files

Command

Result

CONTROL-X CONTROL-F

Prompts for a filename and reads its contents into a new
buffer. Assigns the file’s simple filename as the buffer name.
Other buffers are unaffected. It is common practice and
often useful to have several files open simultaneously for
editing.

CONTROL-X CONTROL-V

Prompts for a filename and replaces the current buffer with a
buffer containing the contents of the requested file. The current
buffer is destroyed.

CONTROL-X 4 CONTROL-F

Prompts for a filename and reads its contents into a new buffer.
Assigns the file’s simple filename as the buffer name. Creates a
new window for this buffer and selects that window. The
window selected before the command still displays the buffer it
was showing before this operation, although the new window
might cover up part of the old window.

To create a new file, simply call it up. An empty buffer is created and properly
named so you can eventually save it. The message (New File) appears in the Echo
Area, reflecting emacs’s understanding of the situation. If this new file grew out of
a typographical error, you can give the command CONTROL-X CONTROL-V and enter the
correct name.

240 Chapter 7 The emacs Editor

Figure 7-9

A Pathname Completion List window

Pathname Completion
When you are prompted for the pathname of a file in the Minibuffer, you can type
the pathname followed by a RETURN. Alternatively, you can use pathname completion,
which is similar to bash filename completion (page 338), to help you enter a
pathname.
While you are entering a pathname in the Minibuffer, press TAB and emacs will complete the pathname as far as possible. If the completed pathname is satisfactory,
press RETURN. In some cases, emacs cannot complete a pathname. For example, a
directory in the pathname you entered might not exist or you might not have permission to read it. If emacs cannot complete a pathname, it displays a message in
the Echo Area. If the characters following the rightmost slash (/) in the pathname
you are typing match more than one filename, when you press TAB emacs displays
[Complete, but not unique]. If you press TAB a second time, emacs opens a Pathname
Completion List window that displays a list of possible completions (Figure 7-9).
You can open this window manually while you are entering a pathname by typing
a question mark (?).
With the Pathname Completion List window open, you can
• Cancel the selection by pressing CONTROL-G or ESCAPE ESCAPE ESCAPE. The display
returns to the state it was in before you opened the Pathname Completion
List window.
• Type more characters in the Minibuffer to finish the pathname. Press RETURN
to select the pathname; emacs closes the completion window.

Advanced Editing 241

• Type more characters in the Minibuffer to make the completion unambiguous
and press TAB again.
• Press META-v or PAGE UP to move the cursor into the Pathname Completion List
window. Use the ARROW keys to move the cursor between selections. Press
RETURN to choose the selection the cursor is on. You can press CONTROL-G or ESCAPE
ESCAPE ESCAPE to back out of this window and return the cursor to the
Minibuffer.
When you press RETURN, emacs closes the Pathname Completion List window, adds the
filename you selected to the end of the pathname you were typing, and moves the cursor to the end of the pathname you were typing in the Minibuffer. You can continue
typing in the Minibuffer and perform more completions before you press RETURN to
accept the pathname. More information is available from the Completion and Completion Commands menus of the Minibuffer menu of the emacs online manual (see
the tip on page 218).

Saving Files
You save a buffer by copying its contents back to the original file you called up.
Table 7-11 (next page) lists the relevant commands.

You can exit without first getting a warning
caution Clearing the modified flag (META-~) allows you to exit without saving a modified buffer with no
warning. Make sure you know what you are doing when you use META-~.

Did you modify a buffer by mistake?
caution When you give a CONTROL-X s command, you might discover files whose buffers were modified by
mistake as emacs tries to save the wrong changes back to the file. When emacs prompts you
to confirm the save, do not answer y if you are not sure. First exit from the CONTROL-X s dialog by
typing n to any saves you are not sure about. You then have several options:
• Save the suspicious buffer to a temporary file with CONTROL-X CONTROL-W and analyze it later.
• Undo the changes with a string of CONTROL-_ commands until the ** indicator disappears
from the buffer’s Mode Line.
• If you are sure that all the changes are wrong, use META-x revert-buffer to get a fresh copy
of the file.
• Kill the buffer outright. Because it is modified, emacs asks whether you are sure before
carrying out this command.
• Give the META-~ (tilde) command to clear the modified condition and ** indicator. A
subsequent CONTROL-X s then believes that the buffer does not need to be written.

242 Chapter 7 The emacs Editor

Table 7-11

Saving files

Command

Result

CONTROL-X CONTROL-S

This workhorse file-saving command saves the current buffer
into its original file. If the current buffer is not modified, emacs
displays the message (No changes need to be saved).

CONTROL-X s

For each modified buffer, you are asked whether you wish to
save it. Answer y or n. This command is given automatically as
you exit from emacs and allows you to save any buffers that
have been modified but not yet written out. Give this command
to save intermediate copies of your work.

META-x set-visited-file-name

Prompts for a filename and sets this name as the “original”
name for the current buffer.

CONTROL-X CONTROL-W

Prompts for a filename, sets this name as the “original” name
for the current buffer, and saves the current buffer into that file.
Equivalent to META-x set-visited-file-name followed by
CONTROL-X CONTROL-S.

META-~ (tilde)

Clears the modified flag from the current buffer. If you
mistakenly type META-~ against a buffer with changes you want
to keep, you need to make sure the modified condition and its
** indicator are turned back on before leaving emacs, or all
the changes you made will be lost. One easy way to mark a buffer
as modified is to insert a SPACE and then remove it using DELETE.

Buffers
An emacs buffer is a storage object that you can edit. It often holds the contents of
a file but can also exist without being associated with a file. You can select only one
buffer at a time, designated as the current buffer. Most commands operate only on
the current buffer, even when windows show multiple buffers on the screen. For the
most part each buffer is its own world: It has its own name, its own modes, its own
file associations, its own modified state, and perhaps its own special key bindings.
You can use the commands shown in Table 7-12 to create, select, list, and manipulate
buffers.

Table 7-12

Working with buffers

Command

Result

CONTROL-X b

Prompts for a buffer name and selects it. If the buffer you name
does not exist, this command creates it.

CONTROL-X 4 b

Prompts for a buffer name and selects it in another window. The
existing window is not disturbed, although the new window
might overlap it.

Advanced Editing 243

Table 7-12

Working with buffers (continued)

Command

Result

CONTROL-X CONTROL-B

Creates a buffer named *Buffer List* and displays it in another
window. The existing window is not disturbed, although the new
window might overlap it. The new buffer is not selected. In the
*Buffer List* buffer, each buffer’s data is shown along with the
name, size, mode(s), and original filename. A % appears for a
readonly buffer, a * indicates a modified buffer, and . appears
for the selected buffer.

META-x rename-buffer

Prompts for a new buffer name and gives this new name to the
current buffer.

CONTROL-X CONTROL-Q

Toggles the current buffer’s readonly status and the associated
%% Mode Line indicator. This command can prevent you from
accidentally modifying a buffer or allow you to modify a buffer
when visiting a readonly file.

META-x append-to-buffer

Prompts for a buffer name and appends Region to the end of
that buffer.

META-x prepend-to-buffer

Prompts for a buffer name and prepends Region to the
beginning of that buffer.

META-x copy-to-buffer

Prompts for a buffer name and deletes the contents of the buffer
before copying Region to that buffer.

META-x insert-buffer

Prompts for a buffer name and inserts the contents of that
buffer in the current buffer at Point.

CONTROL-X k

Prompts for a buffer name and deletes that buffer. If the buffer
has been modified but not saved, emacs asks you to confirm
the operation.

META-x kill-some-buffers

Goes through the list of buffers and offers the chance to delete
each buffer. As with CONTROL-X k, emacs asks you to confirm
the kill command if a modified buffer has not been saved.

Windows
An emacs window is a viewport that looks into a buffer. The emacs screen begins by
displaying a single window, but this screen space can later be divided among two or
more windows. On the screen the current window holds the cursor and views the current buffer. For a tip on terminology, see “The screen and emacs windows” on
page 216.
CONTROL-X b
buffer-name

A window displays one buffer at a time. The command CONTROL-X b buffer-name
switches the buffer that the current window displays. Multiple windows can display

244 Chapter 7 The emacs Editor

Figure 7-10

Splitting a window horizontally

the same buffer with each window displaying a different part of the buffer. Any
change to a buffer is reflected in all windows displaying that buffer. Also, a buffer can
exist without a window open on it.

Splitting a Window
One way to divide the screen is to split the starting window explicitly into two or more
pieces. The command CONTROL-X 2 splits the current window in two, with one new window
appearing above the other. A numeric argument is taken as the size of the upper window
in lines. The command CONTROL-X 3 splits the current window in two, with the new windows being arranged side by side (Figure 7-10). A numeric argument is taken as the
number of columns to give the left window. For example, CONTROL-U CONTROL-X 2 splits the
current window in two; because of the special “times 4” interpretation of CONTROL-U
standing alone, the upper window is given four lines (barely enough to be useful).
Although these commands split the current window, both windows continue to view
the same buffer. You can select a new buffer in either or both new windows, or you
can scroll each window to show different portions of the same buffer.

Manipulating Windows
CONTROL-X o
META-CONTROL-V

You can use CONTROL-X o (lowercase “o”) to select the other window. If more than two
windows appear on the screen, a sequence of CONTROL-X o commands cycles through
them in top-to-bottom, left-to-right order. The META-CONTROL-V command scrolls the
other window. If more than two windows are visible, the command scrolls the window that CONTROL-X o would select next. You can use either a positive or negative
scrolling argument, just as with CONTROL-V scrolling in the current window.

Other-Window Display
CONTROL-X 4b
CONTROL-X 4f

In normal emacs operation, explicit window splitting is not nearly as common as
the implicit splitting done by the family of CONTROL-X 4 commands. The CONTROL-X 4b

Advanced Editing 245

command, for example, prompts for a buffer name and selects it in the other window. If no other window exists, this command begins with a half-and-half split that
arranges the windows one above the other. The CONTROL-X 4f command prompts for
a filename, calls the file up in the other window, and selects the other window. If
no other window exists, this command begins with a half-and-half split that
arranges the windows one above the other.

Adjusting and Deleting Windows
CONTROL-X 0
CONTROL-X 1

Windows might be destroyed when they get in the way. No data is lost in the window’s
associated buffer with this operation, and you can make another window whenever
you like. The CONTROL-X 0 (zero) command deletes the current window and gives its
space to its neighbors; CONTROL-X 1 deletes all windows except the current window.

META-x shrink-

You can also adjust the dimensions of the current window at the expense of its neigh-

window
bors. To make a window shorter, give the command META-x shrink-window. Press
CONTROL-X ^
CONTROL-X } CONTROL-X ^ to increase the height of a window, CONTROL-X } to make the window wider,
CONTROL-X {

and CONTROL-X { to make the window narrower. Each of these commands adds or subtracts one line or column to or from the window, unless you precede the command
with a numeric argument.

The emacs editor has its own guidelines for a window’s minimum useful size and
might destroy a window before you force one of its dimensions to zero. Although the
window might disappear, the buffer remains intact.

Foreground Shell Commands
The emacs editor can run a subshell (a shell that is a child of emacs—refer to “Executing a Command” on page 325) to execute a single command line, optionally with
standard input coming from Region of the current buffer and optionally with standard output replacing Region (Table 7-13). This process is analogous to executing a
shell command from the vim editor and having the input come from the file you are
editing and the output go back to the same file (page 202). As with vim, how well this
process works depends in part on the capabilities of the shell.

Table 7-13

Foreground shell commands

Command

Result

META-! (exclamation point)

Prompts for a shell command, executes it, and displays the output

CONTROL-U META-! (exclamation

Prompts for a shell command, executes it, and inserts the
output at Point

point)
META-| (vertical bar)

Prompts for a shell command, gives Region as input, filters it
through the command, and displays the output

CONTROL-U META-| (vertical bar)

Prompts for a shell command, gives Region as input, filters it
through the command, deletes the old Region, and inserts the
output in that position

The emacs editor can also start an interactive subshell that runs continuously in its
own buffer. See “Shell Mode” on page 255 for more information.

246 Chapter 7 The emacs Editor

Background Shell Commands
The emacs editor can run processes in the background, with their output being fed
into a growing emacs buffer that does not have to remain in view. You can continue
editing while the background process runs and look at its output later. Any shell command can be run in this way.
The growing output buffer is always named *compilation*. You can read it, copy
from it, or edit it in any way, without waiting for the background process to finish.
Most commonly this buffer is used to review the output of program compilation and
to correct any syntax errors found by the compiler.
META-x compile

To run a process in the background, give the command META-x compile to prompt for
a shell command and begin executing it as a background process. The screen splits
in half to show the *compilation* buffer.
You can switch to the *compilation* buffer and watch the execution, if you wish.
To make the display scroll as you watch, position the cursor at the very end of the
text with a META-> command. If you are not interested in this display, just remove the
window with CONTROL-X 0 (zero) if you are in it or CONTROL-X 1 otherwise and keep working. You can switch back to the *compilation* buffer later with CONTROL-X b.
To kill the background process give the command META-x kill-compilation. The emacs
editor asks for confirmation and then kills the background process.
If standard format error messages appear in *compilation*, you can automatically
visit the line in the file where each error occurred. Give the command CONTROL-X‘ (back
tick) to split the screen into two windows and visit the file and line of the next error
message. Scroll the *compilation* buffer until this error message appears at the top
of its window. Use CONTROL-U CONTROL-X‘ to start over with the first error message and
visit that file and line.

Major Modes: Language-Sensitive Editing
The emacs editor has a large collection of feature sets, each specific to a certain variety of text. The feature sets are called Major modes. A buffer can have only one Major
mode at a time.
A buffer’s Major mode is private to the buffer and does not affect editing in any other
buffer. If you switch to a new buffer having a different mode, rules for the new mode
take effect immediately. To avoid confusion, the name of a buffer’s Major mode
appears in the Mode Line of any window viewing that buffer (Figure 7-1 on page 217).
The three classes of Major modes are used for the following tasks:
• Editing human languages (for example, text, nroff, TeX)
• Editing programming languages (for example, C, Fortran, Lisp)
• Special purposes (for example, shell, mail, dired, ftp)

Major Modes: Language-Sensitive Editing 247

In addition, one Major mode—Fundamental—does nothing special. A Major mode
usually sets up the following:
• Special commands unique to the mode, possibly with their own key bindings.
Whereas languages might have just a few special commands, special-purpose
modes might have dozens.
• Mode-specific character syntax and regular expressions defining wordconstituent characters, delimiters, comments, whitespace, and so on. This
setup conditions the behavior of commands oriented to syntactic units,
such as words, sentences, comments, or parenthesized expressions.

Selecting a Major Mode
META-x modename

The emacs editor chooses and sets a mode when a file is called up by matching the
filename against a set of regular expression patterns describing the filename and filename extension. The explicit command to enter a Major mode is META-x modename.
This command is used mostly to correct the Major mode when emacs guesses wrong.
To have a file define its own mode, include the text – *– modename – *– somewhere
in the first nonblank line of the file, possibly inside a comment suitable for the programming language the file is written in.

Human-Language Modes
A human language is meant eventually to be used by humans, possibly after being formatted by a text-formatting program. Human languages share many conventions
about the structure of words, sentences, and paragraphs. With regard to these textual
units, the major human language modes all behave in the same way.
Beyond this area of commonality, each mode offers additional functionality oriented
to a specific text formatter, such as TeX, LaTeX, or nroff/troff. Text-formatter extensions are beyond the scope of this chapter; the focus here is on the commands relating
to human textual units.

Words
As mnemonic aids, the bindings for words are defined parallel to the characteroriented bindings CONTROL-F, CONTROL-B, CONTROL-D, DELETE, and CONTROL-T.
META-f
META-b

Just as CONTROL-F and CONTROL-B move forward and backward over characters, so META-f
and META-b move forward and backward over words. They might start from a position
inside or outside the word to be traversed, but in all cases Point finishes just beyond
the word, adjacent to the last character skipped over. Both commands accept a
numeric argument specifying the number of words to be traversed.

META-d
META-DELETE

Just as CONTROL-D and DELETE delete characters forward and backward, so the keys META-d
and META-DELETE kill words forward and backward. They leave Point in exactly the same
finishing position as META-f and META-b do, but they kill the words they pass over. They
also accept a numeric argument.

META-t META-t

transposes the word before Point with the word after Point.

248 Chapter 7 The emacs Editor

Sentences
META-a
META-e
CONTROL-X DELETE
META-k

As mnemonic aids, three of the bindings for sentences are defined parallel to the
line-oriented bindings: CONTROL-A, CONTROL-E, and CONTROL-K. The META-a command moves
backward to the beginning of a sentence; META-e moves forward to the end of a sentence. In addition, CONTROL-X DELETE kills backward to the beginning of a sentence; META-k
kills forward to the end of a sentence.
The emacs editor recognizes the ends of sentences by referring to a regular expression
that is kept in a variable named sentence-end. Briefly, emacs looks for the characters
., ?, or ! followed by two SPACEs or an end-of-line marker, possibly with close quotation
marks or close braces. Give the command CONTROL-H v sentence-end RETURN to display the
value of this variable.
The META-a and META-e commands leave Point adjacent to the first or last nonblank character in the sentence. They accept a numeric argument specifying the number of
sentences to traverse; a negative argument runs them in reverse.
The META-k and CONTROL-X DELETE commands kill sentences forward and backward, in a
manner analogous to CONTROL-K line kill. They leave Point in the same position as META-a
and META-e do, but they kill the sentences they pass over. They also accept a numeric
argument. CONTROL-X DELETE is useful for quickly backing out of a half-finished sentence.

Paragraphs
META-{
META-}
META-h

The META-{ command moves backward to the most recent paragraph beginning; META-}
moves forward to the next paragraph ending. The META-h command marks the paragraph the cursor is on as Region (that is, it puts Point at the beginning and Mark at
the end), or marks the next paragraph if the cursor is between paragraphs.
The META-} and META-{ commands leave Point at the beginning of a line, adjacent to the
first character or last character, respectively, of the paragraph. They accept a numeric
argument specifying the number of paragraphs to traverse and run in reverse if given
a negative argument.
In human-language modes, paragraphs are separated by blank lines and textformatter command lines, and an indented line starts a paragraph. Recognition
is based on the regular expressions stored in the variables paragraph-separate
and paragraph-start. A paragraph is composed of complete lines, including the
final line terminator. If a paragraph starts following one or more blank lines,
the last blank line before the paragraph belongs to the paragraph.

Fill
The emacs editor can fill a paragraph to fit a specified width, breaking lines and
rearranging them as necessary. It breaks lines between words and does not hyphenate
words. The emacs editor can fill automatically as you type or in response to an
explicit command.
META-x auto-fill-

mode

The META-x auto-fill-mode command toggles Auto Fill mode on and off. When this mode
is on, emacs automatically breaks lines when you press SPACE or RETURN and are currently
beyond the specified line width. This feature is useful when you are entering new text.

Major Modes: Language-Sensitive Editing 249
META-q
META-x fill-region

Auto Fill mode does not automatically refill the entire paragraph you are currently
working on. If you add new text in the middle of a paragraph, Auto Fill mode breaks
the new text as you type but does not refill the complete paragraph. To refill a complete paragraph or Region of paragraphs, use either META-q to refill the current
paragraph or META-x fill-region to refill each paragraph in Region (between Point and
Mark).
You can change the filling width from its default value of 70 by setting the fill-column
variable. Give the command CONTROL-X f to set fill-column to the current cursor position and the command CONTROL-U nnn CONTROL-X f to set fill-column to nnn, where 0 is
the left margin.

Case Conversion
The emacs editor can force words or Regions to all uppercase, all lowercase, or initial
caps (the first letter of each word uppercase, the rest lowercase) characters. Refer to
Table 7-14.

Table 7-14

Case conversion

Command

Result

META-l (lowercase “l”)

Converts word to the right of Point to lowercase

META-u

Converts word to the right of Point to uppercase

META-c

Converts word to the right of Point to initial caps

CONTROL-X CONTROL-L

Converts Region to lowercase

CONTROL-X CONTROL-U

Converts Region to uppercase

The word-oriented conversions move Point over the word just converted (just as META-f
does), allowing you to walk through text and convert each word with META-l, META-u, or
META-c, or skip over words to be left alone with META-f. A positive numeric argument converts that number of words to the right of Point, moving Point as it goes. A negative
numeric argument converts that number of words to the left of Point but leaves Point
stationary. This feature is useful for quickly changing the case of words you have just
typed. Table 7-15 shows some examples.

Table 7-15

Examples of case conversion

Characters and commands

Result

HELLOMETA-–META-l (lowercase “l”) hello
helloMETA-–META-u

HELLO

helloMETA-–META-c

Hello

When the cursor (Point) is in the middle of a word, the case conversion commands
convert the characters to the left of the cursor.

250 Chapter 7 The emacs Editor

Text Mode
META-x text-mode

With very few exceptions, the commands for human-language textual units are
always turned on and available, even when the programming-language modes are
activated. Text mode adds very little to these basic commands but is still worth turning on just to activate the TAB key function (next). Use the command META-x text-mode
to activate Text mode.

META-x edit-tab-

In Text mode, pressing TAB runs the function tab-to-tab-stop. By default TAB stops are
set every eight columns. You can adjust them with META-x edit-tab-stops, which switches
to a special *Tab Stops* buffer. The current TAB stops are laid out in this buffer on a
scale for you to edit. The new stops are installed when or if you type CONTROL-C CONTROL-C.
You can kill this buffer (CONTROL-X k) or switch away from it (CONTROL-X b) without changing
the TAB stops.

stops

The TAB stops you set with the META-x edit-tab-stops command affect only the interpretation of TAB characters arriving from the keyboard. The emacs editor automatically
inserts enough spaces to reach the TAB stop. This command does not affect the interpretation of TAB characters already in the buffer or the underlying file. If you edit the
TAB stops and then use them, when you print the file the hard copy will look the same
as the text on the screen.

C Mode
Programming languages are read by humans but are interpreted by machines. Besides
continuing to handle some of the human-language text units (for example, words and
sentences), the major programming-language modes address several additional
issues:
• Handling balanced expressions enclosed by parentheses, brackets, or braces
as textual units
• Handling comments as textual units
• Indention
The emacs editor includes Major modes to support C, Fortran, and several variants
of Lisp. In addition, many users have contributed modes for their favorite languages.
In these modes the commands for human textual units are still available, with occasional redefinitions. For example, a paragraph is bounded only by blank lines and
indention does not signal a paragraph start. In addition, each mode has custom code
to handle the language-specific conventions for balanced expressions, comments, and
indention. This chapter discusses only C mode.

Expressions
The emacs Major modes are limited to lexical analysis. They can recognize most
tokens (for example, symbols, strings, and numbers) and all matched sets of

Major Modes: Language-Sensitive Editing 251

parentheses, brackets, and braces. This is enough for Lisp but not for C. The C
mode lacks a full-function syntax analyzer and is not prepared to recognize all of
C’s possible expressions.1
Table 7-16 lists the emacs commands applicable to parenthesized expressions and
some tokens. By design the bindings run parallel to the CONTROL commands for characters and the META commands for words. All of these commands accept a numeric
argument and run in reverse if that argument is negative.

Table 7-16

Commands for expressions and tokens

Command

Result

CONTROL-META-f

Moves forward over an expression. The exact behavior depends
on which character lies to the right of Point (or left of Point,
depending on which direction you are moving Point).
• If the first nonwhitespace is an opening delimiter
(parenthesis, bracket, or brace), Point is moved just
past the matching closing delimiter.
• If the first nonwhitespace is a token, Point is moved
just past the end of this token.

CONTROL-META-b

Moves backward over an expression.

CONTROL-META-k

Kills an expression forward. This command leaves Point at the
same finishing position as CONTROL-META-f but kills the
expression it traverses.

CONTROL-META-@

Sets Mark at the position CONTROL-META-f would move to but
does not change Point. To see the marked Region clearly, give
a pair of CONTROL-X CONTROL-X commands to exchange Point
and Mark.

Function Definitions
In emacs a balanced expression at the outermost level is considered to be a function
definition and is often called a defun, even though that term is specific to Lisp. More
generally it is understood to be a function definition in the language at hand.
In C mode a function definition includes the return data type, the function name, and
the argument declarations appearing before the { character. Table 7-17 shows the
commands for operating on function definitions.

1. In the emacs documentation the recurring term sexp refers to the Lisp term S-expression. Unfortunately,
it is sometimes used interchangeably with expression, even though the language might not be Lisp.

252 Chapter 7 The emacs Editor

Table 7-17

Function definition commands

Command

Result

CONTROL-META-a

Moves to the beginning of the most recent function definition.
Use this command to scan backward through a buffer one
function at a time.

CONTROL-META-e

Moves to the end of the next function definition. Use this
command to scan forward through a buffer one function at a
time.

CONTROL-META-h

Marks as Region the current function definition (or next
function definition, if the cursor is between two functions). This
command sets up an entire function definition for a Regionoriented operation such as kill.

Function indention style
caution The emacs editor assumes an opening brace at the left margin is part of a function definition. This
heuristic speeds up the reverse scan for a definition’s leading edge. If your code has an indention
style that puts the opening brace elsewhere, you might get unexpected results.

Indention
The emacs C mode has extensive logic to control the indention of C programs. You
can adjust this logic for many different styles of C indention (Table 7-18).

Table 7-18

Indention commands

Command

Result

TAB

Adjusts the indention of the current line. TAB inserts or deletes
whitespace at the beginning of the line until the indention
conforms to the current context and rules in effect. Point is not
moved unless it lies in the whitespace area; in that case it is
moved to the end of the whitespace. TAB inserts leading SPACEs;
you can press TAB with the cursor at any position on the line. If
you want to insert a TAB in the text, use META-i or CONTROL-Q TAB.

LINEFEED

Shorthand for RETURN followed by TAB. The LINEFEED key is a
convenience for entering new code, giving you an autoindent as
you begin each line.

The next two commands indent multiple lines with a single command.
CONTROL-META-q

Reindents all lines inside the next pair of matched braces.
CONTROL-META-q assumes the left brace is correctly indented and

drives the indention from there. If you need to adjust the left
brace, type TAB just to the left of the brace before giving this
command. All lines up to the matching brace are indented as if
you had typed TAB on each one.

Major Modes: Language-Sensitive Editing 253

Table 7-18

Indention commands (continued)

Command

Result

CONTROL-META-\

Reindents all lines in Region. Put Point just to the left of a left
brace and then give the command. All lines up to the matching
brace are indented as if you had typed TAB on each one.

Customizing Indention
Many styles of C programming have evolved, and emacs does its best to support
automatic indention for all of them. The indention coding was completely rewritten
for emacs version 19; it supports C, C++, Objective-C, and Java. The new emacs
syntactic analysis is much more precise and can classify each syntactic element of
each line of a program into a single syntactic category (out of about 50), such as
statement, string, or else-clause. Based on that analysis, emacs refers to the offset
table named c-offsets-alist to look up how much each line should be indented from
the preceding line.
To customize indention, you must change the offset table. Although you can define
a completely new offset table for each customized style, it is typically more convenient to feed in a short list of exceptions to the standard rules. Each mainstream style
(GNU, K&R [Kernighan and Ritchie], BSD, and so on) has such an exception list;
all are collected in c-style-alist. Here is one entry from c-style-alist:
("gnu"
(c-basic-offset . 2)
(c-comment-only-line-offset . (0 . 0))
(c-offsets-alist . ((statement-block-intro . +)
(knr-argdecl-intro . 5)
(substatement-open . +)
(label . 0)
(statement-case-open . +)
(statement-cont . +)
(arglist-intro . c-lineup-arglist-intro-after-paren)
(arglist-close . c-lineup-arglist)
))
)

Constructing a custom style is beyond the scope of this book. If you are curious, the
long story is available in emacs info beginning at “Customizing C Indentation.” The
sample .emacs file given in this chapter (page 259) adds a very simple custom style
and arranges to use it on every .c file that is edited.

Comments
Each buffer has its own comment-column variable, which you can view with the
CONTROL-H v comment-column RETURN help command. Table 7-19 (next page) lists commands that facilitate working with comments.

254 Chapter 7 The emacs Editor

Table 7-19

Comment commands

Command

Result

META-;

Inserts a comment on the current line or aligns an existing
comment. This command’s behavior differs according to the
situation.
• If no comment is on this line, META-; creates an empty
comment at the value of comment-column.
• If text already on this line overlaps the position of comment-column, META-; creates an empty comment one
SPACE after the end of the text.
• If a comment is already on this line but not at the current value of comment-column, META-; realigns the
comment at that column. If text is in the way, it places
the comment one SPACE after the end of the text.
Once an aligned (possibly empty) comment exists on the line,
Point moves to the start of the comment text.

CONTROL-X ;

Sets comment-column to the column after Point. The left
margin is column 0.

CONTROL-U – CONTROL-X ;

Kills the comment on the current line. This command sets
comment-column from the first comment found above this line
and then performs a META-; command to insert or align a
comment at that position.

CONTROL-U CONTROL-X ;

Sets comment-column to the position of the first comment
found above this line and then executes a META-; command to
insert or align a comment on this line.

Special-Purpose Modes
The emacs editor includes a third family of Major modes that are not oriented toward
a particular language or toward ordinary editing. Instead, these modes perform some
special function. The following modes might define their own key bindings and commands to accomplish that function:
• Rmail: reads, archives, and composes email
• Dired: moves around an ls –l display and operates on files
• VIP: simulates a complete vi environment
• VC: allows you to drive version-control systems (including RCS, CVS, and
Subversion) from within emacs
• GUD (Grand Unified Debugger): allows you to run and debug C (and other)
programs from within emacs
• Tramp: allows you to edit files on any remote system you can reach with ftp or scp
• Shell: runs an interactive subshell from inside an emacs buffer
This book discusses only Shell mode.

Major Modes: Language-Sensitive Editing 255

Shell Mode
One-time shell commands and Region filtering were discussed earlier under “Foreground Shell Commands” on page 245. Each emacs buffer in Shell mode has an
underlying interactive shell permanently associated with it. This shell takes its input
from the last line of the buffer and sends its output back to the buffer, advancing Point
as it goes. If you do not edit the buffer, it holds a record of the complete shell session.
The shell runs asynchronously, whether or not you have its buffer in view. The emacs
editor uses idle time to read the shell’s output and add it to the buffer.
META-x shell

Type META-x shell to create a buffer named *shell* and start a subshell. If a buffer
named *shell* already exists, emacs just switches to that buffer. The shell that this
command runs is taken from one of the following sources:
• The Lisp variable explicit-shell-file-name
• The environment variable ESHELL
• The environment variable SHELL
To start a second shell, first give the command META-x rename-buffer to change the
name of the existing shell’s buffer, and then give the command META-x shell to start
another shell. You can create as many subshells and buffers as you like, all running
in parallel.
A special set of commands is defined in Shell mode (Table 7-20). These commands
are bound mostly to two-key sequences starting with CONTROL-C. Each sequence is
similar to the ordinary control characters found in Linux but uses a leading
CONTROL-C.

Table 7-20

Shell mode commands

Command

Result

RETURN

If Point is at the end of the buffer, emacs inserts the RETURN and
sends this (the last) line to the shell. If Point is elsewhere, it
copies this line to the end of the buffer, peeling off the old shell
prompt (see the regular expression shell-prompt-pattern), if
one existed. Then this copied line—now the last in the buffer—is
sent to the shell.

CONTROL-C CONTROL-D

Sends CONTROL-D to the shell or its subshell.

CONTROL-C CONTROL-C

Sends CONTROL-C to the shell or its subshell.

CONTROL-C CONTROL-\

Sends a quit signal to the shell or its subshell.

CONTROL-C CONTROL-U

Kills the text on the current line not yet completed.

CONTROL-C CONTROL-R

Scrolls back to the beginning of the last shell output, putting the
first line of output at the top of the window.

CONTROL-C CONTROL-O

Deletes the last batch of shell output.

256 Chapter 7 The emacs Editor

optional

Customizing emacs
At the heart of emacs is a Lisp interpreter written in C. This version of Lisp is significantly extended and includes many special editing commands. The interpreter’s main
task is to execute the Lisp-coded system that implements the look-and-feel of emacs.
Reduced to its essentials, this system implements a continuous loop that watches keystrokes arrive, parses them into commands, executes those commands, and updates
the screen. This behavior can be customized in a number of ways.
• As single keystrokes arrive, they are mapped immediately through a keyboard translation table. By changing the entries in this table, it is possible
to swap keys. If you are used to vi or vim, for example, you might want to
swap DELETE and CONTROL-H. Then CONTROL-H backspaces as it does in vim, and
DELETE (which is not used by vim) is the help key. If you use DELETE as an interrupt key, you might want to choose another key to swap with CONTROL-H.
• The mapped keystrokes are gathered into small groups called key
sequences. A key sequence might be only a single key, such as CONTROL-N, or
might include two or more keys, such as CONTROL-X CONTROL-F. Once gathered,
the key sequences are used to select a particular procedure to be executed.
The rules for gathering each key sequence and the specific procedure name
to be executed when that sequence comes in are codified in a series of tables
called keymaps. By altering the keymaps, you can change the gathering rules
or change which procedure is associated with which sequence. For example,
if you are used to vi’s or vim’s use of CONTROL-W to back up over the word you
are entering, you might want to change emacs’s CONTROL-W binding from the
standard kill-region to delete-word-backward.
• The command behavior is often conditioned by one or more environment
variables or options. It might be possible to get the behavior you want by
setting some of these variables.
• The command itself is usually a Lisp program that can be reprogrammed to
make it behave as desired. Although this task is not appropriate for beginners, the Lisp source to nearly all commands is available and the internal
Lisp system is fully documented. As mentioned earlier, it is common practice
to load customized Lisp code at startup time, even if you did not write the
code yourself.
Most emacs documentation glosses over the translation, gathering, and procedure
selection steps and talks about keystrokes as though they were commands. However,
it is important to know that the underlying machinery exists and to understand that
you can change its behavior.

Customizing emacs 257

The .emacs Startup File
Each time you start emacs, it loads the file of Lisp code named ~/.emacs. Using this
file is the most common way to customize emacs. Two command-line options control
the use of the .emacs file. The –q option ignores the .emacs file so emacs starts without it; this is one way to get past a bad .emacs file. The –u user option uses the
~user/.emacs file (the .emacs file from the home directory of user).
The .emacs startup file is generally concerned only with key bindings and option settings; it is possible to write the Lisp statements for this file in a straightforward style.
Each parenthesized Lisp statement is a Lisp function call. Inside the parentheses the
first symbol is the function name; the rest of the SPACE-separated tokens are arguments
to that function.
Assigning a value to
a variable

The most common function in the .emacs file, setq, is a simple assignment to a
global variable. The first argument is the name of the variable to set and the second
argument is its value. The following example sets the variable named c-indent-level
to 8:
(setq c-indent-level 8)

Displaying the value
of a variable

While you are running emacs, the command CONTROL-H v prompts for the name of a
variable. When you enter the name of a variable and press RETURN , emacs displays the
value of the variable.

Setting the default
value of a variable

You can set the default value for a variable that is buffer-private by using the function
named setq-default. To set a specific element of a vector, use the function name aset.
The first argument is the name of the vector, the second is the offset, and the third is
the value of the target entry. In the startup file the new values are usually constants.
Table 7-21 shows the formats of these constants.

Table 7-21

Formats of constants in .emacs

Command

Result

Numbers

Decimal integers, with an optional minus sign

Strings

Similar to C strings but with extensions for CONTROL and META
characters: \C-s yields CONTROL-S, \M-s yields META-s, and \M-\C-s
yields CONTROL-META-s

Characters

Not like C characters; start with ? and continue with a printing
character or with a backslash escape sequence (for example,
?a, ?\C-i, ?\033)

Booleans

Not 1 and 0; use t for true and nil for false

Other Lisp objects

Begin with a single quotation mark and continue with the
object’s name

258 Chapter 7 The emacs Editor

Remapping Keys
The emacs command loop begins each cycle by translating incoming keystrokes into
the name of the command to be executed. The basic translation operation uses the
ASCII value of the incoming character to index a 128-element vector called a
keymap.
Sometimes a character’s eighth bit is interpreted as the META case, but this cannot
always be relied on. At the point of translation all META characters appear with the
ESCAPE prefix, whether or not they were typed that way.
Each position in this vector is one of the following:
• Not defined: No translation possible in this map.
• The name of another keymap: Switches to that keymap and waits for the
next character to arrive.
• The name of a Lisp function to be called: Translation process is done; call
this command.
Because keymaps can reference other keymaps, an arbitrarily complex recognition
tree can be set up. The mainstream emacs bindings use at most three keys, with a very
small group of well-known prefix keys, each with its well-known keymap name.
Each buffer can have a local keymap that is used first for any keystrokes arriving
while a window into that buffer is selected. The local keymap allows the regular mapping to be extended or overridden on a per-buffer basis and is most often used to add
bindings for a Major mode.
The basic translation flow runs as follows:
• Map the first character through the buffer’s local keymap. If it is defined as
a Lisp function name, translation is done and emacs executes that function.
If it is not defined, use this same character to index the global top-level
keymap.
• Map the first character through the top-level global keymap global-map. At
this and each following stage, the following conditions hold:
◆

If the entry for this character is not defined, it is an error. Send a bell to
the terminal and discard all the characters entered in this key sequence.

◆

If the entry for this character is defined as a Lisp function name, translation is done and the function is executed.

◆

If the entry for this character is defined as the name of another keymap,
switch to that keymap and wait for another character to select one of
its elements.

Everything input during the remapping process must be either a command or an
error. Ordinary characters that are to be inserted in the buffer are usually bound to

Customizing emacs 259

the command self-insert-command. Each of the well-known prefix characters is each
associated with a keymap (Table 7-22).

Table 7-22

Keymap prefixes

Keymap prefix

Applies to

ctl-x-map

For characters following CONTROL-X

ctl-x-4-map

For characters following CONTROL-X 4

esc-map

For characters following ESCAPE (including META characters)

help-map

For characters following CONTROL-H

mode-specific-map

For characters following CONTROL-C

To see the current state of the keymaps, type CONTROL-H b. They appear in the following
order: local, global, and shorter maps for each prefix key. Each line specifies the name
of the Lisp function to be called; the documentation for that function can be retrieved
with the command CONTROL-H f function-name or CONTROL-H k key-sequence.
The most common type of keymap customization is making small changes to the
global command assignments without creating any new keymaps or commands. This
type of customization is most easily done in the .emacs file using the Lisp function
define-key. The define-key function takes three arguments:
• The keymap name
• A single character defining a position in that map
• The command to be executed when this character appears
For instance, to bind the command backward-kill-word to
statement

CONTROL-W,

use the

(define-key global-map "\C-w" 'backward-kill-word)

The \ character causes C-w to be interpreted as CONTROL-W instead of three letters
(equivalent to \^w). The unmatched single quotation mark in front of the command
name is correct. This Lisp escape character keeps the name from being evaluated too
soon. To bind the command kill-region to CONTROL-X CONTROL-K, use the statement
(define-key ctl-x-map "\C-k" 'kill-region)

A Sample .emacs File
The following ~/.emacs file produces a plain editing environment that minimizes surprises for vi and vim users. If any section or any line is not appropriate for your
situation, you can edit it or make it a comment by placing one or more semicolons
(;) beginning in column 1.

260 Chapter 7 The emacs Editor
;;; Preference Variables
(setq
(setq
(setq
(setq
(setq

make-backup-files nil)
backup-by-copying t)
delete-auto-save-files t)
blink-matching-paren nil)
require-final-newline 'ask)

;Do not make backup files
;If you do, at least do not destroy links
;Delete autosave files when writing orig
;Do not blink opening delim
;Ask about missing final newline

;; Reverse mappings for C-h and DEL.
;; Sometimes useful to get DEL character from the Backspace key,
;; and online help from the Delete key.
;; NB: F1 is always bound to online help.
(keyboard-translate ?\C-h ?\177)
(keyboard-translate ?\177 ?\C-h)
;; Some vi sugar: emulate the CR command
;; that positions us to first non-blank on next line.
(defun forward-line-1-skipws ()
"Position to first nonwhitespace character on next line."
(interactive)
(if (= (forward-line) 0)
;if moved OK to start of next line
(skip-chars-forward " \t")))
;skip over horizontal whitespace
;; Bind this to M-n. ("enhanced next-line")
;; C-M-n is arguably more "correct" but (1) it takes three fingers
;; and (2) C-M-n is already bound to forward-list.
(define-key esc-map "n" 'forward-line-1-skipws)

;; C mode customization: set vanilla (8-space bsd) indention style
(require 'cc-mode)

;kiss: be sure it's here

(setq c-default-style
'(
(java-mode . "java")
(awk-mode . "awk")
(c-mode . "bsd")
(other . "gnu")
))
;; See also CC Mode in online help for more style setup examples.
;; end of c mode style setup

More Information
A lot of emacs documentation is available in both paper and electronic form. The
emacs info page and emacs help functions (page 229) provide an abundance of information. See also the GNU emacs Web page at www.gnu.org/software/emacs.

Chapter Summary

261

The comp.emacs and gnu.emacs.help newsgroups offer support for and a general
discussion about emacs.

Access to emacs
The emacs editor is included in the repositories of most Linux distributions. You can
download and install emacs with apt-get (page 1034) or yum (page 1028). You can
download the latest version of the source code from www.gnu.org.
The Free Software Foundation can be reached at these addresses:
Mail

Email

Free Software Foundation, Inc.
51 Franklin Street, Fifth Floor
Boston, MA 02110-1301, USA
gnu@gnu.org

Phone

+1 617-542-5942

Fax

+1 617 542 2652

Web

www.gnu.org

Chapter Summary
You can precede many of the commands in the following tables with a numeric argument to make the command repeat the number of times specified by the argument.
Precede a numeric argument with CONTROL-U to keep emacs from entering the argument
as text.
Table 7-23 lists commands that move the cursor.

Table 7-23

Moving the cursor

Command

Result

CONTROL-F

Forward by characters

CONTROL-B

Backward by characters

META-f

Forward by words

META-b

Backward by words

META-e

To end of sentence

META-a

To beginning of sentence

META-}

To end of paragraph

META-{

To beginning of paragraph

META->

Forward to end of buffer

262 Chapter 7 The emacs Editor

Table 7-23

Moving the cursor (continued)

META-<

Backward to beginning of buffer

CONTROL-ESCAPE

To end of line

CONTROL-A

To beginning of line

CONTROL-N

Forward (down) one line

CONTROL-P

Backward (up) one line

CONTROL-V

Scroll forward (down) one window

META-v

Scroll backward (up) one window

CONTROL-L

Clear and repaint screen, and scroll current line to center of
window

META-r

To beginning of middle line

CONTROL-U num META-r

To beginning of line number num (0 = top, – = bottom)

Table 7-24 lists commands that kill and delete text.

Table 7-24

Killing and deleting text

Command

Result

CONTROL-DELETE

Deletes character under cursor

DELETE

Deletes character to left of cursor

META-d

Kills forward to end of current word

META-DELETE

Kills backward to beginning of previous word

META-k

Kills forward to end of sentence

CONTROL-X DELETE

Kills backward to beginning of sentence

CONTROL-K

Kills forward to, but not including, line-ending LINEFEED; if there
is no text between the cursor and the LINEFEED, kills the LINEFEED

CONTROL-U 1 CONTROL-K

Kills from cursor forward to and including LINEFEED

CONTROL-U 0 CONTROL-K

Kills from cursor backward to beginning of line

META-z char

Kills forward to, but not including, next occurrence of char

META-w

Copies Region to Kill Ring (does not delete Region from buffer)

CONTROL-W

Kills Region (deletes Region from buffer)

CONTROL-Y

Yanks most recently killed text into current buffer at Point; sets
Mark at beginning of this text, with Point and cursor at the end

META-y

Erases just-yanked text, rotates Kill Ring, and yanks next item
(only after CONTROL-Y or META-y)

Chapter Summary

263

Table 7-25 lists commands that search for strings and regular expressions.

Table 7-25

Search commands

Command

Result

CONTROL-S

Prompts incrementally for a string and searches forward

CONTROL-S RETURN

Prompts for a complete string and searches forward

CONTROL-R

Prompts incrementally for a string and searches backward

CONTROL-R RETURN

Prompts for a complete string and searches backward

META-CONTROL-S

Prompts incrementally for a regular expression and searches
forward

META- CONTROL-S RETURN

Prompts for a complete regular expression and searches
forward

META-CONTROL-R

Prompts incrementally for a regular expression and searches
backward

META- CONTROL-R RETURN

Prompts for a complete regular expression and searches
backward

Table 7-26 lists commands that provide online help.

Table 7-26

Online help

Command

Result

CONTROL-H a

Prompts for string and displays a list of commands whose
names contain string

CONTROL-H b

Displays a (long) table of all key bindings now in effect

CONTROL-H c

key-sequence

Displays the name of the command bound to key-sequence

CONTROL-H k

key-sequence

Displays the name of and documentation for the command
bound to key-sequence

CONTROL-H f

Prompts for the name of a Lisp function and displays the
documentation for that function

CONTROL-H i (lowercase “i”)

Displays the top menu of info (page 36)

CONTROL-H l (lowercase “l”)

Displays the last 100 characters typed

CONTROL-H m

Displays the documentation and special key bindings for the
current Major mode

CONTROL-H n

Displays the emacs news file

CONTROL-H t

Starts an emacs tutorial session

CONTROL-H v

Prompts for a Lisp variable name and displays the
documentation for that variable

CONTROL-H w

Prompts for a command name and displays the key sequence,
if any, bound to that command

264 Chapter 7 The emacs Editor

Table 7-27 lists commands that work with a Region.

Table 7-27

Working with a Region

Command

Result

META-W

Copies Region nondestructively to the Kill Ring

CONTROL-W

Kills (deletes) Region

META-x print-region

Copies Region to the print spooler

META-x append-to-buffer

Prompts for buffer name and appends Region to that buffer

META-x append-to-file

Prompts for filename and appends Region to that file

CONTROL-X CONTROL-U

Converts Region to uppercase

CONTROL-X CONTROL-L

Converts Region to lowercase

Table 7-28 lists commands that work with lines.

Table 7-28

Working with lines

Command

Result

META-x occur

Prompts for a regular expression and lists each line containing
a match for the expression in a buffer named *Occur*

META-x delete-matching-lines

Prompts for a regular expression and deletes lines from Point
forward that have a match for the regular expression

META-x delete-non-matching-lines Prompts for a regular expression and deletes lines from Point

forward that do not have a match for the regular expression

Table 7-29 lists commands that replace strings and regular expressions unconditionally and interactively.

Table 7-29

Commands that replace text

Command

Result

META-x replace-string

Prompts for two strings and replaces each instance of the first
string with the second string from Mark forward; sets Mark at
the start of the command

META-%

As above but queries for each replacement (see Table 7-30 for
a list of responses)

or
META-x query-replace
META-x replace-regexp

Prompts for a regular expression and a string, and replaces
each match for the regular expression with the string; sets Mark
at the start of the command

META-x query-replace-regexp

As above but queries for each replacement (see Table 7-30 for
a list of responses)

Chapter Summary

265

Table 7-30 lists responses to replacement queries.

Table 7-30

Responses to replacement queries

Command

Result

RETURN

Quits searching (does not make or query for any more
replacements)

SPACE

Makes this replacement and continues querying

DELETE

Does not make this replacement and continues querying

, (comma)

Makes this replacement, displays the result, and asks for
another command

. (period)

Makes this replacement and does not make or query for any
more replacements

! (exclamation point)

Replaces this and all remaining instances without querying

Table 7-31 lists commands that work with windows.

Table 7-31

Working with windows

Command

Result

CONTROL-X b

Prompts for and displays a different buffer in current window

CONTROL-X 2

Splits current window vertically into two

CONTROL-X 3

Splits current window horizontally into two

CONTROL-X o (lowercase “o”)

Selects other window

META-CONTROL-V

Scrolls other window

CONTROL-X 4b

Prompts for buffer name and selects it in other window

CONTROL-X 4f

Prompts for filename and selects it in other window

CONTROL-X 0 (zero)

Deletes current window

CONTROL-X 1 (one)

Deletes all windows except current window

META-x shrink-window

Makes current window one line shorter

CONTROL-X ^

Makes current window one line taller

CONTROL-X }

Makes current window one character wider

CONTROL-X {

Makes current window one character narrower

266 Chapter 7 The emacs Editor

Table 7-32 lists commands that work with files.

Table 7-32

Working with files

Command

Result

CONTROL-X CONTROL-F

Prompts for a filename and reads its contents into a new buffer;
assigns the file’s simple filename as the buffer name.

CONTROL-X CONTROL-V

Prompts for a filename and reads its contents into the current
buffer (overwriting the contents of the current buffer).

CONTROL-X 4 CONTROL-F

Prompts for a filename and reads its contents into a new buffer;
assigns the file’s simple filename as the buffer name. Creates a
new window for the new buffer and selects that window. This
command splits the screen in half if you begin with only one
window.

CONTROL-X CONTROL-S

Saves the current buffer to the original file.

CONTROL-X s

Prompts for whether to save each modified buffer (y/n).

META-x set-visited-file-name

Prompts for a filename and sets the current buffer’s “original”
name to that filename.

CONTROL-X CONTROL-W

Prompts for a filename, sets the current buffer’s “original” name
to that filename, and saves the current buffer in that file.

META-~ (tilde)

Clears modified flag from the current buffer. Use with caution.

Table 7-33 lists commands that work with buffers.

Table 7-33

Working with buffers

Command

Result

CONTROL-X CONTROL-S

Saves current buffer in its associated file.

CONTROL-X CONTROL-F

Prompts for filename and visits (opens) that file.

CONTROL-X b

Prompts for buffer name and selects it. If that buffer does not
exist, creates it.

CONTROL-X 4b

Prompts for buffer name and displays that buffer in another
window. The existing window is not disturbed, although the new
window might overlap it.

CONTROL-X CONTROL-B

Creates a buffer named *Buffer List* and displays it in another
window. The existing window is not disturbed, although the new
window might overlap it. The new buffer is not selected. In the
*Buffer List* buffer, each buffer’s data is displayed with its
name, size, mode(s), and original filename.

META-x rename-buffer

Prompts for a new buffer name and assigns this new name to
the current buffer.

Chapter Summary

Table 7-33

267

Working with buffers (continued)

Command

Result

CONTROL-X CONTROL-Q

Toggles the current buffer’s readonly status and the associated
%% Mode Line indicator.

META-x append-to-buffer

Prompts for buffer name and appends Region to the end of that
buffer.

META-x prepend-to-buffer

Prompts for buffer name and prepends Region to the beginning
of that buffer.

META-x copy-to-buffer

Prompts for buffer name, deletes contents of that buffer, and
copies Region to that buffer.

META-x insert-buffer

Prompts for buffer name and inserts entire contents of that
buffer in current buffer at Point.

CONTROL-X k

Prompts for buffer name and deletes that buffer.

META-x kill-some-buffers

Goes through the entire buffer list and offers the chance to
delete each buffer.

Table 7-34 lists commands that run shell commands in the foreground. These
commands might not work with all shells.

Table 7-34

Foreground shell commands

Command

Result

META-! (exclamation point)

Prompts for shell command, executes it, and displays the
output

CONTROL-U META-!

Prompts for shell command, executes it, and inserts the output
at Point

(exclamation point)
META-| (vertical bar)

Prompts for shell command, supplies Region as input to that
command, and displays output of command

CONTROL-U META-| (vertical bar)

Prompts for shell command, supplies Region as input to that
command, deletes old Region, and inserts output of command
in place of Region

Table 7-35 lists commands that run shell commands in the background.

Table 7-35

Background shell commands

Command

Result

META-x compile

Prompts for shell command and runs that command in the
background, with output going to the buffer named
*compilation*

META-x kill-compilation

Kills background process

268 Chapter 7 The emacs Editor

Table 7-36 lists commands that convert text from uppercase to lowercase, and vice versa.

Table 7-36

Case conversion commands

Command

Result

META-l (lowercase “l”)

Converts word to right of Point to lowercase

META-u

Converts word to right of Point to uppercase

META-c

Converts word to right of Point to initial caps

CONTROL-X CONTROL-L

Converts Region to lowercase

CONTROL-X CONTROL-U

Converts Region to uppercase

Table 7-37 lists commands that work in C mode.

Table 7-37

C mode commands

Command

Result

CONTROL-META-f

Moves forward over expression

CONTROL-META-b

Moves backward over expression

CONTROL-META-k

Moves forward over expression and kills it

CONTROL-META-@

Sets Mark at the position CONTROL-META-f would move to, without
changing Point

CONTROL-META-a

Moves to beginning of the most recent function definition

CONTROL-META-e

Moves to end of the next function definition

CONTROL-META-h

Moves Point to beginning and Mark to end of current (or next,
if between) function definition

Type META-x shell to create a buffer named *shell* and start a subshell. Table 7-38 lists
commands that work on this buffer.

Table 7-38

Shell mode commands

Command

Result

RETURN

Sends current line to the shell

CONTROL-C CONTROL-D

Sends CONTROL-D to shell or its subshell

CONTROL-C CONTROL-C

Sends CONTROL-C to shell or its subshell

CONTROL-C CONTROL-\

Sends quit signal to shell or its subshell

CONTROL-C CONTROL-U

Kills text on the current line not yet completed

CONTROL-C CONTROL-R

Scrolls back to beginning of last shell output, putting first line
of output at the top of the window

CONTROL-C CONTROL-O (uppercase

Deletes last batch of shell output

“O”)

Exercises 269

Exercises
1. Given a buffer full of English text, answer the following questions:
a. How would you change every instance of his to hers?
b. How would you make this change only in the final paragraph?
c. Is there a way to look at every usage in context before changing it?
d. How would you deal with the possibility that His might begin a
sentence?
2. Which command moves the cursor to the end of the current paragraph? Can
you use this command to skip through the buffer in one-paragraph steps?
3. Suppose that you are lost in the middle of typing a long sentence.
a. Is there an easy way to kill the botched sentence and start over?
b. What if only one word is incorrect? Is there an alternative to backspacing
one letter at a time?
4. After you have been working on a paragraph for a while, most likely some
lines will have become too short and others too long. Is there a command
to “neaten up” the paragraph without rebreaking all the lines by hand?
5. Is there a way to change the entire contents of the buffer to capital letters?
Can you think of a way to change just one paragraph?
6. How would you reverse the order of two paragraphs?
7. How would you reverse two words?
8. Imagine that you saw a Usenet posting with something particularly funny
in it and saved the posting to a file. How would you incorporate this file into
your own buffer? What if you wanted to use only a couple of paragraphs
from the posting? How would you add > to the beginning of each included
line?
9. On the keyboard alone emacs has always offered a full set of editing possibilities. Generally several techniques will accomplish the same goal for any
editing task. In the X environment the choice is enlarged still further with
a new group of mouse-oriented visual alternatives. From these options you
must select the way that you like to solve a given editing puzzle best.
Consider this Shakespearean fragment:
1. Full fathom five thy father lies;
2.
Of his bones are coral made;
3. Those are pearls that were his eyes:
4.
Nothing of him that doth fade,
5. But doth suffer a sea-change

270 Chapter 7 The emacs Editor
6. Into something rich and strange.
7. Sea-nymphs hourly ring his knell:
8.
Ding-dong.
9. Hark! now I hear them-10. Ding-dong, bell!

The following fragment has been typed with some errors:
1. Full fathiom five tyy father lies;
2. These are pearls that were his eyes:
3.
Of his bones are coral made;
4.
Nothin of him that doth fade,
5. But doth susffer a sea-change
6. Into something rich and strange.
7. Sea-nymphs hourly ring his knell:
8.
Ding=dong.
9. Hard! now I hear them-10. Ding-dong, bell!

Use only the keyboard to answer the following:
a. How many ways can you think of to move the cursor to the spelling
errors?
b. Once the cursor is on or near the errors, how many ways can you think
of to fix them?
c. Are there ways to fix errors without explicitly navigating to or searching
for them? How many can you think of?
d. Lines 2 and 3 in the retyped material are transposed. How many ways can
you think of to correct this situation?

Advanced Exercises
10. Assume that your buffer contains the C code shown here, with the Major
mode set for C and the cursor positioned at the end of the while line as
shown by the black square:
/*

* Copy string
* return s1
*/

s2 to s1.

s1 must be large enough

char *strcpy(char *s1, char
{
char *os1;
os1 = s1;
while (*s1++ =
;
return os1;
}

*s2)

*s2++)

Advanced Exercises

271

/*

*
*

Copy source into dest, stopping after '\0' is copied, and
return a pointer to the '\0' at the end of dest. Then our
caller
* can catenate to the dest * string without another strlen call.
*/
char *stpcpy (char *dest, char *source)
{
while ((*dest++ = *source++) != '\0') ■
; /* void loop body */
return (dest - 1);
}

a. Which command moves the cursor to the opening brace of strcpy? Which
command moves the cursor past the closing brace? Can you use these
commands to skip through the buffer in one-procedure steps?
b. Assume the cursor is just past the closing parenthesis of the while condition. How do you move to the matching opening parenthesis? How do
you move back to the matching close parenthesis again? Does the same
command set work for matched [] (square brackets) and {} (braces)?
How does this differ from the vim % command?
c. One procedure is indented in the Berkeley indention style; the other is
indented in the GNU style. Which command reindents a line in accordance with the current indention style you have set up? How would you
reindent an entire procedure?
d. Suppose that you want to write five string procedures and intend to use
strcpy as a starting point for further editing. How would you make five
copies of the strcpy procedure?
e. How would you compile the code without leaving emacs?

I

PART III
The Shells
CHAPTER 8
The Bourne Again Shell (bash)
CHAPTER 9
The TC Shell (tcsh)

369

275

8
The Bourne Again
Shell (bash)

Chapter8
8

In This Chapter

Objectives

Startup Files . . . . . . . . . . . . . . . . . 278

After reading this chapter you should be able to:

Redirecting Standard Error . . . . . . 282

 Describe the purpose and history of bash

Writing and Executing a Simple
Shell Script. . . . . . . . . . . . . . . . . 284

 List the startup files bash runs

Job Control. . . . . . . . . . . . . . . . . . . 294
Manipulating the Directory
Stack . . . . . . . . . . . . . . . . . . . . . 297

 Use three different methods to run a shell script
 Understand the purpose of the PATH variable
 Manage multiple processes using job control

Parameters and Variables . . . . . . 300

 Redirect error messages to a file

Locale . . . . . . . . . . . . . . . . . . . . . . 316

 Use control operators to separate and group
commands

Processes . . . . . . . . . . . . . . . . . . . 323
History . . . . . . . . . . . . . . . . . . . . . . 326
Reexecuting and Editing
Commands . . . . . . . . . . . . . . . . . 328

 Create variables and display the values of variables
and parameters

Functions . . . . . . . . . . . . . . . . . . . . 346

 List and describe common variables found on the
system

Controlling bash: Features and
Options . . . . . . . . . . . . . . . . . . . 349

 Reference, repeat, and modify previous commands
using history

Processing the Command Line. . . 354

 Use control characters to edit the command line
 Create, display, and remove aliases and functions
 Customize the bash environment using the set and
shopt builtins
 List the order of command-line expansion

276 Chapter 8 The Bourne Again Shell (bash)

This chapter picks up where Chapter 5 left off by focusing on the Bourne Again Shell
(bash). It notes where tcsh implementation of a feature differs from that of bash; if
appropriate, you are directed to the page where the alternative implementation is discussed. Chapter 10 expands on this chapter, exploring control flow commands and
more advanced aspects of programming the Bourne Again Shell. The bash home page
is at www.gnu.org/software/bash. The bash info page is a complete Bourne Again Shell
reference.
The Bourne Again Shell (bash) and the TC Shell (tcsh) are command interpreters and
high-level programming languages. As command interpreters, they process commands you enter on the command line in response to a prompt. When you use the
shell as a programming language, it processes commands stored in files called shell
scripts. Like other languages, shells have variables and control flow commands (e.g.,
for loops and if statements).
When you use a shell as a command interpreter, you can customize the environment
you work in. You can make the prompt display the name of the working directory,
create a function or an alias for cp that keeps it from overwriting certain kinds of files,
take advantage of keyword variables to change aspects of how the shell works, and
so on. You can also write shell scripts that do your bidding—anything from a oneline script that stores a long, complex command to a longer script that runs a set of
reports, prints them, and mails you a reminder when the job is done. More complex
shell scripts are themselves programs; they do not just run other programs.
Chapter 10 has some examples of these types of scripts.
Most system shell scripts are written to run under bash (or dash; next page). If you
will ever work in single-user/recovery mode—when you boot the system or perform
system maintenance, administration, or repair work, for example—it is a good idea
to become familiar with this shell.
This chapter expands on the interactive features of the shell described in Chapter 5,
explains how to create and run simple shell scripts, discusses job control, talks
about locale, introduces the basic aspects of shell programming, talks about history
and aliases, and describes command-line expansion. Chapter 9 covers interactive
use of the TC Shell and TC Shell programming, and Chapter 10 presents some more
challenging shell programming problems.

Background
bash Shell

The Bourne Again Shell is based on the Bourne Shell (an early UNIX shell; this book
refers to it as the original Bourne Shell to avoid confusion), which was written by
Steve Bourne of AT&T’s Bell Laboratories. Over the years the original Bourne Shell
has been expanded, but it remains the basic shell provided with many commercial
versions of UNIX.

Background
sh Shell

277

Because of its long and successful history, the original Bourne Shell has been used to
write many of the shell scripts that help manage UNIX systems. Some of these scripts
appear in Linux as Bourne Again Shell scripts. Although the Bourne Again Shell
includes many extensions and features not found in the original Bourne Shell, bash
maintains compatibility with the original Bourne Shell so you can run Bourne Shell
scripts under bash. On UNIX systems the original Bourne Shell is named sh.
On many Linux systems sh is a symbolic link to bash or dash, ensuring scripts that
require the presence of the Bourne Shell still run. When called as sh, bash does its best
to emulate the original Bourne Shell. Under OS X, sh is a copy of bash.

dash Shell

The bash executable file is almost 900 kilobytes, has many features, and is well suited
as a user login shell. The dash (Debian Almquist) shell is about 100 kilobytes, offers
Bourne Shell compatibility for shell scripts (noninteractive use), and because of its
size, can load and execute shell scripts much more quickly than bash.

Korn Shell

The Korn Shell (ksh), written by David Korn, ran on System V UNIX. This shell
extended many features of the original Bourne Shell and added many new features.
Some features of the Bourne Again Shell, such as command aliases and commandline editing, are based on similar features from the Korn Shell.

POSIX

The POSIX (Portable Operating System Interface) family of related standards is being
developed by PASC (IEEE’s Portable Application Standards Committee;
www.pasc.org/plato). A comprehensive FAQ on POSIX, including many links,
appears at www.opengroup.org/austin/papers/posix_faq.html.
POSIX standard 1003.2 describes shell functionality. The Bourne Again Shell provides
the features that match the requirements of this standard. Efforts are under way to
make the Bourne Again Shell fully comply with the POSIX standard. In the meantime,
if you invoke bash with the ––posix option, the behavior of the Bourne Again Shell will
closely match the POSIX requirements.

chsh: changes your login shell
tip The person who sets up your account determines which shell you use when you first log in on the
system or when you open a terminal emulator window in a GUI environment. Under most Linux
systems, bash is the default shell. You can run any shell you like after you are logged in. Enter
the name of the shell you want to use (bash, tcsh, or another shell) and press RETURN; the next
prompt will be that of the new shell. Give an exit command to return to the previous shell. Because
shells you call in this manner are nested (one runs on top of the other), you will be able to log out
only from your original shell. When you have nested several shells, keep giving exit commands
until you reach your original shell. You will then be able to log out.
The chsh utility changes your login shell more permanently. First give the command chsh. In
response to the prompts, enter your password and the absolute pathname of the shell you want
to use (/bin/bash, /bin/tcsh, or the pathname of another shell). When you change your login shell
in this manner using a terminal emulator under a GUI, subsequent terminal emulator windows
might not reflect the change until you log out of the system and log back in. See page 371 for an
example of how to use chsh.

278 Chapter 8 The Bourne Again Shell (bash)

Startup Files
When a shell starts, it runs startup files to initialize itself. Which files the shell runs
depends on whether it is a login shell, an interactive shell that is not a login shell (give
the command bash to run one of these shells), or a noninteractive shell (one used to
execute a shell script). You must have read access to a startup file to execute the commands in it. Typically Linux distributions put appropriate commands in some of
these files. This section covers bash startup files. See page 372 for information on tcsh
startup files and page 1050 for information on startup files under Mac OS X.

Login Shells
A login shell is the first shell that displays a prompt when you log in on a system from
the system console or a virtual console, remotely using ssh or another program, or by
another means. When you are running a GUI and open a terminal emulator such as
gnome-terminal, you are not logging in on the system (you do not provide your username
and password), so the shell the emulator displays is (usually) not a login shell; it is an
interactive nonlogin shell (next page). Login shells are, by their nature, interactive. See
“bash versus –bash” on page 459 for a way to tell which type of shell you are running.
This section describes the startup files that are executed by login shells and shells that
you start with the bash ––login option.
/etc/profile

The shell first executes the commands in /etc/profile, establishing systemwide default
characteristics for users running bash. In addition to executing the commands it
holds, some versions of profile execute the commands within each of the files with a
.sh filename extension in the /etc/profile.d directory. This setup allows a user working with root privileges to modify the commands profile runs without changing the
profile file itself. Because profile can be replaced when the system is updated, making
changes to files in the profile.d directory ensures the changes will remain when the
system is updated.

Set environment variables for all users in /etc/profile or in a
/etc/profile.d

*.sh file in

tip Setting and exporting a variable in /etc/profile or in a file with a .sh filename extension in the
/etc/profile.d directory makes that variable available to every user’s login shell. Variables that
are exported (placed in the environment) are also available to all interactive and noninteractive
subshells of the login shell.
.bash_profile,
.bash_login,
and .profile

Next the shell looks for ~/.bash_profile, ~/.bash_login, or ~/.profile (~/ is shorthand
for your home directory), in that order, executing the commands in the first of these
files it finds. You can put commands in one of these files to override the defaults set
in /etc/profile.
By default, a typical Linux distribution sets up new accounts with ~/.bash_profile
and ~/.bashrc files. The default ~/.bash_profile file calls ~/.bashrc, which calls
/etc/bashrc.

Startup Files 279
.bash_logout

When you log out, bash executes commands in the ~/.bash_logout file. This file often
holds commands that clean up after a session, such as those that remove temporary files.

Interactive Nonlogin Shells
The commands in the preceding startup files are not executed by interactive, nonlogin
shells. However, these shells inherit from the login shell variables that are declared
and exported in these startup files.
.bashrc

/etc/bashrc

An interactive nonlogin shell executes commands in the ~/.bashrc file. The default
~/.bashrc file calls /etc/bashrc.
Although not called by bash directly, many ~/.bashrc files call /etc/bashrc.

Noninteractive Shells
The commands in the previously described startup files are not executed by noninteractive shells, such as those that run shell scripts. However, if these shells are forked
by a login shell, they inherit variables that are declared and exported in these startup
files. Specifically, crontab files (page 763) do not inherit variables from startup files.
BASH_ENV

Noninteractive shells look for the environment variable BASH_ENV (or ENV if the
shell is called as sh) and execute commands in the file named by this variable.

Setting Up Startup Files
Although many startup files and types of shells exist, usually all you need are the
.bash_profile and .bashrc files in your home directory. Commands similar to the
following in .bash_profile run commands from .bashrc for login shells (when
.bashrc exists). With this setup, the commands in .bashrc are executed by login
and nonlogin shells.
if [ -f ~/.bashrc ]; then . ~/.bashrc; fi

The [ –f ~/.bashrc ] tests whether the file named .bashrc in your home directory exists.
See pages 419, 422, and 978 for more information on test and its synonym [ ]. See
page 280 for information on the . (dot) builtin.

Set PATH in .bash_profile
tip Because commands in .bashrc might be executed many times, and because subshells inherit environment (exported) variables, it is a good idea to put commands that add to existing variables in
the .bash_profile file. For example, the following command adds the bin subdirectory of the home
directory to PATH (page 308) and should go in .bash_profile:
PATH=$PATH:$HOME/bin

When you put this command in .bash_profile and not in .bashrc, the string is added to the PATH
variable only once, when you log in.
Modifying a variable in .bash_profile causes changes you make in an interactive session to
propagate to subshells. In contrast, modifying a variable in .bashrc overrides changes inherited
from a parent shell.

280 Chapter 8 The Bourne Again Shell (bash)

Sample .bash_profile and .bashrc files follow. Some commands used in these files are
not covered until later in this chapter. In any startup file, you must place in the environment (export) those variables and functions that you want to be available to child
processes. For more information refer to “Environment, Environment Variables, and
Inheritance” on page 468.
$ cat ~/.bash_profile
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
PATH=$PATH:/usr/local/bin
export PS1='[\h \W \!]\$ '

# Read local startup file if it exists
# Add /usr/local/bin to PATH
# Set prompt

The first command in the preceding .bash_profile file executes the commands in the
user’s .bashrc file if it exists. The next command adds to the PATH variable
(page 308). Typically PATH is set and exported in /etc/profile, so it does not need to
be exported in a user’s startup file. The final command sets and exports PS1
(page 309), which controls the user’s prompt.
The first command in the .bashrc file shown below executes the commands in the
/etc/bashrc file if it exists. Next the file sets noclobber (page 139), unsets MAILCHECK
(page 309), exports LANG (page 314) and VIMINIT (for vim initialization; page 195),
and defines several aliases. The final command defines a function (page 346) that swaps
the names of two files.
$ cat ~/.bashrc
if [ -f /etc/bashrc ]; then
source /etc/bashrc
fi

# read global startup file if it exists

set -o noclobber
unset MAILCHECK
export LANG=C
export VIMINIT='set ai aw'
alias df='df -h'
alias rm='rm -i'
alias lt='ls -ltrh | tail'
alias h='history | tail'
alias ch='chmod 755 '

#
#
#
#
#
#

prevent overwriting files
turn off "you have new mail" notice
set LANG variable
set vim options
set up aliases
always do interactive rm's

function switch() {
local tmp=$$switch
mv "$1" $tmp
mv "$2" "$1"
mv $tmp "$2"
}

# a function to exchange
# the names of two files

. (Dot) or source: Runs a Startup File in the Current Shell
After you edit a startup file such as .bashrc, you do not have to log out and log in
again to put the changes into effect. Instead, you can run the startup file using the

Commands That Are Symbols 281

. (dot) or source builtin (they are the same command under bash; only source is available under tcsh [page 410]). As with other commands, the . must be followed by a
SPACE on the command line. Using . or source is similar to running a shell script, except
these commands run the script as part of the current process. Consequently, when
you use . or source to run a script, changes you make to variables from within the
script affect the shell you run the script from. If you ran a startup file as a regular shell
script and did not use the . or source builtin, the variables created in the startup file
would remain in effect only in the subshell running the script—not in the shell you
ran the script from. You can use the . or source command to run any shell script—not
just a startup file—but undesirable side effects (such as changes in the values of shell
variables you rely on) might occur. For more information refer to “Environment,
Environment Variables, and Inheritance” on page 468.
In the following example, .bashrc sets several variables and sets PS1, the bash
prompt, to the name of the host. The . builtin puts the new values into effect.
$ cat ~/.bashrc
export TERM=xterm
# set
export PS1="$(hostname -f): " # set
export CDPATH=:$HOME
# add
stty kill '^u'
# set

the terminal type
the prompt string
HOME to CDPATH string
kill line to control-u

$ . ~/.bashrc
guava:

Commands That Are Symbols
The Bourne Again Shell uses the symbols (, ), [, ], and $ in a variety of ways. To
minimize confusion, Table 8-1 lists the most common use of each of these symbols
and the page on which it is discussed.

Table 8-1

Builtin commands that are symbols

Symbol

Command

()

Subshell (page 292)

$( )

Command substitution (page 361)

(( ))

Arithmetic evaluation; a synonym for let (use when the enclosed value
contains an equal sign; page 492)

$(( ))

Arithmetic expansion (not for use with an enclosed equal sign; page 359)

[]

The test command (pages 419, 422, and 978)

[[ ]]

Conditional expression; similar to [ ] but adds string comparisons (page 493)

282 Chapter 8 The Bourne Again Shell (bash)

Redirecting Standard Error
Chapter 5 covered the concept of standard output and explained how to redirect
standard output of a command. In addition to standard output, commands can send
output to standard error. A command might send error messages to standard error
to keep them from getting mixed up with the information it sends to standard output.
Just as it does with standard output, by default the shell directs standard error to the
screen. Unless you redirect one or the other, you might not know the difference
between the output a command sends to standard output and the output it sends to
standard error. One difference is that the system buffers standard output but does not
buffer standard error. This section describes the syntax used by bash to redirect standard error and to distinguish between standard output and standard error. See
page 379 if you are using tcsh.
File descriptors

A file descriptor is the place a program sends its output to and gets its input from.
When you execute a program, the shell opens three file descriptors for the program:
0 (standard input), 1 (standard output), and 2 (standard error). The redirect output
symbol (> [page 136]) is shorthand for 1>, which tells the shell to redirect standard
output. Similarly < (page 137) is short for 0<, which redirects standard input. The
symbols 2> redirect standard error. For more information refer to “File Descriptors”
on page 452.
The following examples demonstrate how to redirect standard output and standard
error to different files and to the same file. When you run the cat utility with the name
of a file that does not exist and the name of a file that does exist, cat sends an error
message to standard error and copies the file that does exist to standard output.
Unless you redirect them, both messages appear on the screen.
$ cat y
This is y.
$ cat x
cat: x: No such file or directory
$ cat x y
cat: x: No such file or directory
This is y.

When you redirect standard output of a command, output sent to standard error is
not affected and still appears on the screen.
$ cat x y > hold
cat: x: No such file or directory
$ cat hold
This is y.

Similarly, when you send standard output through a pipeline, standard error is not
affected. The following example sends standard output of cat through a pipeline to
tr (page 987), which in this example converts lowercase characters to uppercase. The

Redirecting Standard Error

283

text that cat sends to standard error is not translated because it goes directly to the
screen rather than through the pipeline.
$ cat x y | tr "[a-z]" "[A-Z]"
cat: x: No such file or directory
THIS IS Y.

The following example redirects standard output and standard error to different files.
The shell redirects standard output (file descriptor 1) to the filename following 1>.
You can specify > in place of 1>. The shell redirects standard error (file descriptor 2)
to the filename following 2>.
$ cat x y 1> hold1 2> hold2
$ cat hold1
This is y.
$ cat hold2
cat: x: No such file or directory
Combining
standard output and
standard error

In the next example, the &> token redirects standard output and standard error to
a single file. The >& token performs the same function under tcsh (page 379).
$ cat x y &> hold
$ cat hold
cat: x: No such file or directory
This is y.

Duplicating a file
descriptor

In the next example, first 1> redirects standard output to hold, and then 2>&1 declares
file descriptor 2 to be a duplicate of file descriptor 1. As a result, both standard output
and standard error are redirected to hold.
$ cat x y 1> hold 2>&1
$ cat hold
cat: x: No such file or directory
This is y.

In this case, 1> hold precedes 2>&1. If they had appeared in the opposite order, standard error would have been made a duplicate of standard output before standard
output was redirected to hold. Only standard output would have been redirected
to hold in that case.
Sending errors
through a pipeline

The next example declares file descriptor 2 to be a duplicate of file descriptor 1 and
sends the output for file descriptor 1 (as well as file descriptor 2) through a pipeline
to the tr command.
$ cat x y 2>&1 | tr "[a-z]" "[A-Z]"
CAT: X: NO SUCH FILE OR DIRECTORY
THIS IS Y.

The token |& is shorthand for 2>&1 |:
$ cat x y |& tr "[a-z]" "[A-Z]"
CAT: X: NO SUCH FILE OR DIRECTORY
THIS IS Y.

284 Chapter 8 The Bourne Again Shell (bash)
Sending errors to
standard error

You can use 1>&2 (or simply >&2; the 1 is not required) to redirect standard output
of a command to standard error. Shell scripts use this technique to send the output
of echo to standard error. In the following script, standard output of the first echo is
redirected to standard error:
$ cat message_demo
echo This is an error message. 1>&2
echo This is not an error message.

If you redirect standard output of message_demo, error messages such as the one produced by the first echo appear on the screen because you have not redirected standard
error. Because standard output of a shell script is frequently redirected to a file, you
can use this technique to display on the screen any error messages generated by the
script. The lnks script (page 427) uses this technique. You can use the exec builtin to
create additional file descriptors and to redirect standard input, standard output, and
standard error of a shell script from within the script (page 482).
The Bourne Again Shell supports the redirection operators shown in Table 8-2.

Table 8-2

Redirection operators

Operator

Meaning

< filename

Redirects standard input from filename.

> filename

Redirects standard output to filename unless filename exists and noclobber
(page 139) is set. If noclobber is not set, this redirection creates filename if it
does not exist and overwrites it if it does exist.

>! filename

Redirects standard output to filename, even if the file exists and noclobber
(page 139) is set.

>> filename

Redirects and appends standard output to filename; creates filename if it does
not exist.

&> filename

Redirects standard output and standard error to filename.

<&m

Duplicates standard input from file descriptor m (page 453).

[n] >&m

Duplicates standard output or file descriptor n if specified from file descriptor
m (page 453).

[n]<&–

Closes standard input or file descriptor n if specified (page 453).

[n] >&–

Closes standard output or file descriptor n if specified.

Writing and Executing a Simple Shell Script
A shell script is a file that holds commands the shell can execute. The commands in
a shell script can be any commands you can enter in response to a shell prompt. For

Writing and Executing a Simple Shell Script 285

example, a command in a shell script might run a utility, a compiled program, or
another shell script. Like the commands you give on the command line, a command
in a shell script can use ambiguous file references and can have its input or output
redirected from or to a file or sent through a pipeline. You can also use pipelines and
redirection with the input and output of the script itself.
In addition to the commands you would ordinarily use on the command line, control
flow commands (also called control structures) find most of their use in shell scripts.
This group of commands enables you to alter the order of execution of commands
in a script in the same way you would alter the order of execution of statements using
a structured programming language. Refer to “Control Structures” on page 418
(bash) and page 398 (tcsh) for specifics.
The shell interprets and executes the commands in a shell script, one after another.
Thus a shell script enables you to simply and quickly initiate a complex series of tasks
or a repetitive procedure.

chmod: Makes a File Executable
To execute a shell script by giving its name as a command, you must have permission
to read and execute the file that contains the script (refer to “Access Permissions” on
page 98). Read permission enables you to read the file that holds the script. Execute
permission tells the system that the owner, group, and/or public has permission to
execute the file; it implies the content of the file is executable.
When you create a shell script using an editor, the file does not typically have its
execute permission set. The following example shows a file named whoson that
contains a shell script:
$ cat whoson
date
echo "Users Currently Logged In"
who
$ ./whoson
bash: ./whoson: Permission denied

You cannot execute whoson by giving its name as a command because you do not have
execute permission for the file. The system does not recognize whoson as an executable
file and issues the error message Permission denied when you try to execute it. (See the
tip on the next page if the shell issues a command not found error message.) When you
give the filename as an argument to bash (bash whoson), bash assumes the argument is
a shell script and executes it. In this case bash is executable, and whoson is an argument
that bash executes, so you do not need execute permission to whoson. You must have
read permission.
The chmod utility changes the access privileges associated with a file. Figure 8-1 (next
page) shows ls with the –l option displaying the access privileges of whoson before
and after chmod gives execute permission to the file’s owner.

286 Chapter 8 The Bourne Again Shell (bash)

$ ls -l whoson
-rw-rw-r--. 1 max pubs 40 05-24 11:30 whoson
$ chmod u+x whoson
$ ls -l whoson
-rwxrw-r--. 1 max pubs 40 05-24 11:30 whoson
$ ./whoson
Sat May 25 11:40:49 PDT 2013
Users Currently Logged In
zach
pts/7
2013-05-23
hls
pts/1
2013-05-24
sam
pts/12
2013-05-24
max
pts/4
2013-05-24

Figure 8-1

18:17
09:59
06:29 (guava)
09:08

Using chmod to make a shell script executable

The first ls displays a hyphen (–) as the fourth character, indicating the owner does
not have permission to execute the file. Next chmod gives the owner execute permission: u+x causes chmod to add (+) execute permission (x) for the owner (u). (The u
stands for user, although it means the owner of the file.) The second argument is the
name of the file. The second ls shows an x in the fourth position, indicating the owner
has execute permission.

Command not found?
tip If you give the name of a shell script as a command without including the leading ./, the shell
typically displays the following error message:
$ whoson
bash: whoson: command not found

This message indicates the shell is not set up to search for executable files in the working directory.
Enter this command instead:
$ ./whoson

The ./ tells the shell explicitly to look for an executable file in the working directory. Although not
recommended for security reasons, you can change the PATH variable so the shell searches the
working directory automatically; see PATH on page 308.

If other users will execute the file, you must also change group and/or public access permissions for the file. Any user must have execute access to use the file’s name as a
command. If the file is a shell script, the user trying to execute the file must have read
access to the file as well. You do not need read access to execute a binary executable
(compiled program).
The final command in Figure 8-1 shows the shell executing the file when its name is
given as a command. For more information refer to “Access Permissions” on page 98
as well as the discussions of ls and chmod in Part VII.

Writing and Executing a Simple Shell Script 287

#! Specifies a Shell
You can put a special sequence of characters on the first line of a shell script to tell
the operating system which shell (or other program) should execute the file and
which options you want to include. Because the operating system checks the initial
characters of a program before attempting to execute it using exec, these characters
save the system from making an unsuccessful attempt. If #! (sometimes said out loud
as hashbang or shebang) are the first two characters of a script, the system interprets
the characters that follow as the absolute pathname of the program that is to execute
the script. This pathname can point to any program, not just a shell, and can be useful
if you have a script you want to run with a shell other than the shell you are running
the script from. The following example specifies that bash should run the script:
$ cat bash_script
#!/bin/bash
echo "This is a Bourne Again Shell script."

The bash –e and –u options can make your programs less fractious
tip The bash –e (errexit) option causes bash to exit when a simple command (e.g., not a control
structure) fails. The bash –u (nounset) option causes bash to display a message and exit when
it tries to expand an unset variable. See Table 8-13 on page 351 for details. It is easy to turn these
options on in the !# line of a bash script:
#!/bin/bash -eu

These options can prevent disaster when you mistype lines like this in a script:
MYDIR=/tmp/$$
cd $MYDIr; rm -rf .

During development, you can also specify the –x option in the !# line to turn on debugging
(page 430).

The next example runs under Perl and can be run directly from the shell without
explicitly calling Perl on the command line:
$ cat ./perl_script.pl
#!/usr/bin/perl -w
print "This is a Perl script.\n";
$ ./perl_script.pl
This is a Perl script.

The next example shows a script that should be executed by tcsh:
$ cat tcsh_script
#!/bin/tcsh
echo "This is a tcsh script."
set person = zach
echo "person is $person"

Because of the #! line, the operating system ensures that tcsh executes the script no
matter which shell you run it from.

288 Chapter 8 The Bourne Again Shell (bash)

You can use ps –f within a shell script to display the name of the program that is
executing the script. The three lines that ps displays in the following example show
the process running the parent bash shell, the process running the tcsh script, and
the process running the ps command:
$ cat tcsh_script2
#!/bin/tcsh
ps -f
$ ./tcsh_script2
UID
PID PPID
max
3031 3030
max
9358 3031
max
9375 9358

C
0
0
0

STIME
Nov16
21:13
21:13

TTY
pts/4
pts/4
pts/4

TIME
00:00:00
00:00:00
00:00:00

CMD
-bash
/bin/tcsh ./tcsh_script2
ps -f

If you do not follow #! with the name of an executable program, the shell reports it
cannot find the program you asked it to run. You can optionally follow #! with SPACEs
before the name of the program. If you omit the #! line and try to run, for example,
a tcsh script from bash, the script will run under bash and might generate error messages or not run properly. See page 666 for an example of a stand-alone sed script
that uses #!.

# Begins a Comment
Comments make shell scripts and all code easier to read and maintain by you and
others. The comment syntax is common to both the Bourne Again Shell and the TC
Shell.
If a hashmark (#) in the first character position of the first line of a script is not immediately followed by an exclamation point (!) or if a hashmark occurs in any other
location in a script, the shell interprets it as the beginning of a comment. The shell
then ignores everything between the hashmark and the end of the line (the next NEWLINE
character).

Executing a Shell Script
fork and exec
system calls

As discussed earlier, you can execute commands in a shell script file that you do not
have execute permission for by using a bash command to exec a shell that runs the
script directly. In the following example, bash creates a new shell that takes its input
from the file named whoson:
$ bash whoson

Because the bash command expects to read a file containing commands, you do not
need execute permission for whoson. (You do need read permission.) Even though bash
reads and executes the commands in whoson, standard input, standard output, and
standard error remain directed from/to the terminal. Alternatively, you can supply
commands to bash using standard input:

Control Operators: Separate and Group Commands 289
$ bash < whoson

Although you can use bash to execute a shell script, these techniques cause the script
to run more slowly than if you give yourself execute permission and directly invoke the
script. Users typically prefer to make the file executable and run the script by typing its
name on the command line. It is also easier to type the name, and this practice is consistent with the way other kinds of programs are invoked (so you do not need to know
whether you are running a shell script or an executable file). However, if bash is not
your interactive shell or if you want to see how the script runs with different shells, you
might want to run a script as an argument to bash or tcsh.

sh does not call the original Bourne Shell
caution The original Bourne Shell was invoked with the command sh. Although you can call bash or, on
some systems dash, with an sh command, it is not the original Bourne Shell. The sh command
(/bin/sh) is a symbolic link to /bin/bash or /bin/dash, so it is simply another name for the bash or
dash command. When you call bash using the command sh, bash tries to mimic the behavior
of the original Bourne Shell as closely as possible—but it does not always succeed.

Control Operators: Separate and Group Commands
Whether you give the shell commands interactively or write a shell script, you must
separate commands from one another. This section, which applies to the Bourne
Again and TC Shells, reviews the ways to separate commands that were covered in
Chapter 5 and introduces a few new ones.
The tokens that separate, terminate, and group commands are called control
operators. Each of the control operators implies line continuation as explained
on page 499. Following is a list of the control operators and the page each is
discussed on.
• ; Command separator (next page)
•

NEWLINE

Command initiator (next page)

• & Background task (next page)
• | Pipeline (next page)
• |& Standard error pipeline (page 283)
• () Groups commands (page 292)
• || Boolean OR (page 292)
• && Boolean AND (page 292)
• ;; Case terminator (page 442)

290 Chapter 8 The Bourne Again Shell (bash)

; and NEWLINE Separate Commands
The NEWLINE character is a unique control operator because it initiates execution of the
command preceding it. You have seen this behavior throughout this book each time
you press the RETURN key at the end of a command line.
The semicolon (;) is a control operator that does not initiate execution of a command
and does not change any aspect of how the command functions. You can execute a
series of commands sequentially by entering them on a single command line and separating each from the next using a semicolon (;). You initiate execution of the
sequence of commands by pressing RETURN:
$ x ; y ; z

If x, y, and z are commands, the preceding command line yields the same results as
the next three commands. The difference is that in the next example the shell issues
a prompt after each of the commands finishes executing, whereas the preceding command line causes the shell to issue a prompt only after z is complete:
$ x
$ y
$ z
Whitespace

Although the whitespace (SPACEs and/or TABs) around the semicolons in the previous
example makes the command line easier to read, it is not necessary. None of the control
operators needs to be surrounded by whitespace.

| and & Separate Commands and Do Something Else
The pipe symbol (|) and the background task symbol (&) are also control operators.
They do not start execution of a command but do change some aspect of how the
command functions. The pipe symbol alters the source of standard input or the destination of standard output. The background task symbol causes the shell to execute
the task in the background and display a prompt immediately so you can continue
working on other tasks.
Each of the following command lines initiates a pipeline (page 141) comprising three
simple commands:
$ x | y | z
$ ls -l | grep tmp | less

In the first pipeline, the shell redirects standard output of x to standard input of y and
redirects y’s standard output to z’s standard input. Because it runs the entire pipeline
in the foreground, the shell does not display a prompt until task z runs to completion:
z does not finish until y finishes, and y does not finish until x finishes. In the second
pipeline, x is an ls –l command, y is grep tmp, and z is the pager less. The shell displays a long (wide) listing of the files in the working directory that contain the string
tmp, sent via a pipeline through less.

Control Operators: Separate and Group Commands 291

The next command line executes a list (page 145) running the simple commands d
and e in the background and the simple command f in the foreground:
$ d & e & f
[1] 14271
[2] 14272

The shell displays the job number between brackets and the PID number for each
process running in the background. It displays a prompt as soon as f finishes, which
might be before d or e finishes.
Before displaying a prompt for a new command, the shell checks whether any background jobs have completed. For each completed job, the shell displays its job
number, the word Done, and the command line that invoked the job; the shell then
displays a prompt. When the job numbers are listed, the number of the last job started
is followed by a + character, and the job number of the previous job is followed by a
– character. Other job numbers are followed by a SPACE character. After running the
last command, the shell displays the following lines before issuing a prompt:
[1][2]+

Done
Done

d
e

The next command line executes a list that runs three commands as background jobs.
The shell displays a shell prompt immediately:
$ d
[1]
[2]
[3]

& e & f &
14290
14291
14292

The next example uses a pipe symbol to send the output from one command to the
next command and an ampersand (&) to run the entire pipeline in the background.
Again the shell displays the prompt immediately. The shell commands that are part
of a pipeline form a single job. That is, the shell treats a pipeline as a single job, no
matter how many commands are connected using pipe (|) symbols or how complex
they are. The Bourne Again Shell reports only one process in the background
(although there are three):
$ d | e | f &
[1] 14295

The TC Shell shows three processes (all belonging to job 1) in the background:
tcsh $ d | e | f &
[1] 14302 14304 14306

&& and || Boolean Control Operators
The && (AND) and || (OR) Boolean operators are called short-circuiting control
operators. If the result of using one of these operators can be decided by looking
only at the left operand, the right operand is not evaluated. The result of a Boolean
operation is either 0 (true) or 1 (false).

292 Chapter 8 The Bourne Again Shell (bash)
&&

The && operator causes the shell to test the exit status of the command preceding
it. If the command succeeds, bash executes the next command; otherwise, it skips the
next command. You can use this construct to execute commands conditionally.
$ mkdir bkup && cp -r src bkup

This compound command creates the directory bkup. If mkdir succeeds, the content
of directory src is copied recursively to bkup.
||

The || control operator also causes bash to test the exit status of the first command
but has the opposite effect: The remaining command(s) are executed only if the first
command failed (that is, exited with nonzero status).
$ mkdir bkup || echo "mkdir of bkup failed" >> /tmp/log

The exit status of a command list is the exit status of the last command in the list.
You can group lists with parentheses. For example, you could combine the previous
two examples as
$ (mkdir bkup && cp -r src bkup) || echo "mkdir failed" >> /tmp/log

In the absence of parentheses, && and || have equal precedence and are grouped from
left to right. The following examples use the true and false utilities. These utilities do
nothing and return true (0) and false (1) exit statuses, respectively:
$ false; echo $?
1

The $? variable holds the exit status of the preceding command (page 465). The next
two commands yield an exit status of 1 ( false):
$
$
1
$
$
1

true || false && false
echo $?
(true || false) && false
echo $?

Similarly the next two commands yield an exit status of 0 (true):
$
$
0
$
$
0

false && false || true
echo $?
(false && false) || true
echo $?

See “Lists” on page 145 for more examples.

optional

( ) Groups Commands
You can use the parentheses control operator to group commands. When you use this
technique, the shell creates a copy of itself, called a subshell, for each group. It treats
each group of commands as a list and creates a new process to execute each command

Control Operators: Separate and Group Commands 293

(refer to “Process Structure” on page 323 for more information on creating subshells). Each subshell has its own environment, meaning it has its own set of variables
whose values can differ from those in other subshells.
The following command line executes commands a and b sequentially in the background while executing c in the background. The shell displays a prompt
immediately.
$ (a ; b) & c &
[1] 15520
[2] 15521

The preceding example differs from the earlier example d & e & f & in that tasks a
and b are initiated sequentially, not concurrently.
Similarly the following command line executes a and b sequentially in the background and, at the same time, executes c and d sequentially in the background. The
subshell running a and b and the subshell running c and d run concurrently. The shell
displays a prompt immediately.
$ (a ; b) & (c ; d) &
[1] 15528
[2] 15529

The next script copies one directory to another. The second pair of parentheses creates
a subshell to run the commands following the pipe symbol. Because of these parentheses,
the output of the first tar command is available for the second tar command, despite the
intervening cd command. Without the parentheses, the output of the first tar command
would be sent to cd and lost because cd does not process standard input. The shell variables $1 and $2 hold the first and second command-line arguments (page 459),
respectively. The first pair of parentheses, which creates a subshell to run the first two
commands, allows users to call cpdir with relative pathnames. Without them, the first
cd command would change the working directory of the script (and consequently the
working directory of the second cd command). With them, only the working directory
of the subshell is changed.
$ cat cpdir
(cd $1 ; tar -cf - . ) | (cd $2 ; tar -xvf - )
$ ./cpdir /home/max/sources /home/max/memo/biblio

The cpdir command line copies the files and directories in the /home/max/sources
directory to the directory named /home/max/memo/biblio. Running this shell script
is the same as using cp with the –r option. See page 754 for more information on cp.

\ Continues a Command
Although it is not a control operator, you can use a backslash (\) character in the middle of commands. When you enter a long command line and the cursor reaches the
right side of the screen, you can use a backslash to continue the command on the next
line. The backslash quotes, or escapes, the NEWLINE character that follows it so the shell
does not treat the NEWLINE as a control operator. Enclosing a backslash within single
quotation marks or preceding it with another backslash turns off the power of a

294 Chapter 8 The Bourne Again Shell (bash)

backslash to quote special characters such as NEWLINE (not tcsh; see prompt2 on
page 394). Enclosing a backslash within double quotation marks has no effect on the
power of the backslash (not tcsh).
Although you can break a line in the middle of a word (token), it is typically simpler,
and makes code easier to read, if you break a line immediately before or after
whitespace.

optional You can enter a RETURN in the middle of a quoted string on a command line without
using a backslash. (See prompt2 on page 394 for tcsh behavior.) The
you enter will then be part of the string:

NEWLINE (RETURN)

$ echo "Please enter the three values
> required to complete the transaction."
Please enter the three values
required to complete the transaction.

In the three examples in this section, the shell does not interpret RETURN as a control
operator because it occurs within a quoted string. The greater than sign (>) is a secondary prompt (PS2; page 311) indicating the shell is waiting for you to continue the
unfinished command. In the next example, the first RETURN is quoted (escaped) so the
shell treats it as a separator and does not interpret it literally.
$ echo "Please enter the three values \
> required to complete the transaction."
Please enter the three values required to complete the transaction.

Single quotation marks cause the shell to interpret a backslash literally:
$ echo 'Please enter the three values \
> required to complete the transaction.'
Please enter the three values \
required to complete the transaction.

Job Control
As explained on page 146, a job is another name for a process running a pipeline
(which can be a simple command). You run one or more jobs whenever you give the
shell a command. For example, if you type date on the command line and press RETURN,
you have run a job. You can also create several jobs on a single command line by
entering several simple commands separated by control operators (& in the following
example):
$ find . -print | sort | lpr & grep -l max /tmp/* > maxfiles &
[1] 18839
[2] 18876

Job Control

295

The portion of the command line up to the first & is one job—a pipeline comprising
three simple commands connected by pipe symbols: find, sort, and lpr. The second job
is a pipeline that is a simple command (grep). The & characters following each pipeline put each job in the background, so bash does not wait for them to complete
before displaying a prompt.
Using job control you can move jobs from the foreground to the background, and
vice versa; temporarily stop jobs; and list jobs that are running in the background or
stopped.

jobs: Lists Jobs
The jobs builtin lists all background jobs. In the following example, the sleep command
runs in the background and creates a background job that jobs reports on:
$ sleep 60 &
[1] 7809
$ jobs
[1] + Running

sleep 60 &

fg: Brings a Job to the Foreground
The shell assigns a job number to each job you run in the background. For each job
run in the background, the shell lists the job number and PID number immediately,
just before it issues a prompt:
$ gnome-calculator &
[1] 1246
$ date &
[2] 1247
$ Sat Dec 7 11:44:40 PST 2013
[2]+ Done
date
$ find /usr -name ace -print > findout &
[2] 1269
$ jobs
[1]- Running
gnome-calculator &
[2]+ Running
find /usr -name ace -print > findout &

The shell discards job numbers when a job is finished and reuses discarded job numbers.
When you start or put a job in the background, the shell assigns a job number that is
one more than the highest job number in use.
In the preceding example, the jobs command lists the first job, gnome-calculator, as
job 1. The date command does not appear in the jobs list because it finished before
jobs was run. Because the date command was completed before find was run, the find
command became job 2.
To move a background job to the foreground, use the fg builtin followed by the job
number. Alternatively, you can give a percent sign (%) followed by the job number
as a command. Either of the following commands moves job 2 to the foreground.

296 Chapter 8 The Bourne Again Shell (bash)

When you move a job to the foreground, the shell displays the command it is now
executing in the foreground.
$ fg 2
find /usr -name ace -print > findout

or
$ %2
find /usr -name ace -print > findout

You can also refer to a job by following the percent sign with a string that uniquely
identifies the beginning of the command line used to start the job. Instead of the preceding command, you could have used either fg %find or fg %f because both
uniquely identify job 2. If you follow the percent sign with a question mark and a
string, the string can match any part of the command line. In the preceding example,
fg %?ace would also bring job 2 to the foreground.
Often the job you wish to bring to the foreground is the only job running in the background or is the job that jobs lists with a plus (+). In these cases, calling fg without
an argument brings the job to the foreground.

Suspending a Job
Pressing the suspend key (usually CONTROL-Z) immediately suspends (temporarily stops)
the job in the foreground and displays a message that includes the word Stopped.
CONTROL-Z

[2]+

Stopped

find /usr -name ace -print > findout

For more information refer to “Moving a Job from the Foreground to the Background”
on page 147.

bg: Sends a Job to the Background
To move the foreground job to the background, you must first suspend the job
(above). You can then use the bg builtin to resume execution of the job in the
background.
$ bg
[2]+ find /usr -name ace -print > findout &

If a background job attempts to read from the terminal, the shell stops the job and
displays a message saying the job has been stopped. You must then move the job to
the foreground so it can read from the terminal.
$ (sleep 5; cat > mytext) &
[1] 1343
$ date
Sat Dec 7 11:58:20 PST 2013

Manipulating the Directory Stack 297
[1]+ Stopped
$ fg
( sleep 5; cat >mytext )
Remember to let the cat out!

( sleep 5; cat >mytext )

CONTROL-D

$

In the preceding example, the shell displays the job number and PID number of the
background job as soon as it starts, followed by a prompt. Demonstrating that you
can give a command at this point, the user gives the command date, and its output
appears on the screen. The shell waits until just before it issues a prompt (after date
has finished) to notify you that job 1 is stopped. When you give an fg command, the
shell puts the job in the foreground, and you can enter the data the command is waiting for. In this case the input needs to be terminated using CONTROL-D, which sends an
EOF (end of file) signal to cat. The shell then displays another prompt.
The shell keeps you informed about changes in the status of a job, notifying you when
a background job starts, completes, or stops, perhaps because it is waiting for input
from the terminal. The shell also lets you know when a foreground job is suspended.
Because notices about a job being run in the background can disrupt your work, the
shell delays displaying these notices until just before it displays a prompt. You can
set notify (page 353) to cause the shell to display these notices without delay.
If you try to exit from a nonlogin shell while jobs are stopped, the shell issues a
warning and does not allow you to exit. If you then use jobs to review the list of
jobs or you immediately try to exit from the shell again, the shell allows you to exit.
If huponexit (page 352) is not set (it is not set by default), stopped jobs remain
stopped and background jobs keep running in the background. If it is set, the shell
terminates these jobs.

Manipulating the Directory Stack
Both the Bourne Again Shell and the TC Shell allow you to store a list of directories
you are working with, enabling you to move easily among them. This list is referred
to as a stack. It is analogous to a stack of dinner plates: You typically add plates to
and remove plates from the top of the stack, so this type of stack is named a LIFO
(last in, first out) stack.

dirs: Displays the Stack
The dirs builtin displays the contents of the directory stack. If you call dirs when the
directory stack is empty, it displays the name of the working directory:
$ dirs
~/literature

298 Chapter 8 The Bourne Again Shell (bash)

home

sam

demo

names

literature

promo

Figure 8-2

The directory structure used in the examples

The dirs builtin uses a tilde (~) to represent the name of a user’s home directory. The
examples in the next several sections assume you are referring to the directory structure shown in Figure 8-2.

pushd: Pushes a Directory on the Stack
When you supply the pushd (push directory) builtin with one argument, it pushes the
directory specified by the argument on the stack, changes directories to the specified
directory, and displays the stack. The following example is illustrated in Figure 8-3:
$ pushd ../demo
~/demo ~/literature
$ pwd
/home/sam/demo
$ pushd ../names
~/names ~/demo ~/literature
$ pwd
/home/sam/names

When you call pushd without an argument, it swaps the top two directories on the
stack, makes the new top directory (which was the second directory) the new working directory, and displays the stack (Figure 8-4).
names
21 pushd
demo
12 pushd
literature

Figure 8-3

Creating a directory stack

Manipulating the Directory Stack 299

pushd

pushd

names

demo

names

demo

names

demo

literature

literature

literature

Figure 8-4

Using pushd to change working directories

$ pushd
~/demo ~/names ~/literature
$ pwd
/home/sam/demo

Using pushd in this way, you can easily move back and forth between two directories.
You can also use cd – to change to the previous directory, whether or not you have
explicitly created a directory stack. To access another directory in the stack, call
pushd with a numeric argument preceded by a plus sign. The directories in the stack
are numbered starting with the top directory, which is number 0. The following pushd
command continues with the previous example, changing the working directory to
literature and moving literature to the top of the stack:
$ pushd +2
~/literature ~/demo ~/names
$ pwd
/home/sam/literature

popd: Pops a Directory Off the Stack
To remove a directory from the stack, use the popd (pop directory) builtin. As the
following example and Figure 8-5 show, without an argument, popd removes the

literature
popd
demo
names

Figure 8-5

Using popd to remove a directory from the stack

300 Chapter 8 The Bourne Again Shell (bash)

top directory from the stack and changes the working directory to the new top
directory:
$ dirs
~/literature ~/demo ~/names
$ popd
~/demo ~/names
$ pwd
/home/sam/demo

To remove a directory other than the top one from the stack, use popd with a numeric
argument preceded by a plus sign. The following example removes directory number
1, demo. Removing a directory other than directory number 0 does not change the
working directory.
$ dirs
~/literature ~/demo ~/names
$ popd +1
~/literature ~/names

Parameters and Variables
Shell parameter

Within a shell, a shell parameter is associated with a value you or a shell script can
access. This section introduces the following kinds of shell parameters: user-created
variables, keyword variables, positional parameters, and special parameters.

Variables

Parameters whose names consist of letters, digits, and underscores are referred to as
variables. A variable name must start with a letter or underscore, not with a number.
Thus A76, MY_CAT, and _ _ _ X _ _ _ are valid variable names, whereas
69TH_STREET (starts with a digit) and MY-NAME (contains a hyphen) are not.

User-created
variables

Variables that you name and assign values to are user-created variables. You can
change the values of user-created variables at any time, or you can make them
readonly so that their values cannot be changed.

Shell variables
and environment
variables

By default, a variable is available only in the shell it was created in (i.e., local); this type
of variable is called a shell variable. You can use export to make a variable available in
shells spawned from the shell it was created in (i.e., global); this type of variable is called
an environment variable. One naming convention is to use mixed-case or lowercase
letters for shell variables and only uppercase letters for environment variables. Refer to
“Variables” on page 467 for more information on shell variables and environment
variables.
To declare and initialize a variable in bash, use the following syntax:
VARIABLE=value
There can be no whitespace on either side of the equal sign (=). An example follows:

Parameters and Variables 301
$ myvar=abc

Under tcsh the assignment must be preceded by the word set and the SPACEs on either
side of the equal sign are optional:
$ set myvar = abc
Declaring and
initializing a variable
for a script

The Bourne Again Shell permits you to put variable assignments at the beginning of
a command line. This type of assignment places variables in the environment of the
command shell—that is, the variable is accessible only from the program (and the
children of the program) the command runs. It is not available from the shell running
the command. The my_script shell script displays the value of TEMPDIR. The following command runs my_script with TEMPDIR set to /home/sam/temp. The echo
builtin shows that the interactive shell has no value for TEMPDIR after running
my_script. If TEMPDIR had been set in the interactive shell, running my_script in
this manner would have had no effect on its value.
$ cat my_script
echo $TEMPDIR
$ TEMPDIR=/home/sam/temp ./my_script
/home/sam/temp
$ echo $TEMPDIR
$

Keyword variables

Keyword variables have special meaning to the shell and usually have short, mnemonic names. When you start a shell (by logging in, for example), the shell inherits
several keyword variables from the environment. Among these variables are HOME,
which identifies your home directory, and PATH, which determines which directories
the shell searches and in which order to locate commands you give the shell. The shell
creates and initializes (with default values) other keyword variables when you start
it. Still other variables do not exist until you set them.
You can change the values of most keyword shell variables. It is usually not necessary
to change the values of keyword variables initialized in the /etc/profile or
/etc/csh.cshrc systemwide startup files. If you need to change the value of a bash keyword variable, do so in one of your startup files (for bash see page 278; for tcsh see
page 372). Just as you can make user-created variables environment variables, so you
can make keyword variables environment variables—a task usually done automatically in startup files. You can also make a keyword variable readonly. See page 307
for a discussion of keyword variables.

Positional and
special parameters

The names of positional and special parameters do not resemble variable names.
Most of these parameters have one-character names (for example, 1, ?, and #) and
are referenced (as are all variables) by preceding the name with a dollar sign ($1, $?,
and $#). The values of these parameters reflect different aspects of your ongoing
interaction with the shell.

302 Chapter 8 The Bourne Again Shell (bash)

Whenever you run a command, each argument on the command line becomes the
value of a positional parameter (page 458). Positional parameters enable you to
access command-line arguments, a capability you will often require when you write
shell scripts. The set builtin (page 460) enables you to assign values to positional
parameters.
Other frequently needed shell script values, such as the name of the last command
executed, the number of positional parameters, and the status of the most recently
executed command, are available as special parameters (page 463). You cannot
assign values to special parameters.

User-Created Variables
The first line in the following example declares the variable named person and initializes
it with the value max:
$ person=max
$ echo person
person
$ echo $person
max
Parameter
substitution

Because the echo builtin copies its arguments to standard output, you can use it to display the values of variables. The second line of the preceding example shows that
person does not represent max. Instead, the string person is echoed as person. The shell
substitutes the value of a variable only when you precede the name of the variable with
a dollar sign ($). Thus the command echo $person displays the value of the variable
person; it does not display $person because the shell does not pass $person to echo as
an argument. Because of the leading $, the shell recognizes that $person is the name of
a variable, substitutes the value of the variable, and passes that value to echo. The echo
builtin displays the value of the variable (not its name), never “knowing” you called it
with the name of a variable.

Quoting the $

You can prevent the shell from substituting the value of a variable by quoting the
leading $. Double quotation marks do not prevent the substitution; single quotation
marks or a backslash (\) do.
$ echo $person
max
$ echo "$person"
max
$ echo '$person'
$person
$ echo \$person
$person

SPACEs

Because they do not prevent variable substitution but do turn off the special meanings
of most other characters, double quotation marks are useful when you assign values
to variables and when you use those values. To assign a value that contains SPACEs or

Parameters and Variables 303
TABs to a variable, use double quotation marks around the value. Although double
quotation marks are not required in all cases, using them is a good habit.

$ person="max and zach"
$ echo $person
max and zach
$ person=max and zach
bash: and: command not found

When you reference a variable whose value contains TABs or multiple adjacent SPACEs,
you must use quotation marks to preserve the spacing. If you do not quote the variable, the shell collapses each string of blank characters into a single SPACE before
passing the variable to the utility:
$ person="max
and
$ echo $person
max and zach
$ echo "$person"
max
and
zach
Pathname
expansion in
assignments

zach"

When you execute a command with a variable as an argument, the shell replaces the
name of the variable with the value of the variable and passes that value to the program being executed. If the value of the variable contains a special character, such as
* or ?, the shell might expand that variable.
The first line in the following sequence of commands assigns the string max* to the
variable memo. All shells interpret special characters as special when you reference
a variable that contains an unquoted special character. In the following example, the
shell expands the value of the memo variable because it is not quoted:
$ memo=max*
$ ls
max.report
max.summary
$ echo $memo
max.report max.summary

Above, the shell expands the $memo variable to max*, expands max* to max.report
and max.summary, and passes these two values to echo. In the next example, the
Bourne Again Shell does not expand the string because bash does not perform pathname expansion (page 148) when it assigns a value to a variable.
$ echo "$memo"
max*

All shells process a command line in a specific order. Within this order bash (but not
tcsh) expands variables before it interprets commands. In the preceding echo command line, the double quotation marks quote the asterisk (*) in the expanded value
of $memo and prevent bash from performing pathname expansion on the expanded
memo variable before passing its value to the echo command.

304 Chapter 8 The Bourne Again Shell (bash)

optional
Braces around
variables

The $VARIABLE syntax is a special case of the more general syntax ${VARIABLE},
in which the variable name is enclosed by ${}. The braces insulate the variable name
from adjacent characters. Braces are necessary when catenating a variable value with
a string:
$
$
$
$

PREF=counter
WAY=$PREFclockwise
FAKE=$PREFfeit
echo $WAY $FAKE

$

The preceding example does not work as expected. Only a blank line is output
because although PREFclockwise and PREFfeit are valid variable names, they are not
initialized. By default the shell evaluates an unset variable as an empty (null) string
and displays this value (bash) or generates an error message (tcsh). To achieve the
intent of these statements, refer to the PREF variable using braces:
$ PREF=counter
$ WAY=${PREF}clockwise
$ FAKE=${PREF}feit
$ echo $WAY $FAKE
counterclockwise counterfeit

The Bourne Again Shell refers to command-line arguments using the positional
parameters $1, $2, $3, and so forth up to $9. You must use braces to refer to arguments past the ninth argument: ${10}. The name of the command is held in $0
(page 458).

unset: Removes a Variable
Unless you remove a variable, it exists as long as the shell in which it was created
exists. To remove the value of a variable but not the variable itself, assign a null value
to the variable. In the following example, set (page 460) displays a list of all variables
and their values; grep extracts the line that shows the value of person.
$ echo $person
zach
$ person=
$ echo $person
$ set | grep person
person=

You can remove a variable using the unset builtin. The following command removes
the variable person:
$ unset person
$ echo $person
$ set | grep person
$

Parameters and Variables 305

Variable Attributes
This section discusses attributes and explains how to assign attributes to variables.

readonly: Makes the Value of a Variable Permanent
You can use the readonly builtin (not in tcsh) to ensure the value of a variable cannot
be changed. The next example declares the variable person to be readonly. You must
assign a value to a variable before you declare it to be readonly; you cannot change
its value after the declaration. When you attempt to change the value of or unset a
readonly variable, the shell displays an error message:
$ person=zach
$ echo $person
zach
$ readonly person
$ person=helen
bash: person: readonly variable
$ unset person
bash: unset: person: cannot unset: readonly variable

If you use the readonly builtin without an argument, it displays a list of all readonly shell
variables. This list includes keyword variables that are automatically set as readonly as
well as keyword or user-created variables that you have declared as readonly. See the
next page for an example (readonly and declare –r produce the same output).

declare: Lists and Assigns Attributes to Variables
The declare builtin (not in tcsh) lists and sets attributes and values for shell variables.
The typeset builtin (another name for declare) performs the same function but is deprecated. Table 8-3 lists five of these attributes.

Table 8-3

Variable attributes (declare)

Attribute

Meaning

–a

Declares a variable as an array (page 474)

–f

Declares a variable to be a function name (page 346)

–i

Declares a variable to be of type integer (page 306)

–r

Makes a variable readonly; also readonly (above)

–x

Makes a variable an environment variable; also export (page 468)

The following commands declare several variables and set some attributes. The first
line declares person1 and initializes it to max. This command has the same effect with
or without the word declare.
$
$
$
$

declare
declare
declare
declare

person1=max
-r person2=zach
-rx person3=helen
-x person4

306 Chapter 8 The Bourne Again Shell (bash)
readonly and
export

The readonly and export builtins are synonyms for the commands declare –r and
declare –x, respectively. You can declare a variable without initializing it, as the preceding declaration of the variable person4 illustrates. This declaration makes person4
an environment variable so it is available to all subshells. Until person4 is initialized,
it has a null value.
You can list the options to declare separately in any order. The following is equivalent
to the preceding declaration of person3:
$ declare -x -r person3=helen

Use the + character in place of – when you want to remove an attribute from a variable.
You cannot remove the readonly attribute. After the following command is given, the
variable person3 is no longer exported, but it is still readonly:
$ declare +x person3

See page 468 for more information on exporting variables.
Listing variable
attributes

Without any arguments or options, declare lists all shell variables. The same list is
output when you run set (page 461) without any arguments.
If you call declare with options but no variable names, the command lists all shell
variables that have the specified attributes set. For example, the command declare –r
displays a list of all readonly variables. This list is the same as that produced by the
readonly command without any arguments. After the declarations in the preceding
example have been given, the results are as follows:
$ declare -r
declare -r BASHOPTS="checkwinsize:cmdhist:expand_aliases: ... "
declare -ir BASHPID
declare -ar BASH_VERSINFO='([0]="4" [1]="2" [2]="24" [3]="1" ... '
declare -ir EUID="500"
declare -ir PPID="1936"
declare -r SHELLOPTS="braceexpand:emacs:hashall:histexpand: ... "
declare -ir UID="500"
declare -r person2="zach"
declare -rx person3="helen"

The first seven entries are keyword variables that are automatically declared as
readonly. Some of these variables are stored as integers (–i). The –a option indicates
that BASH_VERSINFO is an array variable; the value of each element of the array
is listed to the right of an equal sign.
Integer

By default, the values of variables are stored as strings. When you perform arithmetic
on a string variable, the shell converts the variable into a number, manipulates it, and
then converts it back to a string. A variable with the integer attribute is stored as an
integer. Assign the integer attribute as follows:
$ declare -i COUNT

You can use declare to display integer variables:

Parameters and Variables 307
$ declare -i
declare -ir BASHPID
declare -i COUNT
declare -ir EUID="1000"
declare -i HISTCMD
declare -i LINENO
declare -i MAILCHECK="60"
declare -i OPTIND="1"
...

Keyword Variables
Keyword variables are either inherited or declared and initialized by the shell when
it starts. You can assign values to these variables from the command line or from a
startup file. Typically these variables are environment variables (exported) so they are
available to subshells you start as well as your login shell.

HOME: Your Home Directory
By default, your home directory is the working directory when you log in. Your home
directory is established when your account is set up; under Linux its name is stored
in the /etc/passwd file. Mac OS X uses Open Directory (page 1042) to store this
information.
$ grep sam /etc/passwd
sam:x:500:500:Sam the Great:/home/sam:/bin/bash

When you log in, the shell inherits the pathname of your home directory and assigns it
to the environment variable HOME (tcsh uses home). When you give a cd command
without an argument, cd makes the directory whose name is stored in HOME the
working directory:
$ pwd
/home/max/laptop
$ echo $HOME
/home/max
$ cd
$ pwd
/home/max

This example shows the value of the HOME variable and the effect of the cd builtin.
After you execute cd without an argument, the pathname of the working directory is
the same as the value of HOME: your home directory.
Tilde (~)

The shell uses the value of HOME to expand pathnames that use the shorthand tilde
(~) notation (page 89) to denote a user’s home directory. The following example uses
echo to display the value of this shortcut and then uses ls to list the files in Max’s laptop
directory, which is a subdirectory of his home directory.
$ echo ~
/home/max
$ ls ~/laptop
tester
count

lineup

308 Chapter 8 The Bourne Again Shell (bash)

PATH: Where the Shell Looks for Programs
When you give the shell an absolute or relative pathname as a command, it looks
in the specified directory for an executable file with the specified filename. If the
file with the pathname you specified does not exist, the shell reports No such file
or directory. If the file exists as specified but you do not have execute permission
for it, or in the case of a shell script you do not have read and execute permission
for it, the shell reports Permission denied.
When you give a simple filename as a command, the shell searches through certain
directories (your search path) for the program you want to execute. It looks in several
directories for a file that has the same name as the command and that you have execute permission for (a compiled program) or read and execute permission for (a shell
script). The PATH (tcsh uses path) variable controls this search.
The default value of PATH is determined when bash is compiled. It is not set in a
startup file, although it might be modified there. Normally the default specifies that the
shell search several system directories used to hold common commands. These system
directories include /bin and /usr/bin and other directories appropriate to the local system. When you give a command, if the shell does not find the executable—and, in the
case of a shell script, readable—file named by the command in any of the directories
listed in PATH, the shell generates one of the aforementioned error messages.
Working directory

The PATH variable specifies the directories in the order the shell should search them.
Each directory must be separated from the next by a colon. The following command
sets PATH so a search for an executable file starts with the /usr/local/bin directory.
If it does not find the file in this directory, the shell looks next in /bin and then in
/usr/bin. If the search fails in those directories, the shell looks in the ~/bin directory,
a subdirectory of the user’s home directory. Finally the shell looks in the working
directory. Exporting PATH makes sure it is an environment variable so it is available
to subshells, although it is typically exported when it is declared so exporting it again
is not necessary:
$ export PATH=/usr/local/bin:/bin:/usr/bin:~/bin:

A null value in the string indicates the working directory. In the preceding example,
a null value (nothing between the colon and the end of the line) appears as the last
element of the string. The working directory is represented by a leading colon (not
recommended; see the following security tip), a trailing colon (as in the example), or
two colons next to each other anywhere in the string. You can also represent the
working directory explicitly using a period (.).
Because Linux stores many executable files in directories named bin (binary), users
typically put their executable files in their own ~/bin directories. If you put your own
bin directory toward the end of PATH, as in the preceding example, the shell looks
there for any commands it cannot find in directories listed earlier in PATH.
If you want to add directories to PATH, you can reference the old value of the PATH
variable in setting PATH to a new value (but see the preceding security tip). The following command adds /usr/local/bin to the beginning of the current PATH and the
bin directory in the user’s home directory (~/bin) to the end:

Parameters and Variables 309
$ PATH=/usr/local/bin:$PATH:~/bin

Set PATH in ~/.bash_profile; see the tip on page 279.

PATH and security
security Do not put the working directory first in PATH when security is a concern. If you are working as
root, you should never put the working directory first in PATH. It is common for root’s PATH to
omit the working directory entirely. You can always execute a file in the working directory by
prepending ./ to the name: ./myprog.
Putting the working directory first in PATH can create a security hole. Most people type ls as the
first command when entering a directory. If the owner of a directory places an executable file
named ls in the directory, and the working directory appears first in a user’s PATH, the user giving
an ls command from the directory executes the ls program in the working directory instead of the
system ls utility, possibly with undesirable results.

MAIL: Where Your Mail Is Kept
The MAIL variable (mail under tcsh) usually contains the pathname of the file that holds
your mail (your mailbox, usually /var/mail/name, where name is your username).
However, you can use MAIL to watch any file (including a directory): Set MAIL to the
name of the file you want to watch.
If MAIL is set and MAILPATH (below) is not set, the shell informs you when the file
specified by MAIL is modified (such as when mail arrives). In a graphical environment
you can unset MAIL so the shell does not display mail reminders in a terminal emulator
window (assuming you are using a graphical mail program).
Most Mac OS X systems do not use local files for incoming mail. Instead, mail is
typically kept on a remote mail server. The MAIL variable and other mail-related
shell variables have no effect unless you have a local mail server.
The MAILPATH variable (not in tcsh) contains a list of filenames separated by
colons. If this variable is set, the shell informs you when any one of the files is
modified (for example, when mail arrives). You can follow any of the filenames
in the list with a question mark (?) and a message. The message replaces the you
have mail message when you receive mail while you are logged in.
The MAILCHECK variable (not in tcsh) specifies how often, in seconds, the shell
checks the directories specified by MAIL or MAILPATH. The default is 60 seconds.
If you set this variable to zero, the shell checks before it issues each prompt.

PS1: User Prompt (Primary)
The default Bourne Again Shell prompt is a dollar sign ($). When you run bash with
root privileges, bash typically displays a hashmark (#) prompt. The PS1 variable
(prompt under tcsh; page 393) holds the prompt string the shell uses to let you know
it is waiting for a command. When you change the value of PS1, you change the
appearance of your prompt.

310 Chapter 8 The Bourne Again Shell (bash)

You can customize the prompt displayed by PS1. For example, the assignment
$ PS1="[\u@\h \W \!]$ "

displays the following prompt:
[user@host directory event]$
where user is the username, host is the hostname up to the first period, directory is
the basename of the working directory, and event is the event number (page 327) of
the current command.
If you are working on more than one system, it can be helpful to incorporate the system name into your prompt. The first example that follows changes the prompt to
the name of the local host, a SPACE, and a dollar sign (or, if the user is running with
root privileges, a hashmark), followed by a SPACE. A SPACE at the end of the prompt
makes commands you enter following the prompt easier to read. The second example
changes the prompt to the time followed by the name of the user. The third example
changes the prompt to the one used in this book (a hashmark for a user running with
root privileges and a dollar sign otherwise):
$ PS1='\h \$ '
guava $
$ PS1='\@ \u $ '
09:44 PM max $
$ PS1='\$ '
$

Table 8-4 describes some of the symbols you can use in PS1. See Table 9-4 on
page 393 for the corresponding tcsh symbols. For a complete list of special characters
you can use in the prompt strings, open the bash man page and search for the third
occurrence of PROMPTING (enter the command /PROMPTING followed by a
RETURN and then press n two times).

Table 8-4

PS1 symbols

Symbol

Display in prompt

\$

# if the user is running with root privileges; otherwise, $

\w

Pathname of the working directory

\W

Basename of the working directory

\!

Current event (history) number (page 331)

\d

Date in Weekday Month Date format

\h

Machine hostname, without the domain

\H

Full machine hostname, including the domain

\u

Username of the current user

Parameters and Variables 311

Table 8-4

PS1 symbols (continued)

Symbol

Display in prompt

\@

Current time of day in 12-hour, AM/PM format

\T

Current time of day in 12-hour HH:MM:SS format

\A

Current time of day in 24-hour HH:MM format

\t

Current time of day in 24-hour HH:MM:SS format

PS2: User Prompt (Secondary)
The PS2 variable holds the secondary prompt (prompt2 under tcsh). On the first line
of the next example, an unclosed quoted string follows echo. The shell assumes the
command is not finished and on the second line displays the default secondary
prompt (>). This prompt indicates the shell is waiting for the user to continue the
command line. The shell waits until it receives the quotation mark that closes the
string and then executes the command:
$ echo "demonstration of prompt string
> 2"
demonstration of prompt string
2

The next command changes the secondary prompt to Input => followed by a SPACE.
On the line with who, a pipe symbol (|) implies the command line is continued
(page 499) and causes bash to display the new secondary prompt. The command grep
sam (followed by a RETURN) completes the command; grep displays its output.
$ PS2="Input => "
$ who |
Input => grep sam
sam
tty1

2012-05-01 10:37 (:0)

PS3: Menu Prompt
The PS3 variable holds the menu prompt (prompt3 in tcsh) for the select control
structure (page 449).

PS4: Debugging Prompt
The PS4 variable holds the bash debugging symbol (page 431; not in tcsh).

IFS: Separates Input Fields (Word Splitting)
The IFS (Internal Field Separator) shell variable (not in tcsh) specifies the characters
you can use to separate arguments on a command line. It has the default value of SPACETAB-NEWLINE. Regardless of the value of IFS, you can always use one or more SPACE or TAB
characters to separate arguments on the command line, provided these characters are
not quoted or escaped. When you assign character values to IFS, these characters can
also separate fields—but only if they undergo expansion. This type of interpretation
of the command line is called word splitting and is discussed on page 362.

312 Chapter 8 The Bourne Again Shell (bash)

Be careful when changing IFS
caution Changing IFS has a variety of side effects, so work cautiously. You might find it useful to save
the value of IFS before changing it. You can then easily restore the original value if a change
yields unexpected results. Alternatively, you can fork a new shell using a bash command before
experimenting with IFS; if you run into trouble, you can exit back to the old shell, where IFS is
working properly.

The following example demonstrates how setting IFS can affect the interpretation of
a command line:
$ a=w:x:y:z
$ cat $a
cat: w:x:y:z: No such file or directory
$ IFS=":"
$ cat $a
cat: w: No
cat: x: No
cat: y: No
cat: z: No

such
such
such
such

file
file
file
file

or
or
or
or

directory
directory
directory
directory

The first time cat is called, the shell expands the variable a, interpreting the string
w:x:y:z as a single word to be used as the argument to cat. The cat utility cannot find
a file named w:x:y:z and reports an error for that filename. After IFS is set to a
colon (:), the shell expands the variable a into four words, each of which is an argument to cat. Now cat reports errors for four files: w, x, y, and z. Word splitting based
on the colon (:) takes place only after the variable a is expanded.
The shell splits all expanded words on a command line according to the separating
characters found in IFS. When there is no expansion, there is no splitting. Consider
the following commands:
$ IFS="p"
$ export VAR

Although IFS is set to p, the p on the export command line is not expanded, so the
word export is not split.
The following example uses variable expansion in an attempt to produce an export
command:
$ IFS="p"
$ aa=export
$ echo $aa
ex ort

This time expansion occurs, so the p in the token export is interpreted as a separator (as
the echo command shows). Next, when you try to use the value of the aa variable to
export the VAR variable, the shell parses the $aa VAR command line as ex ort VAR. The
effect is that the command line starts the ex editor with two filenames: ort and VAR.

Parameters and Variables 313
$ $aa VAR
2 files to edit
"ort" [New File]
Entering Ex mode. Type "visual" to go to Normal mode.
:q
E173: 1 more file to edit
:q
$

If IFS is unset, bash uses its default value (SPACE-TAB-NEWLINE). If IFS is null, bash does not
split words.

Multiple separator characters
tip Although the shell treats sequences of multiple SPACE or TAB characters as a single separator, it
treats each occurrence of another field-separator character as a separator.

CDPATH: Broadens the Scope of cd
The CDPATH variable (cdpath under tcsh) allows you to use a simple filename as an
argument to the cd builtin to change the working directory to a directory other than
a child of the working directory. If you typically work in several directories, this variable can speed things up and save you the tedium of using cd with longer pathnames
to switch among them.
When CDPATH is not set and you specify a simple filename as an argument to cd, cd
searches the working directory for a subdirectory with the same name as the argument.
If the subdirectory does not exist, cd displays an error message. When CDPATH is set,
cd searches for an appropriately named subdirectory in the directories in the CDPATH
list. If it finds one, that directory becomes the working directory. With CDPATH set,
you can use cd and a simple filename to change the working directory to a child of any
of the directories listed in CDPATH.
The CDPATH variable takes on the value of a colon-separated list of directory
pathnames (similar to the PATH variable). It is usually set in the ~/.bash_profile
startup file with a command line such as the following:
export CDPATH=$HOME:$HOME/literature

This command causes cd to search your home directory, the literature directory, and
then the working directory when you give a cd command. If you do not include the
working directory in CDPATH, cd searches the working directory if the search of all
the other directories in CDPATH fails. If you want cd to search the working directory
first, include a colon (:) as the first entry in CDPATH:
export CDPATH=:$HOME:$HOME/literature

If the argument to the cd builtin is anything other than a simple filename (i.e., if the
argument contains a slash [/]), the shell does not consult CDPATH.

314 Chapter 8 The Bourne Again Shell (bash)

Keyword Variables: A Summary
Table 8-5 lists the bash keyword variables. See page 391 for information on tcsh
variables.

Table 8-5

bash keyword variables

Variable

Value

BASH_ENV

The pathname of the startup file for noninteractive shells (page 279)

CDPATH

The cd search path (page 313)

COLUMNS

The width of the display used by select (page 448)

HISTFILE

The pathname of the file that holds the history list (default: ~/.bash_history;
page 326)

HISTFILESIZE

The maximum number of entries saved in HISTFILE (default: 1,000–2,000;
page 326)

HISTSIZE

The maximum number of entries saved in the history list (default: 1,000;
page 326)

HOME

The pathname of the user’s home directory (page 307); used as the default
argument for cd and in tilde expansion (page 89)

IFS

Internal Field Separator (page 311); used for word splitting (page 362)

INPUTRC

The pathname of the Readline startup file (default: ~/.inputrc; page 339)

LANG

The locale category when that category is not specifically set using one of the
LC_ variables (page 317)

LC_

A group of variables that specify locale categories including LC_ALL,
LC_COLLATE, LC_CTYPE, LC_MESSAGES, and LC_NUMERIC; use the locale
builtin (page 318) to display a more complete list including values

LINES

The height of the display used by select (page 448)

MAIL

The pathname of the file that holds a user’s mail (page 309)

MAILCHECK

How often, in seconds, bash checks for mail (default: 60; page 309)

MAILPATH

A colon-separated list of file pathnames that bash checks for mail in
(page 309)

OLDPWD

The pathname of the previous working directory

PATH

A colon-separated list of directory pathnames that bash looks for commands
in (page 308)

PROMPT_COMMAND A command that bash executes just before it displays the primary prompt

Special Characters 315

Table 8-5

bash keyword variables (continued)

Variable

Value

PS1

Prompt String 1; the primary prompt (page 309)

PS2

Prompt String 2; the secondary prompt (page 311)

PS3

The prompt issued by select (page 448)

PS4

The bash debugging symbol (page 431)

PWD

The pathname of the working directory

REPLY

Holds the line that read accepts (page 478); also used by select (page 448)

Special Characters
Table 8-6 lists most of the characters that are special to the bash and tcsh shells.

Table 8-6

Shell special characters

Character

Use

NEWLINE

A control operator that initiates execution of a command (page 290)

;

A control operator that separates commands (page 290)

()

A control operator that groups commands (page 292) for execution by a
subshell; these characters are also used to identify a function (page 346)

(( ))

Evaluates an arithmetic expression (page 492)

&

A control operator that executes a command in the background (pages 146
and 290)

|

A control operator that sends standard output of the preceding command to
standard input of the following command (pipeline; page 290)

|&

A control operator that sends standard output and standard error of the
preceding command to standard input of the following command (page 283)

>

Redirects standard output (page 136)

>>

Appends standard output (page 140)

<

Redirects standard input (page 137)

<<

Here document (page 450)

*

Matches any string of zero or more characters in an ambiguous file reference
(page 149)

316 Chapter 8 The Bourne Again Shell (bash)

Table 8-6 Shell special characters (continued)
Character

Use

?

Matches any single character in an ambiguous file reference (page 148)

\

Quotes the following character (page 50)

'

Quotes a string, preventing all substitution (page 50)

"

Quotes a string, allowing only variable and command substitution (pages 50
and 302)

‘...‘

Performs command substitution [deprecated, see $()]

[]

Character class in an ambiguous file reference (page 151)

$(())

Evaluates an arithmetic expression (page 359)

$

References a variable (page 300)

.

(dot builtin)

Executes a command in the current shell (page 280)

#

Begins a comment (page 288)

{}

Surrounds the contents of a function (page 346)

:

(null builtin)

&&
(Boolean AND)

Returns true (page 485)
A control operator that executes the command on the right only if the
command on the left succeeds (returns a zero exit status; page 292)

|| (Boolean OR) A control operator that executes the command on the right only if the
command on the left fails (returns a nonzero exit status; page 292)
!
$()

(Boolean NOT) Reverses the exit status of a command
(not in tcsh) Performs command substitution (preferred form; page 361)

Locale
In conversational English, a locale is a place or location. When working with Linux,
a locale specifies the way locale-aware programs display certain kinds of data such
as times and dates, money and other numeric values, telephone numbers, and measurements. It can also specify collating sequence and printer paper size.
Localization and
internationalization

Localization and internationalization go hand in hand: Internationalization is
the process of making software portable to multiple locales while localization is
the process of adapting software so that it meets the language, cultural, and
other requirements of a specific locale. Linux is well internationalized so you can
easily specify a locale for a given system or user. Linux uses variables to specify
a locale.

Locale

317

i18n

The term i18n is an abbreviation of the word internationalization: the letter i followed
by 18 letters (nternationalizatio) followed by the letter n.

l10n

The term l10n is an abbreviation of the word localization: the letter l followed by 10
letters (ocalizatio) followed by the letter n.

LC_: Locale Variables
The bash man page lists the following locale variables; other programs use additional
locale variables. See the locale man pages (sections 1, 5, and 7) or use the locale ––help
option for more information.
• LANG—Specifies the locale category for categories not specified by an LC_
variable (except see LC_ALL). Many setups use only this locale variable and
do not specify any of the LC_ variables.
• LC_ALL—Overrides the value of LANG and all other LC_ variables.
• LC_COLLATE—Specifies the collating sequence for the sort utility
(page 942) and for sorting the results of pathname expansion (page 303).
• LC_CTYPE—Specifies how characters are interpreted and how character
classes within pathname expansion and pattern matching behave. Also affects
the sort utility (page 942) when you specify the –d (––dictionary-order) or –i
(––ignore-nonprinting) options.
• LC_MESSAGES—Specifies how affirmative and negative answers appear
and the language messages are displayed in.
• LC_NUMERIC—Specifies how numbers are formatted (e.g., are thousands
separated by a comma or a period?).

Internationalized C programs call setlocale()
tip Internationalized C programs call setlocale(). Other languages have analogous facilities. Shell
scripts are typically internationalized to the degree that the routines they call are. Without a call to
setlocale(), the hello, world program will always display hello, world, regardless of how you set
LANG.

You can set one or more of the LC_ variables to a value using the following syntax:
xx_YY.CHARSET
where xx is the ISO-639 language code (e.g., en = English, fr = French, zu = Zulu), YY
is the ISO-3166 country code (e.g., FR = France, GF = French Guiana, PF = French
Polynesia), and CHARSET is the name of the character set (e.g., UTF-8 [page 1100],
ASCII [page 1055], ISO-8859-1 [Western Europe]; also called the character map or
charmap). On some systems you can specify CHARSET using lowercase letters. For
example, en_GB.UTF-8 can specify English as written in Great Britain, en_US.UTF-8
can specify English as written in the United States, and fr_FR.UTF-8 can specify French
as written in France.

318 Chapter 8 The Bourne Again Shell (bash)

The C locale
tip Setting the locale to C forces a program to process and display strings as the program was written
(i.e., without translating input or output), which frequently means the program works in English.
Many system scripts set LANG to C so they run in a known environment. Some text processing
utilities run slightly faster when you set LANG to C. Setting LANG to C before you run sort can help
ensure you get the results you expect.
If you want to make sure your shell script will work properly, put the following line near the top of
the file:
export LANG=C

Following is an example of a difference that setting LANG can cause. It shows that having LANG
set to different values can cause commands to behave differently, especially with regard to sorting.
$ echo $LANG
en_US.UTF-8
$ ls
m666 Makefile
$ ls [l-n]*
m666 Makefile

merry
merry

$ export LANG=C
$ ls
Makefile m666 merry
$ ls [l-n]*
m666 merry

locale: Displays Locale Information
The locale utility displays information about the current and available locales. Without
options, locale displays the value of the locale variables. In the following example, only
the LANG variable is set, although you cannot determine this fact from the output.
Unless explicitly set, each of the LC_ variables derives its value from LANG.
$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Typically you will want all locale variables to have the same value. However, in some
cases you might want to change the value of one or more locale variables. For example,
if you are using paper size A4 but working in English, you could change the value of
LC_PAPER to nl_NL.utf8.

Locale

319

The –a (all) option causes locale to display the names of available locales; –v (verbose;
not in OS X) displays more complete information.
$ locale -av
locale: aa_DJ
archive: /usr/lib/locale/locale-archive
-------------------------------------------------------------------title | Afar language locale for Djibouti (CaduLaaqo Dialects).
source | Ge'ez Frontier Foundation
address | 7802 Solomon Seal Dr., Springfield, VA 22152, USA
email | locales@geez.org
language | aa
territory | DJ
revision | 0.20
date | 2003-07-05
codeset | ISO-8859-1
...

The –m (maps) option causes locale to display the names of available character maps. On
Linux systems, locale definition files are kept in the /usr/share/i18n/locales directory;
on OS X systems, they are kept in /usr/share/locale.
Following are some examples of how some LC_ variables change displayed values. Each
of these command lines sets an LC_ variable and places it in the environment of the utility
it calls. The +%x format causes date to display the locale’s date representation. The last
example does not work under OS X.
$ LC_TIME=en_GB.UTF-8 date +%x
24/01/12
$ LC_TIME=en_US.UTF-8 date +%x
01/24/2012
$ ls xx
ls: impossible d'accéder à xx: Aucun fichier ou dossier de ce type
$ LC_MESSAGES=en_US.UTF-8 ls xx
ls: cannot access xx: No such file or directory

Setting the Locale
You might have to install a language package for a locale before you can specify a locale.
If you are working in a GUI, it is usually easiest to change the locale using the GUI.
For all Linux distributions and OS X, put locale variable assignments in ~/.profile or
~/.bash_profile to affect both GUI and bash command-line logins for a single user.
Remember to export the variables. The following line in one of these files will set all
LC_ variables for the given user to French as spoken in France:
export LANG=fr_FR.UTF-8

Under tcsh, put the following line in ~/.tcshrc or ~/.cshrc to have the same effect:
setenv LANG fr_FR.UTF-8

The following paragraphs explain how to use the command-line interface to change
the locale for all users; the technique varies by distribution.

320 Chapter 8 The Bourne Again Shell (bash)
Fedora/RHEL

Put locale variable assignments (previous page) in /etc/profile.d/zlang.sh (you will
need to create this file; the filename was chosen to be executed after lang.sh) to
affect both GUI and command-line logins for all users. Under tcsh, put the variable
assignment in /etc/profile.d/zlang.csh.

Debian/Ubuntu/Mint

Put locale variable assignments (previous page) in /etc/default/locale to affect both
GUI and command-line logins for all users.

openSUSE

Put locale variable assignments (previous page) in /etc/profile.local (you might need
to create this file) to affect both GUI and command-line logins for all users. The
/etc/sysconfig/language file controls the locale of GUI logins; see the file for
instructions.

Mac OS X

Put locale variable assignments (previous page) in /etc/profile to affect both GUI and
command-line logins for all users.

Time
UTC

On networks with systems in different time zones it can be helpful to set all systems
to the UTC (page 1100) time zone. Among other benefits, doing so can make it easier
for an administrator to compare logged events on different systems over time. Each
user account can be set to the local time for that user.

Time zone

The time zone for a user is specified by an environment variable or, if one is not set,
by the time zone for the system.

TZ

The TZ variable gives a program access to information about the local time zone.
This variable is typically set in a startup file (pages 278 and 372) and placed in the
environment (page 468) so called programs have access to it. It has two syntaxes.
The first syntax of the TZ variable is
nam±val[nam2]
where nam is a string comprising three or more letters that typically name the time
zone (e.g., PST; its value is not significant) and ±val is the offset of the time zone from
UTC, with positive values indicating the local time zone is west of the prime meridian
and negative values indicating the local time zone is east of the prime meridian. If the
nam2 is present, it indicates the time zone observes daylight savings time; it is the
name of the daylight savings time zone (e.g., PDT).
In the following example, date is called twice, once without setting the TZ variable
and then with the TZ variable set in the environment in which date is called:
$ date
Thu May

3 10:08:06 PDT 2012

$ TZ=EST+5EDT date
Thu May 3 13:08:08 EDT 2012

The second syntax of the TZ variable is

Time 321

continent/country
where continent is the name of the continent or ocean and country is the name of the
country that includes the desired time zone. This syntax points to a file in the
/usr/share/zoneinfo hierarchy (next page). See tzselect (below) if you need help determining these values.
In the next example, date is called twice, once without setting the TZ variable and
then with the TZ variable set in the environment in which date is called:
$ date
Thu May

3 10:09:27 PDT 2012

$ TZ=America/New_York date
Thu May 3 13:09:28 EDT 2012

See www.gnu.org/software/libc/manual/html_node/TZ-Variable.html for extensive
documentation on the TZ variable.
tzconfig

The tzconfig utility was available under Debian/Ubuntu and is now deprecated; use
dpkg-reconfigure tzdata in its place.

tzselect

The tzselect utility can help you determine the name of a time zone by asking you first
to name the continent or ocean and then the country the time zone is in. If necessary,
it asks for a time zone region (e.g., Pacific Time). This utility does not change system
settings but rather displays a line telling you the name of the time zone. In the following example, the time zone is named Europe/Paris. Newer releases keep time zone
information in /usr/share/zoneinfo (next page). Specifications such as Europe/Paris
refer to the file in that directory (/usr/share/zoneinfo/Europe/Paris).
$ tzselect
Please identify a location so that time zone rules can be set correctly.
Please select a continent or ocean.
1) Africa
...
8) Europe
9) Indian Ocean
10) Pacific Ocean
11) none - I want to specify the time zone using the Posix TZ format.
#? 8
Please select a country.
1) Aaland Islands
18) Greece
35) Norway
...
15) France
32) Monaco
49) Vatican City
16) Germany
33) Montenegro
17) Gibraltar
34) Netherlands
#? 15
...
Here is that TZ value again, this time on standard output so that you
can use the /usr/bin/tzselect command in shell scripts:
Europe/Paris

322 Chapter 8 The Bourne Again Shell (bash)
/etc/timezone

Under some distributions, including Debian/Ubuntu/Mint, the /etc/timezone file
holds the name of the local time zone.
$ cat /etc/timezone
America/Los_Angeles

/usr/share/zoneinfo

The /usr/share/zoneinfo directory hierarchy holds time zone data files. Some time
zones are held in regular files in the zoneinfo directory (e.g., Japan and GB) while
others are held in subdirectories (e.g., Azores and Pacific). The following example
shows a small part of the /usr/share/zoneinfo directory hierarchy and illustrates
how file (page 802) reports on a time zone file.
$ find /usr/share/zoneinfo
/usr/share/zoneinfo
/usr/share/zoneinfo/Atlantic
/usr/share/zoneinfo/Atlantic/Azores
/usr/share/zoneinfo/Atlantic/Madeira
/usr/share/zoneinfo/Atlantic/Jan_Mayen
...
/usr/share/zoneinfo/Japan
/usr/share/zoneinfo/GB
/usr/share/zoneinfo/US
/usr/share/zoneinfo/US/Pacific
/usr/share/zoneinfo/US/Arizona
/usr/share/zoneinfo/US/Michigan
...
$ file /usr/share/zoneinfo/Atlantic/Azores
/usr/share/zoneinfo/Atlantic/Azores: timezone data, version 2, 12 gmt
time flags, 12 std time flags, no leap seconds, 220 transition times, 12
abbreviation chars

/etc/localtime

Some Linux distributions use a link at /etc/localtime to a file in /usr/share/zoneinfo
to specify the local time zone. Others copy the file from the zoneinfo directory to
localtime. Following is an example of setting up this link; to create this link you must
work with root privileges.
# date
Tue Jan 24 13:55:00 PST 2012
# cd /etc
# ln -sf /usr/share/zoneinfo/Europe/Paris localtime
# date
Tue Jan 24 22:55:38 CET 2012

On some of these systems, the /etc/systemconfig/clock file sets the ZONE variable
to the name of the time zone:
$ cat /etc/sysconfig/clock
# The time zone of the system is defined by the contents of /etc/localtime.
# This file is only for evaluation by system-config-date, do not rely on its
# contents elsewhere.
ZONE="Europe/Paris"
Mac OS X

On a Macintosh running OS X, you can use systemsetup to work with the time zone.

Processes 323
$ systemsetup -gettimezone
Time Zone: America/Los_Angeles
$ systemsetup -listtimezones
Time Zones:
Africa/Abidjan
Africa/Accra
Africa/Addis_Ababa
...
$ systemsetup -settimezone America/Los_Angeles
Set TimeZone: America/Los_Angeles

Processes
A process is the execution of a command by the Linux kernel. The shell that starts
when you log in is a process, like any other. When you specify the name of a utility
as a command, you initiate a process. When you run a shell script, another shell
process is started, and additional processes are created for each command in the
script. Depending on how you invoke the shell script, the script is run either by the
current shell or, more typically, by a subshell (child) of the current shell. Running
a shell builtin, such as cd, does not start a new process.

Process Structure
fork() system call

Like the file structure, the process structure is hierarchical, with parents, children,
and a root. A parent process forks (or spawns) a child process, which in turn can fork
other processes. The term fork indicates that, as with a fork in the road, one process
turns into two. Initially the two forks are identical except that one is identified as the
parent and one as the child. The operating system routine, or system call, that creates
a new process is named fork().

init daemon

A Linux system begins execution by starting the init daemon, a single process called
a spontaneous process, with PID number 1. This process holds the same position in
the process structure as the root directory does in the file structure: It is the ancestor
of all processes the system and users work with. When a command-line system is in
multiuser mode, init runs getty or mingetty processes, which display login: prompts on
terminals and virtual consoles. When a user responds to the prompt and presses
RETURN, getty or mingetty passes control to a utility named login, which checks the username and password combination. After the user logs in, the login process becomes the
user’s shell process.
When you enter the name of a program on the command line, the shell forks a new
process, creating a duplicate of the shell process (a subshell). The new process
attempts to exec (execute) the program. Like fork(), exec() is a system call. If the
program is a binary executable, such as a compiled C program, exec() succeeds, and
the system overlays the newly created subshell with the executable program. If the
command is a shell script, exec() fails. When exec fails, the program is assumed to

324 Chapter 8 The Bourne Again Shell (bash)

be a shell script, and the subshell runs the commands in the script. Unlike a login
shell, which expects input from the command line, the subshell takes its input from
a file—namely, the shell script.

Process Identification
PID numbers

Linux assigns a unique PID (process identification) number at the inception of each
process. As long as a process exists, it keeps the same PID number. During one session
the same process is always executing the login shell (page 278). When you fork a new
process—for example, when you use an editor—the PID number of the new (child)
process is different from that of its parent process. When you return to the login shell,
it is still being executed by the same process and has the same PID number as when
you logged in.
The following example shows that the process running the shell forked (is the parent
of) the process running ps. When you call it with the –f option, ps displays a full listing of information about each process. The line of the ps display with bash in the
CMD column refers to the process running the shell. The column headed by PID identifies the PID number. The column headed by PPID identifies the PID number of the
parent of the process. From the PID and PPID columns you can see that the process
running the shell (PID 21341) is the parent of the processes running sleep (PID
22789) and ps (PID 22790).
$ sleep 10 &
[1] 22789
$ ps -f
UID
PID PPID C STIME TTY
max
21341 21340 0 10:42 pts/16
max
22789 21341 0 17:30 pts/16
max
22790 21341 0 17:30 pts/16

TIME CMD
00:00:00 bash
00:00:00 sleep 10
00:00:00 ps -f

Refer to page 921 for more information on ps and the columns it displays when
you specify the –f option. A second pair of sleep and ps –f commands shows that
the shell is still being run by the same process but that it forked another process
to run sleep:
$ sleep 10 &
[1] 22791
$ ps -f
UID
PID PPID C STIME TTY
max
21341 21340 0 10:42 pts/16
max
22791 21341 0 17:31 pts/16
max
22792 21341 0 17:31 pts/16

TIME CMD
00:00:00 bash
00:00:00 sleep 10
00:00:00 ps -f

You can also use pstree (or ps ––forest, with or without the –e option) to see the
parent–child relationship of processes. The next example shows the –p option to
pstree, which causes it to display PID numbers:
$ pstree -p
systemd(1)-+-NetworkManager(655)---{NetworkManager}(702)
|-abrtd(657)---abrt-dump-oops(696)

Processes 325
|-accounts-daemon(1204)---{accounts-daemo}(1206)
|-agetty(979)
...
|-login(984)---bash(2071)-+-pstree(2095)
|
`-sleep(2094)
...

The preceding output is abbreviated. The first line shows the PID 1 (systemd init) and
a few of the processes it is running. The line that starts with –login shows a textual
user running sleep in the background and pstree in the foreground. The tree for a user
running a GUI is much more complex. Refer to “$$: PID Number” on page 464 for
a description of how to instruct the shell to report on PID numbers.

Executing a Command
fork() and sleep()

When you give the shell a command, it usually forks [spawns using the fork() system
call] a child process to execute the command. While the child process is executing the
command, the parent process (running the shell) sleeps [implemented as the sleep()
system call]. While a process is sleeping, it does not use any computer time; it remains
inactive, waiting to wake up. When the child process finishes executing the command, it tells its parent of its success or failure via its exit status and then dies. The
parent process (which is running the shell) wakes up and prompts for another
command.

Background process

When you run a process in the background by ending a command with the ampersand
control operator (&), the shell forks a child process without going to sleep and without
waiting for the child process to run to completion. The parent process, which is executing the shell, reports the job number and PID number of the child process and
prompts for another command. The child process runs in the background, independent
of its parent.

Builtins

Although the shell forks a process to run most commands, some commands are built
into the shell (e.g., cd, alias, jobs, pwd). The shell does not fork a process to run builtins.
For more information refer to “Builtins” on page 153.

Variables

Within a given process, such as a login shell or subshell, you can declare, initialize, read,
and change variables. Some variables, called shell variables, are local to a process.
Other variables, called environment variables, are available to child processes. For
more information refer to “Variables” on page 467.

Hash table

The first time you specify a command as a simple filename (and not a relative or absolute pathname), the shell looks in the directories specified by the PATH (bash;
page 308) or path (tcsh; page 393) variable to find that file. When it finds the file, the
shell records the absolute pathname of the file in its hash table. When you give the
command again, the shell finds it in its hash table, saving the time needed to search
through the directories in PATH. The shell deletes the hash table when you log out
and starts a new hash table when you start a session. This section shows some of the
ways you can use the bash hash builtin; tcsh uses different commands for working
with its hash table.

326 Chapter 8 The Bourne Again Shell (bash)

When you call the hash builtin without any arguments, it displays the hash table.
When you first log in, the hash table is empty:
$ hash
hash: hash table empty
$ who am i
sam
pts/2
2013-03-09 14:24 (plum)
$ hash
hits
command
1
/usr/bin/who

The hash –r option causes bash to empty the hash table, as though you had just logged
in; tcsh uses rehash for a similar purpose.
$ hash -r
$ hash
hash: hash table empty

Having bash empty its hash table is useful when you move a program to a different
directory in PATH and bash cannot find the program in its new location, or when
you have two programs with the same name and bash is calling the wrong one. Refer
to the bash info page for more information on the hash builtin.

History
The history mechanism, a feature adapted from the C Shell, maintains a list of
recently issued command lines, called events, that provides a quick way to reexecute
any events in the list. This mechanism also enables you to edit and then execute previous commands and to reuse arguments from them. You can use the history list to
replicate complicated commands and arguments that you used previously and to
enter a series of commands that differ from one another in minor ways. The history
list also serves as a record of what you have done. It can prove helpful when you have
made a mistake and are not sure what you did or when you want to keep a record
of a procedure that involved a series of commands.

history can help track down mistakes
tip When you have made a mistake on a command line (not an error within a script or program) and
are not sure what you did wrong, look at the history list to review your recent commands. Sometimes this list can help you figure out what went wrong and how to fix things.

The history builtin displays the history list. If it does not, read the next section, which
describes the variables you might need to set.

Variables That Control History
The TC Shell’s history mechanism is similar to bash’s but uses different variables and
has some other differences. See page 374 for more information.

History 327

The value of the HISTSIZE variable determines the number of events preserved in the
history list during a session. A value in the range of 100 to 1,000 is normal.
When you exit from the shell, the most recently executed commands are saved in the file
whose name is stored in the HISTFILE variable (default is ~/.bash_history). The next
time you start the shell, this file initializes the history list. The value of the HISTFILESIZE
variable determines the number of lines of history saved in HISTFILE (see Table 8-7).

Table 8-7

Event number

History variables

Variable

Default

Function

HISTSIZE

1,000 events

Maximum number of events saved during a session

HISTFILE

~/.bash_history

Location of the history file

HISTFILESIZE

1,000–2,000 events Maximum number of events saved between sessions

The Bourne Again Shell assigns a sequential event number to each command line.
You can display this event number as part of the bash prompt by including \! in PS1
(page 309). Examples in this section show numbered prompts when they help to
illustrate the behavior of a command.
Enter the following command manually to establish a history list of the 100 most
recent events; place it in ~/.bash_profile to affect future sessions:
$ HISTSIZE=100

The following command causes bash to save the 100 most recent events across login
sessions:
$ HISTFILESIZE=100

After you set HISTFILESIZE, you can log out and log in again, and the 100 most
recent events from the previous login session will appear in your history list.
Enter the command history to display the events in the history list. This list is ordered
with the oldest events at the top. A tcsh history list includes the time the command
was executed. The following history list includes a command to modify the bash
prompt so it displays the history event number. The last event in the history list is the
history command that displayed the list.
32 $ history | tail
23 PS1="\! bash$ "
24 ls -l
25 cat temp
26 rm temp
27 vim memo
28 lpr memo
29 vim memo
30 lpr memo
31 rm memo
32 history | tail

328 Chapter 8 The Bourne Again Shell (bash)

As you run commands and your history list becomes longer, it might run off the top
of the screen when you use the history builtin. Send the output of history through a
pipeline to less to browse through it or give the command history 10 or history | tail
to look at the ten most recent commands.

Handy history aliases
tip Creating the following aliases makes working with history easier. The first allows you to give the
command h to display the ten most recent events. The second alias causes the command hg string
to display all events in the history list that contain string. Put these aliases in your ~/.bashrc file
to make them available each time you log in. See page 342 for more information on aliases.
$ alias 'h=history | tail'
$ alias 'hg=history | grep'

Reexecuting and Editing Commands
You can reexecute any event in the history list. Not having to reenter long command
lines allows you to reexecute events more easily, quickly, and accurately than you
could if you had to retype the command line in its entirety. You can recall, modify,
and reexecute previously executed events in three ways: You can use the fc builtin
(next), the exclamation point commands (page 331), or the Readline Library, which
uses a one-line vi- or emacs-like editor to edit and execute events (page 335).

Which method to use?
tip If you are more familiar with vi or emacs and less familiar with the C or TC Shell, use fc or the
Readline Library. If you are more familiar with the C or TC Shell, use the exclamation point commands. If it is a toss-up, try the Readline Library; it will benefit you in other areas of Linux more
than learning the exclamation point commands will.

fc: Displays, Edits, and Reexecutes Commands
The fc (fix command) builtin (not in tcsh) enables you to display the history list and
to edit and reexecute previous commands. It provides many of the same capabilities
as the command-line editors.

Viewing the History List
When you call fc with the –l option, it displays commands from the history list.
Without any arguments, fc –l lists the 16 most recent commands in a list that
includes event numbers, with the oldest appearing first:
$ fc -l
1024
1025
1026
1027
1028
1029
1030
1031

cd
view calendar
vim letter.adams01
aspell -c letter.adams01
vim letter.adams01
lpr letter.adams01
cd ../memos
ls

History 329
1032
1033
1034
1035
1036
1037
1038
1039
1040

rm *0405
fc -l
cd
whereis aspell
man aspell
cd /usr/share/doc/*aspell*
pwd
ls
ls man-html

The fc builtin can take zero, one, or two arguments with the –l option. The arguments
specify the part of the history list to be displayed:
fc –l [first [last]]
The fc builtin lists commands beginning with the most recent event that matches
first. The argument can be an event number, the first few characters of the command
line, or a negative number, which specifies the nth previous command. Without last,
fc displays events through the most recent. If you include last, fc displays commands
from the most recent event that matches first through the most recent event that
matches last.
The next command displays the history list from event 1030 through event 1035:
$ fc -l 1030 1035
1030
cd ../memos
1031
ls
1032
rm *0405
1033
fc -l
1034
cd
1035
whereis aspell

The following command lists the most recent event that begins with view through the
most recent command line that begins with whereis:
$ fc -l view whereis
1025
view calendar
1026
vim letter.adams01
1027
aspell -c letter.adams01
1028
vim letter.adams01
1029
lpr letter.adams01
1030
cd ../memos
1031
ls
1032
rm *0405
1033
fc -l
1034
cd
1035
whereis aspell

To list a single command from the history list, use the same identifier for the first and
second arguments. The following command lists event 1027:
$ fc -l 1027 1027
1027
aspell -c letter.adams01

330 Chapter 8 The Bourne Again Shell (bash)

Editing and Reexecuting Previous Commands
You can use fc to edit and reexecute previous commands.
fc [–e editor] [first [last]]
When you call fc with the –e option followed by the name of an editor, fc calls the
editor with event(s) in the Work buffer. By default, fc invokes the vi(m) or nano editor.
Without first and last, it defaults to the most recent command. The next example
invokes the vim editor (Chapter 6) to edit the most recent command. When you exit
from the editor, the shell executes the command.
$ fc -e vi

The fc builtin uses the stand-alone vim editor. If you set the EDITOR variable, you do
not need to use the –e option to specify an editor on the command line. Because the value
of EDITOR has been changed to /usr/bin/emacs and fc has no arguments, the following
command edits the most recent command using the emacs editor (Chapter 7):
$ export EDITOR=/usr/bin/emacs
$ fc

If you call it with a single argument, fc invokes the editor on the specified command.
The following example starts the editor with event 1029 in the Work buffer:
$ fc 1029

As described earlier, you can identify commands either by using numbers or by specifying the first few characters of the command name. The following example calls the
editor to work on events from the most recent event that begins with the letters vim
through event 1030:
$ fc vim 1030

Clean up the fc buffer
caution When you execute an fc command, the shell executes whatever you leave in the editor buffer,
possibly with unwanted results. If you decide you do not want to execute a command, delete
everything from the buffer before you exit from the editor.

Reexecuting Commands Without Calling the Editor
You can also reexecute previous commands without using an editor. If you call fc with
the –s option, it skips the editing phase and reexecutes the command. The following
example reexecutes event 1029:
$ fc -s 1029
lpr letter.adams01

The next example reexecutes the previous command:
$ fc -s

When you reexecute a command, you can tell fc to substitute one string for another.
The next example substitutes the string john for the string adams in event 1029 and
executes the modified event:
$ fc -s adams=john 1029
lpr letter.john01

History 331

Using an Exclamation Point (!) to Reference Events
The C Shell history mechanism uses an exclamation point to reference events. This
technique, which is available under bash and tcsh, is frequently more cumbersome to
use than fc but nevertheless has some useful features. For example, the !! command
reexecutes the previous event, and the shell replaces the !$ token with the last word
from the previous command line.
You can reference an event by using its absolute event number, its relative event number,
or the text it contains. All references to events, called event designators, begin with an
exclamation point (!). One or more characters follow the exclamation point to specify
an event.
You can put history events anywhere on a command line. To escape an exclamation
point so the shell interprets it literally instead of as the start of a history event, precede
it with a backslash (\) or enclose it within single quotation marks.

Event Designators
An event designator specifies a command in the history list. Table 8-8 lists event
designators.

!! reexecutes the
previous event

Table 8-8

Event designators

Designator

Meaning

!

Starts a history event unless followed immediately by SPACE, NEWLINE, =,
or (.

!!

The previous command.

!n

Command number n in the history list.

!–n

The n th preceding command.

!string

The most recent command line that started with string.

!?string [?]

The most recent command that contained string. The last ? is optional.

!#

The current command (as you have it typed so far).

You can reexecute the previous event by giving a !! command. In the following example,
event 45 reexecutes event 44:
44 $ ls -l text
-rw-rw-r--. 1 max pubs 45 04-30 14:53 text
45 $ !!
ls -l text
-rw-rw-r--. 1 max pubs 45 04-30 14:53 text

The !! command works whether or not your prompt displays an event number. As
this example shows, when you use the history mechanism to reexecute an event, the
shell displays the command it is reexecuting.

332 Chapter 8 The Bourne Again Shell (bash)
!n event number

A number following an exclamation point refers to an event. If that event is in the
history list, the shell executes it. Otherwise, the shell displays an error message. A
negative number following an exclamation point references an event relative to the
current event. For example, the command !–3 refers to the third preceding event.
After you issue a command, the relative event number of a given event changes (event
–3 becomes event –4). Both of the following commands reexecute event 44:
51 $ !44
ls -l text
-rw-rw-r--. 1 max pubs 45 04-30 14:53 text
52 $ !-8
ls -l text
-rw-rw-r--. 1 max pubs 45 04-30 14:53 text

!string event text

When a string of text follows an exclamation point, the shell searches for and
executes the most recent event that began with that string. If you enclose the
string within question marks, the shell executes the most recent event that contained that string. The final question mark is optional if a RETURN would
immediately follow it.
68 $ history 10
59 ls -l text*
60 tail text5
61 cat text1 text5 > letter
62 vim letter
63 cat letter
64 cat memo
65 lpr memo
66 pine zach
67 ls -l
68 history
69 $ !l
ls -l
...
70 $ !lpr
lpr memo
71 $ !?letter?
cat letter
...

optional Word Designators
A word designator specifies a word (token) or series of words from an event (a command line). Table 8-9 on page 334 lists word designators. The words on a command
line are numbered starting with 0 (the first word, usually the command), continuing
with 1 (the first word following the command), and ending with n (the last word on
the command line).
To specify a particular word from a previous event, follow the event designator (such
as !14) with a colon and the number of the word in the previous event. For example,

History 333

!14:3 specifies the third word following the command from event 14. You can specify
the first word following the command (word number 1) using a caret (^) and the last
word using a dollar sign ($). You can specify a range of words by separating two
word designators with a hyphen.
72 $ echo apple grape orange pear
apple grape orange pear
73 $ echo !72:2
echo grape
grape
74 $ echo !72:^
echo apple
apple
75 $ !72:0 !72:$
echo pear
pear
76 $ echo !72:2-4
echo grape orange pear
grape orange pear
77 $ !72:0-$
echo apple grape orange pear
apple grape orange pear

As the next example shows, !$ refers to the last word of the previous event. You can
use this shorthand to edit, for example, a file you just displayed using cat:
$ cat report.718
...
$ vim !$
vim report.718
...

If an event contains a single command, the word numbers correspond to the argument numbers. If an event contains more than one command, this correspondence
does not hold for commands after the first. In the next example, event 78 contains
two commands separated by a semicolon so the shell executes them sequentially; the
semicolon is word number 5.
78 $ !72 ; echo helen zach barbara
echo apple grape orange pear ; echo helen zach barbara
apple grape orange pear
helen zach barbara
79 $ echo !78:7
echo helen
helen
80 $ echo !78:4-7
echo pear ; echo helen
pear
helen

334 Chapter 8 The Bourne Again Shell (bash)

Table 8-9

Word designators

Designator

Meaning

n

The n th word. Word 0 is normally the command name.

^

The first word (after the command name).

$

The last word.

m –n

All words from word number m through word number n; m defaults to 0 if you
omit it (0–n ).

n*

All words from word number n through the last word.

*

All words except the command name. The same as 1*.

%

The word matched by the most recent ?string ? search.

Modifiers
On occasion you might want to change an aspect of an event you are reexecuting. Perhaps you entered a complex command line with a typo or incorrect pathname or you
want to specify a different argument. You can modify an event or a word of an event
by putting one or more modifiers after the word designator or after the event designator
if there is no word designator. Each modifier must be preceded by a colon (:).
Substitute modifier

The following example shows the substitute modifier correcting a typo in the previous
event:
$ car /home/zach/memo.0507 /home/max/letter.0507
bash: car: command not found
$ !!:s/car/cat
cat /home/zach/memo.0507 /home/max/letter.0507
...

The substitute modifier has the following syntax:
[g]s/old/new/
where old is the original string (not a regular expression) and new is the string that
replaces old. The substitute modifier substitutes the first occurrence of old with new.
Placing a g before the s causes a global substitution, replacing all occurrences of old.
Although / is the delimiter in the examples, you can use any character that is not in
either old or new. The final delimiter is optional if a RETURN would immediately follow
it. As with the vim Substitute command, the history mechanism replaces an ampersand (&) in new with old. The shell replaces a null old string (s//new/) with the
previous old string or the string within a command you searched for using ?string?.
Quick substitution

An abbreviated form of the substitute modifier is quick substitution. Use it to reexecute the most recent event while changing some of the event text. The quick
substitution character is the caret (^). For example, the command
$ ^old^new^

History 335

produces the same results as
$ !!:s/old/new/

Thus substituting cat for car in the previous event could have been entered as
$ ^car^cat
cat /home/zach/memo.0507 /home/max/letter.0507
...

You can omit the final caret if it would be followed immediately by a RETURN. As with
other command-line substitutions, the shell displays the command line as it appears
after the substitution.
Other modifiers

Modifiers (other than the substitute modifier) perform simple edits on the part of the
event that has been selected by the event designator and the optional word designators. You can use multiple modifiers, each preceded by a colon (:).
The following series of commands uses ls to list the name of a file, repeats the command
without executing it (p modifier), and repeats the last command, removing the last part
of the pathname (h modifier) again without executing it:
$ ls /etc/ssh/ssh_config
/etc/ssh/ssh_config
$ !!:p
ls /etc/ssh/ssh_config
$ !!:h:p
ls /etc/ssh

Table 8-10 lists event modifiers other than the substitute modifier.

Table 8-10

Event modifiers

Modifier

Function

e

(extension)

Removes all but the filename extension

h

(head)

Removes the last part of a pathname

p

(print)

Displays the command but does not execute it

q

(quote)

Quotes the substitution to prevent further substitutions on it

r

(root)

Removes the filename extension

t

(tail)

Removes all elements of a pathname except the last

x

Like q but quotes each word in the substitution individually

The Readline Library
Command-line editing under the Bourne Again Shell is implemented through the
Readline Library, which is available to any application written in C. Any application that uses the Readline Library supports line editing that is consistent with that

336 Chapter 8 The Bourne Again Shell (bash)

provided by bash. Programs that use the Readline Library, including bash, read
~/.inputrc (page 339) for key binding information and configuration settings. The
––noediting command-line option turns off command-line editing in bash.
vi mode

You can choose one of two editing modes when using the Readline Library in bash:
emacs or vi(m). Both modes provide many of the commands available in the stand-alone
versions of the emacs and vim editors. You can also use the ARROW keys to move around.
Up and down movements move you backward and forward through the history list. In
addition, Readline provides several types of interactive word completion (page 338).
The default mode is emacs; you can switch to vi mode using the following command:
$ set -o vi

emacs mode

The next command switches back to emacs mode:
$ set -o emacs

vi Editing Mode
Before you start, make sure the shell is in vi mode.
When you enter bash commands while in vi editing mode, you are in Input mode
(page 163). As you enter a command, if you discover an error before you press RETURN,
you can press ESCAPE to switch to vim Command mode. This setup is different from the
stand-alone vim editor’s initial mode. While in Command mode you can use many vim
commands to edit the command line. It is as though you were using vim to edit a copy
of the history file with a screen that has room for only one command. When you use
the k command or the UP ARROW to move up a line, you access the previous command.
If you then use the j command or the DOWN ARROW to move down a line, you return to
the original command. To use the k and j keys to move between commands, you must
be in Command mode; you can use the ARROW keys in both Command and Input modes.

The command-line vim editor starts in Input mode
tip The stand-alone vim editor starts in Command mode, whereas the command-line vim editor
starts in Input mode. If commands display characters and do not work properly, you are in Input
mode. Press ESCAPE and enter the command again.

In addition to cursor-positioning commands, you can use the search-backward (?)
command followed by a search string to look back through the history list for the
most recent command containing a string. If you have moved back in the history list,
use a forward slash (/) to search forward toward the most recent command. Unlike
the search strings in the stand-alone vim editor, these search strings cannot contain
regular expressions. You can, however, start the search string with a caret (^) to force
the shell to locate commands that start with the search string. As in vim, pressing n
after a successful search looks for the next occurrence of the same string.
You can also use event numbers to access events in the history list. While you are in
Command mode (press ESCAPE), enter the event number followed by a G to go to the
command with that event number.

History 337

When you use /, ?, or G to move to a command line, you are in Command mode, not
Input mode: You can edit the command or press RETURN to execute it.
When the command you want to edit is displayed, you can modify the command line
using vim Command mode editing commands such as x (delete character), r (replace
character), ~ (change case), and . (repeat last change). To switch to Input mode, use
an Insert (i, I), Append (a, A), Replace (R), or Change (c, C) command. You do not
have to return to Command mode to execute a command; simply press RETURN, even
if the cursor is in the middle of the command line. For more information refer to the
vim tutorial on page 161. Refer to page 206 for a summary of vim commands.

emacs Editing Mode
Unlike the vim editor, emacs is modeless. You need not switch between Command
mode and Input mode because most emacs commands are control characters
(page 223), allowing emacs to distinguish between input and commands. Like vim,
the emacs command-line editor provides commands for moving the cursor on the
command line and through the command history list and for modifying part or all
of a command. However, in a few cases, the emacs command-line editor commands
differ from those used in the stand-alone emacs editor.
In emacs you perform cursor movement by using both CONTROL and ESCAPE commands.
To move the cursor one character backward on the command line, press CONTROL-B. Press
CONTROL-F to move one character forward. As in vim, you can precede these movements
with counts. To use a count you must first press ESCAPE; otherwise, the numbers you type
will appear on the command line.
Like vim, emacs provides word and line movement commands. To move backward or
forward one word on the command line, press ESCAPE b or ESCAPE f, respectively. To move
several words using a count, press ESCAPE followed by the number and the appropriate
escape sequence. To move to the beginning of the line, press CONTROL-A; to move to the
end of the line, press CONTROL-E; and to move to the next instance of the character c, press
CONTROL-X CONTROL-F followed by c.
You can add text to the command line by moving the cursor to the position you
want to enter text and typing. To delete text, move the cursor just to the right of
the characters you want to delete and press the erase key (page 29) once for each
character you want to delete.

CONTROL-D can terminate your screen session
caution If you want to delete the character directly under the cursor, press CONTROL-D. If you enter CONTROL-D
at the beginning of the line, it might terminate your shell session.

If you want to delete the entire command line, press the line kill key (page 30). You can
press this key while the cursor is anywhere in the command line. Use CONTROL-K to delete
from the cursor to the end of the line. Refer to page 261 for a summary of emacs
commands.

338 Chapter 8 The Bourne Again Shell (bash)

Readline Completion Commands
You can use the TAB key to complete words you are entering on the command line. This
facility, called completion, works in both vi and emacs editing modes and is similar
to the completion facility available in tcsh. Several types of completion are possible,
and which one you use depends on which part of a command line you are typing
when you press TAB.

Command Completion
If you are typing the name of a command, pressing TAB initiates command completion,
in which bash looks for a command whose name starts with the part of the word you
have typed. If no command starts with the characters you entered, bash beeps. If there
is one such command, bash completes the command name. If there is more than one
choice, bash does nothing in vi mode and beeps in emacs mode. Pressing TAB a second
time causes bash to display a list of commands whose names start with the prefix you
typed and allows you to continue typing the command name.
In the following example, the user types bz and presses TAB. The shell beeps (the user
is in emacs mode) to indicate that several commands start with the letters bz. The user
enters another TAB to cause the shell to display a list of commands that start with bz
followed by the command line as the user has entered it so far:
$ bz  TAB (beep)  TAB
bzcat
bzdiff
bzcmp
bzgrep
$ bz■

bzip2
bzip2recover

bzless
bzmore

Next the user types c and presses TAB twice. The shell displays the two commands that
start with bzc. The user types a followed by TAB. At this point the shell completes the
command because only one command starts with bzca.
$ bzc  TAB (beep)
bzcat bzcmp
$ bzca  TAB  t



TAB

■

Pathname Completion
Pathname completion, which also uses TABs, allows you to type a portion of a pathname and have bash supply the rest. If the portion of the pathname you have typed
is sufficient to determine a unique pathname, bash displays that pathname. If more
than one pathname would match it, bash completes the pathname up to the point
where there are choices so that you can type more.
When you are entering a pathname, including a simple filename, and press TAB, the
shell beeps (if the shell is in emacs mode—in vi mode there is no beep). It then extends
the command line as far as it can.
$ cat films/dar



TAB (beep) cat films/dark_■

History 339

In the films directory every file that starts with dar has k_ as the next characters,
so bash cannot extend the line further without making a choice among files. The
shell leaves the cursor just past the _ character. At this point you can continue typing the pathname or press TAB twice. In the latter case bash beeps, displays the
choices, redisplays the command line, and again leaves the cursor just after the
_ character.
$ cat films/dark_  TAB (beep)
dark_passage dark_victory
$ cat films/dark_■



TAB

When you add enough information to distinguish between the two possible files and
press TAB, bash displays the unique pathname. If you enter p followed by TAB after the
_ character, the shell completes the command line:
$ cat films/dark_p



TAB



assage

Because there is no further ambiguity, the shell appends a SPACE so you can either finish
typing the command line or press RETURN to execute the command. If the complete
pathname is that of a directory, bash appends a slash (/) in place of a SPACE.

Variable Completion
When you are typing a variable name, pressing TAB results in variable completion,
wherein bash attempts to complete the name of the variable. In case of an ambiguity,
pressing TAB twice displays a list of choices:
$ echo $HO  TAB (beep)  TAB
$HOME
$HOSTNAME $HOSTTYPE
$ echo $HOM  TAB  E

Pressing RETURN executes the command
caution Pressing RETURN causes the shell to execute the command regardless of where the cursor is on
the command line.

.inputrc: Configuring the Readline Library
The Bourne Again Shell and other programs that use the Readline Library read the file
specified by the INPUTRC environment variable to obtain initialization information.
If INPUTRC is not set, these programs read the ~/.inputrc file. They ignore lines of
.inputrc that are blank or that start with a hashmark (#).

Variables
You can set variables in .inputrc to control the behavior of the Readline Library using
the following syntax:
set variable value

340 Chapter 8 The Bourne Again Shell (bash)

Table 8-11 lists some variables and values you can use. See “Readline Variables” in
the bash man or info page for a complete list.

Table 8-11

Readline variables

Variable

Effect

editing-mode

Set to vi to start Readline in vi mode. Set to emacs to start
Readline in emacs mode (the default). Similar to the set –o vi
and set –o emacs shell commands (page 336).

horizontal-scroll-mode

Set to on to cause long lines to extend off the right edge of the
display area. Moving the cursor to the right when it is at the right
edge of the display area shifts the line to the left so you can see
more of the line. Shift the line back by moving the cursor back
past the left edge. The default value is off, which causes long
lines to wrap onto multiple lines of the display.

mark-directories

Set to off to cause Readline not to place a slash (/) at the end of
directory names it completes. The default value is on.

mark-modified-lines

Set to on to cause Readline to precede modified history lines
with an asterisk. The default value is off.

Key Bindings
You can map keystroke sequences to Readline commands, changing or extending the
default bindings. Like the emacs editor, the Readline Library includes many commands that are not bound to a keystroke sequence. To use an unbound command,
you must map it using one of the following forms:
keyname: command_name
"keystroke_sequence": command_name
In the first form, you spell out the name for a single key. For example, CONTROL-U would
be written as control-u. This form is useful for binding commands to single keys.
In the second form, you specify a string that describes a sequence of keys that will be
bound to the command. You can use the emacs-style backslash escape sequences
(page 223) to represent the special keys CONTROL (\C), META (\M), and ESCAPE (\e). Specify
a backslash by escaping it with another backslash: \\. Similarly a double or single
quotation mark can be escaped with a backslash: \" or \'.
The kill-whole-line command, available in emacs mode only, deletes the current line.
Put the following command in .inputrc to bind the kill-whole-line command (which
is unbound by default) to the keystroke sequence CONTROL-R:
control-r: kill-whole-line
bind

Give the command bind –P to display a list of all Readline commands. If a command
is bound to a key sequence, that sequence is shown. Commands you can use in vi
mode start with vi. For example, vi-next-word and vi-prev-word move the cursor to

History 341

the beginning of the next and previous words, respectively. Commands that do not
begin with vi are generally available in emacs mode.
Use bind –q to determine which key sequence is bound to a command:
$ bind -q kill-whole-line
kill-whole-line can be invoked via "\C-r".

You can also bind text by enclosing it within double quotation marks (emacs mode
only):
"QQ": "The Linux Operating System"

This command causes bash to insert the string The Linux Operating System when
you type QQ on the command line.

Conditional Constructs
You can conditionally select parts of the .inputrc file using the $if directive. The syntax
of the conditional construct is
$if test[=value]
commands
[$else
commands]
$endif
where test is mode, term, or a program name such as bash. If test equals value (or if test
is true when value is not specified), this structure executes the first set of commands. If
test does not equal value (or if test is false when value is not specified), it executes the
second set of commands if they are present or exits from the structure if they are not
present.
The power of the $if directive lies in the three types of tests it can perform.
1. You can test to see which mode is currently set.
$if mode=vi

The preceding test is true if the current Readline mode is vi and false
otherwise. You can test for vi or emacs.
2. You can test the type of terminal.
$if term=xterm

The preceding test is true if the TERM variable is set to xterm. You can test
for any value of TERM.
3. You can test the application name.
$if bash

The preceding test is true when you are running bash and not another program
that uses the Readline Library. You can test for any application name.

342 Chapter 8 The Bourne Again Shell (bash)

These tests can customize the Readline Library based on the current mode, the type
of terminal, and the application you are using. They give you a great deal of power
and flexibility when you are using the Readline Library with bash and other
programs.
The following commands in .inputrc cause CONTROL-Y to move the cursor to the beginning
of the next word regardless of whether bash is in vi or emacs mode:
$ cat ~/.inputrc
set editing-mode vi
$if mode=vi
"\C-y": vi-next-word
$else
"\C-y": forward-word
$endif

Because bash reads the preceding conditional construct when it is started, you must
set the editing mode in .inputrc. Changing modes interactively using set will not
change the binding of CONTROL-Y.
For more information on the Readline Library, open the bash man page and give the
command /^READLINE, which searches for the word READLINE at the beginning
of a line.

If Readline commands do not work, log out and log in again
tip The Bourne Again Shell reads ~/.inputrc when you log in. After you make changes to this file, you
must log out and log in again before the changes will take effect.

Aliases
An alias is a (usually short) name that the shell translates into another (usually longer)
name or command. Aliases allow you to define new commands by substituting a
string for the first token of a simple command. They are typically placed in the
~/.bashrc (bash) or ~/.tcshrc (tcsh) startup files so that they are available to interactive subshells.
Under bash the syntax of the alias builtin is
alias [name[=value]]
Under tcsh the syntax is
alias [name[ value]]
In the bash syntax no SPACEs are permitted around the equal sign. If value contains
SPACEs or TABs, you must enclose value within quotation marks. Unlike aliases under
tcsh, a bash alias does not accept an argument from the command line in value. Use
a bash function (page 346) when you need to use an argument.

Aliases 343

An alias does not replace itself, which avoids the possibility of infinite recursion in
handling an alias such as the following:
$ alias ls='ls -F'

You can nest aliases. Aliases are disabled for noninteractive shells (that is, shell scripts).
Use the unalias builtin to remove an alias. When you give an alias builtin command
without any arguments, the shell displays a list of all defined aliases:
$ alias
alias ll='ls -l'
alias l='ls -ltr'
alias ls='ls -F'
alias zap='rm -i'

To view the alias for a particular name, enter the command alias followed by the
name of the alias. Most Linux distributions define at least some aliases. Enter an alias
command to see which aliases are in effect. You can delete the aliases you do not want
from the appropriate startup file.

Single Versus Double Quotation Marks in Aliases
The choice of single or double quotation marks is significant in the alias syntax when
the alias includes variables. If you enclose value within double quotation marks, any
variables that appear in value are expanded when the alias is created. If you enclose
value within single quotation marks, variables are not expanded until the alias is
used. The following example illustrates the difference.
The PWD keyword variable holds the pathname of the working directory. Max creates two aliases while he is working in his home directory. Because he uses double
quotation marks when he creates the dirA alias, the shell substitutes the value of the
working directory when he creates this alias. The alias dirA command displays the
dirA alias and shows that the substitution has already taken place:
$ echo $PWD
/home/max
$ alias dirA="echo Working directory is $PWD"
$ alias dirA
alias dirA='echo Working directory is /home/max'

When Max creates the dirB alias, he uses single quotation marks, which prevent the
shell from expanding the $PWD variable. The alias dirB command shows that the
dirB alias still holds the unexpanded $PWD variable:
$ alias dirB='echo Working directory is $PWD'
$ alias dirB
alias dirB='echo Working directory is $PWD'

After creating the dirA and dirB aliases, Max uses cd to make cars his working directory and gives each of the aliases as a command. The alias he created using double
quotation marks displays the name of the directory he created the alias in as the

344 Chapter 8 The Bourne Again Shell (bash)

working directory (which is wrong). In contrast, the dirB alias displays the proper
name of the working directory:
$ cd cars
$ dirA
Working directory is /home/max
$ dirB
Working directory is /home/max/cars

How to prevent the shell from invoking an alias
tip The shell checks only simple, unquoted commands to see if they are aliases. Commands given as
relative or absolute pathnames and quoted commands are not checked. When you want to give a
command that has an alias but do not want to use the alias, precede the command with a backslash,
specify the command’s absolute pathname, or give the command as ./command.

Examples of Aliases
The following alias allows you to type r to repeat the previous command or r abc to
repeat the last command line that began with abc:
$ alias r='fc -s'

If you use the command ls –ltr frequently, you can create an alias that substitutes
ls –ltr when you give the command l:
$ alias l='ls
$ l
-rw-r-----. 1
-rw-r--r--. 1
-rw-r--r--. 1
-rw-r--r--. 1
drwxrwxr-x. 2
drwxrwxr-x. 2
-rwxr-xr-x. 1
drwxrwxr-x. 2

-ltr'
max
max
max
max
max
max
max
max

pubs 3089 02-11 16:24 XTerm.ad
pubs 30015 03-01 14:24 flute.ps
pubs
641 04-01 08:12 fixtax.icn
pubs
484 04-09 08:14 maptax.icn
pubs 1024 08-09 17:41 Tiger
pubs 1024 09-10 11:32 testdir
pubs
485 09-21 08:03 floor
pubs 1024 09-27 20:19 Test_Emacs

Another common use of aliases is to protect yourself from mistakes. The following
example substitutes the interactive version of the rm utility when you enter the command zap:
$ alias zap='rm -i'
$ zap f*
rm: remove 'fixtax.icn'? n
rm: remove 'flute.ps'? n
rm: remove 'floor'? n

The –i option causes rm to ask you to verify each file that would be deleted,
thereby helping you avoid deleting the wrong file. You can also alias rm with the
rm –i command: alias rm='rm –i'.
The aliases in the next example cause the shell to substitute ls –l each time you give an ll
command and ls –F each time you use ls. The –F option causes ls to print a slash (/) at
the end of directory names and an asterisk (*) at the end of the names of executable files.

Aliases 345
$ alias ls='ls -F'
$ alias ll='ls -l'
$ ll
drwxrwxr-x. 2 max pubs 1024 09-27 20:19 Test_Emacs/
drwxrwxr-x. 2 max pubs 1024 08-09 17:41 Tiger/
-rw-r-----. 1 max pubs 3089 02-11 16:24 XTerm.ad
-rw-r--r--. 1 max pubs
641 04-01 08:12 fixtax.icn
-rw-r--r--. 1 max pubs 30015 03-01 14:24 flute.ps
-rwxr-xr-x. 1 max pubs
485 09-21 08:03 floor*
-rw-r--r--. 1 max pubs
484 04-09 08:14 maptax.icn
drwxrwxr-x. 2 max pubs 1024 09-10 11:32 testdir/

In this example, the string that replaces the alias ll (ls –l) itself contains an alias (ls).
When it replaces an alias with its value, the shell looks at the first word of the
replacement string to see whether it is an alias. In the preceding example, the
replacement string contains the alias ls, so a second substitution occurs to produce
the final command ls –F –l. (To avoid a recursive plunge, the ls in the replacement
text, although an alias, is not expanded a second time.)
When given a list of aliases without the =value or value field, the alias builtin displays
the value of each defined alias. The alias builtin reports an error if an alias has not
been defined:
$ alias ll l ls zap wx
alias ll='ls -l'
alias l='ls -ltr'
alias ls='ls -F'
alias zap='rm -i'
bash: alias: wx: not found

You can avoid alias substitution by preceding the aliased command with a backslash (\):
$ \ls
Test_Emacs XTerm.ad
Tiger
fixtax.icn

flute.ps
floor

maptax.icn
testdir

Because the replacement of an alias name with the alias value does not change the rest
of the command line, any arguments are still received by the command that is executed:
$ ll f*
-rw-r--r--. 1 max pubs
641 04-01 08:12 fixtax.icn
-rw-r--r--. 1 max pubs 30015 03-01 14:24 flute.ps
-rwxr-xr-x. 1 max pubs
485 09-21 08:03 floor*

You can remove an alias using the unalias builtin. When the zap alias is removed, it is no
longer displayed by the alias builtin, and its subsequent use results in an error message:
$ unalias zap
$ alias
alias ll='ls -l'
alias l='ls -ltr'
alias ls='ls -F'
$ zap maptax.icn
bash: zap: command not found

346 Chapter 8 The Bourne Again Shell (bash)

Functions
A shell function (tcsh does not have functions) is similar to a shell script in that it
stores a series of commands for execution at a later time. However, because the shell
stores a function in the computer’s main memory (RAM) instead of in a file on the
disk, the shell can access it more quickly than the shell can access a script. The shell
also preprocesses (parses) a function so it starts more quickly than a script. Finally
the shell executes a shell function in the same shell that called it. If you define too
many functions, the overhead of starting a subshell (as when you run a script) can
become unacceptable.
You can declare a shell function in the ~/.bash_profile startup file, in the script that
uses it, or directly from the command line. You can remove functions using the unset
builtin. The shell does not retain functions after you log out.

Removing variables and functions that have the same name
tip If you have a shell variable and a function that have the same name, using unset removes the shell
variable. If you then use unset again with the same name, it removes the function.

The syntax that declares a shell function is
[function] function-name () {
commands
}
where the word function is optional (and is frequently omitted; it is not portable),
function-name is the name you use to call the function, and commands comprise
the list of commands the function executes when you call it. The commands can be
anything you would include in a shell script, including calls to other functions.
The opening brace ({) can appear on the line following the function name. Aliases and
variables are expanded when a function is read, not when it is executed. You can use
the break statement (page 441) within a function to terminate its execution.
You can declare a function on a single line. Because the closing brace must appear as
a separate command, you must place a semicolon before the closing brace when you
use this syntax:
$ say_hi() { echo "hi" ; }
$ say_hi
hi

Shell functions are useful as a shorthand as well as to define special commands. The
following function starts a process named process in the background, with the output
normally displayed by process being saved in .process.out.

Functions

347

start_process() {
process > .process.out 2>&1 &
}

The next example creates a simple function that displays the date, a header, and a list
of the people who are logged in on the system. This function runs the same commands as the whoson script described on page 285. In this example the function is
being entered from the keyboard. The greater than (>) signs are secondary shell
prompts (PS2); do not enter them.
$ function whoson () {
>
date
>
echo "Users Currently Logged On"
>
who
> }
$ whoson
Fri Aug 9 15:44:58 PDT 2013
Users Currently Logged On
hls
console
2013-08-08 08:59
max
pts/4
2013-08-08 09:33
zach
pts/7
2013-08-08 09:23
Function local
variables

(:0)
(0.0)
(guava)

You can use the local builtin only within a function. This builtin causes its arguments
to be local to the function it is called from and its children. Without local, variables
declared in a function are available to the shell that called the function (functions are
run in the shell they are called from). The following function demonstrates the use
of local.
$ demo () {
> x=4
> local y=8
> echo "demo: $x $y"
> }
$ demo
demo: 4 8
$ echo $x
4
$ echo $y
$

The demo function, which is entered from the keyboard, declares two variables, x
and y, and displays their values. The variable x is declared with a normal assignment
statement while y is declared using local. After running the function, the shell that
called the function has access to x but knows nothing of y. See page 476 for another
example of function local variables.
Export a function

An export –f command places the named function in the environment so it is available
to child processes.

348 Chapter 8 The Bourne Again Shell (bash)
Functions in
startup files

If you want the whoson function to be available without having to enter it each time
you log in, put its definition in ~/.bash_profile. Then run .bash_profile, using the
. (dot) command to put the changes into effect immediately:
$ cat ~/.bash_profile
export TERM=vt100
stty kill '^u'
whoson () {
date
echo "Users Currently Logged On"
who
}
$ . ~/.bash_profile

You can specify arguments when you call a function. Within the function these arguments
are available as positional parameters (page 458). The following example shows the arg1
function entered from the keyboard:
$ arg1 ( ) { echo "$1" ; }
$ arg1 first_arg
first_arg

See the function switch () on page 280 for another example of a function.

optional The following function allows you to place variables in the environment (export them)
using tcsh syntax. The env utility lists all environment variables and their values and
verifies that setenv worked correctly:
$ cat .bash_profile
...
# setenv - keep tcsh users happy
setenv() {
if [ $# -eq 2 ]
then
eval $1=$2
export $1
else
echo "Usage: setenv NAME VALUE" 1>&2
fi
}
$ . ~/.bash_profile
$ setenv TCL_LIBRARY /usr/local/lib/tcl
$ env | grep TCL_LIBRARY
TCL_LIBRARY=/usr/local/lib/tcl
eval

The $# special parameter (page 463) takes on the value of the number of commandline arguments. This function uses the eval builtin to force bash to scan the command
$1=$2 twice. Because $1=$2 begins with a dollar sign ($), the shell treats the entire
string as a single token—a command. With variable substitution performed, the command name becomes TCL_LIBRARY=/usr/local/lib/tcl, which results in an error.

Controlling bash: Features and Options 349

With eval, a second scanning splits the string into the three desired tokens, and the
correct assignment occurs. See page 487 for more information on eval.

Controlling bash: Features and Options
This section explains how to control bash features and options using command-line
options and the set and shopt builtins. The shell sets flags to indicate which options
are set (on) and expands $– to a list of flags that are set; see page 466 for more
information.

bash Command-Line Options
You can specify short and long command-line options. Short options consist of a
hyphen followed by a letter; long options have two hyphens followed by multiple
characters. Long options must appear before short options on a command line that
calls bash. Table 8-12 lists some commonly used command-line options.

Table 8-12

bash command-line options

Option

Explanation

Syntax

Help

Displays a usage message.

––help

No edit

Prevents users from using the Readline Library
(page 335) to edit command lines in an interactive
shell.

––noediting

No profile

Prevents reading these startup files (page 278):
/etc/profile, ~/.bash_profile, ~/.bash_login, and
~/.profile.

––noprofile

No rc

Prevents reading the ~/.bashrc startup file
(page 279). This option is on by default if the shell is
called as sh.

––norc

POSIX

Runs bash in POSIX mode.

––posix

Version

Displays bash version information and exits.

––version

Login

Causes bash to run as though it were a login shell.

–l (lowercase “l”)

shopt

Runs a shell with the opt shopt option (page 350). A
–O (uppercase “O”) sets the option; +O unsets it.

[±]O [opt ]

End of options

On the command line, signals the end of options.
Subsequent tokens are treated as arguments even if
they begin with a hyphen (–).

––

350 Chapter 8 The Bourne Again Shell (bash)

Shell Features
You can control the behavior of the Bourne Again Shell by turning features on and
off. Different methods turn different features on and off: The set builtin controls one
group of features, and the shopt builtin controls another group. You can also control
many features from the command line you use to call bash.

Features, options, variables, attributes?
tip To avoid confusing terminology, this book refers to the various shell behaviors that you can control
as features. The bash info page refers to them as “options” and “values of variables controlling
optional shell behavior.” In some places you might see them referred to as attributes.

set ±o: Turns Shell Features On and Off
The set builtin, when used with the –o or +o option, enables, disables, and lists certain
bash features (the set builtin in tcsh works differently). For example, the following
command turns on the noclobber feature (page 139):
$ set -o noclobber

You can turn this feature off (the default) by giving this command:
$ set +o noclobber

The command set –o without an option lists each of the features controlled by set,
followed by its state (on or off). The command set +o without an option lists the same
features in a form you can use as input to the shell. Table 8-13 lists bash features. This
table does not list the –i option because you cannot set it. The shell sets this option
when it is invoked as an interactive shell. See page 460 for a discussion of other uses
of set.

shopt: Turns Shell Features On and Off
The shopt (shell option) builtin (not in tcsh) enables, disables, and lists certain bash
features that control the behavior of the shell. For example, the following command
causes bash to include filenames that begin with a period (.) when it expands ambiguous file references (the –s stands for set):
$ shopt -s dotglob

You can turn this feature off (the default) by giving the following command (the –u
stands for unset):
$ shopt -u dotglob

The shell displays how a feature is set if you give the name of the feature as the only
argument to shopt:
$ shopt dotglob
dotglob
off

Controlling bash: Features and Options 351

Without any options or arguments, shopt lists the features it controls and their states.
The command shopt –s without an argument lists the features controlled by shopt
that are set or on. The command shopt –u lists the features that are unset or off.
Table 8-13 lists bash features.

Setting set ±o features using shopt
tip You can use shopt to set/unset features that are otherwise controlled by set ±o. Use the regular
shopt syntax using –s or –u and include the –o option. For example, the following command turns
on the noclobber feature:
$ shopt -o -s noclobber

Table 8-13

bash features

Feature

Description

Syntax

Alternative syntax

allexport

Automatically places in the environment
(exports) all variables and functions you
create or modify after giving this command
(default is off).

set –o allexport

set –a

braceexpand

Causes bash to perform brace expansion
(default is on; page 356).

set –o braceexpand

set –B

cdspell

Corrects minor spelling errors in directory
names used as arguments to cd (default
is off).

shopt –s cdspell

cmdhist

Saves all lines of a multiline command in
the same history entry, adding semicolons
as needed (default is on).

shopt –s cmdhist

dotglob

Causes shell special characters
(wildcards; page 148) in an ambiguous
file reference to match a leading period in
a filename. By default, special characters
do not match a leading period: You must
always specify the filenames . and ..
explicitly because no pattern ever
matches them (default is off).

shopt –s dotglob

emacs

Specifies emacs editing mode for
command-line editing (default is on;
page 337).

set –o emacs

errexit

Causes bash to exit when a pipeline
(page 141), which can be a simple
command (page 130; not a control
structure), fails (default is off).

set –o errexit

set –e

352 Chapter 8 The Bourne Again Shell (bash)

Table 8-13

bash features (continued)

Feature

Description

Syntax

execfail

Causes a shell script to continue running
when it cannot find the file that is given as
an argument to exec. By default, a script
terminates when exec cannot find the file
that is given as its argument (default is off).

shopt –s execfail

expand_aliases

Causes aliases (page 342) to be expanded
(default is on for interactive shells and off
for noninteractive shells).

shopt –s expand_aliases

hashall

Causes bash to remember where
commands it has found using PATH
(page 308) are located (default is on).

set –o hashall

histappend

Causes bash to append the history list to
the file named by HISTFILE (page 326)
when the shell exits (default is off [bash
overwrites this file]).

shopt –s histappend

histexpand

Turns on the history mechanism (which
uses exclamation points by default;
page 331). Turn this feature off to turn off
history expansion (default is on).

set –o histexpand

history

Enables command history (default is on;
page 326).

set –o history

huponexit

Specifies that bash send a SIGHUP signal
to all jobs when an interactive login shell
exits (default is off).

shopt –s huponexit

ignoreeof

Specifies that bash must receive ten EOF
characters before it exits. Useful on noisy
dial-up lines (default is off).

set –o ignoreeof

monitor

Enables job control (default is on;
page 294).

set –o monitor

nocaseglob

Causes ambiguous file references
(page 148) to match filenames without
regard to case (default is off).

shopt –s nocaseglob

noclobber

Helps prevent overwriting files (default is
off; page 139).

set –o noclobber

set –C

noglob

Disables pathname expansion (default is
off; page 148).

set –o noglob

set –f

Alternative syntax

set –h

set –H

set –m

Controlling bash: Features and Options 353

Table 8-13

bash features (continued)

Feature

Description

Syntax

Alternative syntax

notify

With job control (page 294) enabled,
reports the termination status of
background jobs immediately (default is
off: bash displays the status just before
the next prompt).

set –o notify

set –b

nounset

Displays an error when the shell tries to
expand an unset variable; bash exits from
a script but not from an interactive shell
(default is off: bash substitutes a null
value for an unset variable).

set –o nounset

set –u

nullglob

Causes bash to substitute a null string for
ambiguous file references (page 148) that
do not match a filename (default is off:
bash passes these file references as is).

shopt –s nullglob

pipefail

Sets the exit status of a pipeline to the exit
status of the last (rightmost) simple
command that failed (returned a nonzero
exit status) in the pipeline; if no command
failed, exit status is set to zero (default is
off: bash sets the exit status of a pipeline
to the exit status of the final command in
the pipeline).

set –o pipefail

posix

Runs bash in POSIX mode (default is off).

set –o posix

verbose

Displays each command line after bash
reads it but before bash expands it (default
is off). See also xtrace.

set –o verbose

vi

Specifies vi editing mode for commandline editing (default is off; page 336).

set –o vi

xpg_echo

Causes the echo builtin to expand
backslash escape sequences without the
need for the –e option (default is off;
page 445).

shopt –s xpg_echo

xtrace

Turns on shell debugging: Displays the
value of PS4 (page 311) followed by each
input line after the shell reads and expands
it (default is off; see page 430 for a
discussion). See also verbose.

set –o xtrace

set –v

set –x

354 Chapter 8 The Bourne Again Shell (bash)

Processing the Command Line
Whether you are working interactively or running a shell script, bash needs to read a
command line before it can start processing it—bash always reads at least one line before
processing a command. Some bash builtins, such as if and case, as well as functions and
quoted strings, span multiple lines. When bash recognizes a command that covers more
than one line, it reads the entire command before processing it. In interactive sessions,
bash prompts with the secondary prompt (PS2, > by default; page 311) as you type each
line of a multiline command until it recognizes the end of the command:
$ ps -ef |
> grep emacs
zach
26880 24579
zach
26890 24579

1 14:42 pts/10
0 14:42 pts/10

00:00:00 emacs notes
00:00:00 grep emacs

$ function hello () {
> echo hello there
> }
$

For more information refer to “Implicit Command-Line Continuation” on page 499.
After reading a command line, bash applies history expansion and alias substitution
to the command line.

History Expansion
“Reexecuting and Editing Commands” on page 328 discusses the commands you can
give to modify and reexecute command lines from the history list. History expansion
is the process bash uses to turn a history command into an executable command line.
For example, when you enter the command !!, history expansion changes that command line so it is the same as the previous one. History expansion is turned on by
default for interactive shells; set +o histexpand turns it off. History expansion does
not apply to noninteractive shells (shell scripts).

Alias Substitution
Aliases (page 342) substitute a string for the first word of a simple command. By
default, alias substitution is turned on for interactive shells and off for noninteractive
shells; shopt –u expand_aliases turns it off.

Parsing and Scanning the Command Line
After processing history commands and aliases, bash does not execute the command
immediately. One of the first things the shell does is to parse (isolate strings of characters
in) the command line into tokens (words). After separating tokens and before executing
the command, the shell scans the tokens and performs command-line expansion.

Command-Line Expansion
Both interactive and noninteractive shells transform the command line using commandline expansion before passing the command line to the program being called. You can

Processing the Command Line 355

use a shell without knowing much about command-line expansion, but you can use what
a shell has to offer to a better advantage with an understanding of this topic. This section
covers Bourne Again Shell command-line expansion; TC Shell command-line expansion
is covered starting on page 374.
The Bourne Again Shell scans each token for the various types of expansion and substitution in the following order. Most of these processes expand a word into a single
word. Only brace expansion, word splitting, and pathname expansion can change
the number of words in a command (except for the expansion of the variable "$@"—
see page 462).
1. Brace expansion (next page)
2. Tilde expansion (page 358)
3. Parameter and variable expansion (page 358)
4. Arithmetic expansion (page 359)
5. Command substitution (page 361)
6. Word splitting (page 362)
7. Pathname expansion (page 362)
8. Process substitution (page 364)
9. Quote removal (page 364)

Order of Expansion
The order in which bash carries out these steps affects the interpretation of commands. For example, if you set a variable to a value that looks like the instruction
for output redirection and then enter a command that uses the variable’s value to
perform redirection, you might expect bash to redirect the output.
$ SENDIT="> /tmp/saveit"
$ echo xxx $SENDIT
xxx > /tmp/saveit
$ cat /tmp/saveit
cat: /tmp/saveit: No such file or directory

In fact, the shell does not redirect the output—it recognizes input and output redirection before it evaluates variables. When it executes the command line, the shell
checks for redirection and, finding none, evaluates the SENDIT variable. After
replacing the variable with > /tmp/saveit, bash passes the arguments to echo,
which dutifully copies its arguments to standard output. No /tmp/saveit file is
created.

Quotation marks can alter expansion
tip Double and single quotation marks cause the shell to behave differently when performing expansions. Double quotation marks permit parameter and variable expansion but suppress other types
of expansion. Single quotation marks suppress all types of expansion.

356 Chapter 8 The Bourne Again Shell (bash)

Brace Expansion
Brace expansion, which originated in the C Shell, provides a convenient way to specify a series of strings or numbers. Although brace expansion is frequently used to
specify filenames, the mechanism can be used to generate arbitrary strings; the shell
does not attempt to match the brace notation with the names of existing files. Brace
expansion is turned on in interactive and noninteractive shells by default; you can
turn it off using set +o braceexpand. The shell also uses braces to isolate variable
names (page 304).
The following example illustrates how brace expansion works. The ls command does
not display any output because there are no files in the working directory. The echo
builtin displays the strings the shell generates using brace expansion.
$ ls
$ echo chap_{one,two,three}.txt
chap_one.txt chap_two.txt chap_three.txt

The shell expands the comma-separated strings inside the braces on the command
line into a SPACE-separated list of strings. Each string from the list is prepended with
the string chap_, called the preamble, and appended with the string .txt, called the
postscript. Both the preamble and the postscript are optional. The left-to-right order
of the strings within the braces is preserved in the expansion. For the shell to treat
the left and right braces specially and for brace expansion to occur, at least one
comma must be inside the braces and no unquoted whitespace can appear inside the
braces. You can nest brace expansions.
Brace expansion can match filenames. This feature is useful when there is a long
preamble or postscript. The following example copies four files—main.c, f1.c, f2.c,
and tmp.c—located in the /usr/local/src/C directory to the working directory:
$ cp /usr/local/src/C/{main,f1,f2,tmp}.c .

You can also use brace expansion to create directories with related names:
$ ls -F
file1 file2 file3
$ mkdir vrs{A,B,C,D,E}
$ ls -F
file1 file2 file3 vrsA/

vrsB/

vrsC/

vrsD/

vrsE/

The –F option causes ls to display a slash (/) after a directory and an asterisk (*) after
an executable file. If you tried to use an ambiguous file reference instead of braces to
specify the directories, the result would be different (and not what you wanted):
$ rmdir vrs*
$ mkdir vrs[A-E]
$ ls -F
file1 file2 file3

vrs[A-E]/

An ambiguous file reference matches the names of existing files. In the preceding example, because it found no filenames matching vrs[A–E], bash passed the ambiguous file

Processing the Command Line 357

reference to mkdir, which created a directory with that name. Brackets in ambiguous file
references are discussed on page 151.
Sequence
expression

Under newer versions of bash, brace expansion can include a sequence expression to
generate a sequence of characters. It can generate a sequential series of numbers or
letters using the following syntax:
{n1..n2[..incr]}
where n1 and n2 are numbers or single letters and incr is a number. This syntax
works on bash version 4.0+; give the command echo $BASH_VERSION to see which
version you are using. The incr does not work under OS X. When you specify invalid
arguments, bash copies the arguments to standard output. Following are some
examples:
$ echo {4..8}
4 5 6 7 8
$ echo {8..16..2}
8 10 12 14 16
$ echo {a..m..3}
a d g j m
$ echo {a..m..b}
{a..m..b}
$ echo {2..m}
{2..m}

See page 487 for a way to use variables to specify the values used by a sequence
expression. Page 432 shows an example in which a sequence expression is used to
specify step values in a for...in loop.
seq

Older versions of bash do not support sequence expressions. Although you can use
the seq utility to perform a similar function, seq does not work with letters and displays an error when given invalid arguments. The seq utility uses the following
syntax:
seq n1 [incr] n2
The –s option causes seq to use the specified character to separate its output. Following
are some examples:
$ seq 4 8
4
5
6
7
8
$ seq -s\ 8 2 16
8 10 12 14 16
$ seq a d
seq: invalid floating point argument: a
Try 'seq --help' for more information.

358 Chapter 8 The Bourne Again Shell (bash)

Tilde Expansion
Chapter 4 introduced a shorthand notation to specify your home directory or the
home directory of another user. This section provides a more detailed explanation of
tilde expansion.
The tilde (~) is a special character when it appears at the start of a token on a command line. When it sees a tilde in this position, bash looks at the following string of
characters—up to the first slash (/) or to the end of the word if there is no slash—as
a possible username. If this possible username is null (that is, if the tilde appears as
a word by itself or if it is immediately followed by a slash), the shell substitutes the
value of the HOME variable for the tilde. The following example demonstrates this
expansion, where the last command copies the file named letter from Max’s home
directory to the working directory:
$ echo $HOME
/home/max
$ echo ~
/home/max
$ echo ~/letter
/home/max/letter
$ cp ~/letter .

If the string of characters following the tilde forms a valid username, the shell substitutes the path of the home directory associated with that username for the tilde and
name. If the string is not null and not a valid username, the shell does not make any
substitution:
$ echo ~zach
/home/zach
$ echo ~root
/root
$ echo ~xx
~xx

Tildes are also used in directory stack manipulation (page 297). In addition, ~+ is a
synonym for PWD (the name of the working directory), and ~– is a synonym for
OLDPWD (the name of the previous working directory).

Parameter and Variable Expansion
On a command line, a dollar sign ($) that is not followed by an open parenthesis
introduces parameter or variable expansion. Parameters include both command-line,
or positional, parameters (page 458) and special parameters (page 463). Variables
include both user-created variables (page 302) and keyword variables (page 307).
The bash man and info pages do not make this distinction.
The shell does not expand parameters and variables that are enclosed within single
quotation marks and those in which the leading dollar sign is escaped (i.e., preceded
with a backslash). The shell does expand parameters and variables enclosed within
double quotation marks.

Processing the Command Line 359

Arithmetic Expansion
The shell performs arithmetic expansion by evaluating an arithmetic expression and
replacing it with the result. See page 388 for information on arithmetic expansion
under tcsh. Under bash the syntax for arithmetic expansion is
$((expression))
The shell evaluates expression and replaces $((expression)) with the result. This syntax
is similar to the syntax used for command substitution [$(...)] and performs a parallel
function. You can use $((expression)) as an argument to a command or in place of any
numeric value on a command line.
The rules for forming expression are the same as those found in the C programming
language; all standard C arithmetic operators are available (see Table 10-8 on
page 495). Arithmetic in bash is done using integers. Unless you use variables of type
integer (page 306) or actual integers, however, the shell must convert string-valued
variables to integers for the purpose of the arithmetic evaluation.
You do not need to precede variable names within expression with a dollar sign ($).
In the following example, after read (page 477) assigns the user’s response to age, an
arithmetic expression determines how many years are left until age 100:
$ cat age_check
#!/bin/bash
read -p "How old are you? " age
echo "Wow, in $((100-age)) years, you'll be 100!"
$ ./age_check
How old are you? 55
Wow, in 45 years, you'll be 100!

You do not need to enclose the expression within quotation marks because bash does
not perform pathname expansion until later. This feature makes it easier for you to
use an asterisk (*) for multiplication, as the following example shows:
$ echo There are $((60*60*24*365)) seconds in a non-leap year.
There are 31536000 seconds in a non-leap year.

The next example uses wc, cut, arithmetic expansion, and command substitution
(page 361) to estimate the number of pages required to print the contents of the file
letter.txt. The output of the wc (word count) utility (page 1000) used with the –l
option is the number of lines in the file, in columns (character positions) 1 through
4, followed by a SPACE and the name of the file (the first command following). The cut
utility (page 766) with the –c1–4 option extracts the first four columns.
$ wc -l letter.txt
351 letter.txt
$ wc -l letter.txt | cut -c1-4
351

360 Chapter 8 The Bourne Again Shell (bash)

The dollar sign and single parenthesis instruct the shell to perform command substitution; the dollar sign and double parentheses indicate arithmetic expansion:
$ echo $(( $(wc -l letter.txt | cut -c1-4)/66 + 1))
6

The preceding example sets up a pipeline that sends standard output from wc to standard input of cut. Because of command substitution, the output of both commands
replaces the commands between the $( and the matching ) on the command line.
Arithmetic expansion then divides this number by 66, the number of lines on a page.
A 1 is added because integer division discards remainders.

Fewer dollar signs ($)
tip When you specify variables within $(( and )), the dollar signs that precede individual variable
references are optional. This format also allows you to include whitespace around operators,
making expressions easier to read.
$ x=23 y=37
$ echo $(( 2
157
$ echo $(( 2
157

* $x + 3 * $y ))
* x + 3 * y ))

Another way to get the same result without using cut is to redirect the input to wc
instead of having wc get its input from a file you name on the command line. When
you redirect its input, wc does not display the name of the file:
$ wc -l < letter.txt
351

It is common practice to assign the result of arithmetic expansion to a variable:
$ numpages=$(( $(wc -l < letter.txt)/66 + 1))
let builtin

The let builtin (not in tcsh) evaluates arithmetic expressions just as the $(( )) syntax
does. The following command is equivalent to the preceding one:
$ let "numpages=$(wc -l < letter.txt)/66 + 1"

The double quotation marks keep the SPACEs (both those you can see and those that
result from the command substitution) from separating the expression into separate
arguments to let. The value of the last expression determines the exit status of let. If
the value of the last expression is 0, the exit status of let is 1; otherwise, the exit status
is 0.
You can supply let with multiple arguments on a single command line:
$ let a=5+3 b=7+2
$ echo $a $b
8 9

When you refer to variables when doing arithmetic expansion with let or $(( )), the
shell does not require a variable name to begin with a dollar sign ($). Nevertheless,

Processing the Command Line 361

it is a good practice to do so for consistency, because in most places you must precede
a variable name with a dollar sign.

Command Substitution
Command substitution replaces a command with the output of that command. The
preferred syntax for command substitution under bash is
$(command)
Under bash you can also use the following, older syntax, which is the only syntax
allowed under tcsh:

‘command‘
The shell executes command within a subshell and replaces command, along with
the surrounding punctuation, with standard output of command. Standard error of
command is not affected.
In the following example, the shell executes pwd and substitutes the output of the
command for the command and surrounding punctuation. Then the shell passes the
output of the command, which is now an argument, to echo, which displays it.
$ echo $(pwd)
/home/max

The next script assigns the output of the pwd builtin to the variable where and displays
a message containing the value of this variable:
$ cat where
where=$(pwd)
echo "You are using the $where directory."
$ ./where
You are using the /home/zach directory.

Although it illustrates how to assign the output of a command to a variable, this
example is not realistic. You can more directly display the output of pwd without
using a variable:
$ cat where2
echo "You are using the $(pwd) directory."
$ ./where2
You are using the /home/zach directory.

The following command uses find to locate files with the name README in the
directory tree rooted at the working directory. This list of files is standard output
of find and becomes the list of arguments to ls.
$ ls -l $(find . -name README -print)

The next command line shows the older ‘command‘ syntax:
$ ls -l

‘find

. -name README -print‘

362 Chapter 8 The Bourne Again Shell (bash)

One advantage of the newer syntax is that it avoids the rather arcane rules for token
handling, quotation mark handling, and escaped back ticks within the old syntax.
Another advantage of the new syntax is that it can be nested, unlike the old syntax.
For example, you can produce a long listing of all README files whose size exceeds
the size of ./README using the following command:
$ ls -l $(find . -name README -size +$(echo $(cat ./README | wc -c)c ) -print )

Try giving this command after giving a set –x command (page 430) to see how bash
expands it. If there is no README file, the command displays the output of ls –l.
For additional scripts that use command substitution, see pages 427, 446, and 486.

$(( versus $(
tip The symbols $(( constitute a single token. They introduce an arithmetic expression, not a command
substitution. Thus, if you want to use a parenthesized subshell (page 292) within $(), you must put
a SPACE between the $( and the following (.

Word Splitting
The results of parameter and variable expansion, command substitution, and arithmetic
expansion are candidates for word splitting. Using each character of IFS (page 311) as
a possible delimiter, bash splits these candidates into words or tokens. If IFS is unset,
bash uses its default value (SPACE-TAB-NEWLINE). If IFS is null, bash does not split words.

Pathname Expansion
Pathname expansion (page 148), also called filename generation or globbing, is the
process of interpreting ambiguous file references and substituting the appropriate list
of filenames. Unless noglob (page 352) is set, the shell performs this function when
it encounters an ambiguous file reference—a token containing any of the unquoted
characters *, ?, [, or ]. If bash cannot locate any files that match the specified pattern,
the token with the ambiguous file reference remains unchanged. The shell does not
delete the token or replace it with a null string but rather passes it to the program as
is (except see nullglob on page 353). The TC Shell generates an error message.
In the first echo command in the following example, the shell expands the ambiguous
file reference tmp* and passes three tokens (tmp1, tmp2, and tmp3) to echo. The
echo builtin displays the three filenames it was passed by the shell. After rm removes
the three tmp* files, the shell finds no filenames that match tmp* when it tries to
expand it. It then passes the unexpanded string to the echo builtin, which displays the
string it was passed.
$ ls
tmp1 tmp2 tmp3
$ echo tmp*
tmp1 tmp2 tmp3
$ rm tmp*
$ echo tmp*
tmp*

By default, the same command causes the TC Shell to display an error message:

Processing the Command Line 363
tcsh $ echo tmp*
echo: No match

A period that either starts a pathname or follows a slash (/) in a pathname must
be matched explicitly unless you have set dotglob (page 351). The option
nocaseglob (page 352) causes ambiguous file references to match filenames without regard to case.
Quotation marks

Putting double quotation marks around an argument causes the shell to suppress
pathname and all other kinds of expansion except parameter and variable expansion.
Putting single quotation marks around an argument suppresses all types of expansion. The second echo command in the following example shows the variable $max
between double quotation marks, which allow variable expansion. As a result the
shell expands the variable to its value: sonar. This expansion does not occur in the
third echo command, which uses single quotation marks. Because neither single nor
double quotation marks allow pathname expansion, the last two commands display
the unexpanded argument tmp* .
$ echo tmp* $max
tmp1 tmp2 tmp3 sonar
$ echo "tmp* $max"
tmp* sonar
$ echo 'tmp* $max'
tmp* $max

The shell distinguishes between the value of a variable and a reference to the variable
and does not expand ambiguous file references if they occur in the value of a variable.
As a consequence you can assign to a variable a value that includes special characters,
such as an asterisk (*).
Levels of expansion

In the next example, the working directory has three files whose names begin with
letter. When you assign the value letter* to the variable var, the shell does not expand
the ambiguous file reference because it occurs in the value of a variable (in the assignment statement for the variable). No quotation marks surround the string letter*;
context alone prevents the expansion. After the assignment the set builtin (with the
help of grep) shows the value of var to be letter*.
$ ls letter*
letter1 letter2 letter3
$ var=letter*
$ set | grep var
var='letter*'
$ echo '$var'
$var
$ echo "$var"
letter*
$ echo $var
letter1 letter2 letter3

The three echo commands demonstrate three levels of expansion. When $var is
quoted with single quotation marks, the shell performs no expansion and passes the
character string $var to echo, which displays it. With double quotation marks, the
shell performs variable expansion only and substitutes the value of the var variable
for its name, preceded by a dollar sign. No pathname expansion is performed on this

364 Chapter 8 The Bourne Again Shell (bash)

command because double quotation marks suppress it. In the final command, the
shell, without the limitations of quotation marks, performs variable substitution and
then pathname expansion before passing the arguments to echo.

Process Substitution
The Bourne Again Shell can replace filename arguments with processes. An argument
with the syntax <(command) causes command to be executed and the output to be
written to a named pipe (FIFO). The shell replaces that argument with the name of
the pipe. If that argument is then used as the name of an input file during processing,
the output of command is read. Similarly an argument with the syntax >(command)
is replaced by the name of a pipe that command reads as standard input.
The following example uses sort (page 942) with the –m (merge, which works correctly
only if the input files are already sorted) option to combine two word lists into a single
list. Each word list is generated by a pipe that extracts words matching a pattern from
a file and sorts the words in that list.
$ sort -m -f <(grep "[^A-Z]..$" memo1 | sort) <(grep ".*aba.*" memo2 |sort)

Quote Removal
After bash finishes with the preceding list, it performs quote removal. This process
removes from the command line single quotation marks, double quotation marks,
and backslashes that are not a result of an expansion.

Chapter Summary
The shell is both a command interpreter and a programming language. As a command interpreter, it executes commands you enter in response to its prompt. As a
programming language, it executes commands from files called shell scripts. When
you start a shell, it typically runs one or more startup files.
Running a
shell script

When the file holding a shell script is in the working directory, there are three basic
ways to execute the shell script from the command line.
1. Type the simple filename of the file that holds the script.
2. Type an absolute or relative pathname, including the simple filename
preceded by ./ .
3. Type bash or tcsh followed by the name of the file.
Technique 1 requires the working directory to be in the PATH variable. Techniques
1 and 2 require you to have execute and read permission for the file holding the script.
Technique 3 requires you to have read permission for the file holding the script.

Job control

A job is another name for a process running a pipeline (which can be a simple command).
You can bring a job running in the background into the foreground using the fg builtin.
You can put a foreground job into the background using the bg builtin, provided you first
suspend the job by pressing the suspend key (typically CONTROL-Z). Use the jobs builtin to
display the list of jobs that are running in the background or are suspended.

Chapter Summary

365

Variables

The shell allows you to define variables. You can declare and initialize a variable by
assigning a value to it; you can remove a variable declaration using unset. Shell variables are local to the process they are defined in. Environment variables are global
and are placed in the environment using the export (bash) or setenv (tcsh) builtin so
they are available to child processes. Variables you declare are called user-created
variables. The shell defines keyword variables. Within a shell script you can work
with the positional (command-line) parameters the script was called with.

Locale

Locale specifies the way locale-aware programs display certain kinds of data, such
as times and dates, money and other numeric values, telephone numbers, and measurements. It can also specify collating sequence and printer paper size.

Process

Each process is the execution of a single command and has a unique identification
(PID) number. When you give the shell a command, it forks a new (child) process to
execute the command (unless the command is built into the shell). While the child
process is running, the shell is in a state called sleep. By ending a command line with
an ampersand (&), you can run a child process in the background and bypass the
sleep state so the shell prompt returns immediately after you press RETURN. Each command in a shell script forks a separate process, each of which might in turn fork other
processes. When a process terminates, it returns its exit status to its parent process.
An exit status of zero signifies success; a nonzero value signifies failure.

History

The history mechanism maintains a list of recently issued command lines called events,
that provides a way to reexecute previous commands quickly. There are several ways
to work with the history list; one of the easiest is to use a command-line editor.

Command-line
editors

When using an interactive Bourne Again Shell, you can edit a command line and commands from the history list, using either of the Bourne Again Shell’s command-line
editors (vim or emacs). When you use the vim command-line editor, you start in Input
mode, unlike with the stand-alone version of vim. You can switch between Command
and Input modes. The emacs editor is modeless and distinguishes commands from
editor input by recognizing control characters as commands.

Aliases

An alias is a name the shell translates into another name or command. Aliases allow
you to define new commands by substituting a string for the first token of a simple
command. The Bourne Again and TC Shells use different syntaxes to define an alias,
but aliases in both shells work similarly.

Functions

A shell function is a series of commands that, unlike a shell script, is parsed prior to
being stored in memory. As a consequence shell functions run faster than shell scripts.
Shell scripts are parsed at runtime and are stored on disk. A function can be defined
on the command line or within a shell script. If you want the function definition to
remain in effect across login sessions, you can define it in a startup file. Like functions
in many programming languages, a shell function is called by giving its name followed by any arguments.

Shell features

There are several ways to customize the shell’s behavior. You can use options on the
command line when you call bash. You can also use the bash set and shopt builtins
to turn features on and off.

Command-line
expansion

When it processes a command line, the Bourne Again Shell replaces some words
with expanded text. Most types of command-line expansion are invoked by the

366 Chapter 8 The Bourne Again Shell (bash)

appearance of a special character within a word (for example, the leading dollar
sign that denotes a variable). Table 8-6 on page 315 lists these special characters.
The expansions take place in a specific order. Following the history and alias expansions, the common expansions are parameter and variable expansion, command
substitution, and pathname expansion. Surrounding a word with double quotation
marks suppresses all types of expansion except parameter and variable expansion.
Single quotation marks suppress all types of expansion, as does quoting (escaping)
a special character by preceding it with a backslash.

Exercises
1. Explain the following unexpected result:
$ whereis date
date: /bin/date ...
$ echo $PATH
.:/usr/local/bin:/usr/bin:/bin
$ cat > date
echo "This is my own version of date."
$ ./date
Tue May 21 11:45:49 PDT 2013

2. What are two ways you can execute a shell script when you do not have execute
permission for the file containing the script? Can you execute a shell script if
you do not have read permission for the file containing the script?
3. What is the purpose of the PATH variable?
a. Set the PATH variable and place it in the environment so it causes the
shell to search the following directories in order:
• /usr/local/bin
• /usr/bin
• /bin
• /usr/kerberos/bin
• The bin directory in your home directory
• The working directory
b. If there is an executable file named doit in /usr/bin and another file with
the same name in your ~/bin directory, which one will be executed?
c. If your PATH variable is not set to search the working directory, how can
you execute a program located there?
d. Which command can you use to add the directory /usr/games to the end
of the list of directories in PATH?
4. Assume you have made the following assignment:
$ person=zach

Exercises 367

Give the output of each of the following commands.
a. echo $person
b. echo '$person'
c. echo "$person"
5. The following shell script adds entries to a file named journal-file in your
home directory. This script helps you keep track of phone conversations and
meetings.
$ cat journal
# journal: add journal entries to the file
# $HOME/journal-file
file=$HOME/journal-file
date >> $file
echo -n "Enter name of person or group: "
read name
echo "$name" >> $file
echo >> $file
cat >> $file
echo "----------------------------------------------------" >>
$file
echo >> $file

a. What do you have to do to the script to be able to execute it?
b. Why does the script use the read builtin the first time it accepts input from
the terminal and the cat utility the second time?
6. Assume the /home/zach/grants/biblios and /home/zach/biblios directories
exist. Specify Zach’s working directory after he executes each sequence of
commands. Explain what happens in each case.
a.

$ pwd
/home/zach/grants
$ CDPATH=$(pwd)
$ cd
$ cd biblios

b.

$ pwd
/home/zach/grants
$ CDPATH=$(pwd)
$ cd $HOME/biblios

7. Name two ways you can identify the PID number of the login shell.
8. Enter the following command:
$ sleep 30 | cat /etc/services

Is there any output from sleep? Where does cat get its input from? What has
to happen before the shell will display a prompt?

368 Chapter 8 The Bourne Again Shell (bash)

Advanced Exercises
9. Write a sequence of commands or a script that demonstrates variable
expansion occurs before pathname expansion.
10. Write a shell script that outputs the name of the shell executing it.
11. Explain the behavior of the following shell script:
$ cat quote_demo
twoliner="This is line 1.
This is line 2."
echo "$twoliner"
echo $twoliner

a. How many arguments does each echo command see in this script? Explain.
b. Redefine the IFS shell variable so the output of the second echo is the
same as the first.
12. Add the exit status of the previous command to your prompt so it behaves
similarly to the following:
$ [0] ls xxx
ls: xxx: No such file or directory
$ [1]

13. The dirname utility treats its argument as a pathname and writes to standard output the path prefix—that is, everything up to but not including the last component:
$ dirname a/b/c/d
a/b/c

If you give dirname a simple filename (no / characters) as an argument,
dirname writes a . to standard output:
$ dirname simple
.

Implement dirname as a bash function. Make sure it behaves sensibly when
given such arguments as /.
14. Implement the basename utility, which writes the last component of its
pathname argument to standard output, as a bash function. For example,
given the pathname a/b/c/d, basename writes d to standard output:
$ basename a/b/c/d
d

15. The Linux basename utility has an optional second argument. If you give
the command basename path suffix, basename removes the suffix and the
prefix from path:
$ basename src/shellfiles/prog.bash .bash
prog
$ basename src/shellfiles/prog.bash .c
prog.bash

Add this feature to the function you wrote for exercise 14.

9
The TC Shell (tcsh)

Chapter9
9

In This Chapter

Objectives

Shell Scripts . . . . . . . . . . . . . . . . . 370

After reading this chapter you should be able to:

Entering and Leaving the
TC Shell . . . . . . . . . . . . . . . . . . . 371
Features Common to the Bourne
Again and TC Shells. . . . . . . . . . 373
Redirecting Standard Error . . . . . . 379
Word Completion . . . . . . . . . . . . . 380
Editing the Command Line . . . . . . 383

 Identify tcsh startup files
 Explain the function of the history, histfile, and
savehist variables
 Set up an alias that uses a command-line argument
 Redirect standard error and standard output of a script
to two different files

Variables . . . . . . . . . . . . . . . . . . . . 385

 Set up and use filename, command, and variable
completion

Reading User Input . . . . . . . . . . . . 391

 Correct command-line spelling errors

Control Structures . . . . . . . . . . . . . 398

 Explain and use the @ builtin to work with numeric
variables

Builtins . . . . . . . . . . . . . . . . . . . . . 407

 Explain the use of the noclobber variable
 Use an if structure to evaluate the status of a file
 Describe eight tcsh builtins

370 Chapter 9 The TC Shell (tcsh)

The TC Shell (tcsh) performs the same function as the Bourne Again Shell and other
shells: It provides an interface between you and the Linux operating system. The TC
Shell is an interactive command interpreter as well as a high-level programming language. Although you use only one shell at any given time, you should be able to
switch back and forth comfortably between shells as the need arises. In fact, you
might want to run different shells in different windows. Chapters 8 and 10 apply to
tcsh as well as to bash so they provide a good background for this chapter. This chapter explains tcsh features that are not found in bash and those that are implemented
differently from their bash counterparts. The tcsh home page is www.tcsh.org.
The TC Shell is an expanded version of the C Shell (csh), which originated on Berkeley
UNIX. The “T” in TC Shell comes from the TENEX and TOPS-20 operating systems,
which inspired command completion and other features in the TC Shell. A number of
features not found in csh are present in tcsh, including file and username completion,
command-line editing, and spelling correction. As with csh, you can customize tcsh to
make it more tolerant of mistakes and easier to use. By setting the proper shell variables, you can have tcsh warn you when you appear to be accidentally logging out or
overwriting a file. Many popular features of the original C Shell are now shared by
bash and tcsh.
Assignment
statement

Although some of the functionality of tcsh is present in bash, differences arise in
the syntax of some commands. For example, the tcsh assignment statement has the
following syntax:
set variable = value
Having SPACEs on either side of the equal sign, although illegal in bash, is allowed (but
not mandatory) in tcsh. By convention shell variables in tcsh are generally named with
lowercase letters, not uppercase (you can use either). If you reference an undeclared
variable (one that has had no value assigned to it), tcsh generates an error message,
whereas by default bash does not. Finally the default tcsh prompt is a greater than
sign (>), but it is frequently set to a single $ character followed by a SPACE. The examples in this chapter use a prompt of tcsh $ to avoid confusion with the bash prompt.

Do not use tcsh as a programming language
tip If you have used UNIX and are comfortable with the C or TC Shell, you might want to use tcsh as
your login shell. However, you might find that the TC Shell is not as good a programming language
as bash. If you are going to learn only one shell programming language, learn bash. The Bourne
Again Shell and dash (page 277), which is a subset of bash, are used throughout Linux to program many system administration scripts.

Shell Scripts
The TC Shell can execute files containing tcsh commands, just as the Bourne Again
Shell can execute files containing bash commands. Although the concepts of writing
and executing scripts in the two shells are similar, the methods of declaring and
assigning values to variables and the syntax of control structures are different.

Entering and Leaving the TC Shell

371

You can run bash and tcsh scripts while using any one of the shells as a command
interpreter. Various methods exist for selecting the shell that runs a script. Refer to
“#! Specifies a Shell” on page 287 for more information.
If the first character of a shell script is a pound sign (#) and the following character is
not an exclamation point (!), the TC Shell executes the script under tcsh. If the first character is anything other than #, tcsh calls the sh link to dash or bash to execute the script.

echo: getting rid of the RETURN
tip The tcsh echo builtin accepts either a –n option or a trailing \c to get rid of the RETURN that echo
normally displays at the end of a line. The bash echo builtin accepts only the –n option (refer to
“read: Accepts User Input” on page 477).

Shell game
tip When you are working with an interactive TC Shell, if you run a script in which # is not the first
character of the script and you call the script directly (without preceding its name with tcsh), tcsh
calls the sh link to dash or bash to run the script. The following script was written to be run under
tcsh but, when called from a tcsh command line, is executed by bash. The set builtin (page 460)
works differently under bash and tcsh. As a result the following example (from page 391) issues
a prompt but does not wait for you to respond:
tcsh $ cat user_in
echo -n "Enter input: "
set input_line = "$<"
echo $input_line
tcsh $ user_in
Enter input:

Although in each case the examples are run from a tcsh command line, the following one calls
tcsh explicitly so that tcsh executes the script and it runs correctly:
tcsh $ tcsh user_in
Enter input: here is some input
here is some input

Entering and Leaving the TC Shell
chsh

You can execute tcsh by giving the command tcsh. If you are not sure which shell you
are using, use the ps utility to find out. It shows whether you are running tcsh, bash, sh
(linked to bash), or possibly another shell. The finger command followed by your username displays the name of your login shell, which is stored in the /etc/passwd file. (Mac
OS X uses Open Directory [page 1042] in place of this file.) If you want to use tcsh as
a matter of course, you can use the chsh (change shell) utility to change your login shell:
bash $ chsh
Changing shell for sam.
Password:
New shell [/bin/bash]: /bin/tcsh
Shell changed.
bash $

372 Chapter 9 The TC Shell (tcsh)

The shell you specify will remain in effect for your next login and all subsequent logins until you specify a different login shell. The /etc/passwd file stores the name of
your login shell.
You can leave tcsh in several ways. The approach you choose depends on two factors: whether the shell variable ignoreeof is set and whether you are using the shell
that you logged in on (your login shell) or another shell that you created after you
logged in. If you are not sure how to exit from tcsh, press CONTROL-D on a line by itself
with no leading SPACEs, just as you would to terminate standard input to a program.
You will either exit or receive instructions on how to exit. If you have not set
ignoreeof (page 397) and it has not been set for you in a startup file, you can exit
from any shell by using CONTROL-D (the same procedure you use to exit from the Bourne
Again Shell).
When ignoreeof is set, CONTROL-D does not work. The ignoreeof variable causes the shell
to display a message telling you how to exit. You can always exit from tcsh by giving
an exit command. A logout command allows you to exit from your login shell only.

Startup Files
When you log in on the TC Shell, it automatically executes various startup files.
These files are normally executed in the order described in this section, but you can
compile tcsh so that it uses a different order. You must have read access to a startup
file to execute the commands in it. See page 278 for information on bash startup files
and page 1050 for information on startup files under Mac OS X.
/etc/csh.cshrc and
/etc/csh.login

The shell first executes the commands in /etc/csh.cshrc and /etc/csh.login. A user
working with root privileges can set up these files to establish systemwide default
characteristics for tcsh users. They contain systemwide configuration information,
such as the default path, the location to check for mail, and so on.

.tcshrc and .cshrc

Next the shell looks for ~/.tcshrc or, if it does not exist, ~/.cshrc (~/ is shorthand for
your home directory; page 89). You can use these files to establish variables and
parameters that are specific to your shell. Each time you create a new shell, tcsh reinitializes these variables for the new shell. The following .tcshrc file sets several shell
variables, establishes two aliases (page 377), and adds two directories to path—one
at the beginning of the list and one at the end:
tcsh $ cat ~/.tcshrc
set noclobber
set dunique
set ignoreeof
set history=256
set path = (~/bin $path /usr/games)
alias h history
alias ll ls -l

.history

Login shells rebuild the history list from the contents of ~/.history. If the histfile
variable exists, tcsh uses the file that histfile points to in place of .history.

.login

Login shells read and execute the commands in ~/.login. This file contains commands
that you want to execute once, at the beginning of each session. You can use setenv

Features Common to the Bourne Again and TC Shells

373

(page 386) to declare environment (global) variables here. You can also declare the
type of terminal you are using and set some terminal characteristics in your .login file.
tcsh $ cat ~/.login
setenv history 200
setenv mail /var/spool/mail/$user
if ( -z $DISPLAY ) then
setenv TERM vt100
else
setenv TERM xterm
endif
stty erase '^h' kill '^u' -lcase tab3
date '+Login on %A %B %d at %I:%M %p'

The preceding .login file establishes the type of terminal you are using by setting the
TERM variable (the if statement [page 398] determines whether you are using a
graphical interface and therefore which value should be assigned to TERM). It then
runs stty (page 960) to set terminal characteristics and date (page 769) to display the
time you logged in.
/etc/csh.logout
and .logout

The TC Shell runs the /etc/csh.logout and ~/.logout files, in that order, when you exit
from a login shell. The following sample .logout file uses date to display the time you
logged out. The sleep command ensures that echo has time to display the message
before the system logs you out. The delay might be useful for dial-up lines that take
some time to display the message.
tcsh $ cat ~/.logout
date '+Logout on %A %B %d at %I:%M %p'
sleep 5

Features Common to the Bourne Again and
TC Shells
Most of the features common to both bash and tcsh are derived from the original
C Shell:
• Command-line expansion (also called substitution; next page)
• History (next page)
• Aliases (page 377)
• Job control (page 378)
• Filename substitution (page 378)
• Directory stack manipulation (page 379)
• Command substitution (page 379)
The chapters on bash discuss these features in detail. This section focuses on the
differences between the bash and tcsh implementations.

374 Chapter 9 The TC Shell (tcsh)

Command-Line Expansion (Substitution)
Refer to “Processing the Command Line” on page 354 for an introduction to command-line expansion in the Bourne Again Shell. The tcsh man page uses the term
substitution instead of expansion; the latter is used by bash. The TC Shell scans each
token on a command line for possible expansion in the following order:
1. History substitution (below)
2. Alias substitution (page 377)
3. Variable substitution (page 386)
4. Command substitution (page 379)
5. Filename substitution (page 378)
6. Directory stack substitution (page 379)

History
The TC Shell assigns a sequential event number to each command line. You can display this event number as part of the tcsh prompt (refer to “prompt” on page 393).
Examples in this section show numbered prompts when they help illustrate the
behavior of a command.

The history Builtin
As in bash, the tcsh history builtin displays the events in your history list. The list of
events is ordered with the oldest events at the top. The last event in the history list
is the history command that displayed the list. In the following history list, which is
limited to ten lines by the argument of 10 to the history command, command 23
modifies the tcsh prompt to display the history event number. The time each command was executed appears to the right of the event number.
32 $ history 10
23 23:59
24 23:59
25 23:59
26 0:00
27 0:00
28 0:00
29 0:00
30 0:00
31 0:00
32 0:00

set prompt = "! $ "
ls -l
cat temp
rm temp
vim memo
lpr memo
vim memo
lpr memo
rm memo
history

History Expansion
The same event and word designators work in both shells. For example, !! refers to
the previous event in tcsh, just as it does in bash. The command !328 executes event
number 328; !?txt? executes the most recent event containing the string txt. For more
information refer to “Using an Exclamation Point (!) to Reference Events” on
page 331. Table 9-1 lists the few tcsh word modifiers not found in bash.

Features Common to the Bourne Again and TC Shells

Table 9-1

375

Word modifiers

Modifier

Function

u

Converts the first lowercase letter into uppercase

l

Converts the first uppercase letter into lowercase

a

Applies the next modifier globally within a single word

You can use more than one word modifier in a command. For instance, the a modifier, when used in combination with the u or l modifier, enables you to change the
case of an entire word.
tcsh $ echo $VERSION
VERSION: Undefined variable.
tcsh $ echo !!:1:al
echo $version
tcsh 6.17.00 (Astron) 2009-07-10 (i386-intel-linux) options wide,nls, ...

In addition to using event designators to access the history list, you can use the commandline editor to access, modify, and execute previous commands (page 383).

Variables
The variables you set to control the history list in tcsh are different from those used in
bash. Whereas bash uses HISTSIZE and HISTFILESIZE to determine the number of
events that are preserved during and between sessions, respectively, tcsh uses history
and savehist (Table 9-2) for these purposes.

Table 9-2

history and savehist

History variables

Variable

Default

Function

history

100 events

Maximum number of events saved during a session

histfile

~/.history

Location of the history file

savehist

not set

Maximum number of events saved between sessions

When you exit from a tcsh shell, the most recently executed commands are saved in
your ~/.history file. The next time you start the shell, this file initializes the history
list. The value of the savehist variable determines the number of lines saved in the
.history file (not necessarily the same as the history variable). If savehist is not set,
tcsh does not save history information between sessions. The history and savehist
variables must be shell variables (i.e., declared using set, not setenv). The history variable holds the number of events remembered during a session and the savehist
variable holds the number remembered between sessions. See Table 9-2.
If you set the value of history too high, it can use too much memory. If it is unset or
set to zero, the shell does not save any commands. To establish a history list of the
500 most recent events, give the following command manually or place it in your
~/.tcshrc startup file:
tcsh $ set history = 500

376 Chapter 9 The TC Shell (tcsh)

The following command causes tcsh to save the 200 most recent events across login
sessions:
tcsh $ set savehist = 200

You can combine these two assignments into a single command:
tcsh $ set history=500 savehist=200

After you set savehist, you can log out and log in again; the 200 most recent events
from the previous login sessions will appear in your history list after you log back in.
Set savehist in your ~/.tcshrc file if you want to maintain your event list from login
to login.
histlit

If you set the variable histlit (history literal), history displays the commands in
the history list exactly as they were typed in, without any shell interpretation.
The following example shows the effect of this variable (compare the lines numbered 32):
tcsh $ cat /etc/csh.cshrc
...
tcsh $ cp !!:1 ~
cp /etc/csh.cshrc ~
tcsh $ set histlit
tcsh $ history
...
31 9:35
cat /etc/csh.cshrc
32 9:35
cp !!:1 ~
33 9:35
set histlit
34 9:35
history
tcsh $ unset histlit
tcsh $ history
...
31 9:35
cat /etc/csh.cshrc
32 9:35
cp /etc/csh.cshrc ~
33 9:35
set histlit
34 9:35
history
35 9:35
unset histlit
36 9:36
history

optional The bash and tcsh Shells expand history event designators differently. If you give the
command !250w, bash replaces it with command number 250 with a character w
appended to it. In contrast, tcsh looks back through your history list for an event that
begins with the string 250w to execute. The reason for the difference: bash interprets
the first three characters of 250w as the number of a command, whereas tcsh interprets those characters as part of the search string 250w. (If the 250 stands alone, tcsh
treats it as a command number.)
If you want to append w to command number 250, you can insulate the event number
from the w by surrounding it with braces:
!{250}w

Features Common to the Bourne Again and TC Shells

377

Aliases
The alias/unalias feature in tcsh closely resembles its counterpart in bash (page 342).
However, the alias builtin has a slightly different syntax:
alias name value
The following command creates an alias for ls:
tcsh $ alias ls "ls -lF"

The tcsh alias allows you to substitute command-line arguments, whereas bash does
not:
tcsh $ alias nam "echo Hello, \!^ is my name"
tcsh $ nam Sam
Hello, Sam is my name

The string \!* within an alias expands to all command-line arguments:
tcsh $ alias sortprint "sort \!* | lpr"

The next alias displays its second argument:
tcsh $ alias n2 "echo \!:2"

To display a list of current aliases, give the command alias. To display the alias for a
particular name, give the command alias followed by that name.

Special Aliases
Some alias names, called special aliases, have special meaning to tcsh. If you define
an alias that uses one of these names, tcsh executes it automatically as explained in
Table 9-3. Initially all special aliases are undefined. The following command sets the
cwdcmd alias so it displays the name of the working directory when you change to
a new working directory. The single quotation marks are critical in this example; see
page 343.
tcsh $ alias cwdcmd 'echo Working directory is now `pwd`'
tcsh $ cd /etc
Working directory is now /etc
tcsh $

Table 9-3

Special aliases

Alias

When executed

beepcmd

Whenever the shell would normally ring the terminal bell. Gives you a way to
have other visual or audio effects take place at those times.

cwdcmd

Whenever you change to another working directory.

periodic

Periodically, as determined by the number of minutes in the tperiod variable.
If tperiod is unset or has the value 0, periodic has no meaning.

precmd

Just before the shell displays a prompt.

shell

Specifies the absolute pathname of the shell that will run scripts that do not
start with #! (page 287).

378 Chapter 9 The TC Shell (tcsh)

History Substitution in Aliases
You can substitute command-line arguments by using the history mechanism, where
a single exclamation point represents the command line containing the alias. Modifiers are the same as those used by history (page 331). In the following example, the
exclamation points are quoted so the shell does not interpret them when building the
aliases:
21 $
22 $
test
23 $
24 $
temp

alias last echo \!:$
last this is just a test
alias fn2 echo \!:2:t
fn2 /home/sam/test /home/zach/temp /home/barbara/new

Event 21 defines for last an alias that displays the last argument. Event 23 defines for
fn2 an alias that displays the simple filename, or tail, of the second argument on the
command line.

Job Control
Job control is similar in both bash (page 294) and tcsh. You can move commands
between the foreground and the background, suspend jobs temporarily, and display
a list of current jobs. The % character references a job when it is followed by a job
number or a string prefix that uniquely identifies the job. You will see a minor difference when you run a multiple-process command line in the background from each
shell. Whereas bash displays only the PID number of the last background process in
each job, tcsh displays the numbers for all processes belonging to a job. The example
from page 294 looks like this under tcsh:
tcsh $ find . -print | sort | lpr & grep -l max /tmp/* > maxfiles &
[1] 18839 18840
18841
[2] 18876

Filename Substitution
The TC Shell expands the characters *, ?, and [ ] in a pathname just as bash does
(page 148). The * matches any string of zero or more characters, ? matches any single character, and [ ] defines a character class that matches single characters appearing
between the brackets.
The TC Shell expands command-line arguments that start with a tilde (~) into filenames in much the same way that bash does (page 381), with the ~ standing for the
user’s home directory or the home directory of the user whose name follows the tilde.
The bash special expansions ~+ and ~– are not available in tcsh.
Brace expansion (page 356) is available in tcsh. Like tilde expansion, it is regarded
as an aspect of filename substitution even though brace expansion can generate
strings that are not the names of existing files.

Redirecting Standard Error
globbing

379

In tcsh and its predecessor csh, the process of using patterns to match filenames is
referred to as globbing and the pattern itself is called a globbing pattern. If tcsh is
unable to identify one or more files that match a globbing pattern, it reports an error
(unless the pattern contains a brace). Setting the shell variable noglob suppresses filename substitution, including both tilde and brace interpretation.

Manipulating the Directory Stack
Directory stack manipulation in tcsh does not differ much from that in bash
(page 297). The dirs builtin displays the contents of the stack, and the pushd and popd
builtins push directories onto and pop directories off of the stack.

Command Substitution
The $(...) syntax for command substitution is not available in tcsh. In its place you
must use the original ‘...‘ syntax. Otherwise, the implementation in bash and tcsh
is identical. Refer to page 361 for more information on command substitution.

Redirecting Standard Error
Both bash and tcsh use a greater than symbol (>) to redirect standard output, but tcsh
does not use the bash notation 2> to redirect standard error. Under tcsh you use a
greater than symbol followed by an ampersand (>&) to combine and redirect standard output and standard error. Although you can use this notation under bash, few
people do. The following examples, like the bash examples on page 282, reference file
x, which does not exist, and file y, which contains a single line:
tcsh
cat:
tcsh
This
tcsh
tcsh
cat:
This

$ cat
x: No
$ cat
is y.
$ cat
$ cat
x: No
is y.

x
such file or directory
y
x y >& hold
hold
such file or directory

With an argument of y in the preceding example, cat sends a string to standard output.
An argument of x causes cat to send an error message to standard error.
Unlike bash, tcsh does not provide a simple way to redirect standard error separately
from standard output. A work-around frequently provides a reasonable solution.
The following example runs cat with arguments of x and y in a subshell (the parentheses ensure that the command within them runs in a subshell; page 292). Also
within the subshell, a > redirects standard output to the file outfile. Output sent to
standard error is not touched by the subshell but rather is sent to the parent shell,

380 Chapter 9 The TC Shell (tcsh)

where both it and standard output are sent to errfile. Because standard output has
already been redirected, errfile contains only output sent to standard error.
tcsh
tcsh
This
tcsh
cat:

$ (cat x y > outfile) >& errfile
$ cat outfile
is y.
$ cat errfile
x: No such file or directory

It can be useful to combine and redirect output when you want to execute a command
that runs slowly in the background and do not want its output cluttering up the
screen. For example, because the find utility (page 804) can take a long time to complete, it might be a good idea to run it in the background. The next command finds
in the filesystem hierarchy all files that contain the string biblio in their name. This
command runs in the background and sends its output to the findout file. Because
the find utility sends to standard error a report of directories that you do not have permission to search, the findout file contains a record of any files that are found as well
as a record of the directories that could not be searched.
tcsh $ find / -name "*biblio*" -print >& findout &

In this example, if you did not combine standard error with standard output and redirected only standard output, the error messages would appear on the screen and
findout would list only files that were found.
While a command that has its output redirected to a file is running in the background,
you can look at the output by using tail (page 965) with the –f option. The –f option
causes tail to display new lines as they are written to the file:
tcsh $ tail -f findout

To terminate the tail command, press the interrupt key (usually CONTROL-C).

Working with the Command Line
This section covers word completion, editing the command line, and correcting
spelling.

Word Completion
The TC Shell completes filenames, commands, and variable names on the command
line when you prompt it to do so. The generic term used to refer to all these features
under tcsh is word completion.

Filename Completion
The TC Shell can complete a filename after you specify a unique prefix. Filename
completion is similar to filename generation, but the goal of filename completion

Working with the Command Line 381

is to select a single file. Together these capabilities make it practical to use long,
descriptive filenames.
To use filename completion when you are entering a filename on the command line,
type enough of the name to identify the file uniquely and press TAB ; tcsh fills in the
name and adds a SPACE, leaving the cursor so you can enter additional arguments or
press RETURN. In the following example, the user types the command cat trig1A and
presses TAB; the system fills in the rest of the filename that begins with trig1A:
tcsh $ cat trig1A



TAB



cat trig1A.302488

■

If two or more filenames match the prefix that you have typed, tcsh cannot complete
the filename without obtaining more information. The shell maximizes the length of
the prefix by adding characters, if possible, and then beeps to signify that additional
input is needed to resolve the ambiguity:
tcsh $ ls h*
help.hist help.trig01 help.txt
tcsh $ cat h  TAB  cat help. (beep)

You can fill in enough characters to resolve the ambiguity and then press the TAB key
again. Alternatively, you can press CONTROL-D to cause tcsh to display a list of matching
filenames:
tcsh $ cat help.  CONTROL-D
help.hist
help.trig01 help.txt
tcsh $ cat help.■

After displaying the filenames, tcsh redraws the command line so you can disambiguate
the filename (and press TAB again) or finish typing the filename manually.

Tilde Completion
The TC Shell parses a tilde (~) appearing as the first character of a word and attempts
to expand it to a username when you enter a TAB:
tcsh $ cd ~za
tcsh $ pwd
/home/zach



TAB



cd ~zach/■



RETURN

By appending a slash (/), tcsh indicates that the completed word is a directory. The
slash also makes it easy to continue specifying a pathname.

Command and Variable Completion
You can use the same mechanism you use to list and complete filenames with command and variable names. When you specify a simple filename, the shell uses the
variable path to attempt to complete a command name. The choices tcsh lists might
be located in different directories.

382 Chapter 9 The TC Shell (tcsh)
tcsh $ up  TAB (beep)  CONTROL-D
up2date
updatedb
uptime
up2date-config
update-mime-database
up2date-nox
updmap
tcsh $ up  t  TAB  uptime ■  RETURN
9:59am up 31 days, 15:11, 7 users, load average: 0.03, 0.02, 0.00

If you set the autolist variable as in the following example, the shell lists choices automatically when you invoke completion by pressing TAB. You do not have to press
CONTROL-D.
tcsh $ set autolist
tcsh $ up  TAB (beep)
up2date
updatedb
uptime
up2date-config
update-mime-database
up2date-nox
updmap
tcsh $ up  t  TAB  uptime ■  RETURN
10:01am up 31 days, 15:14, 7 users, load average: 0.20, 0.06, 0.02

If you set autolist to ambiguous, the shell lists the choices when you press TAB only if
the word you enter is the longest prefix of a set of commands. Otherwise, pressing
TAB causes the shell to add one or more characters to the word until it is the longest
prefix; pressing TAB again then lists the choices:
tcsh $ set autolist=ambiguous
tcsh $ echo $h  TAB (beep)
histfile history home
tcsh $ echo $h■  i  TAB  echo $hist■  TAB
histfile history
tcsh $ echo $hist■  o  TAB  echo $history ■
1000



RETURN

The shell must rely on the context of the word within the input line to determine
whether it is a filename, a username, a command, or a variable name. The first word
on an input line is assumed to be a command name; if a word begins with the special
character $, it is viewed as a variable name; and so on. In the following example, the
second which command does not work properly: The context of the word up makes
it look like the beginning of a filename rather than the beginning of a command. The
TC Shell supplies which with an argument of updates (a nonexecutable file) and which
displays an error message:
tcsh $ ls up*
updates
tcsh $ which updatedb ups uptime
/usr/bin/updatedb
/usr/local/bin/ups
/usr/bin/uptime
tcsh $ which up  TAB  which updates
updates: Command not found.

Working with the Command Line 383

Editing the Command Line
bindkey

The tcsh command-line editing feature is similar to that available under bash. You can
use either emacs mode commands (default) or vi(m) mode commands. Change to
vi(m) mode commands by giving the command bindkey –v and to emacs mode commands by giving the command bindkey –e. The ARROW keys are bound to the obvious
motion commands in both modes, so you can move back and forth (up and down)
through the history list as well as left and right on the current command line.
Without an argument, the bindkey builtin displays the current mappings between
editor commands and the key sequences you can enter at the keyboard:
tcsh $ bindkey
Standard key bindings
"^@"
-> set-mark-command
"^A"
-> beginning-of-line
"^B"
-> backward-char
"^C"
-> tty-sigintr
"^D"
-> delete-char-or-list-or-eof
...
Multi-character bindings
"^[[A"
-> up-history
"^[[B"
-> down-history
"^[[C"
-> forward-char
"^[[D"
-> backward-char
"^[[H"
-> beginning-of-line
...
Arrow key bindings
down
-> down-history
up
-> up-history
left
-> backward-char
right
-> forward-char
home
-> beginning-of-line
end
-> end-of-line

The ^ indicates a CONTROL character (^B = CONTROL-B). The ^[ indicates a META or ALT character; in this case you press and hold the META or ALT key while you press the key for the
next character. If this substitution does not work or if the keyboard you are using does
not have a META or ALT key, press and release the ESCAPE key and then press the key for the
next character. For ^[[F you would press META-[ or ALT-[ followed by the F key or else ESCAPE
[ F. The down/up/left/right indicate ARROW keys, and home/end indicate the HOME and END
keys on the numeric keypad. See page 223 for more information on the META key.
Under OS X, most keyboards do not have a META or ALT key. See page 1049 for an
explanation of how to set up the OPTION key to perform the same functions as the META
key on a Macintosh.
The preceding example shows the output from bindkey with the user in emacs mode.
Change to vi(m) mode (bindkey –v) and give another bindkey command to display the
vi(m) key bindings. You can send the output of bindkey through a pipeline to less to
make it easier to read.

384 Chapter 9 The TC Shell (tcsh)

Correcting Spelling
You can have tcsh attempt to correct the spelling of command names, filenames, and
variables (but only using emacs-style key bindings). Spelling correction can take place
before and after you press RETURN.

Before You Press RETURN
For tcsh to correct a word or line before you press RETURN, you must indicate that you
want it to do so. The two functions for this purpose are spell-line and spell-word:
$ bindkey | grep spell
"^[$"
-> spell-line
"^[S"
-> spell-word
"^[s"
-> spell-word

The output from bindkey shows that spell-line is bound to META-$ (ALT-$ or ESCAPE $) and
spell-word is bound to META-S and META-s (ALT-s or ESCAPE s and ALT-S or ESCAPE S). To correct
the spelling of the word to the left of the cursor, press META-s. Pressing META-$ invokes
the spell-line function, which attempts to correct all words on a command line:
tcsh $ ls
bigfile.gz
tcsh $ gunzipp  META-s  gunzip bigfele.gz  META-s  gunzip bigfile.gz
tcsh $ gunzip bigfele.gz  META-$  gunzip bigfile.gz
tcsh $ ecno $usfr  META-$  echo $user

After You Press RETURN
The variable named correct controls what tcsh attempts to correct or complete after
you press RETURN and before it passes the command line to the command being called.
If you do not set correct, tcsh will not correct anything:
tcsh $ unset correct
tcsh $ ls morning
morning
tcsh $ ecno $usfr morbing
usfr: Undefined variable.

The shell reports the error in the variable name and not the command name because
it expands variables before it executes the command (page 374). When you give a bad
command name without any arguments, the shell reports on the bad command name.
Set correct to cmd to correct only commands; to all to correct commands, variables,
and filenames; or to complete to complete commands:
tcsh $ set correct = cmd
tcsh $ ecno $usfr morbing
CORRECT>echo $usfr morbing (y|n|e|a)? y
usfr: Undefined variable.

Variables 385
tcsh $ set correct = all
tcsh $ echo $usfr morbing
CORRECT>echo $user morning (y|n|e|a)? y
zach morning

With correct set to cmd, tcsh corrects the command name from ecno to echo. With
correct set to all, tcsh corrects both the command name and the variable. It would
also correct a filename if one was present on the command line.
The TC Shell displays a special prompt that lets you enter y to accept the modified
command line, n to reject it, e to edit it, or a to abort the command. Refer to
prompt3 on page 394 for a discussion of the special prompt used in spelling
correction.
In the next example, after setting the correct variable the user mistypes the name of
the ls command; tcsh then prompts for a correct command name. Because the command that tcsh has offered as a replacement is not ls, the user chooses to edit the
command line. The shell leaves the cursor following the command so the user can
correct the mistake:
tcsh $ set correct=cmd
tcsh $ lx -l  RETURN (beep)
CORRECT>lex -l (y|n|e|a)? e
tcsh $ lx -l■

If you assign the value complete to the variable correct, tcsh attempts command name
completion in the same manner as filename completion (page 380). In the following
example, after setting correct to complete the user enters the command up. The shell
responds with Ambiguous command because several commands start with these two
letters but differ in the third letter. The shell then redisplays the command line. The
user could press TAB at this point to get a list of commands that start with up but
decides to enter t and press RETURN. The shell completes the command because these
three letters uniquely identify the uptime utility:
tcsh $ set correct = complete
tcsh $ upRETURN
Ambiguous command
tcsh $ up  tRETURN  uptime
4:45pm up 5 days, 9:54, 5 users,

load average: 1.62, 0.83, 0.33

Variables
Although tcsh stores variable values as strings, you can work with these variables as
numbers. Expressions in tcsh can use arithmetic, logical, and conditional operators.
The @ builtin can evaluate integer arithmetic expressions.

386 Chapter 9 The TC Shell (tcsh)

This section uses the term numeric variable to describe a string variable that contains
a number that tcsh uses in arithmetic or logical arithmetic computations. However,
no true numeric variables exist in tcsh.
Variable name

A tcsh variable name consists of 1 to 20 characters, which can be letters, digits, and
underscores (_). The first character cannot be a digit but can be an underscore.

Variable Substitution
Three builtins declare, display, and assign values to variables: set, @, and setenv. The
set and setenv builtins both assume nonnumeric string variables. The @ builtin works
only with numeric variables. Both set and @ declare shell (local) variables. The setenv
builtin declares an environment (global) variable. Using setenv is similar to assigning
a value to a variable and then using export in the Bourne Again Shell. See “Environment, Environment Variables, and Inheritance” on page 468 for a discussion of shell
and environment variables.
Once the value—or merely the existence—of a variable has been established, tcsh
substitutes the value of that variable when the name of the variable, preceded by a
dollar sign ($), appears on a command line. If you quote the dollar sign by preceding it with a backslash or enclosing it within single quotation marks, the shell does
not perform the substitution. When a variable is within double quotation marks,
the substitution occurs even if you quote the dollar sign by preceding it with a
backslash.

String Variables
The TC Shell treats string variables similarly to the way the Bourne Again Shell does.
The major difference lies in their declaration and assignment: tcsh uses an explicit
command, set (or setenv), to declare and/or assign a value to a string variable.
tcsh $ set name = fred
tcsh $ echo $name
fred
tcsh $ set
argv
()
cwd
/home/zach
home
/home/zach
name
fred
path
(/usr/local/bin /bin /usr/bin /usr/X11R6/bin)
prompt $
shell
/bin/tcsh
status 0
term
vt100
user
zach

The first line in the example declares the variable name and assigns the string fred to
it. Unlike bash, tcsh allows—but does not require—SPACEs around the equal sign. The
next line displays the value of name. When you give a set command without any arguments, it displays a list of all shell (not environment) variables and their values. When

Variables 387

you give a set command with the name of a variable and no value, the command sets
the value of the variable to the null string.
You can use the unset builtin to remove a variable:
tcsh $ set name
tcsh $ echo $name
tcsh $ unset name
tcsh $ echo $name
name: Undefined variable.
setenv

The setenv builtin declares an environment variable. When using setenv you must separate the variable name from the string being assigned to it by inserting one or more
SPACEs and omitting the equal sign. In the following example, the tcsh command creates
a subshell, echo shows that the variable and its value are known to the subshell, and
exit returns to the original shell. Try this example, using set in place of setenv:
tcsh $ setenv SRCDIR /usr/local/src
tcsh $ tcsh
tcsh $ echo $SRCDIR
/usr/local/src
tcsh $ exit

Without arguments, setenv displays a list of the environment (global) variables—
variables that are passed to the shell’s child processes. By convention, environment
variables are named using uppercase letters.
As with set, giving setenv a variable name without a value sets the value of the variable to a null string. Although you can use unset to remove environment and local
variables, unsetenv can remove environment variables only.

Arrays of String Variables
An array is a collection of strings, each of which is identified by its index (1, 2, 3, and
so on). Arrays in tcsh use one-based indexing (i.e., the first element of the array has
the subscript 1). Before you can access individual elements of an array, you must
declare the entire array by assigning a value to each element of the array. The list of
values must be enclosed in parentheses and separated by SPACEs:
8 $ set colors = (red green blue orange yellow)
9 $ echo $colors
red green blue orange yellow
10 $ echo $colors[3]
blue
11 $ echo $colors[2-4]
green blue orange
12 $ set shapes = ('' '' '' '' '')
13 $ echo $shapes
14 $ set shapes[4] = square
15 $ echo $shapes[4]
square

388 Chapter 9 The TC Shell (tcsh)

Event 8 declares the array of string variables named colors to have five elements and
assigns values to each of them. If you do not know the values of the elements at the
time you declare an array, you can declare an array containing the necessary number
of null elements (event 12).
You can reference an entire array by preceding its name with a dollar sign (event 9).
A number in brackets following a reference to the array refers to an element of the
array (events 10, 14, and 15). Two numbers in brackets, separated by a hyphen, refer
to two or more adjacent elements of the array (event 11). Refer to “Special Variable
Forms” on page 391 for more information on arrays.

Numeric Variables
The @ builtin assigns the result of a numeric calculation to a numeric variable (as
described under “Variables” on page 385, tcsh has no true numeric variables). You
can declare single numeric variables using @, just as you can use set to declare nonnumeric variables. However, if you give it a nonnumeric argument, @ displays an
error message. Just as set does, the @ command used without any arguments lists all
shell variables.
Many of the expressions that the @ builtin can evaluate and the operators it recognizes are derived from the C programming language. The following syntax shows a
declaration or assignment using @ (the SPACE after the @ is required):
@ variable-name operator expression
The variable-name is the name of the variable you are assigning a value to. The operator is one of the C assignment operators: =, +=, –=, *=, /=, or %=. (See Table 14-4
on page 625 for a description of these operators.) The expression is an arithmetic
expression that can include most C operators (see the next section). You can use
parentheses within the expression for clarity or to change the order of evaluation.
Parentheses must surround parts of the expression that contain any of the following
characters: <, >, &, or |.

Do not use $ when assigning a value to a variable
tip As with bash, variables having a value assigned to them (those on the left of the operator) must
not be preceded by a dollar sign ($) in tcsh. Thus
tcsh $ @ $answer = 5 + 5

will yield
answer: Undefined variable.

or, if answer is defined,
@: Variable name must begin with a letter.

whereas
tcsh $ @ answer = 5 + 5

assigns the value 10 to the variable answer.

Variables 389

Expressions
An expression can be composed of constants, variables, and most of the bash operators (page 495). Expressions that involve files rather than numeric variables or
strings are described in Table 9-8 on page 399.
Expressions follow these rules:
1. The shell evaluates a missing or null argument as 0.
2. All results are decimal numbers.
3. Except for != and ==, the operators act on numeric arguments.
4. You must separate each element of an expression from adjacent elements
by a SPACE, unless the adjacent element is &, |, <, >, (, or ).
Following are some examples that use @:
216
217
0
218
219
7
220
221
0
222
223
12
224
225
13

$ @ count = 0
$ echo $count
$ @ count = ( 10 + 4 ) / 2
$ echo $count
$ @ result = ( $count < 5 )
$ echo $result
$ @ count += 5
$ echo $count
$ @ count++
$ echo $count

Event 216 declares the variable count and assigns it a value of 0. Event 218 shows
the result of an arithmetic operation being assigned to a variable. Event 220 uses @
to assign the result of a logical operation involving a constant and a variable to result.
The value of the operation is false (= 0) because the variable count is not less than 5.
Event 222 is a compressed form of the following assignment statement:
tcsh $ @ count = $count + 5

Event 224 uses a postfix operator to increment count by 1.
Postincrement and
postdecrement
operators

You can use the postincrement (++) and postdecrement (– –) operators only in
expressions containing a single variable name, as shown in the following example:
tcsh $ @ count = 0
tcsh $ @ count++
tcsh $ echo $count
1
tcsh $ @ next = $count++
@: Badly formed number.

390 Chapter 9 The TC Shell (tcsh)

Unlike in the C programming language and bash, expressions in tcsh cannot use
preincrement and predecrement operators.

Arrays of Numeric Variables
You must use the set builtin to declare an array of numeric variables before you can
use @ to assign values to the elements of that array. The set builtin can assign any values
to the elements of a numeric array, including zeros, other numbers, and null strings.
Assigning a value to an element of a numeric array is similar to assigning a value to
a simple numeric variable. The only difference is that you must specify the element,
or index, of the array. The syntax is
@ variable-name[index] operator expression
The index specifies the element of the array that is being addressed. The first element has
an index of 1. The index cannot be an expression but rather must be either a numeric
constant or a variable. In the preceding syntax the brackets around index are part of the
syntax and do not indicate that index is optional. If you specify an index that is too large
for the array you declared with set, tcsh displays @: Subscript out of range.
226 $ set ages = (0 0 0 0 0)
227 $ @ ages[2] = 15
228 $ @ ages[3] = ($ages[2] + 4)
229 $ echo $ages[3]
19
230 $ echo $ages
0 15 19 0 0
231 $ set index = 3
232 $ echo $ages[$index]
19
233 $ echo $ages[6]
ages: Subscript out of range.

Elements of a numeric array behave as though they were simple numeric variables.
Event 226 declares an array with five elements, each having a value of 0. Events 227
and 228 assign values to elements of the array, and event 229 displays the value of
one of the elements. Event 230 displays all the elements of the array, event 232 specifies an element by using a variable, and event 233 demonstrates the out-of-range
error message.

Braces
Like bash, tcsh allows you to use braces to distinguish a variable from the surrounding
text without the use of a separator:
$ set bb=abc
$ echo $bbdef
bbdef: Undefined variable.
$ echo ${bb}def
abcdef

Variables 391

Special Variable Forms
The special variable with the following syntax has the value of the number of elements
in the variable-name array:
$#variable-name
You can determine whether variable-name has been set by looking at the value of the
variable with the following syntax:
$?variable-name
This variable has a value of 1 if variable-name is set and 0 otherwise:
tcsh
tcsh
5
tcsh
1
tcsh
tcsh
0

$ set days = (mon tues wed thurs fri)
$ echo $#days
$ echo $?days
$ unset days
$ echo $?days

Reading User Input
Within a tcsh shell script, you can use the set builtin to read a line from the terminal
and assign it to a variable. The following portion of a shell script prompts the user
and reads a line of input into the variable input_line:
echo -n "Enter input: "
set input_line = "$<"

The value of the shell variable $< is a line from standard input. The quotation marks
around $< keep the shell from assigning only the first word of the line of input to the
variable input_line.

tcsh Variables
TC Shell variables can be set by the shell, inherited by the shell from its parent, or set
by the user and used by the shell. Some variables take on significant values (for example, the PID number of a background process). Other variables act as switches: on if
they are declared and off if they are not. Many of the shell variables are often set from
a startup file (page 372).

tcsh Variables That Take on Values
argv Contains the command-line arguments (positional parameters) from the command
line that invoked the shell. Like all tcsh arrays, this array uses one-based indexing;

argv[1] contains the first command-line argument. You can abbreviate references to

392 Chapter 9 The TC Shell (tcsh)

$argv[n] as $n. The token argv[*] references all the arguments together; you can
abbreviate it as $*. Use $0 to reference the name of the calling program. Refer to
“Positional Parameters” on page 458. The Bourne Again Shell does not use the argv
form, only the abbreviated form. You cannot assign values to the elements of argv.
$#argv or $# Holds the number of elements in the argv array. Refer to “Special Variable Forms”

on page 391.
autolist Controls command and variable completion (page 381).
autologout Enables tcsh’s automatic logout facility, which logs you out if you leave the shell idle
for too long. The value of the variable is the number of minutes of inactivity that tcsh

waits before logging you out. The default is 60 minutes except when you are running
in a graphical environment, in which case this variable is initially unset.
cdpath Affects the operation of cd in the same way as the CDPATH variable does in bash

(page 313). The cdpath variable is assigned an array of absolute pathnames (see path,
later in this section) and is usually set in the ~/.login file with a line such as the
following:
set cdpath = (/home/zach /home/zach/letters)

When you call cd with a simple filename, it searches the working directory for a subdirectory with that name. If one is not found, cd searches the directories listed in
cdpath for the subdirectory.
correct Set to cmd for automatic spelling correction of command names, to all to correct the

entire command line, and to complete for automatic completion of command names.
This variable works on corrections that are made after you press RETURN. Refer to
“After You Press RETURN” on page 384.
cwd The shell sets this variable to the name of the working directory. When you access a
directory through a symbolic link (page 112), tcsh sets cwd to the name of the sym-

bolic link.
dirstack The shell keeps the stack of directories used with the pushd, popd, and dirs builtins in

this variable. For more information refer to “Manipulating the Directory Stack” on
page 297.
fignore Holds an array of suffixes that tcsh ignores during filename completion.
gid The shell sets this variable to your group ID.
histfile Holds the full pathname of the file that saves the history list between login sessions

(page 375). The defaults is ~/.history.
history Specifies the size of the history list. Refer to “History” on page 374.
home or HOME Holds the pathname of the user’s home directory. The cd builtin refers to this variable,

as does the filename substitution of ~ (page 358).
mail Specifies files and directories, separated by whitespace, to check for mail. The TC

Shell checks for new mail every 10 minutes unless the first word of mail is a number,
in which case that number specifies how often the shell should check in seconds.

Variables 393
owd The shell keeps the name of your previous (old) working directory in this variable,
which is equivalent to ~– in bash.
path or PATH Holds a list of directories that tcsh searches for executable commands (page 307). If

this array is empty or unset, you can execute commands only by giving their pathnames. You can set path with a command such as the following:
tcsh $ set path = ( /usr/bin /bin /usr/local/bin /usr/bin/X11 ~/bin . )

prompt Holds the primary prompt, similar to the bash PS1 variable (page 309). If it is not set,

the prompt is >, or # when you are working with root privileges. The shell expands
an exclamation point in the prompt string to the current event number. The following
is a typical line from a .tcshrc file that sets the value of prompt:
set prompt = '! $ '

Table 9-4 lists some of the formatting sequences you can use in prompt to achieve
special effects.

Table 9-4

prompt formatting sequences

Sequence

Displays in prompt

%/

Value of cwd (the working directory)

%~

Same as %/, but replaces the path of the user’s home directory with a tilde

%! or %h or !

Current event number

%d

Day of the week

%D

Day of the month

%m

Hostname without the domain

%M

Full hostname, including the domain

%n

User’s username

%t

Time of day through the current minute

%p

Time of day through the current second

%W

Month as mm

%y

Year as yy

%Y

Year as yyyy

%#

A pound sign (#) if the user is running with root privileges; otherwise, a greater
than sign (>)

%?

Exit status of the preceding command

394 Chapter 9 The TC Shell (tcsh)
prompt2 Holds the secondary prompt, which tcsh uses to indicate it is waiting for additional

input. The default value is %R?. The TC Shell replaces %R with nothing when it is
waiting for you to continue an unfinished command, the word foreach while iterating
through a foreach structure (page 403), and the word while while iterating through
a while structure (page 405).
When you press RETURN in the middle of a quoted string on a command line without
ending the line with a backslash, tcsh displays an error message regardless of whether
you use single or double quotation marks:
% echo "Please enter the three values
Unmatched ".

In the next example, the first RETURN is quoted (escaped); the shell interprets it literally.
Under tcsh, single and double quotation marks produce the same result. The secondary prompt is a question mark (?).
% echo "Please enter the three values \
? required to complete the transaction."
Please enter the three values
> required to complete the transaction.

prompt3 Holds the prompt used during automatic spelling correction. The default value is

CORRECT>%R (y|n|e|a)?, where R is replaced by the corrected string.
savehist Specifies the number of commands saved from the history list when you log out.

These events are saved in a file named ~/.history. The shell uses these events as the
initial history list when you log in again, causing your history list to persist across
login sessions (page 375).
shell Holds the pathname of the shell you are using.
shlvl Holds the level of the shell. The TC Shell increments this variable each time you start

a subshell and decrements it each time you exit a subshell. The TC Shell sets it to 1
for a login shell.
status Holds the exit status returned by the last command. Similar to $? in bash

(page 465).
tcsh Holds the version number of tcsh you are running.
time Provides two functions: automatic timing of commands using the time builtin and the
format used by time. You can set this variable to either a single numeric value or an

array holding a numeric value and a string. The numeric value is used to control automatic timing; any command that takes more than that number of CPU seconds to run
has time display the command statistics when it finishes execution. A value of 0
results in statistics being displayed after every command. The string controls the formatting of the statistics using formatting sequences, including those listed in
Table 9-5.

Variables 395

Table 9-5

time formatting sequences

Sequence

Displays

%U

Time the command spent running user code, in CPU seconds (user mode)

%S

Time the command spent running system code, in CPU seconds (kernel mode)

%E

Wall clock time (total elapsed) taken by the command

%P

Percentage of time the CPU spent on this task during this period, computed as
(%U+%S)/%E

%W

Number of times the command’s processes were swapped out to disk

%X

Average amount of shared code memory used by the command, in kilobytes

%D

Average amount of data memory used by the command, in kilobytes

%K

Total memory used by the command (as %X+%D), in kilobytes

%M

Maximum amount of memory used by the command, in kilobytes

%F

Number of major page faults (pages of memory that had to be read from disk)

%I

Number of input operations

%O

Number of output operations

By default the time builtin uses the string
"%Uu %Ss %E %P%

%X+%Dk %I+%Oio %Fpf+%Ww"

which generates output in the following format:
tcsh $ time
0.200u 0.340s 17:32:33.27 0.0%

0+0k 0+0io 1165pf+0w

You might want to time commands to check system performance. If commands
consistently show many page faults and swaps, the system probably does not have
enough memory; you should consider adding more. You can use the information
that time reports to compare the performance of various system configurations and
program algorithms.
tperiod Controls how often, in minutes, the shell executes the special periodic alias

(page 377).
user The shell sets this variable to your username.
version The shell sets this variable to contain detailed information about the version of tcsh

the system is running.
watch Set to an array of user and terminal pairs to watch for logins and logouts. The word

any means any user or any terminal, so (any any) monitors all logins and logouts on
all terminals, whereas (zach ttyS1 any console $user any) watches for zach on ttyS1,

396 Chapter 9 The TC Shell (tcsh)

any user who accesses the system console, and any logins and logouts that use your
account (presumably to catch intruders). By default logins and logouts are checked
once every 10 minutes, but you can change this value by beginning the array with a
numeric value giving the number of minutes between checks. If you set watch to (1
any console), logins and logouts by any user on the console will be checked once per
minute. Reports are displayed just before a new shell prompt is issued. Also, the log
builtin forces an immediate check whenever it is executed. See who (next) for information about how you can control the format of the watch messages.
who Controls the format of the information displayed in watch messages (Table 9-6).

Table 9-6

who formatting sequence

Sequence

Displays

%n

Username

%a

Action taken by the user

%l

Terminal on which the action took place

%M

Full hostname of remote host (or local if none) from which the action took place

$m

Hostname without domain name

The default string used for watch messages when who is unset is "%n has %a %l
from %m", which generates the following line:
sam has logged on tty2 from local

$ As in bash, this variable contains the PID number of the current shell; use it as $$.

tcsh Variables That Act as Switches
The following shell variables act as switches; their values are not significant. If the
variable has been declared, the shell takes the specified action. If not, the action is not
taken or is negated. You can set these variables in a startup file, in a shell script, or
from the command line.
autocorrect Causes the shell to attempt spelling correction automatically, just before each attempt

at completion (page 384).
dunique Normally pushd blindly pushes the new working directory onto the directory stack,

meaning that you can end up with many duplicate entries on this stack. Set dunique
to cause the shell to look for and delete any entries that duplicate the one it is about
to push.
echo Causes the shell to display each command before it executes that command. Set echo
by calling tcsh with the –x option or by using set.
filec Enables filename completion (page 380) when running tcsh as csh (and csh is linked
to tcsh).
histlit Displays the commands in the history list exactly as entered, without interpretation

by the shell (page 376).

Variables 397
ignoreeof Prevents you from using CONTROL-D to exit from a shell so you cannot accidentally log

out. When this variable is declared, you must use exit or logout to leave a shell.
listjobs Causes the shell to list all jobs whenever a job is suspended.
listlinks Causes the ls–F builtin to show the type of file each symbolic link points to instead

of marking the symbolic link with an @ symbol.
loginsh Set by the shell if the current shell is running as a login shell.
nobeep Disables all beeping by the shell.
noclobber Prevents you from accidentally overwriting a file when you redirect output and pre-

vents you from creating a file when you attempt to append output to a nonexistent file
(Table 9-7). To override noclobber, add an exclamation point to the symbol you use for
redirecting or appending output (e.g., >! and >>!). For more information see page 139.

Table 9-7

How noclobber works

Command line

noclobber not declared

noclobber declared

x > fileout

Redirects standard output from
Redirects standard output from
process x to fileout. Overwrites fileout if process x to fileout. The shell displays
it exists.
an error message if fileout exists, and
does not overwrite the file.

x >> fileout

Redirects standard output from
process x to fileout. Appends new
output to the end of fileout if it exists.
Creates fileout if it does not exist.

Redirects standard output from
process x to fileout. Appends new
output to the end of fileout if it exists.
The shell displays an error message
if fileout does not exist, and does not
create the file.

noglob Prevents the shell from expanding ambiguous filenames. Allows you to use

*, ?, ~,
and [] literally on the command line or in a shell script without quoting them.

nonomatch Causes the shell to pass an ambiguous file reference that does not match a filename

to the command being called. The shell does not expand the file reference. When you
do not set nonomatch, tcsh generates a No match error message and does not execute
the command.
tcsh
cat:
tcsh
tcsh
cat:

$ cat questions?
No match
$ set nonomatch
$ cat questions?
questions?: No such file or directory

notify When set, tcsh sends a message to the screen immediately whenever a background job
completes. Ordinarily tcsh notifies you about job completion just before displaying

the next prompt. Refer to “Job Control” on page 294.
pushdtohome Causes a call to pushd without any arguments to change directories to your home

directory (equivalent to pushd –).
pushdsilent Causes pushd and popd not to display the directory stack.

398 Chapter 9 The TC Shell (tcsh)
rmstar Causes the shell to request confirmation when you give an rm

* command.

verbose Causes the shell to display each command after a history expansion (page 374). Set
verbose by calling tcsh with the –v option or by using set.
visiblebell Causes audible beeps to be replaced by flashing the screen.

Control Structures
The TC Shell uses many of the same control structures as the Bourne Again Shell. In
each case the syntax is different, but the effects are the same. This section summarizes
the differences between the control structures in the two shells. For more information
refer to “Control Structures” on page 418.

if
The syntax of the if control structure is
if (expression) simple-command
The if control structure works only with simple commands, not with pipelines
(page 141) or lists (page 145). You can use the if...then control structure (page 402)
to execute more complex commands.
tcsh $ cat if_1
#!/bin/tcsh
# Routine to show the use of a simple if control structure.
#
if ( $#argv == 0 ) echo "if_1: There are no arguments."

The if_1 script checks whether it was called with zero arguments. If the expression
enclosed in parentheses evaluates to true—that is, if zero arguments were on the command line—the if structure displays a message.
In addition to logical expressions such as the one the if_1 script uses, you can use
expressions that return a value based on the status of a file. The syntax for this type
of expression is
–n filename
where n is one of the values listed in Table 9-8.
If the result of the test is true, the expression has a value of 1; if it is false, the expression
has a value of 0. If the specified file does not exist or is not accessible, tcsh evaluates
the expression as 0. The following example checks whether the file specified on the
command line is an ordinary or directory file (and not a device or other special file):
tcsh $ cat if_2
#!/bin/tcsh
if -f $1 echo "$1 is an ordinary or directory file."

Control Structures

Table 9-8

399

Value of n

n

Meaning

b

File is a block special file

c

File is a character special file

d

File is a directory file

e

File exists

f

File is an ordinary or directory file

g

File has the set-group-ID bit set

k

File has the sticky bit (page 1095) set

l

File is a symbolic link

o

File is owned by user

p

File is a named pipe (FIFO)

r

The user has read access to the file

S

File is a socket special file

s

File is not empty (has nonzero size)

t

File descriptor (a single digit replacing filename) is open and connected to the
terminal

u

File has the set-user-ID bit set

w

User has write access to the file

X

File is either a builtin or an executable found by searching the directories in
$path

x

User has execute access to the file

z

File is 0 bytes long

You can combine operators where it makes sense. For example, –ox filename is true
if you own and have execute permission for the file. This expression is equivalent to
–o filename && –x filename.
Some operators return useful information about a file other than reporting true or
false. They use the same –n filename syntax, where n is one of the values shown in
Table 9-9 on the next page.

400 Chapter 9 The TC Shell (tcsh)

Table 9-9

Value of n

n

Meaning

A

The last time the file was accessed.*

A:

The last time the file was accessed displayed in a human-readable format.

M

The last time the file was modified.*

M:

The last time the file was modified displayed in a human-readable format.

C

The last time the file’s inode was modified.*

C:

The last time the file’s inode was modified displayed in a human-readable
format.

D

Device number for the file. This number uniquely identifies the device (a disk
partition, for example) on which the file resides.

I

Inode number for the file. The inode number uniquely identifies a file on a
particular device.

F

A string of the form device:inode. This string uniquely identifies a file
anywhere on the system.

N

Number of hard links to the file.

P

The file’s permissions, shown in octal, without a leading 0.

U

Numeric user ID of the file’s owner.

U:

Username of the file’s owner.

G

Numeric ID of the group the file is associated with.

G:

Name of the group the file is associated with.

Z

Number of bytes in the file.

*Time measured in seconds from the epoch (usually the start of January 1, 1970).

You can use only one of these operators in a given test, and it must appear as the last
operator in a multiple-operator sequence. Because 0 (zero) can be a valid response
from some of these operators (for instance, the number of bytes in a file might be 0),
most return –1 on failure instead of the 0 that the logical operators return on failure.
The one exception is F, which returns a colon if it cannot determine the device and
inode for the file.
When you want to use one of these operators outside of a control structure expression, you can use the filetest builtin to evaluate a file test and report the result:
tcsh $ filetest -z if_1
0
tcsh $ filetest -F if_1
2051:12694
tcsh $ filetest -Z if_1
131

Control Structures

401

goto
The goto statement has the following syntax:
goto label
A goto builtin transfers control to the statement beginning with label:. The following
script fragment demonstrates the use of goto:
tcsh $ cat goto_1
#!/bin/tcsh
#
# test for 2 arguments
#
if ($#argv == 2) goto goodargs
echo "Usage: $0 arg1 arg2"
exit 1
goodargs:
...

The goto_1 script displays a usage message (page 422) when it is called with more or
fewer than two arguments.

Interrupt Handling
The onintr (on interrupt) statement transfers control when you interrupt a shell
script. The syntax of an onintr statement is
onintr label
When you press the interrupt key during execution of a shell script, the shell transfers
control to the statement beginning with label:. This statement allows you to terminate a script gracefully when it is interrupted. For example, you can use it to ensure
that when you interrupt a shell script, the script removes temporary files before
returning control to the parent shell.
The following script demonstrates the use of onintr. It loops continuously until you
press the interrupt key, at which time it displays a message and returns control to the
shell:
tcsh $ cat onintr_1
#!/bin/tcsh
# demonstration of onintr
onintr close
while ( 1 )
echo "Program is running."
sleep 2
end
close:
echo "End of program."

If a script creates temporary files, you can use onintr to remove them.
close:
rm -f /tmp/$$*

402 Chapter 9 The TC Shell (tcsh)

The ambiguous file reference /tmp/$$* matches all files in /tmp that begin with the
PID number of the current shell. Refer to page 464 for a description of this technique
for naming temporary files.

if...then...else
The if...then...else control structure has three forms. The first form, an extension of
the simple if structure, executes more complex commands or a series of commands
if expression is true. This form is still a one-way branch.
if (expression) then
commands
endif
The second form is a two-way branch. If expression is true, the first set of commands
is executed. If it is false, the set of commands following else is executed.
if (expression) then
commands
else
commands
endif
The third form is similar to the if...then...elif structure (page 425). It performs tests
until it finds an expression that is true and then executes the corresponding
commands.
if (expression) then
commands
else if (expression) then
commands
...
else
commands
endif
The following program assigns a value of 0, 1, 2, or 3 to the variable class based on
the value of the first command-line argument. The program declares the variable
class at the beginning for clarity; you do not need to declare it before its first use. Also
for clarity, the script assigns the value of the first command-line argument to number.
tcsh $ cat if_else_1
#!/bin/tcsh
# routine to categorize the first
# command-line argument
set class
set number = $argv[1]
#

Control Structures

403

if ($number < 0) then
@ class = 0
else if (0 <= $number && $number < 100) then
@ class = 1
else if (100 <= $number && $number < 200) then
@ class = 2
else
@ class = 3
endif
#
echo "The number $number is in class $class."

The first if statement tests whether number is less than 0. If it is, the script assigns
0 to class and transfers control to the statement following endif. If it is not, the second if tests whether the number is between 0 and 100. The && Boolean AND
operator yields a value of true if the expression on each side is true. If the number
is between 0 and 100, 1 is assigned to class and control is transferred to the statement following endif. A similar test determines whether the number is between 100
and 200. If it is not, the final else assigns 3 to class. The endif closes the if control
structure.

foreach
The foreach structure parallels the bash for...in structure (page 431). The syntax is
foreach loop-index (argument-list)
commands
end
This structure loops through commands. The first time through the loop, the structure assigns the value of the first argument in argument-list to loop-index. When
control reaches the end statement, the shell assigns the value of the next argument
from argument-list to loop-index and executes the commands again. The shell
repeats this procedure until it exhausts argument-list.
The following tcsh script uses a foreach structure to loop through the files in the
working directory containing a specified string of characters in their filename and to
change the string. For example, you can use it to change the string memo in filenames
to letter. Thus the filenames memo.1, dailymemo, and memories would be changed
to letter.1, dailyletter, and letterries, respectively.
This script requires two arguments: the string to be changed (the old string) and the
new string. The argument-list of the foreach structure uses an ambiguous file reference to loop through all files in the working directory with filenames that contain the
first argument. For each filename that matches the ambiguous file reference, the mv
utility changes the filename. The echo and sed commands appear within back ticks
(‘) that indicate command substitution: Executing the commands within the back

404 Chapter 9 The TC Shell (tcsh)

ticks replaces the back ticks and everything between them. Refer to “Command Substitution” on page 361 for more information. The sed utility (page 653) substitutes
the first argument with the second argument in the filename. The $1 and $2 are
abbreviated forms of $argv[1] and $argv[2], respectively.
tcsh $ cat ren
#!/bin/tcsh
# Usage:
ren arg1 arg2
#
changes the string arg1 in the names of files
#
in the working directory into the string arg2
if ($#argv != 2) goto usage
foreach i ( *$1* )
mv $i ‘echo $i | sed -n s/$1/$2/p‘
end
exit 0
usage:
echo "Usage: ren arg1 arg2"
exit 1

optional The next script uses a foreach loop to assign the command-line arguments to the
elements of an array named buffer:
tcsh $ cat foreach_1
#!/bin/tcsh
# routine to zero-fill argv to 20 arguments
#
set buffer = (0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)
set count = 1
#
if ($#argv > 20) goto toomany
#
foreach argument ($argv[*])
set buffer[$count] = $argument
@ count++
end
# REPLACE command ON THE NEXT LINE WITH
# THE PROGRAM YOU WANT TO CALL.
exec command $buffer[*]
#
toomany:
echo "Too many arguments given."
echo "Usage: foreach_1 [up to 20 arguments]"
exit 1

The foreach_1 script calls another program named command with a command line
guaranteed to contain 20 arguments. If foreach_1 is called with fewer than 20 arguments, it fills the command line with zeros to complete the 20 arguments for
command. Providing more than 20 arguments causes it to display a usage message
and exit with an error status of 1.

Control Structures

405

The foreach structure loops through the commands one time for each command-line
argument. Each time through the loop, foreach assigns the value of the next argument
from the command line to the variable argument. Then the script assigns each of these
values to an element of the array buffer. The variable count maintains the index for
the buffer array. A postfix operator increments the count variable using @ (@
count++). The exec builtin (bash and tcsh; page 480) calls command so a new process
is not initiated. (Once command is called, the process running this routine is no longer needed so a new process is not required.)

while
The syntax of the while structure is
while (expression)
commands
end
This structure continues to loop through commands while expression is true. If
expression is false the first time it is evaluated, the structure never executes
commands.
tcsh $ cat while_1
#!/bin/tcsh
# Demonstration of a while control structure.
# This routine sums the numbers between 1 and n;
# n is the first argument on the command line.
#
set limit = $argv[1]
set index = 1
set sum = 0
#
while ($index <= $limit)
@ sum += $index
@ index++
end
#
echo "The sum is $sum"

This program computes the sum of all integers up to and including n, where n is the
first argument on the command line. The += operator assigns the value of sum +
index to sum.

break and continue
You can interrupt a foreach or while structure with a break or continue statement.
These statements execute the remaining commands on the line before they transfer
control. The break statement transfers control to the statement after the end statement, terminating execution of the loop. The continue statement transfers control to
the end statement, which continues execution of the loop.

406 Chapter 9 The TC Shell (tcsh)

switch
The switch structure is analogous to the bash case structure (page 442):
switch (test-string)
case pattern:
commands
breaksw
case pattern:
commands
breaksw
...
default:
commands
breaksw
endsw
The breaksw statement transfers control to the statement following the endsw statement. If you omit breaksw, control falls through to the next command. You can use
any of the special characters listed in Table 10-2 on page 444 within pattern except
the pipe symbol (|).
tcsh $ cat switch_1
#!/bin/tcsh
# Demonstration of a switch control structure.
# This routine tests the first command-line argument
# for yes or no in any combination of uppercase and
# lowercase letters.
#
# test that argv[1] exists
if ($#argv != 1) then
echo "Usage: $0 [yes|no]"
exit 1
else
# argv[1] exists, set up switch based on its value
switch ($argv[1])
# case of YES
case [yY][eE][sS]:
echo "Argument one is yes."
breaksw
#
# case of NO
case [nN][oO]:
echo "Argument one is no."
breaksw
#

Builtins 407
# default case
default:
echo "Argument one is not yes or no."
breaksw
endsw
endif

Builtins
Builtins are commands that are part of (built into) the shell. When you give a simple
filename as a command, the shell first checks whether it is the name of a builtin. If it
is, the shell executes it as part of the calling process; the shell does not fork a new
process to execute the builtin. The shell does not need to search the directory structure for builtin programs because they are immediately available to the shell.
If the simple filename you give as a command is not a builtin, the shell searches the
directory structure for the program you want, using the PATH variable as a guide.
When it finds the program, the shell forks a new process to execute the program.
Although they are not listed in Table 9-10, the control structure keywords (if,
foreach, endsw, and so on) are builtins. Table 9-10 describes many of the tcsh builtins, some of which are also built into other shells.

Table 9-10

tcsh builtins

Builtin

Function

% job

A synonym for the fg builtin. The job is the job number of the job you want to
bring to the foreground (page 295).

% job &

A synonym for the bg builtin. The job is the number of the job you want to put
in the background (page 296).

@

Similar to the set builtin but evaluates numeric expressions. Refer to “Numeric
Variables” on page 388.

alias

Creates and displays aliases; tcsh uses a different syntax than bash. Refer to
“Aliases” on page 377.

alloc

Displays a report of the amount of free and used memory.

bg

Moves a suspended job into the background (page 296).

bindkey

Controls the mapping of keys to the tcsh command-line editor commands.
bindkey Without any arguments, bindkey lists all key bindings (page 383).

bindkey –l Lists all available editor commands and gives a short description of each.
bindkey –e Puts the command-line editor in emacs mode (page 383).

408 Chapter 9 The TC Shell (tcsh)

Table 9-10
Builtin

tcsh builtins (continued)
Function

bindkey –v Puts the command-line editor in vi(m) mode (page 383).
bindkey key Attaches the editor command command to the key key.
command
bindkey –b key Similar to the previous form but allows you to specify CONTROL keys by using
command the form C–x (where x is the character you type while you press the CONTROL
key), specify META key sequences as M-x (on most keyboards used with Linux,
the ALT key is the META key; Mac OS X uses the OPTION key [but you have to
set an option in the Terminal utility to use it; see page 1049]), and specify
function keys as F-x.
bindkey –c key Binds the key key to the command command. Here the command is not an
command editor command but rather a shell builtin or an executable program.
bindkey –s key Causes tcsh to substitute string when you press key.
string

builtins

Displays a list of all builtins.

cd or chdir

Changes the working directory (page 92).

dirs

Displays the directory stack (page 297).

echo

Displays its arguments. You can prevent echo from displaying a RETURN at the
end of a line by using the –n option (see “Reading User Input” on page 391)
or by using a trailing \c (see “read: Accepts User Input” on page 477). The
echo builtin is similar to the echo utility (page 794).

eval

Scans and evaluates the command line. When you put eval in front of a
command, the command is scanned twice by the shell before it is executed.
This feature is useful with a command that is generated through command or
variable substitution. Because of the order in which the shell processes a
command line, it is sometimes necessary to repeat the scan to achieve the
desired result (page 487).

exec

Overlays the program currently being executed with another program in the
same shell. The original program is lost. Refer to “exec: Executes a Command
or Redirects File Descriptors” on page 480 for more information; also refer to
source on page 410.

exit

Exits from a TC Shell. When you follow exit with a numeric argument, tcsh
returns that number as the exit status (page 465).

fg

Moves a job into the foreground (page 294).

filetest

Takes one of the file inquiry operators followed by one or more filenames and
applies the operator to each filename (page 400). Returns the results as a
SPACE-separated list.

Builtins 409

Table 9-10

tcsh builtins (continued)

Builtin

Function

glob

Like echo, but does not display SPACEs between its arguments and does not
follow its display with a NEWLINE.

hashstat

Reports on the efficiency of tcsh’s hash mechanism. The hash mechanism
speeds the process of searching through the directories in your search path.
See also rehash (page 410) and unhash (page 411).

history

Displays a list of recent commands (page 374).

jobs

Displays a list of jobs (suspended commands and those running in the
background).

kill

Terminates a job or process (page 486).

limit

Limits the computer resources that the current process and any processes it
creates can use. You can put limits on the number of seconds of CPU time the
process can use, the size of files that the process can create, and so forth.

log

Immediately produces the report that the watch shell variable (page 395)
would normally cause tcsh to produce every 10 minutes.

login

Logs in a user. Can be followed by a username.

logout

Ends a session if you are using a login shell.

ls–F

Similar to ls –F (page 862) but faster. (This builtin is the characters ls–F
without any SPACEs.)

nice

Lowers the processing priority of a command or a shell. This builtin is useful
if you want to run a command that makes large demands on the system and
you do not need the output right away. If you are working with root privileges,
you can use nice to raise the priority of a command. Refer to page 892 for
more information on the nice builtin and the nice utility.

nohup

Allows you to log out without terminating processes running in the
background. Some systems are set up this way by default. Refer to page 896
for information on the nohup builtin and the nohup utility.

notify

Causes the shell to notify you immediately when the status of one of your jobs
changes (page 294).

onintr

Controls the action an interrupt causes within a script (page 401). See “trap:
Catches a Signal” on page 483 for information on the equivalent command in
bash.

popd

Changes the working directory to the directory on the top of the directory stack
and removes that directory from the directory stack (page 297).

printenv

Displays all environment variable names and values.

410 Chapter 9 The TC Shell (tcsh)

Table 9-10

tcsh builtins (continued)

Builtin

Function

pushd

Changes the working directory and places the new directory at the top of the
directory stack (page 297).

rehash

Re-creates the internal tables used by the hash mechanism. Whenever a new
instance of tcsh is invoked, the hash mechanism creates a sorted list of all
available commands based on the value of path. After you add a command to
a directory in path, use rehash to re-create the sorted list of commands. If you
do not, tcsh might not be able to find the new command. Also refer to
hashstat (page 409) and unhash (page 411).

repeat

Takes two arguments—a count and a simple command (no pipelines or
lists)—and repeats the command the number of times specified by the count.

sched

Executes a command at a specified time. For example, the following command
causes the shell to display the message Dental appointment. at 10 AM:
tcsh $ sched 10:00 echo "Dental appointment."

Without any arguments, sched displays the list of scheduled commands.
When the time to execute a scheduled command arrives, tcsh executes the
command just before it displays a prompt.

set

Declares, initializes, and displays local variables (page 385).

setenv

Declares, initializes, and displays environment variables (page 385).

shift

Analogous to the bash shift builtin (page 461). Without an argument, shift
promotes the indexes of the argv array. You can use it with an argument of an
array name to perform the same operation on that array.

source

Executes the shell script given as its argument: source does not fork another
process. It is similar to the bash . (dot) builtin (page 280). The source builtin
expects a TC Shell script so no leading #! is required in the script. The current
shell executes source; thus the script can contain commands, such as set,
that affect the current shell. After you make changes to your .tcshrc or .login
file, you can use source to execute it from the shell, thereby putting the
changes into effect without logging out and in. You can nest source builtins.

stop

Stops a job or process that is running in the background. The stop builtin
accepts multiple arguments.

suspend

Stops the current shell and puts it in the background. It is similar to the
suspend key, which stops jobs running in the foreground. This builtin will not
suspend a login shell.

time

Executes the command you give it as an argument. It displays a summary of
time-related information about the executed command, according to the time
shell variable (page 394). Without an argument, time displays the times for the
current shell and its children.

Chapter Summary

Table 9-10

411

tcsh builtins (continued)

Builtin

Function

umask

Identifies or changes the access permissions that tcsh assigns to files you
create (page 994).

unalias

Removes an alias (page 377).

unhash

Turns off the hash mechanism. See also hashstat (page 409) and rehash
(page 410).

unlimit

Removes limits (page 409) on the current process.

unset

Removes a variable declaration (page 385).

unsetenv

Removes an environment variable declaration (page 385).

wait

Causes the shell to wait for all child processes to terminate. When you give a
wait command in response to a shell prompt, tcsh does not display a prompt
until all background processes have finished execution. If you interrupt it with
the interrupt key, wait displays a list of background processes before tcsh
displays a prompt.

where

When given the name of a command as an argument, locates all occurrences
of the command and, for each, tells you whether it is an alias, a builtin, or an
executable program in your path.

which

Similar to where but reports on only the command that would be executed,
not all occurrences. This builtin is much faster than the which utility and
reports on aliases and builtins.

Chapter Summary
Like the Bourne Again Shell, the TC Shell is both a command interpreter and a programming language. The TC Shell, which is based on the C Shell that was developed
at the University of California at Berkeley, includes popular features such as history,
alias, and job control.
You might prefer to use tcsh as a command interpreter, especially if you are familiar
with the C Shell. You can use chsh to change your login shell to tcsh. However, running tcsh as your interactive shell does not cause tcsh to run shell scripts; they will
continue to be run by bash unless you explicitly specify another shell on the first line
of the script or specify the script name as an argument to tcsh. Specifying the shell on
the first line of a shell script ensures the behavior you expect.
If you are familiar with bash, you will notice some differences between the two shells.
For instance, the syntax you use to assign a value to a variable differs. In addition,
tcsh allows SPACEs around the equal sign. Both numeric and nonnumeric variables are

412 Chapter 9 The TC Shell (tcsh)

created and given values using the set builtin. The @ builtin can evaluate numeric
expressions for assignment to numeric variables.
setenv

Because there is no export builtin in tcsh, you must use the setenv builtin to create an
environment (global) variable. You can also assign a value to the variable with the
setenv command. The command unset removes both shell and environment variables,
whereas the command unsetenv removes only environment variables.

Aliases

The syntax of the tcsh alias builtin is slightly different from that of alias in bash.
Unlike bash, the tcsh aliases permit you to substitute command-line arguments using
the history mechanism syntax.
Most other tcsh features, such as history, word completion, and command-line editing, closely resemble their bash counterparts. The syntax of the tcsh control structures
is slightly different but provides functionality equivalent to that found in bash.

Globbing

The term globbing, a carryover from the original Bourne Shell, refers to the matching
of strings containing special characters (such as * and ?) to filenames. If tcsh is unable
to generate a list of filenames matching a globbing pattern, it displays an error message. This behavior contrasts with that of bash, which simply leaves the pattern
alone.
Standard input and standard output can be redirected in tcsh, but there is no straightforward way to redirect them independently. Doing so requires the creation of a
subshell that redirects standard output to a file while making standard error available
to the parent process.

Exercises
1. Assume that you are working with the following history list:
37
38
39
40
41
42
43
44
45
46
47

mail zach
cd /home/sam/correspondence/business/cheese_co
less letter.0321
vim letter.0321
cp letter.0321 letter.0325
grep hansen letter.0325
vim letter.0325
lpr letter*
cd ../milk_co
pwd
vim wilson.0321 wilson.0329

Using the history mechanism, give commands to
a. Send mail to Zach.
b. Use vim to edit a file named wilson.0329.

Exercises 413

c. Send wilson.0329 to the printer.
d. Send both wilson.0321 and wilson.0329 to the printer.
2. a. How can you display the aliases currently in effect?
b. Write an alias named homedots that lists the names (only) of all hidden
files in your home directory.
3. a. How can you prevent a command from sending output to the terminal
when you start it in the background?
b. What can you do if you start a command in the foreground and later
decide that you want it to run in the background?
4. Which statement can you put in your ~/.tcshrc file to prevent accidentally
overwriting a file when you redirect output? How can you override this
feature?
5. Assume that the working directory contains the following files:
adams.ltr.03
adams.brief
adams.ltr.07
abelson.09
abelson.brief
anthony.073
anthony.brief
azevedo.99

What happens if you press TAB after typing the following commands?
a. less adams.l
b. cat a
c. ls ant
d. file az
What happens if you press CONTROL-D after typing the following commands?
e. ls ab
f. less a
6. Write an alias named backup that takes a filename as an argument and creates a copy of that file with the same name and a filename extension of .bak.
7. Write an alias named qmake (quiet make) that runs make with both standard output and standard error redirected to the file named make.log. The
command qmake should accept the same options and arguments as make.
8. How can you make tcsh always display the pathname of the working directory as part of its prompt?

414 Chapter 9 The TC Shell (tcsh)

Advanced Exercises
9. Which lines do you need to change in the Bourne Again Shell script
command_menu (page 444) to turn it into a TC Shell script? Make the
changes and verify that the new script works.
10. Users often find rm (and even rm –i) too unforgiving because this utility
removes files irrevocably. Create an alias named delete that moves files
specified by its argument(s) into the ~/.trash directory. Create a second
alias named undelete that moves a file from the ~/.trash directory into the
working directory. Put the following line in your ~/.logout file to remove
any files that you deleted during the login session:
/bin/rm -f $HOME/.trash/* >& /dev/null

Explain what could be different if the following line were put in your ~/.logout file instead:
rm $HOME/.trash/*

11. Modify the foreach_1 script (page 404) so that it takes the command to exec
as an argument.
12. Rewrite the program while_1 (page 405) so that it runs faster. Use the time
builtin to verify the improvement in execution time.
13. Write your own version of find named myfind that writes output to the file
findout but without the clutter of error messages, such as those generated
when you do not have permission to search a directory. The myfind command should accept the same options and arguments as find. Can you think
of a situation in which myfind does not work as desired?
14. When the foreach_1 script (page 404) is supplied with 20 or fewer arguments, why are the commands following toomany: not executed? (Why is
there no exit command?)

I

PART IV
Programming Tools
CHAPTER 10
Programming the Bourne Again Shell (bash)
CHAPTER 11
The Perl Scripting Language

517

CHAPTER 12
The Python Programming Language

563

CHAPTER 13
The MySQL Database Management System
CHAPTER 14
The AWK Pattern Processing Language
CHAPTER 15
The sed Editor 653

595

619

417

10
Programming the
Bourne Again Shell
(bash)

Chapter10
10

In This Chapter

Objectives

Control Structures . . . . . . . . . . . . . 418

After reading this chapter you should be able to:

File Descriptors . . . . . . . . . . . . . . . 452
Positional Parameters. . . . . . . . . . 458

 Use control structures to implement decision making
and repetition in shell scripts

Special Parameters. . . . . . . . . . . . 463

 Handle input to and output from scripts

Variables . . . . . . . . . . . . . . . . . . . . 467

 Use shell variables (local) and environment variables
(global)

Environment, Environment
Variables, and Inheritance . . . . 468
Array Variables . . . . . . . . . . . . . . . 474
Builtin Commands . . . . . . . . . . . . 476
Expressions . . . . . . . . . . . . . . . . . . 492
Shell Programs . . . . . . . . . . . . . . . 500
A Recursive Shell Script . . . . . . . . 501
The quiz Shell Script. . . . . . . . . . . 504

 Evaluate the value of numeric variables
 Use bash builtin commands to call other scripts inline,
trap signals, and kill processes
 Use arithmetic and logical expressions
 List standard programming practices that result in
well-written scripts

418 Chapter 10 Programming the Bourne Again Shell (bash)

Chapter 5 introduced the shells and Chapter 8 went into detail about the Bourne Again
Shell. This chapter introduces additional Bourne Again Shell commands, builtins, and
concepts that carry shell programming to a point where it can be useful. Although you
might make use of shell programming as a system administrator, you do not have to
read this chapter to perform system administration tasks. Feel free to skip this chapter
and come back to it if and when you like.
The first part of this chapter covers programming control structures, also called
control flow constructs. These structures allow you to write scripts that can loop
over command-line arguments, make decisions based on the value of a variable, set
up menus, and more. The Bourne Again Shell uses the same constructs found in
programming languages such as C.
The next part of this chapter discusses parameters and variables, going into detail
about array variables, shell versus environment variables, special parameters, and
positional parameters. The exploration of builtin commands covers type, which displays information about a command, and read, which allows a shell script to accept
user input. The section on the exec builtin demonstrates how to use exec to execute
a command efficiently by replacing a process and explains how to use exec to redirect
input and output from within a script.
The next section covers the trap builtin, which provides a way to detect and respond to
operating system signals (such as the signal generated when you press CONTROL-C). The
discussion of builtins concludes with a discussion of kill, which can abort a process, and
getopts, which makes it easy to parse options for a shell script. Table 10-6 on page 491
lists some of the more commonly used builtins.
Next the chapter examines arithmetic and logical expressions as well as the operators
that work with them. The final section walks through the design and implementation
of two major shell scripts.
This chapter contains many examples of shell programs. Although they illustrate certain concepts, most use information from earlier examples as well. This overlap not
only reinforces your overall knowledge of shell programming but also demonstrates
how you can combine commands to solve complex tasks. Running, modifying, and
experimenting with the examples in this book is a good way to become comfortable
with the underlying concepts.

Do not name a shell script test
tip You can unwittingly create a problem if you name a shell script test because a bash builtin has
the same name. Depending on how you call your script, you might run either your script or the
builtin, leading to confusing results.

Control Structures
The control flow commands alter the order of execution of commands within a shell
script. The TC Shell uses a different syntax for these commands (page 398) than

Control Structures

if
test-command

419

False

True

then
commands

fi

Figure 10-1

An if...then flowchart

the Bourne Again Shell does. Control structures include the if...then, for...in, while,
until, and case statements. In addition, the break and continue statements work in
conjunction with the control structures to alter the order of execution of commands
within a script.
Getting help with
control structures

You can use the bash help command to display information about bash control
structures. See page 39 for more information.

if...then
The if...then control structure has the following syntax (see page 398 for tcsh):
if test-command
then
commands
fi
The bold words in the syntax description are the items you supply to cause the
structure to have the desired effect. The nonbold words are the keywords the shell
uses to identify the control structure.
test builtin

Figure 10-1 shows that the if statement tests the status returned by the test-command
and transfers control based on this status. The end of the if structure is marked by a
fi statement (if spelled backward). The following script prompts for two words, reads
them, and then uses an if structure to execute commands based on the result returned
by the test builtin (tcsh uses the test utility) when it compares the two words. (See
page 978 for information on the test utility, which is similar to the test builtin.) The

420 Chapter 10 Programming the Bourne Again Shell (bash)
test builtin returns a status of true if the two words are the same and false if they are
not. Double quotation marks around $word1 and $word2 make sure test works
properly if you enter a string that contains a SPACE or other special character.
$ cat if1
read -p "word 1: " word1
read -p "word 2: " word2
if test "$word1" = "$word2"
then
echo "Match"
fi
echo "End of program."
$ ./if1
word 1: peach
word 2: peach
Match
End of program.

In the preceding example the test-command is test "$word1" = "$word2". The test
builtin returns a true status if its first and third arguments have the relationship
specified by its second argument. If this command returns a true status (= 0), the
shell executes the commands between the then and fi statements. If the command
returns a false status (not = 0), the shell passes control to the statement following
fi without executing the statements between then and fi. The effect of this if statement is to display Match if the two words are the same. The script always displays
End of program.
Builtins

In the Bourne Again Shell, test is a builtin—part of the shell. It is also a stand-alone
utility kept in /usr/bin/test. This chapter discusses and demonstrates many Bourne
Again Shell builtins. Each bash builtin might or might not be a builtin in tcsh. The
shell will use the builtin version if it is available and the utility if it is not. Each version
of a command might vary slightly from one shell to the next and from the utility to
any of the shell builtins. See page 476 for more information on shell builtins.

Checking arguments

The next program uses an if structure at the beginning of a script to confirm that you
have supplied at least one argument on the command line. The test –eq criterion compares two integers; the shell expands the $# special parameter (page 463) to the
number of command-line arguments. This structure displays a message and exits
from the script with an exit status of 1 if you do not supply at least one argument.
$ cat chkargs
if test $# -eq 0
then
echo "You must supply at least one argument."
exit 1
fi
echo "Program running."

Control Structures

421

$ ./chkargs
You must supply at least one argument.
$ ./chkargs abc
Program running.

A test like the one shown in chkargs is a key component of any script that requires
arguments. To prevent the user from receiving meaningless or confusing information
from the script, the script needs to check whether the user has supplied the appropriate arguments. Some scripts simply test whether arguments exist (as in chkargs);
other scripts test for a specific number or specific kinds of arguments.
You can use test to verify the status of a file argument or the relationship between
two file arguments. After verifying that at least one argument has been given on the
command line, the following script tests whether the argument is the name of an ordinary file (not a directory or other type of file). The test builtin with the –f criterion
and the first command-line argument ($1) checks the file.
$ cat is_ordfile
if test $# -eq 0
then
echo "You must supply at least one argument."
exit 1
fi
if test -f "$1"
then
echo "$1 is an ordinary file."
else
echo "$1 is NOT an ordinary file."
fi

You can test many other characteristics of a file using test criteria; see Table 10-1.

Table 10-1

test builtin criteria

Criterion

Tests file to see if it

–d

Exists and is a directory file

–e

Exists

–f

Exists and is an ordinary file (not a directory)

–r

Exists and is readable

–s

Exists and has a size greater than 0 bytes

–w

Exists and is writable

–x

Exists and is executable

422 Chapter 10 Programming the Bourne Again Shell (bash)

Other test criteria provide ways to test relationships between two files, such as
whether one file is newer than another. Refer to examples later in this chapter and to
test on page 978 for more information.

Always test the arguments
tip To keep the examples in this book short and focused on specific concepts, the code to verify
arguments is often omitted or abbreviated. It is good practice to test arguments in shell programs that other people will use. Doing so results in scripts that are easier to debug, run, and
maintain.
[ ] is a synonym
for test

The following example—another version of chkargs—checks for arguments in a
way that is more traditional for Linux shell scripts. This example uses the bracket
([]) synonym for test. Rather than using the word test in scripts, you can surround
the arguments to test with brackets. The brackets must be surrounded by
whitespace (SPACEs or TABs).
$ cat chkargs2
if [ $# -eq 0 ]
then
echo "Usage: chkargs2 argument..." 1>&2
exit 1
fi
echo "Program running."
exit 0
$ ./chkargs2
Usage: chkargs2 argument...
$ ./chkargs2 abc
Program running.

Usage messages

The error message that chkargs2 displays is called a usage message and uses the 1>&2
notation to redirect its output to standard error (page 284). After issuing the usage
message, chkargs2 exits with an exit status of 1, indicating an error has occurred. The
exit 0 command at the end of the script causes chkargs2 to exit with a 0 status after
the program runs without an error. The Bourne Again Shell returns the exit status of
the last command the script ran if you omit the status code.
The usage message is commonly used to specify the type and number of arguments
the script requires. Many Linux utilities provide usage messages similar to the one in
chkargs2. If you call a utility or other program with the wrong number or wrong kind
of arguments, it will often display a usage message. Following is the usage message
that cp displays when you call it with only one argument:
$ cp a
cp: missing destination file operand after 'a'
Try 'cp --help' for more information.

Control Structures

True

if
test-command

then
commands

423

False

else
commands

fi

Figure 10-2

An if...then...else flowchart

if...then...else
The introduction of an else statement turns the if structure into the two-way branch
shown in Figure 10-2. The if...then...else control structure (available in tcsh with a
slightly different syntax) has the following syntax:
if test-command
then
commands
else
commands
fi
Because a semicolon (;) ends a command just as a NEWLINE does, you can place then on
the same line as if by preceding it with a semicolon. (Because if and then are separate
builtins, they require a control operator between them; a semicolon and NEWLINE work
equally well [page 290].) Some people prefer this notation for aesthetic reasons; others
like it because it saves space.
if test-command; then
commands
else
commands
fi

424 Chapter 10 Programming the Bourne Again Shell (bash)

True

if
test-command

then
commands

True

False

elif
test-command

then
commands

False

else
commands

fi

Figure 10-3 An if...then...elif flowchart

If the test-command returns a true status, the if structure executes the commands
between the then and else statements and then diverts control to the statement following fi. If the test-command returns a false status, the if structure executes the
commands following the else statement.
When you run the out script with arguments that are filenames, it displays the files
on the terminal. If the first argument is –v (called an option in this case), out uses less
(page 53) to display the files one screen at a time. After determining that it was called
with at least one argument, out tests its first argument to see whether it is –v. If the
result of the test is true (the first argument is –v), out uses the shift builtin (page 461)
to shift the arguments to get rid of the –v and displays the files using less. If the result
of the test is false (the first argument is not –v), the script uses cat to display the files.

Control Structures

425

$ cat out
if [ $# -eq 0 ]
then
echo "Usage: $0 [-v] filenames..." 1>&2
exit 1
fi
if [ "$1" = "-v" ]
then
shift
less -- "$@"
else
cat -- "$@"
fi

optional In out, the –– argument to cat and less tells these utilities that no more options follow
on the command line and not to consider leading hyphens (–) in the following list as
indicating options. Thus – – allows you to view a file whose name starts with a
hyphen (page 129). Although not common, filenames beginning with a hyphen do
occasionally occur. (You can create such a file by using the command cat > –fname.)
The –– argument works with all Linux utilities that use the getopts builtin (page 488)
to parse their options; it does not work with more and a few other utilities. This argument is particularly useful when used in conjunction with rm to remove a file whose
name starts with a hyphen (rm –– –fname), including any you create while experimenting with the –– argument.

if...then...elif
The if...then...elif control structure (Figure 10-3; not in tcsh) has the following
syntax:
if test-command
then
commands
elif test-command
then
commands
...
else
commands
fi
The elif statement combines the else statement and the if statement and enables you
to construct a nested set of if...then...else structures (Figure 10-3). The difference
between the else statement and the elif statement is that each else statement must be
paired with a fi statement, whereas multiple nested elif statements require only a single
closing fi statement.

426 Chapter 10 Programming the Bourne Again Shell (bash)

The following example shows an if...then...elif control structure. This shell script
compares three words the user enters. The first if statement uses the Boolean AND
operator (–a) as an argument to test. The test builtin returns a true status if the first
and second logical comparisons are true (that is, word1 matches word2 and word2
matches word3). If test returns a true status, the script executes the command following the next then statement, passes control to the statement following fi, and
terminates.
$ cat if3
read -p "word 1: " word1
read -p "word 2: " word2
read -p "word 3: " word3
if [ "$word1" = "$word2" -a "$word2" = "$word3" ]
then
echo "Match: words 1, 2, & 3"
elif [ "$word1" = "$word2" ]
then
echo "Match: words 1 & 2"
elif [ "$word1" = "$word3" ]
then
echo "Match: words 1 & 3"
elif [ "$word2" = "$word3" ]
then
echo "Match: words 2 & 3"
else
echo "No match"
fi

$ ./if3
word 1: apple
word 2: orange
word 3: pear
No match
$ ./if3
word 1: apple
word 2: orange
word 3: apple
Match: words 1 & 3
$ ./if3
word 1: apple
word 2: apple
word 3: apple
Match: words 1, 2, & 3

If the three words are not the same, the structure passes control to the first elif,
which begins a series of tests to see if any pair of words is the same. As the nesting
continues, if any one of the elif statements is satisfied, the structure passes control
to the next then statement and subsequently to the statement following fi. Each
time an elif statement is not satisfied, the structure passes control to the next elif
statement. The double quotation marks around the arguments to echo that contain
ampersands (&) prevent the shell from interpreting the ampersands as special
characters.

Control Structures

optional

427

The lnks Script
The following script, named lnks, demonstrates the if...then and if...then...elif control
structures. This script finds hard links to its first argument, a filename. If you provide
the name of a directory as the second argument, lnks searches for links in the directory
hierarchy rooted at that directory. If you do not specify a directory, lnks searches the
working directory and its subdirectories. This script does not locate symbolic links.
$ cat lnks
#!/bin/bash
# Identify links to a file
# Usage: lnks file [directory]
if [ $# -eq 0 -o $# -gt 2 ]; then
echo "Usage: lnks file [directory]" 1>&2
exit 1
fi
if [ -d "$1" ]; then
echo "First argument cannot be a directory." 1>&2
echo "Usage: lnks file [directory]" 1>&2
exit 1
else
file="$1"
fi
if [ $# -eq 1 ]; then
directory="."
elif [ -d "$2" ]; then
directory="$2"
else
echo "Optional second argument must be a directory." 1>&2
echo "Usage: lnks file [directory]" 1>&2
exit 1
fi
# Check that file exists and is an ordinary file
if [ ! -f "$file" ]; then
echo "lnks: $file not found or is a special file" 1>&2
exit 1
fi
# Check link count on file
set -- $(ls -l "$file")
linkcnt=$2
if [ "$linkcnt" -eq 1 ]; then
echo "lnks: no other hard links to $file" 1>&2
exit 0
fi
# Get the inode of the given file
set $(ls -i "$file")
inode=$1
# Find and print the files with that inode number
echo "lnks: using find to search for links..." 1>&2
find "$directory" -xdev -inum $inode -print

428 Chapter 10 Programming the Bourne Again Shell (bash)

Max has a file named letter in his home directory. He wants to find links to this file
in his and other users’ home directory file hierarchies. In the following example, Max
calls lnks from his home directory to perform the search. If you are running Mac OS
X, substitute /Users for /home. The second argument to lnks, /home, is the pathname
of the directory where Max wants to start the search. The lnks script reports that
/home/max/letter and /home/zach/draft are links to the same file:
$ ./lnks letter /home
lnks: using find to search for links...
/home/max/letter
/home/zach/draft

In addition to the if...then...elif control structure, lnks introduces other features that
are commonly used in shell programs. The following discussion describes lnks section
by section.
Specify the shell

The first line of the lnks script uses #! (page 287) to specify the shell that will execute
the script:
#!/bin/bash

In this chapter, the #! notation appears only in more complex examples. It ensures
that the proper shell executes the script, even when the user is running a different shell
or the script is called from a script running a different shell.
Comments

The second and third lines of lnks are comments; the shell ignores text that follows a
hashmark (#) up to the next NEWLINE character. These comments in lnks briefly identify
what the file does and explain how to use it:
# Identify links to a file
# Usage: lnks file [directory]

Usage messages

The first if statement tests whether lnks was called with zero arguments or more than
two arguments:
if [ $# -eq 0 -o $# -gt 2 ]; then
echo "Usage: lnks file [directory]" 1>&2
exit 1
fi

If either of these conditions is true, lnks sends a usage message to standard error
and exits with a status of 1. The double quotation marks around the usage message
prevent the shell from interpreting the brackets as special characters. The brackets
in the usage message indicate that the directory argument is optional.
The second if statement tests whether the first command-line argument ($1) is a
directory (the –d argument to test returns true if the file exists and is a directory):
if [ -d "$1" ]; then
echo "First argument cannot be a directory." 1>&2
echo "Usage: lnks file [directory]" 1>&2
exit 1
else
file="$1"
fi

If the first argument is a directory, lnks displays a usage message and exits. If it is not
a directory, lnks saves the value of $1 in the file variable because later in the script

Control Structures

429

set resets the command-line arguments. If the value of $1 is not saved before the set
command is issued, its value is lost.
Test the arguments

The next section of lnks is an if...then...elif statement:
if [ $# -eq 1 ]; then
directory="."
elif [ -d "$2" ]; then
directory="$2"
else
echo "Optional second argument must be a directory." 1>&2
echo "Usage: lnks file [directory]" 1>&2
exit 1
fi

The first test-command determines whether the user specified a single argument on
the command line. If the test-command returns 0 (true), the directory variable is
assigned the value of the working directory (.). If the test-command returns a nonzero
value (false), the elif statement tests whether the second argument is a directory. If it
is a directory, the directory variable is set equal to the second command-line argument, $2. If $2 is not a directory, lnks sends a usage message to standard error and
exits with a status of 1.
The next if statement in lnks tests whether $file does not exist. This test keeps lnks
from wasting time looking for links to a nonexistent file. The test builtin, when called
with the three arguments !, –f, and $file, evaluates to true if the file $file does not
exist:
[ ! -f "$file" ]

The ! operator preceding the –f argument to test negates its result, yielding false if
the file $file does exist and is an ordinary file.
Next lnks uses set and ls –l to check the number of links $file has:
# Check link count on file
set -- $(ls -l "$file")
linkcnt=$2
if [ "$linkcnt" -eq 1 ]; then
echo "lnks: no other hard links to $file" 1>&2
exit 0
fi

The set builtin uses command substitution (page 361) to set the positional parameters
to the output of ls –l. The second field in this output is the link count, so the user-created
variable linkcnt is set equal to $2. The –– used with set prevents set from interpreting
as an option the first argument produced by ls –l (the first argument is the access permissions for the file and typically begins with –). The if statement checks whether
$linkcnt is equal to 1; if it is, lnks displays a message and exits. Although this message
is not truly an error message, it is redirected to standard error. The way lnks has been
written, all informational messages are sent to standard error. Only the final product
of lnks—the pathnames of links to the specified file—is sent to standard output, so you
can redirect the output.

430 Chapter 10 Programming the Bourne Again Shell (bash)

If the link count is greater than 1, lnks goes on to identify the inode (page 1074) for
$file. As explained on page 112, comparing the inodes associated with filenames is a
good way to determine whether the filenames are links to the same file. The lnks script
uses set to set the positional parameters to the output of ls –i. The first argument to
set is the inode number for the file, so the user-created variable named inode is
assigned the value of $1:
# Get the inode of the given file
set $(ls -i "$file")
inode=$1

Finally lnks uses the find utility (page 804) to search for files having inode numbers
that match $inode:
# Find and print the files with that inode number
echo "lnks: using find to search for links..." 1>&2
find "$directory" -xdev -inum $inode -print

The find utility searches the directory hierarchy rooted at the directory specified by
its first argument ($directory) for files that meet the criteria specified by the remaining arguments. In this example, the remaining arguments send the names of files
having inode numbers matching $inode to standard output. Because files in different
filesystems can have the same inode number yet not be linked, find must search only
directories in the same filesystem as $directory. The –xdev (cross-device) argument
prevents find from searching directories on other filesystems. Refer to page 109 for
more information about filesystems and links.
The echo command preceding the find command in lnks, which tells the user that find is
running, is included because find can take a long time to run. Because lnks does not include
a final exit statement, the exit status of lnks is that of the last command it runs, find.

Debugging Shell Scripts
When you are writing a script such as lnks, it is easy to make mistakes. You can use
the shell’s –x option to help debug a script. This option causes the shell to display each
command after it expands it but before it runs the command. Tracing a script’s execution in this way can give you information about where a problem lies.
You can run lnks (above) and cause the shell to display each command before it is
executed. Either set the –x option for the current shell (set –x) so all scripts display
commands as they are run or use the –x option to affect only the shell running the
script called by the command line.
$ bash -x lnks letter /home
+ '[' 2 -eq 0 -o 2 -gt 2 ']'
+ '[' -d letter ']'
+ file=letter
+ '[' 2 -eq 1 ']'
+ '[' -d /home ']'
+ directory=/home
+ '[' '!' -f letter ']'
...

Control Structures

431

Assign next
argument in
argument-list
to loop-index

do

commands

Another
argument in
argument-list
?

Yes

No

done

Figure 10-4
PS4

A for...in flowchart

Each command the script executes is preceded by the value of the PS4 variable—a plus
sign (+) by default—so you can distinguish debugging output from output produced by
the script. You must export PS4 if you set it in the shell that calls the script. The next
command sets PS4 to >>>> followed by a SPACE and exports it:
$ export PS4='>>>> '

You can also set the –x option of the shell running the script by putting the following
set command near the beginning of the script:
set -x

You can put set –x anywhere in the script to turn debugging on starting at that location.
Turn debugging off using set +x. The set –o xtrace and set +o xtrace commands do the
same things as set –x and set +x, respectively.

for...in
The for...in control structure (tcsh uses foreach) has the following syntax:
for loop-index in argument-list
do
commands
done
The for...in structure (Figure 10-4) assigns the value of the first argument in the
argument-list to the loop-index and executes the commands between the do and

432 Chapter 10 Programming the Bourne Again Shell (bash)

done statements. The do and done statements mark the beginning and end of the for
loop, respectively.
After it passes control to the done statement, the structure assigns the value of the
second argument in the argument-list to the loop-index and repeats the commands.
It repeats the commands between the do and done statements one time for each argument in the argument-list. When the structure exhausts the argument-list, it passes
control to the statement following done.
The following for...in structure assigns apples to the user-created variable fruit and
then displays the value of fruit, which is apples. Next the structure assigns oranges
to fruit and repeats the process. When it exhausts the argument list, the structure
transfers control to the statement following done, which displays a message.
$ cat fruit
for fruit in apples oranges pears bananas
do
echo "$fruit"
done
echo "Task complete."
$ ./fruit
apples
oranges
pears
bananas
Task complete.

The next script lists the names of the directory files in the working directory by
looping through the files in the working directory and using test to determine which
are directory files:
$ cat dirfiles
for i in *
do
if [ -d "$i" ]
then
echo "$i"
fi
done

The ambiguous file reference character * matches the names of all files (except hidden
files) in the working directory. Prior to executing the for loop, the shell expands the *
and uses the resulting list to assign successive values to the index variable i.

optional

Step Values
As an alternative to explicitly specifying values for argument-list, you can specify step
values. A for...in loop that uses step values assigns an initial value to or increments the
loop-index, executes the statements within the loop, and tests a termination condition
at the end of the loop.
The following example uses brace expansion with a sequence expression (page 357) to
generate the argument-list. This syntax works on bash version 4.0 and above; give the
command echo $BASH_VERSION to see which version you are using. The increment

Control Structures

433

does not work under OS X. The first time through the loop, bash assigns a value of 0
to count (the loop-index) and executes the statement between do and done. At the bottom of the loop, bash tests whether the termination condition has been met (is
count>10?). If it has, bash passes control to the statement following done; if not, bash
increments count by the increment value (2) and makes another pass through the loop.
It repeats this process until the termination condition is met.
$ cat step1
for count in {0..10..2}
do
echo -n "$count "
done
echo
$ ./step1
0 2 4 6 8 10

Older versions of bash do not support sequence expressions; you can use the seq utility
to perform the same function:
$ for count in $(seq 0 2 10); do echo -n "$count "; done; echo
0 2 4 6 8 10

The next example uses bash’s C-like syntax to specify step values. This syntax gives
you more flexibility in specifying the termination condition and the increment value.
Using this syntax, the first parameter initializes the loop-index, the second parameter
specifies the condition to be tested, and the third parameter specifies the increment.
$ cat rand
# $RANDOM evaluates to a
# This program simulates
for ((x=1; x<=10; x++))
do
echo -n "Roll #$x: "
echo -n
$(( $RANDOM
echo " " $(( $RANDOM
done

random value 0 < x < 32,767
10 rolls of a pair of dice

% 6 + 1 ))
% 6 + 1 ))

for
The for control structure (not in tcsh) has the following syntax:
for loop-index
do
commands
done
In the for structure, the loop-index takes on the value of each of the command-line arguments, one at a time. The for structure is the same as the for...in structure (Figure 10-4,
page 431) except in terms of where it gets values for the loop-index. The for structure
performs a sequence of commands, usually involving each argument in turn.
The following shell script shows a for structure displaying each command-line argument. The first line of the script, for arg, implies for arg in "$@", where the shell
expands "$@" into a list of quoted command-line arguments (i.e., "$1" "$2"
"$3" ...). The balance of the script corresponds to the for...in structure.

434 Chapter 10 Programming the Bourne Again Shell (bash)
$ cat for_test
for arg
do
echo "$arg"
done
$ for_test candy gum chocolate
candy
gum
chocolate

The next example uses a different syntax. In it, the loop-index is named count and
is set to an initial value of 0. The condition to be tested is count<=10: bash continues
executing the loop as long as this condition is true (as long as count is less than or
equal to 10; see Table 10-8 on page 495 for a list of operators). Each pass through
the loop, bash adds 2 to the value of count (count+=2).
$ cat step2
for (( count=0; count<=10; count+=2 ))
do
echo -n "$count "
done
echo
$ ./step2
0 2 4 6 8 10

optional

The whos Script
The following script, named whos, demonstrates the usefulness of the implied "$@"
in the for structure. You give whos one or more users’ full names or usernames as
arguments, and whos displays information about the users. The whos script gets the
information it displays from the first and fifth fields in the /etc/passwd file. The first
field contains a username, and the fifth field typically contains the user’s full name.
You can provide a username as an argument to whos to display the user’s name or
provide a name as an argument to display the username. The whos script is similar
to the finger utility, although whos delivers less information. Mac OS X uses Open
Directory in place of the passwd file; see page 1042 for a similar script that runs
under OS X.
$ cat whos
#!/bin/bash
if [ $# -eq 0 ]
then
echo "Usage: whos id..." 1>&2
exit 1
fi

Control Structures

435

for id
do
gawk -F: '{print $1, $5}' /etc/passwd |
grep -i "$id"
done

In the next example, whos identifies the user whose username is chas and the user
whose name is Marilou Smith:
$ ./whos chas "Marilou Smith"
chas Charles Casey
msmith Marilou Smith
Use of "$@"

The whos script uses a for statement to loop through the command-line arguments.
In this script the implied use of "$@" in the for loop is particularly beneficial because
it causes the for loop to treat an argument that contains a SPACE as a single argument.
This example encloses Marilou Smith in quotation marks, which causes the shell to
pass it to the script as a single argument. Then the implied "$@" in the for statement
causes the shell to regenerate the quoted argument Marilou Smith so that it is again
treated as a single argument. The double quotation marks in the grep statement perform the same function.

gawk

For each command-line argument, whos searches the /etc/passwd file. Inside the for
loop, the gawk utility (Chapter 12; awk and mawk work the same way) extracts the first
($1) and fifth ($5) fields from each line in /etc/passwd. The –F: option causes gawk
to use a colon (:) as a field separator when it reads /etc/passwd, allowing it to break
each line into fields. The gawk command sets and uses the $1 and $5 arguments; they
are included within single quotation marks and are not interpreted by the shell. Do
not confuse these arguments with positional parameters, which correspond to command-line arguments. The first and fifth fields are sent to grep (page 833) via a
pipeline. The grep utility searches for $id (to which the shell has assigned the value
of a command-line argument) in its input. The –i option causes grep to ignore case
as it searches; grep displays each line in its input that contains $id.

A pipe symbol (|) at
the end of a line

Under bash (and not tcsh), a control operator such as a pipe symbol ( | ) implies
continuation: bash “knows” another command must follow it. Therefore, in
whos, the NEWLINE following the pipe symbol at the end of the line with the gawk
command does not have to be quoted. For more information refer to “Implicit
Command-Line Continuation” on page 499.

while
The while control structure (see page 405 for tcsh) has the following syntax:
while test-command
do
commands
done

436 Chapter 10 Programming the Bourne Again Shell (bash)

while
test-command

False

done

True

do
commands

Figure 10-5

A while flowchart

As long as the test-command (Figure 10-5) returns a true exit status, the while
structure continues to execute the series of commands delimited by the do and
done statements. Before each loop through the commands, the structure executes
the test-command. When the exit status of the test-command is false, the structure
passes control to the statement after the done statement.
test builtin

The following shell script first initializes the number variable to zero. The test builtin
then determines whether number is less than 10. The script uses test with the –lt argument to perform a numerical test. For numerical comparisons, you must use –ne (not
equal), –eq (equal), –gt (greater than), –ge (greater than or equal to), –lt (less than),
or –le (less than or equal to). For string comparisons, use = (equal) or != (not equal)
when you are working with test. In this example, test has an exit status of 0 (true) as
long as number is less than 10. As long as test returns true, the structure executes the
commands between the do and done statements. See page 978 for information on the
test utility, which is very similar to the test builtin.
$ cat count
#!/bin/bash
number=0
while [ "$number" -lt 10 ]
do
echo -n "$number"
((number +=1))
done
echo
$ ./count
0123456789
$

The echo command following do displays number. The –n prevents echo from issuing
a NEWLINE following its output. The next command uses arithmetic evaluation [((...));
page 492] to increment the value of number by 1. The done statement terminates the
loop and returns control to the while statement to start the loop over again. The final
echo causes count to send a NEWLINE character to standard output, so the next prompt
is displayed at the left edge of the display rather than immediately following the 9.

Control Structures

optional

437

The spell_check Script
The aspell utility (page 721; not available under OS X) checks the words in a file against
a dictionary of correctly spelled words. With the list command, aspell runs in list mode:
Input comes from standard input and aspell sends each potentially misspelled word to
standard output. The following command produces a list of possible misspellings in the
file letter.txt:
$ aspell list < letter.txt
quikly
portible
frendly

The next shell script, named spell_check, shows another use of a while structure. To
find the incorrect spellings in a file, spell_check calls aspell to check a file against a
system dictionary. But it goes a step further: It enables you to specify a list of correctly
spelled words and removes these words from the output of aspell. This script is useful
for removing words you use frequently, such as names and technical terms, that do
not appear in a standard dictionary. Although you can duplicate the functionality of
spell_check by using additional aspell dictionaries, the script is included here for its
instructive value.
The spell_check script requires two filename arguments: the file containing the list of
correctly spelled words and the file you want to check. The first if statement verifies
that the user specified two arguments. The next two if statements verify that both
arguments are readable files. (The exclamation point negates the sense of the following operator; the –r operator causes test to determine whether a file is readable. The
result is a test that determines whether a file is not readable.)
$ cat spell_check
#!/bin/bash
# remove correct spellings from aspell output
if [ $# -ne
then
echo
echo
echo
exit
fi

2 ]
"Usage: spell_check dictionary filename" 1>&2
"dictionary: list of correct spellings" 1>&2
"filename: file to be checked" 1>&2
1

if [ ! -r "$1" ]
then
echo "spell_check: $1 is not readable" 1>&2
exit 1
fi
if [ ! -r "$2" ]
then
echo "spell_check: $2 is not readable" 1>&2
exit 1
fi

438 Chapter 10 Programming the Bourne Again Shell (bash)
aspell list < "$2" |
while read line
do
if ! grep "^$line$" "$1" > /dev/null
then
echo $line
fi
done

The spell_check script sends the output from aspell (with the list argument, so it
produces a list of misspelled words on standard output) through a pipeline to
standard input of a while structure, which reads one line at a time (each line has
one word on it) from standard input. The test-command (that is, read line) returns
a true exit status as long as it receives a line from standard input.
Inside the while loop, an if statement monitors the return value of grep, which determines whether the line that was read is in the user’s list of correctly spelled words.
The pattern grep searches for (the value of $line) is preceded and followed by special
characters that specify the beginning and end of a line (^ and $, respectively). These
special characters ensure that grep finds a match only if the $line variable matches an
entire line in the file of correctly spelled words. (Otherwise, grep would match a
string, such as paul, in the output of aspell if the file of correctly spelled words contained the word paulson.) These special characters, together with the value of the
$line variable, form a regular expression (Appendix A).
The output of grep is redirected to /dev/null (page 141) because the output is not
needed; only the exit code is important. The if statement checks the negated exit status of grep (the leading exclamation point negates or changes the sense of the exit
status—true becomes false, and vice versa), which is 0 or true (false when negated)
when a matching line is found. If the exit status is not 0 or false (true when negated),
the word was not in the file of correctly spelled words. The echo builtin sends a list
of words that are not in the file of correctly spelled words to standard output.
Once it detects the EOF (end of file), the read builtin returns a false exit status, control
passes out of the while structure, and the script terminates.
Before you use spell_check, create a file of correct spellings containing words that you
use frequently but that are not in a standard dictionary. For example, if you work for a
company named Blinkenship and Klimowski, Attorneys, you would put Blinkenship
and Klimowski in the file. The following example shows how spell_check checks the
spelling in a file named memo and removes Blinkenship and Klimowski from the output
list of incorrectly spelled words:
$ aspell list < memo
Blinkenship
Klimowski
targat
hte
$ cat word_list
Blinkenship
Klimowski
$ ./spell_check word_list memo
targat
hte

Control Structures

until
test-command

True

439

done

False

do
commands

Figure 10-6

An until flowchart

until
The until (not in tcsh) and while (see page 405 for tcsh) structures are similar, differing only in the sense of the test performed at the top of the loop. Figure 10-6 shows
that until continues to loop until the test-command returns a true exit status. The
while structure loops while the test-command continues to return a true or nonerror
condition. The until control structure has the following syntax:
until test-command
do
commands
done
The following script demonstrates an until structure that includes read (page 477).
When the user enters the correct string of characters, the test-command is satisfied
and the structure passes control out of the loop.
$ cat until1
secretname=zach
name=noname
echo "Try to guess the secret name!"
echo
until [ "$name" = "$secretname" ]
do
read -p "Your guess: " name
done
echo "Very good."
$ ./until1
Try to guess the secret name!
Your
Your
Your
Your
Very

guess:
guess:
guess:
guess:
good

helen
barbara
rachael
zach

440 Chapter 10 Programming the Bourne Again Shell (bash)

The following locktty script is similar to the lock command on Berkeley UNIX and
the Lock Screen menu selection in GNOME. The script prompts for a key (password)
and uses an until control structure to lock the terminal. The until statement causes
the system to ignore any characters typed at the keyboard until the user types the key
followed by a RETURN on a line by itself, which unlocks the terminal. The locktty script
can keep people from using your terminal while you are away from it for short periods of time. It saves you from having to log out if you are concerned about other users
using your session.
$ cat locktty
#! /bin/bash
trap '' 1 2 3 18
stty -echo
read -p "Key: " key_1
echo
read -p "Again: " key_2
echo
key_3=
if [ "$key_1" = "$key_2" ]
then
tput clear
until [ "$key_3" = "$key_2" ]
do
read key_3
done
else
echo "locktty: keys do not match" 1>&2
fi
stty echo

Forget your password for locktty?
tip If you forget your key (password), you will need to log in from another (virtual) terminal and give
a command to kill the process running locktty (e.g., killall –9 locktty).
trap builtin

The trap builtin (page 483; not in tcsh) at the beginning of the locktty script stops a
user from being able to terminate the script by sending it a signal (for example, by
pressing the interrupt key). Trapping signal 20 means that no one can use CONTROL-Z
(job control, a stop from a tty) to defeat the lock. Table 10-5 on page 483 provides
a list of signals. The stty –echo command (page 960) turns on keyboard echo (causes
the terminal not to display characters typed at the keyboard), preventing the key the
user enters from appearing on the screen. After turning off keyboard echo, the script
prompts the user for a key, reads it into the user-created variable key_1, prompts the
user to enter the same key again, and saves it in key_2. The statement key_3= creates
a variable with a NULL value. If key_1 and key_2 match, locktty clears the screen (with
the tput command) and starts an until loop. The until loop keeps reading from the terminal and assigning the input to the key_3 variable. Once the user types a string that

Control Structures

441

matches one of the original keys (key_2), the until loop terminates and keyboard echo
is turned on again.

break and continue
You can interrupt a for, while, or until loop by using a break or continue statement.
The break statement transfers control to the statement following the done statement,
thereby terminating execution of the loop. The continue command transfers control
to the done statement, continuing execution of the loop.
The following script demonstrates the use of these two statements. The for...in structure loops through the values 1–10. The first if statement executes its commands
when the value of the index is less than or equal to 3 ($index –le 3). The second if
statement executes its commands when the value of the index is greater than or equal
to 8 ($index –ge 8). In between the two ifs, echo displays the value of the index. For
all values up to and including 3, the first if statement displays continue, executes a
continue statement that skips echo $index and the second if statement, and continues
with the next for statement. For the value of 8, the second if statement displays the
word break and executes a break statement that exits from the for loop.
$ cat brk
for index in 1 2 3 4 5 6 7 8 9 10
do
if [ $index -le 3 ] ; then
echo "continue"
continue
fi
#
echo $index
#
if [ $index -ge 8 ] ; then
echo "break"
break
fi
done
$ ./brk
continue
continue
continue
4
5
6
7
8
break
$

442 Chapter 10 Programming the Bourne Again Shell (bash)

case

test-string
=
pattern-1
?

commands-1

test-string
=
pattern-2
?

commands-2

test-string
=
pattern-3
?

commands-3

esac

Figure 10-7

A case flowchart

case
The case structure (Figure 10-7; tcsh uses switch) is a multiple-branch decision mechanism. The path taken through the structure depends on a match or lack of a match
between the test-string and one of the patterns. When the test-string matches one of
the patterns, the shell transfers control to the commands following the pattern. The
commands are terminated by a double semicolon (;;) control operator. When control
reaches this control operator, the shell transfers control to the command following
the esac statement. The case control structure has the following syntax:
case test-string in
pattern-1)
commands-1
;;

Control Structures

443

pattern-2)
commands-2
;;
pattern-3)
commands-3
;;
...
esac
The following case structure uses the character the user enters as the test-string. This
value is held in the variable letter. If the test-string has a value of A, the structure executes the command following the pattern A. The right parenthesis is part of the case
control structure, not part of the pattern. If the test-string has a value of B or C, the
structure executes the command following the matching pattern. The asterisk (*)
indicates any string of characters and serves as a catchall in case there is no match.
If no pattern matches the test-string and if there is no catchall (*) pattern, control
passes to the command following the esac statement, without the case structure taking any action.
$ cat case1
read -p "Enter A, B, or C: " letter
case "$letter" in
A)
echo "You entered A"
;;
B)
echo "You entered B"
;;
C)
echo "You entered C"
;;
*)
echo "You did not enter A, B, or C"
;;
esac
$ ./case1
Enter A, B, or C: B
You entered B

The next execution of case1 shows the user entering a lowercase b. Because the teststring b does not match the uppercase B pattern (or any other pattern in the case
statement), the program executes the commands following the catchall pattern and
displays a message:
$ ./case1
Enter A, B, or C: b
You did not enter A, B, or C

444 Chapter 10 Programming the Bourne Again Shell (bash)

The pattern in the case structure is a glob (it is analogous to an ambiguous file reference).
It can include any special characters and strings shown in Table 10-2.

Table 10-2

Patterns

Pattern

Function

*

Matches any string of characters. Use for the default case.

?

Matches any single character.

[...]

Defines a character class. Any characters enclosed within brackets are tried,
one at a time, in an attempt to match a single character. A hyphen between two
characters specifies a range of characters.

|

Separates alternative choices that satisfy a particular branch of the case
structure.

The next script accepts both uppercase and lowercase letters:
$ cat case2
read -p "Enter A, B, or C: " letter
case "$letter" in
a|A)
echo "You entered A"
;;
b|B)
echo "You entered B"
;;
c|C)
echo "You entered C"
;;
*)
echo "You did not enter A, B, or C"
;;
esac
$ ./case2
Enter A, B, or C: b
You entered B

optional The following example shows how to use the case structure to create a simple menu.
The command_menu script uses echo to present menu items and prompt the user for
a selection. (The select control structure [page 448] is a much easier way of coding a
menu.) The case structure then executes the appropriate utility depending on the
user’s selection.
$ cat command_menu
#!/bin/bash
# menu interface to simple commands
echo -e "\n
COMMAND MENU\n"
echo " a. Current date and time"
echo " b. Users currently logged in"

Control Structures

445

echo " c. Name of the working directory"
echo -e " d. Contents of the working directory\n"
read -p "Enter a, b, c, or d: " answer
echo
#
case "$answer" in
a)
date
;;
b)
who
;;
c)
pwd
;;
d)
ls
;;
*)
echo "There is no selection: $answer"
;;
esac
$ ./command_menu
COMMAND MENU
a.
b.
c.
d.

Current date and time
Users currently logged in
Name of the working directory
Contents of the working directory

Enter a, b, c, or d: a
Sun Jan
echo –e

6 12:31:12 PST 2013

The –e option causes echo to interpret \n as a NEWLINE character. If you do not include
this option, echo does not output the extra blank lines that make the menu easy to
read but instead outputs the (literal) two-character sequence \n. The –e option causes
echo to interpret several other backslash-quoted characters (Table 10-3). Remember
to quote (i.e., place double quotation marks around the string) the backslash-quoted
character so the shell does not interpret it but rather passes the backslash and the
character to echo. See xpg_echo (page 353) for a way to avoid using the –e option.

Table 10-3

Special characters in echo (must use –e)

Quoted
character

echo displays

\a

Alert (bell)

\b

BACKSPACE

\c

Suppress trailing NEWLINE

446 Chapter 10 Programming the Bourne Again Shell (bash)

Table 10-3

Special characters in echo (must use –e) (continued)

Quoted
character

echo displays

\f

FORMFEED

\n

NEWLINE

\r

RETURN

\t

Horizontal TAB

\v

Vertical TAB

\\

Backslash

\nnn

The character with the ASCII octal code nnn; if nnn is not valid, echo displays
the string literally

You can also use the case control structure to take various actions in a script, depending
on how many arguments the script is called with. The following script, named safedit,
uses a case structure that branches based on the number of command-line arguments
($#). It calls vim and saves a backup copy of a file you are editing.
$ cat safedit
#!/bin/bash
PATH=/bin:/usr/bin
script=$(basename $0)
case $# in
0)
vim
exit 0
;;
1)
if [ ! -f "$1" ]
then
vim "$1"
exit 0
fi
if [ ! -r "$1" -o ! -w "$1" ]
then
echo "$script: check permissions on $1" 1>&2
exit 1
else
editfile=$1
fi
if [ ! -w "." ]
then
echo "$script: backup cannot be " \
"created in the working directory" 1>&2
exit 1
fi
;;

Control Structures

447

*)
echo "Usage: $script [file-to-edit]" 1>&2
exit 1
;;
esac
tempfile=/tmp/$$.$script
cp $editfile $tempfile
if vim $editfile
then
mv $tempfile bak.$(basename $editfile)
echo "$script: backup file created"
else
mv $tempfile editerr
echo "$script: edit error--copy of " \
"original file is in editerr" 1>&2
fi

If you call safedit without any arguments, the case structure executes its first branch
and calls vim without a filename argument. Because an existing file is not being edited,
safedit does not create a backup file. (See the :w command on page 173 for an explanation of how to exit from vim when you have called it without a filename.) If you
call safedit with one argument, it runs the commands in the second branch of the case
structure and verifies that the file specified by $1 does not yet exist or is the name of
a file for which the user has read and write permission. The safedit script also verifies
that the user has write permission for the working directory. If the user calls safedit
with more than one argument, the third branch of the case structure presents a usage
message and exits with a status of 1.
Set PATH

At the beginning of the script, the PATH variable is set to search /bin and /usr/bin.
Setting PATH in this way ensures that the commands executed by the script are standard utilities, which are kept in those directories. By setting this variable inside a
script, you can avoid the problems that might occur if users have set PATH to search
their own directories first and have scripts or programs with the same names as the
utilities the script calls. You can also include absolute pathnames within a script to
achieve this end, although this practice can make a script less portable.

Name of the
program

The next line declares a variable named script and initializes it with the simple filename
of the script:
script=$(basename $0)

The basename utility sends the simple filename component of its argument to standard output, which is assigned to the script variable, using command substitution.
The $0 holds the command the script was called with (page 458). No matter which
of the following commands the user calls the script with, the output of basename is
the simple filename safedit:
$ /home/max/bin/safedit memo
$ ./safedit memo
$ safedit memo

After the script variable is set, it replaces the filename of the script in usage and error
messages. By using a variable that is derived from the command that invoked the

448 Chapter 10 Programming the Bourne Again Shell (bash)

script rather than a filename that is hardcoded into the script, you can create links to
the script or rename it, and the usage and error messages will still provide accurate
information.
Naming
temporary files

Another feature of safedit relates to the use of the $$ parameter in the name of a temporary file. The statement following the esac statement creates and assigns a value to
the tempfile variable. This variable contains the name of a temporary file that is
stored in the /tmp directory, as are many temporary files. The temporary filename
begins with the PID number of the shell and ends with the name of the script. Using
the PID number ensures that the filename is unique. Thus safedit will not attempt to
overwrite an existing file, as might happen if two people were using safedit at the
same time. The name of the script is appended so that, should the file be left in /tmp
for some reason, you can figure out where it came from.
The PID number is used in front of—rather than after—$script in the filename
because of the 14-character limit placed on filenames by some older versions of
UNIX. Linux systems do not have this limitation. Because the PID number ensures
the uniqueness of the filename, it is placed first so that it cannot be truncated. (If the
$script component is truncated, the filename is still unique.) For the same reason,
when a backup file is created inside the if control structure a few lines down in the
script, the filename consists of the string bak. followed by the name of the file being
edited. On an older system, if bak were used as a suffix rather than a prefix and the
original filename were 14 characters long, .bak might be lost and the original file
would be overwritten. The basename utility extracts the simple filename of $editfile
before it is prefixed with bak..
The safedit script uses an unusual test-command in the if structure: vim $editfile. The
test-command calls vim to edit $editfile. When you finish editing the file and exit from
vim, vim returns an exit code. The if control structure uses that exit code to determine
which branch to take. If the editing session completed successfully, vim returns 0 and
the statements following the then statement are executed. If vim does not terminate
normally (as would occur if the user killed [page 846] the vim process), vim returns a
nonzero exit status and the script executes the statements following else.

select
The select control structure (not in tcsh) is based on the one found in the Korn Shell.
It displays a menu, assigns a value to a variable based on the user’s choice of items,
and executes a series of commands. The select control structure has the following
syntax:
select varname [in arg . . . ]
do
commands
done
The select structure displays a menu of the arg items. If you omit the keyword in and
the list of arguments, select uses the positional parameters in place of the arg items. The

Control Structures

449

menu is formatted with numbers before each item. For example, a select structure that
begins with
select fruit in apple banana blueberry kiwi orange watermelon STOP

displays the following menu:
1) apple
2) banana

3) blueberry
4) kiwi

5) orange
6) watermelon

7) STOP

The select structure uses the values of the LINES (default is 24) and COLUMNS
(default is 80) variables to specify the size of the display. With COLUMNS set to 20,
the menu looks like this:
1)
2)
3)
4)
5)
6)
7)
PS3

apple
banana
blueberry
kiwi
orange
watermelon
STOP

After displaying the menu, select displays the value of PS3, the select prompt. The
default value of PS3 is ?#, but it is typically set to a more meaningful value. When
you enter a valid number (one in the menu range) in response to the PS3 prompt,
select sets varname to the argument corresponding to the number you entered. An
invalid entry causes the shell to set varname to null. Either way, select stores your
response in the keyword variable REPLY and then executes the commands between
do and done. If you press RETURN without entering a choice, the shell redisplays the
menu and the PS3 prompt.
The select structure continues to issue the PS3 prompt and execute the commands until
something causes it to exit—typically a break or exit statement. A break statement
exits from the loop and an exit statement exits from the script.
The following script illustrates the use of select:
$ cat fruit2
#!/bin/bash
PS3="Choose your favorite fruit from these possibilities: "
select FRUIT in apple banana blueberry kiwi orange watermelon STOP
do
if [ "$FRUIT" == "" ]; then
echo -e "Invalid entry.\n"
continue
elif [ $FRUIT = STOP ]; then
echo "Thanks for playing!"
break
fi
echo "You chose $FRUIT as your favorite."
echo -e "That is choice number $REPLY.\n"
done

450 Chapter 10 Programming the Bourne Again Shell (bash)
$ ./fruit2
1) apple
3) blueberry
5) orange
7) STOP
2) banana
4) kiwi
6) watermelon
Choose your favorite fruit from these possibilities: 3
You chose blueberry as your favorite.
That is choice number 3.
Choose your favorite fruit from these possibilities: 99
Invalid entry.
Choose your favorite fruit from these possibilities: 7
Thanks for playing!

After setting the PS3 prompt and establishing the menu with the select statement,
fruit2 executes the commands between do and done. If the user submits an invalid
entry, the shell sets varname ($FRUIT) to a null value. If $FRUIT is null, echo displays an error message; continue then causes the shell to redisplay the PS3 prompt.
If the entry is valid, the script tests whether the user wants to stop. If so, echo displays
an appropriate message and break exits from the select structure (and from the
script). If the user enters a valid response and does not want to stop, the script displays the name and number of the user’s response. (See page 445 for information
about the echo –e option.)

Here Document
A Here document allows you to redirect input to a shell script from within the shell
script itself. A Here document is so named because it is here—immediately accessible
in the shell script—instead of there, perhaps in another file.
The following script, named birthday, contains a Here document. The two less than
symbols (<<) in the first line indicate a Here document follows. One or more characters that delimit the Here document follow the less than symbols—this example
uses a plus sign. Whereas the opening delimiter must appear adjacent to the less than
symbols, the closing delimiter must be on a line by itself. The shell sends everything
between the two delimiters to the process as standard input. In the example it is as
though you have redirected standard input to grep from a file, except that the file is
embedded in the shell script:
$ cat birthday
grep -i "$1" <<+
Max
June 22
Barbara February 3
Darlene May 8
Helen
March 13
Zach
January 23
Nancy
June 26
+
$ ./birthday Zach
Zach
January 23

Control Structures

451

$ ./birthday june
Max
June 22
Nancy
June 26

When you run birthday, it lists all the Here document lines that contain the argument
you called it with. In this case the first time birthday is run, it displays Zach’s birthday
because it is called with an argument of Zach. The second run displays all the birthdays
in June. The –i argument causes grep’s search not to be case sensitive.

optional The next script, named bundle,1 includes a clever use of a Here document. The bundle
script is an elegant example of a script that creates a shell archive (shar) file. The
script creates a file that is itself a shell script containing several other files as well as
the code needed to re-create the original files:
$ cat bundle
#!/bin/bash
# bundle: group files into distribution package
echo "# To unbundle, bash this file"
for i
do
echo "echo $i 1>&2"
echo "cat >$i <<'End of $i'"
cat $i
echo "End of $i"
done

Just as the shell does not treat special characters that occur in standard input of a shell
script as special, so the shell does not treat the special characters that occur between
the delimiters in a Here document as special.
As the following example shows, the output of bundle is a shell script, which is redirected to a file named bothfiles. It contains the contents of each file given as an
argument to bundle (file1 and file2 in this case) inside a Here document. To extract the
original files from bothfiles, you simply give it as an argument to a bash command.
Before each Here document is a cat command that causes the Here document to be
written to a new file when bothfiles is run:
$ cat file1
This is a file.
It contains two lines.
$ cat file2
This is another file.
It contains
three lines.

1. Thanks to Brian W. Kernighan and Rob Pike, The Unix Programming Environment (Englewood Cliffs,
N.J.: Prentice-Hall, 1984), 98. Reprinted with permission.

452 Chapter 10 Programming the Bourne Again Shell (bash)
$ ./bundle file1 file2 > bothfiles
$ cat bothfiles
# To unbundle, bash this file
echo file1 1>&2
cat >file1 <<'End of file1'
This is a file.
It contains two lines.
End of file1
echo file2 1>&2
cat >file2 <<'End of file2'
This is another file.
It contains
three lines.
End of file2

In the next example, file1 and file2 are removed before bothfiles is run. The bothfiles
script echoes the names of the files it creates as it creates them. The ls command then
shows that bothfiles has re-created file1 and file2:
$ rm file1 file2
$ bash bothfiles
file1
file2
$ ls
bothfiles
file1
file2

File Descriptors
As discussed on page 282, before a process can read from or write to a file, it must
open that file. When a process opens a file, Linux associates a number (called a file
descriptor) with the file. A file descriptor is an index into the process’s table of open
files. Each process has its own set of open files and its own file descriptors. After
opening a file, a process reads from and writes to that file by referring to its file
descriptor. When it no longer needs the file, the process closes the file, freeing the file
descriptor.
A typical Linux process starts with three open files: standard input (file descriptor 0),
standard output (file descriptor 1), and standard error (file descriptor 2). Often these
are the only files the process needs. Recall that you redirect standard output with the
symbol > or the symbol 1> and that you redirect standard error with the symbol 2>.
Although you can redirect other file descriptors, because file descriptors other than
0, 1, and 2 do not have any special conventional meaning, it is rarely useful to do so.
The exception is in programs that you write yourself, in which case you control the
meaning of the file descriptors and can take advantage of redirection.

File Descriptors 453

Opening a File Descriptor
The Bourne Again Shell opens files using the exec builtin with the following syntax:
exec n> outfile
exec m< infile
The first line opens outfile for output and holds it open, associating it with file
descriptor n. The second line opens infile for input and holds it open, associating it
with file descriptor m.

Duplicating a File Descriptor
The <& token duplicates an input file descriptor; >& duplicates an output file
descriptor. You can duplicate a file descriptor by making it refer to the same file as
another open file descriptor, such as standard input or output. Use the following
syntax to open or redirect file descriptor n as a duplicate of file descriptor m:
exec n<&m
Once you have opened a file, you can use it for input and output in two ways. First,
you can use I/O redirection on any command line, redirecting standard output to a
file descriptor with >&n or redirecting standard input from a file descriptor with
<&n. Second, you can use the read (page 477) and echo builtins. If you invoke other
commands, including functions (page 346), they inherit these open files and file
descriptors. When you have finished using a file, you can close it using the following
syntax:
exec n<&–

File Descriptor Examples
When you call the following mycp function with two arguments, it copies the file
named by the first argument to the file named by the second argument. If you supply
only one argument, the script copies the file named by the argument to standard output. If you invoke mycp with no arguments, it copies standard input to standard
output.

A function is not a shell script
tip The mycp example is a shell function; it will not work as you expect if you execute it as a shell
script. (It will work: The function will be created in a very short-lived subshell, which is of little use.)
You can enter this function from the keyboard. If you put the function in a file, you can run it as
an argument to the . (dot) builtin (page 280). You can also put the function in a startup file if you
want it to be always available (page 348).

454 Chapter 10 Programming the Bourne Again Shell (bash)
function mycp () {
case $# in
0)
# Zero arguments
# File descriptor 3 duplicates standard input
# File descriptor 4 duplicates standard output
exec 3<&0 4<&1
;;
1)
# One argument
# Open the file named by the argument for input
# and associate it with file descriptor 3
# File descriptor 4 duplicates standard output
exec 3< $1 4<&1
;;
2)
# Two arguments
# Open the file named by the first argument for input
# and associate it with file descriptor 3
# Open the file named by the second argument for output
# and associate it with file descriptor 4
exec 3< $1 4> $2
;;
*)
echo "Usage: mycp [source [dest]]"
return 1
;;
esac
# Call cat with input coming from file descriptor 3
# and output going to file descriptor 4
cat <&3 >&4
# Close file descriptors 3 and 4
exec 3<&- 4<&}

The real work of this function is done in the line that begins with cat. The rest of the
script arranges for file descriptors 3 and 4, which are the input and output of the cat
command, respectively, to be associated with the appropriate files.

optional The next program takes two filenames on the command line, sorts both, and sends
the output to temporary files. The program then merges the sorted files to standard
output, preceding each line by a number that indicates which file it came from.
$ cat sortmerg
#!/bin/bash
usage () {
if [ $# -ne 2 ]; then
echo "Usage: $0 file1 file2" 2>&1
exit 1
fi
}
# Default temporary directory
: ${TEMPDIR:=/tmp}

File Descriptors 455
# Check argument count
usage "$@"
# Set up temporary files for sorting
file1=$TEMPDIR/$$.file1
file2=$TEMPDIR/$$.file2
# Sort
sort $1 > $file1
sort $2 > $file2
# Open $file1 and $file2 for reading. Use file descriptors 3 and 4.
exec 3<$file1
exec 4<$file2
# Read the first line from each file to figure out how to start.
read Line1 <&3
status1=$?
read Line2 <&4
status2=$?
# Strategy: while there is still input left in both files:
#
Output the line that should come first.
#
Read a new line from the file that line came from.
while [ $status1 -eq 0 -a $status2 -eq 0 ]
do
if [[ "$Line2" > "$Line1" ]]; then
echo -e "1.\t$Line1"
read -u3 Line1
status1=$?
else
echo -e "2.\t$Line2"
read -u4 Line2
status2=$?
fi
done
# Now one of the files is at end of file.
# Read from each file until the end.
# First file1:
while [ $status1 -eq 0 ]
do
echo -e "1.\t$Line1"
read Line1 <&3
status1=$?
done
# Next file2:
while [[ $status2 -eq 0 ]]
do
echo -e "2.\t$Line2"
read Line2 <&4
status2=$?
done
# Close and remove both input files
exec 3<&- 4<&rm -f $file1 $file2
exit 0

456 Chapter 10 Programming the Bourne Again Shell (bash)

Determining Whether a File Descriptor Is Associated
with the Terminal
The test –t criterion takes an argument of a file descriptor and causes test to return
a value of 0 (true) or not 0 (false) based on whether the specified file descriptor is
associated with the terminal (screen or keyboard). It is typically used to determine
whether standard input, standard output, and/or standard error is coming
from/going to the terminal.
In the following example, the is.term script uses the test –t criterion ([ ] is a synonym
for test; page 978) to see if file descriptor 1 (initially standard output) of the process
running the shell script is associated with the screen. The message the script displays
is based on whether test returns true (file descriptor 1 is associated with the screen)
or false (file descriptor 1 is not associated with the screen).
$ cat is.term
if [ -t 1 ] ; then
echo "FD 1 (stdout) IS going to the screen"
else
echo "FD 1 (stdout) is NOT going to the screen"
fi

When you run is.term without redirecting standard output, the script displays FD 1
(stdout) IS going to the screen because standard output of the is.term script has not
been redirected:
$ ./is.term
FD 1 (stdout) IS going to the screen

When you redirect standard output of a program using > on the command line, bash
closes file descriptor 1 and then reopens it, associating it with the file specified following
the redirect symbol.
The next example redirects standard output of the is.term script: The newly opened file
descriptor 1 associates standard output with the file named hold. Now the test command
([ –t 1 ]) fails, returning a value of 1 (false), because standard output is not associated
with a terminal. The script writes FD 1 (stdout) is NOT going to the screen to hold:
$ ./is.term > hold
$ cat hold
FD 1 (stdout) is NOT going to the screen

If you redirect standard error from is.term, the script will report FD 1 (stdout) IS going
to the screen and will write nothing to the file receiving the redirection; standard output
has not been redirected. You can use [ –t 2 ] to test if standard error is going to the
screen:
$ ./is.term 2> hold
FD 1 (stdout) IS going to the screen

File Descriptors 457

In a similar manner, if you send standard output of is.term through a pipeline, test
reports standard output is not associated with a terminal. In this example, cat copies
standard input to standard output:
$ ./is.term | cat
FD 1 (stdout) is NOT going to the screen

optional You can also experiment with test on the command line. This technique allows you
to make changes to your experimental code quickly by taking advantage of command
history and editing (page 328). To better understand the following examples, first
verify that test (called as [ ]) returns a value of 0 (true) when file descriptor 1 is associated with the screen and a value other than 0 (false) when file descriptor 1 is not
associated with the screen. The $? special parameter (page 465) holds the exit status
of the previous command.
$ [ -t 1 ]
$ echo $?
0
$ [ -t 1 ] > hold
$ echo $?
1

As explained on page 292, the && (AND) control operator first executes the command preceding it. Only if that command returns a value of 0 (true) does && execute
the command following it. In the following example, if [ –t 1 ] returns 0, && executes
echo "FD 1 to screen". Although the parentheses (page 292) are not required in this
example, they are needed in the next one.
$ ( [ -t 1 ] && echo "FD 1 to screen" )
FD 1 to screen

Next, the output from the same command line is sent through a pipeline to cat, so
test returns 1 (false) and && does not execute echo.
$ ( [ -t 1 ] && echo "FD 1 to screen" ) | cat
$

The following example is the same as the previous one, except test checks whether file
descriptor 2 is associated with the screen. Because the pipeline redirects only standard
output, test returns 0 (true) and && executes echo.
$ ( [ -t 2 ] && echo "FD 2 to screen" ) | cat
FD 2 to screen

In this example, test checks whether file descriptor 2 is associated with the screen(it is)
and echo sends its output to file descriptor 1 (which goes through the pipeline to cat).

458 Chapter 10 Programming the Bourne Again Shell (bash)

Parameters
Shell parameters were introduced on page 300. This section goes into more detail
about positional parameters and special parameters.

Positional Parameters
Positional parameters comprise the command name and command-line arguments.
These parameters are called positional because you refer to them by their position on
the command line. You cannot use an assignment statement to change the value of a
positional parameter. However, the bash set builtin (page 460) enables you to change
the value of any positional parameter except the name of the calling program (the
command name). The tcsh set builtin does not change the values of positional
parameters.

$0: Name of the Calling Program
The shell expands $0 to the name of the calling program (the command you used to
call the program—usually the name of the program you are running). This parameter
is numbered zero because it appears before the first argument on the command line:
$ cat abc
echo "This script was called by typing $0"
$ ./abc
This script was called by typing ./abc
$ /home/sam/abc
This script was called by typing /home/sam/abc

The preceding shell script uses echo to verify the way the script you are executing was
called. You can use the basename utility and command substitution to extract the
simple filename of the script:
$ cat abc2
echo "This script was called by typing $(basename $0)"
$ /home/sam/abc2
This script was called by typing abc2

When you call a script through a link, the shell expands $0 to the value of the link.
The busybox utility (page 729) takes advantage of this feature so it knows how it was
called and which utility to run.
$ ln -s abc2 mylink
$ /home/sam/mylink
This script was called by typing mylink

When you display the value of $0 from an interactive shell, the shell displays its name
because that is the name of the calling program (the program you are running).
$ echo $0
bash

Parameters

459

bash versus –bash
tip On some systems, echo $0 displays –bash while on others it displays bash. The former indicates
a login shell (page 278); the latter indicates a shell that is not a login shell. In a GUI environment,
some terminal emulators launch login shells while others do not.

$1–$n: Positional Parameters
The shell expands $1 to the first argument on the command line, $2 to the second
argument, and so on up to $n. These parameters are short for ${1}, ${2}, ${3}, and
so on. For values of n less than or equal to 9, the braces are optional. For values of
n greater than 9, the number must be enclosed within braces. For example, the
twelfth positional parameter is represented by ${12}. The following script displays
positional parameters that hold command-line arguments:
$ cat display_5args
echo First 5 arguments are $1 $2 $3 $4 $5
$ ./display_5args zach max helen
First 5 arguments are zach max helen

The display_5args script displays the first five command-line arguments. The shell
expands each parameter that represents an argument that is not present on the
command line to a null string. Thus the $4 and $5 parameters have null values in
this example.

Always quote positional parameters
caution You can “lose” positional parameters if you do not quote them. See the following text for an
example.

Enclose references to positional parameters between double quotation marks. The
quotation marks are particularly important when you are using positional parameters as arguments to commands. Without double quotation marks, a positional
parameter that is not set or that has a null value disappears:
$ cat showargs
echo "$0 was called with $# arguments, the first is :$1:."
$ ./showargs a b c
./showargs was called with 3 arguments, the first is :a:.
$ echo $xx
$ ./showargs $xx a b c
./showargs was called with 3 arguments, the first is :a:.
$ ./showargs "$xx" a b c
./showargs was called with 4 arguments, the first is ::.

The showargs script displays the number of arguments it was called with ($#) followed
by the value of the first argument between colons. In the preceding example, showargs

460 Chapter 10 Programming the Bourne Again Shell (bash)

is initially called with three arguments. Next the echo command shows that the $xx
variable, which is not set, has a null value. The $xx variable is the first argument to the
second and third showargs commands; it is not quoted in the second command and
quoted using double quotation marks in the third command. In the second showargs
command, the shell expands the arguments to a b c and passes showargs three arguments. In the third showargs command, the shell expands the arguments to "" a b c,
which results in calling showargs with four arguments. The difference in the two calls
to showargs illustrates a subtle potential problem when using positional parameters
that might not be set or that might have a null value.

set: Initializes Positional Parameters
When you call the set builtin with one or more arguments, it assigns the values of the
arguments to the positional parameters, starting with $1 (not in tcsh). The following
script uses set to assign values to the positional parameters $1, $2, and $3:
$ cat set_it
set this is it
echo $3 $2 $1
$ ./set_it
it is this

optional A single hyphen (–) on a set command line marks the end of options and the start of
values the shell assigns to positional parameters. A – also turns off the xtrace (–x) and
verbose (–v) options (Table 8-13 on page 351). The following set command turns on
posix mode and sets the first two positional parameters as shown by the echo
command:
$ set -o posix - first.param second.param
$ echo $*
first.param second.param

A double hyphen (––) on a set command line without any following arguments unsets
the positional parameters; when followed by arguments, – – sets the positional
parameters, including those that begin with a hyphen (–).
$ set -$ echo $*
$

Combining command substitution (page 361) with the set builtin is a convenient way to
alter standard output of a command to a form that can be easily manipulated in a shell
script. The following script shows how to use date and set to provide the date in a useful
format. The first command shows the output of date. Then cat displays the contents of
the dateset script. The first command in this script uses command substitution to set the
positional parameters to the output of the date utility. The next command, echo $*, displays all positional parameters resulting from the previous set. Subsequent commands
display the values of $1, $2, $3, and $6. The final command displays the date in a format
you can use in a letter or report.

Parameters

461

$ date
Wed Aug 15 17:35:29 PDT 2012
$ cat dateset
set $(date)
echo $*
echo
echo "Argument 1: $1"
echo "Argument 2: $2"
echo "Argument 3: $3"
echo "Argument 6: $6"
echo
echo "$2 $3, $6"
$ ./dateset
Wed Aug 15 17:35:34 PDT 2012
Argument
Argument
Argument
Argument

1:
2:
3:
6:

Wed
Aug
15
2012

Aug 15, 2012

You can also use the +format argument to date (page 769) to specify the content and
format of its output.
set displays shell
variables

When called without arguments, set displays a list of the shell variables that are set,
including user-created variables and keyword variables. Under bash, this list is the
same as that displayed by declare (page 305) when it is called without any arguments.
$ set
BASH_VERSION='4.2.24(1)-release'
COLORS=/etc/DIR_COLORS
COLUMNS=89
LESSOPEN='||/usr/bin/lesspipe.sh %s'
LINES=53
LOGNAME=sam
MAIL=/var/spool/mail/sam
MAILCHECK=60
...

The bash set builtin can also perform other tasks. For more information refer to “set:
Works with Shell Features, Positional Parameters, and Variables” on page 472.

shift: Promotes Positional Parameters
The shift builtin promotes each positional parameter. The first argument (which was
represented by $1) is discarded. The second argument (which was represented by $2)
becomes the first argument (now $1), the third argument becomes the second, and
so on. Because no “unshift” command exists, you cannot bring back arguments that
have been discarded. An optional argument to shift specifies the number of positions
to shift (and the number of arguments to discard); the default is 1.

462 Chapter 10 Programming the Bourne Again Shell (bash)

The following demo_shift script is called with three arguments. Double quotation
marks around the arguments to echo preserve the spacing of the output but allow the
shell to expand variables. The program displays the arguments and shifts them
repeatedly until no arguments are left to shift.
$ cat demo_shift
echo "arg1= $1
shift
echo "arg1= $1
shift
echo "arg1= $1
shift
echo "arg1= $1
shift

arg2= $2

arg3= $3"

arg2= $2

arg3= $3"

arg2= $2

arg3= $3"

arg2= $2

arg3= $3"

$ ./demo_shift alice helen zach
arg1= alice
arg2= helen
arg3= zach
arg1= helen
arg2= zach
arg3=
arg1= zach
arg2=
arg3=
arg1=
arg2=
arg3=

Repeatedly using shift is a convenient way to loop over all command-line arguments
in shell scripts that expect an arbitrary number of arguments. See page 425 for a shell
script that uses shift.

$* and $@: Expand to All Positional Parameters
The shell expands the $* parameter to all positional parameters, as the display_all

program demonstrates:

$ cat display_all
echo All arguments are $*
$ ./display_all a b c d e f g h i j k l m n o p
All arguments are a b c d e f g h i j k l m n o p

*

"$ " versus "$@"
The $* and $@ parameters work the same way except when they are enclosed within
double quotation marks. Using "$*" yields a single argument with the first character
in IFS (page 311; normally a SPACE ) between the positional parameters. Using "$@"
produces a list wherein each positional parameter is a separate argument. This difference typically makes "$@" more useful than "$*" in shell scripts.
The following scripts help explain the difference between these two parameters. In
the second line of both scripts, the single quotation marks keep the shell from interpreting the enclosed special characters, allowing the shell to pass them to echo so echo
can display them. The bb1 script shows that set "$*" assigns multiple arguments to
the first command-line parameter.

Parameters

463

$ cat bb1
set "$*"
echo $# parameters with '"$*"'
echo 1: $1
echo 2: $2
echo 3: $3
$ ./bb1 a b c
1 parameters with "$*"
1: a b c
2:
3:

The bb2 script shows that set "$@" assigns each argument to a different command-line
parameter.
$ cat bb2
set "$@"
echo $# parameters with '"$@"'
echo 1: $1
echo 2: $2
echo 3: $3
$ ./bb2 a b c
3 parameters with "$@"
1: a
2: b
3: c

Special Parameters
Special parameters enable you to access useful values pertaining to positional parameters and the execution of shell commands. As with positional parameters, the shell
expands a special parameter when it is preceded by a $. Also as with positional
parameters, you cannot modify the value of a special parameter using an assignment
statement.

$#: Number of Positional Parameters
The shell expands $# to the decimal number of arguments on the command line
(positional parameters), not counting the name of the calling program:
$ cat num_args
echo "This script was called with $# arguments."
$ ./num_args sam max zach
This script was called with 3 arguments.

The next example shows set initializing four positional parameters and echo displaying
the number of parameters set initialized:
$ set a b c d; echo $#
4

464 Chapter 10 Programming the Bourne Again Shell (bash)

$$: PID Number
The shell expands the $$ parameter to the PID number of the process that is executing
it. In the following interaction, echo displays the value of this parameter and the ps utility
confirms its value. Both commands show the shell has a PID number of 5209:
$ echo $$
5209
$ ps
PID TTY
5209 pts/1
6015 pts/1

TIME CMD
00:00:00 bash
00:00:00 ps

Because echo is built into the shell, the shell does not create another process when you
give an echo command. However, the results are the same whether echo is a builtin
or not, because the shell expands $$ before it forks a new process to run a command.
Try giving this command using the echo utility (/bin/echo), which is run by another
process, and see what happens.
Naming
temporary files

In the following example, the shell substitutes the value of $$ and passes that value
to cp as a prefix for a filename:
$ echo $$
8232
$ cp memo $$.memo
$ ls
8232.memo memo

Incorporating a PID number in a filename is useful for creating unique filenames
when the meanings of the names do not matter; this technique is often used in shell
scripts for creating names of temporary files. When two people are running the same
shell script, having unique filenames keeps the users from inadvertently sharing the
same temporary file.
The following example demonstrates that the shell creates a new shell process when
it runs a shell script. The id2 script displays the PID number of the process running
it (not the process that called it—the substitution for $$ is performed by the shell that
is forked to run id2):
$ cat id2
echo "$0 PID= $$"
$ echo $$
8232
$ ./id2
./id2 PID= 8362
$ echo $$
8232

The first echo displays the PID number of the interactive shell. Then id2 displays its
name ($0) and the PID number of the subshell it is running in. The last echo shows
that the PID number of the interactive shell has not changed.

Parameters

465

$!: PID Number of Most Recent Background Process
The shell expands $! to the value of the PID number of the most recent process
that ran in the background (not in tcsh). The following example executes sleep as
a background task and uses echo to display the value of $!:
$ sleep 60 &
[1] 8376
$ echo $!
8376

$?: Exit Status
When a process stops executing for any reason, it returns an exit status to its parent
process. The exit status is also referred to as a condition code or a return code. The
shell expands the $? ($status under tcsh) parameter to the exit status of the most
recently executed command.
By convention, a nonzero exit status is interpreted as false and means the command
failed; a zero is interpreted as true and indicates the command executed successfully.
In the following example, the first ls command succeeds and the second fails; the exit
status displayed by echo reflects these outcomes:
$ ls es
es
$ echo $?
0
$ ls xxx
ls: xxx: No such file or directory
$ echo $?
1

You can specify the exit status a shell script returns by using the exit builtin, followed by a number, to terminate the script. If you do not use exit with a number
to terminate a script, the exit status of the script is that of the last command the
script ran.
$ cat es
echo This program returns an exit status of 7.
exit 7
$ es
This program returns an exit status of 7.
$ echo $?
7
$ echo $?
0

The es shell script displays a message and terminates execution with an exit command
that returns an exit status of 7, the user-defined exit status in this script. The first echo
then displays the exit status of es. The second echo displays the exit status of the first
echo: This value is 0, indicating the first echo executed successfully.

466 Chapter 10 Programming the Bourne Again Shell (bash)

$–: Flags of Options That Are Set
The shell expands the $– parameter to a string of one-character bash option flags (not
in tcsh). These flags are set by the set or shopt builtins, when bash is invoked, or by
bash itself (e.g., –i). For more information refer to “Controlling bash: Features and
Options” on page 349. The following command displays typical bash option flags for
an interactive shell:
$ echo $himBH

Table 8-13 on page 351 lists each of these flags (except i) as options to set in the
Alternative syntax column. When you start an interactive shell, bash sets the i (interactive) option flag. You can use this flag to determine if a shell is being run
interactively. In the following example, display_flags displays the bash option flags.
When run as a script in a subshell, it shows the i option flag is not set; when run using
source (page 280), which runs a script in the current shell, it shows the i option flag
is set.
$ cat display_flags
echo $$ ./display_flags
hB
$ source ./display_flags
himBH

$_: Last Argument of Previously Executed Command
When bash starts, as when you run a shell script, it expands the $_ parameter to the
pathname of the file it is running. After running a command, it expands this parameter
to the last argument of the previously executed command.
$ cat last_arg
echo $_
echo here I am
echo $_
$ ./last_arg
./last_arg
here I am
am

In the next example, the shell never executes the echo command; it expands $_ to the
last argument of the ls command (which it executed, albeit unsuccessfully).
$ ls xx && echo hi
ls: cannot access xx: No such file or directory
$ echo $_
xx

Variables 467

The tcsh shell expands the $_ parameter to the most recently executed command line.
tcsh $ who am i
sam
pts/1
tcsh $ echo $_
who am i

2013-02-29 16:48 (172.16.192.1)

Variables
Variables, introduced on page 300, are shell parameters denoted by a name. Variables can have zero or more attributes (page 305; e.g., export, readonly). You, or a
shell program, can create and delete variables, and can assign values and attributes
to variables. This section adds to the previous coverage with a discussion of the shell
variables, environment variables, inheritance, expanding null and unset variables,
array variables, and variables in functions.

Shell Variables
By default, when you create a variable it is available only in the shell you created
it in; it is not available in subshells. This type of variable is called a shell variable.
In the following example, the first command displays the PID number of the interactive shell the user is working in (2802) and the second command initializes the
variable x to 5. Then a bash command spawns a new shell (PID 29572). This new
shell is a child of the shell the user was working in (a subprocess; page 323). The
ps –l command shows the PID and PPID (parent PID) numbers of each shell: PID
29572 is a child of PID 2802. The final echo command shows the variable x is
not set in the spawned (child) shell: It is a shell variable and is local to the shell
it was created in.
$ echo $$
2802
$ x=5
$ echo $x
5
$ bash
$ echo $$
29572
$ ps -l
F S
UID
PID PPID
0 S 1000 2802 2786
0 S 1000 29572 2802
0 R 1000 29648 29572
$ echo $x
$

C PRI
0 80
0 80
0 80

NI
0
0
0

ADDR SZ
- 5374
- 5373
- 1707

WCHAN
wait
wait
-

TTY
pts/2
pts/2
pts/2

TIME
00:00:00
00:00:00
00:00:00

CMD
bash
bash
ps

468 Chapter 10 Programming the Bourne Again Shell (bash)

Environment, Environment Variables, and Inheritance
This section explains the concepts of the command execution environment and
inheritance.

Environment
When the Linux kernel invokes a program, the kernel passes to the program a list comprising an array of strings. This list, called the command execution environment or
simply the environment, holds a series of name-value pairs in the form name=value.

Environment Variables
When bash is invoked, it scans its environment and creates parameters for each namevalue pair, assigning the corresponding value to each name. Each of these parameters
is an environment variable; these variables are in the shell’s environment. Environment
variables are sometimes referred to as global variables or exported variables.
Inheritance

A child process (a subprocess; see page 323 for more information about the process
structure) inherits its environment from its parent. An inherited variable is an environment variable for the child, so its children also inherit the variable: All children
and grandchildren, to any level, inherit environment variables from their ancestor. A
process can create, remove, and change the value of environment variables, so a child
process might not inherit the same environment its parent inherited.
Because of process locality (next), a parent cannot see changes a child makes to an
environment variable and a child cannot see changes a parent makes to an environment variable once the child has been spawned (created). Nor can unrelated processes
see changes to variables that have the same name in each process, such as commonly
inherited environment variables (e.g., PATH).

Process Locality: Shell Variables
Variables are local, which means they are specific to a process: Local means local to
a process. For example, when you log in on a terminal or open a terminal emulator,
you start a process that runs a shell. Assume in that shell the LANG environment
variable (page 317) is set to en_US.UTF-8.
If you then log in on a different terminal or open a second terminal emulator, you
start another process that runs a different shell. Assume in that shell the LANG environment variable is also set to en_US.UTF-8. When you change the value of LANG
on the second terminal to de_DE.UTF-8, the value of LANG on the first terminal
does not change. It does not change because variables (both names and values) are
local to a process and each terminal is running a separate process (even though both
processes are running shells).

export: Puts Variables in the Environment
When you run an export command with variable names as arguments, the shell
places the names (and values, if present) of those variables in the environment.
Without arguments, export lists environment (exported) variables.

Variables 469

Under tcsh, setenv (page 386) assigns a value to a variable and places the name (and
value) of that variable in the environment. The examples in this section use the bash
syntax but the theory applies to both shells.
The following extest1 shell script assigns the value of american to the variable
named cheese and then displays its name (the shell expands $0 to the name of the
calling program) and the value of cheese. The extest1 script then calls subtest,
which attempts to display the same information, declares a cheese variable by initializing it, displays the value of the variable, and returns control to the parent
process, which is executing extest1. Finally, extest1 again displays the value of the
original cheese variable.
$ cat extest1
cheese=american
echo "$0 1: $cheese"
./subtest
echo "$0 2: $cheese"
$ cat subtest
echo "$0 1: $cheese"
cheese=swiss
echo "$0 2: $cheese"
$ ./extest1
./extest1 1: american
./subtest 1:
./subtest 2: swiss
./extest1 2: american

The subtest script never receives the value of cheese from extest1 (and extest1 never
loses the value): cheese is a shell variable, not an environment variable (it is not in
the environment of the parent process and therefore is not available in the child process). When a process attempts to display the value of a variable that has not been
declared and is not in the environment, as is the case with subtest, the process displays
nothing; the value of an undeclared variable is that of the null string. The final echo
shows the value of cheese in extest1 has not changed: In bash—unlike in the real
world—a child can never affect its parent’s attributes.
The extest2 script is the same as extest1 except it uses export to put cheese in the environment of the current process. The result is that cheese appears in the environment
of the child process running the subtest script.
$ cat extest2
export cheese=american
echo "$0 1: $cheese"
./subtest
echo "$0 2: $cheese"
$ ./extest2
./extest2 1:
./subtest 1:
./subtest 2:
./extest2 2:

american
american
swiss
american

470 Chapter 10 Programming the Bourne Again Shell (bash)

Here the child process inherits the value of cheese as american and, after displaying this
value, changes its copy to swiss. When control is returned to the parent, the parent’s
copy of cheese retains its original value: american.
Alternatively, as the next program shows, you can put a variable in the environment of
a child shell without declaring it in the parent shell. See page 130 for more information
on this command-line syntax.
$ cheese=cheddar ./subtest
./subtest 1: cheddar
./subtest 2: swiss
$ echo $cheese
$

You can export a variable without/before assigning a value to it. Also, you do not
need to export an already-exported variable after you change its value. For example,
you do not usually need to export PATH when you assign a value to it in
~/.bash_profile because it is typically exported in a global startup file.
You can place several export declarations (initializations) on a single line:
$ export cheese=swiss coffee=colombian avocados=us
Unexport

An export –n or declare +x command removes the export attribute from the named
environment variable (unexports the variable), demoting it to become a shell variable
while preserving its value.

Export a function

An export –f command places the named function (page 346) in the environment so
it is available to child processes.

printenv: Displays Environment Variable Names and Values
The printenv utility displays environment variable names and values. When called
without an argument, it displays all environment variables. When called with the
name of an environment variable, it displays the value of that variable. When called
with the name of a variable that is not in the environment or has not been declared,
it displays nothing. You can also use export (page 468) and env (next page) to display
a list of environment variables.
$ x=5
$ export y=10
$ printenv x
$ printenv y
10
$ printenv
...
SHELL=/bin/bash
TERM=xterm
USER=sam
PWD=/home/sam
y=10
...

# not in the environment
# in the environment

Variables 471

env: Runs a Program in a Modified Environment
The env utility runs a program as a child of the current shell, allowing you to modify
the environment the current shell exports to the newly created process. See page 130
for an easier way to place a variable in the environment of a child process. The env
utility has the following syntax:
env [options] [-] [name=value] ... [command-line]
where options is one of the following options:
––ignore-environment

–i or –
Causes command-line to run in a clean environment; no environment variables
are available to the newly created process.
––unset=name –u name

Unsets the environment variable named name so it is not available to the newly
created process.
Just as on a bash command line (page 130), zero or more name=value pairs may be
used to set or modify environment variables in the newly created process, except you
cannot specify a name without a value. The env utility evaluates the name=value
pairs from left to right, so if name appears more than once in this list, the rightmost
value takes precedence.
The command-line is the command (including any options and arguments) that env
executes. The env utility takes its first argument that does not contain an equal sign
as the beginning of the command line and, if you specify a command that does not
include a slash (i.e., if you specify a simple filename), uses the value of PATH
(page 308) to locate the command. It does not work with builtin commands.
In the following example, env runs display_xx, a script that displays the value of the
xx variable. On the command line, env initializes the variable xx in the environment
of the script it calls and echo in the script displays the value of xx.
$ cat display_xx
echo "Running $0"
echo $xx
$ env xx=remember ./display_xx
Running ./display_xx
remember

If you want to declare only environment variables for a program, it is simpler to use
the following bash syntax (page 130):
$ xx=remember ./display_xx
Running ./display_xx
remember

472 Chapter 10 Programming the Bourne Again Shell (bash)

When called without a command-line, env displays a list of environment variables
(it behaves similarly to printenv [page 470]):
$ env
...
SHELL=/bin/bash
TERM=xterm
USER=sam
PWD=/home/sam
y=10
...

set: Works with Shell Features, Positional Parameters,
and Variables
The set builtin can perform the following tasks:
• Set or unset shell features (also called attributes; page 350).
• Assign values to positional parameters (page 460).
• Display variables that are available to the current shell. These variables
comprise shell variables (variables not in the environment) and environment
variables. The set builtin displays variables in a format you can use in a shell
script or as input to set to declare and initialize variables. Output is sorted
based on the current locale (page 316). You cannot reset readonly variables.
$ set
...
BASH=/bin/bash
COLUMNS=70
PWD=/home/sam
SHELL=/bin/bash
x=5
y=10
...

Expanding Null and Unset Variables
The expression ${name} (or just $name if it is not ambiguous) expands to the value
of the name variable. If name is null or not set, bash expands ${name} to a null string.
The Bourne Again Shell provides the following alternatives to accepting the null
string as the value of the variable:
• Use a default value for the variable.
• Use a default value and assign that value to the variable.
• Display an error.
You can choose one of these alternatives by using a modifier with the variable name.
In addition, you can use set –o nounset (page 353) to cause bash to display an error
message and exit from a script whenever the script references an unset variable.

Variables 473

:– Uses a Default Value
The :– modifier uses a default value in place of a null or unset variable while allowing
a nonnull variable to represent itself:
${name:–default}
The shell interprets :– as “If name is null or unset, expand default and use the
expanded value in place of name; else use name.”
The following command lists the contents of the directory named by the LIT variable.
If LIT is null or unset, it lists the contents of /home/max/literature:
$ ls ${LIT:-/home/max/literature}

The shell expands variables in default:
$ ls ${LIT:-$HOME/literature}

:= Assigns a Default Value
The :– modifier does not change the value of a variable. However, you can change
the value of a null or unset variable to the expanded value of default by using the :=
modifier:
${name:=default}
The shell expands the expression ${name:=default} in the same manner as it expands
${name:–default}, but also sets the value of name to the expanded value of default.
If a script contains a line such as the following and LIT is unset or null at the time
this line is executed, the shell assigns LIT the value /home/max/literature:
$ ls ${LIT:=/home/max/literature}
: (null) builtin

Some shell scripts include lines that start with the : (null) builtin followed on the same
line by the := expansion modifier. This syntax sets variables that are null or unset.
The : builtin evaluates each token in the remainder of the command line but does not
execute any commands.
Use the following syntax to set a default for a null or unset variable in a shell script
(a SPACE follows the first colon). Without the leading colon (:), the shell would evaluate
and attempt to execute the “command” that results from the evaluation.
: ${name:=default}
When a script needs a directory for temporary files and uses the value of TEMPDIR
for the name of this directory, the following line assigns to TEMPDIR the value /tmp
if TEMPDIR is null:
: ${TEMPDIR:=/tmp}

474 Chapter 10 Programming the Bourne Again Shell (bash)

:? Sends an Error Message to Standard Error
Sometimes a script needs the value of a variable, but you cannot supply a reasonable
default at the time you write the script. In this case you want the script to exit if the
variable is not set. If the variable is null or unset, the :? modifier causes the script to
send an error message to standard error and terminate with an exit status of 1. Interactive shells do not exit when you use :?.
${name:?message}
If you omit message, the shell displays parameter null or not set. In the following
command, TESTDIR is not set, so the shell sends to standard error the expanded
value of the string following :?. In this case the string includes command substitution for date with the %T syntax (page 769), followed by the string error, variable
not set.
cd ${TESTDIR:?$(date +%T) error, variable not set.}
bash: TESTDIR: 16:16:14 error, variable not set.

Array Variables
The Bourne Again Shell supports one-dimensional array variables. The subscripts are
integers with zero-based indexing (i.e., the first element of the array has the subscript
0). The following syntax declares and assigns values to an array:
name=(element1 element2 ...)
The following example assigns four values to the array NAMES:
$ NAMES=(max helen sam zach)

You reference a single element of an array as follows; the braces are not optional.
$ echo ${NAMES[2]}
sam

The subscripts [*] and [@] both extract the entire array but work differently when
used within double quotation marks. An @ produces an array that is a duplicate of
the original array; an * produces a single element of an array (or a plain variable)
that holds all the elements of the array separated by the first character in IFS (normally a SPACE; page 311). In the following example, the array A is filled with the
elements of the NAMES variable using an *, and B is filled using an @. The declare
builtin (page 305) with the –a option displays the values of the arrays (and reminds
you that bash uses zero-based indexing for arrays):
$ A=("${NAMES[*]}")
$ B=("${NAMES[@]}")
$ declare -a
declare -a A='([0]="max helen sam zach")'
declare -a B='([0]="max" [1]="helen" [2]="sam" [3]="zach")'
...
declare -a NAMES='([0]="max" [1]="helen" [2]="sam" [3]="zach")'

Variables 475

The output of declare shows that NAMES and B have multiple elements. In contrast,
A, which was assigned its value with an * within double quotation marks, has only
one element: A has all its elements enclosed between double quotation marks.
In the next example, echo attempts to display element 1 of array A. Nothing is displayed because A has only one element and that element has an index of 0. Element
0 of array A holds all four names. Element 1 of B holds the second item in the array
and element 0 holds the first item.
$ echo ${A[1]}
$ echo ${A[0]}
max helen sam zach
$ echo ${B[1]}
helen
$ echo ${B[0]}
max

The ${#name[*]} operator returns the number of elements in an array:
$ echo ${#NAMES[*]}
4

The same operator, when given the index of an element of an array in place of
returns the length of the element:

*,

$ echo ${#NAMES[1]}
5

You can use subscripts on the left side of an assignment statement to replace selected
elements of an array:
$ NAMES[1]=max
$ echo ${NAMES[*]}
max max sam zach

Variables in Functions
Because functions run in the same environment as the shell that calls them, variables
are implicitly shared by a shell and a function it calls.
$
>
>
>

nam () {
echo $myname
myname=zach
}

$ myname=sam
$ nam
sam
$ echo $myname
zach

In the preceding example, the myname variable is set to sam in the interactive shell.
The nam function then displays the value of myname (sam) and sets myname to zach.
The final echo shows that, in the interactive shell, the value of myname has been
changed to zach.

476 Chapter 10 Programming the Bourne Again Shell (bash)
Function local
variables

Local variables are helpful in a function written for general use. Because the function
is called by many scripts that might be written by different programmers, you need
to make sure the names of the variables used within the function do not conflict with
(i.e., duplicate) the names of the variables in the programs that call the function.
Local variables eliminate this problem. When used within a function, the local builtin
declares a variable to be local to the function it is defined in.
The next example shows the use of a local variable in a function. It features two variables named count. The first is declared and initialized to 10 in the interactive shell.
Its value never changes, as echo verifies after count_down is run. The other count is
declared, using local, to be local to the count_down function. Its value, which is
unknown outside the function, ranges from 4 to 1, as the echo command within the
function confirms.
The following example shows the function being entered from the keyboard; it is not
a shell script. See the tip “A function is not a shell script” on page 453.
$
>
>
>
>
>
>
>
>
>
>

count_down () {
local count
count=$1
while [ $count -gt 0 ]
do
echo "$count..."
((count=count-1))
sleep 1
done
echo "Blast Off."
}

$ count=10
$ count_down 4
4...
3...
2...
1...
Blast Off.
$ echo $count
10

The count=count–1 assignment is enclosed between double parentheses, which
cause the shell to perform an arithmetic evaluation (page 492). Within the double
parentheses you can reference shell variables without the leading dollar sign ($). See
page 347 for another example of function local variables.

Builtin Commands
Builtin commands, which were introduced in Chapter 5, do not fork a new process
when you execute them. This section discusses the type, read, exec, trap, kill, and
getopts builtins. Table 10-6 on page 491 lists many bash builtin commands. See
Table 9-10 on page 407 for a list of tcsh builtins.

Builtin Commands 477

type: Displays Information About a Command
The type builtin (use which under tcsh) provides information about a command:
$ type cat echo who if lt
cat is hashed (/bin/cat)
echo is a shell builtin
who is /usr/bin/who
if is a shell keyword
lt is aliased to 'ls -ltrh | tail'

The preceding output shows the files that would be executed if you gave cat or who
as a command. Because cat has already been called from the current shell, it is in the
hash table (page 325) and type reports that cat is hashed. The output also shows that
a call to echo runs the echo builtin, if is a keyword, and lt is an alias.

read: Accepts User Input
A common use for user-created variables is storing information that a user enters in
response to a prompt. Using read, scripts can accept input from the user and store
that input in variables. See page 391 for information about reading user input under
tcsh. The read builtin reads one line from standard input and assigns the words on
the line to one or more variables:
$ cat read1
echo -n "Go ahead: "
read firstline
echo "You entered: $firstline"
$ ./read1
Go ahead: This is a line.
You entered: This is a line.

The first line of the read1 script uses echo to prompt for a line of text. The –n option
suppresses the following NEWLINE, allowing you to enter a line of text on the same line
as the prompt. The second line reads the text into the variable firstline. The third line
verifies the action of read by displaying the value of firstline.
The –p (prompt) option causes read to send to standard error the argument that follows
it; read does not terminate this prompt with a NEWLINE. This feature allows you to both
prompt for and read the input from the user on one line:
$ cat read1a
read -p "Go ahead: " firstline
echo "You entered: $firstline"
$ ./read1a
Go ahead: My line.
You entered: My line.

The variable in the preceding examples is quoted (along with the text string) because
you, as the script writer, cannot anticipate which characters the user might enter in
response to the prompt. Consider what would happen if the variable were not quoted
and the user entered * in response to the prompt:

478 Chapter 10 Programming the Bourne Again Shell (bash)
$ cat read1_no_quote
read -p "Go ahead: " firstline
echo You entered: $firstline
$ ./read1_no_quote
Go ahead: *
You entered: read1 read1_no_quote script.1
$ ls
read1
read1_no_quote
script.1

The ls command lists the same words as the script, demonstrating that the shell
expands the asterisk into a list of files in the working directory. When the variable
$firstline is surrounded by double quotation marks, the shell does not expand the
asterisk. Thus the read1 script behaves correctly:
$ ./read1
Go ahead: *
You entered:
REPLY

*

When you do not specify a variable to receive read’s input, bash puts the input into
the variable named REPLY. The following read1b script performs the same task as
read1:
$ cat read1b
read -p "Go ahead: "
echo "You entered: $REPLY"

The read2 script prompts for a command line, reads the user’s response, and assigns
it to the variable cmd. The script then attempts to execute the command line that
results from the expansion of the cmd variable:
$ cat read2
read -p "Enter a command: " cmd
$cmd
echo "Thanks"

In the following example, read2 reads a command line that calls the echo builtin. The
shell executes the command and then displays Thanks. Next read2 reads a command
line that executes the who utility:
$ ./read2
Enter a command: echo Please display this message.
Please display this message.
Thanks
$ ./read2
Enter a command: who
max
pts/4
2013-06-17 07:50 (:0.0)
sam
pts/12
2013-06-17 11:54 (guava)
Thanks

If cmd does not expand into a valid command line, the shell issues an error message:
$ ./read2
Enter a command: xxx
./read2: line 2: xxx: command not found
Thanks

Builtin Commands 479

The read3 script reads values into three variables. The read builtin assigns one word
(a sequence of nonblank characters) to each variable:
$ cat read3
read -p "Enter something: " word1 word2 word3
echo "Word 1 is: $word1"
echo "Word 2 is: $word2"
echo "Word 3 is: $word3"
$ ./read3
Enter something: this is something
Word 1 is: this
Word 2 is: is
Word 3 is: something

When you enter more words than read has variables, read assigns one word to each
variable, assigning all leftover words to the last variable. Both read1 and read2
assigned the first word and all leftover words to the one variable the scripts each had
to work with. In the following example, read assigns five words to three variables: It
assigns the first word to the first variable, the second word to the second variable,
and the third through fifth words to the third variable.
$ ./read3
Enter something: this is something else, really.
Word 1 is: this
Word 2 is: is
Word 3 is: something else, really.

Table 10-4 lists some of the options supported by the read builtin.

Table 10-4

read options

Option

Function

–a aname

(array) Assigns each word of input to an element of array aname.

–d delim

(delimiter) Uses delim to terminate the input instead of NEWLINE.

–e

(Readline) If input is coming from a keyboard, uses the Readline Library
(page 335) to get input.

–n num

(number of characters) Reads num characters and returns. As soon as the user
types num characters, read returns; there is no need to press RETURN.

–p prompt

(prompt) Displays prompt on standard error without a terminating NEWLINE
before reading input. Displays prompt only when input comes from the
keyboard.

–s

(silent) Does not echo characters.

–un

(file descriptor) Uses the integer n as the file descriptor that read takes its
input from. Thus
read –u4 arg1 arg2

is equivalent to
read arg1 arg2 <&4

See “File Descriptors” (page 452) for a discussion of redirection and file
descriptors.

480 Chapter 10 Programming the Bourne Again Shell (bash)

The read builtin returns an exit status of 0 if it successfully reads any data. It has a
nonzero exit status when it reaches the EOF (end of file).
The following example runs a while loop from the command line. It takes its input
from the names file and terminates after reading the last line from names.
$ cat names
Alice Jones
Robert Smith
Alice Paulson
John Q. Public
$ while read first rest
> do
> echo $rest, $first
> done < names
Jones, Alice
Smith, Robert
Paulson, Alice
Q. Public, John
$

The placement of the redirection symbol (<) for the while structure is critical. It is
important that you place the redirection symbol at the done statement and not at the
call to read.

optional Each time you redirect input, the shell opens the input file and repositions the read
pointer at the start of the file:
$ read line1 < names; echo $line1; read line2 < names; echo $line2
Alice Jones
Alice Jones

Here each read opens names and starts at the beginning of the names file. In the
following example, names is opened once, as standard input of the subshell created
by the parentheses. Each read then reads successive lines of standard input:
$ (read line1; echo $line1; read line2; echo $line2) < names
Alice Jones
Robert Smith

Another way to get the same effect is to open the input file with exec and hold it open
(refer to “File Descriptors” on page 452):
$ exec 3< names
$ read -u3 line1; echo $line1; read -u3 line2; echo $line2
Alice Jones
Robert Smith
$ exec 3<&-

exec: Executes a Command or Redirects File Descriptors
The exec builtin (not in tcsh) has two primary purposes: to run a command without
creating a new process and to redirect a file descriptor—including standard input,

Builtin Commands 481

output, or error—of a shell script from within the script (page 452). When the shell
executes a command that is not built into the shell, it typically creates a new process.
The new process inherits environment (exported) variables from its parent but does
not inherit variables that are not exported by the parent (page 468). In contrast, exec
executes a command in place of (overlays) the current process.

exec: Executes a Command
The exec builtin used for running a command has the following syntax:
exec command arguments
exec versus . (dot)

Insofar as exec runs a command in the environment of the original process, it is similar to the . (dot) command (page 280). However, unlike the . command, which can
run only shell scripts, exec can run both scripts and compiled programs. Also,
whereas the . command returns control to the original script when it finishes running,
exec does not. Finally the . command gives the new program access to local variables,
whereas exec does not.

exec does not
return control

Because the shell does not create a new process when you use exec, the command runs
more quickly. However, because exec does not return control to the original program,
it can be used only as the last command in a script. The following script shows that
control is not returned to the script:
$ cat exec_demo
who
exec date
echo "This line is never displayed."
$ ./exec_demo
zach
pts/7
May 20 7:05 (guava)
hls
pts/1
May 20 6:59 (:0.0)
Fri May 24 11:42:56 PDT 2013

The next example, a modified version of the out script (page 425), uses exec to execute the final command the script runs. Because out runs either cat or less and then
terminates, the new version, named out2, uses exec with both cat and less:
$ cat out2
if [ $# -eq 0 ]
then
echo "Usage: out2 [-v] filenames" 1>&2
exit 1
fi
if [ "$1" = "-v" ]
then
shift
exec less "$@"
else
exec cat -- "$@"
fi

482 Chapter 10 Programming the Bourne Again Shell (bash)

exec: Redirects Input and Output
The second major use of exec is to redirect a file descriptor—including standard input,
output, or error—from within a script. The next command causes all subsequent input
to a script that would have come from standard input to come from the file named
infile:
exec < infile

Similarly the following command redirects standard output and standard error to
outfile and errfile, respectively:
exec > outfile 2> errfile

When you use exec in this manner, the current process is not replaced with a new process
and exec can be followed by other commands in the script.
/dev/tty

When you redirect the output from a script to a file, you must make sure the user sees
any prompts the script displays. The /dev/tty device is a pseudonym for the screen
the user is working on; you can use this device to refer to the user’s screen without
knowing which device it is. (The tty utility displays the name of the device you are
using.) By redirecting the output from a script to /dev/tty, you ensure that prompts
and messages go to the user’s terminal, regardless of which terminal the user is logged
in on. Messages sent to /dev/tty are also not diverted if standard output and standard
error from the script are redirected.
The to_screen1 script sends output to three places: standard output, standard error,
and the user’s screen. When run with standard output and standard error redirected,
to_screen1 still displays the message sent to /dev/tty on the user’s screen. The out and
err files hold the output sent to standard output and standard error, respectively.
$ cat to_screen1
echo "message to standard output"
echo "message to standard error" 1>&2
echo "message to screen" > /dev/tty
$ ./to_screen1 > out 2> err
message to screen
$ cat out
message to standard output
$ cat err
message to standard error

The following command redirects standard output from a script to the user’s screen:
exec > /dev/tty

Putting this command at the beginning of the previous script changes where the output goes. In to_screen2, exec redirects standard output to the user’s screen so the
> /dev/tty is superfluous. Following the exec command, all output sent to standard
output goes to /dev/tty (the screen). Output to standard error is not affected.

Builtin Commands 483
$ cat to_screen2
exec > /dev/tty
echo "message to standard output"
echo "message to standard error" 1>&2
echo "message to screen" > /dev/tty
$ ./to_screen2 > out 2> err
message to standard output
message to screen

One disadvantage of using exec to redirect the output to /dev/tty is that all subsequent
output is redirected unless you use exec again in the script.
You can also redirect the input to read (standard input) so that it comes from /dev/tty
(the keyboard):
read name < /dev/tty

or
exec < /dev/tty

trap: Catches a Signal
A signal is a report to a process about a condition. Linux uses signals to report interrupts generated by the user (for example, pressing the interrupt key) as well as bad
system calls, broken pipelines, illegal instructions, and other conditions. The trap
builtin (tcsh uses onintr) catches (traps) one or more signals, allowing you to direct the
actions a script takes when it receives a specified signal.
This discussion covers six signals that are significant when you work with shell
scripts. Table 10-5 lists these signals, the signal numbers that systems often ascribe
to them, and the conditions that usually generate each signal. Give the command
kill –l (lowercase “l”), trap –l (lowercase “l”), or man 7 signal to display a list of
all signal names.

Table 10-5

Signals

Type

Name

Number

Generating condition

Not a real signal

EXIT

0

Exit because of exit command or reaching the
end of the program (not an actual signal but
useful in trap)

Hang up

SIGHUP or
HUP

1

Disconnect the line

Terminal
interrupt

SIGINT or
INT

2

Press the interrupt key (usually CONTROL-C)

484 Chapter 10 Programming the Bourne Again Shell (bash)

Table 10-5

Signals (continued)

Type

Name

Number

Generating condition

Quit

SIGQUIT or
QUIT

3

Press the quit key (usually CONTROL-SHIFT-| or
CONTROL-SHIFT-\)

Kill

SIGKILL or
KILL

9

The kill builtin with the –9 option (cannot be
trapped; use only as a last resort)

Software
termination

SIGTERM or
TERM

15

Default of the kill command

Stop

SIGTSTP or
TSTP

20

Press the suspend key (usually CONTROL-Z)

Debug

DEBUG

Execute commands specified in the trap
statement after each command (not an actual
signal but useful in trap)

Error

ERR

Execute commands specified in the trap
statement after each command that returns a
nonzero exit status (not an actual signal but
useful in trap)

When it traps a signal, a script takes whatever action you specify: It can remove files
or finish other processing as needed, display a message, terminate execution immediately, or ignore the signal. If you do not use trap in a script, any of the six actual
signals listed in Table 10-5 (not EXIT, DEBUG, or ERR) will terminate the script.
Because a process cannot trap a KILL signal, you can use kill –KILL (or kill –9) as a
last resort to terminate a script or other process. (See page 486 for more information
on kill.)
The trap command has the following syntax:
trap ['commands'] [signal]
The optional commands specifies the commands the shell executes when it catches
one of the signals specified by signal. The signal can be a signal name or number—for
example, INT or 2. If commands is not present, trap resets the trap to its initial condition, which is usually to exit from the script.
Quotation marks

The trap builtin does not require single quotation marks around commands as
shown in the preceding syntax but it is a good practice to use them. The single
quotation marks cause shell variables within the commands to be expanded when
the signal occurs, rather than when the shell evaluates the arguments to trap. Even
if you do not use any shell variables in the commands, you need to enclose any

Builtin Commands 485

command that takes arguments within either single or double quotation marks.
Quoting commands causes the shell to pass to trap the entire command as a single
argument.
After executing the commands, the shell resumes executing the script where it left off.
If you want trap to prevent a script from exiting when it receives a signal but not to
run any commands explicitly, you can specify a null (empty) commands string, as
shown in the locktty script (page 440). The following command traps signal number
15, after which the script continues:
trap '' 15

The following script demonstrates how the trap builtin can catch the terminal interrupt
signal (2). You can use SIGINT, INT, or 2 to specify this signal. The script returns an
exit status of 1:
$ cat inter
#!/bin/bash
trap 'echo PROGRAM INTERRUPTED; exit 1' INT
while true
do
echo "Program running."
sleep 1
done
$ ./inter
Program running.
Program running.
Program running.
CONTROL-C

PROGRAM INTERRUPTED
$
: (null) builtin

The second line of inter sets up a trap for the terminal interrupt signal using INT.
When trap catches the signal, the shell executes the two commands between the single quotation marks in the trap command. The echo builtin displays the message
PROGRAM INTERRUPTED, exit terminates the shell running the script, and the
parent shell displays a prompt. If exit were not there, the shell would return control
to the while loop after displaying the message. The while loop repeats continuously
until the script receives a signal because the true utility always returns a true exit
status. In place of true you can use the : (null) builtin, which is written as a colon
and always returns a 0 (true) status.
The trap builtin frequently removes temporary files when a script is terminated
prematurely, thereby ensuring the files are not left to clutter the filesystem. The
following shell script, named addbanner, uses two traps to remove a temporary file
when the script terminates normally or because of a hangup, software interrupt,
quit, or software termination signal:

486 Chapter 10 Programming the Bourne Again Shell (bash)
$ cat addbanner
#!/bin/bash
script=$(basename $0)
if [ ! -r "$HOME/banner" ]
then
echo "$script: need readable $HOME/banner file" 1>&2
exit 1
fi
trap 'exit 1' 1 2 3 15
trap 'rm /tmp/$$.$script 2> /dev/null' EXIT
for file
do
if [ -r "$file" -a -w "$file" ]
then
cat $HOME/banner $file > /tmp/$$.$script
cp /tmp/$$.$script $file
echo "$script: banner added to $file" 1>&2
else
echo "$script: need read and write permission for $file" 1>&2
fi
done

When called with one or more filename arguments, addbanner loops through the files,
adding a header to the top of each. This script is useful when you use a standard format
at the top of your documents, such as a standard layout for memos, or when you want
to add a standard header to shell scripts. The header is kept in a file named ~/banner.
Because addbanner uses the HOME variable, which contains the pathname of the user’s
home directory, the script can be used by several users without modification. If Max had
written the script with /home/max in place of $HOME and then given the script to
Zach, either Zach would have had to change it or addbanner would have used Max’s
banner file when Zach ran it (assuming Zach had read permission for the file).
The first trap in addbanner causes it to exit with a status of 1 when it receives a
hangup, software interrupt (terminal interrupt or quit signal), or software termination signal. The second trap uses EXIT in place of signal-number, which causes trap
to execute its command argument whenever the script exits because it receives an exit
command or reaches its end. Together these traps remove a temporary file whether
the script terminates normally or prematurely. Standard error of the second trap is
sent to /dev/null whenever trap attempts to remove a nonexistent temporary file. In
those cases rm sends an error message to standard error; because standard error is
redirected, the user does not see the message.
See page 440 for another example that uses trap.

kill: Aborts a Process
The kill builtin sends a signal to a process or job. The kill command has the following
syntax:

Builtin Commands 487

kill [–signal] PID
where signal is the signal name or number (for example, INT or 2) and PID is the
process identification number of the process that is to receive the signal. You can
specify a job number (page 146) as %n in place of PID. If you omit signal, kill sends
a TERM (software termination, number 15) signal. For more information on signal
names and numbers, see Table 10-5 on page 483.
The following command sends the TERM signal to job number 1, regardless of
whether it is running or stopped in the background:
$ kill -TERM %1

Because TERM is the default signal for kill, you can also give this command as kill
%1. Give the command kill –l (lowercase “l”) to display a list of signal names.
A program that is interrupted can leave matters in an unpredictable state: Temporary
files might be left behind (when they are normally removed), and permissions might
be changed. A well-written application traps signals and cleans up before exiting.
Most carefully written applications trap the INT, QUIT, and TERM signals.
To terminate a program, first try INT (press CONTROL-C, if the job running is in the foreground). Because an application can be written to ignore this signal, you might need
to use the KILL signal, which cannot be trapped or ignored; it is a “sure kill.” Refer
to page 846 for more information on kill. See also the related utility killall (page 848).

eval: Scans, Evaluates, and Executes a Command Line
The eval builtin scans the command that follows it on the command line. In doing
so, eval processes the command line in the same way bash does when it executes a
command line (e.g., it expands variables, replacing the name of a variable with its
value). For more information refer to “Processing the Command Line” on page 354.
After scanning (and expanding) the command line, it passes the resulting command
line to bash to execute.
The following example first assigns the value frog to the variable name. Next eval
scans the command $name=88 and expands the variable $name to frog, yielding the
command frog=88, which it passes to bash to execute. The last command displays the
value of frog.
$ name=frog
$ eval $name=88
$ echo $frog
88
Brace expansion
with a sequence
expression

The next example uses eval to cause brace expansion with a sequence expression
(page 357) to accept variables, which it does not normally do. The following command
demonstrates brace expansion with a sequence expression:
$ echo {2..5}
2 3 4 5

488 Chapter 10 Programming the Bourne Again Shell (bash)

One of the first things bash does when it processes a command line is to perform
brace expansion; later it expands variables (page 354). When you provide an invalid
argument in brace expansion, bash does not perform brace expansion; instead, it
passes the string to the program being called. In the next example, bash cannot
expand {$m..$n} during the brace expansion phase because it contains variables, so
it continues processing the command line. When it gets to the variable expansion
phase, it expands $m and $n and then passes the string {2..5} to echo.
$ m=2 n=5
$ echo {$m..$n}
{2..5}

When eval scans the same command line, it expands the variables as explained
previously and yields the command echo {2..5}. It then passes that command to
bash, which can now perform brace expansion:
$ eval echo {$m..$n}
2 3 4 5

getopts: Parses Options
The getopts builtin (not in tcsh) parses command-line arguments, making it easier
to write programs that follow the Linux argument conventions. The syntax for
getopts is
getopts optstring varname [arg ...]
where optstring is a list of the valid option letters, varname is the variable that receives
the options one at a time, and arg is the optional list of parameters to be processed. If
arg is not present, getopts processes the command-line arguments. If optstring starts
with a colon (:), the script must take care of generating error messages; otherwise,
getopts generates error messages.
The getopts builtin uses the OPTIND (option index) and OPTARG (option argument) variables to track and store option-related values. When a shell script starts,
the value of OPTIND is 1. Each time getopts is called and locates an argument, it
increments OPTIND to the index of the next option to be processed. If the option
takes an argument, bash assigns the value of the argument to OPTARG.
To indicate that an option takes an argument, follow the corresponding letter in
optstring with a colon (:). For example, the optstring dxo:lt:r instructs getopts to
search for the –d, –x, –o, –l, –t, and –r options and tells it the –o and –t options
take arguments.
Using getopts as the test-command in a while control structure allows you to loop
over the options one at a time. The getopts builtin checks the option list for options
that are in optstring. Each time through the loop, getopts stores the option letter it
finds in varname.

Builtin Commands 489

As an example, assume you want to write a program that can take three options:
1. A –b option indicates that the program should ignore whitespace at the start
of input lines.
2. A –t option followed by the name of a directory indicates that the program
should store temporary files in that directory. Otherwise, it should use /tmp.
3. A –u option indicates that the program should translate all output to
uppercase.
In addition, the program should ignore all other options and end option processing
when it encounters two hyphens (––).
The problem is to write the portion of the program that determines which options
the user has supplied. The following solution does not use getopts:
SKIPBLANKS=
TMPDIR=/tmp
CASE=lower
while [[ "$1" = -* ]] # [[ = ]] does pattern match
do
case $1 in
-b)
SKIPBLANKS=TRUE ;;
-t)
if [ -d "$2" ]
then
TMPDIR=$2
shift
else
echo "$0: -t takes a directory argument." >&2
exit 1
fi ;;
-u)
CASE=upper ;;
--)
break
;;
# Stop processing options
*)
echo "$0: Invalid option $1 ignored." >&2 ;;
esac
shift
done

This program fragment uses a loop to check and shift arguments while the argument is not – –. As long as the argument is not two hyphens, the program
continues to loop through a case statement that checks for possible options. The
–– case label breaks out of the while loop. The * case label recognizes any option;
it appears as the last case label to catch any unknown options, displays an error
message, and allows processing to continue. On each pass through the loop, the
program uses shift so it accesses the next argument on the next pass through the
loop. If an option takes an argument, the program uses an extra shift to get past
that argument.

490 Chapter 10 Programming the Bourne Again Shell (bash)

The following program fragment processes the same options using getopts:
SKIPBLANKS=
TMPDIR=/tmp
CASE=lower
while getopts :bt:u arg
do
case $arg in
b)
SKIPBLANKS=TRUE ;;
t)
if [ -d "$OPTARG" ]
then
TMPDIR=$OPTARG
else
echo "$0: $OPTARG is not a directory." >&2
exit 1
fi ;;
u)
CASE=upper ;;
:)
echo "$0: Must supply an argument to -$OPTARG." >&2
exit 1 ;;
\?)
echo "Invalid option -$OPTARG ignored." >&2 ;;
esac
done

In this version of the code, the while structure evaluates the getopts builtin each time
control transfers to the top of the loop. The getopts builtin uses the OPTIND variable
to keep track of the index of the argument it is to process the next time it is called.
There is no need to call shift in this example.
In the getopts version of the script, the case patterns do not start with a hyphen
because the value of arg is just the option letter (getopts strips off the hyphen). Also,
getopts recognizes –– as the end of the options, so you do not have to specify it
explicitly, as in the case statement in the first example.
Because you tell getopts which options are valid and which require arguments, it can
detect errors in the command line and handle them in two ways. This example uses
a leading colon in optstring to specify that you check for and handle errors in your
code; when getopts finds an invalid option, it sets varname to ? and OPTARG to
the option letter. When it finds an option that is missing an argument, getopts sets
varname to : and OPTARG to the option lacking an argument.
The \? case pattern specifies the action to take when getopts detects an invalid option. The
: case pattern specifies the action to take when getopts detects a missing option argument.
In both cases getopts does not write any error message but rather leaves that task to you.
If you omit the leading colon from optstring, both an invalid option and a missing option
argument cause varname to be assigned the string ?. OPTARG is not set and getopts
writes its own diagnostic message to standard error. Generally this method is less desirable because you have less control over what the user sees when an error occurs.
Using getopts will not necessarily make your programs shorter. Its principal advantages
are that it provides a uniform programming interface and that it enforces standard
option handling.

Builtin Commands 491

A Partial List of Builtins
Table 10-6 lists some of the bash builtins. You can use type (page 477) to see if a
command runs a builtin. See “Listing bash builtins” on page 153 for instructions
on how to display complete lists of builtins.

Table 10-6

bash builtins

Builtin

Function

:

Returns 0 or true (the null builtin; pages 473 and 485)

. (dot)

Executes a shell script as part of the current process (page 280)

bg

Puts a suspended job in the background (page 296)

break

Exits from a looping control structure (page 441)

cd

Changes to another working directory (page 92)

continue

Starts with the next iteration of a looping control structure (page 441)

echo

Displays its arguments (page 61)

eval

Scans and evaluates the command line (page 487)

exec

Executes a shell script or program in place of the current process (page 480)

exit

Exits from the current shell (usually the same as CONTROL-D from an interactive
shell; page 465)

export

Makes the variable an environment variable (page 468)

fg

Brings a job from the background to the foreground (page 295)

getopts

Parses arguments to a shell script (page 488)

jobs

Displays a list of background jobs (page 295)

kill

Sends a signal to a process or job (page 846)

pwd

Displays the name of the working directory (page 86)

read

Reads a line from standard input (page 477)

readonly

Declares a variable to be readonly (page 305)

set

Sets shell flags or positional parameters; with no argument, lists all variables
(pages 350, 386, and 460)

shift

Promotes each positional parameter (page 461)

test

Compares arguments (pages 419 and 978)

times

Displays total times for the current shell and its children

trap

Traps a signal (page 483)

492 Chapter 10 Programming the Bourne Again Shell (bash)

Table 10-6

bash builtins (continued)

Builtin

Function

type

Displays how each argument would be interpreted as a command (page 477)

umask

Sets and displays the file-creation mask (page 994)

unset

Removes a variable or function (page 304)

wait

Waits for a background process to terminate

Expressions
An expression comprises constants, variables, and operators that the shell can
process to return a value. This section covers arithmetic, logical, and conditional
expressions as well as operators. Table 10-8 on page 495 lists the bash operators.

Arithmetic Evaluation
The Bourne Again Shell can perform arithmetic assignments and evaluate many different types of arithmetic expressions, all using integers. The shell performs
arithmetic assignments in a number of ways. One is with arguments to the let builtin:
$ let "VALUE=VALUE * 10 + NEW"

In the preceding example, the variables VALUE and NEW hold integer values.
Within a let statement you do not need to use dollar signs ($) in front of variable
names. Double quotation marks must enclose a single argument, or expression,
that contains SPACEs. Because most expressions contain SPACEs and need to be quoted,
bash accepts ((expression)) as a synonym for let "expression", obviating the need
for both quotation marks and dollar signs:
$ ((VALUE=VALUE * 10 + NEW))

You can use either form wherever a command is allowed and can remove the SPACEs.
In these examples, the asterisk (*) does not need to be quoted because the shell does
not perform pathname expansion on the right side of an assignment (page 303):
$ let VALUE=VALUE*10+NEW

Because each argument to let is evaluated as a separate expression, you can assign values
to more than one variable on a single line:
$ let "COUNT = COUNT + 1" VALUE=VALUE*10+NEW

You must use commas to separate multiple assignments within a set of double
parentheses:
$ ((COUNT = COUNT + 1, VALUE=VALUE*10+NEW))

Expressions

493

Arithmetic evaluation versus arithmetic expansion
tip Arithmetic evaluation differs from arithmetic expansion. As explained on page 359, arithmetic
expansion uses the syntax $((expression)), evaluates expression, and replaces $((expression)) with
the result. You can use arithmetic expansion to display the value of an expression or to assign that
value to a variable.
Arithmetic evaluation uses the let expression or ((expression)) syntax, evaluates expression, and
returns a status code. You can use arithmetic evaluation to perform a logical comparison or an
assignment.
Logical expressions

You can use the ((expression)) syntax for logical expressions, although that task is
frequently left to [[expression]] (next). The next example expands the age_check
script (page 359) to include logical arithmetic evaluation in addition to arithmetic
expansion:
$ cat age2
#!/bin/bash
read -p "How old are
if ((30 < age && age
echo "Wow, in
else
echo "You are
fi

you? " age
< 60)); then
$((60-age)) years, you'll be 60!"
too young or too old to play."

$ ./age2
How old are you? 25
You are too young or too old to play.

The test-statement for the if structure evaluates two logical comparisons joined by a
Boolean AND and returns 0 (true) if they are both true or 1 (false) otherwise.

Logical Evaluation (Conditional Expressions)
The syntax of a conditional expression is
[[ expression ]]
where expression is a Boolean (logical) expression. You must precede a variable name
with a dollar sign ($) within expression. The result of executing this builtin, as with
the test builtin, is a return status. The conditions allowed within the brackets are
almost a superset of those accepted by test (page 978). Where the test builtin uses –a
as a Boolean AND operator, [[ expression ]] uses &&. Similarly, where test uses –o
as a Boolean OR operator, [[ expression ]] uses ||.
To see how conditional expressions work, replace the line that tests age in the age2
script with the following conditional expression. You must surround the [[ and ]]
tokens with whitespace or a command terminator, and place dollar signs before the
variables:
if [[ 30 < $age && $age < 60 ]]; then

You can also use test’s relational operators –gt, –ge, –lt, –le, –eq, and –ne:
if [[ 30 -lt $age && $age -lt 60 ]]; then

494 Chapter 10 Programming the Bourne Again Shell (bash)
String comparisons

The test builtin tests whether strings are equal. The [[ expression ]] syntax adds
comparison tests for strings. The > and < operators compare strings for order (for
example, "aa" < "bbb"). The = operator tests for pattern match, not just equality:
[[ string = pattern ]] is true if string matches pattern. This operator is not symmetrical; the pattern must appear on the right side of the equal sign. For example,
[[ artist = a* ]] is true (= 0), whereas [[ a* = artist ]] is false (= 1):
$
$
0
$
$
1

[[ artist = a* ]]
echo $?
[[ a* = artist ]]
echo $?

The next example uses a command list that starts with a compound condition. The
condition tests whether the directory bin and the file src/myscript.bash exist. If the
result is true, cp copies src/myscript.bash to bin/myscript. If the copy succeeds, chmod
makes myscript executable. If any of these steps fails, echo displays a message.
Implicit command-line continuation (page 499) obviates the need for backslashes at
the ends of lines.
$ [[ -d bin && -f src/myscript.bash ]] &&
cp src/myscript.bash bin/myscript &&
chmod +x bin/myscript ||
echo "Cannot make executable version of myscript"

String Pattern Matching
The Bourne Again Shell provides string pattern-matching operators that can manipulate
pathnames and other strings. These operators can delete from strings prefixes or suffixes
that match patterns. Table 10-7 lists the four operators.

Table 10-7

String operators

Operator

Function

#

Removes minimal matching prefixes

##

Removes maximal matching prefixes

%

Removes minimal matching suffixes

%%

Removes maximal matching suffixes

The syntax for these operators is
${varname op pattern}
where op is one of the operators listed in Table 10-7 and pattern is a match pattern
similar to that used for filename generation. These operators are commonly used to
manipulate pathnames to extract or remove components or to change suffixes:

Expressions

495

$ SOURCEFILE=/usr/local/src/prog.c
$ echo ${SOURCEFILE#/*/}
local/src/prog.c
$ echo ${SOURCEFILE##/*/}
prog.c
$ echo ${SOURCEFILE%/*}
/usr/local/src
$ echo ${SOURCEFILE%%/*}
$ echo ${SOURCEFILE%.c}
/usr/local/src/prog
$ CHOPFIRST=${SOURCEFILE#/*/}
$ echo $CHOPFIRST
local/src/prog.c
$ NEXT=${CHOPFIRST%%/*}
$ echo $NEXT
local
String length

The shell expands ${#name} to the number of characters in name:
$ echo $SOURCEFILE
/usr/local/src/prog.c
$ echo ${#SOURCEFILE}
21

Arithmetic Operators
Arithmetic expansion and arithmetic evaluation in bash use the same syntax, precedence,
and associativity of expressions as the C language. Table 10-8 lists arithmetic operators
in order of decreasing precedence (priority of evaluation); each group of operators has
equal precedence. Within an expression you can use parentheses to change the order of
evaluation.

Table 10-8

Arithmetic operators

Type of operator/operator

Function

Post

var++ Postincrement
var –– Postdecrement
Pre
++var Preincrement
––var Predecrement
Unary
– Unary minus
+ Unary plus

496 Chapter 10 Programming the Bourne Again Shell (bash)

Table 10-8

Arithmetic operators (continued)

Type of operator/operator

Function

Negation
! Boolean NOT (logical negation)
~ Complement (bitwise negation)
Exponentiation

**

Exponent

Multiplication, division,
remainder

*

Multiplication

/ Division
% Remainder
Addition, subtraction
+ Addition
– Subtraction
Bitwise shifts
<< Left bitwise shift
>> Right bitwise shift
Comparison
<= Less than or equal
>= Greater than or equal
< Less than
> Greater than
Equality, inequality
== Equality
!= Inequality
Bitwise
& Bitwise AND
^ Bitwise XOR (exclusive OR)
| Bitwise OR

Expressions

Table 10-8

497

Arithmetic operators (continued)

Type of operator/operator

Function

Boolean (logical)
&& Boolean AND
|| Boolean OR
Conditional evaluation
? : Ternary operator
Assignment
=, *=, /=, %=, +=, –=, Assignment
<<=, >>=, &=, ^=, |=
Comma
, Comma
Pipe symbol

The | control operator has higher precedence than arithmetic operators. For example,
the command line
$ cmd1 | cmd2 || cmd3 | cmd4 && cmd5 | cmd6

is interpreted as if you had typed
$ ((cmd1 | cmd2) || (cmd3 | cmd4)) && (cmd5 | cmd6)

Do not rely on rules of precedence: use parentheses
tip Do not rely on the precedence rules when you use command lists (page 145). Instead, use parentheses
to explicitly specify the order in which you want the shell to interpret the commands.
Increment and
decrement

The postincrement, postdecrement, preincrement, and predecrement operators work
with variables. The pre- operators, which appear in front of the variable name (as in
++COUNT and ––VALUE), first change the value of the variable (++ adds 1; –– subtracts 1) and then provide the result for use in the expression. The post- operators
appear after the variable name (as in COUNT++ and VALUE––); they first provide
the unchanged value of the variable for use in the expression and then change the
value of the variable.
$ N=10
$ echo
10
$ echo
12
$ echo
9
$ echo
6
$ echo
10

$N
$((--N+3))
$N
$((N++ - 3))
$N

498 Chapter 10 Programming the Bourne Again Shell (bash)
Remainder

The remainder operator (%) yields the remainder when its first operand is divided
by its second. For example, the expression $((15%7)) has the value 1.

Ternary

The ternary operator, ? :, decides which of two expressions should be evaluated,
based on the value returned by a third expression. The syntax is
expression1 ? expression2 : expression3
If expression1 produces a false (0) value, expression3 is evaluated; otherwise,
expression2 is evaluated. The value of the entire expression is the value of
expression2 or expression3, depending on which is evaluated. If expression1 is true,
expression3 is not evaluated. If expression1 is false, expression2 is not evaluated.
$
$
$
1
$
1

((N=10,Z=0,COUNT=1))
((T=N>COUNT?++Z:--Z))
echo $T
echo $Z

Assignment

The assignment operators, such as +=, are shorthand notations. For example, N+=3
is the same as ((N=N+3)).

Other bases

The following commands use the syntax base#n to assign base 2 (binary) values. First
v1 is assigned a value of 0101 (5 decimal) and then v2 is assigned a value of 0110 (6
decimal). The echo utility verifies the decimal values.
$
$
$
5

((v1=2#0101))
((v2=2#0110))
echo "$v1 and $v2"
and 6

Next the bitwise AND operator (&) selects the bits that are on in both 5 (0101 binary)
and 6 (0110 binary). The result is binary 0100, which is 4 decimal.
$ echo $(( v1 & v2 ))
4

The Boolean AND operator (&&) produces a result of 1 if both of its operands are
nonzero and a result of 0 otherwise. The bitwise inclusive OR operator (|) selects the
bits that are on in either 0101 or 0110, resulting in 0111, which is 7 decimal. The
Boolean OR operator (||) produces a result of 1 if either of its operands is nonzero
and a result of 0 otherwise.
$ echo $(( v1 && v2 ))
1
$ echo $(( v1 | v2 ))
7
$ echo $(( v1 || v2 ))
1

Implicit Command-Line Continuation

499

Next the bitwise exclusive OR operator (^) selects the bits that are on in either, but
not both, of the operands 0101 and 0110, yielding 0011, which is 3 decimal. The
Boolean NOT operator (!) produces a result of 1 if its operand is 0 and a result of 0
otherwise. Because the exclamation point in $(( ! v1 )) is enclosed within double
parentheses, it does not need to be escaped to prevent the shell from interpreting the
exclamation point as a history event. The comparison operators produce a result of
1 if the comparison is true and a result of 0 otherwise.
$
3
$
0
$
1
$
0

echo $(( v1 ^ v2 ))
echo $(( ! v1 ))
echo $(( v1 < v2 ))
echo $(( v1 > v2 ))

Implicit Command-Line Continuation
Each of the following control operators (page 289) implies continuation:
; ;; | & && |& ||

For example, there is no difference between this set of commands:
cd mydir && rm

*.o

and this set:
cd mydir &&
rm *.o

Both sets of commands remove all files with a filename extension of .o only if the
cd mydir command is successful. If you give the second set of commands in an interactive
shell, the shell issues a secondary prompt (>; page 311) after you enter the first line and
waits for you to complete the command line.
The following commands create the directory named mydir if mydir does not exist.
You can put the commands on one line or two.
[ -d mydir ] ||
mkdir mydir
Pipe symbol (|)
implies continuation

Similarly the pipe symbol implies continuation:
sort names
grep -i '^[a-m]'
sed 's/Street/St/'
pr --header="Names from A-M"
lpr

|
|
|
|

500 Chapter 10 Programming the Bourne Again Shell (bash)

When a command line ends with a pipe symbol, you do not need backslashes to
indicate continuation.
sort names
grep -i '^[a-m]'
sed 's/Street/St/'
pr --header="Names from A-M"
lpr

|
|
|
|

\
\
\
\

Although it will work, the following example is also a poor way to write code because
it is hard to read and understand:
sort names \
| grep -i '^[a-m]' \
| sed 's/Street/St/' \
| pr --header="Names from A-M" \
| lpr

Another way to improve the readability of code you write is to take advantage of
implicit command-line continuation to break lines without using backslashes. These
commands are easier to read and understand:
$ [ -e /home/sam/memos/helen.personnel/november ] &&
~sam/report_a november alphaphonics totals

than these commands:
$ [ -e /home/sam/memos/helen.personnel/november ] && ~sam/report_a \
november alphaphonics totals

Shell Programs
The Bourne Again Shell has many features that make it a good programming language.
The structures that bash provides are not a random assortment, but rather have been
chosen to provide most of the structural features found in other procedural languages,
such as C and Perl. A procedural language provides the following abilities:
• Declare, assign, and manipulate variables and constant data. The Bourne
Again Shell provides both string variables, together with powerful string
operators, and integer variables, along with a complete set of arithmetic
operators.
• Break large problems into small ones by creating subprograms. The Bourne
Again Shell allows you to create functions and call scripts from other
scripts. Shell functions can be called recursively; that is, a Bourne Again
Shell function can call itself. You might not need to use recursion often, but
it might allow you to solve some apparently difficult problems with ease.
• Execute statements conditionally using statements such as if.

Shell Programs 501

• Execute statements iteratively using statements such as while and for.
• Transfer data to and from the program, communicating with both data files
and users.
Programming languages implement these capabilities in different ways but with the
same ideas in mind. When you want to solve a problem by writing a program, you
must first figure out a procedure that leads you to a solution—that is, an algorithm.
Typically you can implement the same algorithm in roughly the same way in different
programming languages, using the same kinds of constructs in each language.
Chapter 8 and this chapter have introduced numerous bash features, many of which
are useful for both interactive use and shell programming. This section develops two
complete shell programs, demonstrating how to combine some of these features
effectively. The programs are presented as problems for you to solve, with sample
solutions provided.

A Recursive Shell Script
A recursive construct is one that is defined in terms of itself. Alternatively, you
might say that a recursive program is one that can call itself. This concept might
seem circular, but it need not be. To avoid circularity, a recursive definition must
have a special case that is not self-referential. Recursive ideas occur in everyday life.
For example, you can define an ancestor as your mother, your father, or one of their
ancestors. This definition is not circular; it specifies unambiguously who your
ancestors are: your mother or your father, or your mother’s mother or father or
your father’s mother or father, and so on.
A number of Linux system utilities can operate recursively. See the –R option to the
chmod (page 741), chown (page 746), and cp (page 754) utilities for examples.
Solve the following problem by using a recursive shell function:
Write a shell function named makepath that, given a pathname, creates all components
in that pathname as directories. For example, the command makepath a/b/c/d should
create directories a, a/b, a/b/c, and a/b/c/d. (The mkdir –p option creates directories in
this manner. Solve the problem without using mkdir –p.)
One algorithm for a recursive solution follows:
1. Examine the path argument. If it is a null string or if it names an existing
directory, do nothing and return.
2. If the path argument is a simple path component, create it (using mkdir) and
return.
3. Otherwise, call makepath using the path prefix of the original argument.
This step eventually creates all the directories up to the last component,
which you can then create using mkdir.

502 Chapter 10 Programming the Bourne Again Shell (bash)

In general, a recursive function must invoke itself with a simpler version of the problem
than it was given until it is finally called with a simple case that does not need to call
itself. Following is one possible solution based on this algorithm:
makepath

# This is a function
# Enter it at the keyboard; do not run it as a shell script
#
function makepath()
{
if [[ ${#1} -eq 0 || -d "$1" ]]
then
return 0
# Do nothing
fi
if [[ "${1%/*}" = "$1" ]]
then
mkdir $1
return $?
fi
makepath ${1%/*} || return 1
mkdir $1
return $?
}

In the test for a simple component (the if statement in the middle of the function), the
left expression is the argument after the shortest suffix that starts with a / character has
been stripped away (page 494). If there is no such character (for example, if $1 is max),
nothing is stripped off and the two sides are equal. If the argument is a simple filename
preceded by a slash, such as /usr, the expression ${1%/*} evaluates to a null string. To
make the function work in this case, you must take two precautions: Put the left expression within quotation marks and ensure that the recursive function behaves sensibly
when it is passed a null string as an argument. In general, good programs are robust:
They should be prepared for borderline, invalid, or meaningless input and behave
appropriately in such cases.
By giving the following command from the shell you are working in, you turn on
debugging tracing so that you can watch the recursion work:
$ set -o xtrace

(Give the same command but replace the hyphen with a plus sign [+] to turn debugging
off.) With debugging turned on, the shell displays each line in its expanded form as it
executes the line. A + precedes each line of debugging output.
In the following example, the first line that starts with + shows the shell calling
makepath. The makepath function is initially called from the command line with
arguments of a/b/c. It then calls itself with arguments of a/b and finally a. All the
work is done (using mkdir) as each call to makepath returns.
$
+
+
+
+

./makepath a/b/c
makepath a/b/c
[[ 5 -eq 0 ]]
[[ -d a/b/c ]]
[[ a/b = \a\/\b\/\c ]]

Shell Programs 503
+
+
+
+
+
+
+
+
+
+
+
+
+
+

makepath a/b
[[ 3 -eq 0 ]]
[[ -d a/b ]]
[[ a = \a\/\b ]]
makepath a
[[ 1 -eq 0 ]]
[[ -d a ]]
[[ a = \a ]]
mkdir a
return 0
mkdir a/b
return 0
mkdir a/b/c
return 0

The function works its way down the recursive path and back up again.
It is instructive to invoke makepath with an invalid path and see what happens. The
following example, which is run with debugging turned on, tries to create the path
/a/b. Creating this path requires that you create directory a in the root directory.
Unless you have permission to write to the root directory, you are not permitted to
create this directory.
$ ./makepath /a/b
+ makepath /a/b
+ [[ 4 -eq 0 ]]
+ [[ -d /a/b ]]
+ [[ /a = \/\a\/\b ]]
+ makepath /a
+ [[ 2 -eq 0 ]]
+ [[ -d /a ]]
+ [[ '' = \/\a ]]
+ makepath
+ [[ 0 -eq 0 ]]
+ return 0
+ mkdir /a
mkdir: cannot create directory '/a': Permission denied
+ return 1
+ return 1

The recursion stops when makepath is denied permission to create the /a directory.
The error returned is passed all the way back, so the original makepath exits with
nonzero status.

Use local variables with recursive functions
tip The preceding example glossed over a potential problem that you might encounter when you use a recursive function. During the execution of a recursive function, many separate instances of that function
might be active simultaneously. All but one of them are waiting for their child invocation to complete.
Because functions run in the same environment as the shell that calls them, variables are implicitly
shared by a shell and a function it calls. As a consequence, all instances of the function share a
single copy of each variable. Sharing variables can give rise to side effects that are rarely what you
want. As a rule, you should use local to make all variables of a recursive function local. See
page 476 for more information.

504 Chapter 10 Programming the Bourne Again Shell (bash)

The quiz Shell Script
Solve the following problem using a bash script:
Write a generic multiple-choice quiz program. The program should get its questions
from data files, present them to the user, and keep track of the number of correct and
incorrect answers. The user must be able to exit from the program at any time and
receive a summary of results to that point.
The detailed design of this program and even the detailed description of the problem
depend on a number of choices: How will the program know which subjects are
available for quizzes? How will the user choose a subject? How will the program
know when the quiz is over? Should the program present the same questions (for a
given subject) in the same order each time, or should it scramble them?
Of course, you can make many perfectly good choices that implement the specification
of the problem. The following details narrow the problem specification:
• Each subject will correspond to a subdirectory of a master quiz directory. This
directory will be named in the environment variable QUIZDIR, whose
default will be ~/quiz. For example, you could have the following directories
correspond to the subjects engineering, art, and politics: ~/quiz/engineering,
~/quiz/art, and ~/quiz/politics. Put the quiz directory in /usr/games if you
want all users to have access to it (requires root privileges).
• Each subject can have several questions. Each question is represented by a
file in its subject’s directory.
• The first line of each file that represents a question holds the text of the
question. If it takes more than one line, you must escape the NEWLINE with a
backslash. (This setup makes it easy to read a single question with the read
builtin.) The second line of the file is an integer that specifies the number of
choices. The next lines are the choices themselves. The last line is the correct
answer. Following is a sample question file:
Who discovered the principle of the lever?
4
Euclid
Archimedes
Thomas Edison
The Lever Brothers
Archimedes

• The program presents all the questions in a subject directory. At any point
the user can interrupt the quiz using CONTROL-C, whereupon the program will
summarize the results up to that point and exit. If the user does not interrupt
the program, the program summarizes the results and exits when it has
asked all questions for the chosen subject.
• The program scrambles the questions related to a subject before presenting them.

Shell Programs 505

Following is a top-level design for this program:
1. Initialize. This involves a number of steps, such as setting the counts of the
number of questions asked so far and the number of correct and wrong
answers to zero. It also sets up the program to trap CONTROL-C.
2. Present the user with a choice of subjects and get the user’s response.
3. Change to the corresponding subject directory.
4. Determine the questions to be asked (that is, the filenames in that directory).
Arrange them in random order.
5. Repeatedly present questions and ask for answers until the quiz is over or
is interrupted by the user.
6. Present the results and exit.
Clearly some of these steps (such as step 3) are simple, whereas others (such as step 4)
are complex and worthy of analysis on their own. Use shell functions for any complex
step, and use the trap builtin to handle a user interrupt.
Here is a skeleton version of the program with empty shell functions:
function initialize
{
# Initializes variables.
}
function choose_subj
{
# Writes choice to standard output.
}
function scramble
{
# Stores names of question files, scrambled,
# in an array variable named questions.
}
function ask
{
# Reads a question file, asks the question, and checks the
# answer. Returns 1 if the answer was correct, 0 otherwise. If it
# encounters an invalid question file, exits with status 2.
}
function summarize
{
# Presents the user's score.
}
# Main program
initialize

# Step 1 in top-level design

subject=$(choose_subj)
[[ $? -eq 0 ]] || exit 2

# Step 2
# If no valid choice, exit

506 Chapter 10 Programming the Bourne Again Shell (bash)
cd $subject || exit 2
echo
scramble

# Step 3
# Skip a line
# Step 4

for ques in ${questions[*]}; do
ask $ques
result=$?
(( num_ques=num_ques+1 ))
if [[ $result == 1 ]]; then
(( num_correct += 1 ))
fi
echo
sleep ${QUIZDELAY:=1}
done

# Step 5

summarize
exit 0

# Step 6

# Skip a line between questions

To make reading the results a bit easier for the user, a sleep call appears inside the
question loop. It delays $QUIZDELAY seconds (default = 1) between questions.
Now the task is to fill in the missing pieces of the program. In a sense this program
is being written backward. The details (the shell functions) come first in the file but
come last in the development process. This common programming practice is called
top-down design. In top-down design you fill in the broad outline of the program
first and supply the details later. In this way you break the problem up into smaller
problems, each of which you can work on independently. Shell functions are a great
help in using the top-down approach.
One way to write the initialize function follows. The cd command causes QUIZDIR
to be the working directory for the rest of the script and defaults to ~/quiz if
QUIZDIR is not set.
function initialize ()
{
trap 'summarize ; exit 0' INT
num_ques=0
num_correct=0
first_time=true
cd ${QUIZDIR:=~/quiz} || exit 2
}

#
#
#
#

Handle user interrupts
Number of questions asked so far
Number answered correctly so far
true until first question is asked

Be prepared for the cd command to fail. The directory might not be searchable or
conceivably another user might have removed it. The preceding function exits with
a status code of 2 if cd fails.
The next function, choose_subj, is a bit more complicated. It displays a menu using
a select statement:

Shell Programs 507
function choose_subj ()
{
subjects=($(ls))
PS3="Choose a subject for the quiz from the preceding list: "
select Subject in ${subjects[*]}; do
if [[ -z "$Subject" ]]; then
echo "No subject chosen. Bye." >&2
exit 1
fi
echo $Subject
return 0
done
}

The function first uses an ls command and command substitution to put a list of subject
directories in the subjects array. Next the select structure (page 448) presents the user
with a list of subjects (the directories found by ls) and assigns the chosen directory name
to the Subject variable. Finally the function writes the name of the subject directory to
standard output. The main program uses command substitution to assign this value to
the subject variable [subject=$(choose_subj)].
The scramble function presents a number of difficulties. In this solution it uses an
array variable (questions) to hold the names of the questions. It scrambles the entries
in an array using the RANDOM variable (each time you reference RANDOM, it has
the value of a [random] integer between 0 and 32767):
function scramble ()
{
declare -i index quescount
questions=($(ls))
quescount=${#questions[*]}
((index=quescount-1))
while [[ $index > 0 ]]; do
((target=RANDOM % index))
exchange $target $index
((index -= 1))
done
}

# Number of elements

This function initializes the array variable questions to the list of filenames (questions)
in the working directory. The variable quescount is set to the number of such files. Then
the following algorithm is used: Let the variable index count down from quescount – 1
(the index of the last entry in the array variable). For each value of index, the function
chooses a random value target between 0 and index, inclusive. The command
((target=RANDOM % index))

produces a random value between 0 and index – 1 by taking the remainder (the %
operator) when $RANDOM is divided by index. The function then exchanges the
elements of questions at positions target and index. It is convenient to take care of
this step in another function named exchange:

508 Chapter 10 Programming the Bourne Again Shell (bash)
function exchange ()
{
temp_value=${questions[$1]}
questions[$1]=${questions[$2]}
questions[$2]=$temp_value
}

The ask function also uses the select structure. It reads the question file named in its
argument and uses the contents of that file to present the question, accept the answer,
and determine whether the answer is correct. (See the code that follows.)
The ask function uses file descriptor 3 to read successive lines from the question file,
whose name was passed as an argument and is represented by $1 in the function.
It reads the question into the ques variable and the number of questions into
num_opts. The function constructs the variable choices by initializing it to a null
string and successively appending the next choice. Then it sets PS3 to the value of
ques and uses a select structure to prompt the user with ques. The select structure
places the user’s answer in answer, and the function then checks that response
against the correct answer from the file.
The construction of the choices variable is done with an eye toward avoiding a potential
problem. Suppose that one answer has some whitespace in it—then it might appear as
two or more arguments in choices. To avoid this problem, make sure that choices is an
array variable. The select statement does the rest of the work:
quiz

$ cat quiz
#!/bin/bash
# remove the # on the following line to turn on debugging
# set -o xtrace
#==================
function initialize ()
{
trap 'summarize ; exit 0' INT
num_ques=0
num_correct=0
first_time=true
cd ${QUIZDIR:=~/quiz} || exit 2
}

#
#
#
#

Handle user interrupts
Number of questions asked so far
Number answered correctly so far
true until first question is asked

#==================
function choose_subj ()
{
subjects=($(ls))
PS3="Choose a subject for the quiz from the preceding list: "
select Subject in ${subjects[*]}; do
if [[ -z "$Subject" ]]; then
echo "No subject chosen. Bye." >&2
exit 1
fi
echo $Subject
return 0
done
}

Shell Programs 509
#==================
function exchange ()
{
temp_value=${questions[$1]}
questions[$1]=${questions[$2]}
questions[$2]=$temp_value
}
#==================
function scramble ()
{
declare -i index quescount
questions=($(ls))
quescount=${#questions[*]}
((index=quescount-1))
while [[ $index > 0 ]]; do
((target=RANDOM % index))
exchange $target $index
((index -= 1))
done
}

# Number of elements

#==================
function ask ()
{
exec 3<$1
read -u3 ques || exit 2
read -u3 num_opts || exit 2
index=0
choices=()
while (( index < num_opts )) ; do
read -u3 next_choice || exit 2
choices=("${choices[@]}" "$next_choice")
((index += 1))
done
read -u3 correct_answer || exit 2
exec 3<&if [[ $first_time = true ]]; then
first_time=false
echo -e "You may press the interrupt key at any time to quit.\n"
fi
PS3=$ques"

"

# Make $ques the prompt for select
# and add some spaces for legibility
select answer in "${choices[@]}"; do
if [[ -z "$answer" ]]; then
echo Not a valid choice. Please choose again.
elif [[ "$answer" = "$correct_answer" ]]; then
echo "Correct!"
return 1
else
echo "No, the answer is $correct_answer."
return 0
fi
done
}

510 Chapter 10 Programming the Bourne Again Shell (bash)
#==================
function summarize ()
{
echo
# Skip a line
if (( num_ques == 0 )); then
echo "You did not answer any questions"
exit 0
fi
(( percent=num_correct*100/num_ques ))
echo "You answered $num_correct questions correctly, out of \
$num_ques total questions."
echo "Your score is $percent percent."
}
#==================
# Main program
initialize

# Step 1 in top-level design

subject=$(choose_subj)
[[ $? -eq 0 ]] || exit 2

# Step 2
# If no valid choice, exit

cd $subject || exit 2
echo
scramble

# Step 3
# Skip a line
# Step 4

for ques in ${questions[*]}; do
ask $ques
result=$?
(( num_ques=num_ques+1 ))
if [[ $result == 1 ]]; then
(( num_correct += 1 ))
fi
echo
sleep ${QUIZDELAY:=1}
done

# Step 5

summarize
exit 0

# Step 6

# Skip a line between questions

Chapter Summary
The shell is a programming language. Programs written in this language are called
shell scripts, or simply scripts. Shell scripts provide the decision and looping control
structures present in high-level programming languages while allowing easy access to
system utilities and user programs. Shell scripts can use functions to modularize and
simplify complex tasks.
Control structures

The control structures that use decisions to select alternatives are if...then,
if...then...else, and if...then...elif. The case control structure provides a multiway
branch and can be used when you want to express alternatives using a simple patternmatching syntax.

Chapter Summary

511

The looping control structures are for...in, for, until, and while. These structures
perform one or more tasks repetitively.
The break and continue control structures alter control within loops: break transfers control out of a loop, and continue transfers control immediately to the top of
a loop.
The Here document allows input to a command in a shell script to come from within
the script itself.
File descriptors

The Bourne Again Shell provides the ability to manipulate file descriptors. Coupled
with the read and echo builtins, file descriptors allow shell scripts to have as much
control over input and output as do programs written in lower-level languages.

Variables

By default, variables are local to the process they are declared in; these variables are called
shell variables. You can use export to cause variables to be environment variables, which
are available to children of the process they are declared in.
The declare builtin assigns attributes, such as readonly, to bash variables. The Bourne
Again Shell provides operators to perform pattern matching on variables, provide default
values for variables, and evaluate the length of variables. This shell also supports array
variables and local variables for functions and provides built-in integer arithmetic, using
the let builtin and an expression syntax similar to that found in the C programming
language.

Builtins

Bourne Again Shell builtins include type, read, exec, trap, kill, and getopts. The type
builtin displays information about a command, including its location; read allows a
script to accept user input.
The exec builtin executes a command without creating a new process. The new command overlays the current process, assuming the same environment and PID number
of that process. This builtin executes user programs and other Linux commands
when it is not necessary to return control to the calling process.
The trap builtin catches a signal sent to the process running the script and allows
you to specify actions to be taken upon receipt of one or more signals. You can use
this builtin to cause a script to ignore the signal that is sent when the user presses
the interrupt key.
The kill builtin terminates a running program. The getopts builtin parses command-line
arguments, making it easier to write programs that follow standard Linux/OS X
conventions for command-line arguments and options.

Utilities in scripts

In addition to using control structures, builtins, and functions, shell scripts generally
call Linux/OS X utilities. The find utility, for instance, is commonplace in shell scripts
that search for files in the system hierarchy and can perform a wide range of tasks.

Expressions

There are two basic types of expressions: arithmetic and logical. Arithmetic expressions
allow you to do arithmetic on constants and variables, yielding a numeric result. Logical
(Boolean) expressions compare expressions or strings, or test conditions, to yield a true
or false result. As with all decisions within shell scripts, a true status is represented by the
value 0; false, by any nonzero value.

512 Chapter 10 Programming the Bourne Again Shell (bash)
Good programming
practices

A well-written shell script adheres to standard programming practices, such as
specifying the shell to execute the script on the first line of the script, verifying the
number and type of arguments that the script is called with, displaying a standard
usage message to report command-line errors, and redirecting all informational
messages to standard error.

Exercises
1. Rewrite the journal script of Chapter 8 (exercise 5, page 367) by adding
commands to verify that the user has write permission for a file named
journal-file in the user’s home directory, if such a file exists. The script
should take appropriate actions if journal-file exists and the user does not
have write permission to the file. Verify that the modified script works.
2. The special parameter "$@" is referenced twice in the out script (page 425).
Explain what would be different if the parameter "$*" were used in its
place.
3. Write a filter that takes a list of files as input and outputs the basename
(page 447) of each file in the list.
4. Write a function that takes a single filename as an argument and adds execute
permission to the file for the user.
a. When might such a function be useful?
b. Revise the script so it takes one or more filenames as arguments and adds
execute permission for the user for each file argument.
c. What can you do to make the function available every time you log in?
d. Suppose that, in addition to having the function available on subsequent
login sessions, you want to make the function available in your current
shell. How would you do so?
5. When might it be necessary or advisable to write a shell script instead of a
shell function? Give as many reasons as you can think of.
6. Write a shell script that displays the names of all directory files, but no other
types of files, in the working directory.
7. Write a script to display the time every 15 seconds. Read the date man page
and display the time, using the %r field descriptor. Clear the window (using
the clear command) each time before you display the time.
8. Enter the following script named savefiles, and give yourself execute
permission to the file:

Exercises 513
$ cat savefiles
#! /bin/bash
echo "Saving files in working directory to the file savethem."
exec > savethem
for i in *
do
echo
"==================================================="
echo "File: $i"
echo
"==================================================="
cat "$i"
done

a. Which error message do you receive when you execute this script?
Rewrite the script so that the error does not occur, making sure the output
still goes to savethem.
b. What might be a problem with running this script twice in the same
directory? Discuss a solution to this problem.
9. Read the bash man or info page, try some experiments, and answer the following
questions:
a. How do you export a function?
b. What does the hash builtin do?
c. What happens if the argument to exec is not executable?
10. Using the find utility, perform the following tasks:
a. List all files in the working directory and all subdirectories that have been
modified within the last day.
b. List all files you have read access to on the system that are larger than 1
megabyte.
c. Remove all files named core from the directory structure rooted at your
home directory.
d. List the inode numbers of all files in the working directory whose filenames
end in .c.
e. List all files you have read access to on the root filesystem that have been
modified in the last 30 days.
11. Write a short script that tells you whether the permissions for two files,
whose names are given as arguments to the script, are identical. If the permissions for the two files are identical, output the common permission field.
Otherwise, output each filename followed by its permission field. (Hint: Try
using the cut utility.)

514 Chapter 10 Programming the Bourne Again Shell (bash)

12. Write a script that takes the name of a directory as an argument and searches
the file hierarchy rooted at that directory for zero-length files. Write the names
of all zero-length files to standard output. If there is no option on the command
line, have the script delete the file after displaying its name, asking the user for
confirmation, and receiving positive confirmation. A –f (force) option on the
command line indicates that the script should display the filename but not ask
for confirmation before deleting the file.

Advanced Exercises
13. Write a script that takes a colon-separated list of items and outputs the
items, one per line, to standard output (without the colons).
14. Generalize the script written in exercise 13 so the character separating the
list items is given as an argument to the function. If this argument is absent,
the separator should default to a colon.
15. Write a function named funload that takes as its single argument the name
of a file containing other functions. The purpose of funload is to make all
functions in the named file available in the current shell; that is, funload
loads the functions from the named file. To locate the file, funload searches
the colon-separated list of directories given by the environment variable
FUNPATH. Assume the format of FUNPATH is the same as PATH and the
search of FUNPATH is similar to the shell’s search of the PATH variable.
16. Rewrite bundle (page 451) so the script it creates takes an optional list of
filenames as arguments. If one or more filenames are given on the command
line, only those files should be re-created; otherwise, all files in the shell
archive should be re-created. For example, suppose all files with the filename extension .c are bundled into an archive named srcshell, and you want
to unbundle just the files test1.c and test2.c. The following command will
unbundle just these two files:
$ bash srcshell test1.c test2.c

17. Which kind of links will the lnks script (page 427) not find? Why?
18. In principle, recursion is never necessary. It can always be replaced by an
iterative construct, such as while or until. Rewrite makepath (page 502) as
a nonrecursive function. Which version do you prefer? Why?
19. Lists are commonly stored in environment variables by putting a colon
(:) between each of the list elements. (The value of the PATH variable is
an example.) You can add an element to such a list by catenating the new
element to the front of the list, as in
PATH=/opt/bin:$PATH

Advanced Exercises

If the element you add is already in the list, you now have two copies of it
in the list. Write a shell function named addenv that takes two arguments:
(1) the name of a shell variable and (2) a string to prepend to the list that is
the value of the shell variable only if that string is not already an element of
the list. For example, the call
addenv PATH /opt/bin

would add /opt/bin to PATH only if that pathname is not already in PATH.
Be sure your solution works even if the shell variable starts out empty. Also
make sure you check the list elements carefully. If /usr/opt/bin is in PATH
but /opt/bin is not, the example just given should still add /opt/bin to
PATH. (Hint: You might find this exercise easier to complete if you first
write a function locate_field that tells you whether a string is an element in
the value of a variable.)
20. Write a function that takes a directory name as an argument and writes to
standard output the maximum of the lengths of all filenames in that directory.
If the function’s argument is not a directory name, write an error message to
standard output and exit with nonzero status.
21. Modify the function you wrote for exercise 20 to descend all subdirectories
of the named directory recursively and to find the maximum length of any
filename in that hierarchy.
22. Write a function that lists the number of ordinary files, directories, block special
files, character special files, FIFOs, and symbolic links in the working directory.
Do this in two different ways:
a. Use the first letter of the output of ls –l to determine a file’s type.
b. Use the file type condition tests of the [[ expression ]] syntax to determine
a file’s type.
23. Modify the quiz program (page 508) so that the choices for a question are
randomly arranged.

515

11
The Perl Scripting
Language

Chapter11
K

In This Chapter

Objectives

Introduction to Perl . . . . . . . . . . . . 518

After reading this chapter you should be able to:

Help . . . . . . . . . . . . . . . . . . . . . . . . 519

 Use perldoc to display Perl documentation

Running a Perl Program . . . . . . . . 522
Syntax . . . . . . . . . . . . . . . . . . . . . . 524

 Run a Perl program on the command line and from a
file

Variables . . . . . . . . . . . . . . . . . . . . 526

 Explain the use of the say function

Control Structures . . . . . . . . . . . . . 533

 Name and describe the three types of Perl variables

Working with Files. . . . . . . . . . . . . 542

 Write a Perl program that uses each type of variable

Sort . . . . . . . . . . . . . . . . . . . . . . . . 546

 Describe the Perl control structures

Subroutines. . . . . . . . . . . . . . . . . . 547

 Write programs that read from and write to files

Regular Expressions . . . . . . . . . . . 550

 Use regular expressions in a Perl program

CPAN Modules. . . . . . . . . . . . . . . . 555

 Write a Perl program that incorporates a CPAN
module

Examples . . . . . . . . . . . . . . . . . . . . 558

 Demonstrate several Perl functions

518 Chapter 11 The Perl Scripting Language

In 1987 Larry Wall created the Perl (Practical Extraction and Report Language) programming language for working with text. Perl uses syntax and concepts from awk,
sed, C, the Bourne Shell, Smalltalk, Lisp, and English. It was designed to scan and
extract information from text files and generate reports based on that information.
Since its introduction in 1987, Perl has expanded enormously—its documentation
growing up with it. Today, in addition to text processing, Perl is used for system
administration, software development, and general-purpose programming.
Perl code is portable because Perl has been implemented on many operating systems
(see www.cpan.org/ports). Perl is an informal, practical, robust, easy-to-use, efficient,
complete, and down-and-dirty language that supports procedural and object-oriented
programming. It is not necessarily elegant.
One of the things that distinguishes Perl from many other languages is its linguistic origins. In English you say, “I will buy a car if I win the lottery.” Perl
allows you to mimic that syntax. Another distinction is that Perl has singular
and plural variables, the former holding single values and the latter holding lists
of values.

Introduction to Perl
A couple of quotes from the manual shed light on Perl’s philosophy:
Many of Perl’s syntactic elements are optional. Rather than
requiring you to put parentheses around every function call and
declare every variable, you can often leave such explicit elements
off and Perl will frequently figure out what you meant. This is
known as Do What I Mean, abbreviated DWIM. It allows programmers to be lazy and to code in a style with which they are
comfortable.
The Perl motto is “There’s more than one way to do it.” Divining
how many more is left as an exercise to the reader.
One of Perl’s biggest assets is its support by thousands of third-party modules. The
Comprehensive Perl Archive Network (CPAN; www.cpan.org) is a repository for
many of the modules and other information related to Perl. See page 555 for information on downloading, installing, and using these modules in Perl programs.
The best way to learn Perl is to work with it. Copy and modify the programs in this
chapter until they make sense to you. Many system tools are written in Perl. The first
line of most of these tools begins with #!/usr/bin/perl, which tells the shell to pass
the program to Perl for execution. Most files that contain the string /usr/bin/perl are

Introduction to Perl

519

Perl programs. The following command uses grep to search the /usr/bin and /usr/sbin
directories recursively (–r) for files containing the string /usr/bin/perl; it lists many
local system tools written in Perl:
$ grep -r /usr/bin/perl /usr/bin /usr/sbin | head -4
/usr/bin/defoma-user:#! /usr/bin/perl -w
/usr/bin/pod2latex:#!/usr/bin/perl
/usr/bin/pod2latex:
eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
/usr/bin/splain:#!/usr/bin/perl

Review these programs—they demonstrate how Perl is used in the real world. Copy
a system program to a directory you own before modifying it. Do not run a system
program while running with root privileges unless you know what you are doing.

More Information
Local man

pages: See the perl and perltoc man pages for lists of Perl man pages

Web

Perl home page: www.perl.com
CPAN: www.cpan.org
blog: perlbuzz.com

Book

Programming Perl, third edition, by Wall, Christiansen, & Orwant, O’Reilly &
Associates (July 2000)

Help
Perl is a forgiving language. As such, it is easy to write Perl code that runs but does
not perform as you intended. Perl includes many tools that can help you find coding
mistakes. The –w option and the use warnings statement can produce helpful diagnostic messages. The use strict statement (see the perldebtut man page) can impose
order on a program by requiring, among other things, that you declare variables
before you use them. When all else fails, you can use Perl’s builtin debugger to step
through a program. See the perldebtut and perldebug man pages for more
information.

perldoc
You might need to install the perl-doc package before you can use perldoc.
The perldoc utility locates and displays local Perl documentation. It is similar to man
(page 33) but specific to Perl. It works with files that include lines of pod (plain old
documentation), a clean and simple documentation language. When embedded in a
Perl program, pod enables you to include documentation for the entire program, not
just code-level comments.
Following is a simple Perl program that includes pod. The two lines following =cut
are the program; the rest is pod-format documentation.

520 Chapter 11 The Perl Scripting Language
$ cat pod.ex1.pl
#!/usr/bin/perl
=head1 A Perl Program to Say I
This simple Perl program includes documentation in B format.
The following B<=cut> command tells B that what follows
is not documentation.
=cut
# A Perl program
print "Hi there.\n";
=head1 pod Documentation Resumes with Any pod Command
See the B page for more information
on B and B for complete Perl documentation.

You can use Perl to run the program:
$ perl pod.ex1.pl
Hi there.

Or you can use perldoc to display the documentation:
$ perldoc pod.ex1.pl
POD.EX1(1)

User Contributed Perl Documentation

POD.EX1(1)

A Perl Program to Say Hi there.
This simple Perl program includes documentation in pod format.
following =cut command tells perldoc that what follows is not
documentation.

The

pod Documentation Resumes with Any pod Command
See the perldoc.perl.org/perlpod.html page for more information on pod
and perldoc.perl.org for complete Perl documentation.

perl v5.10.0

2012-10-14

POD.EX1(1)

Most publicly distributed modules and scripts, as well as Perl itself, include
embedded pod-format documentation. For example, the following command displays information about the Perl print function:
$ perldoc -f print
print FILEHANDLE LIST
print LIST
print
Prints a string or a list of strings. Returns true if
successful. FILEHANDLE may be a scalar variable containing the
name of or a reference to the filehandle, thus introducing one
level of indirection. (NOTE: If FILEHANDLE is a variable and the
next token is a term, it may be misinterpreted as an operator
unless you interpose a "+" or put parentheses around the
...

Introduction to Perl

521

Once you have installed a module (page 555), you can use perldoc to display documentation for that module. The following example shows perldoc displaying
information on the locally installed Timestamp::Simple module:
$ perldoc Timestamp::Simple
Timestamp::Simple(3) User Contributed Perl Documentation Timestamp::Simple(3)
NAME
Timestamp::Simple - Simple methods for timestamping
SYNOPSIS
use Timestamp::Simple qw(stamp);
print stamp, "\n";
...

Give the command man perldoc or perldoc perldoc to display the perldoc man page
and read more about this tool.

Make Perl programs readable
tip Although Perl has many shortcuts that are good choices for one-shot programming, this chapter
presents code that is easy to understand and easy to maintain.

Terminology
This section defines some of the terms used in this chapter.
Module

A Perl module is a self-contained chunk of Perl code, frequently containing several
functions that work together. A module can be called from another module or from
a Perl program. A module must have a unique name. To help ensure unique names,
Perl provides a hierarchical namespace (page 1081) for modules, separating component s of a n ame wit h dou ble colo ns ( :: ). Ex am ple mod ule na mes a re
Timestamp::Simple and WWW::Mechanize.

Distribution

A Perl distribution is a set of one or more modules that perform a task. You can search
for distributions and modules at search.cpan.org. Examples of distributions include
Timestamp-Simple (the Timestamp-Simple-1.01.tar.gz archive file contains the Timestamp::Simple module only) and WWW-Mechanize (WWW-Mechanize-1.34.tar.gz
contains the WWW::Mechanize module, plus supporting modules including
WWW::Mechanize::Link and WWW::Mechanize::Image).

Package

A package defines a Perl namespace. For example, in the variable with the name
$WWW::Mechanize::ex, $ex is a scalar variable in the WWW::Mechanize package,
where “package” is used in the sense of a namespace. Using the same name, such as
WWW::Mechanize, for a distribution, a package, and a module can be confusing.

Block

A block is zero or more statements, delimited by curly braces ({}), that defines a scope.
The shell control structure syntax explanations refer to these elements as commands.
See the if...then control structure on page 419 for an example.

Package variable

A package variable is defined within the package it appears in. Other packages can
refer to package variables by using the variable’s fully qualified name (for example,
$Text::Wrap::columns). By default, variables are package variables unless you define
them as lexical variables.

522 Chapter 11 The Perl Scripting Language
Lexical variable

A lexical variable, which is defined by preceding the name of a variable with the keyword my (see the tip on page 527), is defined only within the block or file it appears
in. Other languages refer to a lexical variable as a local variable. Because Perl 4 used
the keyword local with a different meaning, Perl 5 uses the keyword lexical in its
place. When programming using bash, variables that are not exported (page 468) are
shell variables and are local to the program they are used in.

List

A list is a series of zero or more scalars. The following list has three elements—two
numbers and a string:
(2, 4, 'Zach')

Array

An array is a variable that holds a list of elements in a defined order. In the following
line of code, @a is an array. See page 529 for more information about array variables.
@a = (2, 4, 'Zach')

Compound
statement

A compound statement is a statement made up of other statements. For example, the
if compound statement (page 534) incorporates an if statement that normally
includes other statements within the block it controls.

Running a Perl Program
There are several ways you can run a program written in Perl. The –e option enables
you to enter a program on the command line:
$ perl -e 'print "Hi there.\n"'
Hi there.

The –e option is a good choice for testing Perl syntax and running brief, one-shot programs. This option requires that the Perl program appear as a single argument on the
command line. The program must immediately follow this option—it is an argument
to this option. An easy way to write this type of program is to enclose the program
within single quotation marks.
Because Perl is a member of the class of utilities that take input from a file or standard
input (page 138), you can give the command perl and enter the program terminated
by CONTROL-D (end of file). Perl reads the program from standard input:
$ perl
print "Hi there.\n";
CONTROL-D

Hi there.

The preceding techniques are useful for quick, one-off command-line programs but
are not helpful for running more complex programs. Most of the time, a Perl program
is stored in a text file. Although not required, the file typically has a filename extension of .pl. Following is the same simple program used in the previous examples
stored in a file:

Introduction to Perl

523

$ cat simple.pl
print "Hi there.\n";

You can run this program by specifying its name as an argument to Perl:
$ perl simple.pl
Hi there.

Most commonly and similarly to most shell scripts, the file containing the Perl program is executable. In the following example, chmod (page 285) makes the simple2.pl
file executable. As explained on page 287, the #! at the start of the first line of the
file instructs the shell to pass the rest of the file to /usr/bin/perl for execution.
$ chmod 755 simple2.pl
$ cat simple2.pl
#!/usr/bin/perl -w
print "Hi there.\n";
$ ./simple2.pl
Hi there.

In this example, the simple2.pl program is executed as ./simple2.pl because the working
directory is not in the user’s PATH (page 308). The –w option tells Perl to issue warning
messages when it identifies potential errors in the code.

Perl Version 5.10
All examples in this chapter were run under Perl 5.10. Give the following command
to see which version of Perl the local system is running:
$ perl -v
This is perl 5, version 14, subversion 2 (v5.14.2) built for i386-linux-thread-multi
...
use feature 'say'

The say function is a Perl 6 feature available in Perl 5.10. It works the same way print
does, except it adds a NEWLINE (\n) at the end of each line it outputs. Some versions of
Perl require you to tell Perl explicitly that you want to use say. The use function in
the following example tells Perl to enable say. Try running this program without the
use line to see if the local version of Perl requires it.
$ cat 5.10.pl
use feature 'say';
say 'Output by say.';
print 'Output by print.';
say 'End.'
$ perl 5.10.pl
Output by say.
Output by print.End.
$

524 Chapter 11 The Perl Scripting Language
Earlier versions
of Perl

If you are running an earlier version of Perl, you will need to replace say in the examples
in this chapter with print and terminate the print statement with a quoted \n:
$ cat
print
print
print

5.8.pl
'Output by print in place of say.', "\n";
'Output by print.';
'End.', "\n";

$ perl 5.8.pl
Output by print in place of say.
Output by print.End.

Syntax
This section describes the major components of a Perl program.
Statements

A Perl program comprises one or more statements, each terminated by a semicolon
(;). These statements are free-form with respect to whitespace (page 1101), except for
whitespace within quoted strings. Multiple statements can appear on a single line,
each terminated by a semicolon. The following programs are equivalent. The first
occupies two lines, the second only one; look at the differences in the spacing around
the equal and plus signs. See use feature 'say' (on the previous page) if these programs
complain about say not being available.
$ cat statement1.pl
$n=4;
say "Answer is ", $n + 2;
$ perl statement1.pl
Answer is 6
$ cat statement2.pl
$n = 4; say "Answer is ", $n+2;
$ perl statement2.pl
Answer is 6

Expressions

The syntax of Perl expressions frequently corresponds to the syntax of C expressions
but is not always the same. Perl expressions are covered in examples throughout this
chapter.

Quotation marks

All character strings must be enclosed within single or double quotation marks. Perl
differentiates between the two types of quotation marks in a manner similar to the
way the shell does (page 302): Double quotation marks allow Perl to interpolate
enclosed variables and interpret special characters such as \n (NEWLINE), whereas single
quotation marks do not. Table 11-1 lists some of Perl’s special characters.
The following example demonstrates how different types of quotation marks, and
the absence of quotation marks, affect Perl in converting scalars between numbers
and strings. The single quotation marks in the first print statement prevent Perl from
interpolating the $string variable and from interpreting the \n special character. The
leading \n in the second print statement forces the output of that statement to appear
on a new line.

Introduction to Perl
$ cat string1.pl
$string="5";

525

# $string declared as a string, but it will not matter

print '$string+5\n';

#
#
print "\n$string+5\n"; #
#
print $string+5, "\n"; #
#
#

Perl displays $string+5 literally because of
the single quotation marks
Perl interpolates the value of $string as a string
because of the double quotation marks
Lack of quotation marks causes Perl to interpret
$string as a numeric variable and to add 5;
the \n must appear between double quotation marks

$ perl string1.pl
$string+5\n
5+5
10
Slash

By default, regular expressions are delimited by slashes (/). The following example tests
whether the string hours contains the pattern our; see page 550 for more information
on regular expression delimiters in Perl.
$ perl -e 'if ("hours" =~ /our/) {say "yes";}'

The local version of Perl might require use feature 'say' (page 523) to work properly:
$ perl -e 'use feature "say"; if ("hours" =~ /our/) {say "yes";}'
Backslash

Within a string enclosed between double quotation marks, a backslash escapes
(quotes) another backslash. Thus Perl displays "\\n" as \n. Within a regular expression, Perl does not expand a metacharacter preceded by a backslash. See the
string1.pl program above.

Comments

As in the shell, a comment in Perl begins with a hashmark (#) and ends at the end of
the line (just before the NEWLINE character).

Special characters

Table 11-1 lists some of the characters that are special within strings in Perl. Perl
interpolates these characters when they appear between double quotation marks but
not when they appear between single quotation marks. Table 11-3 on page 552 lists
metacharacters, which are special within regular expressions.

Table 11-1

Some Perl special characters

Character

When within double quotation marks, interpolated as

\0xx (zero)

The ASCII character whose octal value is xx

\a

An alarm (bell or beep) character (ASCII 7)

\e

An ESCAPE character (ASCII 27)

\n

A NEWLINE character (ASCII 10)

\r

A RETURN character (ASCII 13)

\t

A TAB character (ASCII 9)

526 Chapter 11 The Perl Scripting Language

Variables
Like human languages, Perl distinguishes between singular and plural data. Strings and
numbers are singular; lists of strings or numbers are plural. Perl provides three types
of variables: scalar (singular), array (plural), and hash (plural; also called associative
arrays). Perl identifies each type of variable by a special character preceding its name.
The name of a scalar variable begins with a dollar sign ($), an array variable begins with
an at sign (@), and a hash variable begins with a percent sign (%). As opposed to the
way the shell identifies variables, Perl requires the leading character to appear each time
you reference a variable, including when you assign a value to the variable:
$ name="Zach" ; echo "$name"
Zach

(bash)

$ perl -e '$name="Zach" ; print "$name\n";'
Zach

(perl)

Variable names, which are case sensitive, can include letters, digits, and the underscore character (_). A Perl variable is a package variable (page 521) unless it is
preceded by the keyword my, in which case it is a lexical variable (page 522) that is
defined only within the block or file it appears in. See “Subroutines” on page 547 for
a discussion of the locality of Perl variables.

Lexical variables overshadow package variables
caution If a lexical variable and a package variable have the same name, within the block or file in which
the lexical variable is defined, the name refers to the lexical variable and not to the package variable.

A Perl variable comes into existence when you assign a value to it—you do not
need to define or initialize a variable, although it might make a program more
understandable to do so. Normally, Perl does not complain when you reference an
uninitialized variable:
$ cat variable1.pl
#!/usr/bin/perl
my $name = 'Sam';
print "Hello, $nam, how are you?\n"; # Typo, e left off of name
$ ./variable1.pl
Hello, , how are you?
use strict

Include use strict to cause Perl to require variables to be declared before being
assigned values. See the perldebtut man page for more information. When you
include use strict in the preceding program, Perl displays an error message:
$ cat variable1b.pl
#!/usr/bin/perl
use strict;
my $name = 'Sam';
print "Hello, $nam, how are you?\n"; # Typo, e left off of name

Variables 527
$ ./variable1b.pl
Global symbol "$nam" requires explicit package name at ./variable1b.pl line 4.
Execution of ./variable1b.pl aborted due to compilation errors.

Using my: lexical versus package variables
tip In variable1.pl, $name is declared to be lexical by preceding its name with the keyword my; its
name and value are known within the file variable1.pl only. Declaring a variable to be lexical limits
its scope to the block or file it is defined in. Although not necessary in this case, declaring variables
to be lexical is good practice. This habit becomes especially useful when you write longer programs, subroutines, and packages, where it is harder to keep variable names unique. Declaring all
variables to be lexical is mandatory when you write routines that will be used within code written
by others. This practice allows those who work with your routines to use whichever variable names
they like, without regard to which variable names you used in the code you wrote.
The shell and Perl scope variables differently. In the shell, if you do not export a variable to make it
an environment variable, it is a shell variable and is local to the routine it is used in (page 468). In
Perl, if you do not use my to declare a variable to be lexical, it is defined for the package it appears in.
–w and
use warnings

The –w option and the use warnings statement perform the same function: They
cause Perl to generate an error message when it detects a syntax error. In the following example, Perl displays two warnings. The first tells you that you have used
the variable named $nam once, on line 3, which probably indicates an error. This
message is helpful when you mistype the name of a variable. Under Perl 5.10, the
second warning specifies the name of the uninitialized variable. This warning
refers to the same problem as the first warning. Although it is not hard to figure
out which of the two variables is undefined in this simple program, doing so in a
complex program can take a lot of time.

$ cat variable1a.pl
#!/usr/bin/perl -w
my $name = 'Sam';
print "Hello, $nam, how are you?\n"; # Prints warning because of typo and -w
$ ./variable1a.pl
Name "main::nam" used only once: possible typo at ./variable1a.pl line 3.
Use of uninitialized value $nam in concatenation (.) or string at ./variable1a.pl line 3.
Hello, , how are you?

You can also use –w on the command line. If you use –e as well, make sure the
argument that follows this option is the program you want to execute (e.g., –e –w
does not work). See the tip on page 550.
$ perl -w -e 'my $name = "Sam"; print "Hello, $nam, how are you?\n"'
Name "main::nam" used only once: possible typo at -e line 1.
Use of uninitialized value $nam in concatenation (.) or string at -e line 1.
Hello, , how are you?
undef and defined

An undefined variable has the special value undef, which evaluates to zero (0) in a
numeric expression and expands to an empty string ("") when you print it. Use the
defined function to determine whether a variable has been defined. The following
example, which uses constructs explained later in this chapter, calls defined with an

528 Chapter 11 The Perl Scripting Language

argument of $name and negates the result with an exclamation point (!). The result
is that the print statement is executed if $name is not defined.
$ cat variable2.pl
#!/usr/bin/perl
if (!defined($name)) {
print "The variable '\$name' is not defined.\n"
};
$ ./variable2.pl
The variable '$name' is not defined.

Because the –w option causes Perl to warn you when you reference an undefined
variable, using this option would generate a warning.

Scalar Variables
A scalar variable has a name that begins with a dollar sign ($) and holds a single string
or number: It is a singular variable. Because Perl converts between the two when necessary, you can use strings and numbers interchangeably. Perl interprets scalar
variables as strings when it makes sense to interpret them as strings, and as numbers
when it makes sense to interpret them as numbers. Perl's judgment in these matters
is generally good.
The following example shows some uses of scalar variables. The first two lines of
code (lines 3 and 4) assign the string Sam to the scalar variable $name and the numbers 5 and 2 to the scalar variables $n1 and $n2, respectively. In this example,
multiple statements, each terminated with a semicolon (;), appear on a single line. See
use feature 'say' on page 523 if this program complains about say not being available.
$ cat scalars1.pl
#!/usr/bin/perl -w
$name = "Sam";
$n1 = 5; $n2 = 2;
say
say
say
say
say

"$name $n1 $n2";
"$n1 + $n2";
'$name $n1 $n2';
$n1 + $n2, " ", $n1
$name + $n1;

*

$n2;

$ ./scalars1.pl
Sam 5 2
5 + 2
$name $n1 $n2
7 10
Argument "Sam" isn't numeric in addition (+) at ./scalers1.pl line 11.
5
Double quotation
marks

The first say statement sends the string enclosed within double quotation marks to
standard output (the screen unless you redirect it). Within double quotation marks,

Variables 529

Perl expands variable names to the value of the named variable. Thus the first say
statement displays the values of three variables, separated from each other by SPACEs.
The second say statement includes a plus sign (+). Perl does not recognize operators
such as + within either type of quotation marks. Thus Perl displays the plus sign
between the values of the two variables.
Single quotation
marks

The third say statement sends the string enclosed within single quotation marks to
standard output. Within single quotation marks, Perl interprets all characters literally, so it displays the string exactly as it appears between the single quotation
marks.
In the fourth say statement, the operators are not quoted, and Perl performs the addition and multiplication as specified. Without the quoted SPACE, Perl would catenate the
two numbers (710). The last say statement attempts to add a string and a number;
the –w option causes Perl to display an error message before displaying 5. The 5
results from adding Sam, which Perl evaluates as 0 in a numerical context, to the
number 5 (0 + 5 = 5).

Array Variables
An array variable is an ordered container of scalars whose name begins with an at
sign (@) and whose first element is numbered zero (zero-based indexing). Because an
array can hold zero or more scalars, it is a plural variable. Arrays are ordered; hashes
(page 532) are unordered. In Perl, arrays grow as needed. If you reference an uninitialized element of an array, such as an element beyond the end of the array, Perl
returns undef.
The first statement in the following program assigns the values of two numbers
and a string to the array variable named @arrayvar. Because Perl uses zero-based
indexing, the first say statement displays the value of the second element of the
array (the element with the index 1). This statement specifies the variable
$arrayvar[1] as a scalar (singular) because it refers to a single value. The second
say statement specifies the variable @arrayvar[1,2] as a list (plural) because it refers
to multiple values (the elements with the indexes 1 and 2).
$ cat arrayvar1.pl
#!/usr/bin/perl -w
@arrayvar = (8, 18, "Sam");
say $arrayvar[1];
say "@arrayvar[1,2]";
$ perl arrayvar1.pl
18
18 Sam

The next example shows a couple of ways to determine the length of an array and
presents more information on using quotation marks within print statements. The
first assignment statement in arrayvar2.pl assigns values to the first six elements of
the @arrayvar2 array. When used in a scalar context, Perl evaluates the name of an

530 Chapter 11 The Perl Scripting Language

array as the length of the array. The second assignment statement assigns the number
of elements in @arrayvar2 to the scalar variable $num.
$ cat arrayvar2.pl
#!/usr/bin/perl -w
@arrayvar2 = ("apple", "bird", 44, "Tike", "metal", "pike");
$num = @arrayvar2;
print "Elements: ", $num, "\n";
print "Elements: $num\n";

# number of elements in array
# two equivalent print statements

print "Last: $#arrayvar2\n";

# index of last element in array

$ ./arrayvar2.pl
Elements: 6
Elements: 6
Last: 5

The first two print statements in arrayvar2.pl display the string Elements:, a SPACE, the
value of $num, and a NEWLINE, each using a different syntax. The first of these statements displays three values, using commas to separate them within the print
statement. The second print statement has one argument and demonstrates that Perl
expands a variable (replaces the variable with its value) when the variable is enclosed
within double quotation marks.
$#array

The final print statement in arrayvar2.pl shows that Perl evaluates the variable
$#array as the index of the last element in the array named array. Because Perl uses
zero-based indexing by default, this variable evaluates to one less than the number
of elements in the array.
The next example works with elements of an array and uses a dot (. ; the string
catenation operator). The first two lines assign values to four scalar variables.
The third line shows that you can assign values to array elements using scalar
variables, arithmetic, and catenated strings. The dot operator catenates strings,
so Perl evaluates $va . $vb as Sam catenated with uel—that is, as Samuel (see the
output of the last print statement).
$ cat arrayvar3.pl
#!/usr/bin/perl -w
$v1 = 5; $v2 = 8;
$va = "Sam"; $vb = "uel";
@arrayvar3 = ($v1, $v1 * 2, $v1
print $arrayvar3[2], "\n";
print @arrayvar3[2,4], "\n";
print @arrayvar3[2..4], "\n\n";

*

$v2, "Max", "Zach", $va . $vb);
# one element of an array is a scalar
# two elements of an array are a list
# a slice

Variables 531
print "@arrayvar3[2,4]", "\n";
# a list, elements separated by SPACEs
print "@arrayvar3[2..4]", "\n\n"; # a slice, elements separated by SPACEs
print "@arrayvar3\n";

# an array, elements separated by SPACEs

$ ./arrayvar3.pl
40
40Zach
40MaxZach
40 Zach
40 Max Zach
5 10 40 Max Zach Samuel

The first print statement in arrayvar3.pl displays the third element (the element with
an index of 2) of the @arrayvar3 array. This statement uses $ in place of @ because
it refers to a single element of the array. The subsequent print statements use @
because they refer to more than one element. Within the brackets that specify an
array subscript, two subscripts separated by a comma specify two elements of an
array. The second print statement, for example, displays the third and fifth elements
of the array.
Array slice

When you separate two elements of an array with two dots (..; the range operator),
Perl substitutes all elements between and including the two specified elements. A portion of an array comprising elements is called a slice. The third print statement in the
preceding example displays the elements with indexes 2, 3, and 4 (the third, fourth,
and fifth elements) as specified by 2..4. Perl puts no SPACEs between the elements it
displays.
Within a print statement, when you enclose an array variable, including its subscripts, within double quotation marks, Perl puts a SPACE between each of the
elements. The fourth and fifth print statements in the preceding example illustrate
this syntax. The last print statement displays the entire array, with elements separated
by SPACEs.

shift, push,
pop, and splice

The next example demonstrates several functions you can use to manipulate arrays.
The example uses the @colors array, which is initialized to a list of seven colors. The
shift function returns and removes the first element of an array, push adds an element
to the end of an array, and pop returns and removes the last element of an array. The
splice function replaces elements of an array with another array; in the example,
splice inserts the @ins array starting at index 1 (the second element), replacing two
elements of the array. See use feature 'say' on page 523 if this program complains
about say not being available. See the perlfunc man page for more information on the
functions described in this paragraph.

532 Chapter 11 The Perl Scripting Language
$ cat ./shift1.pl
#!/usr/bin/perl -w
@colors = ("red", "orange", "yellow", "green", "blue", "indigo", "violet");
say "
Display array: @colors";
say " Display and remove first element of array: ", shift (@colors);
say "
Display remaining elements of array: @colors";
push (@colors, "WHITE");
say "
Add element to end of array and display: @colors";
say "
say "

Display and remove last element of array: ", pop (@colors);
Display remaining elements of array: @colors";

@ins = ("GRAY", "FERN");
splice (@colors, 1, 2, @ins);
say "Replace second and third elements of array: @colors";
$ ./shift1.pl
Display array:
Display and remove first element of array:
Display remaining elements of array:
Add element to end of array and display:
Display and remove last element of array:
Display remaining elements of array:
Replace second and third elements of array:

red orange yellow green blue indigo violet
red
orange yellow green blue indigo violet
orange yellow green blue indigo violet WHITE
WHITE
orange yellow green blue indigo violet
orange GRAY FERN blue indigo violet

Hash Variables
A hash variable, sometimes called an associative array variable, is a plural data
structure that holds an array of key-value pairs. It uses strings as keys (indexes)
and is optimized to return a value quickly when given a key. Each key must be a
unique scalar. Hashes are unordered; arrays (page 529) are ordered. When you
assign a hash to a list, the key-value pairs are preserved, but their order is neither
alphabetical nor the order in which they were inserted into the hash; instead, the
order is effectively random.
Perl provides two syntaxes to assign values to a hash. The first uses a single assignment
statement for each key-value pair:
$ cat hash1.pl
#!/usr/bin/perl -w
$hashvar1{boat} = "tuna";
$hashvar1{"number five"} = 5;
$hashvar1{4} = "fish";
@arrayhash1 = %hashvar1;
say "@arrayhash1";
$ ./hash1.pl
boat tuna 4 fish number five 5

Control Structures

533

Within an assignment statement, the key is located within braces to the left of the
equal sign; the value is on the right side of the equal sign. As illustrated in the preceding example, keys and values can take on either numeric or string values. You do
not need to quote string keys unless they contain SPACEs. This example also shows that
you can display the keys and values held by a hash, each separated from the next by
a SPACE, by assigning the hash to an array variable and then printing that variable
enclosed within double quotation marks.
The next example shows the other way of assigning values to a hash and illustrates
how to use the keys and values functions to extract keys and values from a hash.
After assigning values to the %hash2 hash, hash2.pl calls the keys function with an
argument of %hash2 and assigns the resulting list of keys to the @array_keys array.
The program then uses the values function to assign values to the @array_values
array.
$ cat hash2.pl
#!/usr/bin/perl -w
%hash2 = (
boat => "tuna",
"number five" => 5,
4 => "fish",
);
@array_keys = keys(%hash2);
say " Keys: @array_keys";
@array_values = values(%hash2);
say "Values: @array_values";
$ ./hash2.pl
Keys: boat 4 number five
Values: tuna fish 5

Because Perl automatically quotes a single word that appears to the left of the =>
operator, you do not need quotation marks around boat in the third line of this program. However, removing the quotation marks from around number five would
generate an error because the string contains a SPACE.

Control Structures
Control flow statements alter the order of execution of statements within a Perl program. Starting on page 418, Chapter 10 discusses bash control structures in detail
and includes flow diagrams. Perl control structures perform the same functions as
their bash counterparts, although the two languages use different syntaxes. The
description of each control structure in this section references the discussion of the
same control structure under bash.

534 Chapter 11 The Perl Scripting Language

In this section, the bold italic words in the syntax description are the items you supply
to cause the structure to have the desired effect, the nonbold italic words are the keywords Perl uses to identify the control structure, and {...} represents a block
(page 521) of statements. Many of these structures use an expression, denoted as
expr, to control their execution. See if/unless (next) for an example and explanation
of a syntax description.

if/unless
The if and unless control structures are compound statements that have the following
syntax:
if (expr) {...}
unless (expr) {...}
These structures differ only in the sense of the test they perform. The if structure executes the block of statements if expr evaluates to true; unless executes the block of
statements unless expr evaluates to true (i.e., if expr is false).
The if appears in nonbold type because it is a keyword; it must appear exactly as
shown. The expr is an expression; Perl evaluates it and executes the block (page 521)
of statements represented by {...} if the expression evaluates as required by the control
structure.
File test operators

The expr in the following example, –r memo1, uses the –r file test operator to determine if a file named memo1 exists in the working directory and if the file is readable.
Although this operator tests only whether you have read permission for the file, the
file must exist for you to have read permission; thus it implicitly tests that the file is
present. (Perl uses the same file test operators as bash; see Table 10-1 on page 421.)
If this expression evaluates to true, Perl executes the block of statements (in this case
one statement) between the braces. If the expression evaluates to false, Perl skips the
block of statements. In either case, Perl then exits and returns control to the shell.
$ cat if1.pl
#!/usr/bin/perl -w
if (-r "memo1") {
say "The file 'memo1' exists and is readable.";
}
$ ./if1.pl
The file 'memo1' exists and is readable.

Following is the same program written using the postfix if syntax. Which syntax you
use depends on which part of the statement is more important to someone reading
the code.
$ cat if1a.pl
#!/usr/bin/perl -w
say "The file 'memo1' exists and is readable." if (-r "memo1");

Control Structures

535

The next example uses a print statement to display a prompt on standard output and
uses the statement $entry = <>; to read a line from standard input and assign the line
to the variable $entry. Reading from standard input, working with other files, and
use of the magic file handle (<>) for reading files specified on the command line are
covered on page 542.
Comparison
operators

Perl uses different operators to compare numbers from those it uses to compare
strings. Table 11-2 lists numeric and string comparison operators. In the following
example, the expression in the if statement uses the == numeric comparison operator
to compare the value the user entered and the number 28. This operator performs a
numeric comparison, so the user can enter 28, 28.0, or 00028 and in all cases the
result of the comparison will be true. Also, because the comparison is numeric, Perl
ignores both the whitespace around and the NEWLINE following the user’s entry. The –w
option causes Perl to issue a warning if the user enters a nonnumeric value and the
program uses that value in an arithmetic expression; without this option Perl silently
evaluates the expression as false.
$ cat if2.pl
#!/usr/bin/perl -w
print "Enter 28: ";
$entry = <>;
if ($entry == 28) {
# use == for a numeric comparison
print "Thank you for entering 28.\n";
}
print "End.\n";
$ ./if2.pl
Enter 28: 28.0
Thank you for entering 28.
End.

Table 11-2

Comparison operators

Numeric
operators

String
operators

Value returned based on the relationship between the
values preceding and following the operator

==

eq

True if equal

!=

ne

True if not equal

<

lt

True if less than

>

gt

True if greater than

<=

le

True if less than or equal

>=

ge

True if greater than or equal

<=>

cmp

0 if equal, 1 if greater than, –1 if less than

The next program is similar to the preceding one, except it tests for equality between
two strings. The chomp function (page 544) removes the trailing NEWLINE from the

536 Chapter 11 The Perl Scripting Language

user’s entry—without this function the strings in the comparison would never match.
The eq comparison operator compares strings. In this example the result of the string
comparison is true when the user enters the string five. Leading or trailing whitespace
will yield a result of false, as would the string 5, although none of these entries would
generate a warning because they are legitimate strings.
$ cat if2a.pl
#!/usr/bin/perl -w
print "Enter the word 'five': ";
$entry = <>;
chomp ($entry);
if ($entry eq "five") {
# use eq for a string comparison
print "Thank you for entering 'five'.\n";
}
print "End.\n";
$ ./if2a.pl
Enter the word 'five': five
Thank you for entering 'five'.
End.

if...else
The if...else control structure is a compound statement that is similar to the bash
if...then...else control structure (page 423). It implements a two-way branch using the
following syntax:
if (expr) {...} else {...}
die

The next program prompts the user for two different numbers and stores those
numbers in $num1 and $num2. If the user enters the same number twice, an if
structure executes a die function, which sends its argument to standard error and
aborts program execution.
If the user enters different numbers, the if...else structure reports which number is
larger. Because expr performs a numeric comparison, the program accepts numbers
that include decimal points.
$ cat ifelse.pl
#!/usr/bin/perl -w
print "Enter a number: ";
$num1 = <>;
print "Enter another, different number: ";
$num2 = <>;
if ($num1 == $num2) {
die ("Please enter two different numbers.\n");
}
if ($num1 > $num2) {
print "The first number is greater than the second number.\n";
}
else {
print "The first number is less than the second number.\n";
}

Control Structures

537

$ ./ifelse.pl
Enter a number: 8
Enter another, different number: 8
Please enter two different numbers.
$ ./ifelse.pl
Enter a number: 5.5
Enter another, different number: 5
The first number is greater than the second number.

if...elsif...else
Similar to the bash if...then...elif control structure (page 425), the Perl if...elsif...else
control structure is a compound statement that implements a nested set of if...else
structures using the following syntax:
if (expr) {...} elsif {...} ... else {...}
The next program implements the functionality of the preceding ifelse.pl program
using an if...elsif...else structure. A print statement replaces the die statement
because the last statement in the program displays the error message; the program
terminates after executing this statement anyway. You can use the STDERR handle
(page 542) to cause Perl to send this message to standard error instead of standard
output.
$ cat ifelsif.pl
#!/usr/bin/perl -w
print "Enter a number: ";
$num1 = <>;
print "Enter another, different
$num2 = <>;
if ($num1 > $num2) {
print "The first number
}
elsif ($num1 < $num2) {
print "The first number
}
else {
print "Please enter two
}

number: ";

is greater than the second number.\n";

is less than the second number.\n";

different numbers.\n";

foreach/for
The Perl foreach and for keywords are synonyms; you can replace one with the
other in any context. These structures are compound statements that have two
syntaxes. Some programmers use one syntax with foreach and the other syntax
with the for, although there is no need to do so. This book uses foreach with both
syntaxes.

538 Chapter 11 The Perl Scripting Language

foreach: Syntax 1
The first syntax for the foreach structure is similar to the shell’s for...in structure
(page 431):
foreach|for [var] (list) {...}
where list is a list of expressions or variables. Perl executes the block of statements
once for each item in list, sequentially assigning to var the value of one item in list
on each iteration, starting with the first item. If you do not specify var, Perl assigns
values to the $_ variable (page 542).
The following program demonstrates a simple foreach structure. On the first pass
through the loop, Perl assigns the string Mo to the variable $item and the say statement displays the value of this variable followed by a NEWLINE. On the second and third
passes through the loop, $item is assigned the value of Larry and Curly. When there
are no items left in the list, Perl continues with the statement following the foreach
structure. In this case, the program terminates. See use feature 'say' on page 523 if
this program complains about say not being available.
$ cat foreach.pl
foreach $item ("Mo", "Larry", "Curly") {
say "$item says hello.";
}
$ perl foreach.pl
Mo says hello.
Larry says hello.
Curly says hello.

Using $_ (page 542), you can write this program as follows:
$ cat foreacha.pl
foreach ("Mo", "Larry", "Curly") {
say "$_ says hello.";
}

Following is the program using an array:
$ cat foreachb.pl
@stooges = ("Mo", "Larry", "Curly");
foreach (@stooges) {
say "$_ says hello.";
}

Following is the program using the foreach postfix syntax:
$ cat foreachc.pl
@stooges = ("Mo", "Larry", "Curly");
say "$_ says hello." foreach @stooges;

The loop variable ($item and $_ in the preceding examples) references the elements
in the list within the parentheses. When you modify the loop variable, you modify
the element in the list. The uc function returns an upshifted version of its argument.

Control Structures

539

The next example shows that modifying the loop variable $stooge modifies the
@stooges array:
$ cat foreachd.pl
@stooges = ("Mo", "Larry", "Curly");
foreach $stooge (@stooges) {
$stooge = uc $stooge;
say "$stooge says hello.";
}
say "$stooges[1] is uppercase"
$ perl foreachd.pl
MO says hello.
LARRY says hello.
CURLY says hello.
LARRY is uppercase

See page 545 for an example that loops through command-line arguments.

last and next
Perl’s last and next statements allow you to interrupt a loop; they are analogous to
the Bourne Again Shell’s break and continue statements (page 441). The last statement transfers control to the statement following the block of statements controlled
by the loop structure, terminating execution of the loop. The next statement transfers
control to the end of the block of statements, which continues execution of the loop
with the next iteration.
In the following program, the if structure tests whether $item is equal to the string
two; if it is, the structure executes the next command, which skips the say statement
and continues with the next iteration of the loop. If you replaced next with last, Perl
would exit from the loop and not display three. See use feature 'say' on page 523 if
this program complains about say not being available.
$ cat foreach1.pl
foreach $item ("one", "two", "three") {
if ($item eq "two") {
next;
}
say "$item";
}
$ perl foreach1.pl
one
three

foreach: Syntax 2
The second syntax for the foreach structure is similar to the C for structure:
foreach|for (expr1; expr2; expr3) {...}
The expr1 initializes the foreach loop; Perl evaluates expr1 one time, before it executes the block of statements. The expr2 is the termination condition; Perl evaluates

540 Chapter 11 The Perl Scripting Language

it before each pass through the block of statements and executes the block of statements if expr2 evaluates as true. Perl evaluates expr3 after each pass through the
block of statements—it typically increments a variable that is part of expr2.
In the next example, the foreach2.pl program prompts for three numbers; displays
the first number; repeatedly increments this number by the third number, displaying
each result until the result would be greater than the second number; and quits. See
page 543 for a discussion of the magic file handle (<>).
$ cat foreach2.pl
#!/usr/bin/perl -w
print "Enter starting number: ";
$start = <>;
print "Enter ending number: ";
$end = <>;
print "Enter increment: ";
$incr = <>;
if ($start >= $end || $incr < 1) {
die ("The starting number must be less than the ending number\n",
"and the increment must be greater than zero.\n");
}
foreach ($count = $start+0; $count <= $end; $count += $incr) {
say "$count";
}
$ ./foreach2.pl
Enter starting number: 2
Enter ending number: 10
Enter increment: 3
2
5
8

After prompting for three numbers, the preceding program tests whether the starting
number is greater than or equal to the ending number or if the increment is less than
1. The || is a Boolean OR operator; the expression within the parentheses following
if evaluates to true if either the expression before or the expression after this operator
evaluates to true.
The foreach statement begins by assigning the value of $start+0 to $count. Adding 0
(zero) to the string $start forces Perl to work in a numeric context, removing the trailing NEWLINE when it performs the assignment. Without this fix, the program would
display an extra NEWLINE following the first number it displayed.

while/until
The while (page 435) and until (page 439) control structures are compound statements that implement conditional loops using the following syntax:

Control Structures

541

while (expr) {...}
until (expr) {...}
These structures differ only in the sense of their termination conditions. The while
structure repeatedly executes the block of statements while expr evaluates to true;
until continues until expr evaluates to true (i.e., while expr remains false).
The following example demonstrates one technique for reading and processing input
until there is no more input. Although this example shows input coming from the user
(standard input), the technique works the same way for input coming from a file (see
the example on page 544). The user enters CONTROL-D on a line by itself to signal the end
of file.
In this example, expr is $line = <>. This statement uses the magic file handle (<>;
page 543) to read one line from standard input and assigns the string it reads to the
$line variable. This statement evaluates to true as long as it reads data. When it
reaches the end of file, the statement evaluates to false. The while loop continues to
execute the block of statements (in this example, only one statement) as long as there
is data to read.
$ cat while1.pl
#!/usr/bin/perl -w
$count = 0;
while ($line = <>) {
print ++$count, ". $line";
}
print "\n$count lines entered.\n";
$ ./while1.pl
Good Morning.
1. Good Morning.
Today is Monday.
2. Today is Monday.
CONTROL-D

2 lines entered.

In the preceding example, $count keeps track of the number of lines the user enters.
Putting the ++ increment operator before a variable (++$count; called a preincrement
operator) increments the variable before Perl evaluates it. Alternatively, you could
initialize $count to 1 and increment it with $count++ (postincrement), but then in the
final print statement $count would equal one more than the number of lines entered.
$.

The $. variable keeps track of the number of lines of input a program has read. Using
$. you can rewrite the previous example as follows:
$ cat while1a.pl
#!/usr/bin/perl -w
while ($line = <>) {
print $., ". $line";
}
print "\n$. lines entered.\n";

542 Chapter 11 The Perl Scripting Language
$_

Frequently you can simplify Perl code by using the $_ variable. You can use $_ many
places in a Perl program—think of $_ as meaning it, the object of what you are doing.
It is the default operand for many operations. For example, the following section of
code processes a line using the $line variable. It reads a line into $line, removes any
trailing NEWLINE from $line using chomp (page 544), and checks whether a regular
expression matches $line.
while (my $line = <>) {
chomp $line;
if ($line =~ /regex/) ...
}

You can rewrite this code by using $_ to replace $line:
while (my $_ = <>) {
chomp $_;
if ($_ =~ /regex/) ...
}

Because $_ is the default operand in each of these instances, you can also omit $_
altogether:
while (<>) {
chomp;
if (/regex/) ...
}

# read into $_
# chomp $_
# if $_ matches regex

Working with Files
Opening a file and
assigning a handle

A handle is a name you can use in a Perl program to refer to a file or process that is
open for reading and/or writing. When you are working with the shell, handles are
referred to as file descriptors (page 452). As when you are working with the shell, the
kernel automatically opens handles for standard input (page 133), standard output
(page 133), and standard error (page 282) before it runs a program. The kernel closes
these descriptors after a program finishes running. The names for these handles are
STDIN, STDOUT, and STDERR, respectively. You must manually open handles to
read from or write to other files or processes. The syntax of an open statement is
open (file-handle, ['mode',] "file-ref");
where file-handle is the name of the handle or a variable you will use in the program
to refer to the file or process named by file-ref. If you omit mode or specify a mode
of <, Perl opens the file for input (reading). Specify mode as > to truncate and write
to a file or as >> to append to a file.
See page 558 for a discussion of reading from and writing to processes.

Writing to a file

The print function writes output to a file or process. The syntax of a print statement is

Working with Files 543

print [file-handle] "text";
where file-handle is the name of the handle you specified in an open statement and
text is the information you want to output. The file-handle can also be STDOUT or
STDERR, as explained earlier. Except when you send information to standard output, you must specify a handle in a print statement. Do not place a comma after filehandle. Also, do not enclose arguments to print within parentheses because doing so
can create problems.
Reading from a file

The following expression reads one line, including the
process associated with file-handle:

NEWLINE

(\n), from the file or


This expression is typically used in a statement such as
$line = ;

which reads into the variable $line one line from the file or process identified by the
handle IN.
Magic file
handle (<>)

To facilitate reading from files named on the command line or from standard input,
Perl provides the magic file handle. This book uses this file handle in most examples.
In place of the preceding line, you can use
$line = <>;

This file handle causes a Perl program to work like many Linux utilities: It reads from
standard input unless the program is called with one or more arguments, in which
case it reads from the files named by the arguments. See page 138 for an explanation
of how this feature works with cat.
The print statement in the first line in the next example includes the optional handle
STDOUT; the next print statement omits this handle; the final print statement uses
the STDERR file handle, which causes print’s output to go to standard error. The first
print statement prompts the user to enter something. The string that this statement
outputs is terminated with a SPACE, not a NEWLINE, so the user can enter information on
the same line as the prompt. The second line then uses a magic file handle to read one
line from standard input, which it assigns to $userline. Because of the magic file handle, if you call file1.pl with an argument that is a filename, it reads one line from that
file instead of from standard input. The command line that runs file1.pl uses 2> (see
“File descriptors” on page 282) to redirect standard error (the output of the third
print statement) to the file1.err file.
$ cat file1.pl
print STDOUT "Enter something: ";
$userline = <>;
print "1>>>$userline<<<\n";
chomp ($userline);
print "2>>>$userline<<<\n";
print STDERR "3. Error message.\n";

544 Chapter 11 The Perl Scripting Language
$ perl file1.pl 2> file1.err
Enter something: hi there
1>>>hi there
<<<
2>>>hi there<<<
$ cat file1.err
3. Error message.
chomp/chop

The two print statements following the user input in file1.pl display the value of
$userline immediately preceded by greater than signs (>) and followed by less than
signs (<). The first of these statements demonstrates that $userline includes a NEWLINE:
The less than signs following the string the user entered appear on the line following
the string. The chomp function removes a trailing NEWLINE, if it exists, from a string.
After chomp processes $userline, the print statement shows that this variable no longer contains a NEWLINE. (The chop function is similar to chomp, except it removes any
trailing character from a string.)
The next example shows how to read from a file. It uses an open statement to assign
the lexical file handle $infile to the file /usr/share/dict/words. Each iteration of the
while structure evaluates an expression that reads a line from the file represented by
$infile and assigns the line to $line. When while reaches the end of file, the expression
evaluates to false; control then passes out of the while structure. The block of one
statement displays the line as it was read from the file, including the NEWLINE. This program copies /usr/share/dict/words to standard output. A pipe symbol (|; page 141)
is then used to send the output through head (page 56), which displays the first four
lines of the file (the first line is blank).

$ cat file2.pl
open (my $infile, "/usr/share/dict/words") or die "Cannot open dictionary: $!\n";
while ($line = <$infile>) {
print $line;
}
$ perl file2.pl | head -4
A
A's
AOL
$!

The $! variable holds the last system error. In a numeric context, it holds the system
error number; in a string context, it holds the system error string. If the words file is
not present on the system, file2.pl displays the following message:
Cannot open dictionary: No such file or directory

If you do not have read permission for the file, the program displays this message:
Cannot open dictionary: Permission denied

Displaying the value of $! gives the user more information about what went wrong
than simply saying that the program could not open the file.

Working with Files 545

Always check for an error when opening a file
tip When a Perl program attempts to open a file and fails, the program does not display an error message unless it checks whether open returned an error. In file2.pl, the or operator in the open
statement causes Perl to execute die (page 536) if open fails. The die statement sends the message Cannot open the dictionary followed by the system error string to standard error and
terminates the program.
@ARGV

The @ARGV array holds the arguments from the command line Perl was called with.
When you call the following program with a list of filenames, it displays the first line
of each file. If the program cannot read a file, die (page 536) sends an error message
to standard error and quits. The foreach structure loops through the command-line
arguments, as represented by @ARGV, assigning each argument in turn to $filename.
The foreach block starts with an open statement. Perl executes the open statement
that precedes the OR Boolean operator (or) or, if that fails, Perl executes the statement following the or operator (die). The result is that Perl either opens the file
named by $filename and assigns IN as its handle or, if it cannot open that file, executes the die statement and quits. The print statement displays the name of the file
followed by a colon and the first line of the file. When it accepts $line =  as an
argument to print, Perl displays the value of $line following the assignment. After
reading a line from a file, the program closes the file.
$ cat file3.pl
foreach $filename (@ARGV) {
open (IN, $filename) or die "Cannot open file '$filename': $!\n";
print "$filename: ", $line = ;
close (IN);
}
$ perl file3.pl f1 f2 f3 f4
f1: First line of file f1.
f2: First line of file f2.
Cannot open file 'f3': No such file or directory

The next example is similar to the preceding one, except it takes advantage of several
Perl features that make the code simpler. It does not quit when it cannot read a file.
Instead, Perl displays an error message and continues. The first line of the program
uses my to declare $filename to be a lexical variable. Next, while uses the magic file
handle to open and read each line of each file named by the command-line arguments;
$ARGV holds the name of the file. When there are no more files to read, the while
condition [(<>)] is false, while transfers control outside the while block, and the program terminates. Perl takes care of all file opening and closing operations; you do not
have to write code to take care of these tasks. Perl also performs error checking.
The program displays the first line of each file named by a command-line argument.
Each time through the while block, while reads another line. When it finishes with
one file, it starts reading from the next file. Within the while block, if tests whether
it is processing a new file. If it is, the if block displays the name of the file and the
(first) line from the file and then assigns the new filename ($ARGV) to $filename.

546 Chapter 11 The Perl Scripting Language
$ cat file3a.pl
my $filename;
while (<>) {
if ($ARGV ne $filename) {
print "$ARGV: $_";
$filename = $ARGV;
}
}
$ perl file3a.pl f1 f2
f1: First line of file
f2: First line of file
Can't open f3: No such
f4: First line of file

f3 f4
f1.
f2.
file or directory at file3a.pl line 3, <> line 3.
f4.

Sort
reverse

The sort function returns elements of an array ordered numerically or alphabetically,
based on the locale (page 1078) environment. The reverse function is not related to
sort; it simply returns the elements of an array in reverse order.
The first two lines of the following program assign values to the @colors array and
display these values. Each of the next two pairs of lines uses sort to put the values in
the @colors array in order, assign the result to @scolors, and display @scolors. These
sorts put uppercase letters before lowercase letters. Observe the positions of Orange
and Violet, both of which begin with an uppercase letter, in the sorted output. The
first assignment statement in these two pairs of lines uses the full sort syntax, including the block {$a cmp $b} that tells Perl to use the cmp subroutine, which compares
strings, and to put the result in ascending order. When you omit the block in a sort
statement, as is the case in the second assignment statement, Perl also performs an
ascending textual sort.
$ cat sort3.pl
@colors = ("red", "Orange", "yellow", "green", "blue", "indigo", "Violet");
say "@colors";
@scolors = sort {$a cmp $b} @colors;
say "@scolors";

# ascending sort with
# an explicit block

@scolors = sort @colors;
say "@scolors";

# ascending sort with
# an implicit block

@scolors = sort {$b cmp $a} @colors;
say "@scolors";

# descending sort

@scolors = sort {lc($a) cmp lc($b)} @colors;
say "@scolors";

# ascending folded sort

Subroutines
$ perl sort3.pl
red Orange yellow green blue indigo
Orange Violet blue green indigo red
Orange Violet blue green indigo red
yellow red indigo green blue Violet
blue green indigo Orange red Violet

547

Violet
yellow
yellow
Orange
yellow

The third sort in the preceding example reverses the positions of $a and $b in the
block to specify a descending sort. The last sort converts the strings to lowercase
before comparing them, providing a sort wherein the uppercase letters are folded
into the lowercase letters. As a result, Orange and Violet appear in alphabetical
order.
To perform a numerical sort, specify the <=> subroutine in place of cmp. The following
example demonstrates ascending and descending numerical sorts:
$ cat sort4.pl
@numbers = (22, 188, 44, 2, 12);
print "@numbers\n";
@snumbers = sort {$a <=> $b} @numbers;
print "@snumbers\n";
@snumbers = sort {$b <=> $a} @numbers;
print "@snumbers\n";
$ perl sort4.pl
22 188 44 2 12
2 12 22 44 188
188 44 22 12 2

Subroutines
All variables are package variables (page 521) unless you use the my function to
define them to be lexical variables (page 522). Lexical variables defined in a subroutine are local to that subroutine.
The following program includes a main part and a subroutine named add(). This program uses the variables named $one, $two, and $ans, all of which are package
variables: They are available to both the main program and the subroutine. The call
to the subroutine does not pass values to the subroutine and the subroutine returns
no values. This setup is not typical: It demonstrates that all variables are package
variables unless you use my to declare them to be lexical variables.
The subroutine1.pl program assigns values to two variables and calls a subroutine.
The subroutine adds the values of the two variables and assigns the result to another
variable. The main part of the program displays the result.

548 Chapter 11 The Perl Scripting Language
$ cat subroutine1.pl
$one = 1;
$two = 2;
add();
print "Answer is $ans\n";
sub add {
$ans =$one + $two
}
$ perl subroutine1.pl
Answer is 3

The next example is similar to the previous one, except the subroutine takes advantage of a return statement to return a value to the main program. The program
assigns the value returned by the subroutine to the variable $ans and displays that
value. Again, all variables are package variables.
$ cat subroutine2.pl
$one = 1;
$two = 2;
$ans = add();
print "Answer is $ans\n";
sub add {
return ($one + $two)
}
$ perl subroutine2.pl
Answer is 3

Keeping variables local to a subroutine is important in many cases. The subroutine
in the next example changes the values of variables and insulates the calling program
from these changes by declaring and using lexical variables. This setup is more
typical.
@_

When you pass values in a call to a subroutine, Perl makes those values available in
the array named @_ in the subroutine. Although @_ is local to the subroutine, its elements are aliases for the parameters the subroutine was called with. Changing a value
in the @_ array changes the value of the underlying variable, which might not be what
you want. The next program avoids this pitfall by assigning the values passed to the
subroutine to lexical variables.
The subroutine3.pl program calls the addplusone() subroutine with two variables
as arguments and assigns the value returned by the subroutine to a variable. The
first statement in the subroutine declares two lexical variables and assigns to them
the values from the @_ array. The my function declares these variables to be lexical.
(See the tip on lexical and package variables on page 526.) Although you can use
my without assigning values to the declared variables, the syntax in the example is
more commonly used. The next two statements increment the lexical variables
$lcl_one and $lcl_two. The print statement displays the value of $lcl_one within the

Subroutines

549

subroutine. The return statement returns the sum of the two incremented, lexical
variables.
$ cat subroutine3.pl
$one = 1;
$two = 2;
$ans = addplusone($one, $two);
print "Answer is $ans\n";
print "Value of 'lcl_one' in main: $lcl_one\n";
print "Value of 'one' in main: $one\n";
sub addplusone {
my ($lcl_one, $lcl_two) = @_;
$lcl_one++;
$lcl_two++;
print "Value of 'lcl_one' in sub: $lcl_one\n";
return ($lcl_one + $lcl_two)
}
$ perl subroutine3.pl
Value of 'lcl_one' in sub: 2
Answer is 5
Value of 'lcl_one' in main:
Value of 'one' in main: 1

After displaying the result returned by the subroutine, the print statements in the
main program demonstrate that $lcl_one is not defined in the main program (it is
local to the subroutine) and that the value of $one has not changed.
The next example illustrates another way to work with parameters passed to a subroutine. This subroutine does not use variables other than the @_ array it was passed
and does not change the values of any elements of that array.
$ cat subroutine4.pl
$one = 1;
$two = 2;
$ans = addplusone($one, $two);
print "Answer is $ans\n";
sub addplusone {
return ($_[0] + $_[1] + 2);
}
$ perl subroutine4.pl
Answer is 5

The final example in this section presents a more typical Perl subroutine. The subroutine max() can be called with any number of numeric arguments and returns the
value of the largest argument. It uses the shift function to assign to $biggest the value
of the first argument the subroutine was called with and to shift the rest of the arguments. After using shift, argument number 2 becomes argument number 1 (8),
argument 3 becomes argument 2 (64), and argument 4 becomes argument 3 (2).
Next, foreach loops over the remaining arguments (@_). Each time through the

550 Chapter 11 The Perl Scripting Language

foreach block, Perl assigns to $_ the value of each of the arguments, in order. The
$biggest variable is assigned the value of $_ if $_ is bigger than $biggest. When max()
finishes going through its arguments, $biggest holds the maximum value, which
max() returns.
$ cat subroutine5.pl
$ans = max (16, 8, 64, 2);
print "Maximum value is $ans\n";
sub max {
my $biggest = shift; # Assign first and shift the rest of the arguments to max()
foreach (@_) {
# Loop through remaining arguments
$biggest = $_ if $_ > $biggest;
}
return ($biggest);
}
$ perl subroutine5.pl
Maximum value is 64

Regular Expressions
Appendix A defines and discusses regular expressions you can use in many Linux
utilities. All of the material in Appendix A applies to Perl, except as noted. In addition
to the facilities described in Appendix A, Perl offers regular expression features that
allow you to perform more complex string processing. This section reviews some of
the regular expressions covered in Appendix A and describes some of the additional
features of regular expressions available in Perl. It also introduces the syntax Perl uses
for working with regular expressions.

Syntax and the =~ Operator
The –l option

The Perl –l option applies chomp to each line of input and places \n at the end of each
line of output. The examples in this section use the Perl –l and –e (page 522) options.
Because the program must be specified as a single argument, the examples enclose the
Perl programs within single quotation marks. The shell interprets the quotation
marks and does not pass them to Perl.

Using other options with –e
tip When you use another option with –e, the program must immediately follow the –e on the command line. Like many other utilities, Perl allows you to combine options following a single hyphen;
if –e is one of the combined options, it must appear last in the list of options. Thus you can use
perl –l –e or perl –le but not perl –e –l or perl –el.
/ is the default
delimiter

By default, Perl delimits a regular expression with slashes (/). The first program uses
the =~ operator to search for the pattern ge in the string aged. You can think of the
=~ operator as meaning “contains.” Using different terminology, the =~ operator
determines whether the regular expression ge has a match in the string aged. The

Regular Expressions

551

regular expression in this example contains no special characters; the string ge is
part of the string aged. Thus the expression within the parentheses evaluates to true
and Perl executes the print statement.
$ perl -le 'if ("aged" =~ /ge/) {print "true";}'
true

You can achieve the same functionality by using a postfix if statement:
$ perl -le 'print "true" if "aged" =~ /ge/'
true
!~

The !~ operator works in the opposite sense from the =~ operator. The expression in
the next example evaluates to true because the regular expression xy does not match
any part of aged:
$ perl -le 'print "true" if ("aged" !~ /xy/)'
true

As explained on page 1013, a period within a regular expression matches any single
character, so the regular expression a..d matches the string aged:
$ perl -le 'print "true" if ("aged" =~ /a..d/)'
true

You can use a variable to hold a regular expression. The following syntax quotes
string as a regular expression:
qr/string/
The next example uses this syntax to assign the regular expression /a..d/ (including
the delimiters) to the variable $re and then uses that variable as the regular expression:
$ perl -le '$re = qr/a..d/; print "true" if ("aged" =~ $re)'
true

If you want to include the delimiter within a regular expression, you must quote it.
In the next example, the default delimiter, a slash (/), appears in the regular expression. To keep Perl from interpreting the / in /usr as the end of the regular expression,
the / that is part of the regular expression is quoted by preceding it with a backslash
(\). See page 1015 for more information on quoting characters in regular expressions.
$ perl -le 'print "true" if ("/usr/doc" =~ /\/usr/)'
true

Quoting several characters by preceding each one with a backslash can make a complex regular expression harder to read. Instead, you can precede a delimited regular
expression with m and use a paired set of characters, such as {}, as the delimiters. In
the following example, the caret (^) anchors the regular expression to the beginning
of the line (page 1014):
$ perl -le 'print "true" if ("/usr/doc" =~ m{^/usr})'
true

552 Chapter 11 The Perl Scripting Language

You can use the same syntax when assigning a regular expression to a variable:
$ perl -le '$pn = qr{^/usr}; print "true" if ("/usr/doc" =~ $pn)'
true
Replacement string
and assignment

Perl uses the syntax shown in the next example to substitute a string (the replacement
string) for a matched regular expression. The syntax is the same as that found in vim
and sed. In the second line of the example, an s before the regular expression instructs
Perl to substitute the string between the second and third slashes (worst; the replacement string) for a match of the regular expression between the first two slashes (best).
Implicit in this syntax is the notion that the substitution is made in the string held in
the variable on the left of the =~ operator.
$ cat re10a.pl
$stg = "This is the best!";
$stg =~ s/best/worst/;
print "$stg\n";
$ perl re10a.pl
This is the worst!

Table 11-3 lists some of the characters, called metacharacters, that are considered
special within Perl regular expressions. Give the command perldoc perlre for more
information.

Table 11-3

Some Perl regular expression metacharacters

Character

Matches

^ (caret)

Anchors a regular expression to the beginning of a line (page 1014)

$ (dollar sign)

Anchors a regular expression to the end of a line (page 1014)

(...)

Brackets a regular expression (page 554)

. (period)

Any single character except NEWLINE (\n; page 1013)

\\

A backslash (\)

\b

A word boundary (zero-width match)

\B

A nonword boundary ([^\b])

\d

A single decimal digit ([0–9])

\D

A single nondecimal digit ([^0–9] or [^\d])

\s (lowercase)

A single whitespace character SPACE, NEWLINE, RETURN, TAB, FORMFEED

\S (uppercase)

A single nonwhitespace character ([^\s])

\w (lowercase)

A single word character (a letter or digit; [a–zA–Z0–9])

\W (uppercase)

A single nonword character ([^\w])

Regular Expressions

553

Greedy Matches
By default Perl performs greedy matching, which means a regular expression matches
the longest string possible (page 1015). In the following example, the regular expression /{.*} / matches an opening brace followed by any string of characters, a closing
brace, and a SPACE ({remove me} might have two {keep me} ). Perl substitutes a null
string (//) for this match.
$ cat 5ha.pl
$string = "A line {remove me} might have two {keep me} pairs of
braces.";
$string =~ s/{.*} //;
print "$string\n";
$ perl 5ha.pl
A line pairs of braces.
Nongreedy matches

The next example shows the classic way of matching the shorter brace-enclosed string
from the previous example. This type of match is called nongreedy or parsimonious
matching. Here the regular expression matches
1. An opening brace followed by
2. A character belonging to the character class (page 1013) that includes all
characters except a closing brace ([^}]) followed by
3. Zero or more occurrences of the preceding character (*) followed by
4. A closing brace followed by
5. A SPACE
(A caret as the first character of a character class specifies the class of all characters
that do not match the following characters, so [^}] matches any character that is not
a closing brace.)
$ cat re5b.pl
$string = "A line {remove me} might have two {keep me} pairs of braces.";
$string =~ s/{[^}]*} //;
print "$string\n";
$ perl re5b.pl
A line might have two {keep me} pairs of braces.

Perl provides a shortcut that allows you to specify a nongreedy match. In the following example, the question mark in {.*?} causes the regular expression to match the
shortest string that starts with an opening brace followed by any string of characters
followed by a closing brace.
$ cat re5c.pl
$string = "A line {remove me} might have two {keep me} pairs of braces.";
$string =~ s/{.*?} //;
print "$string\n";
$ perl re5c.pl
A line might have two {keep me} pairs of braces.

554 Chapter 11 The Perl Scripting Language

Bracketing Expressions
As explained on page 1016, you can bracket parts of a regular expression and recall
those parts in the replacement string. Most Linux utilities use quoted parentheses
[i.e., \( and \)] to bracket a regular expression. In Perl regular expressions, parentheses are special characters. Perl omits the backslashes and uses unquoted parentheses
to bracket regular expressions. To specify a parenthesis as a regular character within
a regular expression in Perl, you must quote it (page 1015).
The next example uses unquoted parentheses in a regular expression to bracket part
of the expression. It then assigns the part of the string that the bracketed expression
matched to the variable that held the string in which Perl originally searched for the
regular expression.
First the program assigns the string My name is Sam to $stg. The next statement
looks for a match for the regular expression /My name is (.*)/ in the string held by
$stg. The part of the regular expression bracketed by parentheses matches Sam; the
$1 in the replacement string matches the first (and only in this case) matched bracketed portion of the regular expression. The result is that the string held in $stg is
replaced by the string Sam.
$ cat re11.pl
$stg = "My name is Sam";
$stg =~ s/My name is (.*)/$1/;
print "Matched: $stg\n";
$ perl re11.pl
Matched: Sam

The next example uses regular expressions to parse a string for numbers. Two variables are initialized to hold a string that contains two numbers. The third line of the
program uses a regular expression to isolate the first number in the string. The \D*
matches a string of zero or more characters that does not include a digit: The \D special character matches any single nondigit character. The trailing asterisk makes this
part of the regular expression perform a greedy match that does not include a digit
(it matches What is ). The bracketed regular expression \d+ matches a string of one
or more digits. The parentheses do not affect what the regular expression matches;
they allow the $1 in the replacement string to match what the bracketed regular
expression matched. The final .* matches the rest of the string. This line assigns the
value of the first number in the string to $string.
The next line is similar but assigns the second number in the string to $string2. The
print statements display the numbers and the result of subtracting the second number
from the first.
$ cat re8.pl
$string = "What is 488 minus 78?";
$string2 = $string;
$string =~ s/\D*(\d+).*/$1/;
$string2 =~ s/\D*\d+\D*(\d+).*/$1/;

CPAN Modules 555
print "$string\n";
print "$string2\n";
print $string - $string2, "\n";
$ perl re8.pl
488
78
410

The next few programs show some of the pitfalls of using unquoted parentheses in
regular expressions when you do not intend to bracket part of the regular expression.
The first of these programs attempts to match parentheses in a string with unquoted
parentheses in a regular expression, but fails. The regular expression ag(e matches the
same string as the regular expression age because the parenthesis is a special character; the regular expression does not match the string ag(ed).
$ perl -le 'if ("ag(ed)" =~ /ag(ed)/) {print "true";} else {print "false";}'
false

The regular expression in the next example quotes the parentheses by preceding each
with a backslash, causing Perl to interpret them as regular characters. The match is
successful.
$ perl -le 'if ("ag(ed)" =~ /ag\(ed\)/) {print "true";} else {print "false";}'
true

Next, Perl finds an unmatched parenthesis in a regular expression:
$ perl -le 'if ("ag(ed)" =~ /ag(e/) {print "true";} else {print "false";}'
Unmatched ( in regex; marked by <-- HERE in m/ag( <-- HERE e/ at -e line 1.

When you quote the parenthesis, all is well and Perl finds a match:
$ perl -le 'if ("ag(ed)" =~ /ag\(e/) {print "true";} else {print "false";}'
true

CPAN Modules
CPAN (Comprehensive Perl Archive Network) provides Perl documentation, FAQs,
modules (page 521), and scripts on its Web site (www.cpan.org). It holds more than
16,000 distributions (page 521) and provides links, mailing lists, and versions of Perl
compiled to run under various operating systems (ports of Perl). One way to locate
a module is to visit search.cpan.org and use the search box or click one of the classes
of modules listed on that page.
This section explains how to download a module from CPAN and how to install and
run the module. Perl provides a hierarchical namespace for modules, separating components of a name with double colons (::). The example in this section uses the
module named Timestamp::Simple, which you can read about and download from
search.cpan.org/dist/Timestamp-Simple. The timestamp is the date and time in the
format YYYYMMDDHHMMSS.

556 Chapter 11 The Perl Scripting Language

To use a Perl module, you first download the file that holds the module. For this
example, the search.cpan.org/~shoop/Timestamp-Simple-1.01/Simple.pm Web page
has a link on the right side labeled Download. Click this link and save the file to the
directory you want to work in. You do not need to work as a privileged user until
the last step of this procedure, when you install the module.
Most Perl modules come as compressed tar files (page 66). With the downloaded file
in the working directory, decompress the file:
$ tar xzvf Timestamp-Simple-1.01.tar.gz
Timestamp-Simple-1.01/
Timestamp-Simple-1.01/Simple.pm
Timestamp-Simple-1.01/Makefile.PL
Timestamp-Simple-1.01/README
Timestamp-Simple-1.01/test.pl
Timestamp-Simple-1.01/Changes
Timestamp-Simple-1.01/MANIFEST
Timestamp-Simple-1.01/ARTISTIC
Timestamp-Simple-1.01/GPL
Timestamp-Simple-1.01/META.yml

The README file in the newly created directory usually provides instructions for
building and installing the module. Most modules follow the same steps.
$ cd Timestamp-Simple-1.01
$ perl Makefile.PL
Checking if your kit is complete...
Looks good
Writing Makefile for Timestamp::Simple

If the module you are building depends on other modules that are not installed on
the local system, running perl Makefile.PL will display one or more warnings about
prerequisites that are not found. This step writes out the makefile even if modules are
missing. In this case the next step will fail, and you must build and install missing
modules before continuing.
The next step is to run make on the makefile you just created. After you run make,
run make test to be sure the module is working.
$ make
cp Simple.pm blib/lib/Timestamp/Simple.pm
Manifying blib/man3/Timestamp::Simple.3pm
$ make test
PERL_DL_NONLAZY=1 /usr/bin/perl "-Iblib/lib" "-Iblib/arch" test.pl
1..1
# Running under perl version 5.100000 for linux
# Current time local: Fri Sep 3 18:20:41 2010
# Current time GMT:
Sat Sep 4 01:20:41 2010
# Using Test.pm version 1.25
ok 1
ok 2
ok 3

CPAN Modules 557

Finally, running with root privileges, install the module:
# make install
Installing /usr/local/share/perl/5.10.0/Timestamp/Simple.pm
Installing /usr/local/man/man3/Timestamp::Simple.3pm
Writing /usr/local/lib/perl/5.10.0/auto/Timestamp/Simple/.packlist
Appending installation info to /usr/local/lib/perl/5.10.0/perllocal.pod

Once you have installed a module, you can use perldoc to display the documentation
that tells you how to use the module. See page 519 for an example.
Some modules contain SYNOPSIS sections. If the module you installed includes such
a section, you can test the module by putting the code from the SYNOPSIS section
in a file and running it as a Perl program:
$ cat times.pl
use Timestamp::Simple qw(stamp);
print stamp, "\n";
$ perl times.pl
20100904182627

You can then incorporate the module in a Perl program. The following example uses
the timestamp module to generate a unique filename:
$ cat fn.pl
use Timestamp::Simple qw(stamp);
# Save timestamp in a variable
$ts = stamp, "\n";
# Strip off the year
$ts =~ s/....(.*)/\1/;
# Create a unique filename
$fn = "myfile." . $ts;
# Open, write to, and close the file
open (OUTFILE, '>', "$fn");
print OUTFILE "Hi there.\n";
close (OUTFILE);
$ perl fn.pl
$ ls myf*
myfile.0905183010
substr

You can use the substr function in place of the regular expression to strip off the year.
To do so, replace the line that starts with $ts =~ with the following line. Here, substr
takes on the value of the string $ts starting at position 4 and continuing to the end
of the string:
$ts = substr ($ts, 4);

558 Chapter 11 The Perl Scripting Language

Examples
This section provides some sample Perl programs. First try running these programs
as is, and then modify them to learn more about programming with Perl.
The first example runs under Linux and displays the list of groups that the user given
as an argument is a member of. Without an argument, it displays the list of groups
that the user running the program is a member of. In a Perl program, the %ENV hash
holds the environment variables from the shell that called Perl. The keys in this hash
are the names of environment variables; the values in this hash are the values of the
corresponding variables. The first line of the program assigns a username to $user.
The shift function (page 531) takes on the value of the first command-line argument
and shifts the rest of the arguments, if any remain. If the user runs the program with
an argument, that argument is assigned to $user. If no argument appears on the command line, shift fails and Perl executes the statement following the Boolean OR (||).
This statement extracts the value associated with the USER key in %ENV, which is
the name of the user running the program.
Accepting output
from a process

The third statement initializes the array @list. Although this statement is not
required, it is good practice to include it to make the code easier to read. The next
statement opens the $fh lexical handle. The trailing pipe symbol (|) in the file-ref
(page 542) portion of this open statement tells Perl to pass the command line preceding the pipe symbol to the shell for execution and to accept standard output from the
command when the program reads from the file handle. In this case the command
uses grep to filter the /etc/group file for lines containing the username held in $user.
(OS X does not use this file; see page 1042 for more information.) The die statement
displays an error message if Perl cannot open the handle.
$ cat groupfind.pl
$user = shift || $ENV{"USER"};
say "User $user belongs to these groups:";
@list = ();
open (my $fh, "grep $user /etc/group |") or die "Error: $!\n";
while ($group = <$fh>) {
chomp $group;
$group =~ s/(.*?):.*/$1/;
push @list, $group;
}
close $fh;
@slist = sort @list;
say "@slist";
$ perl groupfind.pl
User sam belongs to these groups:
adm admin audio cdrom dialout dip floppy kvm lpadmin ...

Examples 559

The while structure in groupfind.pl reads lines from standard output of grep and
terminates when grep finishes executing. The name of the group appears first
on each line in /etc/group, followed by a colon and other information, including the names of the users who belong to the group. Following is a line from
this file:
sam:x:1000:max,zach,helen

The line
$group =~ s/(.*?):.*/$1/;

uses a regular expression and substitution to remove everything except the name of
the group from each line. The regular expression .*: would perform a greedy match
of zero or more characters followed by a colon; putting a question mark after the
asterisk causes the expression to perform a nongreedy match (page 553). Putting
parentheses around the part of the expression that matches the string the program
needs to display enables Perl to use the string that the regular expression matches in
the replacement string. The final .* matches the rest of the line. Perl replaces the $1
in the replacement string with the string the bracketed portion of the regular expression (the part between the parentheses) matched and assigns this value (the name of
the group) to $group.
The chomp statement removes the trailing NEWLINE (the regular expression did not
match this character). The push statement adds the value of $group to the end of the
@list array. Without chomp, each group would appear on a line by itself in the output. After the while structure finishes processing input from grep, sort orders @list
and assigns the result to @slist. The final statement displays the sorted list of groups
the user belongs to.
opendir and readdir

The next example introduces the opendir and readdir functions. The opendir function opens a directory in a manner similar to the way open opens an ordinary file. It
takes two arguments: the name of the directory handle and the name of the directory
to open. The readdir function reads the name of a file from an open directory.
In the example, opendir opens the working directory (specified by .) using the $dir
lexical directory handle. If opendir fails, Perl executes the statement following the or
operator: die sends an error message to standard error and terminates the program.
With the directory opened, while loops through the files in the directory, assigning
the filename that readdir returns to the lexical variable $entry. An if statement executes print only for those files that are directories (–d). The print function displays
the name of the directory unless the directory is named . or ... When readdir has read
all files in the working directory, it returns false and control passes to the statement
following the while block. The closedir function closes the open directory and print
displays a NEWLINE following the list of directories the program displayed.

560 Chapter 11 The Perl Scripting Language
$ cat dirs2a.pl
#!/usr/bin/perl
print "The working directory contains these directories:\n";
opendir my $dir, '.' or die "Could not open directory: $!\n";
while (my $entry = readdir $dir) {
if (-d $entry) {
print $entry, ' ' unless ($entry eq '.' || $entry eq '..');
}
}
closedir $dir;
print "\n";
$ ./dirs2a.pl
The working directory contains these directories:
two one
split

The split function divides a string into substrings as specified by a delimiter. The syntax
of a call to split is
split (/re/, string);
where re is the delimiter, which is a regular expression (frequently a single regular
character), and string is the string that is to be divided. As the next example shows,
you can assign the list that split returns to an array variable.
The next program runs under Linux and lists the usernames of users with UIDs greater
than or equal to 100 listed in the /etc/passwd file. Because Mac OS X uses Open Directory in place of this file, you must modify it before it will run under OS X; see
page 1042. It uses a while structure to read lines from passwd into $user, and it uses
split to break the line into substrings separated by colons. The line that begins with
@row assigns each of these substrings to an element of the @row array. The expression
the if statement evaluates is true if the third substring (the UID) is greater than or equal
to 100. This expression uses the >= numeric comparison operator because it compares
two numbers; an alphabetic comparison would use the ge string comparison operator.
The print statement sends the UID number and the associated username to the
$sortout file handle. The open statement for this handle establishes a pipeline that
sends its output to sort –n. Because the sort utility (page 58) does not display any output until it finishes receiving all of the input, split3.pl does not display anything until
it closes the $sortout handle, which it does when it finishes reading the passwd file.
$ cat split3.pl
#!/usr/bin/perl -w
open ($pass, "/etc/passwd");
open ($sortout, "| sort -n");
while ($user = <$pass>) {
@row = split (/:/, $user);
if ($row[2] >= 100) {
print $sortout "$row[2] $row[0]\n";
}
}
close ($pass);
close ($sortout);

Chapter Summary

561

$ ./split3.pl
100 libuuid
101 syslog
102 klog
103 avahi-autoipd
104 pulse
...

The next example counts and displays the arguments it was called with, using
@ARGV (page 545). A foreach structure loops through the elements of the @ARGV
array, which holds the command-line arguments. The ++ preincrement operator
increments $count before it is displayed.
$ cat 10.pl
#!/usr/bin/perl -w
$count = 0;
$num = @ARGV;
print "You entered $num arguments on the command line:\n";
foreach $arg (@ARGV) {
print ++$count, ". $arg\n";
}
$ ./10.pl apple pear banana watermelon
You entered 4 arguments on the command line:
1. apple
2. pear
3. banana
4. watermelon

Chapter Summary
Perl was written by Larry Wall in 1987. Since that time Perl has grown in size and
functionality and is now a very popular language used for text processing, system
administration, software development, and general-purpose programming. One of
Perl’s biggest assets is its support by thousands of third-party modules, many of
which are stored in the CPAN repository.
The perldoc utility locates and displays local Perl documentation. It also allows you
to document a Perl program by displaying lines of pod (plain old documentation) that
you include in the program.
Perl provides three types of variables: scalar (singular variables that begin with a $),
array (plural variables that begin with an @), and hash (also called associative arrays;
plural variables that begin with a %). Array and hash variables both hold lists, but
arrays are ordered while hashes are unordered. Standard control flow statements
allow you to alter the order of execution of statements within a Perl program. In addition, Perl programs can take advantage of subroutines that can include variables local
to the subroutines (lexical variables).
Regular expressions are one of Perl’s strong points. In addition to the same facilities
that are available in many utilities, Perl offers regular expression features that allow
you to perform more complex string processing.

562 Chapter 11 The Perl Scripting Language

Exercises
1. What are two different ways to turn on warnings in Perl?
2. What is the difference between an array and a hash?
3. In each example, when would you use a hash and when would you use an
array?
a. Counting the number of occurrences of an IP address in a log file.
b. Generating a list of users who are over disk quota for use in a report.
4. Write a regular expression to match a quoted string, such as
He said, "Go get me the wrench," but I didn’t hear him.

5. Write a regular expression to match an IP address in a log file.
6. Many configuration files contain many comments, including commentedout default configuration directives. Write a program to remove these comments from a configuration file.

Advanced Exercises
7. Write a program that removes *~ and *.ico files from a directory hierarchy.
(Hint: Use the File::Find module.)
8. Describe a programming mistake that Perl’s warnings do not report on.
9. Write a Perl program that counts the number of files in the working directory and the number of bytes in those files, by filename extension.
10. Describe the difference between quoting strings using single quotation
marks and using double quotation marks.
11. Write a program that copies all files with a .ico filename extension in a
directory hierarchy to a directory named icons in your home directory.
(Hint: Use the File::Find and File::Copy modules.)
12. Write a program that analyzes Apache logs. Display the number of bytes
served by each path. Ignore unsuccessful page requests. If there are more
than ten paths, display the first ten only.
Following is a sample line from an Apache access log. The two numbers following the HTTP/1.1 are the response code and the byte count. A response
code of 200 means the request was successful. A byte count of – means no
data was transferred.
__DATA__
92.50.103.52 - - [19/Aug/2012:08:26:43 -0400] "GET /perl/automated-testing/next_active.gif
HTTP/1.1" 200 980 "http://example.com/perl/automated-testing/navigation_bar.htm"
"Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.6) Gecko/20061201 Firefox/3.0.0.6
(Fedora); Blazer/4.0"

12
The Python
Programming
Language

Chapter12
12

In This Chapter

Objectives

Invoking Python . . . . . . . . . . . . . . 564

After reading this chapter you should be able to:

Lists . . . . . . . . . . . . . . . . . . . . . . . . 569

 Give commands using the Python interactive shell

Dictionaries . . . . . . . . . . . . . . . . . . 573

 Write and run a Python program stored in a file

Control Structures . . . . . . . . . . . . . 574

 Demonstrate how to instantiate a list and how to
remove elements from and add elements to a list

Reading from and Writing
to Files . . . . . . . . . . . . . . . . . . . . 579
Pickle . . . . . . . . . . . . . . . . . . . . . . . 582

 Describe a dictionary and give examples of how it can
be used

Regular Expressions . . . . . . . . . . . 583

 Describe three Python control structures

Defining a Function. . . . . . . . . . . . 584
Using Libraries . . . . . . . . . . . . . . . 585

 Write a Python program that iterates through a list or
dictionary

Lambda Functions . . . . . . . . . . . . 589

 Read from and write to a file

List Comprehensions . . . . . . . . . . 590

 Demonstrate exception processing
 Preserve an object using pickle()
 Write a Python program that uses regular expressions
 Define a function and use it in a program

564 Chapter 12 The Python Programming Language

Introduction
Python is a friendly and flexible programming language in widespread use everywhere from Fortune 500 companies to large-scale open-source projects. Python is
an interpreted language: It translates code into bytecode (page 1059) at runtime
and executes the bytecode within the Python virtual machine. Contrast Python
with the C language, which is a compiled language. C differs from Python in that
the C compiler compiles C source code into architecture-specific machine code.
Python programs are not compiled; you run a Python program the same way you
run a bash or Perl script. Because Python programs are not compiled, they are portable between operating systems and architectures. In other words, the same
Python program will run on any system to which the Python virtual machine has
been ported.
Object oriented

While not required to use the language, Python supports the object-oriented (OO)
paradigm. It is possible to use Python with little or no understanding of objectoriented concepts and this chapter covers OO programming minimally while still
explaining Python’s important features.

Libraries

Python comes with hundreds of prewritten tools that are organized into logical
libraries. These libraries are accessible to Python programs, but not loaded into
memory at runtime because doing so would significantly increase startup times for
Python programs. Entire libraries (or just individual modules) are instead loaded
into memory when the program requests them.

Version

Python is available in two main development branches: Python 2.x and Python 3.x.
This chapter focuses on Python 2.x because the bulk of Python written today uses
2.x. The following commands show that two versions of Python are installed and
that the python command runs Python 2.7.3:
$ whereis python
python: /usr/bin/python /usr/bin/python2.7 /etc/python2.6 /etc/python ...
$ ls -l $(which python)
lrwxrwxrwx 1 root root 9 Apr 17 10:20 /usr/bin/python -> python2.7
$ python -V
Python 2.7.3

Invoking Python
This section discusses the methods you can use to run a Python program.
Interactive shell

Most of the examples in this chapter use the Python interactive shell because you can
use it to debug and execute code one line at a time and see the results immediately.
Although this shell is handy for testing, it is not a good choice for running longer,
more complex programs. You start a Python interactive shell by calling the python
utility (just as you would start a bash shell by calling bash). The primary Python

Introduction

565

prompt is >>>. When Python requires more input to complete a command, it displays
its secondary prompt (...).
$ python
Python 2.7.3 (default, Apr 20 2012, 22:39:59)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

While you are using the Python interactive shell, you can give Python a command by
entering the command and pressing RETURN.
>>> print 'Good morning!'
Good morning!

Implied display
tip Within the Python interactive shell, Python displays output from any command line that does not
have an action. The output is similar to what print would display, although it might not be exactly
the same. The following examples show explicit print and implicit display actions:
>>> print 'Good morning!'
Good morning!
>>> 'Good morning!'
'Good morning!'
>>> print 2 + 2
4
>>> 2 + 2
4

Implied display allows you to display the value of a variable by typing its name:
>>> x = 'Hello'
>>> x
'Hello'

Implied display does not work unless you are running the Python interactive shell (i.e., Python
does not invoke an implicit display action when it is run from a file).
Program file

Most of the time a Python program is stored in a text file. Although not required, the
file typically has a filename extension of .py. Use chmod (page 741) to make the file executable. As explained on page 287, the #! at the start of the first line of the file instructs
the shell to pass the rest of the file to /usr/bin/python for execution.
$ chmod 755 gm.py
$ cat gm.py
#!/usr/bin/python
print 'Good morning!'
$ ./gm.py
Good morning!

566 Chapter 12 The Python Programming Language

You can also run a Python program by specifying the name of the program file as an
argument or as standard input to python.
$ python gm.py
Good morning!
$ python < gm.py
Good morning!
$ cat gm.py | python
Good morning!
$ echo "print 'Good morning! '" | python
Good morning!

Because the shell interprets an exclamation point immediately followed by a character
other than a SPACE as an event number (page 331), the final command includes a SPACE
after the exclamation point.
Command line

Using the python –c option, you can run a program from the shell command line. In the
preceding and following commands, the double quotation marks keep the shell from
removing the single quotation marks from the command line before passing it to Python.
$ python -c "print 'Good morning! '"
Good morning!

Single and double quotation marks are functionally equivalent
tip You can use single quotation marks and double quotation marks interchangeably in a Python
program and you can use one to quote the other. When Python displays quotation marks around
a string it uses single quotation marks.
>>> a = "hi"
>>> a
'hi'
>>> print "'hi'"
'hi'
>>> print '"hi"'
"hi"

More Information
Local python man
Python interactive
shell

Web

page, pydoc program

From the Python interactive shell give the command help() to use the Python help feature.
When you call it, this utility displays information to help you get started using it. Alternatively, you can give the command help('object') where object is the name of an object
you have instantiated or the name of a data structure or module such as list or pickle.
Python home page: www.python.org
Documentation: docs.python.org
Index of Python Enhancement Proposals (PEPs): www.python.org/dev/peps
PEP 8 Style Guide for Python Code: www.python.org/dev/peps/pep-0008
PyPI (Python Package Index): pypi.python.org

Introduction

567

Writing to Standard Output and Reading from
Standard Input
raw_input()

Python makes it easy to write to standard output and read from standard input, both of
which the shell (e.g., bash) connects to the terminal by default. The raw_input() function
writes to standard output and reads from standard input. It returns the value of the string
it reads after stripping the trailing control characters (RETURN-NEWLINE or NEWLINE).
In the following example, raw_input() displays its argument (Enter your name:) and
waits for the user. When the user types something and presses RETURN, Python assigns the
value returned by raw_input(), which is the string the user entered, to the variable my_in.

print

Within a print statement, a plus sign (+) catenates the strings on either side of it. The print
statement in the example displays Hello, the value of my_in, and an exclamation point.
>>> my_in = raw_input ('Enter your name: ')
Enter your name: Neo
>>> print 'Hello, ' + my_in + '!'
Hello, Neo!

Functions and Methods
Functions

Functions in Python, as in most programming languages, are key to improving code
readability, efficiency, and maintenance. Python has a number of builtin functions
and functions you can immediately import into a Python program. These functions
are available when you install Python. For example, the int() function returns the
integer (truncated) part of a floating-point number:
>>> int(8.999)
8

Additional downloadable libraries hold more functions. For more information refer
to “Using Libraries” on page 585. Table 12-1 lists a few of the most commonly used
functions.

Table 12-1

Commonly used functions

Function

What it does

exit()

Exits from a program

float()

Returns its argument as a floating-point number

help()

Displays help on the object specified by its argument; without an argument
opens interactive help (Python interactive shell only; page 566)

int()

Returns the integer (truncated) portion of its argument

len()

Returns the number of elements in a list or dictionary (page 571)

map()

Returns the list that results from applying its first argument (a function) to its
remaining arguments (page 589)

568 Chapter 12 The Python Programming Language

Table 12-1

Methods

Commonly used functions (continued)

Function

What it does

max()

Returns the maximum value from its argument, which can be a list or other
iterable (page 572) data structure (page 589)

open()

Opens the file in the mode specified by its arguments (page 579)

range()

Returns a list of integers between the two values specified by its arguments
(page 578)

raw_input()

Prompts with its argument and returns the string the user enters (page 567)

sorted()

Takes a list (or other iterable data structure) as an argument and returns the
same type of data structure with its elements in order

type()

Returns the type of its argument (e.g., int, file, method, function)

xrange()

Returns a list of integers between the two values specified by its arguments
[more efficient than range(); page 578]

Functions and methods are very similar. The difference is that functions stand on
their own while methods work on and are specific to objects. You will see the range()
function used by itself [range(args)] and the readall() method used as an object
method [f.readall(args), where f is the object (a file) readall() is reading from].
Table 12-2 on page 571 and Table 12-4 on page 579 list some methods.

Scalar Variables, Lists, and Dictionaries
This section discusses some of the Python builtin data types. Scalar data types
include number and string types. Compound data types include dictionary and list
types.

Scalar Variables
As in most programming languages, you declare and initialize a variable using an
equal sign. Python does not require the SPACEs around the equal sign, nor does it
require you to identify a scalar variable with a prefix (Perl and bash require a leading
dollar sign). As explained in the tip on page 565, you can use the print function to
display the value of a variable or you can just specify the variable name as a
command:
>>> lunch = 'Lunch time!'
>>> print lunch
Lunch time!
>>> lunch
'Lunch time!'

Python performs arithmetic as you might expect:

Scalar Variables, Lists, and Dictionaries 569
>>> n1 = 5
>>> n2 = 8
>>> n1 + n2
13
Floating-point
numbers

Whether Python performs floating-point or integer arithmetic depends on the values
it is given. If all of the numbers involved in a calculation are integers—that is, if none
of the numbers includes a decimal point—Python performs integer arithmetic and
will truncate answers that include a fractional part. If at least one of the numbers
involved in a calculation is a floating-point number (includes a decimal point),
Python performs floating-point arithmetic. Be careful when performing division: If
the answer could include a fraction, be sure to include a decimal point in one of the
numbers or explicitly specify one of the numbers as a floating-point number:
>>> 3/2
1
>>> 3/2.0
1.5
>>> float(3)/2
1.5

Lists
A Python list is an object that comprises one or more elements; it is similar to an array
in C or Java. Lists are ordered and use zero-based indexing (i.e., the first element of
a list is numbered zero). A list is called iterable because it can provide successive elements on each iteration through a looping control structure such as for; see page 572
for a discussion.
This section shows one way to instantiate (create) a list. The following commands
instantiate and display a list named a that holds four values:
>>> a = ['bb', 'dd', 'zz', 'rr']
>>> a
['bb', 'dd', 'zz', 'rr']
Indexes

You can access an element of a list by specifying its index (remember—the first element
of a list is numbered zero). The first of the following commands displays the value of
the third element of a; the next assigns the value of the first element of a to x and displays the value of x.
>>> a[2]
'zz'
>>> x = a[0]
>>> x
'bb'

When you specify a negative index, Python counts from the end of the array.
>>> a[-1]
'rr'
>>> a[-2]
'zz'

570 Chapter 12 The Python Programming Language
Replacing an
element

You can replace an element of a list by assigning a value to it.
>>> a[1] = 'qqqq'
>>> a
['bb', 'qqqq', 'zz', 'rr']

Slicing

The next examples show how to access a slice or portion of a list. The first example
displays elements 0 up to 2 of the list (elements 0 and 1):
>>> a[0:2]
['bb', 'dd']

If you omit the number that follows the colon, Python displays from the element with
the index specified before the colon through the end of the list. If you omit the number
before the colon, Python displays from the beginning of the list up to the element with
the number specified after the colon.
>>> a[2:]
['zz', 'rr']
>>> a[:2]
['bb', 'dd']

You can use negative numbers when slicing a list. The first of the following commands
displays element 1 up to the last element of the list (element –1); the second displays
from the next-to-last element of the list (element –2) through the end of the list.
>>> a[1:-1]
['dd', 'zz']
>>> a[-2:]
['zz', 'rr']
remove()

Following Python’s object-oriented paradigm, the list data type includes builtin
methods. The remove(x) method removes the first element of a list whose value is
x, and decreases the length of the list by 1. The following command removes the
first element of list a whose value is bb.
>>> a.remove('bb')
>>> a
['dd', 'zz', 'rr']

append()

The append(x) method appends an element whose value is x to the list, and increases
the length of the list by 1.
>>> a.append('mm')
>>> a
['dd', 'zz', 'rr', 'mm']

reverse()

The reverse() method does not take an argument. It is an efficient method that
reverses elements of a list in place, overwriting elements of the list with new values.
>>> a.reverse()
>>> a
['mm', 'rr', 'zz', 'dd']

sort()

The sort() method does not take an argument. It sorts the elements in a list in place,
overwriting elements of the list with new values.

Scalar Variables, Lists, and Dictionaries 571
>>> a.sort()
>>> a
['dd', 'mm', 'rr', 'zz']
sorted()

If you do not want to alter the contents of the list (or other iterable data structure)
you are sorting, use the sorted() function. This function returns the sorted list and
does not change the original list.
>>> b = sorted(a)
>>> a
['mm', 'rr', 'zz', 'dd']
>>> b
['dd', 'mm', 'rr', 'zz']

len()

The len() function returns the number of elements in a list or other iterable data
structure.
>>> len(a)
4

Table 12-2 lists some of the methods that work on lists. The command help(list)
displays a complete list of methods you can use with a list.

Table 12-2

list methods

Method

What it does

append(x )

Appends the value x to the list (previous page)

count(x )

Returns the number of times the value x occurs in the list

index(x )

Returns the index of the first occurrence of x in the list

remove(x )

Removes the first element of a list whose value is x (previous page)

reverse()

Reverses the order of elements in the list (previous page)

sort()

Sorts the list in place (previous page)

Working with Lists
Passing a list by
reference

Python passes all objects, including lists, by reference. That is, it passes an object by
passing a pointer to the object. When you assign one object to another, you are simply
creating another name for the object—you are not creating a new object. When you
change the object using either name, you can view the change using either name. In
the following example, names is instantiated as a list that holds the values sam, max,
and zach; copy is set equal to names, setting up another name for (reference to) the
same list. When the value of the first element of copy is changed, displaying names
shows that its first element has also changed.
>>> names = ['sam', 'max', 'zach']
>>> copy = names
>>> names
['sam', 'max', 'zach']
>>> copy[0] = 'helen'
>>> names
['helen', 'max', 'zach']

572 Chapter 12 The Python Programming Language
Copying a list

When you use the syntax b = a[:] to copy a list, each list remains independent of the
other. The next example is the same as the previous one, except copy2 points to a
different location than names because the list was copied, not passed by reference:
Look at the difference in the values of the first element (zero index) of both lists.
>>> names = ['sam', 'max', 'zach']
>>> copy2 = names[:]
>>> copy2[0] = 'helen'
>>> names
['sam', 'max', 'zach']
>>> copy2
['helen', 'max', 'zach']

Lists Are Iterable
An important feature of lists is that they are iterable, meaning a control structure such
as for (page 577) can loop (iterate) through each item in a list. In the following example,
the for control structure iterates over the list a, assigning one element of a to item each
time through the loop. The loop terminates after it has assigned each of the elements of
a to item. The comma at the end of the print statement replaces the NEWLINE print normally
outputs with a SPACE. You must indent lines within a control structure (called a logical
block; page 574).
>>> a
['bb', 'dd', 'zz', 'rr']
>>> for item in a:
...
print item,
...
bb dd zz rr

The next example returns the largest element in a list. In this example, the list is
embedded in the code; see page 588 for a similar program that uses random numbers.
The program initializes my_rand_list as a list holding ten numbers and largest as a
scalar with a value of –1. The for structure retrieves the elements of my_rand_list in
order, one each time through the loop. It assigns the value it retrieves to item. Within
the for structure, an if statement tests to see if item is larger than largest. If it is, the
program assigns the value of item to largest and displays the new value (so you can
see how the program is progressing). When control exits from the for structure, the
program displays a message and the largest number. Be careful when a logical block
(a subblock) appears within another logical block: You must indent the subblock one
level more than its superior logical block.
$ cat my_max.py
#!/usr/bin/python
my_rand_list = [5, 6, 4, 1, 7, 3, 2, 0, 9, 8]
largest = -1
for item in my_rand_list:
if (item > largest):
largest = item
print largest,
print
print 'largest number is ', largest

Scalar Variables, Lists, and Dictionaries 573
$ ./my_max.py
5 6 7 9
largest number is

9

See page 589 for an easier way to find the maximum value in a list.

Dictionaries
A Python dictionary holds unordered key–value pairs in which the keys must be
unique. Other languages refer to this type of data structure as an associative array,
hash, or hashmap. Like lists, dictionaries are iterable. A dictionary provides fast lookups. The class name for dictionary is dict; thus you must type help(dict) to display
the help page for dictionaries. Use the following syntax to instantiate a dictionary:
dict = { key1 : value1, key2 : value2, key3 : value3 ... }

Working with Dictionaries
The following example instantiates and displays a telephone extension dictionary
named ext. Because dictionaries are unordered, Python does not display a dictionary
in a specific order and usually does not display it in the order it was created.
>>> ext = {'sam': 44, 'max': 88, 'zach': 22}
>>> ext
{'max': 88, 'zach': 22, 'sam': 44}
keys() and values()

You can use the keys() and values() methods to display all keys or values held in a
dictionary:
>>> ext.keys()
['max', 'zach', 'sam']
>>> ext.values()
[88, 22, 44]

You can add a key–value pair:
>>> ext['helen'] = 92
>>> ext
{'max': 88, 'zach': 22, 'sam': 44, 'helen': 92}

If you assign a value to a key that is already in the dictionary, Python replaces the
value (keys must be unique):
>>> ext['max'] = 150
>>> ext
{'max': 150, 'zach': 22, 'sam': 44, 'helen': 92}

The following example shows how to remove a key–value pair from a dictionary:
>>> del ext['max']
>>> ext
{'zach': 22, 'sam': 44, 'helen': 92}

You can also query the dictionary. Python returns the value when you supply the key:
>>> ext['zach']
22

574 Chapter 12 The Python Programming Language
items()

The items() method returns key–value pairs as a list of tuples (pairs of values).
Because a dictionary is unordered, the order of the tuples returned by items() can vary
from run to run.
>>> ext.items()
[('zach', 22), ('sam', 44), ('helen', 92)]

Because dictionaries are iterable, you can loop through them using for.
>>> ext = {'sam': 44, 'max': 88, 'zach': 22}
>>> for i in ext:
...
print i
...
max
zach
sam

Using this syntax, the dictionary returns just keys; it is as though you wrote for i in
ext.keys(). If you want to loop through values, use for i in ext.values().
Keys and values can be of different types within a dictionary:
>>> dic = {500: 2, 'bbbb': 'BBBB', 1000: 'big'}
>>> dic
{1000: 'big', 'bbbb': 'BBBB', 500: 2}

Control Structures
Control flow statements alter the order of execution of statements within a program.
Starting on page 418, Chapter 10 discusses bash control structures in detail and
includes flow diagrams of their operation. Python control structures perform the same
functions as their bash counterparts, although the two languages use different syntaxes.
The description of each control structure in this section references the discussion of the
same control structure under bash.
In this section, the bold italic words in the syntax description are the items you supply
to cause the structure to have the desired effect; the nonbold italic words are the keywords Python uses to identify the control structure. Many of these structures use an
expression, denoted as expr, to control their execution. The examples in this chapter
delimit expr using parentheses for clarity and consistency; the parentheses are not
always required.
Indenting logical
blocks

In most programming languages, control structures are delimited by pairs of parentheses,
brackets, or braces; bash uses keywords (e.g., if...fi, do...done). Python uses a colon (:)
as the opening token for a control structure. While including SPACEs (or TABs) at the beginning of lines in a control structure is good practice in other languages, Python requires
these elements; they indicate a logical block, or section of code, that is part of a control
structure. The last indented line marks the end of the control structure; the change in
indent level is the closing token that matches the opening colon.

Control Structures

575

if
Similar to the bash if...then control structure (page 419), the Python if control structure
has the following syntax:
if expr:
...
As with all Python control structures, the control block, denoted by ..., must be indented.
In the following example, my_in != '' (if my_in is not an empty string) evaluates to
true if the user entered something before pressing RETURN. If expr evaluates to true,
Python executes the following indented print statement. Python executes any number
of indented statements that follow an if statement as part of the control structure. If
expr evaluates to false, Python skips any number of indented statements following
the if statement.
$ cat if1.py
#!/usr/bin/python
my_in = raw_input('Enter your name: ')
if (my_in != ''):
print 'Thank you, ' + my_in
print 'Program running, with or without your input.'
$ ./if1.py
Enter your name: Neo
Thank you, Neo.
Program running, with or without your input.

if...else
Similar to the bash if...then...else control structure (page 423), the if...else control
structure implements a two-way branch using the following syntax:
if expr:
...
else:
...
If expr evaluates to true, Python executes the statements in the if control block.
Otherwise, it executes the statements in the else control block. The following
example builds on the previous one, displaying an error message and exiting from
the program if the user does not enter something.
$ cat if2.py
#!/usr/bin/python
my_in = raw_input('Enter your name: ')
if (my_in != ''):
print 'Thank you, ' + my_in
else:
print 'Program requires input to continue.'
exit()
print 'Program running with your input.'

576 Chapter 12 The Python Programming Language
$ ./if2.py
Enter your name: Neo
Thank you, Neo
Program running with your input.
$ ./if2.py
Enter your name:
Program requires input to continue.

if...elif...else
Similar to the bash if...then...elif control structure (page 425), the Python if...elif...else
control structure implements a nested set of if...else structures using the following
syntax:
if (expr):
...
elif (expr)
...
else:
...
This control structure can include as many elif control blocks as necessary. In the following example, the if statement evaluates the Boolean expression following it within
parentheses and enters the indented logical block below the statement if the expression
evaluates to true. The if, elif, and else statements are part of one control structure and
Python will execute statements in only one of the indented logical blocks.
The if and elif statements are each followed by a Boolean expression. Python executes
each of their corresponding logical blocks only if their expression evaluates to true. If
none of the expressions evaluates to true, control falls through to the else logical block.
$ cat bignum.py
#!/usr/bin/python
input = raw_input('Please enter a number: ')
if (input == '1'):
print 'You entered one.'
elif (input == '2'):
print 'You entered two.'
elif (input == '3'):
print 'You entered three.'
else:
print 'You entered a big number...'
print 'End of program.'

In the preceding program, even though the user enters an integer/scalar value, Python
stores it as a string. Thus each of the comparisons checks whether this value is equal
to a string. You can use int() to convert a string to an integer. If you do so, you must
remove the quotation marks from around the values:
...
input = int(raw_input('Please enter a number: '))
if (input == 1):
print 'You entered one.'
...

Control Structures

577

while
The while control structure (page 435) evaluates a Boolean expression and continues execution while the expression evaluates to true. The following program,
run using the Python interactive shell, displays 0 through 9. As you enter the
command, Python displays its secondary prompt (.. .) when it requires more
input to complete a statement; you must still enter SPACE or TAB characters to indent
the logical block.
First, the program initializes count to 0. The first time through the loop, the while
expression evaluates to true and Python executes the indented statements that make
up the while control structure logical block. The comma at the end of the print
statement causes print to output a SPACE instead of a NEWLINE after each string, and the
count += 1 statement increments count each time through the loop. When control
reaches the bottom of the loop, Python returns control to the while statement,
where count is now 1. The loop continues while count is less than or equal to 10.
When count equals 11, the while statement evaluates to false and control passes to
the first statement after the logical block (the first statement that is not indented;
there is none in this example).
>>> count = 0
>>> while (count <= 10):
...
print count,
...
count += 1
...
0 1 2 3 4 5 6 7 8 9 10

Be careful not to create an infinite loop
tip It is easy to accidentally create an infinite loop using a while statement. Ensure there is a reachable
exit condition (e.g., a counter is compared to a finite value and the counter is incremented each
time through the loop).

for
The for control structure (page 433) assigns values from a list, string, or other iterable
data structure (page 572) to a loop index variable each time through the loop.
Lists are iterable

In the following example, lis is a list that holds the names of four types of animals.
The for statement iterates through the elements of lis in order, starting with turkey,
and assigning a value to nam each time it is called. The print statement in the logical
block displays the value of nam each time through the loop. Python exits from the
logical block when it runs out of elements in lis.
>>> lis = ['turkey', 'pony', 'dog', 'fox']
>>> for nam in lis:
...
print nam
...
turkey
pony
dog
fox

578 Chapter 12 The Python Programming Language
Strings are iterable

The next example demonstrates that strings are iterable. The string named string holds
My name is Sam. and the for statement iterates through string, assigning one character
to char each time through the loop. The print statement displays each character; the
comma causes print to put a SPACE after each character instead of a NEWLINE.
>>> string = 'My name is Sam.'
>>> for char in string:
...
print char,
...
M y
n a m e
i s
S a m .

range()

The range() function returns a list that holds the integers between the two values
specified as its arguments, including the first but excluding the last. An optional
third parameter defines a step value.
>>>
[1,
>>>
[0,

range(1,6)
2, 3, 4, 5]
range(0,10,3)
3, 6, 9]

The next example shows how to use range() in a for loop. In this example, range()
returns a list comprising 0, 3, 6, and 9. The for control structure loops over these
values, executing the indented statement each time through the loop.
>>> for cnt in range(0,10,3):
...
print cnt
...
0
3
6
9

optional
xrange()

The range() function is useful for generating short lists but, because it stores the list
it returns in memory, it takes up a lot of system resources when it generates longer
lists. In contrast, xrange() has a fixed memory footprint that is independent of the
length of the list it returns; as a consequence, it uses fewer system resources than
range() when working with long lists.
The two functions work differently. Whereas range() fills a list with values and stores
that list in memory, xrange() works only when you iterate through the values it
returns: The entire list is never stored in memory.
>>> range(1,11)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> xrange(1,11)
xrange(1, 11)
>>> for cnt in xrange(1,11):
...
print cnt,
...
1 2 3 4 5 6 7 8 9 10

Reading from and Writing to Files 579

Reading from and Writing to Files
Python allows you to work with files in many ways. This section explains how to read
from and write to text files and how to preserve an object in a file using pickle.

File Input and Output
open()

The open() function opens a file and returns a file object called a file handle; it can
open a file in one of several modes (Table 12-3). Opening a file in w (write) mode
truncates the file; use a (append) mode if you want to add to a file. The following
statement opens the file in Max’s home directory named test_file in read mode; the
file handle is named f.
f = open('/home/max/test_file', 'r')

Table 12-3

File modes

Mode

What it does

r

Read only

Error if file does not exist.

w

Write only

File is created if it does not exist. File is truncated if it exists.

r+

Read and write

a

Append

a+

Append and read

b

Binary

File is created if it does not exist.

File is created if it does not exist.
File is created if it does not exist.

Append to r or w to work with binary files.

Once the file is opened, you direct input and output using the file handle with one of
the methods listed in Table 12-4. When you are finished working with a file, use
close() to close it and free the resources the open file is using.

Table 12-4

File object methods

Method

Arguments

Returns or action

close()

None

Closes the file

isatty()

None

Returns true if the file is connected to a
terminal; false otherwise

read()

Maximum number of bytes to
read (optional)

Reads until EOF or specified maximum
number of bytes; returns file as a string

readline()

Maximum number of bytes to
read (optional)

Reads until NEWLINE or specified maximum
number of bytes; returns line as a string

readlines()

Maximum number of bytes to
read (optional)

Calls readline() repeatedly and returns a list
of lines (iterable)

write(str)

String to be written

Writes to the file

writelines(strs)

List of strings

Calls write() repeatedly, once with each item
in the list

580 Chapter 12 The Python Programming Language

The following example reads from /home/max/test_file, which holds three lines. It
opens this file in read mode and assigns the file handle f to the open file. It uses the
readlines() method, which reads the entire file into a list and returns that list.
Because the list is iterable, Python passes to the for control structure one line from
test_file each time through the loop. The for structure assigns the string value of this
line to ln, which print then displays. The strip() method removes whitespace and/or
a NEWLINE from the end of a line. Without strip(), print would output two NEWLINEs: the
one that terminates the line from the file and the one it automatically appends to
each line it outputs. After reading and displaying all lines from the file, the example
closes the file.
>>> f = open('/home/max/test_file', 'r')
>>> for ln in f.readlines():
...
print ln.strip()
...
This is the first line
and here is the second line
of this file.
>>> f.close()

The next example opens the same file in append mode and writes a line to it using
write(). The write() method does not append a NEWLINE to the line it outputs, so you
must terminate the string you write to the file with a \n.
>>> f = open('/home/max/test_file','a')
>>> f.write('Extra line!\n')
>>> f.close()

optional In the example that uses for, Python does not call the readlines() method each time
through the for loop. Instead, it reads the file into a list the first time readlines() is
called and then iterates over the list, setting ln to the value of the next line in the list
each time it is called subsequently. It is the same as if you had written
>>> f = open('/home/max/test_file', 'r')
>>> lines = f.readlines()
>>> for ln in lines:
...
print ln.strip()

It is more efficient to iterate over the file handle directly because this technique does
not store the file in memory.
>>> f = open('/home/max/test_file', 'r')
>>> for ln in f:
...
print ln.strip()

Exception Handling
An exception is an error condition that changes the normal flow of control in a
program. Although you can try to account for every problem your code will need

Reading from and Writing to Files 581

to deal with, it is not always possible to do so: Unknown circumstances might
arise. What if the file the previous programs opened does not exist? Python raises
an IOError (input/output error) number 2 and displays the message No such file
or directory.
>>> f = open('/home/max/test_file', 'r')
Traceback (most recent call last):
File "", line 1, in 
IOError: [Errno 2] No such file or directory: '/home/max/test_file'

Instead of allowing Python to display what might be an incomprehensible error message
and quit, a well-written program handles exceptions like this one gracefully. Good
exception handling can help you debug a program and can also provide a nontechnical
user with clear information about why the program failed.
The next example wraps the open statement that failed with an exception handler in
the form of a try...except control structure. This structure attempts to execute the try
block. If execution of the try block fails (has an exception), it executes the code in
the except block. If execution of the try block succeeds, it skips the except block.
Depending on the severity of the error, the except block should warn the user that
something might not be right or should display an error message and exit from the
program.
>>> try:
...
f = open('/home/max/test_file', 'r')
... except:
...
print "Error on opening the file."
...
Error on opening the file.

You can refine the type of error an except block handles. Previously, the open statement
returned an IOError. The following program tests for an IOError when it attempts to
open a file, displays an error message, and exits. If it does not encounter an IOError, it
continues normally.
$ cat except1.py
#!/usr/bin/python
try:
f = open('/home/max/test_file', 'r')
except IOError:
print "Cannot open file."
exit()
print "Processing file."
$ ./except1.py
Cannot open file.
$
$ touch test_file
$ ./except1.py
Processing file.

582 Chapter 12 The Python Programming Language

Pickle
The pickle module allows you to store an object in a file in a standard format for later
use by the same or a different program. The object you store can be any type of object
as long as it does not require an operating system resource such as a file handle or
network socket. The standard pickle filename extension is .p. For more information
visit wiki.python.org/moin/UsingPickle.

Never unpickle data received from an untrusted source
security The pickle module is not secure against maliciously constructed data. When you unpickle an
object, you are trusting the person who created it. Do not unpickle an object if you do not know
or trust its source.

This section discusses two pickle methods: dump(), which writes an object to disk,
and load(), which reads an object from disk. The syntax for these methods is
pickle.dump(objectname, open(filename, 'wb'))
pickle.load(objectname, open(filename, 'rb'))
It is critical that you open the file in binary mode (wb and rb). The load() method
returns the object; you can assign the object the same name as or a different name
from the original object. Before you can use pickle, you must import it (page 587).
In the following example, after importing pickle, pickle.dump() creates a file named
pres.p in wb (write binary) mode with a dump of the preserves list, and exit() leaves
the Python interactive shell:
>>> import pickle
>>> preserves = ['apple', 'cherry', 'blackberry', 'apricot']
>>> preserves
['apple', 'cherry', 'blackberry', 'apricot']
>>> pickle.dump(preserves, open('pres.p', 'wb'))
exit()

The next example calls the Python interactive shell and pickle.load() reads the pres.p
file in rb (read binary) mode. This method returns the object you originally saved
using dump(). You can give the object any name you like.
$ python
...
>>> import pickle
>>> jams = pickle.load(open('pres.p', 'rb'))
>>> jams
['apple', 'cherry', 'blackberry', 'apricot']

Regular Expressions

583

Regular Expressions
The Python re (regular expression) module handles regular expressions. Python regular
expressions follow the rules covered in Appendix A and are similar to Perl regular
expressions, which are covered starting on page 550. This section discusses a few of the
tools in the Python re library. You must give the command import re before you can
use the re methods. To display Python help on the re module, give the command help()
from the Python interactive shell and then type re.
findall()

One of the simplest re methods is findall(), which returns a list that holds matches
using the following syntax:
re.findall(regex, string)
where regex is the regular expression and string is the string you are looking for a match in.
The regex in the following example (hi) matches the three occurrences of hi in the
string (hi hi hi hello).
>>> import re
>>> a = re.findall('hi', 'hi hi hi hello')
>>> print a
['hi', 'hi', 'hi']

Because findall() returns a list (it is iterable), you can use it in a for statement. The
following example uses the period (.) special character, which, in a regular expression, matches any character. The regex (hi.) matches the three occurrences of hi
followed by any character in the string.
>>> for mat in re.findall('hi.', 'hit him hid hex'):
...
print mat,
...
hit him hid
search()

The search() re method uses the same syntax as findall() and looks through string for
a match to regex. Instead of returning a list if it finds a match, however, it returns a
MatchObject. Many re methods return a MatchObject (and not a list) when they find
a match for the regex in the string.
>>> a = re.search('hi.', 'bye hit him hex')
>>> print a
<_sre.SRE_Match object at 0xb7663a30>

bool()

The bool() function returns true or false based on its argument. Because you can test
the MatchObject directly, bool() is not used often in Python programming, but is
included here for its instructional value. A MatchObject evaluates to true because it
indicates a match. [Although findall() does not return a MatchObject, it does evaluate
to true when it finds a match.]
>>> bool(a)
True

584 Chapter 12 The Python Programming Language
group()

The group() method allows you to access the match a MatchObject holds.
>>> a.group(0)
'hit'

type()

When no match exists, search() returns a NoneType object [as shown by the type()
function], which evaluates to None or in a Boolean expression evaluates to false.
>>> a = re.search('xx.', 'bye hit him hex')
>>> type(a)

>>> print a
None
>>> bool(a)
False

Because a re method in a Boolean context evaluates to true or false, you can use a re
method as the expression an if statement evaluates. The next example uses search()
as the expression in an if statement; because there is a match, search() evaluates as
true, and Python executes the print statement.
>>> name = 'sam'
>>> if(re.search(name,'zach max sam helen')):
...
print 'The list includes ' + name
...
The list includes sam
match()

The match() method of the re object uses the same syntax as search(), but looks only
at the beginning of string for a match to regex.
>>> name = 'zach'
>>> if(re.match(name,'zach max sam helen')):
...
print 'The list includes ' + name
...
The list includes zach

Defining a Function
A Python function definition must be evaluated before the function is called, so it
generally appears in the code before the call to the function. The contents of the
function, as with other Python logical blocks, must be indented. The syntax of a
function definition is
def my_function(args):
...
Python passes lists and other data structures to a function by reference (page 571),
meaning that when a function modifies a data structure that was passed to it as an
argument, it modifies the original data structure. The following example demonstrates
this fact:

Using Libraries 585
>>>
...
...
...
>>>
>>>
>>>
[1,

def add_ab(my_list):
my_list.append('a')
my_list.append('b')
a = [1,2,3]
add_ab(a)
a
2, 3, 'a', 'b']

You can pass arguments to a function in three ways. Assume the function place_stuff
is defined as follows. The values assigned to the arguments in the function definition
are defaults.
>>> def place_stuff(x = 10, y = 20, z = 30):
...
return x, y, z
...

If you call the function and specify arguments, the function uses those arguments:
>>> place_stuff(1,2,3)
(1, 2, 3)

If you do not specify arguments, the function uses the defaults:
>>> place_stuff()
(10, 20, 30)

Alternatively, you can specify values for some or all of the arguments:
>>> place_stuff(z=100)
(10, 20, 100)

Using Libraries
This section discusses the Python standard library, nonstandard libraries, and Python
namespace, as well as how to import and use a function.

Standard Library
The Python standard library, which is usually included in packages installed with
Python, provides a wide range of facilities, including functions, constants, string
services, data types, file and directory access, cryptographic services, and file formats. Visit docs.python.org/library/index.html for a list of the contents of the
standard library.

Nonstandard Libraries
In some cases a module you want might be part of a library that is not included with
Python. You can usually find what you need in the Python Package Index (PyPI;
pypi.python.org), a repository of more than 22,000 Python packages.

586 Chapter 12 The Python Programming Language

You can find lists of modules for the distribution you are using by searching the Web
for distro package database, where distro is the name of the Linux distribution you
are using, and then searching one of the databases for python.

SciPy and NumPy Libraries
Two popular libraries are SciPy and NumPy. The SciPy (“Sigh Pie”; scipy.org) library
holds Python modules for mathematics, science, and engineering. It depends on
NumPy (numpy.scipy.org), a library of Python modules for scientific computing.
You must download and install the package that holds NumPy before you can use
any of its modules. Under Debian/Ubuntu/Mint and openSuSE, the package is named
python-numpy; under Fedora/RHEL, it is named numpy. If you are running OS X,
visit www.scipy.org/Installing_SciPy/Mac_OS_X. See Appendix C for instructions on
downloading and installing packages. Alternatively, you can obtain the libraries from
scipy.org or pypi.python.org. Once you import SciPy (import scipy), help(scipy) will
list the functions you can import individually.

Namespace
A namespace comprises a set of names (identifiers) in which all names are unique. For
example, the namespace for a program might include an object named planets. You
might instantiate planets as an integer:
>>> planets = 5
>>> type(planets)


Although it is not good programming practice, later on in the program you could
assign planets a string value. It would then be an object of type string.
>>> planets = 'solar system'
>>> type(planets)


You could make planets a function, a list, or another type of object. Regardless, there
would always be only one object named planets (identifiers in a namespace must be
unique).
When you import a module (including a function), Python can merge the namespace
of the module with the namespace of your program, creating a conflict. For example,
if you import the function named sample from the library named random and then
define a function with the same name, you will no longer be able to access the original
function:
>>>
>>>
[6,
>>>
...
...

from random import sample
sample(range(10), 10)
9, 0, 7, 3, 5, 2, 4, 1, 8]
def sample(a, b):
print 'Problem?'

Using Libraries 587
>>> sample(range(10), 10)
Problem?

The next section discusses different ways you can import objects from a library and
steps you can take to avoid the preceding problem.

Importing a Module
You can import a module in one of several ways. How you import the module
determines whether Python merges the namespace of the module with that of your
program.
The simplest thing to do is to import the whole module. In this case Python does
not merge the namespaces but allows you to refer to an object from the module by
prefixing its name with the name of the module. The following code imports the
random module. Using this syntax, the function named sample is not defined: You
must call it as random.sample.
>>> import random
>>> sample(range(10), 10)
Traceback (most recent call last):
File "", line 1, in 
NameError: name 'sample' is not defined
>>> random.sample(range(10), 10)
[1, 0, 6, 9, 8, 3, 2, 7, 5, 4]

This setup allows you to define your own function named sample. Because Python
has not merged the namespaces from your program and the module named random,
the two functions can coexist.
>>>
...
...
>>>
Not
>>>
[2,
Importing part of a
module

def sample(a, b):
print 'Not a problem.'
sample(1, 2)
a problem.
random.sample(range(10), 10)
9, 6, 5, 1, 3, 4, 0, 7, 8]

Another way to import a module is to specify the module and the object.
>>> from random import sample

When you import an object using this syntax, you import only the function named
sample; no other objects from that module will be available. This technique is very
efficient. However, Python merges the namespaces from your program and from
the object, which can give rise to the type of problem illustrated in the previous
section.
You can also use from module import *. This syntax imports all names from module
into the namespace of your program; it is generally not a good idea to use this
technique.

588 Chapter 12 The Python Programming Language

Example of Importing a Function
The my_max.py program on page 572 finds the largest element in a predefined list.
The following program works the same way, except at runtime it fills a list with
random numbers.
The following command imports the sample() function from the standard library
module named random. You can install an object from a nonstandard library, such
as NumPy, the same way.
from random import sample

After importing this function, the command help(sample) displays information about
sample(). The sample() function has the following syntax:
sample(list, number)
where list is a list holding the population, or values sample() can return, and number
is the number of random numbers in the list sample() returns.
>>>
[7,
>>>
[0,

sample([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 4)
1, 2, 5]
sample([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 6)
5, 4, 1, 3, 7]

The following program uses the range() function (page 578), which returns a list
holding the numbers from 0 through 1 less than its argument:
>>>
[0,
>>>
[0,

range(8)
1, 2, 3, 4, 5, 6, 7]
range(16)
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

When you combine the two functions, range() provides a list of values and sample()
selects values randomly from that list.
>>> sample(range(100),10)
[5, 32, 70, 93, 74, 29, 90, 7, 30, 11]

In the following program, sample generates a list of random numbers and for iterates
through them.
$ cat my_max2.py
#!/usr/bin/python
from random import sample
my_rand_list = sample(range(100), 10)
print 'Random list of numbers:', my_rand_list
largest = -1
for item in my_rand_list:
if (item > largest):
largest = item
print largest,
print
print 'largest number is ', largest

Lambda Functions 589
$ ./my_max2.py
random list of
67 99
largest number
$ ./my_max2.py
random list of
53 76
largest number
max()

numbers: [67, 40, 1, 29, 9, 49, 99, 95, 77, 51]
is

99

numbers: [53, 33, 76, 35, 71, 13, 75, 58, 74, 50]
is

76

The algorithm used in this example is not the most efficient way of finding the maximum
value in a list. It is more efficient to use the max() builtin function.
>>> from random import sample
>>> max(sample(range(100), 10))
96

optional

Lambda Functions
Python supports Lambda functions—functions that might not be bound to a name.
You might also see them referred to as anonymous functions. Lambda functions are
more restrictive than other functions because they can hold only a single expression.
In its most basic form, Lambda is another syntax for defining a function. In the following example, the object named a is a Lambda function and performs the same task
as the function named add_one:
>>> def add_one(x):
...
return x + 1
...
>>> type (add_one)

>>> add_one(2)
3
>>> a = lambda x: x + 1
>>> type(a)

>>> a(2)
3
map()

You can use the Lambda syntax to define a function inline as an argument to a function
such as map() that expects another function as an argument. The syntax of the map()
function is
map(func, seq1[, seq2, ...])
where func is a function that is applied to the sequence of arguments represented by
seq1 (and seq2 ...). Typically the sequences that are arguments to map() and the

590 Chapter 12 The Python Programming Language

object returned by map() are lists. The next example first defines a function named
times_two():
>>> def times_two(x):
...
return x * 2
...
>>> times_two(8)
16

Next, the map() function applies times_two() to a list:
>>> map(times_two, [1, 2, 3, 4])
[2, 4, 6, 8]

You can define an inline Lambda function as an argument to map(). In this example
the Lambda function is not bound to a name.
>>> map(lambda x: x * 2, [1, 2, 3, 4])
[2, 4, 6, 8]

List Comprehensions
List comprehensions apply functions to lists. For example, the following code, which
does not use a list comprehension, uses for to iterate over items in a list:
>>> my_list = []
>>> for x in range(10):
...
my_list.append(x + 10)
...
>>> my_list
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

You can use a list comprehension to perform the same task neatly and efficiently. The
syntax is similar, but a list comprehension is enclosed within square brackets and the
operation (x + 10) precedes the iteration [for x in range(10)].
>>> my_list = [x + 10 for x in range(10)]
>>> my_list
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

The results when using a for structure and a list comprehension are the same. The
next example uses a list comprehension to fill a list with powers of 2:
>>> potwo = [2**x for x in range(1, 13)]
>>> print potwo
[2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096]

The next list comprehension fills a list with even numbers. The if clause returns values
only if the remainder after dividing a number by 2 is 0 (if x % 2 == 0).
>>> [x for x in range(1,11) if x % 2 == 0]
[2, 4, 6, 8, 10]

Chapter Summary

591

The final example shows nested list comprehensions. It nests for loops and uses x + y
to catenate the elements of both lists in all combinations.
>>> A = ['a', 'b', 'c']
>>> B = ['1', '2', '3']
>>> all = [x + y for x in A for y in B]
>>> print all
['a1', 'a2', 'a3', 'b1', 'b2', 'b3', 'c1', 'c2', 'c3']

Chapter Summary
Python is an interpreted language: It translates code into bytecode at runtime and
executes the bytecode within the Python virtual machine. You can run a Python program from the Python interactive shell or from a file. The Python interactive shell is
handy for development because you can use it to debug and execute code one line at
a time and see the results immediately. Within the Python interactive shell, Python
displays output from any command line that does not have an action. From this shell
you can give the command help() to use the Python help feature or you can give the
command help('object') to display help on object.
Functions help improve code readability, efficiency, and expandability. Many
function are available when you first call Python (builtin functions) and many
more are found in libraries that you can download and/or import. Methods are
similar to functions except they work on objects whereas functions stand on their
own. Python allows you to define both normal functions and nameless functions
called Lambda functions.
Python enables you to read from and write to files in many ways. The open() function
opens a file and returns a file object called a file handle. Once the file is opened, you
direct input and output using the file handle. When you are finished working with a
file, it is good practice to close it. The pickle module allows you to store an object in
a file in a standard format for later use by the same or a different program.
A Python list is an object that comprises one or more elements; it is similar to an array
in C or Java. An important feature of lists is that they are iterable, meaning a control
structure such as for can loop (iterate) through each item in a list. A Python dictionary
holds unordered key–value pairs in which the keys are unique. Like lists, dictionaries
are iterable.
Python implements many control structures, including if...else, if...elif...else, while,
and for. Unlike most languages, Python requires SPACEs (or TABs) at the beginning of
lines in a control structure. The indented code marks a logical block, or section of
code, that is part of the control structure.
Python regular expressions are implemented by the Python re (regular expression)
module. You must give the command import re before you can use the re methods.

592 Chapter 12 The Python Programming Language

Exercises
1. What is meant by implied display? Is it available in the Python interactive
shell or from a program file? Provide a simple example of implied display.
2. Write and run a Python program that you store in a file. The program
should demonstrate how to prompt the user for input and display the string
the user entered.
3. Using the Python interactive shell, instantiate a list that holds three-letter
abbreviations for the first six months of the year and display the list.
4. Using the Python interactive shell, use a for control structure to iterate
through the elements of the list you instantiated in exercise 3 and display
each abbreviated name followed by a period on a line by itself. (Hint: The
period is a string.)
5. Using the Python interactive shell, put the elements of the list you instantiated
in exercise 3 in alphabetical order.
6. Instantiate a dictionary in which the keys are the months in the third quarter
of the year and the values are the number of days in the corresponding
month. Display the dictionary, the keys, and the values. Add the tenth
month of the year to the dictionary and display the value of that month only.
7. What does iterable mean? Name two builtin objects that are iterable. Which
control structure can you use to loop through an iterable object?
8. Write and demonstrate a Lambda function named stg() that appends .txt to
its argument. What happens when you call the function with an integer?

Advanced Exercises
9. Define a function named cents that returns its argument divided by 100 and
truncated to an integer. For example:
>>> cents(12345)
123

10. Define a function named cents2 that returns its argument divided by 100
exactly (and includes decimal places if necessary). Make sure your function
does not truncate the answer. For example:
>>> cents2(12345)
123.45

11. Create a list that has four elements. Make a copy of the list and change one
of the elements in the copy. Show that the same element in the original list
did not change.

Advanced Exercises

593

12. Why does the following assignment statement generate an error?
>>> x.y = 5
Traceback (most recent call last):
File "", line 1, in 
NameError: name 'x' is not defined

13. Call map() with two arguments:
1. A Lambda function that returns the square of the number it was called
with
2. A list holding the even numbers between 4 and 15; generate the list inline
using range()
14. Use a list comprehension to display the numbers from 1 through 30 inclusive
that are divisible by 3.
15. Write a function that takes an integer, val, as an argument. The function
asks the user to enter a number. If the number is greater than val, the
function displays Too high. and returns 1; if the number is less than val,
the function displays Too low. and returns –1; if the number equals val,
the function displays Got it! and returns 0. Call the function repeatedly
until the user enters the right number.
16. Rewrite exercise 15 to call the function with a random number between 0
and 10 inclusive. (Hint: The randint function in the random library returns
a random number between its two arguments inclusive.)
17. Write a function that counts the characters in a string the user inputs. Then
write a routine that calls the function and displays the following output.
$ ./count_letters.py
Enter some words: The rain in Spain
The string "The rain in Spain" has 17 characters in it.

18. Write a function that counts the vowels (aeiou) in a string the user inputs.
Make sure it counts upper- and lowercase vowels. Then write a routine that
calls the function and displays the following output.
$ ./count_vowels.py
Enter some words: Go East young man!
The string "Go East young man!" has 6 vowels in it.

19. Write a function that counts all the characters and the vowels in a string
the user inputs. Then write a routine that calls the function and displays
the following output.
$ ./count_all.py
Enter some words: The sun rises in the East and sets in the West.
13 letters in 47 are vowels.

13
The MySQL Database
Management System

Chapter13
13

In This Chapter

Objectives

Terminology. . . . . . . . . . . . . . . . . . 596

After reading this chapter you should be able to:

Installing a MySQL Server
and Client. . . . . . . . . . . . . . . . . . 599

 Explain what SQL is and how it relates to MySQL

Client Options . . . . . . . . . . . . . . . . 600

 Explain what a database, table, row, and column are
and the relationship between them

Setting Up MySQL . . . . . . . . . . . . . 601

 Install a MySQL server and client on the local system

Creating a Database . . . . . . . . . . . 603

 Set up MySQL, including a ~/.my.cnf file for a user

Adding a User . . . . . . . . . . . . . . . . 604

 Create a database and add a user

Examples . . . . . . . . . . . . . . . . . . . . 605

 Add data to and retrieve data from the database

Creating a Table . . . . . . . . . . . . . . 605

 Modify data in the database

Adding Data. . . . . . . . . . . . . . . . . . 607

 Add a second table and write joins to retrieve data
from both tables

Backing Up a Database . . . . . . . . 610
Modifying Data . . . . . . . . . . . . . . . 611

596 Chapter 13 The MySQL Database Management System

MySQL (My Structured Query Language) is an implementation of SQL. It is the
world’s most popular open-source relational database management system (RDBMS).
It is extremely fast and is used by some of the most frequently visited Web sites on the
Internet, including Google, Facebook, Twitter, Yahoo, YouTube, and Wikipedia.
Michael Widenius and David Axmark started development of MySQL in 1994.
Today the MySQL database is owned and supported by Oracle Corporation, which
acquired the former owner, Sun Microsystems, in 2010.
Many programming languages provide interfaces and bindings to MySQL, including
C, Python, PHP, and Perl. In addition, you can access a MySQL database using the
industry-standard Open Database Connectivity (ODBC) API. You can also call
MySQL from a shell script or the command line. MySQL is a core component of the
popular LAMP (Linux, Apache, MySQL, PHP/Perl/Python) open-source enterprise
software stack.

Notes
MySQL has a separate set of users from Linux: Users who have MySQL accounts
might not have Linux accounts, and vice versa. As installed, the name of the MySQL
administrator is root. Because the MySQL root user is not the same as the Linux root
user, it can (and should) have a different password.
MySQL does not automatically create a database when you create a MySQL user;
users and databases are not rigidly bound.
SQL is free form with respect to whitespace and

NEWLINEs.

Terminology
This section briefly describes some basic terms used when working with a relational
database. See Figure 13-1.
database
row

A structured set of persistent data comprising one or more tables.
An ordered set of columns in a table. Also record or tuple.

column

A set of one type of values, one per row in a table. Certain columns might be designated
as keys. Keys are indexed to speed up access to specific values in the column. Also field
or attribute.

join

Two (or more) rows, each from a different table, that are tied together by means of
the relationships between values in two (or more) columns. For example, two rows,
each from a different table, can be joined based on equal values in two columns.

relational database
management
system (RDBMS)

A database based on the relational model developed by E. F. Codd comprising tables
of data. Codd used the term relations for what SQL calls tables; thus the name of the
database.

Notes

597

Column
Row
name

hired

store

hourly

topsy

2012-11-01

4

1

max

2012-02-14

4

0

zach

2009-03-24

6

0

sam

2008-01-28

6

1

Figure 13-1
SQL

MySQL

A few rows from the people table in the maxdb database

Structured Query Language. An industry-standard language for creating, updating,
and querying relational databases. SQL is not part of the relational model but is often
associated with it.
A software brand or implementation of SQL.

MySQL is an implementation of SQL
tip This chapter is about SQL and MySQL. It explains how to set up a MySQL server and client. It also
shows examples of how to run SQL queries under MySQL.
table

A collection of rows in a relational database. Also relation.

Syntax and Conventions
An SQL program comprises one or more statements, each terminated by a semicolon
(;) or \g. Although keywords in statements are not case sensitive, this book shows
keywords in uppercase letters for clarity. Database and table names are case sensitive;
column names are not.
The following example shows a multiline SQL statement (query) that includes both
the primary interpreter prompt (mysql>) and the secondary interpreter prompt (–>).
SELECT, FROM, and WHERE are keywords. This statement displays the value of
the name column from the table named people in rows where the value in the store
column is 4.
mysql> SELECT
->
->

FROM
WHERE

name
people
store = 4;

598 Chapter 13 The MySQL Database Management System

Comments
You can specify a comment in one of three ways in an SQL program. These techniques
work in SQL command files and when you are working with MySQL interactively.
• A hash sign (#) marks the beginning of a comment that continues to the end
of the line (the NEWLINE character).
# The following line holds an SQL statement and a comment
USE maxdb;
# Use the maxdb database

• A double hyphen (––) marks the beginning of a comment that continues
to the end of the line (the NEWLINE character). The double hyphen must be
followed by whitespace (one or more SPACEs and/or TABs).
-- The following line holds an SQL statement and a comment
USE maxdb;
-- Use the maxdb database

• As in the C programming language, you can surround a (multiline) comment
with /* and */.
/* The line following this multiline
comment holds an SQL statement
and a comment */
USE maxdb;
/* Use the maxdb database */

Data types
When you create a table, you specify the name and data type of each column in the table.
Each data type is designed to hold a certain kind of data. For example, data type CHAR
holds strings of characters while DATE holds dates. The examples in this section use the
following data types, a small sampling of those available under MySQL.
• CHAR(n)—Stores a string of up to n characters where 0 <= n <= 255. You
must enclose strings within single or double quotation marks. When you store
a string in a type CHAR column, MySQL right-pads the string with SPACEs to
make an n-character string. When you retrieve a type CHAR string, MySQL
strips the trailing SPACEs. Occupies n bytes (CHAR has a fixed length).
• VARCHAR(n)—Stores a string of up to n characters where 0 <= n <=
65,535. You must enclose strings within single or double quotation marks.
MySQL does not pad strings stored in type VARCHAR columns. With a
string length of L, occupies L + 1 bytes if 0 <= L <= 255 and L + 2 bytes if
L > 255. (VARCHAR has a variable length.) See page 612 for an example.

VARCHARs might slow large queries
tip MySQL converts VARCHAR columns to CHAR for operations that generate temporary tables
(e.g., sorting, including ORDER BY and GROUP BY). Thus declaring large VARCHAR values
[e.g., VARCHAR(255)] for columns that require frequent sorting will result in very large temporary tables that tend to slow queries.

Installing a MySQL Server and Client

599

• INTEGER—Stores a 4-byte integer.
• DATE—Stores a date. MySQL sets a DATE variable to 0 if you specify an
illegal value. Occupies 3 bytes.
• BOOL—A synonym for TINYINT. A value of 0 evaluates as false and
1–255 evaluate as true. Occupies 1 byte. When you specify data type
BOOL, MySQL changes it to a TINYINT.

More Information
Home page: www.mysql.com
MySQL documentation: dev.mysql.com/doc
Introduction: dev.mysql.com/tech-resources/articles/mysql_intro.html
Command syntax: dev.mysql.com/doc/refman/5.6/en/sql-syntax.html
Data types: dev.mysql.com/doc/refman/5.6/en/data-types.html
Joins: www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
and mysqljoin.com
ODBC: dev.mysql.com/downloads/connector/odbc
Security: www.kitebird.com/articles/ins-sec.html (dated)
Backing up databases: webcheatsheet.com/SQL/mysql_backup_restore.php and
www.thegeekstuff.com/2008/09/backup-and-restore-mysql-database-usingmysqldump

Installing a MySQL Server and Client
This section briefly covers installing the MySQL client and server packages and starting the server running. The steps necessary to set up MySQL differ by distribution.

You must remove anonymous MySQL users
tip When you install the MySQL server, the MySQL database is set up to allow anonymous users to
log in and use MySQL. The examples in this chapter will not work unless you remove these users
from the MySQL database. See “Removing Anonymous Users” on page 602.

Fedora/RHEL (Red Hat Enterprise Linux)
Install the following packages:
• mysql
• mysql-server
Working as a privileged user under Fedora, give the following commands to always
start the MySQL daemon when the system enters multiuser mode and to start the
MySQL daemon immediately:
# systemctl enable mysqld.service
# systemctl start mysqld.service

600 Chapter 13 The MySQL Database Management System

Use these commands if you are running RHEL:
# chkconfig mysqld on
# service mysqld start

Debian/Ubuntu/Mint
Install the following packages:
• mysql-client
• mysql-server
When you install the mysql-server package, the dpkg postinst script asks you to
provide a password for the MySQL user named root. You will use this password
later in this chapter.

openSUSE
Install the following packages:
• mysql-cluster-client
• mysql-cluster
Working as a privileged user, give the following commands to always start the
MySQL daemon when the system enters multiuser mode and to start the MySQL
daemon immediately:
# systemctl enable mysql.service
# systemctl start mysql.service

The first command might display an error message but will work anyway.

OS X
See page 1051 for instructions on how to install MySQL under OS X.

Client Options
This section describes some of the options you can use on the MySQL client command
line. The options preceded by a single hyphen and those preceded by a double hyphen
are equivalent.
––disable-reconnect

Does not attempt to connect to the server again if the connection is dropped. See
––reconnect.
––host=hostname

–h hostname
Specifies the address of the MySQL server as hostname. Without this option
MySQL connects to the server on the local system (127.0.0.1).

Setting Up MySQL 601
––password[=passwd]

–p[passwd]
Specifies the MySQL password as passwd. For improved security, do not specify
the password on the command line. With this option and no password, MySQL
prompts for a password. By default MySQL does not use a password. The short
form of this option does not accept a SPACE between the –p and passwd.
Attempts to connect to the server again if the connection is dropped (default).
Disable this behavior using ––disable-reconnect.

––reconnect
––skip-column-names

Does not display column names in results.
––user=usr –u usr

Specifies the MySQL user as usr. Without this option, usr defaults to the
username of the user running the MySQL command.
––verbose

–v Increases the amount of information MySQL displays. Use this option multiple
times to increase verbosity. This option displays MySQL statements as they are
executed when running from a command file.

Setting Up MySQL
You must remove the anonymous users from the MySQL database before MySQL
will allow you to run commands working as yourself (or as Max if you follow the
examples in this section). You can work as the MySQL user named root without
removing the anonymous users. To run commands as the MySQL user named root
in cases where this user does not have a password, you can either not specify a password or else press RETURN when prompted for a password. In a production
environment, you can improve security of a MySQL database by giving a password
to the MySQL user named root.
Of the following three steps, only the second, “Removing Anonymous Users,” is
required. The first step, “Assigning a Password to the MySQL User Named root,”
makes a database more secure and is a good idea in a production environment.
You can use the third step, “Running the Secure Installation Script,” in place of
the first two steps. This script removes anonymous users, assigns a password to
the MySQL user named root, and takes other actions to make MySQL more
secure.

Assigning a Password to the MySQL User Named root
The following command, run as a nonprivileged user, assigns mysql-password as the
password for the MySQL user named root. If you assigned the MySQL user named
root a password when you installed MySQL, skip this step.
$ mysqladmin -u root password 'mysql-password'

602 Chapter 13 The MySQL Database Management System

Removing Anonymous Users
The following commands, which you can run as a nonprivileged user, remove the
anonymous users from the MySQL database, yielding a more secure system. The –u
option causes MySQL to run as the MySQL user named root. The –p option causes
MySQL to prompt for the password (for the MySQL user named root). In response
to the prompt, enter the password you assigned to the MySQL user named root.
$ mysql -u root -p
Enter password:
...
mysql> DELETE FROM
mysql.user
mysql>
WHERE
user='';
Query OK, 2 rows affected (0.00 sec)
mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.00 sec)
mysql> quit
Bye

Running the Secure Installation Script
In place of the two preceding steps, you can run the following command, which allows
you to assign a password to the MySQL user named root, removes anonymous users,
disallows remote MySQL root logins, and removes the test database that ships with
MySQL.
$ mysql_secure_installation

~/.my.cnf: Configures a MySQL Client
You can use the ~/.my.cnf file to set MySQL client options. The following example
shows Max’s .my.cnf file. The [mysql] specifies the MySQL group. The user line is
useful if your Linux and MySQL usernames are different. Setting user to max when
Max’s Linux username is max is not necessary. The password line sets Max’s
MySQL password to mpasswd. With this setup, Max does not have to use –p on
the command line; MySQL logs him in automatically. The database line specifies
the name of the MySQL database you want to work with so you do not need a
USE statement at the beginning of a MySQL program or session. Do not add the
database line to this file until after you create the database (next) or you will not
be able to use MySQL.
$ cat /home/max/.my.cnf
[mysql]
user="max"
password="mpasswd"
database="maxdb"

Because this file can hold a password and other sensitive information, setting permissions
on this file so the user in whose home directory the file resides owns the file and only the
owner can read the file makes the MySQL data more secure.

Creating a Database

603

~/.mysql_history: Stores Your MySQL History
MySQL writes each of the statements it executes to a file named ~/.mysql_history.
Because MySQL can write passwords to this file, setting permissions on this file so
the user in whose home directory the file resides owns the file and only the owner can
read the file makes the MySQL data more secure. If you do not want to store your
MySQL history, delete this file and recreate it as a symbolic link to /dev/null.
$ rm ~/.mysql_history
$ ln -s /dev/null ~/.mysql_history

Creating a Database
If the MySQL username you add is the same as your Linux username, you will not
have to specify a username on the MySQL command line. In the following example,
Max works as the MySQL user named root (–u root). The –p option causes MySQL
to prompt for the password. In response to the Enter password prompt, Max supplies
the password for the MySQL user named root. If the MySQL user named root does
not have a password, press RETURN in response to the prompt.
CREATE DATABASE,
SHOW DATABASES

Max uses a CREATE DATABASE statement to create a database named maxdb and a
SHOW DATABASES statement to display the names of all databases. This command
shows the name of the maxdb database and several system databases.
$ mysql -u root -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 183
Server version: 5.5.20 MySQL Community Server (GPL)
...
Type 'help;' or '\h' for help. Type '\c' to clear the current input
statement.
mysql> CREATE DATABASE maxdb;
Query OK, 1 row affected (0.00 sec)
mysql> SHOW DATABASES;
+--------------------+
| Database
|
+--------------------+
| information_schema |
| maxdb
|
| mysql
|
| performance_schema |
| test
|
+--------------------+
5 rows in set (0.00 sec)

If you try to create a database that already exists, MySQL displays an error message.
mysql> CREATE DATABASE maxdb;
ERROR 1007 (HY000): Can't create database 'maxdb'; database exists

604 Chapter 13 The MySQL Database Management System
USE

You must tell MySQL the name of the database you want to work with. If you do
not give MySQL this information, you must prefix the names of tables with the name
of the database. For example, you would need to specify the people table in the
maxdb database as maxdb.people. When you specify the maxdb database in the
~/.my.cnf file (page 602) or with a USE statement, you can refer to the same table as
people. In the following example, Max uses a USE statement to specify maxdb as the
database he is working with:
$ mysql
mysql> USE maxdb;
Database changed

Adding a User
Before starting to work with the database, create a user so you do not have to work
as the MySQL user named root. You must work as the MySQL user named root to
create a MySQL user.
Continuing with his previous MySQL session, Max adds the MySQL user named
max with a password of mpasswd. The GRANT statement gives Max the permissions he needs to work (as the user named max) with the maxdb database. When
using the MySQL interpreter, Query OK indicates that the preceding statement was
syntactically correct. Within an SQL statement, you must surround all character and
date strings with quotation marks.
mysql> GRANT
ALL PRIVILEGES
->
ON
maxdb.* to 'max'
->
IDENTIFIED BY 'mpasswd'
->
WITH GRANT OPTION;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT
user, password
->
FROM
mysql.user;
+------+-------------------------------------------+
| user | password
|
+------+-------------------------------------------+
| root | *81F5E21E35407D884A6CD4A731AEBFB6AF209E1B |
...
| max | *34432555DD6C778E7CB4A0EE4551425CE3AC0E16 |
+------+-------------------------------------------+
7 rows in set (0.00 sec)
mysql> quit
Bye

In the preceding example, after setting up the new user, Max uses a SELECT statement to query the user table of the mysql database and display the user and password

Examples 605

columns. The password column displays encrypted passwords. Two users now exist:
root and max. Max gives the command quit to exit from the MySQL interpreter.
Working as the MySQL user max, Max can now set up a simple database to keep
track of people. He does not need to use the –u option on the command line because
his Linux username and his MySQL username are the same.

Examples
This section follows Max as he works with MySQL. You must follow the steps
described under “Setting Up MySQL” on page 601 to remove anonymous users and
under “Adding a User” on page 604 to create the maxdb database and add the
MySQL user named max. If you do not follow these steps, you will not be able to
use MySQL as described in this section.

Logging In
You must log in to MySQL before you can use it interactively.
You can specify a MySQL username in the ~/.my.cnf file (page 602) or by using the
––user (–u) option on the command line. If you do not specify a user in one of these
ways, MySQL assumes your MySQL username is the same as your Linux username.
The examples in this section assume Max is logging in to MySQL as max (and has
not specified a username in the ~/.my.cnf file).
If a MySQL account has a password, you must specify that password to log in.
You can specify a password in the ~/.my.cnf file or by using the ––password (–p)
option on the command line. Without specifying his password in his ~/.my.cnf
file, Max would log in on the MySQL interactive interpreter using the following
command:
$ mysql -p
Enter password: mpasswd
...

With his password specified in the ~/.my.cnf file as shown on page 602, Max can log
in without the –p option:
$ mysql
...

Creating a Table
This section assumes Max has specified his password in his ~/.my.cnf file.
CREATE TABLE

Before you can work with a database, you must create a table to hold data. To get
started, Max uses the following CREATE TABLE statement to create a table named

606 Chapter 13 The MySQL Database Management System

people in the maxdb database. This table has four columns. The USE statement specifies the name of the database.
$ mysql
mysql> USE maxdb;
Database changed
mysql> CREATE TABLE
people (
->
name
CHAR(10),
->
hired
DATE,
->
store
INTEGER,
->
hourly BOOL
->
);
Query OK, 0 rows affected (0.04 sec)

SQL is free form with respect to whitespace and NEWLINEs. Max could have written the
preceding statement as follows:
mysql> CREATE TABLE people (name CHAR(10),
-> hired DATE, store INTEGER, hourly BOOL);
SHOW TABLES

After creating the table, Max uses a SHOW TABLES statement to display a list of
tables in the maxdb database.
mysql> SHOW TABLES;
+-----------------+
| Tables_in_maxdb |
+-----------------+
| people
|
+-----------------+
1 row in set (0.00 sec)

DESCRIBE

Next, Max uses a DESCRIBE statement to display a description of the table named
people.
mysql> DESCRIBE people;
+--------+------------+------+-----+---------+-------+
| Field | Type
| Null | Key | Default | Extra |
+--------+------------+------+-----+---------+-------+
| name
| char(10)
| YES |
| NULL
|
|
| hired | date
| YES |
| NULL
|
|
| store | int(11)
| YES |
| NULL
|
|
| hourly | tinyint(1) | YES |
| NULL
|
|
+--------+------------+------+-----+---------+-------+
4 rows in set (0.00 sec)

Figure 13-1 on page 597 shows part of the people table after data has been entered
in it.
ALTER TABLE

Max decides that the hourly column should default to true. He uses an ALTER TABLE
statement to modify the table so he does not have to delete the table and create it again.
He then checks his work using a DESCRIBE statement; the hourly column now
defaults to 1, which evaluates as true.

Examples 607
mysql> ALTER TABLE
people
->
MODIFY hourly BOOL DEFAULT TRUE;
Query OK, 0 rows affected (0.00 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> DESCRIBE people;
+--------+------------+------+-----+---------+-------+
| Field | Type
| Null | Key | Default | Extra |
+--------+------------+------+-----+---------+-------+
| name
| char(10)
| YES |
| NULL
|
|
| hired | date
| YES |
| NULL
|
|
| store | int(11)
| YES |
| NULL
|
|
| hourly | tinyint(1) | YES |
| 1
|
|
+--------+------------+------+-----+---------+-------+
4 rows in set (0.00 sec)

Adding Data
This section describes several ways to enter information into a database.
INSERT INTO

Next, Max uses an INSERT INTO statement to try to add a row of data to the people
table. Following the first command, MySQL displays an error saying it does not know
about a column named topsy; Max forgot to put quotation marks around the string
topsy so MySQL parsed topsy as the name of a column. Max includes the quotation
marks in the second command.
mysql> INSERT INTO
people
->
VALUES ( topsy, '2012/11/01', 4, FALSE);
ERROR 1054 (42S22): Unknown column 'topsy' in 'field list'
mysql> INSERT INTO
people
->
VALUES ( 'topsy', '2012/11/01', 4, FALSE);
Query OK, 1 row affected (0.00 sec)

The preceding INSERT INTO statement did not specify which columns the values
were to be inserted into; it specified values for all four columns. The next INSERT
INTO statement specifies values to be added to the name and store columns; MySQL
sets the other columns in the new rows to their default values.
mysql> INSERT INTO
people (name, store)
->
VALUES ( 'percy', 2 ),
->
( 'bailey', 2 );
Query OK, 2 rows affected (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> QUIT
Bye
LOAD DATA
LOCAL INFILE

The preceding examples showed how to work with MySQL interactively. The next
example shows how to run MySQL with statements in a command file. At the beginning of the preceding interactive session, Max gave the command USE maxdb so
MySQL knew which database the following commands worked with. When you

608 Chapter 13 The MySQL Database Management System

specify commands in a file, you must tell MySQL which database the commands
work with; each file of commands must start with a USE statement or MySQL will
return an error. Alternatively, you can specify the name of the database in your
~/.my.cnf file (page 602).
The following example adds three rows to the people table from a text file named
addfile. In addfile, each line holds data for a single row, with a single TAB separating
each column from the next. The \N specifies a null value. The file is not terminated
with a NEWLINE; if it were, MySQL would insert a row of NULLs. Unlike interactive
mode, when you run MySQL from a command file, it does not display a message if
everything worked properly. You can use the –v (verbose) option to cause MySQL to
display information about the commands it is executing. Multiple –v options display
more information.
To run SQL commands from a file, enter the command mysql and redirect input to
come from the command file. The following mysql command uses two –v options
so MySQL displays statements as it executes them and results of statements after
it executes them; input is redirected to come from load.
$ cat load
USE maxdb;
LOAD DATA LOCAL INFILE '/home/max/addfile'
INTO TABLE people;
$ cat addfile
max
\N
4
zach
09-03-24
sam
2008-01-28

0
6
6

0
1

$ mysql -vv < load
-------------LOAD DATA LOCAL INFILE '/home/max/addfile'
INTO TABLE people
-------------Query OK, 3 rows affected
Records: 3 Deleted: 0 Skipped: 0

Warnings: 0

Bye

Retrieving Data
SELECT

The SELECT statement queries the database and displays the data the query returns.
Within a SELECT statement, SQL interprets an asterisk (*) to mean all columns in
a table. The following interactive query displays all columns of all rows of the people
table in the maxdb database.
$ mysql
mysql> USE maxdb;
mysql> SELECT
->
FROM

*
people;

Examples 609
+--------+------------+-------+--------+
| name
| hired
| store | hourly |
+--------+------------+-------+--------+
| topsy | 2012-11-01 |
4 |
0 |
| percy | NULL
|
2 |
1 |
| bailey | NULL
|
2 |
1 |
| max
| NULL
|
4 |
0 |
| zach
| 2009-03-24 |
6 |
0 |
| sam
| 2008-01-28 |
6 |
1 |
+--------+------------+-------+--------+
6 rows in set (0.00 sec)

When you run queries from a command file, MySQL does not line up in columns the
data it outputs, but rather simply separates each column from the next using a TAB.
This minimal formatting allows you to redirect and format the output as you please.
Typically, if you want to redirect the data, you will not specify any –v options so
MySQL does not display any messages. The next command includes an ORDER BY
clause in the SELECT statement to sort the output. It uses the sel2 command file and
sends the output through tail to strip the header.
$ cat sel2
use maxdb;
*
SELECT
FROM
people
ORDER BY name;
$ mysql
bailey
max
percy
sam
topsy
zach
WHERE

< sel2 | tail -n +2
NULL
2
1
NULL
4
0
NULL
2
1
2008-01-28
6
2012-11-01
4
2009-03-24
6

1
0
0

The next example shows a SELECT statement with a WHERE clause. A WHERE
clause causes SELECT to return only those rows that match specified criteria. In
this example, the WHERE clause causes SELECT to return only rows in which the
value in the store column is 4. In addition, this SELECT statement returns a single
column (name). The result is a list of the names of the people who work in store
number 4.
mysql> SELECT
name
->
FROM
people
->
WHERE
store = 4;
+-------+
| name |
+-------+
| topsy |
| max
|
+-------+
2 rows in set (0.00 sec)

610 Chapter 13 The MySQL Database Management System

The next example shows the use of a relational operator in a WHERE clause. Here
the SELECT statement returns the names of people who work in stores with numbers
greater than 2.
mysql> SELECT
name
->
FROM
people
->
WHERE
store > 2;
+-------+
| name |
+-------+
| topsy |
| max
|
| zach |
| sam
|
+-------+
4 rows in set (0.00 sec)
LIKE

You can also use the LIKE operator in a WHERE clause. LIKE causes SELECT to
return rows in which a column contains a specified string. Within the string that follows LIKE, a percent sign (%) matches any string of zero or more characters while
an underscore (_) matches any single character. The following query returns rows in
which the name column contains an m.
mysql> SELECT
name,
->
store
->
FROM
people
->
WHERE
name LIKE '%m%';
+------+-------+
| name | store |
+------+-------+
| max |
4 |
| sam |
6 |
+------+-------+
2 rows in set (0.00 sec)

Backing Up a Database
mysqldump

The mysqldump utility can back up and restore a database. Backing up a database generates a file of SQL statements that create the tables and load the data. You can use
this file to restore the database from scratch. The next example shows how Max can
back up the maxdb database to a file named maxdb.bkup.sql.
$ mysqldump -u max -p maxdb > maxdb.bkup.sql
Enter password:

Be careful: The following restore procedure will overwrite an existing database.
Before you can restore a database, you must create the database as explained on
page 603. After creating the maxdb database, Max runs MySQL with input coming
from the file that mysqldump created. After giving the following command, the maxdb
database is in the same state it was in when Max backed it up.

Examples 611
$ mysql -u max -p maxdb < maxdb.bkup.sql
Enter password:

Modifying Data
DELETE FROM

The DELETE FROM statement removes one or more rows from a table. The next
example deletes the rows from the people table where the name column holds bailey
or percy.
mysql> DELETE FROM
people
->
WHERE
name='bailey'
->
OR name='percy';
Query OK, 2 rows affected (0.00 sec)

UPDATE

The UPDATE statement changes data in a table. The following example sets the
hourly column to TRUE (1) in the rows where the name column contains sam or
topsy. The MySQL messages show the query matched two rows (sam and topsy) but
changed only one row. It changed hourly in the row with topsy; in the row with sam,
hourly was already set to TRUE.
mysql> UPDATE
people
->
SET
hourly = TRUE
->
WHERE
name = 'sam' OR
->
name = 'topsy';
Query OK, 1 row affected (0.00 sec)
Rows matched: 2 Changed: 1 Warnings: 0

CURDATE()

The CURDATE() function returns today’s date. The next example sets the hired
column to today’s date in the row where name contains max.
mysql> UPDATE
people
->
SET
hired = CURDATE()
->
WHERE name = 'max';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

The next query shows the results of the preceding DELETE FROM and UPDATE
statements.
mysql> select * from people;
+-------+------------+-------+--------+
| name | hired
| store | hourly |
+-------+------------+-------+--------+
| topsy | 2012-11-01 |
4 |
1 |
| max
| 2012-02-12 |
4 |
0 |
| zach | 2009-03-24 |
6 |
0 |
| sam
| 2008-01-28 |
6 |
1 |
+-------+------------+-------+--------+
4 rows in set (0.00 sec)

612 Chapter 13 The MySQL Database Management System

Creating a Second Table
Next, the setup.stores SQL command file creates, populates, and displays the stores
table in the maxdb database.
$ mysql -vv < setup.stores
-------------CREATE TABLE
stores (
name VARCHAR(20),
number INTEGER,
city VARCHAR(20)
)
-------------Query OK, 0 rows affected
-------------INSERT INTO
VALUES

stores
( 'headquarters', 4, 'new york' ),
( 'midwest', 5, 'chicago' ),
( 'west coast', 6, 'san francisco' )

-------------Query OK, 3 rows affected
Records: 3 Duplicates: 0
-------------SELECT
FROM
-------------name
number
headquarters
midwest 5
west coast
3 rows in set
VARCHAR

Warnings: 0

*

stores

city
4
new york
chicago
6
san francisco

The stores table has columns named name, number, and city. The data type for the
name and city columns is VARCHAR. A VARCHAR column stores a variable-length
string without the padding required by a CHAR column. In this example, the name
and city columns can store up to 20 characters each.
The size of a VARCHAR is the sum of the length prefix plus the number of characters
in the string being stored. The length prefix for a column declared to hold fewer than
255 characters is 1 byte; larger columns have a 2-byte length prefix. In contrast, a
CHAR column always occupies the number of characters it was declared to be.
The number in parentheses in a VARCHAR declaration specifies the maximum number of characters any row in the column can hold. In the example, the city column is
declared to be VARCHAR(20) so a row in that column cannot hold more than 20
characters. The row with new york in the city column takes up 9 bytes of storage
(8 + 1). If the column had been declared to be CHAR(20), each row would occupy
20 bytes, regardless of the length of the string it held.

Examples 613

Joins
A join operation combines rows from/across two or more related tables in a database
according to some relationship between these tables. (Refer to the join utility
[page 843] for information on joining text files from the command line.) Take the
maxdb database, for example: The people table has columns that hold information
about employee name, hire date, store the employee works at, and whether the
employee is hourly (as opposed to salaried); the stores table has columns that hold
information about the store name, number, and city it is located in. With this setup,
you cannot determine which city an employee works in by querying either the people
or stores table alone; you must join the tables to get this information.
You join tables using a SELECT statement that specifies a column in each of the
tables and the relationship between those columns. The query that would satisfy
the example specifies the store column in the people table and the number column
in the stores table; the relationship between those columns is equality because
both of those columns hold the store number. When you equate like values in
these columns, you can display stores information for each name in the people
table (or people information for each [store] number in the store table).
The following query selects rows from the people and stores tables and joins the rows
where the value of store in the people table equals the value of number in the stores
table (store = number). This query uses an implicit join; it does not use the JOIN
keyword. The examples following this one use JOIN. Using this keyword makes no
difference in the results of the query, but you must use JOIN in certain types of
queries.
mysql> SELECT
*
->
FROM
people, stores
->
WHERE
store = number;
+-------+------------+-------+--------+--------------+--------+---------------+
| name | hired
| store | hourly | name
| number | city
|
+-------+------------+-------+--------+--------------+--------+---------------+
| topsy | 2012-11-01 |
4 |
1 | headquarters |
4 | new york
|
| max
| 2012-02-12 |
4 |
0 | headquarters |
4 | new york
|
| zach | 2009-03-24 |
6 |
0 | west coast
|
6 | san francisco |
| sam
| 2008-01-28 |
6 |
1 | west coast
|
6 | san francisco |
+-------+------------+-------+--------+--------------+--------+---------------+
4 rows in set (0.00 sec)
Inner join

The preceding query demonstrates an inner join. If you like, you can use the keywords INNER JOIN in place of JOIN in this type of query. In each returned row, the
value in people.store equals the value in stores.number. However, the stores table has
a row with the value of 5 in the number column but no row exists in the people table
with this value in the store column. As a result, these queries do not return a row with
people.store and stores.number equal to 5.

Table names

When working with a relational database, you can refer to a column in a specific
table as table_name.column_name (e.g., stores.name in the example database). While
you are working with a single table, a column name is sufficient to uniquely identify

614 Chapter 13 The MySQL Database Management System

the data you want to work with. When you work with several tables, you might need
to specify both a table name and a column name to uniquely identify certain data.
Specifically, if two columns have the same name in two tables that are being joined,
you must specify both the table and the column to disambiguate the columns. The
following query demonstrates the problem.
mysql> SELECT
*
->
FROM
people, stores
->
WHERE
name = 'max';
ERROR 1052 (23000): Column 'name' in where clause is ambiguous

In the preceding query, the column name is ambiguous: A column named name exists
in both the people and stores tables. MySQL cannot determine which table to use in
the WHERE clause. The next query solves this problem. It uses people.name in the
WHERE clause to specify that MySQL is to use the name column from the people
table.
mysql> SELECT
*
->
FROM
people, stores
->
WHERE
people.name = 'max';
+------+------------+-------+--------+--------------+--------+----------+
| name | hired
| store | hourly | name
| number | city
|
+------+------------+-------+--------+--------------+--------+----------+
| max | 2012-05-01 |
4 |
0 | headquarters |
4 | new york |
+------+------------+-------+--------+--------------+--------+----------+
1 row in set (0.01 sec)

Even though it might not be necessary, it can make code and comments clearer to
specify both a table and a column as in the following examples.
Table aliases

An alias for a table is another, usually shorter, name for the table. Aliases can make
a SELECT statement easier to read. You declare table aliases in the FROM clause of
a query. For example, the following clause declares p to be an alias for the people
table and s to be an alias for the stores table.
FROM

people p JOIN stores s

With these aliases in place, you can refer to the people table as p and the stores table
as s in the rest of the query.
The next example rewrites an earlier query using aliases and the JOIN keyword. In
place of a comma between the table (and alias) names, this syntax uses the keyword
JOIN. In place of WHERE it uses ON. Specifying p.store and s.number makes it clear
that the store column of the people table (p.store) is being joined with the number
column of the stores table (s.number). This query returns the same results as the preceding one.
*

SELECT
FROM
ON

people p JOIN stores s
p.store = s.number;

Examples 615
Outer join

An outer join can return rows in which the common column has a value that does
not exist in both tables. You can specify a LEFT OUTER JOIN (or just LEFT JOIN)
or a RIGHT OUTER JOIN (or just RIGHT JOIN). Left and right refer to the tables
specified on the left and right of the keyword JOIN, respectively.
The next example demonstrates a right outer join. The outer table is stores, which
has a value in the number column that does not have a match in the store column of
the people table. SQL inserts null values in the columns from people in the joined row
that does not have a match (stores.number = 5).

mysql> SELECT
*
->
FROM
people p RIGHT JOIN stores s
->
ON
p.store = s.number;
+-------+------------+-------+--------+--------------+--------+---------------+
| name | hired
| store | hourly | name
| number | city
|
+-------+------------+-------+--------+--------------+--------+---------------+
| topsy | 2012-11-01 |
4 |
1 | headquarters |
4 | new york
|
| max
| 2012-02-12 |
4 |
0 | headquarters |
4 | new york
|
| NULL | NULL
| NULL |
NULL | midwest
|
5 | chicago
|
| zach | 2009-03-24 |
6 |
0 | west coast
|
6 | san francisco |
| sam
| 2008-01-28 |
6 |
1 | west coast
|
6 | san francisco |
+-------+------------+-------+--------+--------------+--------+---------------+
5 rows in set (0.00 sec)

The following query performs an inner join to list the city that each person works in.
A right outer join would show the cities that have no one working in them. A left
outer join would show people who are not assigned to a store.
mysql> SELECT
p.name,
->
city
->
FROM
people p JOIN stores s
->
ON
p.store = s.number;
+-------+---------------+
| name | city
|
+-------+---------------+
| topsy | new york
|
| max
| new york
|
| zach | san francisco |
| sam
| san francisco |
+-------+---------------+
4 rows in set (0.00 sec)
Subqueries

A subquery is a SELECT statement within the WHERE clause of another SELECT
statement. The subquery returns a value or values that restrict the main query.
In the next example, the subquery returns the value of the number column from the
stores table where the city column has a value of new york. The main query returns
the value(s) from the name column of the people table where the value of the store
column is equal to the value returned by the subquery. The result is a query that
returns the names of people who work in New York.

616 Chapter 13 The MySQL Database Management System
$ cat sel4
use maxdb;
SELECT
FROM
WHERE
(SELECT

name
people
store =
FROM
WHERE

number
stores
city = 'new york'

);
$ mysql < sel4
name
topsy
max

The final example is a bash script that queries the MySQL database created in this
chapter. If you have not run the example commands it will not work. First the
script checks that the user knows the database password. Then it displays a list
of employees and asks which employee the user is interested in. It queries the
database for information about that employee and displays the name of the store
the employee works in and the city the store is located in.
$ cat employee_info
#! /bin/bash
#
# Script to display employee information
#
# Make sure user is authorized: Get database password
#
echo -n "Enter password for maxdb database: "
stty -echo
read pw
stty echo
echo
echo
# Check for valid password
#
mysql -u max -p$pw maxdb > /dev/null 2>&1 < /dev/null
if [ $? -ne 0 ]
then
echo "Bad password."
exit 1
fi
# Display list of employees for user to select from
#
echo "The following people are employed by the company:"
mysql -u max -p$pw --skip-column-names maxdb <<+
SELECT name FROM people ORDER BY name;
+

Exercises 617
echo
echo -n "Which employee would you like information about? "
read emp
# Query for store name
#
storename=$(mysql -u max -p$pw --skip-column-names maxdb <<+
SELECT stores.name FROM people, stores WHERE store = number AND people.name = "$emp";
+
)
# if null then user entered bad employee name
#
if [ "$storename" = "" ]
then
echo "Not a valid name."
exit 1
fi
# Query for city name
#
storecity=$(mysql -u max -p$pw --skip-column-names maxdb <<+
SELECT city FROM people, stores WHERE store = number AND people.name = "$emp";
+
)
# Display report
#
echo
echo $emp works at the $storename store in $storecity

Chapter Summary
System administrators are frequently called upon to set up and run MySQL databases. MySQL (My Structured Query Language) is the world’s most popular
open-source relational database management system (RDBMS). It is extremely
fast and is used by some of the most frequently visited Web sites on the Internet.
Many programming languages provide interfaces and bindings to MySQL,
including C, PHP, Python, and Perl. You can also call MySQL directly from a shell
script or use it in a pipeline on the command line. MySQL is a core component
of the popular LAMP (Linux, Apache, MySQL, PHP/Perl/Python) open-source
enterprise software stack.

Exercises
1. List two ways you can specify the MySQL password for a user to access
MySQL.

618 Chapter 13 The MySQL Database Management System

2. Using MySQL interactively, create a database named dbsam that the user
named sam can modify and grant privileges on. Set up Sam’s password to
be porcupine. The MySQL user named root has the password five22four.
3. What is a table? A row? A column?
4. Which commands would you use to set up a table in dbsam (created in exercise 2) named shoplist with the following columns of the specified types: day
[DATE], store [CHAR(20)], lettuce [SMALLINT], soupkind [CHAR(20)],
soupnum [INTEGER], and misc [VARCHAR(40)]?
5. Where can you find a list of MySQL commands you have previously run?
6. List two ways you can specify the name of a specific MySQL database to
work with.
7. What does a join do? When is a join useful?
8. Assume you are working with the people table in the maxdb database
described in this chapter. Write a query that lists the names of all the people
and their hire dates sorted by their names.
9. Assume you are working with the people and stores tables in the maxdb
database described in this chapter. Write a query that sets up the aliases q
and n for the people and stores tables, respectively. Have the query join the
tables using the store column in the people table with the number column
in the stores table. From left to right, have the query display the name of
the city the person works in, the name of the person, and the person’s hire
date. Sort the output by city name.

14
The AWK Pattern
Processing Language

Chapter14
14

In This Chapter

Objectives

Syntax . . . . . . . . . . . . . . . . . . . . . . 620

After reading this chapter you should be able to:

Options . . . . . . . . . . . . . . . . . . . . . 621

 Run a gawk program from the command line and from
a file

Patterns . . . . . . . . . . . . . . . . . . . . . 622

 Select lines from a file using gawk

Actions. . . . . . . . . . . . . . . . . . . . . . 623

 Write a report using gawk

Variables . . . . . . . . . . . . . . . . . . . . 623

 Summarize information in a file using gawk

Functions . . . . . . . . . . . . . . . . . . . . 624

 Write an interactive shell script that calls gawk

Arguments . . . . . . . . . . . . . . . . . . . 620

Associative Arrays. . . . . . . . . . . . . 626
Control Structures . . . . . . . . . . . . . 627
Examples . . . . . . . . . . . . . . . . . . . . 629
getline: Controlling Input . . . . . . . 646
Coprocess: Two-Way I/O. . . . . . . . 648
Getting Input from a Network . . . . 650

620 Chapter 14 The AWK Pattern Processing Language

AWK is a pattern-scanning and processing language that searches one or more files for
records (usually lines) that match specified patterns. It processes lines by performing
actions, such as writing the record to standard output or incrementing a counter, each
time it finds a match. Unlike procedural languages, AWK is data driven: You describe
the data you want to work with and tell AWK what to do with the data once it finds it.
You can use AWK to generate reports or filter text. It works equally well with numbers and text; when you mix the two, AWK usually comes up with the right answer.
The authors of AWK (Alfred V. Aho, Peter J. Weinberger, and Brian W. Kernighan)
designed the language to be easy to use. To achieve this end they sacrificed execution
speed in the original implementation.
AWK takes many of its constructs from the C programming language. It includes the
following features:
• A flexible format
• Conditional execution
• Looping statements
• Numeric variables
• String variables
• Regular expressions
• Relational expressions
• C’s printf
• Coprocess execution (gawk only)
• Network data exchange (gawk only)

Syntax
A gawk command line has the following syntax:
gawk [options] [program] [file-list]
gawk [options] –f program-file [file-list]
The gawk utility takes its input from files you specify on the command line or from
standard input. An advanced command, getline, gives you more choices about where
input comes from and how gawk reads it (page 646). Using a coprocess, gawk can
interact with another program or exchange data over a network (page 648; not available under awk or mawk). Output from gawk goes to standard output.

Arguments
In the preceding syntax, program is a gawk program that you include on the command line. The program-file is the name of the file that holds a gawk program. Putting
the program on the command line allows you to write short gawk programs without

Options 621

having to create a separate program-file. To prevent the shell from interpreting the
gawk commands as shell commands, enclose the program within single quotation
marks. Putting a long or complex program in a file can reduce errors and retyping.
The file-list contains the pathnames of the ordinary files that gawk processes. These
files are the input files. When you do not specify a file-list, gawk takes input from
standard input or as specified by getline (page 646) or a coprocess (page 648).

AWK has many implementations
tip The AWK language was originally implemented under UNIX as the awk utility. Most Linux distributions provide gawk (the GNU implementation of awk) or mawk (a faster, stripped-down
version of awk). Mac OS X provides awk. This chapter describes gawk. All the examples in this
chapter work under awk and mawk except as noted; the exceptions make use of coprocesses
(page 648). You can easily install gawk on most Linux distributions. See page 1051 if you want
to install gawk on Mac OS X. For a complete list of gawk extensions, see GNU EXTENSIONS in
the gawk man page or see the gawk info page.

Options
Options preceded by a double hyphen (––) work under gawk only. They are not
available under awk and mawk.
––field-separator fs

–F fs
Uses fs as the value of the input field separator (FS variable; page 624).
––file program-file

–f program-file
Reads the gawk program from the file named program-file instead of the command
line. You can specify this option more than once on a command line. See page 633
for examples.
––help –W help

Summarizes how to use gawk (gawk only).
––lint –W lint

Warns about gawk constructs that might not be correct or portable (gawk only).
––posix –W posix

Runs a POSIX-compliant version of gawk. This option introduces some restrictions; see the gawk man page for details (gawk only).
––traditional –W traditional

Ignores the new GNU features in a gawk program, making the program conform
to UNIX awk (gawk only).
––assign var=value

–v var=value
Assigns value to the variable var. The assignment takes place prior to execution
of the gawk program and is available within the BEGIN pattern (page 623). You
can specify this option more than once on a command line.

622 Chapter 14 The AWK Pattern Processing Language

Notes
See the tip on the previous page for information on AWK implementations.
For convenience many Linux systems provide a link from /bin/awk to /bin/gawk or
/bin/mawk. As a result you can run the program using either name.

Language Basics
A gawk program (from program on the command line or from program-file) consists
of one or more lines containing a pattern and/or action in the following syntax:
pattern { action }
The pattern selects lines from the input. The gawk utility performs the action on all
lines that the pattern selects. The braces surrounding the action enable gawk to differentiate it from the pattern. If a program line does not contain a pattern, gawk
selects all lines in the input. If a program line does not contain an action, gawk copies
the selected lines to standard output.
To start, gawk compares the first line of input (from the file-list or standard input)
with each pattern in the program. If a pattern selects the line (if there is a match),
gawk takes the action associated with the pattern. If the line is not selected, gawk does
not take the action. When gawk has completed its comparisons for the first line of
input, it repeats the process for the next line of input. It continues this process of comparing subsequent lines of input until it has read all of the input.
If several patterns select the same line, gawk takes the actions associated with each
of the patterns in the order in which they appear in the program. It is possible for
gawk to send a single line from the input to standard output more than once.

Patterns
~ and !~

You can use a regular expression (Appendix A), enclosed within slashes, as a pattern.
The ~ operator tests whether a field or variable matches a regular expression (examples on page 631). The !~ operator tests for no match. You can perform both numeric
and string comparisons using the relational operators listed in Table 14-1. You can
combine any of the patterns using the Boolean operators || (OR) or && (AND).

Table 14-1

Relational operators

Relational operator Meaning
<

Less than

<=

Less than or equal to

==

Equal to

Language Basics 623

Table 14-1

Relational operators (continued)

Relational operator Meaning
!=

Not equal to

>=

Greater than or equal to

>

Greater than

BEGIN and END

Two unique patterns, BEGIN and END, execute commands before gawk starts processing the input and after it finishes processing the input. The gawk utility executes
the actions associated with the BEGIN pattern before, and with the END pattern
after, it processes all the input. See pages 633 and 634 for examples.

, (comma)

The comma is the range operator. If you separate two patterns with a comma on a single gawk program line, gawk selects a range of lines, beginning with the first line that
matches the first pattern. The last line selected by gawk is the next subsequent line that
matches the second pattern. If no line matches the second pattern, gawk selects every
line through the end of the input. After gawk finds the second pattern, it begins the
process again by looking for the first pattern again. See page 632 for examples.

Actions
The action portion of a gawk command causes gawk to take that action when it
matches a pattern. When you do not specify an action, gawk performs the default
action, which is the print command (explicitly represented as {print}). This action
copies the record (normally a line; see “Record separators” on the next page) from
the input to standard output.
When you follow a print command with arguments, gawk displays only the arguments you specify. These arguments can be variables or string constants. You can
send the output from a print command to a file (use > within the gawk program;
page 637), append it to a file (>>), or send it through a pipeline to the input of another
program (|). A coprocess (|&) is a two-way pipe that exchanges data with a program
running in the background (available under gawk only; page 648).
Unless you separate items in a print command with commas, gawk catenates them.
Commas cause gawk to separate the items with the output field separator (OFS, normally a SPACE; page 624).
You can include several actions on one line by separating them with semicolons.

Comments
The gawk utility disregards anything on a program line following a pound sign (#).
You can document a gawk program by preceding comments with this symbol.

Variables
Although you do not need to declare gawk variables prior to their use, you can assign
initial values to them if you like. Unassigned numeric variables are initialized to 0;

624 Chapter 14 The AWK Pattern Processing Language

string variables are initialized to the null string. In addition to supporting user variables, gawk maintains program variables. You can use both user and program
variables in the pattern and action portions of a gawk program. Table 14-2 lists a few
program variables.

Table 14-2

Variables

Variable

Meaning

$0

The current record (as a single variable)

$1–$n

Fields in the current record

FILENAME

Name of the current input file (null for standard input)

FS

Input field separator (default: SPACE or TAB; page 638)

NF

Number of fields in the current record (page 642)

NR

Record number of the current record (page 634)

OFS

Output field separator (default: SPACE; page 635)

ORS

Output record separator (default: NEWLINE; page 642)

RS

Input record separator (default: NEWLINE)

In addition to initializing variables within a program, you can use the ––assign (–v)
option to initialize variables on the command line. This feature is useful when the
value of a variable changes from one run of gawk to the next.
Record separators

By default the input and output record separators are NEWLINE characters. Thus gawk
takes each line of input to be a separate record and appends a NEWLINE to the end of
each output record. By default the input field separators are SPACE s and TABs; the
default output field separator is a SPACE. You can change the value of any of the separators at any time by assigning a new value to its associated variable either from
within the program or from the command line by using the ––assign (–v) option.

Functions
Table 14-3 lists a few of the functions gawk provides for manipulating numbers and strings.

Table 14-3

Functions

Function

Meaning

length(str )

Returns the number of characters in str; without an argument, returns the
number of characters in the current record (page 633)

int(num)

Returns the integer portion of num

index(str1,str2 )

Returns the index of str2 in str1 or 0 if str2 is not present

split(str,arr,del )

Places elements of str, delimited by del, in the array arr [1]...arr [n ]; returns the
number of elements in the array (page 644)

Language Basics 625

Table 14-3

Functions (continued)

sprintf(fmt,args)

Formats args according to fmt and returns the formatted string; mimics the C
programming language function of the same name; see also printf on
page 626

substr(str,pos,len) Returns the substring of str that begins at pos and is len characters long
tolower(str )

Returns a copy of str in which all uppercase letters are replaced with their
lowercase counterparts

toupper(str )

Returns a copy of str in which all lowercase letters are replaced with their
uppercase counterparts

Arithmetic Operators
The gawk arithmetic operators listed in Table 14-4 are from the C programming language.

Table 14-4

Arithmetic operators

Operator

Meaning

**

Raises the expression preceding the operator to the power of the expression
following it

*

Multiplies the expression preceding the operator by the expression following it

/

Divides the expression preceding the operator by the expression following it

%

Takes the remainder after dividing the expression preceding the operator by the
expression following it

+

Adds the expression preceding the operator to the expression following it

–

Subtracts the expression following the operator from the expression preceding
it

=

Assigns the value of the expression following the operator to the variable
preceding it

++

Increments the variable preceding the operator

––

Decrements the variable preceding the operator

+=

Adds the expression following the operator to the variable preceding it and
assigns the result to the variable preceding the operator

–=

Subtracts the expression following the operator from the variable preceding it
and assigns the result to the variable preceding the operator

*=

Multiplies the variable preceding the operator by the expression following it
and assigns the result to the variable preceding the operator

/=

Divides the variable preceding the operator by the expression following it and
assigns the result to the variable preceding the operator

%=

Assigns the remainder, after dividing the variable preceding the operator by the
expression following it, to the variable preceding the operator

626 Chapter 14 The AWK Pattern Processing Language

Associative Arrays
The associative array is one of gawk’s most powerful features. These arrays use strings
as indexes. Using an associative array, you can mimic a traditional array by using
numeric strings as indexes. In Perl, an associative array is called a hash (page 532).
You assign a value to an element of an associative array using the following syntax:
array[string] = value
where array is the name of the array, string is the index of the element of the array
you are assigning a value to, and value is the value you are assigning to that element.
Using the following syntax, you can use a for structure with an associative array:
for (elem in array) action
where elem is a variable that takes on the value of each element of the array as the
for structure loops through them, array is the name of the array, and action is the
action that gawk takes for each element in the array. You can use the elem variable
in this action.
See page 639 for example programs that use associative arrays.

printf
You can use the printf command in place of print to control the format of the output
gawk generates. See the printf utility on page 917 for more information on using printf.
(The gawk printf requires the commas shown in the following syntax whereas the
printf utility does not allow them.) A printf command has the following syntax:
printf "control-string", arg1, arg2, ..., argn
The control-string determines how printf formats arg1, arg2, ..., argn. These arguments can be variables or other expressions. Within the control-string you can use
\n to indicate a NEWLINE and \t to indicate a TAB. The control-string contains conversion
specifications, one for each argument. A conversion specification has the following
syntax:
%[–][x[.y]]conv
where – causes printf to left-justify the argument, x is the minimum field width, and
.y is the number of places to the right of a decimal point in a number. The conv indicates the type of numeric conversion and can be selected from the letters in
Table 14-5. See page 636 for example programs that use printf.

Table 14-5

Numeric conversion

conv

Type of conversion

d

Decimal

e

Exponential notation

f

Floating-point number

Language Basics 627

Table 14-5

Numeric conversion (continued)

conv

Type of conversion

g

Use f or e, whichever is shorter

o

Unsigned octal

s

String of characters

x

Unsigned hexadecimal

Control Structures
Control (flow) statements alter the order of execution of commands within a gawk program. This section details the if...else, while, and for control structures. In addition, the
break and continue statements work in conjunction with the control structures to alter
the order of execution of commands. See page 418 for more information on control
structures. You do not need to use braces around commands when you specify a single,
simple command.

if...else
The if...else control structure tests the status returned by the condition and transfers
control based on this status. The syntax of an if...else structure is shown below. The
else part is optional.
if (condition)
{commands}
[else
{commands}]
The simple if statement shown here does not use braces:
if ($5 <= 5000) print $0

Next is a gawk program that uses a simple if...else structure. Again, there are no braces.
$ cat if1
BEGIN
{
nam="sam"
if (nam == "max")
print "nam is max"
else
print "nam is not max, it is", nam
}
$ gawk -f if1
nam is not max, it is sam

while
The while structure loops through and executes the commands as long as the condition
is true. The syntax of a while structure is
while (condition)
{commands}

628 Chapter 14 The AWK Pattern Processing Language

The next gawk program uses a simple while structure to display powers of 2. This
example uses braces because the while loop contains more than one statement. This
program does not accept input; all processing takes place when gawk executes the
statements associated with the BEGIN pattern.
$ cat while1
BEGIN{
n = 1
while (n <= 5)
{
print "2^" n, 2**n
n++
}
}
$ gawk -f while1
1^2 2
2^2 4
3^2 8
4^2 16
5^2 32

for
The syntax of a for control structure is
for (init; condition; increment)
{commands}
A for structure starts by executing the init statement, which usually sets a counter to
0 or 1. It then loops through the commands as long as the condition remains true.
After each loop it executes the increment statement. The for1 gawk program does the
same thing as the preceding while1 program except that it uses a for statement, which
makes the program simpler:
$ cat for1
BEGIN
{
for (n=1; n <= 5; n++)
print "2^" n, 2**n
}
$ gawk -f for1
1^2 2
2^2 4
3^2 8
4^2 16
5^2 32

The gawk utility supports an alternative for syntax for working with associative
arrays:
for (var in array)
{commands}

Examples 629

This for structure loops through elements of the associative array named array,
assigning the value of the index of each element of array to var each time through
the loop. The following line of code (from the program on page 639) demonstrates
a for structure:
END

{for (name in manuf) print name, manuf[name]}

break
The break statement transfers control out of a for or while loop, terminating execution
of the innermost loop it appears in.

continue
The continue statement transfers control to the end of a for or while loop, causing
execution of the innermost loop it appears in to continue with the next iteration.

Examples
cars data file

Many of the examples in this section work with the cars data file. From left to right,
the columns in the file contain each car’s make, model, year of manufacture, mileage
in thousands of miles, and price. All whitespace in this file is composed of single TAB s
(the file does not contain any SPACEs).
$ cat cars
plym
fury
chevy
malibu
ford
mustang
volvo
s80
ford
thundbd
chevy
malibu
bmw
325i
honda
accord
ford
taurus
toyota rav4
chevy
impala
ford
explor

Missing pattern

1970
1999
1965
1998
2003
2000
1985
2001
2004
2002
1985
2003

73
60
45
102
15
50
115
30
10
180
85
25

2500
3000
10000
9850
10500
3500
450
6000
17000
750
1550
9500

A simple gawk program is
{ print }

This program consists of one program line that is an action. Because the pattern is
missing, gawk selects all lines of input. When used without any arguments the print
command displays each selected line in its entirety. This program copies the input to
standard output.
$ gawk '{ print
plym
fury
chevy
malibu
ford
mustang
volvo
s80
...

}' cars
1970
1999
1965
1998

73
60
45
102

2500
3000
10000
9850

630 Chapter 14 The AWK Pattern Processing Language
Missing action

The next program has a pattern but no explicit action. The slashes indicate that
chevy is a regular expression.
/chevy/

In this case gawk selects from the input just those lines that contain the string chevy.
When you do not specify an action, gawk assumes the action is print. The following
example copies to standard output all lines from the input that contain the string
chevy:
$ gawk '/chevy/' cars
chevy
malibu 1999
chevy
malibu 2000
chevy
impala 1985

60
50
85

3000
3500
1550

Single quotation
marks

Although neither gawk nor shell syntax requires single quotation marks on the command line, it is still a good idea to use them because they can prevent problems. If
the gawk program you create on the command line includes SPACE s or characters that
are special to the shell, you must quote them. Always enclosing the program in single
quotation marks is the easiest way to make sure you have quoted any characters that
need to be quoted.

Fields

The next example selects all lines from the file (it has no pattern). The braces
enclose the action; you must always use braces to delimit the action so gawk can
distinguish it from the pattern. This example displays the third field ($3), a SPACE (the
output field separator, indicated by the comma), and the first field ($1) of each
selected line:
$ gawk '{print $3, $1}' cars
1970 plym
1999 chevy
1965 ford
1998 volvo
...

The next example, which includes both a pattern and an action, selects all lines that
contain the string chevy and displays the third and first fields from the selected lines:
$ gawk '/chevy/ {print $3, $1}' cars
1999 chevy
2000 chevy
1985 chevy

In the following example, gawk selects lines that contain a match for the regular
expression h. Because there is no explicit action, gawk displays all the lines it selects.
$ gawk '/h/' cars
chevy
malibu 1999
ford
thundbd 2003
chevy
malibu 2000
honda
accord 2001
chevy
impala 1985

60
15
50
30
85

3000
10500
3500
6000
1550

Examples 631
~ (matches
operator)

The next pattern uses the matches operator (~) to select all lines that contain the letter
h in the first field:
$ gawk '$1 ~ /h/' cars
chevy
malibu 1999
chevy
malibu 2000
honda
accord 2001
chevy
impala 1985

60
50
30
85

3000
3500
6000
1550

The caret (^) in a regular expression forces a match at the beginning of the line
(page 1014) or, in this case, at the beginning of the first field:
$ gawk '$1 ~ /^h/' cars
honda
accord 2001
30

6000

Brackets surround a character class definition (page 1013). In the next example, gawk
selects lines that have a second field that begins with t or m and displays the third
and second fields, a dollar sign, and the fifth field. Because there is no comma
between the "$" and the $5, gawk does not put a SPACE between them in the output.
$ gawk '$2 ~ /^[tm]/ {print $3, $2, "$"
1999 malibu $3000
1965 mustang $10000
2003 thundbd $10500
2000 malibu $3500
2004 taurus $17000
Dollar signs

$5}' cars

The next example shows three roles a dollar sign can play in a gawk program. First,
a dollar sign followed by a number names a field. Second, within a regular expression
a dollar sign forces a match at the end of a line or field (5$). Third, within a string a
dollar sign represents itself.
$ gawk '$3 ~ /5$/ {print $3, $1, "$"
1965 ford $10000
1985 bmw $450
1985 chevy $1550

$5}' cars

In the next example, the equal-to relational operator (==) causes gawk to perform a
numeric comparison between the third field in each line and the number 1985. The
gawk command takes the default action, print, on each line where the comparison is
true.
$ gawk '$3 == 1985' cars
bmw
325i
1985
115
chevy
impala 1985
85

450
1550

The next example finds all cars priced at or less than $3,000.
$ gawk '$5 <= 3000' cars
plym
fury
1970
73
chevy
malibu 1999
60
bmw
325i
1985
115
toyota rav4
2002
180
chevy
impala 1985
85

2500
3000
450
750
1550

632 Chapter 14 The AWK Pattern Processing Language
Textual
comparisons

When you use double quotation marks, gawk performs textual comparisons by
using the ASCII (or other local) collating sequence as the basis of the comparison.
In the following example, gawk shows that the strings 450 and 750 fall in the range
that lies between the strings 2000 and 9000, which is probably not the intended
result.
$ gawk '"2000" <= $5 && $5 < "9000"' cars
plym
fury
1970
73
2500
chevy
malibu 1999
60
3000
chevy
malibu 2000
50
3500
bmw
325i
1985
115
450
honda
accord 2001
30
6000
toyota rav4
2002
180
750

When you need to perform a numeric comparison, do not use quotation marks. The
next example gives the intended result. It is the same as the previous example except
it omits the double quotation marks.
$ gawk '2000 <=
plym
fury
chevy
malibu
chevy
malibu
honda
accord

$5 && $5 < 9000' cars
1970
73
2500
1999
60
3000
2000
50
3500
2001
30
6000

, (range operator) The range operator ( , ) selects a group of lines. The first line it selects is the
one specified by the pattern before the comma. The last line is the one selected
by the pattern after the comma. If no line matches the pattern after the comma,
gawk selects every line through the end of the input. The next example selects
all lines, starting with the line that contains volvo and ending with the line that
contains bmw.
$ gawk '/volvo/ , /bmw/' cars
volvo
s80
1998
102
ford
thundbd 2003
15
chevy
malibu 2000
50
bmw
325i
1985
115

9850
10500
3500
450

After the range operator finds its first group of lines, it begins the process again, looking for a line that matches the pattern before the comma. In the following example,
gawk finds three groups of lines that fall between chevy and ford. Although the fifth
line of input contains ford, gawk does not select it because at the time it is processing
the fifth line, it is searching for chevy.
$ gawk '/chevy/ , /ford/' cars
chevy
malibu 1999
60
ford
mustang 1965
45
chevy
malibu 2000
50
bmw
325i
1985
115
honda
accord 2001
30

3000
10000
3500
450
6000

Examples 633
ford
chevy
ford

taurus
impala
explor

2004
1985
2003

10
85
25

17000
1550
9500

––file option

When you are writing a longer gawk program, it is convenient to put the program in
a file and reference the file on the command line. Use the –f (––file) option followed
by the name of the file containing the gawk program.

BEGIN

The following gawk program, which is stored in a file named pr_header, has two
actions and uses the BEGIN pattern. The gawk utility performs the action associated
with BEGIN before processing any lines of the data file: It displays a header. The second action, {print}, has no pattern part and displays all lines from the input.
$ cat pr_header
BEGIN
{print "Make
{print}

Model

$ gawk -f pr_header cars
Make
Model
Year
Miles
plym
fury
1970
73
chevy
malibu 1999
60
ford
mustang 1965
45
volvo
s80
1998
102
...

Year

Miles

Price"}

Price
2500
3000
10000
9850

The next example expands the action associated with the BEGIN pattern. In the
previous and the following examples, the whitespace in the headers is composed of
single TABs, so the titles line up with the columns of data.
$ cat pr_header2
BEGIN
{
print "Make
Model
Year
Miles
Price"
print "----------------------------------------"
}
{print}
$ gawk -f pr_header2 cars
Make
Model
Year
Miles
Price
---------------------------------------plym
fury
1970
73
2500
chevy
malibu 1999
60
3000
ford
mustang 1965
45
10000
volvo
s80
1998
102
9850
...
length function

When you call the length function without an argument, it returns the number of
characters in the current line, including field separators. The $0 variable always
contains the value of the current line. In the next example, gawk prepends the line
length to each line and then a pipeline sends the output from gawk to sort (the –n
option specifies a numeric sort; page 942). As a result, the lines of the cars file
appear in order of line length.

634 Chapter 14 The AWK Pattern Processing Language
$ gawk '{print length, $0}' cars | sort
21 bmw 325i
1985
115
450
22 plym fury
1970
73
2500
23 volvo
s80
1998
102
24 ford explor 2003
25
9500
24 toyota
rav4
2002
180
25 chevy
impala 1985
85
25 chevy
malibu 1999
60
25 chevy
malibu 2000
50
25 ford taurus 2004
10
17000
25 honda
accord 2001
30
26 ford mustang 1965
45
10000
26 ford thundbd 2003
15
10500

-n

9850
750
1550
3000
3500
6000

The formatting of this report depends on TAB s for horizontal alignment. The three
extra characters at the beginning of each line throw off the format of several lines; a
remedy for this situation is covered shortly.
NR (record number)

The NR variable contains the record (line) number of the current line. The following
pattern selects all lines that contain more than 24 characters. The action displays the
line number of each of the selected lines.
$ gawk 'length > 24 {print NR}' cars
2
3
5
6
8
9
11

You can combine the range operator (,) and the NR variable to display a group of
lines of a file based on their line numbers. The next example displays lines 2 through
4:
$ gawk 'NR == 2
chevy
malibu
ford
mustang
volvo
s80
END

, NR == 4' cars
1999
60
3000
1965
45
10000
1998
102
9850

The END pattern works in a manner similar to the BEGIN pattern, except gawk
takes the actions associated with this pattern after processing the last line of input.
The following report displays information only after it has processed all the input.
The NR variable retains its value after gawk finishes processing the data file, so an
action associated with an END pattern can use it.
$ gawk 'END {print NR, "cars for sale." }' cars
12 cars for sale.

The next example uses if control structures to expand the abbreviations used in some
of the first fields. As long as gawk does not change a record, it leaves the entire

Examples 635

record—including any separators—intact. Once it makes a change to a record, gawk
changes all separators in that record to the value of the output field separator. The
default output field separator is a SPACE.
$ cat separ_demo
{
if ($1 ~ /ply/) $1 = "plymouth"
if ($1 ~ /chev/) $1 = "chevrolet"
print
}
$ gawk -f separ_demo cars
plymouth fury 1970 73 2500
chevrolet malibu 1999 60 3000
ford
mustang 1965
45
volvo
s80
1998
102
ford
thundbd 2003
15
chevrolet malibu 2000 50 3500
bmw
325i
1985
115
honda
accord 2001
30
ford
taurus 2004
10
toyota rav4
2002
180
chevrolet impala 1985 85 1550
ford
explor 2003
25
Stand-alone script

10000
9850
10500
450
6000
17000
750
9500

Instead of calling gawk from the command line with the –f option and the name of
the program you want to run, you can write a script that calls gawk with the commands you want to run. The next example is a stand-alone script that runs the same
program as the previous example. The #!/bin/gawk –f command (page 287) runs the
gawk utility directly. To execute it, you need both read and execute permission to the
file holding the script (page 285).
$ chmod u+rx separ_demo2
$ cat separ_demo2
#!/bin/gawk -f
{
if ($1 ~ /ply/) $1 = "plymouth"
if ($1 ~ /chev/) $1 = "chevrolet"
print
}
$ ./separ_demo2 cars
plymouth fury 1970 73 2500
chevrolet malibu 1999 60 3000
ford
mustang 1965
45
...

OFS variable

10000

You can change the value of the output field separator by assigning a value to the OFS
variable. The following example assigns a TAB character to OFS, using the backslash
escape sequence \t. This fix improves the appearance of the report but does not line
up the columns properly.

636 Chapter 14 The AWK Pattern Processing Language
$ cat ofs_demo
BEGIN
{OFS = "\t"}
{
if ($1 ~ /ply/) $1 = "plymouth"
if ($1 ~ /chev/) $1 = "chevrolet"
print
}
$ gawk -f ofs_demo cars
plymouth
fury
chevrolet
malibu
ford
mustang 1965
volvo
s80
1998
ford
thundbd 2003
chevrolet
malibu
bmw
325i
1985
honda
accord 2001
ford
taurus 2004
toyota rav4
2002
chevrolet
impala
ford
explor 2003
printf

1970
1999
45
102
15
2000
115
30
10
180
1985
25

73
60
10000
9850
10500
50
450
6000
17000
750
85
9500

2500
3000

3500

1550

You can use printf (page 626) to refine the output format. The following example
uses a backslash at the end of two program lines to quote the following NEWLINE. You
can use this technique to continue a long line over one or more lines without affecting
the outcome of the program.
$ cat printf_demo
BEGIN
{
print "
Miles"
print "Make
Model
Year
(000)
Price"
print \
"--------------------------------------------------"
}
{
if ($1 ~ /ply/) $1 = "plymouth"
if ($1 ~ /chev/) $1 = "chevrolet"
printf "%-10s %-8s
%2d
%5d
$ %8.2f\n",\
$1, $2, $3, $4, $5
}
$ gawk -f printf_demo cars
Miles
Make
Model
Year
(000)
Price
-------------------------------------------------plymouth
fury
1970
73
$ 2500.00
chevrolet malibu
1999
60
$ 3000.00
ford
mustang
1965
45
$ 10000.00
volvo
s80
1998
102
$ 9850.00
ford
thundbd
2003
15
$ 10500.00
chevrolet malibu
2000
50
$ 3500.00
bmw
325i
1985
115
$
450.00

Examples 637
honda
ford
toyota
chevrolet
ford
Redirecting output

accord
taurus
rav4
impala
explor

2001
2004
2002
1985
2003

30
10
180
85
25

$ 6000.00
$ 17000.00
$
750.00
$ 1550.00
$ 9500.00

The next example creates two files: one with the lines that contain chevy and one with
the lines that contain ford.
$ cat redirect_out
/chevy/
{print > "chevfile"}
/ford/
{print > "fordfile"}
END
{print "done."}
$ gawk -f redirect_out cars
done.
$ cat chevfile
chevy
malibu
chevy
malibu
chevy
impala

1999
2000
1985

60
50
85

3000
3500
1550

The summary program produces a summary report on all cars and newer cars.
Although they are not required, the initializations at the beginning of the program
represent good programming practice; gawk automatically declares and initializes
variables as you use them. After reading all the input data, gawk computes and displays the averages.
$ cat summary
BEGIN
{
yearsum = 0 ; costsum = 0
newcostsum = 0 ; newcount = 0
}
{
yearsum += $3
costsum += $5
}
$3 > 2000 {newcostsum += $5 ; newcount ++}
END
{
printf "Average age of cars is %4.1f years\n",\
2006 - (yearsum/NR)
printf "Average cost of cars is $%7.2f\n",\
costsum/NR
printf "Average cost of newer cars is $%7.2f\n",\
newcostsum/newcount
}
$ gawk -f summary cars
Average age of cars is 13.1 years
Average cost of cars is $6216.67
Average cost of newer cars is $8750.00

638 Chapter 14 The AWK Pattern Processing Language

The following gawk command shows the format of a line from a Linux passwd file
that the next example uses:
$ gawk '/mark/ {print}' /etc/passwd
mark:x:107:100:ext 112:/home/mark:/bin/tcsh
FS variable

The next example demonstrates a technique for finding the largest number in a field.
Because it works with a Linux passwd file, which delimits fields with colons (:), the
example changes the input field separator (FS) before reading any data. It reads the
passwd file and determines the next available user ID number (field 3). The numbers
do not have to be in order in the passwd file for this program to work.
The pattern ($3 > saveit) causes gawk to select records that contain a user ID number
greater than any previous user ID number it has processed. Each time it selects a
record, gawk assigns the value of the new user ID number to the saveit variable. Then
gawk uses the new value of saveit to test the user IDs of all subsequent records. Finally
gawk adds 1 to the value of saveit and displays the result.
$ cat find_uid
BEGIN
{FS = ":"
saveit = 0}
$3 > saveit
{saveit = $3}
END
{print "Next available UID is " saveit + 1}
$ gawk -f find_uid /etc/passwd
Next available UID is 1092

The next example produces another report based on the cars file. This report uses
nested if...else control structures to substitute values based on the contents of the
price field. The program has no pattern part; it processes every record.
$ cat price_range
{
if
($5 <= 5000)
$5 = "inexpensive"
else if
(5000 < $5 && $5 < 10000) $5 = "please ask"
else if
(10000 <= $5)
$5 = "expensive"
#
printf "%-10s %-8s
%2d
%5d
%-12s\n",\
$1, $2, $3, $4, $5
}
$ gawk -f price_range cars
plym
fury
1970
chevy
malibu
1999
ford
mustang
1965
volvo
s80
1998
ford
thundbd
2003
chevy
malibu
2000
bmw
325i
1985
honda
accord
2001
ford
taurus
2004
toyota
rav4
2002
chevy
impala
1985
ford
explor
2003

73
60
45
102
15
50
115
30
10
180
85
25

inexpensive
inexpensive
expensive
please ask
expensive
inexpensive
inexpensive
please ask
expensive
inexpensive
inexpensive
please ask

Examples 639
Associative arrays

Next the manuf associative array uses the contents of the first field of each record in
the cars file as an index. The array consists of the elements manuf[plym],
manuf[chevy], manuf[ford], and so on. Each new element is initialized to 0 (zero) as
it is created. The ++ operator increments the variable it follows.

for structure

The action following the END pattern is a for structure, which loops through the elements of an associative array. A pipeline sends the output through sort to produce an
alphabetical list of cars and the quantities in stock. Because it is a shell script and not
a gawk program file, you must have both read and execute permission to the manuf
file to execute it as a command.
$ cat manuf
gawk ' {manuf[$1]++}
END
{for (name in manuf) print name, manuf[name]}
' cars |
sort
$ ./manuf
bmw 1
chevy 3
ford 4
honda 1
plym 1
toyota 1
volvo 1

The next program, manuf.sh, is a more general shell script that includes error checking. This script lists and counts the contents of a column in a file, with both the
column number and the name of the file specified on the command line.
The first action (the one that starts with {count) uses the shell variable $1 in the middle of the gawk program to specify an array index. Because of the way the single
quotation marks are paired, the $1 that appears to be within single quotation marks
is actually not quoted: The two quoted strings in the gawk program surround, but do
not include, the $1. Because the $1 is not quoted, and because this is a shell script,
the shell substitutes the value of the first command-line argument in place of $1
(page 459). As a result, the $1 is interpreted before the gawk command is invoked.
The leading dollar sign (the one before the first single quotation mark on that line)
causes gawk to interpret what the shell substitutes as a field number.
$ cat manuf.sh
if [ $# != 2 ]
then
echo "Usage: manuf.sh field file"
exit 1
fi
gawk < $2 '
{count[$'$1']++}
END
{for (item in count) printf "%-20s%-20s\n",\
item, count[item]}' |
sort

640 Chapter 14 The AWK Pattern Processing Language
$ ./manuf.sh
Usage: manuf.sh field file
$ ./manuf.sh 1 cars
bmw
1
chevy
3
ford
4
honda
1
plym
1
toyota
1
volvo
1
$ ./manuf.sh 3 cars
1965
1
1970
1
1985
2
1998
1
1999
1
2000
1
2001
1
2002
1
2003
2
2004
1

A way around the tricky use of quotation marks that allow parameter expansion
within the gawk program is to use the –v option on the command line to pass the field
number to gawk as a variable. This change makes it easier for someone else to read
and debug the script. You call the manuf2.sh script the same way you call manuf.sh:
$ cat manuf2.sh
if [ $# != 2 ]
then
echo "Usage: manuf.sh field file"
exit 1
fi
gawk -v "field=$1" < $2 '
{count[$field]++}
END
{for (item in count) printf "%-20s%-20s\n",\
item, count[item]}' |
sort

The word_usage script displays a word usage list for a file you specify on the command line. The tr utility (page 987) lists the words from standard input, one to a line.
The sort utility orders the file, putting the most frequently used words first. The script
sorts groups of words that are used the same number of times in alphabetical order.
$ cat word_usage
tr -cs 'a-zA-Z' '[\n*]' < $1 |
gawk
'
{count[$1]++}
END
{for (item in count) printf "%-15s%3s\n", item, count[item]}' |
sort -k 2nr -k 1f,2

Examples 641
$ ./word_usage textfile
the
42
file
29
fsck
27
system
22
you
22
to
21
it
17
SIZE
14
and
13
MODE
13
...

Following is a similar program in a different format. The style mimics that of a C
program and might be easier to read and work with for more complex gawk
programs.
$ cat word_count
tr -cs 'a-zA-Z' '[\n*]' < $1 |
gawk ' {
count[$1]++
}
END
{
for (item in count)
{
if (count[item] > 4)
{
printf "%-15s%3s\n", item, count[item]
}
}
} ' |
sort -k 2nr -k 1f

The tail utility displays the last ten lines of output, illustrating that words occurring
fewer than five times are not listed:
$ ./word_count textfile | tail
directories
5
if
5
information
5
INODE
5
more
5
no
5
on
5
response
5
this
5
will
5

The next example shows one way to put a date on a report. The first line of input to
the gawk program comes from date. The program reads this line as record number 1
(NR == 1), processes it accordingly, and processes all subsequent lines with the action
associated with the next pattern (NR > 1).

642 Chapter 14 The AWK Pattern Processing Language
$ cat report
if (test $# = 0) then
echo "You must supply a filename."
exit 1
fi
(date; cat $1) |
gawk '
NR == 1
{print "Report for", $1, $2, $3 ", " $6}
NR > 1
{print $5 "\t" $1}'
$ ./report cars
Report for Thu Jan 31, 2013
2500
plym
3000
chevy
10000
ford
9850
volvo
10500
ford
3500
chevy
450
bmw
6000
honda
17000
ford
750
toyota
1550
chevy
9500
ford

The next example sums each of the columns in a file you specify on the command
line; it takes its input from the numbers file. The program performs error checking,
reporting on and discarding rows that contain nonnumeric entries. It uses the next
command (with the comment skip bad records) to skip the rest of the commands for
the current record if the record contains a nonnumeric entry. At the end of the program, gawk displays a grand total for the file.
$ cat numbers
10
20
20
30
30
xyz
40
75
50
20
60
30
70
1134.7
80
75
90
176
100
1027.45
110
123
120
75
$ cat tally
gawk ' BEGIN

30.3
45.7
50
107.2
30.3
45.O
50
107.2
30.3
45.7
50
107.2

40.5
66.1
70
55.6
40.5
66.1
70
55.6
40.5
66.1
57a.5
55.6

{
ORS = ""
}

NR == 1 {
nfields = NF
}
{

# first record only
# set nfields to number of
# fields in the record (NF)

Examples 643
if ($0 ~ /[^0-9. \t]/)
#
{
#
print "\nRecord " NR " skipped:\n\t"
#
print $0 "\n"
next
#
}
else
{
for (count = 1; count <= nfields; count++)
{
printf "%10.2f", $count > "tally.out"
sum[count] += $count
gtotal += $count
}
print "\n" > "tally.out"
}
}

check each record to see if it contains
any characters that are not numbers,
periods, spaces, or TABs
skip bad records

# for good records loop through fields

END

{
# after processing last record
for (count = 1; count <= nfields; count++)
# print summary
{
print "
-------" > "tally.out"
}
print "\n" > "tally.out"
for (count = 1; count <= nfields; count++)
{
printf "%10.2f", sum[count] > "tally.out"
}
print "\n\n
Grand Total " gtotal "\n" > "tally.out"
} ' < numbers
$ ./tally
Record 3 skipped:
30
xyz

50

70

Record 6 skipped:
60
30

45.O

66.1

Record 11 skipped:
110
123

50

57a.5

$ cat tally.out
10.00
20.00
20.00
30.00
40.00
75.00
50.00
20.00
70.00
1134.70
80.00
75.00
90.00
176.00
100.00
1027.45
120.00
75.00
------------580.00
2633.15

30.30
45.70
107.20
30.30
50.00
107.20
30.30
45.70
107.20
------553.90

Grand Total 4257.55

40.50
66.10
55.60
40.50
70.00
55.60
40.50
66.10
55.60
------490.50

644 Chapter 14 The AWK Pattern Processing Language

The next example reads the passwd file, listing users who do not have passwords and
users who have duplicate user ID numbers. (The pwck utility [Linux only] performs
similar checks.) Because Mac OS X uses Open Directory (page 1042) and not the
passwd file, this example will not work under OS X.
$ cat /etc/passwd
bill::102:100:ext 123:/home/bill:/bin/bash
roy:x:104:100:ext 475:/home/roy:/bin/bash
tom:x:105:100:ext 476:/home/tom:/bin/bash
lynn:x:166:100:ext 500:/home/lynn:/bin/bash
mark:x:107:100:ext 112:/home/mark:/bin/bash
sales:x:108:100:ext 102:/m/market:/bin/bash
anne:x:109:100:ext 355:/home/anne:/bin/bash
toni::164:100:ext 357:/home/toni:/bin/bash
ginny:x:115:100:ext 109:/home/ginny:/bin/bash
chuck:x:116:100:ext 146:/home/chuck:/bin/bash
neil:x:164:100:ext 159:/home/neil:/bin/bash
rmi:x:118:100:ext 178:/home/rmi:/bin/bash
vern:x:119:100:ext 201:/home/vern:/bin/bash
bob:x:120:100:ext 227:/home/bob:/bin/bash
janet:x:122:100:ext 229:/home/janet:/bin/bash
maggie:x:124:100:ext 244:/home/maggie:/bin/bash
dan::126:100::/home/dan:/bin/bash
dave:x:108:100:ext 427:/home/dave:/bin/bash
mary:x:129:100:ext 303:/home/mary:/bin/bash
$ cat passwd_check
gawk < /etc/passwd '
BEGIN
{
uid[void] = ""
# tell gawk that uid is an array
}
{
# no pattern indicates process all records
dup = 0
# initialize duplicate flag
split($0, field, ":")
# split into fields delimited by ":"
if (field[2] == "")
# check for null password field
{
if (field[5] == "")
# check for null info field
{
print field[1] " has no password."
}
else
{
print field[1] " ("field[5]") has no password."
}
}
for (name in uid)
# loop through uid array
{
if (uid[name] == field[3])
# check for second use of UID
{
print field[1] " has the same UID as " name " : UID = " uid[name]
dup = 1
# set duplicate flag
}
}

Examples 645
if (!dup)

# same as if (dup == 0)
# assign UID and login name to uid array

{
uid[field[1]] = field[3]
}
}'
$ ./passwd_check
bill (ext 123) has no
toni (ext 357) has no
neil has the same UID
dan has no password.
dave has the same UID

password.
password.
as toni : UID = 164
as sales : UID = 108

The next example shows a complete interactive shell script that uses gawk to generate
a report on the cars file based on price ranges:
$ cat list_cars
trap 'rm -f $$.tem > /dev/null;echo $0 aborted.;exit 1' 1 2 15
read -p "Price range (for example, 5000 7500):" lowrange hirange
echo '
Miles
Make
Model
Year
(000)
Price
--------------------------------------------------' > $$.tem
gawk < cars '
$5 >= '$lowrange' && $5 <= '$hirange' {
if ($1 ~ /ply/) $1 = "plymouth"
if ($1 ~ /chev/) $1 = "chevrolet"
printf "%-10s %-8s
%2d
%5d
$ %8.2f\n", $1, $2, $3, $4,
$5
}' | sort -n +5 >> $$.tem
cat $$.tem
rm $$.tem
$ ./list_cars
Price range (for example, 5000 7500):3000 8000
Miles
Make
Model
Year
(000)
Price
-------------------------------------------------chevrolet malibu
1999
60
$ 3000.00
chevrolet malibu
2000
50
$ 3500.00
honda
accord
2001
30
$ 6000.00
$ ./list_cars
Price range (for example, 5000 7500):0 2000
Miles
Make
Model
Year
(000)
Price
-------------------------------------------------bmw
325i
1985
115
$
450.00
toyota
rav4
2002
180
$
750.00
chevrolet impala
1985
85
$ 1550.00

646 Chapter 14 The AWK Pattern Processing Language

$ ./list_cars
Price range (for example, 5000 7500):15000 100000
Miles
Make
Model
Year
(000)
Price
-------------------------------------------------ford
taurus
2004
10
$ 17000.00

optional

Advanced gawk Programming
This section discusses some of the advanced features of AWK. It covers how to control input using the getline statement, how to use a coprocess to exchange
information between gawk and a program running in the background, and how to
use a coprocess to exchange data over a network. Coprocesses are available under
gawk only; they are not available under awk and mawk.

getline: Controlling Input
Using the getline statement gives you more control over the data gawk reads than
other methods of input do. When you provide a variable name as an argument to
getline, getline reads data into that variable. The BEGIN block of the g1 program
uses getline to read one line into the variable aa from standard input:
$ cat g1
BEGIN
{
getline aa
print aa
}
$ echo aaaa | gawk -f g1
aaaa

The next few examples use the alpha file:
$ cat alpha
aaaaaaaaa
bbbbbbbbb
ccccccccc
ddddddddd

Even when g1 is given more than one line of input, it processes only the first line:
$ gawk -f g1 < alpha
aaaaaaaaa

When getline is not given an argument, it reads input into $0 and modifies the field
variables ($1, $2, . . .):
$ gawk 'BEGIN {getline;print $1}' < alpha
aaaaaaaaa

Advanced gawk Programming

647

The g2 program uses a while loop in the BEGIN block to loop over the lines in standard input. The getline statement reads each line into holdme and print outputs each
value of holdme.
$ cat g2
BEGIN
{
while (getline holdme)
print holdme
}
$ gawk -f g2 < alpha
aaaaaaaaa
bbbbbbbbb
ccccccccc
ddddddddd

The g3 program demonstrates that gawk automatically reads each line of input into
$0 when it has statements in its body (and not just a BEGIN block). This program
outputs the record number (NR), the string $0:, and the value of $0 (the current
record) for each line of input.
$ cat g3
{print NR, "$0:", $0}
$
1
2
3
4

gawk -f g3 < alpha
$0: aaaaaaaaa
$0: bbbbbbbbb
$0: ccccccccc
$0: ddddddddd

Next g4 demonstrates that getline works independently of gawk’s automatic reads
and $0. When getline reads data into a variable, it does not modify either $0 or any
of the fields in the current record ($1, $2, . . .). The first statement in g4, which is
the same as the statement in g3, outputs the line that gawk has automatically read.
The getline statement reads the next line of input into the variable named aa. The
third statement outputs the record number, the string aa:, and the value of aa. The
output from g4 shows that getline processes records independently of gawk’s automatic reads.
$ cat g4
{
print NR, "$0:", $0
getline aa
print NR, "aa:", aa
}
$
1
2
3
4

gawk -f g4 < alpha
$0: aaaaaaaaa
aa: bbbbbbbbb
$0: ccccccccc
aa: ddddddddd

648 Chapter 14 The AWK Pattern Processing Language

The g5 program outputs each line of input except for those lines that begin with the
letter b. The first print statement outputs each line that gawk reads automatically.
Next the /^b/ pattern selects all lines that begin with b for special processing. The
action uses getline to read the next line of input into the variable hold, outputs the
string skip this line: followed by the value of hold, and outputs the value of $1. The
$1 holds the value of the first field of the record that gawk read automatically, not
the record read by getline. The final statement displays a string and the value of NR,
the current record number. Even though getline does not change $0 when it reads
data into a variable, gawk increments NR.
$ cat g5
# print all lines except those read with getline
{print "line #", NR, $0}
# if line begins with "b" process it specially
/^b/
{
# use getline to read the next line into variable named hold
getline hold
# print value of hold
print "skip this line:", hold
# $0 is not affected when getline reads data into a variable
# $1 still holds previous value
print "previous line began with:", $1
}
{
print ">>>> finished processing line #", NR
print ""
}
$ gawk -f g5 < alpha
line # 1 aaaaaaaaa
>>>> finished processing line # 1
line # 2 bbbbbbbbb
skip this line: ccccccccc
previous line began with: bbbbbbbbb
>>>> finished processing line # 3
line # 4 ddddddddd
>>>> finished processing line # 4

Coprocess: Two-Way I/O
A coprocess is a process that runs in parallel with another process. Starting with version 3.1, gawk can invoke a coprocess to exchange information directly with a
background process. A coprocess can be useful when you are working in a client/server environment, setting up an SQL (page 1095) front end/back end, or
exchanging data with a remote system over a network. The gawk syntax identifies a
coprocess by preceding the name of the program that starts the background process
with a |& operator.

Advanced gawk Programming

649

Only gawk supports coprocesses
tip The awk and mawk utilities do not support coprocesses. Only gawk supports coprocesses.
The coprocess command must be a filter (i.e., it reads from standard input and writes
to standard output) and must flush its output whenever it has a complete line rather
than accumulating lines for subsequent output. When a command is invoked as a
coprocess, it is connected via a two-way pipe to a gawk program so you can read from
and write to the coprocess.
to_upper

When used alone the tr utility (page 987) does not flush its output after each line. The
to_upper shell script is a wrapper for tr that does flush its output; this filter can be
run as a coprocess. For each line read, to_upper writes the line, translated to uppercase, to standard output. Remove the # before set –x if you want to_upper to display
debugging output.
$ cat to_upper
#!/bin/bash
#set -x
while read arg
do
echo "$arg" | tr '[a-z]' '[A-Z]'
done
$ echo abcdef | ./to_upper
ABCDEF

The g6 program invokes to_upper as a coprocess. This gawk program reads standard
input or a file specified on the command line, translates the input to uppercase, and
writes the translated data to standard output.
$ cat g6
{
print $0 |& "to_upper"
"to_upper" |& getline hold
print hold
}
$ gawk -f g6 < alpha
AAAAAAAAA
BBBBBBBBB
CCCCCCCCC
DDDDDDDDD

The g6 program has one compound statement, enclosed within braces, comprising
three statements. Because there is no pattern, gawk executes the compound statement
once for each line of input.
In the first statement, print $0 sends the current record to standard output. The |&
operator redirects standard output to the program named to_upper, which is running
as a coprocess. The quotation marks around the name of the program are required.
The second statement redirects standard output from to_upper to a getline statement,
which copies its standard input to the variable named hold. The third statement, print
hold, sends the contents of the hold variable to standard output.

650 Chapter 14 The AWK Pattern Processing Language

Getting Input from a Network
Building on the concept of a coprocess, gawk can exchange information with a process on another system via an IP network connection. When you specify one of the
special filenames that begins with /inet/, gawk processes the request using a network
connection. The syntax of these special filenames is
/inet/protocol/local-port/remote-host/remote-port
where protocol is usually tcp but can be udp, local-port is 0 (zero) if you want gawk
to pick a port (otherwise it is the number of the port you want to use), remote-host
is the IP address (page 1075) or fully qualified domain name (page 1070) of the
remote host, and remote-port is the port number on the remote host. Instead of a port
number in local-port and remote-port, you can specify a service name such as http
or ftp.
The g7 program reads the rfc-retrieval.txt file from the server at www.rfc-editor.org.
On www.rfc-editor.org the file is located at /rfc/rfc-retrieval.txt. The first statement
in g7 assigns the special filename to the server variable. The filename specifies a TCP
connection, allows the local system to select an appropriate port, and connects to
www.rfc-editor.org on port 80. You can use http in place of 80 to specify the standard
HTTP port.
The second statement uses a coprocess to send a GET request to the remote server.
This request includes the pathname of the file gawk is requesting. A while loop
uses a coprocess to redirect lines from the server to getline. Because getline has no
variable name as an argument, it saves its input in the current record buffer $0.
The final print statement sends each record to standard output. Experiment with
this script, replacing the final print statement with gawk statements that process
the file.
$ cat g7
BEGIN
{
# set variable named server
# to special networking filename
server = "/inet/tcp/0/www.rfc-editor.org/80"
# use coprocess to send GET request to remote server
print "GET /rfc/rfc-retrieval.txt \
HTTP/1.1\nHost:www.rfc-editor.org\n\n"|& server
# while loop uses coprocess to redirect
# output from server to getline
while (server |& getline)
print $0
}

Exercises 651
$ gawk -f g7
Where and how to get new RFCs
=============================
RFCs may be obtained via FTP or HTTP or email from many RFC
repositories.
The official repository for RFCs is:
http://www.rfc-editor.org/
...

Chapter Summary
AWK is a pattern-scanning and processing language that searches one or more files
for records (usually lines) that match specified patterns. It processes lines by performing actions, such as writing the record to standard output or incrementing a counter,
each time it finds a match. AWK has several implementations, including awk, gawk,
and mawk.
An AWK program consists of one or more lines containing a pattern and/or action
in the following syntax:
pattern { action }
The pattern selects lines from the input. An AWK program performs the action on
all lines that the pattern selects. If a program line does not contain a pattern, AWK
selects all lines in the input. If a program line does not contain an action, AWK copies
the selected lines to standard output.
An AWK program can use variables, functions, arithmetic operators, associative
arrays, control statements, and C’s printf statement. Advanced AWK programming
takes advantage of getline statements to fine-tune input, coprocesses to enable gawk
to exchange data with other programs (gawk only), and network connections to
exchange data with programs running on remote systems on a network (gawk only).

Exercises
1. Write a gawk program that numbers each line in a file and sends its output
to standard output.
2. Write a gawk program that displays the number of characters in the first
field followed by the first field and sends its output to standard output.

652 Chapter 14 The AWK Pattern Processing Language

3. Write a gawk program that uses the cars file (page 629), displays all cars
priced at more than $5,000, and sends its output to standard output.
4. Use gawk to determine how many lines in /etc/services contain the string
Mail. Verify your answer using grep.

Advanced Exercises
5. Experiment with pgawk (available only with gawk). What does it do? How
can it be useful?
6. Write a gawk (not awk or mawk) program named net_list that reads from the
rfc-retrieval.txt file on www.rfc-editor.org (see “Getting Input from a Network” on page 650) and displays a the last word on each line in all
uppercase letters.
7. Expand the net_list program developed in Exercise 6 to use to_upper
(page 649) as a coprocess to display the list of cars with only the make of
the cars in uppercase. The model and subsequent fields on each line should
appear as they do in the cars file.
8. How can you cause gawk (not awk or mawk) to neatly format—that is,
“pretty print”—a gawk program file? (Hint: See the gawk man page.)

15
The sed Editor

Chapter15
15

In This Chapter

Objectives

Syntax . . . . . . . . . . . . . . . . . . . . . . 654

After reading this chapter you should be able to:

Arguments . . . . . . . . . . . . . . . . . . . 654

 Use sed to edit a file, replacing words in the file

Options . . . . . . . . . . . . . . . . . . . . . 654
Editor Basics . . . . . . . . . . . . . . . . . 655

 Write a sed program that inserts or changes lines in a
file

Addresses . . . . . . . . . . . . . . . . . . . 655

 Change multiple character strings in a file using sed

Instructions . . . . . . . . . . . . . . . . . . 656

 Use sed as a filter to modify a file

Control Structures . . . . . . . . . . . . . 657
The Hold Space . . . . . . . . . . . . . . . 658
Examples . . . . . . . . . . . . . . . . . . . . 658

654 Chapter 15 The sed Editor

The sed (stream editor) utility is a batch (noninteractive) editor. It transforms an
input stream that can come from a file or standard input. It is frequently used as a
filter or in a pipeline. Because it makes only one pass through its input, sed is more
efficient than an interactive editor such as ed. Most Linux distributions provide GNU
sed; Mac OS X supplies BSD sed. This chapter applies to both versions.

Syntax
A sed command line has the following syntax:
sed [–n] program [file-list]
sed [–n] –f program-file [file-list]
The sed utility takes its input from files you specify on the command line or from
standard input. Output from sed goes to standard output.

Arguments
The program is a sed program included on the command line. The first syntax allows
you to write simple, short sed programs without creating a separate file to hold the
sed program. The program-file in the second syntax is the pathname of a file containing a sed program (see “Editor Basics”). The file-list contains pathnames of the
ordinary files that sed processes; these are the input files. When you do not specify a
file-list, sed takes its input from standard input.

Options
Options preceded by a double hyphen (– –) work under Linux (GNU sed) only.
Options named with a single letter and preceded by a single hyphen work under
Linux (GNU sed) and OS X (BSD sed).
––file program-file

–f program-file
Causes sed to read its program from the file named program-file instead of from
the command line. You can use this option more than once on the command
line.
Summarizes how to use sed. L

––help
––in-place[=suffix]

–i[suffix]
Edits files in place. Without this option sed sends its output to standard output. With this option sed replaces the file it is processing with its output. When
you specify a suffix, sed makes a backup of the original file. The backup has

Editor Basics

655

the original filename with suffix appended. You must include a period in suffix
if you want a period to appear between the original filename and suffix.
––quiet or
––silent

–n Causes sed not to copy lines to standard output except as specified by the Print
(p) instruction or flag.

Editor Basics
A sed program consists of one or more lines with the following syntax:
[address[,address]] instruction [argument-list]
The addresses are optional. If you omit the address, sed processes all lines of input.
The instruction is an editing instruction that modifies the text. The addresses select
the line(s) the instruction part of the command operates on. The number and kinds
of arguments in the argument-list depend on the instruction. If you want to put several sed commands on one line, separate the commands with semicolons (;).
The sed utility processes input as follows:
1. Reads one line of input from file-list or standard input.
2. Reads the first instruction from the program or program-file. If the
address(es) select the input line, acts on the input line as the instruction
specifies.
3. Reads the next instruction from the program or program-file. If the
address(es) select the input line, acts on the input line (possibly modified by
the previous instruction) as the instruction specifies.
4. Repeats step 3 until it has executed all instructions in the program or
program-file.
5. Starts over with step 1 if there is another line of input; otherwise, sed is
finished.

Addresses
A line number is an address that selects a line. As a special case, the line number $
represents the last line of input.
A regular expression (Appendix A) is an address that selects those lines containing a
string that the regular expression matches. Although slashes are often used to delimit
these regular expressions, sed permits you to use any character other than a backslash
or NEWLINE for this purpose.
Except as noted, zero, one, or two addresses (either line numbers or regular
expressions) can precede an instruction. If you do not specify an address, sed
selects all lines, causing the instruction to act on every line of input. Specifying one
address causes the instruction to act on each input line the address selects. Specifying two addresses causes the instruction to act on groups of lines. In this case
the first address selects the first line in the first group. The second address selects

656 Chapter 15 The sed Editor

the next subsequent line that it matches; this line is the last line in the first group.
If no match for the second address is found, the second address points to the end
of the file. After selecting the last line in a group, sed starts the selection process
over, looking for the next line the first address matches. This line is the first line
in the next group. The sed utility continues this process until it has finished going
through the entire file.

Instructions
Pattern space

The sed utility has two buffers. The following commands work with the Pattern
space, which initially holds the line of input that sed just read. The other buffer, the
Hold space, is discussed on page 658.

a (append) The Append instruction appends one or more lines to the currently selected

line. If you precede an Append instruction with two addresses, it appends to each line
that is selected by the addresses. If you do not precede an Append instruction with
an address, it appends to each input line. An Append instruction has the following
syntax:
[address[,address]] a\
text \
text \
...
text
You must end each line of appended text, except the last, with a backslash, which
quotes the following NEWLINE. The appended text concludes with a line that does not
end with a backslash. The sed utility always writes out appended text, regardless of
whether you use a –n flag on the command line. It even writes out the text if you
delete the line to which you are appending the text.
c (change) The Change instruction is similar to Append and Insert except it changes the

selected lines so that they contain the new text. When you specify an address range,
Change replaces the range of lines with a single occurrence of the new text.
d (delete) The Delete instruction causes sed not to write out the lines it selects and
not to finish processing the lines. After sed executes a Delete instruction, it reads

the next input line and then begins anew with the first instruction from the program
or program-file.
i (insert) The Insert instruction is identical to the Append instruction except it places

the new text before the selected line.
N (next without write) The Next (N) instruction reads the next input line and appends

it to the current line. An embedded NEWLINE separates the original line and the new line.
You can use the N command to remove NEWLINEs from a file. See the example on
page 663.

Editor Basics

657

n (next) The Next (n) instruction writes out the currently selected line if appropriate,

reads the next input line, and starts processing the new line with the next instruction
from the program or program-file.
p (print) The Print instruction writes the selected lines to standard output, writing the

lines immediately, and does not reflect the effects of subsequent instructions. This
instruction overrides the –n option on the command line.
q (quit) The Quit instruction causes sed to terminate immediately.
r file (read) The Read instruction reads the contents of the specified file and appends it to

the selected line. A single
instruction.

SPACE

and the name of the input file must follow a Read

s (substitute) The Substitute instruction in sed is similar to that in vim (page 186). It

has the following syntax:
[address[,address]] s/pattern/replacement-string/[g][p][w file]
The pattern is a regular expression (Appendix A) that traditionally is delimited by
a slash (/); you can use any character other than a SPACE or NEWLINE. The replacementstring starts immediately following the second delimiter and must be terminated by
the same delimiter. The final (third) delimiter is required. The replacement-string
can contain an ampersand (&), which sed replaces with the matched pattern. Unless
you use the g flag, the Substitute instruction replaces only the first occurrence of the
pattern on each selected line.
The g (global) flag causes the Substitute instruction to replace all nonoverlapping
occurrences of the pattern on the selected lines.
The p (print) flag causes sed to send all lines on which it makes substitutions to
standard output. This flag overrides the –n option on the command line.
The w (write) flag is similar to the p flag but sends its output to the file specified by
file. A single SPACE and the name of the output file must follow a w flag.
w file (write) This instruction is similar to the Print instruction except it sends the output

to the file specified by file. A single SPACE and the name of the output file must follow
a Write instruction.

Control Structures
! (NOT) Causes sed to apply the following instruction, located on the same line, to

each of the lines not selected by the address portion of the instruction. For example,
3!d deletes all lines except line 3 and $!p displays all lines except the last.
{ } (group instructions) When you enclose a group of instructions within a pair of

braces, a single address or address pair selects the lines on which the group of
instructions operates. Use semicolons (;) to separate multiple commands appearing
on a single line.

658 Chapter 15 The sed Editor
Branch instructions

The GNU sed info page identifies the branch instructions as “Commands for sed
gurus” and suggests that if you need them you might be better off writing your program in awk or Perl.

: label Identifies a location within a sed program. The label is useful as a target for the b

and t branch instructions.
b [label] Unconditionally transfers control to label. Without label, skips the rest of the

instructions for the current line of input and reads the next line of input.
t [label] Transfers control to label only if a Substitute instruction has been successful since the

most recent line of input was read (conditional branch). Without label, skips the rest
of the instructions for the current line of input and reads the next line of input.

The Hold Space
The commands reviewed up to this point work with the Pattern space, a buffer that
initially holds the line of input that sed just read. The Hold space can hold data while
you manipulate data in the Pattern space; it is a temporary buffer. Until you place
data in the Hold space, it is empty. This section discusses commands that move data
between the Pattern space and the Hold space.
g Copies the contents of the Hold space to the Pattern space. The original contents of

the Pattern space is lost.
G Appends a NEWLINE and the contents of the Hold space to the Pattern space.
h Copies the contents of the Pattern space to the Hold space. The original content of

the Hold space is lost.
H Appends a NEWLINE and the contents of the Pattern space to the Hold space.
x Exchanges the contents of the Pattern space and the Hold space.

Examples
lines data file

The following examples use the lines file for input:
$ cat lines
Line one.
The second line.
The third.
This is line four.
Five.
This is the sixth sentence.
This is line seven.
Eighth and last.

Unless you instruct it not to, sed sends all lines—selected or not—to standard output.
When you use the –n option on the command line, sed sends only certain lines, such
as those selected by a Print (p) instruction, to standard output.

Examples 659

The following command line displays all lines in the lines file that contain the word
line (all lowercase). In addition, because there is no –n option, sed displays all the
lines of input. As a result, sed displays the lines that contain the word line twice.
$ sed '/line/ p' lines
Line one.
The second line.
The second line.
The third.
This is line four.
This is line four.
Five.
This is the sixth sentence.
This is line seven.
This is line seven.
Eighth and last.

The preceding command uses the address /line/, a regular expression that is a simple
string. The sed utility selects each of the lines that contains a match for that pattern.
The Print (p) instruction displays each of the selected lines.
The following command uses the –n option, so sed displays only the selected lines:
$ sed -n '/line/ p' lines
The second line.
This is line four.
This is line seven.

In the next example, sed displays part of a file based on line numbers. The Print
instruction selects and displays lines 3 through 6.
$ sed -n '3,6 p' lines
The third.
This is line four.
Five.
This is the sixth sentence.

The next command line uses the Quit instruction to cause sed to display only the
beginning of a file. In this case sed displays the first five lines of lines just as a head
–5 lines command would.
$ sed '5 q' lines
Line one.
The second line.
The third.
This is line four.
Five.

program-file

When you need to give sed more complex or lengthy instructions, you can use a
program-file. The print3_6 program performs the same function as the command
line in the second preceding example. The –f option tells sed to read its program
from the file named following this option.

660 Chapter 15 The sed Editor
$ cat print3_6
3,6 p
$ sed -n -f print3_6 lines
The third.
This is line four.
Five.
This is the sixth sentence.
Append

The next program selects line 2 and uses an Append instruction to append a NEWLINE
and the text AFTER. to the selected line. Because the command line does not include
the –n option, sed copies all lines from the input file lines.
$ cat append_demo
2 a\
AFTER.
$ sed -f append_demo lines
Line one.
The second line.
AFTER.
The third.
This is line four.
Five.
This is the sixth sentence.
This is line seven.
Eighth and last.

Insert

The insert_demo program selects all lines containing the string This and inserts a
NEWLINE and the text BEFORE. before the selected lines.
$ cat insert_demo
/This/ i\
BEFORE.
$ sed -f insert_demo lines
Line one.
The second line.
The third.
BEFORE.
This is line four.
Five.
BEFORE.
This is the sixth sentence.
BEFORE.
This is line seven.
Eighth and last.

Change

The next example demonstrates a Change instruction with an address range. When
you specify a range of lines for a Change instruction, it does not change each line
within the range but rather changes the block of lines to a single occurrence of the
new text.

Examples 661
$ cat change_demo
2,4 c\
SED WILL INSERT THESE\
THREE LINES IN PLACE\
OF THE SELECTED LINES.
$ sed -f change_demo lines
Line one.
SED WILL INSERT THESE
THREE LINES IN PLACE
OF THE SELECTED LINES.
Five.
This is the sixth sentence.
This is line seven.
Eighth and last.
Substitute

The next example demonstrates a Substitute instruction. The sed utility selects all
lines because the instruction has no address. On each line subs_demo replaces the
first occurrence of line with sentence. The p flag displays each line where a substitution occurs. The command line calls sed with the –n option, so sed displays only the
lines the program explicitly specifies.
$ cat subs_demo
s/line/sentence/p
$ sed -n -f subs_demo lines
The second sentence.
This is sentence four.
This is sentence seven.

The next example is similar to the preceding one except that a w flag and filename
(temp) at the end of the Substitute instruction cause sed to create the file named temp.
The command line does not include the –n option, so it displays all lines in addition
to writing the changed lines to temp. The cat utility displays the contents of the file
temp. The word Line (starting with an uppercase L) is not changed.
$ cat write_demo1
s/line/sentence/w temp
$ sed -f write_demo1 lines
Line one.
The second sentence.
The third.
This is sentence four.
Five.
This is the sixth sentence.
This is sentence seven.
Eighth and last.
$ cat temp
The second sentence.
This is sentence four.
This is sentence seven.

662 Chapter 15 The sed Editor

The following bash script changes all occurrences of REPORT to report, FILE to file,
and PROCESS to process in a group of files. Because it is a shell script and not a sed
program file, you must have read and execute permission to the sub file to execute
it as a command (page 285). The for structure (page 433) loops through the list of
files on the command line. As it processes each file, the script displays each filename
before processing the file with sed. This program uses embedded sed commands that
span multiple lines. Because the NEWLINEs between the commands are quoted (they
appear between single quotation marks), sed accepts multiple commands on a single, extended command line (within the shell script). Each Substitute instruction
includes a g (global) flag to take care of the case where a string occurs more than
once on a line.
$ cat sub
for file
do
echo $file
mv $file $$.subhld
sed 's/REPORT/report/g
s/FILE/file/g
s/PROCESS/process/g' $$.subhld > $file
done
rm $$.subhld
$ sub file1 file2 file3
file1
file2
file3

In the next example, a Write instruction copies part of a file to another file (temp2).
The line numbers 2 and 4, separated by a comma, select the range of lines sed is to
copy. This program does not alter the lines.
$ cat write_demo2
2,4 w temp2
$ sed -n -f write_demo2 lines
$ cat temp2
The second line.
The third.
This is line four.

The program write_demo3 is similar to write_demo2 but precedes the Write instruction with the NOT operator (!), causing sed to write to the file those lines not selected
by the address.
$ cat write_demo3
2,4 !w temp3
$ sed -n -f write_demo3 lines

Examples 663
$ cat temp3
Line one.
Five.
This is the sixth sentence.
This is line seven.
Eighth and last.
Next (n)

The following example demonstrates the Next (n) instruction. When it processes the
selected line (line 3), sed immediately starts processing the next line without displaying line 3.
$ cat next_demo1
3 n
p
$ sed -n -f next_demo1 lines
Line one.
The second line.
This is line four.
Five.
This is the sixth sentence.
This is line seven.
Eighth and last.

The next example uses a textual address. The sixth line contains the string the, so the
Next (n) instruction causes sed not to display it.
$ cat next_demo2
/the/ n
p
$ sed -n -f next_demo2 lines
Line one.
The second line.
The third.
This is line four.
Five.
This is line seven.
Eighth and last.
Next (N)

The following example is similar to the preceding example except it uses the uppercase Next (N) instruction in place of the lowercase Next (n) instruction. Here the
Next (N) instruction appends the next line to the line that contains the string the. In
the lines file, sed appends line 7 to line 6 and embeds a NEWLINE between the two lines.
The Substitute command replaces the embedded NEWLINE with a SPACE. The Substitute
command does not affect other lines because they do not contain embedded NEWLINEs;
rather, they are terminated by NEWLINEs. See page 1042 for an example of the Next (N)
instruction in a sed script running under OS X.
$ cat Next_demo3
/the/ N
s/\n/ /
p

664 Chapter 15 The sed Editor
$ sed -n -f Next_demo3 lines
Line one.
The second line.
The third.
This is line four.
Five.
This is the sixth sentence. This is line seven.
Eighth and last.

The next set of examples uses the compound.in file to demonstrate how sed
instructions work together.
$ cat compound.in
1. The words on this
2. The words on this
3. The words on this
4. The words on this

page...
page...
page...
page...

The following example substitutes the string words with text on lines 1, 2, and 3 and
the string text with TEXT on lines 2, 3, and 4. The example also selects and deletes
line 3. The result is text on line 1, TEXT on line 2, no line 3, and words on line 4.
The sed utility makes two substitutions on lines 2 and 3: text for words and TEXT
for text. Then sed deletes line 3.
$ cat compound
1,3 s/words/text/
2,4 s/text/TEXT/
3 d
$ sed -f compound compound.in
1. The text on this page...
2. The TEXT on this page...
4. The words on this page...

The ordering of instructions within a sed program is critical. Both Substitute instructions are applied to the second line in the following example, as in the previous
example, but the order in which the substitutions occur changes the result.
$ cat compound2
2,4 s/text/TEXT/
1,3 s/words/text/
3 d
$ sed -f compound2 compound.in
1. The text on this page...
2. The text on this page...
4. The words on this page...

In the next example, compound3 appends two lines to line 2. The sed utility displays all lines from the file once because no –n option appears on the command
line. The Print instruction at the end of the program file displays line 3 an additional time.

Examples 665
$ cat compound3
2 a\
This is line 2a.\
This is line 2b.
3 p
$ sed -f compound3 compound.in
1. The words on this page...
2. The words on this page...
This is line 2a.
This is line 2b.
3. The words on this page...
3. The words on this page...
4. The words on this page...

The next example shows that sed always displays appended text. Here line 2 is
deleted but the Append instruction still displays the two lines that were appended to
it. Appended lines are displayed even if you use the –n option on the command line.
$ cat compound4
2 a\
This is line 2a.\
This is line 2b.
2 d
$ sed -f compound4 compound.in
1. The words on this page...
This is line 2a.
This is line 2b.
3. The words on this page...
4. The words on this page...

The next example uses a regular expression as the pattern. The regular expression in
the following instruction (^.) matches one character at the beginning of every line
that is not empty. The replacement string (between the second and third slashes) contains a backslash escape sequence that represents a TAB character (\t) followed by an
ampersand (&). The ampersand takes on the value of what the regular expression
matched.
$ sed 's/^./\t&/' lines
Line one.
The second line.
The third.
...

This type of substitution is useful for indenting a file to create a left margin. See
Appendix A for more information on regular expressions.
You can also use the simpler form s/^/\t/ to add TABs to the beginnings of lines. In
addition to placing TABs at the beginnings of lines with text on them, this instruction
places a TAB at the beginning of every empty line—something the preceding command
does not do.

666 Chapter 15 The sed Editor

You might want to put the preceding sed instruction into a shell script so you do not
have to remember it (and retype it) each time you want to indent a file. The chmod
utility gives you read and execute permission to the ind file.
$ cat ind
sed 's/^./\t&/' $*
$ chmod u+rx ind
$ ind lines
Line one.
The second line.
The third.
...
Stand-alone script

When you run the preceding shell script, it creates two processes: It calls a shell,
which in turn calls sed. You can eliminate the overhead associated with the shell process by putting the line #!/bin/sed –f (page 287) at the beginning of the script, which
runs the sed utility directly. You need read and execute permission to the file holding
the script.
$ cat ind2
#!/bin/sed -f
s/^./\t&/

In the following sed program, the regular expression (two SPACEs followed by *$)
matches one or more SPACEs at the end of a line. This program removes trailing SPACEs
at the ends of lines, which is useful for cleaning up files you created using vim.
$ cat cleanup
sed 's/ *$//' $*

The cleanup2 script runs the same sed command as cleanup but stands alone: It calls
sed directly with no intermediate shell.
$ cat cleanup2
#!/bin/sed -f
s/ *$//
Hold space

The next sed program makes use of the Hold space to exchange pairs of lines in
a file.
$
h
n
p
g
p

cat s1
# Copy Pattern space (line just read) to Hold space.
# Read the next line of input into Pattern space.
# Output Pattern space.
# Copy Hold space to Pattern space.
# Output Pattern space (which now holds the previous line).

$ sed -nf s1 lines
The second line.
Line one.
This is line four.

Examples 667
The third.
This is the sixth sentence.
Five.
Eighth and last.
This is line seven.

The commands in the s1 program process pairs of input lines. This program reads a
line and stores it; reads another line and displays it; and then retrieves the stored line
and displays it. After processing a pair of lines, the program starts over with the next
pair of lines.
The next sed program adds a blank line after each line in the input file (i.e., it doublespaces a file).
$ sed 'G' lines
Line one.
The second line.
The third.
This is line four.
$

The G instruction appends a NEWLINE and the contents of the Hold space to the Pattern
space. Unless you put something in the Hold space, it will be empty. Thus the G
instruction appends a NEWLINE to each line of input before sed displays the line(s) from
the Pattern space.
The s2 sed program reverses the order of the lines in a file just as the tac utility does.
$ cat s2
2,$G # On all but the first line, append a NEWLINE and the
# contents of the Hold space to the Pattern space.
h
# Copy the Pattern space to the Hold space.
$!d
# Delete all except the last line.
$ sed -f s2 lines
Eighth and last.
This is line seven.
This is the sixth sentence.
Five.
This is line four.
The third.
The second line.
Line one.

This program comprises three commands: 2,$G, h, and $!d. To understand this script
it is important to understand how the address of the last command works: The $ is
the address of the last line of input and the ! negates the address. The result is an

668 Chapter 15 The sed Editor

address that selects all lines except the last line of input. In the same fashion you could
replace the first command with 1!G: It would select all lines except the first line for
processing; the results would be the same.
Here is what happens as s2 processes the lines file:
1. The sed utility reads the first line of input (Line one.) into the Pattern space.
a. The 2,$G does not process the first line of input—because of its address
the G instruction starts processing at the second line.
b. The h copies Line one. from the Pattern space to the Hold space.
c. The $!d deletes the contents of the Pattern space. Because there is nothing
in the Pattern space, sed does not display anything.
2. The sed utility reads the second line of input (The second line.) into the
Pattern space.
a. The 2,$G adds what is in the Hold space (Line one.) to the Pattern space.
The Pattern space now contains The second line.NEWLINELine one.
b. The h copies what is in the Pattern space to the Hold space.
c. The $!d deletes the second line of input. Because it is deleted, sed does
not display it.
3. The sed utility reads the third line of input (The third.) into the Pattern
space.
a. The 2,$G adds what is in the Hold space (The second line.NEWLINELine
one.) to the Pattern space. The Pattern space now has The third.NEWLINE
The second line.NEWLINELine one.
b. The h copies what is in the Pattern space to the Hold space.
c. The $!d deletes the contents of the Pattern space. Because there is nothing
in the Pattern space, sed does not display anything.
...
8. The sed utility reads the eighth (last) line of input into the Pattern space.
a. The 2,$G adds what is in the Hold space to the Pattern space. The Pattern
space now contains all the lines from lines in reverse order.
b. The h copies what is in the Pattern space to the Hold space. This step is
not necessary for the last line of input but does not alter the program’s
output.
c. The $!d does not process the last line of input. Because of its address the
d instruction does not delete the last line.
d. The sed utility displays the contents of the Pattern space.

Exercises 669

Chapter Summary
The sed (stream editor) utility is a batch (noninteractive) editor. It takes its input from
files you specify on the command line or from standard input. Unless you redirect the
output from sed, it goes to standard output.
A sed program consists of one or more lines with the following syntax:
[address[,address]] instruction [argument-list]
The addresses are optional. If you omit the address, sed processes all lines of input.
The instruction is the editing instruction that modifies the text. The addresses select
the line(s) the instruction part of the command operates on. The number and kinds
of arguments in the argument-list depend on the instruction.
In addition to basic instructions, sed includes some powerful advanced instructions.
One set of these instructions allows sed programs to store data temporarily in a buffer called the Hold space. Other instructions provide unconditional and conditional
branching in sed programs.

Exercises
1. Write a sed command that copies a file to standard output, removing all
lines that begin with the word Today.
2. Write a sed command that copies only those lines of a file that begin with
the word Today to standard output.
3. Write a sed command that copies a file to standard output, removing all
blank lines (i.e., lines with no characters on them).
4. Write a sed program named ins that copies a file to standard output, changing all occurrences of cat to dog and preceding each modified line with a line
that says following line is modified:
5. Write a sed program named div that copies a file to standard output, copies
the first five lines to a file named first, and copies the rest of the file to a file
named last.
6. Write a sed command that copies a file to standard output, replacing a single SPACE as the first character on a line with a 0 (zero) only if the SPACE is
immediately followed by a number (0–9). For example:
abc
abc
85c
55b
000







abc
abc
085c
55b
0000

7. How can you use sed to triple-space (i.e., add two blank lines after each line
in) a file?

I

PART V
Secure Network Utilities
CHAPTER 16
The rsync Secure Copy Utility

673

CHAPTER 17
The OpenSSH Secure Communication Utilities

687

16
The rsync Secure
Copy Utility

Chapter16
16

In This Chapter

Objectives

Syntax . . . . . . . . . . . . . . . . . . . . . . 674

After reading this chapter you should be able to:

Arguments . . . . . . . . . . . . . . . . . . . 674

 Copy files and directories using rsync

Options . . . . . . . . . . . . . . . . . . . . . 675

 Explain why rsync is secure

Examples . . . . . . . . . . . . . . . . . . . . 677

 Back up files and directories to another system using

Removing Files . . . . . . . . . . . . . . . 678

rsync

Copying Files to and from a
Remote System . . . . . . . . . . . . . 680

 Explain the effect of including a trailing slash on the
name of the source directory

Mirroring a Directory . . . . . . . . . . . 681

 Use rsync options to delete files in the destination that
are not in the source, preserve modification times of
copied files, and perform a dry run so rsync takes no
action

Making Backups . . . . . . . . . . . . . . 681
Restoring a File . . . . . . . . . . . . . . . 684

674 Chapter 16 The rsync Secure Copy Utility

The rsync (remote synchronization) utility copies an ordinary file or directory hierarchy
locally or from the local system to or from another system on a network. By default, this
utility uses OpenSSH to transfer files and the same authentication mechanism as
OpenSSH; it therefore provides the same security as OpenSSH. The rsync utility prompts
for a password when it needs one. Alternatively, you can use the rsyncd daemon as a
transfer agent.

Syntax
An rsync command line has the following syntax:
rsync [options] [[user@]from-host:]source-file [[user@]to-host:][destination-file]
The rsync utility copies files, including directory hierarchies, on the local system or
between the local system and a remote system.

Arguments
The from-host is the name of the system you are copying files from; the to-host is the
name of the system you are copying files to. When you do not specify a host, rsync
assumes the local system. The user on either system defaults to the user who is giving
the command on the local system; you can specify a different user with user@. Unlike
scp, rsync does not permit copying between remote systems.
The source-file is the ordinary or directory file you are copying; the destinationfile is the resulting copy. You can specify files as relative or absolute pathnames.
On the local system, relative pathnames are relative to the working directory;
on a remote system, relative pathnames are relative to the specified or implicit
user’s home directory. When the source-file is a directory, you must use the
––recursive or ––archive option to copy its contents. When the destination-file is
a directory, each of the source files maintains its simple filename. If the sourcefile is a single file, you can omit destination-file; the copied file will have the
same simple filename as source-file (useful only when copying to or from a
remote system).

A trailing slash (/) on source-file is critical
caution When source-file is a directory, a trailing slash in source-file causes rsync to copy the contents
of the directory. The slash is equivalent to /*; it tells rsync to ignore the directory itself and
copy the files within the directory. Without a trailing slash, rsync copies the directory. See
page 678.

Options 675

Options
The Mac OS X version of rsync accepts long options
tip Options for rsync preceded by a double hyphen (––) work under Mac OS X as well as under Linux.
––acls –A Preserves ACLs (page 104) of copied files.
––archive

–a Copies files including dereferenced symbolic links, device files, and special files
recursively, preserving ownership, group, permissions, and modification times
associated with the files. Using this option is the same as specifying the
––devices, ––specials, ––group, ––links, ––owner, ––perms, ––recursive, and
– –times options. This option does not include the – –acls, – –hard-links, or
––xattrs options; you must specify these options in addition to ––archive if you
want to use them. See page 678 for an example.

––backup

–b Renames files that otherwise would be deleted or overwritten. By default, the
rsync utility renames files by appending a tilde (~) to existing filenames. See
––backup-dir=dir if you want rsync to put these files in a specified directory
instead of renaming them. See also ––link-dest=dir.
When used with the ––backup option, moves files that otherwise would be
deleted or overwritten to the directory named dir. After moving the older version of the file to dir, rsync copies the newer version of the file from source-file
to destination-file.

––backup-dir=dir

The directory named dir is located on the same system as destination-file. If dir
is a relative pathname, it is relative to destination-file.
––copy-unsafe-links

(partial dereference) For each file that is a symbolic link that refers to a file outside the source-file hierarchy, copies the file the link points to, not the symbolic
link itself. Without this option rsync copies all symbolic links, even if it does not
copy the file the link refers to. See page 115 for information on dereferencing
symbolic links.
–D Same as ––devices ––specials.
––delete

Deletes files in the destination-file that are not in the source-file. This option
can easily remove files you did not intend to remove; see the caution box on
page 679.

––devices

Copies device files (when working with root privileges only).

––dry-run

Runs rsync without writing to disk. With the ––verbose option, this option
reports on what rsync would have done had it been run without this option. Useful
with the ––delete option.

676 Chapter 16 The rsync Secure Copy Utility
––group

–g Preserves group associations of copied files.

––hard-links –H Preserves hard links of copied files.
––links

–l (lowercase “l”; no dereference) For each file that is a symbolic link, copies the symbolic link, not the file the link points to, even if the file the link points to is not in
the source-file. See page 115 for information on dereferencing symbolic links.

––link-dest=dir

If rsync would normally copy a file—that is, if the file exists in source-file but not
in destination-file or is changed in destination-file—rsync looks in the directory
named dir for the same file. If it finds an exact copy of the file in dir, rsync makes
a hard link from the file in dir to destination-file. If it does not find an exact
copy, rsync copies the file to destination-file. See page 682 for an example.
The directory named dir is located on the same system as destination-file. If dir
is a relative pathname, it is relative to destination-file.

––owner

–o Preserves the owner of copied files (when working with root privileges only).
–P Same as ––partial and ––progress.

––partial
––perms

Keeps partially copied files. By default rsync deletes partially copied files. See the
note on page 677.
–p Preserves the permissions of copied files.

––progress

Displays information about the progress of the transfer. Implies ––verbose.

––recursive

–r Recursively descends a directory specified in source-file and copies all files in the
directory hierarchy. See page 677 for an example.

––specials
––times

––update
––verbose

Copies special files.
–t Preserves the modification times of copied files. This option also speeds up copying files because it causes rsync not to copy a file that has the same modification
time and size in both the source-file and the destination-file. See page 678 for
an example.
–u Skips files that are newer in the destination-file than in the source-file.
–v Displays information about what rsync is doing. This option is useful with the
––dry-run option. See page 677 for an example.

––xattrs –X Preserves the extended attributes of copied files. This option is not available
with all compilations of rsync.
––compress

–z Compresses files while copying them. See “Compression” on the next page.

Notes
The rsync utility has many options. This chapter describes a few of them; see the rsync
man page for a complete list.
OpenSSH

By default, rsync copies files to and from a remote system using OpenSSH. The
remote system must be running an OpenSSH server. If you can use ssh to log in on

Examples 677

the remote system, you can use rsync to copy files to or from that system. If ssh
requires you to enter a password, rsync will require a password. See “Copying Files
to and from a Remote System” on page 680 for examples. See Chapter 17 for information on setting up and using OpenSSH.
rsyncd daemon

If you use a double colon (::) in place of a single colon (:) following the name of a
remote system, rsync connects to the rsyncd daemon on the remote system (it does
not use OpenSSH). See the rsync man page for more information.

Compression

The ––compress option causes rsync to compress files while copying them, which can
speed up a transfer. In some cases, when you specify this option in a setup with a fast
network connection and a slower CPU, compressing files can slow down the transfer.
You typically have this setup when you back up to a NAS (Network Attached Storage
device).

Partially copied files

The ––partial option causes rsync to keep partially copied files that result when a
transfer is interrupted. By default it deletes these files. Especially with large files,
having a partially copied file can speed up a subsequent transfer.

More Information
man page: rsync
rsync home page: www.samba.org/rsync

Backup information: www.mikerubel.org/computers/rsync_snapshots
Backup tools: www.rsnapshot.org, backuppc.sourceforge.net
File synchronization: alliance.seas.upenn.edu/~bcpierce

Examples
––recursive
––verbose

The first example shows rsync copying a directory using the – –recursive and
––verbose options. Both the source and destination directories are in the working
directory.
$ ls -l memos
-rw-r--r--. 1 max pubs 1500 05-14 14:24 0514
-rw-r--r--. 1 max pubs 6001 05-16 16:16 0516
$ rsync --recursive --verbose memos memos.copy
sending incremental file list
created directory memos.copy
memos/
memos/0514
memos/0516
sent 7656 bytes received 54 bytes 15420.00 bytes/sec
total size is 7501 speedup is 0.97
$ ls -l memos.copy
drwxr-xr-x. 2 max pubs 4096 05-21 14:32 memos

678 Chapter 16 The rsync Secure Copy Utility

In the preceding example, rsync copies the memos directory to the memos.copy
directory. As the following ls command shows, rsync changed the modification
times on the copied files to the time it made the copies:
$ ls -l memos.copy/memos
-rw-r--r--. 1 max pubs 1500 05-21 14:32 0514
-rw-r--r--. 1 max pubs 6001 05-21 14:32 0516

Using a Trailing Slash (/) on source-file
Whereas the previous example copied a directory to another directory, you might
want to copy the contents of a directory to another directory. A trailing slash (/) on
the source-file causes rsync to act as though you had specified a trailing /* and causes
rsync to copy the contents of the specified directory. A trailing slash on the
destination-file has no effect.
––times

The next example makes another copy of the memos directory, using ––times to preserve modification times of the copied files. It uses a trailing slash on memos to copy
the contents of the memos directory—not the directory itself—to memos.copy2.
$ rsync --recursive --verbose --times memos/ memos.copy2
sending incremental file list
created directory memos.copy2
./
0514
0516
sent 7642 bytes received 53 bytes 15390.00 bytes/sec
total size is 7501 speedup is 0.97
$ ls -l memos.copy2
-rw-r--r--. 1 max pubs 1500 05-14 14:24 0514
-rw-r--r--. 1 max pubs 6001 05-16 16:16 0516

––archive

The ––archive option causes rsync to copy directories recursively, dereferencing symbolic links (copying the files the links point to, not the symbolic links themselves),
preserving modification times, ownership, group association of the copied files, and
more. This option does not preserve hard links; use the ––hard-links option for that
purpose. See page 675 for more information on ––archive. The following commands
perform the same functions as the previous one:
$ rsync --archive --verbose memos/ memos.copy2
$ rsync -av memos/ memos.copy2

Removing Files
––delete
––dry-run

The ––delete option causes rsync to delete from destination-file files that are not in
source-file. Together, the ––dry-run and ––verbose options report on what an rsync
command would do without the ––dry-run option, without rsync taking any action.
With the ––delete option, the ––dry-run and ––verbose options can help you avoid
removing files you did not intend to remove. This combination of options marks files

Examples 679
rsync would remove with the word deleting. The next example uses these options in
addition to the ––archive option.
$ ls -l memos
memos:
-rw-r--r--. 1
-rw-r--r--. 1
-rw-r--r--. 1

memos.copy3
max pubs 1500 05-14 14:24 0514
max pubs 6001 05-16 16:16 0516
max pubs 5911 05-18 12:02 0518

memos.copy3:
-rw-r--r--. 1 max pubs 1500 05-14 14:24 0514
-rw-r--r--. 1 max pubs 6001 05-16 16:16 0516
-rw-r--r--. 1 max pubs 5911 05-21 14:36 notes
$ rsync --archive --verbose --delete --dry-run memos/ memos.copy3
sending incremental file list
./
deleting notes
0518
sent 83 bytes received 18 bytes 202.00 bytes/sec
total size is 13412 speedup is 132.79 (DRY RUN)

The rsync utility reports deleting notes, indicating which file it would remove if you
ran it without the ––dry-run option. It also reports it would copy the 0518 file.

Test to make sure ––delete is going to do what you think it will do
caution The ––delete option can easily delete an entire directory tree if you omit a needed slash (/) or
include an unneeded slash in source-file. Use ––delete with the ––dry-run and ––verbose options
to test an rsync command.

If you get tired of using the long versions of options, you can use the single-letter versions. The next rsync command is the same as the previous one (there is no short
version of the ––delete option):
$ rsync -avn --delete memos/ memos.copy3

The next example runs the same rsync command, omitting the ––dry-run option. The
ls command shows the results of the rsync command: The ––delete option causes
rsync to remove the notes file from the destination-file (memos.copy3) because it is
not in the source-file (memos). In addition, rsync copies the 0518 file.
$ rsync --archive --verbose --delete memos/ memos.copy3
sending incremental file list
./
deleting notes
0518
sent 6034 bytes received 34 bytes 12136.00 bytes/sec
total size is 13412 speedup is 2.21

680 Chapter 16 The rsync Secure Copy Utility
$ ls -l memos
memos:
-rw-r--r--. 1
-rw-r--r--. 1
-rw-r--r--. 1

memos.copy3
max pubs 1500 05-14 14:24 0514
max pubs 6001 05-16 16:16 0516
max pubs 5911 05-18 12:02 0518

memos.copy3:
-rw-r--r--. 1 max pubs 1500 05-14 14:24 0514
-rw-r--r--. 1 max pubs 6001 05-16 16:16 0516
-rw-r--r--. 1 max pubs 5911 05-18 12:02 0518

Up to this point, the examples have copied files locally, in the working directory. To
copy files to other directories, replace the simple filenames with relative or absolute
pathnames. On the local system, relative pathnames are relative to the working directory; on a remote system, relative pathnames are relative to the user’s home directory.
For example, the following command copies the contents of the memos directory in
the working directory to the /backup directory on the local system:
$ rsync --archive --verbose --delete memos/ /backup

Copying Files to and from a Remote System
To copy files to or from a remote system, that system must be running an OpenSSH
server or another transport mechanism rsync can connect to. For more information
refer to “Notes” on page 676. To specify a file on a remote system, preface the filename
with the name of the remote system and a colon. Relative pathnames on the remote system are relative to the user’s home directory. Absolute pathnames are absolute (i.e., they
are relative to the root directory). See page 87 for more information on relative and
absolute pathnames.
In the next example, Max copies the contents of the memos directory in the working
directory on the local system to the holdfiles directory in his working directory on
the remote system named guava. The ssh utility runs an ls command on guava to
show the result of the rsync command. The rsync and ssh utilities do not request a
password because Max has set up OpenSSH-based utilities to log in automatically on
guava (page 702).
$ rsync --archive memos/ guava:holdfiles
$ ssh guava
-rw-r--r--.
-rw-r--r--.
-rw-r--r--.

'ls -l holdfiles'
1 max pubs 1500 05-14 14:24 0514
1 max pubs 6001 05-16 16:16 0516
1 max pubs 5911 05-18 12:02 0518

When copying from a remote system to the local system, place the name of the remote
system before source-file:
$ rsync --archive guava:holdfiles/ ~/memo.copy4
$ rsync --archive guava:holdfiles/ /home/max/memo.copy5

Both of these commands copy the contents of the holdfiles directory from Max’s
home directory on guava to his home directory on the local system. Under Mac OS
X, replace /home with /Users.

Examples 681

Mirroring a Directory
You can use rsync to maintain a copy of a directory. Because it is an exact copy, this
type of copy is called a mirror. The mirror directory must be on an OpenSSH server
(you must be able to connect to it using an OpenSSH utility such as ssh). If you want
to run this script using crontab (page 763), you must set up OpenSSH so you can log
in on the remote system automatically (without providing a password; page 702).
––compress
––update

The next example introduces the rsync – –compress and – –update options. The
––compress option causes rsync to compress files as it copies them, usually making
the transfer go more quickly. See the note on page 677. The ––update option keeps
rsync from overwriting a newer file with an older one.
As with all shell scripts, you must have read and execute access to the mirror script.
To make it easier to read, each option in this script appears on a line by itself. Each
line of each command except the last is terminated with a SPACE and a backslash (\).
The SPACE separates one option from the next; the backslash quotes the following
NEWLINE so the shell passes all arguments to rsync and does not interpret the NEWLINEs
as the end of the command.
$ cat mirror
rsync \
--archive \
--verbose \
--compress \
--update \
--delete \
~/mirrordir/ guava:mirrordir
$ ./mirror > mirror.out

The mirror command in the example redirects output to mirror.out for review. Remove
the ––verbose option if you do not want the command to produce any output except
for errors. The rsync command in mirror copies the mirrordir directory hierarchy from
the user’s home directory on the local system to the user’s home directory on the remote
(server) system. In this example the remote system is named guava. Because of the
––update option, rsync will not overwrite newer versions of files on the server with
older versions of the same files from the local system. Although this option is not
required if files on the server system are never changed manually, it can save you from
grief if you accidentally update files on or add files to the server system. The ––delete
option causes rsync to remove files on the server system that are not present on the local
system.

Making Backups
After performing an initial full backup, rsync is able to perform subsequent incremental backups efficiently with regard to running time and storage space. By definition,
an incremental backup stores only those files that have changed since the last backup;
these are the only files that rsync needs to copy. As the following example shows,

682 Chapter 16 The rsync Secure Copy Utility
rsync, without using extra disk space, can make each incremental backup appear to
be a full backup by creating hard links between the incremental backup and the
unchanged files in the initial full backup.
––link-dest=dir

The rsync ––link-dest=dir option makes backups easy and efficient. It presents the user
and/or system administrator with snapshots that appear to be full backups while taking
minimal extra space in addition to the initial backup. The dir directory is always located
on the machine holding the destination-file. If dir is a relative pathname, the pathname
is relative to the destination-file. See page 676 for a description of this option.
Following is a simple rsync command that uses the ––link-dest=dir option:
$ rsync --archive --link-dest=../backup source/ destination

When you run this command, rsync descends the source directory hierarchy, examining each file it finds. For each file in the source directory hierarchy, rsync looks in
the destination directory hierarchy to find an exact copy of the file.
• If it finds an exact copy of the file in the destination directory, rsync continues with the next file.
• If it does not find an exact copy of the file in the destination directory, rsync
looks in the backup directory to find an exact copy of the file.
• If it finds an exact copy of the file in the backup directory, rsync
makes a hard link from the file in the backup directory to the
destination directory.
• If it does not find an exact copy of the file in the backup directory,
rsync copies the file from the source directory to the destination
directory.
Next is a simple example showing how to use rsync to make full and incremental
backups using the ––link-dest=dir option. Although the backup files reside on the
local system, they could easily be located on a remote system.
As specified by the two arguments to rsync in the bkup script, rsync copies the memos
directory to the bu.0 directory. The – –link-dest=dir option causes rsync to check
whether each file it needs to copy exists in bu.1. If it does, rsync creates a link to the
bu.1 file instead of copying it.
The bkup script rotates three backup directories named bu.0, bu.1, and bu.2 and calls
rsync. The script removes bu.2, moves bu.1 to bu.2, and moves bu.0 to bu.1. The first
time you run the script, rsync copies all the files from memos because they do not exist
in bu.0 or bu.1.
$ cat bkup
rm -rf bu.2
mv bu.1 bu.2
mv bu.0 bu.1
rsync --archive --link-dest=../bu.1 memos/

bu.0

Examples 683

Before you run bkup for the first time, bu.0, bu.1, and bu.2 do not exist. Because of
the –f option, rm does not display an error message when it tries to remove the nonexistent bu.2 directory. Until bkup creates bu.0 and bu.1, mv displays error messages
saying there is No such file or directory.
In the following example, ls shows the bkup script and the contents of the memos
directory. After running bkup, ls shows the contents of memos and of the new bu.0
directory; bu.0 holds exact copies of the files in memos. The rsync utility created no
links because there were no files in bu.1: The directory did not exist.
$ ls -l *
-rwxr-xr-x. 1 max pubs

87 05-18 11:24 bkup

memos:
-rw-r--r--. 1 max pubs 1500 05-14 14:24 0514
-rw-r--r--. 1 max pubs 6001 05-16 16:16 0516
-rw-r--r--. 1 max pubs 5911 05-18 12:02 0518
$ ./bkup
mv: cannot stat 'bu.1': No such file or directory
mv: cannot stat 'bu.0': No such file or directory
--link-dest arg does not exist: ../bu.1
$ ls -l *
-rwxr-xr-x. 1 max pubs

87 05-18 11:24 bkup

bu.0:
-rw-r--r--. 1 max pubs 1500 05-14 14:24 0514
-rw-r--r--. 1 max pubs 6001 05-16 16:16 0516
-rw-r--r--. 1 max pubs 5911 05-18 12:02 0518
memos:
-rw-r--r--. 1 max pubs 1500 05-14 14:24 0514
-rw-r--r--. 1 max pubs 6001 05-16 16:16 0516
-rw-r--r--. 1 max pubs 5911 05-18 12:02 0518

After working with the files in memos, ls shows 0518 has been removed and newfile
has been added:
$ ls -l memos
-rw-r--r--. 1 max pubs 1208 05-21 14:16 0514
-rw-r--r--. 1 max pubs 6001 05-16 16:16 0516
-rw-r--r--. 1 max pubs 7501 05-21 14:16 newfile

After running bkup again, bu.0 holds the same files as memos and bu.1 holds the files
that bu.0 held before running bkup. The 0516 file has not changed, so rsync, with
the ––link-dest=dir option, has not copied it but rather has made a link from the copy
in bu.1 to the copy in bu.0, as indicated by the 2 ls displays between the permissions
and max.

684 Chapter 16 The rsync Secure Copy Utility
$ ./bkup
mv: cannot stat 'bu.1': No such file or directory
$ ls -l bu.0 bu.1
bu.0:
-rw-r--r--. 1 max pubs 1208 05-21 14:16 0514
-rw-r--r--. 2 max pubs 6001 05-16 16:16 0516
-rw-r--r--. 1 max pubs 7501 05-21 14:16 newfile
bu.1:
-rw-r--r--. 1 max pubs 1500 05-14 14:24 0514
-rw-r--r--. 2 max pubs 6001 05-16 16:16 0516
-rw-r--r--. 1 max pubs 5911 05-18 12:02 0518

The beauty of this setup is that each incremental backup occupies only the space
needed to hold files that have changed. Files that have not changed are stored as links,
which take up minimal disk space. Yet users and the system administrator have access
to a directory that appears to hold a full backup.
You can run a backup script such as bkup once an hour, once a day, or as often as
you like. Storage space permitting, you can have as many backup directories as you
like. If rsync does not require a password, you can automate this process by using
crontab (page 763).

Restoring a File
To restore the most recent copy of a file, list all copies of the file and see which has
the most recent date:
$ ls -l bu.?/0514
-rw-r--r--. 1 max pubs 1208 05-21 14:16 bu.0/0514
-rw-r--r--. 1 max pubs 1500 05-14 14:24 bu.1/0514

Then copy that file to the directory you want to restore it into. Use cp with the –a
option to keep the same date on the copy of the file as appears on the original file:
$ cp -a bu.0/0514 ~max/memos

If two copies of (links to) a file show the same time and date it does not matter which
you restore from.

Chapter Summary
The rsync utility copies an ordinary file or directory hierarchy locally or between the
local system and a remote system on a network. By default, this utility uses openSSH
to transfer files and the same authentication mechanism as openSSH; therefore it provides the same security as openSSH. The rsync utility prompts for a password when
it needs one.

Exercises 685

Exercises
1. List three features of rsync.
2. Write an rsync command that copies the backmeup directory from your
home directory on the local system to the /tmp directory on guava, preserving file ownership, permissions, and modification times. Write a command
that will copy the same directory to your home directory on guava. Do not
assume the working directory on the local system is your home directory.
3. You are writing an rsync command that includes the ––delete option. Which
options would you use to test the command without copying or removing
any files?
4. What does the ––archive option do? Why is it useful?
5. When running a script such as bkup (page 682) to back up files on a remote
system, how could you rotate (rename) files on a remote system?
6. What effect does a trailing slash (/) on the source-file have?

17
The OpenSSH Secure
Communication
Utilities

In This Chapter
Introduction to OpenSSH . . . . . . . 688

Chapter17
17

Objectives
After reading this chapter you should be able to:

Tutorial: Using ssh and scp to
Connect to an OpenSSH
Server . . . . . . . . . . . . . . . . . . . . . 691

 Explain the need for encrypted services

Configuring OpenSSH Clients. . . . 692

 Copy files and directories to and from a remote system
securely

ssh: Logs In or Executes
Commands on a Remote
System . . . . . . . . . . . . . . . . . . . . 694
scp: Copies Files to and from a
Remote System . . . . . . . . . . . . . 698
sftp: A Secure FTP Client. . . . . . . . 700
Authorized Keys: Automatic
Login. . . . . . . . . . . . . . . . . . . . . . 702
ssh-agent: Holds Your
Private Keys . . . . . . . . . . . . . . . . 705
Forwarding X11 . . . . . . . . . . . . . . . 706
Port Forwarding . . . . . . . . . . . . . . . 707

 Log in on a remote OpenSSH server system using ssh

 Set up a client/server so you do not need to use a password to log in using ssh or scp
 Enable trusted X11 tunneling between the client and
the OpenSSH server
 Remove a known host record from the
~/.ssh/known_hosts file
 Enable trusted X11 forwarding
 List the uses of ssh tunneling (port forwarding)

688 Chapter 17 The OpenSSH Secure Communication Utilities

OpenSSH is a suite of secure network connectivity tools that replaces telnet/telnetd,
rcp, rsh/rshd, rlogin/rlogind, and ftp/ftpd. Unlike the tools they replace, OpenSSH tools
encrypt all traffic, including passwords. In this way they can thwart malicious users
who attempt to eavesdrop, hijack connections, and steal passwords.
This chapter assumes the local system can connect to an OpenSSH server and covers
the following OpenSSH tools:
• scp—Copies files to and from a remote system (pages 691 and 698)
• sftp—Copies files to and from a remote system (a secure replacement for ftp;
page 700)
• ssh—Runs a command on or logs in on a remote system (pages 691
and 694)
• ssh-add—Works with ssh-agent to specify the passphrase (page 705)
• ssh-agent—Holds your private keys (page 705)
• ssh-copy-id—Copies id_rsa.pub to a server so you do not need to use a password
when you connect to that server using an OpenSSH tool (page 703)
• ssh-keygen—Creates, manages, and converts RSA or DSA host/user
authentication keys (page 703)

Introduction to OpenSSH
Using public key encryption, OpenSSH provides two levels of authentication: server
and client/user. First the client verifies it is connected to the correct server. Then
OpenSSH encrypts communication between the systems. When a secure, encrypted
connection has been established, OpenSSH makes sure the user is authorized to log
in on or copy files to and from the server. After verifying the system and user,
OpenSSH allows a variety of services to be passed through the connection including
interactive shell sessions (ssh), remote command execution (ssh), file copying (scp),
FTP services (sftp), X11 client/server connections, and TCP/IP port tunneling.
SSH1 versus SSH2

SSH protocol version 2 (SSH2) is a complete rewrite of SSH protocol version 1
(SSH1) that offers improved security, performance, and portability. The two protocols are not compatible. Because SSH1 is being rapidly supplanted by SSH2 and
because SSH1 is vulnerable to a man-in-the-middle attack (page 1079), this chapter
does not discuss SSH1. Because version 2 is floating-point intensive, version 1 does
have a place on systems without FPUs (floating-point units or accelerators), such
as old 486SX systems. As installed, the OpenSSH tools supplied with most Linux
distributions support both protocols (some support SSH2 only); most servers can
communicate with clients using either protocol.

ssh

The ssh utility enables you to log in on a remote system over a network. You might
choose to use a remote system to access a special-purpose application or to take

Introduction to OpenSSH 689

advantage of a device that is available only on that system, or you might use a remote
system because you know it is faster or less busy than the local system. While traveling, many businesspeople use ssh on a laptop to log in on a system at company
headquarters.
X11 forwarding

When you turn on trusted X11 forwarding on an ssh client, it is a simple matter to
run a graphical program over an ssh connection to a server that has X11 forwarding
enabled: Run ssh from a terminal emulator running on an X11 server and give an
X11 command such as gnome-calculator; the graphical output appears on the local
display. For more information refer to “Forwarding X11” on page 706.

sshfs

See page 954 for information on using sshfs to mount an OpenSSH-accessible
directory on the local system without needing root privileges.

How OpenSSH Works
When OpenSSH starts, it first establishes an encrypted connection and then authenticates
the user. When these two tasks are complete, OpenSSH allows the two systems to send
information back and forth.
Keys

OpenSSH uses a host key pair and a session key to negotiate an encrypted session.
The host key pair is a set of public/private keys. The session key is a symmetric key
that the server and client share.
The first time an OpenSSH client connects with an OpenSSH server, OpenSSH
asks you to verify the client is connected to the correct server (see “First-time
authentication” on page 692). This verification helps prevent a man-in-the-middle
attack. After verification, the client makes a copy of the server’s public host key.
On subsequent connections, the client compares the key provided by the server
with the original key it stored. Although this test is not foolproof, the next one is
quite secure.
The client then generates a random key, which it encrypts with both the server’s public
host key and the session key. The client sends this encrypted key to the server. The
server, in turn, uses its private keys to decrypt the encrypted key. This process creates
the session key—a key that is known only to the client and the server and is used to
encrypt the rest of the session.

Files
OpenSSH clients rely on many files. Global files are kept in /etc/ssh and user files in
~/.ssh. In this section, the first word in the description of each file indicates whether
the client or the server uses the file.

rhost authentication is a security risk
security Although OpenSSH can get authentication information from /etc/hosts.equiv, /etc/shosts.equiv,
~/.rhosts, and ~/.shosts, this chapter does not cover the use of these files because they present
security risks. By default, most servers are set up to prevent their use.

690 Chapter 17 The OpenSSH Secure Communication Utilities

/etc/ssh: Global Files
Global files listed in this section appear in the /etc/ssh directory. They affect all users,
but a user can override them with files in her ~/.ssh directory.
moduli client and server

Contains key exchange information that OpenSSH uses to establish
a secure connection. Do not modify this file.

ssh_config client

The global OpenSSH configuration file (page 700). Entries here can be
overridden by entries in a user’s ~/.ssh/config file.

ssh_known_hosts

client Contains public RSA (by default) keys of hosts that users on the local system can
connect to. This file contains information similar to that found in ~/.ssh/known_hosts
but is set up by the administrator and is available to all users. This file should be owned
by root and should be readable by anyone but writable only by its owner (644
permissions).

~/.ssh: User Files
OpenSSH creates the ~/.ssh directory and the known_hosts file therein automatically
when a user connects to a remote system.
authorized_keys server

Enables a user to log in on or copy files to and from another system without
supplying a user login password (page 702). However, the user might need to supply
a passphrase, depending on how the key was set up. No one except the owner should
be able to write to this file.

config client

A user’s private OpenSSH configuration file (page 700). Entries here override
those in /etc/ssh/ssh_config.

environment server

Contains assignment statements that define environment variables on a
server when a user logs in using ssh.

id_dsa, client User authentication DSA keys generated by ssh-keygen (page 703). Both files
id_dsa.pub should be owned by the user in whose home directory they appear. The id_dsa.pub

public file should be readable by anyone but writable only by its owner (644 permissions). The id_dsa private file should not be readable or writable by anyone except
its owner (600 permissions).
id_rsa, client User authentication RSA keys generated by ssh-keygen (page 703). Both files
id_rsa.pub should be owned by the user in whose home directory they appear. The id_rsa.pub

public file should be readable by anyone but writable only by its owner (644 permissions). The id_rsa private file should not be readable or writable by anyone except
its owner (600 permissions).
known_hosts client

Contains public RSA keys (by default) of hosts the user has connected to.
OpenSSH automatically adds entries each time the user connects to a new server
(page 692). Refer to “HostKeyAlgorithms” (page 702) for information on using
DSA keys. If HashKnownHosts (page 701) is set to yes, the hostnames and addresses
in this file are hashed to improve security.

Running the ssh, scp, and sftp OpenSSH Clients

691

More Information
Local man
Web

Books

pages: ssh, scp, sftp, ssh-keygen, ssh-agent, ssh-add, ssh_config, sshd, sshd_config

OpenSSH home page: www.openssh.com
Search on ssh to find various HOWTOs and other documents: tldp.org
Implementing SSH: Strategies for Optimizing the Secure Shell by Dwivedi; John
Wiley & Sons (October 2003)
SSH, The Secure Shell: The Definitive Guide by Barrett, Silverman, and Byrnes;
O’Reilly Media (May 2005)

Running the ssh, scp, and sftp OpenSSH Clients
This section presents a brief tutorial on using ssh and scp and then covers setting
up and using the ssh, scp, and sftp clients. The openssh (and on some systems
openssh-clients) package is required to run these clients; this package is typically
installed by default.

Tutorial: Using ssh and scp to Connect to an
OpenSSH Server
The ssh and scp clients do not require setup beyond installing the requisite package,
although you can create and edit files that facilitate their use. To run a secure shell
on or securely copy a file to or from a remote system, the following criteria must be
met: The remote system must be running the OpenSSH daemon (sshd), you must
have an account on the remote system, and the server must positively identify itself
to the client.
ssh

The following example shows Zach using ssh to log in on the remote host named
plum, running who, and giving an exit command to return to the shell on the local
system. The who utility displays the IP address of the system Zach logged in from.
$ ssh zach@plum
zach@plum’s password:
[zach@plum ~]$ who am i
zach
pts/2
2012-12-03 11:43 (172.16.192.150)
[zach@plum ~]$ exit
logout
Connection to plum closed.

You can omit user@ (zach@ in the preceding example) from the command line if you
want to log in as yourself and you have the same username on both systems. The
first time you connect to a remote OpenSSH server, ssh or scp asks you to confirm
you are connected to the right system. Refer to “First-time authentication” on the
next page.

692 Chapter 17 The OpenSSH Secure Communication Utilities
scp

The following example uses scp to copy ty1 from the working directory on the local
system to Zach’s home directory on plum:
$ scp ty1 zach@plum:
zach@plum's password:
ty1

100%

162

0.2KB/s

00:00

Remember to follow the system name with a colon (:). If you omit the colon, scp
copies the file locally; in the example you would end up with a copy of ty1 named
zach@plum.

Configuring OpenSSH Clients
This section describes how to set up OpenSSH on the client side.

Recommended Settings
X11 forwarding

The configuration files provided with most distributions establish a mostly secure
system that might or might not meet your needs. One OpenSSH parameter you
might want to change is ForwardX11Trusted, which is typically set or defaults to
yes in the /etc/ssh/ssh_config configuration file (page 701). To increase security, and
in some cases reduce usability, set ForwardX11Trusted to no. See page 706 for more
information on X11 forwarding.

Server Authentication/Known Hosts
Two files list the hosts the local system has connected to and positively identified:
~/.ssh/known_hosts (user) and /etc/ssh/ssh_known_hosts (global). No one except
the owner (root in the case of the second file) should be able to write to either of these
files. No one except the owner should have any access to a ~/.ssh directory.
First-time
authentication

When you connect to an OpenSSH server for the first time, the OpenSSH client
prompts you to confirm that you are connected to the right system. This behavior
is controlled by StrictHostKeyChecking (page 702). This check can help prevent a
man-in-the-middle attack (page 1079):
The authenticity of host 'plum (172.16.192.151)' can't be established.
RSA key fingerprint is d1:9d:1b:5b:97:5c:80:e9:4b:41:9a:b7:bc:1a:ea:a1.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'plum,172.16.192.151' (RSA) to the list of
known hosts.

Before you respond to the preceding query, make sure you are logging in on the
correct system and not on an imposter. If you are not sure, a telephone call to
someone who logs in on that system locally can help verify that you are on the
intended system. When you answer yes (you must spell it out ), the client appends
the server’s public host key (the single line in the /etc/ssh/ssh_host_rsa_key.pub or
/etc/ssh/ssh_host_dsa_key.pub file on the server) to the user’s ~/.ssh/known_hosts file
on the local system, creating the ~/.ssh directory if necessary. So it can keep track of
which line in known_hosts applies to which server, OpenSSH prepends the name of the
server and the server’s IP address to the line.

Running the ssh, scp, and sftp OpenSSH Clients

693

When you subsequently use OpenSSH to connect to that server, the client verifies it
is connected to the correct server by comparing this key to the one supplied by the
server.
known_hosts

The ~/.ssh/known_hosts file uses one or two very long lines to identify each host it
keeps track of. Each line starts with the hostname and IP address of the system the
line corresponds to, followed by the type of encryption being used and the server’s
public host key. When HashKnownHosts (page 701) is set to yes, OpenSSH hashes
the system name and address to improve security. Because HashKnownHosts hashes
the hostname and IP address separately, OpenSSH puts two lines in known_hosts for
each host. The following line (it is one logical line that wraps onto several physical
lines) from known_hosts is used to connect to plum at 172.16.192.151 using RSA
(page 1091) encryption:
$ cat ~/.ssh/known_hosts
plum,172.16.192.151 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDbhLRVTfI
v9gy7oP+5T3HjZmrKt2q6ydyKmLlHNUjZFXM4hCdkJlpTfJ4wy260UAZBWvrBLP6N9k
...

You can use ssh-keygen with the –R option followed by the hostname to remove an
entry, even a hashed one. Alternatively, you can use a text editor to remove an entry.
The –F option to ssh-keygen displays a line in a known_hosts file that corresponds
to a specified system, even if the entry is hashed:
$ ssh-keygen -F plum
# Host plum found: line 1 type RSA
plum ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDbhLRVTfIv9gy7oP+5T3HjZmr
Kt2q6ydyKmLlHNUjZFXM4hCdkJlpTfJ4wy260UAZBWvrBLP6N9kYA+KRG1FyuwROUwXQ
...
ssh_known_hosts

OpenSSH automatically stores keys from servers it has connected to in user-private
files (~/.ssh/known_hosts). These files work only for the user whose directory they
appear in. Working with root privileges and using a text editor, you can copy
nonhashed lines from a user’s private list of known hosts to the public list in
/etc/ssh/ssh_known_hosts to make a server known globally on the local system.
The following example shows how Sam puts the hashed entry from his known_hosts
file into the global ssh_known_hosts file. First, working as himself, Sam sends the
output of ssh-keygen through tail to strip off the Host plum found line and redirects
the output to a file named tmp_known_hosts in his home directory. Next, working
with root privileges, Sam appends the contents of the file he just created to
/etc/ssh/ssh_known_hosts. This command creates this file if it does not exist. Finally
Sam returns to working as himself and removes the temporary file he created.
$ ssh-keygen -F plum | tail -1 > ~sam/tmp_known_hosts
# or sudo –i
$ su
Password:
# cat ~sam/tmp_known_hosts >> /etc/ssh/ssh_known_hosts
# exit
exit
$ rm ~/tmp_known_hosts

694 Chapter 17 The OpenSSH Secure Communication Utilities

If, after a remote system’s public key is stored in one of the known hosts files, the
remote system supplies a different fingerprint when the systems connect, OpenSSH
displays the following message and does not complete the connection:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@
WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!
@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the RSA host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
f1:6f:ea:87:bb:1b:df:cd:e3:45:24:60:d3:25:b1:0a.
Please contact your system administrator.
Add correct host key in /home/sam/.ssh/known_hosts to get rid of this message.
Offending key in /home/sam/.ssh/known_hosts:1
RSA host key for plum has changed and you have requested strict checking.
Host key verification failed.

If you see this message, you might be the subject of a man-in-the-middle attack
(page 1079). More likely, however, something on the remote system has changed, causing it to supply a new fingerprint. Check with the remote system’s administrator. If all
is well, remove the offending key from the specified file (the third line from the bottom
in the preceding message points to the line you need to remove) and try connecting
again. You can use ssh-keygen with the –R option followed by the name of a host to
remove a hashed entry. You will be subject to first-time authentication (page 692) again
as OpenSSH verifies you are connecting to the correct system. Follow the same steps
as when you initially connected to the remote host.

ssh: Logs In or Executes Commands on a Remote System
The syntax of an ssh command line is
ssh [options] [user@]host [command]
where host, the name of the OpenSSH server (the remote system) you want to connect
to, is the only required argument. The host can be a local system name, the FQDN
(page 1069) of a system on the Internet, or an IP address.
With the command ssh host, you log in on the remote system host with the same username you are using on the local system. The remote system displays a shell prompt, and
you can run commands on host. Enter the command exit to close the connection to host
and return to the local system’s prompt. Include user@ when you want to log in with a
username other than the one you are using on the local system. Depending on how the
server is set up, you might need to supply the password you use on the remote system.

Use screen after you connect to a remote system using ssh
security If you are concerned about getting disconnected during an ssh session, or if you want to be able
to continue a session from a different location, run the screen utility once you establish an ssh
connection. Then, if you get disconnected or if you detach your screen session and log out, the
session will keep running and you can reconnect when and from where you like. See page 931 for
more information on screen.

Running the ssh, scp, and sftp OpenSSH Clients

695

When you include command, ssh logs in on host, executes command, closes the
connection to host, and returns control to the local system. The remote system
never displays a shell prompt.
Opening a
remote shell

In the following example, Sam, who is logged in on guava, uses ssh to log in on plum,
gives a uname command that shows the name and type of the system, and uses exit
to close the connection to plum and return to the local system’s prompt:
[sam@guava ~]$ ssh plum
sam@plum’s password:
[sam@plum ~]$ uname -nm
plum i686
[sam@plum ~]$ exit
logout
Connection to plum closed.
[sam@guava ~]$

Running commands
remotely

The next example uses ls to list the files in the memos directory on the remote system
named plum. The example assumes the user running the command (Sam) has an
account on plum and that the memos directory is in Sam’s home directory on plum.
$ ssh plum ls memos
sam@plum’s password:
memo.0921
memo.draft
$

When you run ssh, standard output of the command run on the remote system is
passed to the local shell as though the command had been run on the local system.
As with all shell commands, you must quote special characters you do not want the
local shell to interpret.
In the next example, standard output of the ls command, which is run on the remote
system, is sent to ls.out on the local system:
$ ssh plum ls memos > ls.out
sam@plum’s password:
$ cat ls.out
memo.0921
memo.draft

In the preceding ssh command the redirect symbol (>) is not quoted, so it is interpreted
by the local shell and creates the ls.out file on the local system.
The next command is similar, but the redirect symbol is quoted, so the local shell does
not interpret it but passes it to the remote shell. The first command creates the ls.out2
file on the remote system, and the second command displays the file.
$ ssh plum
sam@plum’s
$ ssh plum
sam@plum’s
memo.0921
memo.draft

'ls memos > ls.out2'
password:
cat ls.out2
password:

696 Chapter 17 The OpenSSH Secure Communication Utilities

For the next example, assume the working directory on the local system holds a file
named memo.new. You cannot remember whether this file contains certain changes
or whether you made these changes to the file named memo.draft in the memos directory on plum. You could copy memo.draft to the local system and run diff (page 58)
on the two files, but then you would have three similar copies of the file spread across
two systems. If you are not careful about removing the old copies when you are done,
you might just become confused again in a few days. Instead of copying the file, you
can use ssh. This example shows that only the date line differs between the two files:
$ ssh plum
sam@plum’s
1c1
< Tue Dec
--> Mon Dec

cat memos/memo.draft | diff memo.new password:
4 12:06:22 PST 2012
3 14:27:06 PST 2012

In the preceding example, the output of the cat command on plum is sent through a
pipeline on the local system to diff (running on the local system), which compares the
local file memos.new to standard input (–). The following command line has the same
effect but causes diff to run on the remote system:
$ cat memo.new | ssh plum diff - memos/memo.draft
sam@plum’s password:
1c1
< Tue Dec 4 12:06:22 PST 2012
--> Mon Dec 3 14:27:06 PST 2012

Standard output from diff on the remote system is sent to the local shell, which displays
it on the screen (because it is not redirected).
Sam now decides to change the password for his sls login on plum:
$ ssh sls@plum passwd
sls@plum's password:
(current) UNIX password: por

Sam stops as soon as he sees passwd (running on plum) displaying his password: He
knows something is wrong. For passwd to work, it must run with a tty (terminal) so
it can turn off character echo (stty –echo); then it will not display passwords as the
user enters them. The –t option solves the problem by associating a pseudo-tty with
the process running passwd on the remote system:
$ ssh -t sls@plum passwd
sls@plum's password:
Changing password for user sls.
Changing password for sls.
(current) UNIX password:
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
Connection to plum closed.

Running the ssh, scp, and sftp OpenSSH Clients

697

The –t option is also useful when you are running a program that uses a characterbased/pseudographical interface.
Using tar

The following example uses tar (page 968) to create an archive file on a remote system
containing the contents of the working directory hierarchy on the local system. The
f – option causes tar to send its output to standard output. A pipeline sends the output
of tar running on the local system, via ssh, to dd (page 772) running on the remote
system.
$ cat buwd
#! /bin/bash
# Back up the working directory to the user's
# home directory on the remote system specified
# by $machine
# Remote system:
machine=plum
dir=$(basename $(pwd))
filename=$$.$dir.tar
echo Backing up $(pwd) to your home directory on $machine
tar -cf - . | ssh $machine "dd obs=256k of=$filename"
echo done. Name of file on $machine is $filename
$ ./buwd
Backing up /home/sam to your home directory on plum
10340+0 records in
20+1 records out
5294080 bytes (5.3 MB) copied, 0.243011 s, 21.8 MB/s
done. Name of file on plum is 26537.sam.tar

Options
This section describes some of the options you can use with ssh.
–C (compression) Enables compression (page 708). In the commercial version of ssh, –C

disables compression and +C enables compression.
–f (not foreground) Sends ssh to the background after asking for a password and before

executing the command. Useful when you want to run the command in the background
but must supply a password. Implies –n.
–L Forwards a port on the local system to a remote system. For more information refer

to “Tunneling/Port Forwarding” on page 706.
–l user (login) Attempts to log in as user.
–n (null) Redirects standard input to ssh to come from /dev/null. Required when running
ssh in the background (–f option).
–o option (option) Specifies option in the format used in configuration files (page 700).

698 Chapter 17 The OpenSSH Secure Communication Utilities
–p (port) Specifies the port on the remote host that the connection is made to. Using the

host declaration (page 701) in the configuration file, you can specify a different port
for each system you connect to.
–q (quiet) Suppresses warning and diagnostic messages.
–R Forwards a port on the remote system to the local client. For more information refer

to “Tunneling/Port Forwarding” on page 706.
–t (tty) Allocates a pseudo-tty (terminal) to the ssh process on the remote system. Without this option, when you run a command on a remote system, ssh does not allocate

a tty (terminal) to the process. Instead, it attaches standard input and standard output
of the remote process to the ssh session—which is normally, but not always, what you
want. This option forces ssh to allocate a tty on the remote system so programs that
require a tty will work.
–v (verbose) Displays debugging messages about the connection and transfer. Useful if

things are not going as expected. Specify this option up to three times to increase
verbosity.
–X (X11) Turns on nontrusted X11 forwarding. This option is not necessary if you turn

on nontrusted X11 forwarding in the configuration file. For more information refer
to “Forwarding X11” on page 706.
–x (X11) Turns off X11 forwarding.
–Y (X11trusted) Turns on trusted X11 forwarding. This option is not necessary if you

turn on trusted X11 forwarding in the configuration file. For more information refer
to “Forwarding X11” on page 706.

scp: Copies Files to and from a Remote System
The scp (secure copy) utility copies an ordinary or directory file from one system to
another over a network; both systems can be remote. This utility uses ssh to transfer
files and employs the same authentication mechanism as ssh; thus it provides the
same security as ssh. The scp utility asks for a password when one is required. The
syntax of an scp command is
scp [[user@]from-host:]source-file [[user@]to-host:][destination-file]
where from-host is the name of the system you are copying files from and to-host is
the system you are copying files to. The from-host and to-host arguments can be local
system names, FQDNs (page 1069) of systems on the Internet, or IP addresses. When
you do not specify a host, scp assumes the local system. The user on either system
defaults to the user on the local system who is giving the command; you can specify
a different user with user@.
The source-file is the file you are copying, and the destination-file is the resulting
copy. Make sure you have read permission for the file you are copying and write
permission for the directory you are copying it into. You can specify plain or
directory files as relative or absolute pathnames. (A relative pathname is relative

Running the ssh, scp, and sftp OpenSSH Clients

699

to the specified directory or to the implicit user’s home directory.) When the
source-file is a directory, you must use the –r option to copy its contents. When
the destination-file is a directory, each of the source files maintains its simple filename. When the destination-file is missing, scp assumes the user’s home directory
is the intended destination.
Suppose Sam has an alternate username, sls, on plum. In the following example, Sam
uses scp to copy memo.txt from the home directory of his sls account on plum to the
allmemos directory in the working directory on the local system. If allmemos were
not the name of a directory, memo.txt would be copied to a file named allmemos in
the working directory.
[sam@guava ~]$ scp sls@plum:memo.txt allmemos
sls@plum’s password:
memo.txt
100%

12KB

12.3KB/s

00:00

As the transfer progresses, the percentage and number of bytes transferred increase,
and the time remaining decreases.

rsync is much more versatile than scp
tip The rsync utility (page 673) is more configurable than scp and uses OpenSSH security by default.
It has many options; the most commonly used are –a and –v. The –a option causes rsync to copy
ordinary files and directories, preserving the ownership, group, permissions, and modification
times associated with the files. It usually does no harm to specify the –a option; it is frequently
helpful. The –v option causes rsync to list files as it copies them. For example, Sam could have
given the preceding command as
$ rsync -av sls@plum:memo.txt allmemos
sls@plum's password:
receiving incremental file list
memo.txt
sent 30 bytes received 87495 bytes 19450.00 bytes/sec
total size is 87395 speedup is 1.00

The rsync utility is also smarter than scp. If the destination file exists, rsync copies only those
parts of the source file that are different from the destination file. This feature can save a lot of
time when you are copying large files with few changes, such as when making backups. The
number following speedup in the output indicates how much rsync’s algorithm has speeded up
the process of copying a file. In the case of the preceding example, rsync did not speed up the
copy process because it needed to copy the whole file.

In the next example, Sam, while working from guava, copies the same file as in the
previous example to the directory named old in his home directory on speedy. For this
example to work, Sam must be able to use ssh to log in on speedy from plum without
using a password. For more information refer to “Authorized Keys: Automatic
Login” on page 702.
[sam@guava ~]$ scp sls@plum:memo.txt speedy:old
sls@plum's password:

700 Chapter 17 The OpenSSH Secure Communication Utilities

Options
This section describes some of the options you can use with scp.
–C (compression) Enables compression (page 708).
–o option (option) Specifies option in the format used in configuration files (discussed shortly).
–P port (port) Connects to port port on the remote host. This option is uppercase for scp and
lowercase for ssh.
–p (preserve) Preserves the modification and access times as well as the modes of the

original file.
–q (quiet) Does not display the progress information as scp copies a file.
–r (recursive) Recursively copies a directory hierarchy (dereferences symbolic links

[page 115]).
–v (verbose) Displays debugging messages about the connection and transfer. Useful if

things are not going as expected.

sftp: A Secure FTP Client
As part of OpenSSH, most Linux distributions provide sftp, a secure alternative to ftp
(page 819). Functionally the same as ftp, sftp maps ftp commands to OpenSSH commands. You can replace ftp with sftp when you are logging in on an OpenSSH server.
When you connect to a system using sftp, enter the command ? to display a list of
commands. Refer to the sftp man page for more information.
lftp

Most Linux distributions also offers lftp, which is more sophisticated than sftp and
supports sftp. The lftp utility provides a shell-like command syntax that has many features, including support for tab completion and the ability to run jobs in the
background. Place the following .lftprc file in your home directory to ensure lftp uses
OpenSSH to connect to a server:
$ cat ~/.lftprc
set default-protocol sftp

You can also use /etc/lftp.conf to configure lftp; see the lftp man page for more
information.

~/.ssh/config and /etc/ssh/ssh_config Configuration Files
It is rarely necessary to modify OpenSSH client configuration files. For a given user
there might be two configuration files: ~/.ssh/config (user) and /etc/ssh/ssh_config
(global). These files are read in this order and, for a given parameter, the first one
found is the one OpenSSH uses. A user can override a global parameter setting by
setting the same parameter in her user configuration file. Parameters given on the ssh
or scp command line take precedence over parameters set in either of these files.
A user’s ~/.ssh/config file should be owned by the user in whose home directory it
appears and should not be writable by anyone except the owner. This file is typically
set to mode 600 as there is no reason for anyone except its owner to be able to read it.

Running the ssh, scp, and sftp OpenSSH Clients

701

Lines in the configuration files contain declarations. Each of these declarations starts
with a keyword that is not case sensitive. Some keywords must be followed by
whitespace and one or more case-sensitive arguments. You can use the Host keyword
to cause declarations to apply to a specific system. A Host declaration applies to all
the lines between it and the next Host declaration.
Following are some of the keywords and arguments you can specify:
CheckHostIP yes | no

Identifies a remote system using the IP address in addition to a hostname from the
known_hosts file when set to yes. Set it to no to use a hostname only. Setting
CheckHostIP to yes can improve system security. Default is yes.
ForwardX11 yes | no

When set to yes, automatically forwards X11 connections over a secure channel in
nontrusted mode and sets the DISPLAY shell variable. Different Linux distributions
set this keyword (or allow it to default) to different values. Check ~/.ssh/config and
/etc/ssh/ssh_config to see which value it is set to or defaults to. Generally, if it is
allowed to default to a value, the keyword will appear set to the default value on a
commented-out line (a line that begins with a hashmark [#]). If ForwardX11Trusted
is also set to yes, the connections are made in trusted mode. Alternatively, you can use
–X on the command line to redirect X11 connections in nontrusted mode. For X11 forwarding to work, X11Forwarding must also be set to yes in the /etc/sshd_config file
on the server. For more information refer to “Forwarding X11” on page 706.
ForwardX11Trusted yes | no

Works in conjunction with ForwardX11, which must be set to yes for this keyword
to have any effect. When this keyword is set to yes and ForwardX11 is set to yes, this
keyword gives remote X11 clients full access to the original (server) X11 display.
Alternatively, you can use –Y on the command line to redirect X11 connections in
trusted mode. For X11 forwarding to work, X11Forwarding must also be set to yes
in the /etc/sshd_config file on the server. Default is no, but some Linux distributions
sets this keyword to yes. For more information refer to “Forwarding X11” on
page 706.
HashKnownHosts

Improves security by causing OpenSSH to hash hostnames and addresses in the
~/.ssh/known_hosts file when set to yes. When set to no, the hostnames and
addresses are written in cleartext. See page 693 for more information on the
known_hosts file.
Host hostnames Specifies that the following declarations, until the next Host declaration, apply only

to hosts that hostnames matches. The hostnames is a whitespace-separated list that
can include ? and * wildcards. A single * specifies all hosts. Without this keyword,
all declarations apply to all hosts.
HostbasedAuthentication yes | no

Tries rhosts authentication when set to yes. For a more secure system, set to no.
Default is no.

702 Chapter 17 The OpenSSH Secure Communication Utilities
HostKeyAlgorithms algorithms

The algorithms is a comma-separated list of algorithms the client uses in order of
preference. Choose algorithms from ssh-rsa or ssh-dss. Default is ssh-rsa,ssh-dss.
Port num Causes OpenSSH to connect to the remote system on port num. Default is 22.
StrictHostKeyChecking yes | no | ask

Determines whether and how OpenSSH adds host keys to a user’s known_hosts file.
Set this option to ask to ask whether to add a host key when connecting to a new
system, set it to no to add a host key automatically, and set it to yes to require host
keys to be added manually. The yes and ask arguments cause OpenSSH to refuse
to connect to a system whose host key has changed. For a more secure system, set
to yes or ask. Default is ask.
TCPKeepAlive yes | no

Periodically checks whether a connection is alive when set to yes. Checking causes
the ssh or scp connection to be dropped when the server crashes or the connection
dies for another reason, even if it is only temporary. This option tests the connection
at the transport (TCP) layer. Setting this parameter to no causes the client not to
check whether the connection is alive. Default is yes.
This declaration uses the TCP keepalive option, which is not encrypted and is susceptible to IP spoofing (page 1075). The ClientAliveInterval option on the server is
a nonspoofable alternative.
User name Specifies a username to use when logging in on a system. You can specify a system

with the Host declaration. This option means you do not have to enter a username
on the command line when you are using a username that differs from your username
on the local system.
VisualHostKey yes | no

Displays an ASCII art representation of the key of the remote system in addition to
displaying the hexadecimal representation of the key when set to yes. When set to no,
this declaration displays the hexadecimal key only. For more information refer to
“Randomart” on page 704. Default is no.

Authorized Keys: Automatic Login
You can configure OpenSSH so you do not have to enter a password each time you
connect to a server (remote system). To set up this feature, you need to generate a personal authentication key on the client (local system), place the public part of the key
on the server, and keep the private part of the key on the client. When you connect
to the server, it issues a challenge based on the public part of the key. OpenSSH then
uses the private part of the key to respond to this challenge. If the client provides the
appropriate response, the server logs you in.
The first step in setting up an automatic login is to generate your personal authentication keys. First check whether these authentication keys exist on the local system
(client) by looking in ~/.ssh for either id_dsa and id_dsa.pub or id_rsa and id_rsa.pub.
If one of these pairs of files is present, skip the next step (do not create a new key).

Running the ssh, scp, and sftp OpenSSH Clients
ssh-keygen

703

On the client, the ssh-keygen utility creates the public and private parts of an RSA
key. The key’s randomart image is a visual representation of the public key; see the
next page.
$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/sam/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/sam/.ssh/id_rsa.
Your public key has been saved in /home/sam/.ssh/id_rsa.pub.
The key fingerprint is:
23:8f:99:2e:43:36:93:ed:c6:38:fe:4d:04:61:27:28 sam@guava
The key's randomart image is:
+--[ RSA 2048]----+
|
.+ .
|
| E .. +
|
|
. .
|
|
.
|
|
o. S
|
|
* .B .
|
|
o *+ o
|
|
=.+o
|
|
..*o .
|
+-----------------+

Replace rsa with dsa to generate DSA keys. In this example, the user pressed RETURN
in response to each query. You have the option of specifying a passphrase (10–30
characters is a good length) to encrypt the private part of the key. There is no way to
recover a lost passphrase. See the security tip on page 705 for more information
about the passphrase.
id_rsa and
id_rsa.pub

The ssh-keygen utility generates two keys: a private key or identification in ~/.ssh/id_rsa
and a public key in ~/.ssh/id_rsa.pub. No one except the owner should be able to
write to either of these files, and only the owner should be able to read from the private
key file.
You can display the local server’s RSA key fingerprint using ssh-keygen:

$ ssh-keygen -lf /etc/ssh/ssh_host_rsa_key.pub
2048 d1:9d:1b:5b:97:5c:80:e9:4b:41:9a:b7:bc:1a:ea:a1 /etc/ssh/ssh_host_rsa_key.pub (RSA)

You can also use ssh-keygen to display the fingerprint of the public key you just
created:
$ ssh-keygen -lf ~/.ssh/id_rsa.pub
2048 23:8f:99:2e:43:36:93:ed:c6:38:fe:4d:04:61:27:28 /home/sam/.ssh/id_rsa.pub (RSA)
ssh-copy-id

To log in on or copy files to and from another system without supplying a password, you must copy the ~/.ssh/id_rsa.pub from the client (local system) to a file
named ~/.ssh/authorized_keys on the server (remote system). The ssh-copy-id utility creates the ~/.ssh directory on the server if necessary, copies id_rsa.pub, and

704 Chapter 17 The OpenSSH Secure Communication Utilities

makes sure permissions are correct. The following example shows Sam setting up
automatic login on the system named plum:
$ ssh-copy-id sam@plum
sam@plum's password:
Now try logging into the machine, with "ssh 'sam@plum'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
$ ssh sam@plum
$

Sam must supply his password to copy the file to plum. After running ssh-copy-id,
Sam can log in on plum without providing a password. To make the server even more
secure, disable password authentication (see the adjacent tip). Automatic login will
fail if anyone other than the owner has permission to read from or write to the ~/.ssh
directory on the server.

Use a personal authentication key instead of a password
security Using a personal authentication key is more secure than using a password. When you turn
off password authentication on the server, brute-force authentication attacks become very
unlikely. You can disable password authentication by setting PasswordAuthentication to no in
/etc/ssh/sshd_config on the server.
Randomart

The randomart image of a system is an experimental OpenSSH ASCII visualization
of the system’s host key. This image is displayed by OpenSSH utilities, including ssh,
scp, and ssh-keygen. Its display is controlled by the VisualHostKey keyword in the
ssh_config file (page 702). With this keyword set to yes, OpenSSH displays a system’s
randomart image when you connect:
$ ssh plum
Host key fingerprint is 0b:90:12:02:97:fb:93:6e:d4:14:f8:e0:ac:f3:26:36
+--[ RSA 2048]----+
|+ o. .
|
| o..o..
|
| .+oo .
|
| ..o.o
|
|
o +. S
|
| o = .. .
|
|
= . .
|
| E =
|
| . =
|
+-----------------+
sam@plum's password:
...

The randomart image renders a system’s host key in a visual format that can be easier
to recall than a host key fingerprint (see the preceding example). Making it easier for
a user to detect a change in the fingerprint can mean that a user will be aware when
he is connecting to a system other than the one he intended to connect to.

Running the ssh, scp, and sftp OpenSSH Clients

705

ssh-agent: Holds Your Private Keys
Personal key encryption, passphrase, and ssh-agent
security The private part of your personal authentication key is kept in a file that only you can read. When
you set up automatic login on a remote system, any user who has access to your account on the
local system also has access to your account on the remote system because that user can read
the private part of your personal key. Thus, if a malicious user compromises your account or the
root account on the local system, that user then has access to your account on the remote system.
Encrypting the private part of your personal key protects the key and, therefore, restricts access to
the remote system should someone compromise your local account. However, if you encrypt your
personal key, you must supply the passphrase you used to encrypt the key each time you use the
key, negating the benefit of not having to type a password when logging in on the remote system.
You can use ssh-agent to remember the private part of your personal key for the duration of a
session; you supply your passphrase once, at the beginning of the session.

Storing private keys on a removable medium
security You can store private keys on a removable medium, such as a USB flash drive, and use your ~/.ssh
directory as the mount point for the filesystem stored on this drive. You might want to encrypt
these keys with a passphrase in case you lose the flash drive.

When you use ssh-keygen to generate a public/private key pair, you have the option of
specifying a passphrase. If you specify a passphrase, you must supply that passphrase
each time you use the key. As a result, when you set up a key pair to avoid specifying
your password when you log in on a remote system using ssh, you end up supplying
the passphrase instead.
The ssh-agent utility allows you to use a passphrase with your personal key while
entering the key one time, at the beginning of each session. When you log out,
ssh-agent forgets the key. On Linux distributions that are set up with ssh-agent
enabled for X sessions, it is very easy to use in a graphical environment. It is also
easy to set up this utility in a textual environment.
If you are running a textual session (not within a terminal emulator window), you
must give the following command to enable ssh-agent. If you are running a graphical
session, skip this step (ssh-agent is typically set up in the files in /etc/X11/xinitrc).
$ eval $(ssh-agent -s)
Agent pid 9882
ssh-add

Once ssh-agent is enabled, use ssh-add to specify the passphrase for a key pair:
$ ssh-add ~/.ssh/id_rsa
Enter passphrase for /home/sam/.ssh/id_rsa:
Identity added: /home/sam/.ssh/id_rsa (/home/sam/.ssh/id_rsa)

If you omit the argument to ssh-add , it adds a passphrase for each of the
~/.ssh/id_rsa, ~/.ssh/id_dsa, and ~/.ssh/identity files that exist.
With this setup, you can use ssh to work on a remote system without supplying a
login password and by supplying your passphrase one time per session.

706 Chapter 17 The OpenSSH Secure Communication Utilities

Tunneling/Port Forwarding
The ssh utility can forward a port (port forwarding; page 1086) through the
encrypted connection it establishes. Because the data sent across the forwarded port
uses the encrypted ssh connection as its data link layer, the term tunneling
(page 1099) is applied to this type of connection: “The connection is tunneled
through ssh.” You can secure protocols—including POP, X, IMAP, VNC, and
WWW—by tunneling them through ssh.

Forwarding X11
The ssh utility makes it easy to tunnel the X11 protocol. For X11 tunneling to work,
you must enable it on both the server and the client, and the client must be running
the X Window System. On the ssh server, enable X11 forwarding by setting the
X11Forwarding declaration to yes in the /etc/ssh/sshd_config file.

Trusted Client
On a client, enable trusted X11 forwarding by setting the ForwardX11 (page 701)
and ForwardX11Trusted (page 701) declarations to yes in the /etc/ssh/ssh_config or
~/.ssh/config file. Alternatively, you can specify the –Y option (page 698) on the command line to start the client in trusted mode.
When you enable trusted X11 forwarding on a client, the client connects as a trusted
client, which means the client trusts the server and is given full access to the X11 display.
With full access to the X11 display, in some situations a client might be able to modify
other clients of the X11 display. Make a trusted connection only when you trust the
remote system: You do not want someone tampering with your client.

Nontrusted Client
On a client, enable nontrusted X11 forwarding by setting the ForwardX11
(page 701) declaration to yes and the ForwardX11Trusted (page 701) declaration to
no in the /etc/ssh/ssh_config or ~/.ssh/config file. Alternatively, you can specify the
–X option (page 698) on the command line to start the client in nontrusted mode.
A nontrusted client is given limited access to the X11 display and cannot modify
other clients of the X11 display. Few clients work properly when they are run in nontrusted mode. If you are running an X11 client in nontrusted mode and encounter
problems, try running in trusted mode (assuming you trust the remote system).

Running ssh
With trusted X11 forwarding turned on, ssh tunnels the X11 protocol, setting the
DISPLAY environment variable on the system it connects to and forwarding the
required port. Typically you will be running from a GUI, which usually means that
you are using ssh in a terminal emulator window to connect to a remote system.

Tunneling/Port Forwarding 707

When you give an X11 command from an ssh prompt, OpenSSH creates a new secure
channel that carries the X11 data, and the graphical output from the X11 program
appears on the screen. Typically you will need to start the client in trusted mode.
[sam@guava ~]$ ssh plum
[sam@plum ~]$ echo $DISPLAY
localhost:10.0

By default, ssh uses X Window System display numbers 10 and higher (port numbers
6010 and higher) for forwarded X sessions. After you connect to a remote system
using ssh, you can give a command to run an X application. The application will then
run on the remote system with its display appearing on the local system, such that it
appears to run locally.

Port Forwarding
You can forward arbitrary ports using the –L and –R options. The –L option forwards a local port to a remote system, so a program that tries to connect to the
forwarded port on the local system transparently connects to the remote system. The
–R option does the reverse: It forwards remote ports to the local system. The –N
option, which prevents ssh from executing remote commands, is generally used with
–L and –R. When you specify –N, ssh works only as a private network to forward
ports. An ssh command line using the –L or –R option has the following syntax:
$ ssh –N –L | –R local-port:remote-host:remote-port target
where local-port is the number of the local port that is being forwarded to or from
remote-host, remote-host is the name or IP address of the system that local-port gets
forwarded to or from, remote-port is the number of the port on remote-host that is
being forwarded from or to the local system, and target is the name or IP address of
the system ssh connects to.
As an example, assume there is a POP mail client on the local system and that the
POP server is on a remote network, on a system named pophost. POP is not a secure
protocol; passwords are sent in cleartext each time the client connects to the server.
You can make it more secure by tunneling POP through ssh (POP-3 connects on port
110; port 1550 is an arbitrary port on the local system):
$ ssh -N -L 1550:pophost:110 pophost

After giving the preceding command, you can point the POP client at localhost:1550.
The connection between the client and the server will then be encrypted. (When you
set up an account on the POP client, specify the location of the server as localhost,
port 1550; details vary with different mail clients.)

Firewalls
In the preceding example, remote-host and target were the same system. However,
the system specified for port forwarding (remote-host) does not have to be the same
as the destination of the ssh connection (target). As an example, assume the POP

708 Chapter 17 The OpenSSH Secure Communication Utilities

server is behind a firewall and you cannot connect to it via ssh. If you can connect
to the firewall via the Internet using ssh, you can encrypt the part of the connection
over the Internet:
$ ssh -N -L 1550:pophost:110 firewall

Here remote-host (the system receiving the port forwarding) is pophost, and target
(the system that ssh connects to) is firewall.
You can also use ssh when you are behind a firewall (that is running sshd) and want
to forward a port into your system without modifying the firewall settings:
$ ssh -R 1678:localhost:80 firewall

The preceding command forwards connections from the outside to port 1678 on the
firewall to the local Web server. Forwarding connections in this manner allows you
to use a Web browser to connect to port 1678 on the firewall when you connect to
the Web server on the local system. This setup would be useful if you ran a Webmail
program on the local system because it would allow you to check your mail from
anywhere using an Internet connection.

Compression
Compression, which is enabled with the –C option, can speed up communication over
a low-bandwidth connection. This option is commonly used with port forwarding.
Compression can increase latency to an extent that might not be desirable for an X
session forwarded over a high-bandwidth connection.

Chapter Summary
OpenSSH is a suite of secure network connectivity tools that encrypts all traffic,
including passwords, thereby helping to thwart malicious users who might otherwise eavesdrop, hijack connections, and steal passwords. An OpenSSH server
accepts connections from clients including ssh (runs a command on or logs in on
another system), scp (copies files to and from another system), and sftp (securely
replaces ftp). Helper programs including ssh-keygen (creates, manages, and converts
authentication keys), ssh-agent (manages keys during a session), and ssh-add (works
with ssh-agent) create and manage authentication keys.
To ensure secure communications, when an OpenSSH client opens a connection, it
verifies that it is connected to the correct server. Then OpenSSH encrypts communication between the systems. Finally, OpenSSH makes sure the user is authorized to
log in on or copy files to and from the server. You can secure many protocols—
including POP, X, IMAP, VNC, and WWW—by tunneling them through ssh.
When it is properly set up, OpenSSH also enables secure X11 forwarding. With this
feature, you can run a graphical program securely on a remote system and have the
display appear on the local system.

Advanced Exercises

Exercises
1. What is the difference between the scp and sftp utilities?
2. How can you use ssh to find out who is logged in on a remote system?
3. How would you use scp to copy your ~/.bashrc file from the system named
plum to the local system?
4. How would you use ssh to run xterm on plum and show the display on the
local system?
5. Which problem can enabling compression present when you are using ssh
to run remote X applications on a local display?
6. When you try to connect to a remote system using an OpenSSH client and
you see a message warning you that the remote host identification has
changed, what has happened? What should you do?

Advanced Exercises
7. Which scp command would you use to copy your home directory from
plum to the local system?
8. Which single command could you give to log in as root on the remote system named plum, if plum has remote root logins disabled? Assume you are
running the command as a user without root privileges, you have not set up
an automatic login for ssh, and you have the same username on both
systems.
9. How could you use ssh to compare the contents of the ~/memos directories
on plum and the local system?
10. How would you use rsync with OpenSSH authentication to copy the
memos12 file from the working directory on the local system to your home
directory on plum? How would you copy the memos directory from the
working directory on the local system to your home directory on plum and
cause rsync to display each file as it copied the file?

709

VI
VI
VI

Command Reference
Command Reference

The following tables list the utilities covered in this part of the book grouped by
function and alphabetically within function. Although most of these are true utilities
(programs that are separate from the shells), some are built into the shells (shell builtins). The sample utility on page 719 shows the format of the description of each
utility in this part of the book.

Utilities That Display and Manipulate Files
aspell

Checks a file for spelling errors—page 721

bzip2

Compresses or decompresses files—page 732

cat

Joins and displays files—page 735

cmp

Compares two files—page 748

comm

Compares sorted files—page 750

cp

Copies files—page 754

cpio

Creates an archive, restores files from an archive, or copies a directory
hierarchy—page 758

cut

Selects characters or fields from input lines—page 766

dd

Converts and copies a file—page 772

diff

Displays the differences between two text files—page 777

ditto

Copies files and creates and unpacks archives—page 785 O

emacs

Editor—page 213

expand

Converts TABs to SPACEs—page 796

find

Finds files based on criteria—page 804

fmt

Formats text very simply—page 812

grep

Searches for a pattern in files—page 833

gzip

Compresses or decompresses files—page 838

head

Displays the beginning of a file—page 841

join

Joins lines from two files based on a common field—page 843

less

Displays text files, one screen at a time—page 852

ln

Makes a link to a file—page 856

lpr

Sends files to printers—page 858

714 Command Reference

ls

Displays information about one or more files—page 861

man

Displays documentation for utilities—page 875

mkdir

Creates a directory—page 886

mv

Renames or moves a file—page 890

nl

Numbers lines from a file—page 894

od

Dumps the contents of a file—page 897

open

Opens files, directories, and URLs—page 901 O

otool

Displays object, library, and executable files—page 903 O

paste

Joins corresponding lines from files—page 905

pax

Creates an archive, restores files from an archive, or copies a directory
hierarchy—page 907

plutil

Manipulates property list files—page 913 O

pr

Paginates files for printing—page 915

printf

Formats string and numeric data—page 917

rm

Removes a file (deletes a link)—page 926

rmdir

Removes directories—page 928

sed

Edits a file noninteractively—page 653

sort

Sorts and/or merges files—page 942

split

Divides a file into sections—page 951

strings

Displays strings of printable characters from files—page 959

tail

Displays the last part (tail) of a file—page 965

tar

Stores or retrieves files to/from an archive file—page 968

touch

Creates a file or changes a file’s access and/or modification time—page 985

unexpand

Converts SPACEs to TABs—page 796

uniq

Displays unique lines from a file—page 996

vim

Editor—page 159

wc

Displays the number of lines, words, and bytes in one or more
files—page 1000

Network Utilities
curlftps

Mounts a directory on an FTP server as a local directory—page 954

ftp

Transfers files over a network—page 819

rsync

Securely copies files and directory hierarchies over a network—page 673

Command Reference

715

scp

Securely copies one or more files to or from a remote system—pages 691
and 698

ssh

Securely runs a program or opens a shell on a remote system—pages 691
and 694

sshfs

Mounts a directory on an OpenSSH server as a local directory—page 954

telnet

Connects to a remote computer over a network—page 974

Utilities That Display and Alter Status
cd

Changes to another working directory—page 737

chgrp

Changes the group associated with a file—page 739

chmod

Changes the access mode (permissions) of a file—page 741

chown

Changes the owner of a file and/or the group the file is associated
with—page 746

date

Displays or sets the system time and date—page 769

df

Displays disk space usage—page 775

dmesg

Displays kernel messages—page 787

dscl

Displays and manages Directory Service information—page 788 O

du

Displays information on disk usage by directory hierarchy and/or
file—page 791

file

Displays the classification of a file—page 802

finger

Displays information about users—page 810

GetFileInfo

Displays file attributes—page 831 O

kill

Terminates a process by PID—page 846

killall

Terminates a process by name—page 848

nice

Changes the priority of a command—page 892

nohup

Runs a command that keeps running after you log out—page 896

ps

Displays process status—page 921

renice

Changes the priority of a process—page 925

SetFile

Sets file attributes—page 938 O

sleep

Creates a process that sleeps for a specified interval—page 940

stat

Displays information about files—page 957

stty

Displays or sets terminal parameters—page 960

716 Command Reference

sysctl

Displays and alters kernel variables at runtime—page 964

top

Dynamically displays process status—page 981

umask

Specifies the file-creation permissions mask—page 994

w

Displays information about local system users—page 998

which

Shows where in PATH a utility is located—page 1001

who

Displays information about logged-in users—page 1003

Utilities That Are Programming Tools
awk

Searches for and processes patterns in a file—page 619

configure

Configures source code automatically—page 752

gawk

Searches for and processes patterns in a file—page 619

gcc

Compiles C and C++ programs—page 826

make

Keeps a set of programs current—page 869

mawk

Searches for and processes patterns in a file—page 619

perl

Scripting language—page 517

python

Programming language—page 563

Miscellaneous Utilities
at

Executes commands at a specified time—page 725

busybox

Implements many standard utilities—page 729

cal

Displays a calendar—page 734

crontab

Maintains crontab files—page 763

diskutil

Checks, modifies, and repairs local volumes—page 782 O

echo

Displays a message—page 794

expr

Evaluates an expression—page 798

fsck

Checks and repairs a filesystem—page 814

launchctl

Controls the launchd daemon—page 850 O

mc

Manages files in a textual environment (aka Midnight Commander)—page 879

mkfs

Creates a filesystem on a device—page 887

Command Reference

717

screen

Manages several textual windows—page 931

tee

Copies standard input to standard output and one or more files—page 973

test

Evaluates an expression—page 978

tr

Replaces specified characters—page 987

tty

Displays the terminal pathname—page 990

tune2fs

Changes parameters on an ext2, ext3, or ext4 filesystem—page 991

xargs

Converts standard input to command lines—page 1005

Standard Multiplicative Suffixes
Some utilities allow you to use the suffixes listed in Table VI-1 following a byte count.
You can precede a multiplicative suffix with a number that is a multiplier. For example,
5K means 5 × 210. The absence of a multiplier indicates that the multiplicative suffix is
to be multiplied by 1. This text identifies utilities that accept these suffixes.

Table VI-1

Multiplicative suffixes

Suffix

Multiplicative value

Suffix

Multiplicative value

KB

1,000 (103)

PB

1015

K

1,024 (210)

P

250

MB

1,000,000 (106)

EB

1018

M

1,048,576 (220)

E

260

GB

1,000,000,000 (109)

ZB

1021

G

1,073,741,824 (230)

Z

270

TB

1012

YB

1024

T

240

Y

280

For example, the following command uses dd (page 772) to create a 2-megabyte (2
× 106 bytes) file of random values. It uses the MB multiplicative suffix following a
multiplier of 2 as part of the count argument. The ls utility shows the size of the
resulting file. It uses the –h (human-readable) option (see the tip on page 129) to
display a file size of 2.0M instead of the less readable 2000000 (bytes).
$ dd if=/dev/urandom of=randf bs=1 count=2MB
2000000+0 records in
2000000+0 records out
2000000 bytes (2.0 MB) copied, 20.5025 s, 97.5 kB/s
$ ls -lh randf
-rw-r--r--. 1 sam pubs 2.0M 04-10 15:42 randf

Give the command info coreutils Block size for more information.

718 Command Reference
BLOCKSIZE

Under Mac OS X, some utilities use the BLOCKSIZE environment variable to set a
default block size. You can set BLOCKSIZE to a value that is a number of bytes or
to a value that uses one of the K, M, or G suffixes. The text identifies utilities that
use BLOCKSIZE.

Common Options
Several GNU utilities share the options listed in Table VI-2. This text identifies the
utilities that accept these options.

Table VI-2

Common command-line options

Option

Effect

–

A single hyphen appearing in place of a filename instructs the utility to accept
input from standard input in place of a file.

––

A double hyphen marks the end of the options on a command line. You can
follow this option with an argument that begins with a hyphen. Without this
option the utility assumes an argument that begins with a hyphen is an option.

––help

Displays a help message for the utility. Some of these messages are quite long;
you can use a pipeline to send the output through less to display it one screen
at a time. For example, you could give the command ls ––help | less.
Alternatively, you can send the output through a pipeline to grep if you are
looking for specific information. For example, you could give the following
command to get information on the –d option to ls: ls ––help | grep –– –d.
See the preceding entry in this table for information on the double hyphen.

––version

Displays version information for the utility.

The sample Utility
The following description of the sample utility shows the format that this part of the
book uses to describe the utilities. These descriptions are similar to the man page
descriptions (pages 33 and 875); however, most users find the descriptions in this
book easier to read and understand. The descriptions emphasize the most useful features of the utilities and often leave out the more obscure features. For information
about the less commonly used features, refer to the man and info pages or call the utility with the ––help option, which works with many utilities.

sample O  OS X in an oval indicates this utility runs under Mac OS X only.
Brief description of what the utility does.
sample [options] arguments
Following the syntax line is a description of the utility. The syntax line shows how to
run the utility from the command line. Options and arguments enclosed in brackets
([]) are not required. Enter words that appear in this italic typeface as is. Words that
you must replace when you enter the command appear in this bold italic typeface.
Words listed as arguments to a command identify single arguments (for example,
source-file) or groups of similar arguments (for example, directory-list). A note here
indicates if the utility runs under Linux or OS X only. O

Arguments

This section describes the arguments you can use when you run the utility. The argument,
as shown in the preceding syntax line, is printed in this bold italic typeface.

Options

This section lists some of the options you can use with the command. Unless otherwise
specified, you must precede options with one or two hyphens. Most commands accept
a single hyphen before multiple options (page 128). Options in this section are ordered
alphabetically by short (single-hyphen) options. If an option has only a long version
(two hyphens), it is ordered by its long option. Following are some sample options:

––delimiter=dchar

–d dchar
This option includes an argument. The argument is set in a bold italic typeface
in both the heading and the description. You substitute another word (filename,
string of characters, or other value) for any arguments shown in this typeface.
Type characters that are in bold type (such as the ––delimiter and –d) as is.
––make-dirs –m This option has a long and a short version. You can use either option; they are

equivalent. This option description ends with Linux in a box, indicating it is
available under Linux only. Options not followed by Linux or OS X are available
under both operating systems. L
–t (table of contents) This simple option is preceded by a single hyphen and not
followed by arguments. It has no long version. The table of contents appearing
in parentheses at the beginning of the description is a cue, suggestive of what
the option letter stands for. This option description ends with OS X in a box,
indicating it is available under OS X only. Options not followed by Linux or
OS X are available under both operating systems. O

Discussion

This optional section describes how to use the utility and identifies any quirks it
might have.

sample O

sample O 719

720 sample O

Notes

This section contains miscellaneous notes—some important and others merely
interesting.

Examples

This section contains examples illustrating how to use the utility. This section is a
tutorial, so it takes a more casual tone than the preceding sections of the description.

aspell 721

Checks a file for spelling errors
aspell check [options] filename
aspell list [options] < filename
aspell config
aspell help
The aspell utility checks the spelling of words in a file against a standard dictionary. You can use aspell
interactively: It displays each misspelled word in context, together with a menu that gives you the choice
of accepting the word as is, choosing one of aspell’s suggested replacements for the word, inserting the
word into your personal dictionary, or replacing the word with one you enter. You can also use aspell in
batch mode so it reads from standard input and writes to standard output. The aspell utility is available
under Linux only. L

aspell is not like other utilities regarding its input
tip Unlike many utilities, aspell does not accept input from standard input when you do not specify
a filename on the command line. Instead, the action specifies where aspell gets its input.

Actions

You must specify exactly one action when you run aspell.

check

–c Runs aspell as an interactive spell checker. Input comes from a single file named
on the command line. Refer to “Discussion” on page 722.

config

Displays aspell’s configuration, both default and current values. Send the output
through a pipeline to less for easier viewing, or use grep to find the option you
are looking for (for example, aspell config | grep backup).

help

–? Displays an extensive page of help. Send the output through a pipeline to less
for easier viewing.

list

–l Runs aspell in batch mode (noninteractively) with input coming from standard
input and output going to standard output.

Arguments

The filename is the name of the file you want to check. The aspell utility accepts this
argument only when you use the check (–c) action. With the list (–l) action, input
must come from standard input.

Options

The aspell utility has many options. The more commonly used ones are listed in this
section; see the aspell man page for a complete list. Default values of many options
are determined when aspell is compiled (see the config action).
You can specify options on the command line, as the value of the ASPELL_CONF
environment variable, or in your personal configuration file (~/.aspell.conf). A user

aspell

aspell

722 aspell

working with root privileges can create a global configuration file (/etc/aspell.conf).
Put one option per line in a configuration file; separate options with a semicolon (;)
in ASPELL_CONF. Options on the command line override those in ASPELL_CONF,
which override those in your personal configuration file, which override those in the
global configuration file.
This section lists two types of options: Boolean and value. The Boolean options turn
a feature on (enable the feature) or off (disable the feature). Precede a Boolean option
with dont– to turn it off. For example, ––ignore-case turns the ignore-case feature on
and ––dont-ignore-case turns it off.
Value options assign a value to a feature. Follow the option with an equal sign and
a value—for example, ––ignore=4.
For all options in a configuration file or in the ASPELL_CONF variable, omit the
leading hyphens (ignore-case or dont-ignore-case).

aspell options and leading hyphens
caution The way you specify options differs depending on whether you are specifying them on the command line, using the ASPELL_CONF shell variable, or in a configuration file.
On the command line, prefix long options with two hyphens (for example, ––ignore-case or
––dont-ignore-case). In ASPELL_CONF and configuration files, omit the leading hyphens (for
example, ignore-case or dont-ignore-case).
––dont-backup
––ignore=n

Does not create a backup file named filename.bak (default is ––backup when
action is check).
Ignores words with n or fewer characters (default is 1).

––ignore-case

Ignores the case of letters in words being checked (default is ––dont-ignorecase).

––lang=cc

Specifies the two-letter language code (cc; page 317). The language code
defaults to the value of LC_MESSAGES (page 317).

––mode=mod

Specifies a filter to use. Select mod from url (default), none, sgml, and others.
The modes work as follows: url skips URLs, hostnames, and email addresses;
none turns off all filters; and sgml skips SGML, HTML, XHTML, and XML
commands.

––strip-accents

Removes accent marks from all the words in the dictionary before checking
words (default is ––dont-strip-accents).

Discussion

The aspell utility has two modes of operation: batch and interactive. You specify batch
mode by using the list or –l action. In batch mode, aspell accepts the document you want
to check for spelling errors as standard input and sends the list of potentially misspelled
words to standard output.

aspell 723

You specify interactive mode by using the check or –c action. In interactive mode,
aspell displays a screen with the potentially misspelled word highlighted in context
and a menu of choices. See “Examples” for an illustration. The menu includes various
commands (Table VI-3) as well as some suggestions of similar, correctly spelled words.
You either enter one of the numbers from the menu to select a suggested word to
replace the word in question or enter a letter to give a command.

Table VI-3

Notes

aspell commands

Command

Action

SPACE

Takes no action and goes on to the next misspelled word.

n

Replaces the misspelled word with suggested word number n.

a

Adds the “misspelled” word to your personal dictionary.

b

Aborts aspell; does not save changes.

i or I (letter “i”)

Ignores the misspelled word. I (uppercase “I”) ignores all occurrences of this
word; i ignores this occurrence only and has the same effect as SPACE.

l (lowercase “l”)

Changes the “misspelled” word to lowercase and adds it to your personal
dictionary.

r or R

Replaces the misspelled word with the word you enter at the bottom of the
screen. R replaces all occurrences of this word; r replaces this occurrence only.

x

Saves the file as corrected so far and exits from aspell.

For more information refer to the aspell man page, the aspell home page (aspell.net),
and the /usr/share/doc/aspell directory.
The aspell utility is not a foolproof way of finding spelling errors. It also does not
check for misused, properly spelled words (such as red instead of read).

Spelling from
emacs

You can make it easy to use aspell from emacs by adding the following line to your
~/.emacs file (page 257). This line causes emacs’ ispell functions to call aspell:
(setq-default ispell-program-name "aspell")

Spelling from vim

Similarly, you can make it easy to use aspell from vim by adding the following line to
your ~/.vimrc file (page 195):
map ^T :w!:!aspell check %:e! %

When you enter this line in ~/.vimrc using vim, enter the ^T as CONTROL-V CONTROL-T
(page 179). With this line in ~/.vimrc, while you are editing a file using vim, CONTROL-T
brings up aspell to spell check the file you are editing.

724 aspell

Examples

The following examples use aspell to correct the spelling in the memo.txt file:
$ cat memo.txt
Here's a document for teh aspell utilitey
to check. It obviosly needs proofing
quiet badly.

The first example uses aspell with the check action and no options. The appearance
of the screen for the first misspelled word, teh, is shown. At the bottom of the screen
is the menu of commands and suggested words. Each of the numbered words differs
slightly from the misspelled word:
$ aspell check memo.txt
Here's a document for teh aspell utilitey
to check. It obviosly needs proofing
quiet badly.
============================================================
1) the
6) th
2) Te
7) tea
3) tech
8) tee
4) Th
9) Ted
5) eh
0) tel
i) Ignore
I) Ignore all
r) Replace
R) Replace all
a) Add
l) Add Lower
b) Abort
x) Exit
============================================================
?

Enter one of the menu choices in response to the preceding display; aspell will do your
bidding and move the highlight to the next misspelled word (unless you choose to
abort or exit). In this case, entering 1 (one) would change teh to the in the file.
The next example uses the list action to display a list of misspelled words. The word
quiet is not in the list—it is not properly used but is properly spelled.
$ aspell list < memo.txt
teh
aspell
utilitey
obviosly

The last example also uses the list action. It shows a quick way to check the spelling
of a word or two using a single command. The user gives the aspell list command and
then enters seperate temperature into aspell’s standard input (the keyboard). After the
user presses RETURN and CONTROL-D (to indicate the EOF or end of file), aspell writes the
misspelled word to standard output (the screen):
$ aspell list
seperate temperatureRETURN
CONTROL-D

seperate

at

725

at
at [options] time [date | +increment]
atq
atrm job-list
batch [options] [time]
The at and batch utilities execute commands at a specified time. They accept commands from standard
input or, with the –f option, from a file. Commands are executed in the same environment as the at
or batch command. Unless redirected, standard output and standard error from commands are
emailed to the user who ran the at or batch command. A job is the group of commands that is executed
by one call to at. The batch utility differs from at in that it schedules jobs so they run when the CPU
load on the system is low.
The atq utility displays a list of queued at jobs; atrm cancels pending at jobs.

Arguments

The time is the time of day when at runs the job. You can specify the time as a one-,
two-, or four-digit number. One- and two-digit numbers specify an hour, and four-digit
numbers specify an hour and minute. You can also give the time in the form hh:mm.
The at utility assumes a 24-hour clock unless you place am or pm immediately after
the number, in which case it uses a 12-hour clock. You can also specify time as now,
midnight, noon, or teatime (4:00 PM).
The date is the day of the week or day of the month when at runs the job. When you
do not specify a day, at executes the job today if the hour you specify in time is greater
than the current hour. If the hour is less than the current hour, at executes the job
tomorrow.
You specify a day of the week by spelling it out or abbreviating it to three letters. You
can also use the words today and tomorrow. Use the name of a month followed by
the number of the day in the month to specify a date. You can follow the month and
day number with a year.
The increment is a number followed by one of the following (it accepts both plural
and singular): minutes, hours, days, or weeks. The at utility adds the increment to
time. You cannot specify an increment for a date.
When using atrm, job-list is a list of one or more at job numbers. You can list job
numbers by giving the command at –l or atq.

Options

The batch utility accepts options under OS X only. The at utility does not accept the
–c, –d, and –l options when you initiate a job using at; use these options to determine
the status of a job or to cancel a job only.

at

Executes commands at a specified time

726 at

–c job-list
(cat) Displays the environment and commands specified by the job numbers in
job-list.
–d job-list
(delete) Cancels jobs that you submitted using at. The job-list is a list of one or
more at job numbers to cancel. If you do not remember the job number, use the
–l option or run atq to list your jobs and their numbers. Using this option with
at has the same effect as running atrm. This option is deprecated under OS X;
use the –r option instead.
–f file
(file) Specifies that commands come from file instead of standard input. This
option is useful for long lists of commands or commands that are executed
repeatedly.
–l (list) Displays a list of your at jobs. Using this option with at has the same effect
as running atq.
–m (mail) Sends you email after a job is run, even when nothing is sent to standard
output or standard error. When a job generates output, at always emails it to
you, regardless of this option.
–r job-list
(remove) Same as the –d option. O

Notes

The at utility uses /bin/sh to execute commands. Under Linux, this file is typically a
link to bash or dash.
The shell saves the environment variables and the working directory at the time you
submit an at job so they are available when at executes commands.

at.allow and
at.deny

A user running with root privileges can always use at. The Linux /etc/at.allow (OS
X uses /var/at/at.allow) and Linux /etc/at.deny (OS X uses /var/at/at.deny) files,
which should be readable and writable by root only (600 permissions), control which
ordinary, local users can use at. When at.deny exists and is empty, all users can use
at. When at.deny does not exist, only those users listed in at.allow can use at. Users
listed in at.deny cannot use at unless they are also listed in at.allow.
Under Linux, jobs you submit using at are run by the atd daemon. This daemon stores
jobs in /var/spool/at or /var/spool/cron/atjobs and stores their output in
/var/spool/at/spool or /var/spool/cron/atspool. These files are set to mode 700 and
owned by the user named daemon or the user who ran the job.
Under Mac OS X, jobs you submit using at are run by atrun, which is called every 30
seconds by launchd. The atrun utility stores jobs in /var/at/jobs and stores their output in /var/at/spool, both of which are set to mode 700 and owned by the user named
daemon.

at

727

Under OS X 10.4 and above, the atrun daemon is disabled by default. Working with
root privileges, you can enable and disable atrun using the following commands:
# launchctl load -w /System/Library/LaunchDaemons/com.apple.atrun.plist
# launchctl unload -w /System/Library/LaunchDaemons/com.apple.atrun.plist

See launchctl (page 850) for more information.

Examples

You can use any of the following techniques to paginate and print long_file tomorrow
at 2:00 AM. The first example executes the command directly from the command line;
the last two examples use the pr_tonight file, which contains the necessary command,
and execute that command using at. Prompts and output from different versions of
at differ.
$ at 2am
at> pr long_file | lpr
at>CONTROL-D 
job 8 at Thu Apr 5 02:00:00 2012
$ cat pr_tonight
#!/bin/bash
pr long_file | lpr
$ at -f pr_tonight 2am
job 9 at Thu Apr 5 02:00:00 2012
$ at 2am < pr_tonight
job 10 at Thu Apr 5 02:00:00 2012

If you execute commands directly from the command line, you must indicate the end
of the commands by pressing CONTROL-D at the beginning of a line. After you press CONTROL-D, at displays a line that begins with job followed by the job number and the time
at will execute the job.
atq

If you run atq after the preceding commands, it displays a list of jobs in its queue:
$ atq
8
9
10

atrm

Thu Apr
Thu Apr
Thu Apr

5 02:00:00 2012 a sam
5 02:00:00 2012 a sam
5 02:00:00 2012 a sam

The following command removes job number 9 from the queue:
$ atrm 9
$ atq
8
Thu Apr
10
Thu Apr

5 02:00:00 2012 a sam
5 02:00:00 2012 a sam

The next example executes cmdfile at 3:30 PM (1530 hours) one week from today:
$ at -f cmdfile 1530 +1 week
job 12 at Wed Apr 11 15:30:00 2012

728 at

The next at command executes a job at 7:00 PM on Thursday. This job uses find to
create an intermediate file, redirects the output sent to standard error, and prints
the file.
$ at 7pm Thursday
at> find / -name "core" -print >report.out 2>report.err
at> lpr report.out
at>CONTROL-D 
job 13 at Thu Apr 5 19:00:00 2012

The final example shows some of the output generated by the –c option when at is
queried about the preceding job. Most of the lines show the environment; the last few
lines execute the commands as a Here document (page 450):
$ at -c 13
#!/bin/sh
# atrun uid=1000 gid=1400
# mail sam 0
umask 22
HOSTNAME=guava; export HOSTNAME
SHELL=/bin/bash; export SHELL
HISTSIZE=1000; export HISTSIZE
USER=sam; export USER
MAIL=/var/spool/mail/sam; export MAIL
PATH=/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/hom
/sam/.local/bin:/home/sam/bin; export PATH
PWD=/home/sam; export PWD
...
cd /home/sam || {
echo 'Execution directory inaccessible' >&2
exit 1
}
${SHELL:-/bin/sh} << 'marcinDELIMITER3b59900b'
find / -name "core" -print >report.out 2>report.err
lpr report.out
marcinDELIMITER3b59900b

busybox

729

Implements many standard utilities
busybox [applet] [arguments]
busybox ––list | ––list-full
applet [arguments]
The busybox utility incorporates the functionality of many standard Linux utilities, called applets
(page 1055), within a single utility.

Arguments

The busybox utility runs applet with optional arguments. When called without an
applet, it displays a usage message that lists the applets it incorporates. See “Notes”
for a discussion of typical usage.

Options

The busybox utility accepts two options, each of which displays a list of the applets
it incorporates. Most of the applets support a ––help option that displays a list of
options that applet supports.

––list
––list-full

Notes

Displays a list of applets you can run from busybox.
Displays a list of the absolute pathnames of applets you can run from busybox.
The busybox utility (busybox.net) combines tiny versions of approximately 200
Linux utilities into a single utility. It is called a multicall binary because you can call
it many different ways (you can call busybox as any of the utilities it incorporates). In
this context, the included utilities are called applets.
Because of the size of busybox, its applets have fewer options than the original GNU utilities. The utility was written to be small, use few resources, and be easily customized.
Because running a Linux utility requires several kilobytes of overhead, incorporating
many utilities in a single executable file can save disk space and system memory.
With its small size and completeness, busybox is used primarily in embedded systems
and as an emergency shell under Linux. When a Linux system cannot boot properly,
it will typically drop into busybox so you can repair the system. This utility runs in
several environments, including Linux, OS X, Android, and FreeBSD.
Although you can run busybox from a shell, it is typically run as a shell itself. In the
latter case, you run a busybox applet by typing the name of the applet and options
for that applet (you do not type the word busybox).
When busybox is run from a shell, each of the applets it supports is typically
linked to busybox so you can type the name of the applet and options for that

busybox

busybox

730 busybox

applet without typing the word busybox. With this setup, the /bin directory
might look like this:
$ ls -l /bin
lrwxrwxrwx
lrwxrwxrwx
lrwxrwxrwx
lrwxrwxrwx
lrwxrwxrwx
lrwxrwxrwx
-rwxr-xr-x
lrwxrwxrwx
-rwxr-xr-x
lrwxrwxrwx
...

1
1
1
1
1
1
2
1
1
1

admin
admin
admin
admin
admin
admin
admin
admin
admin
admin

administ
administ
administ
administ
administ
administ
administ
administ
administ
administ

7
7
7
7
7
2
451992
7
95264
7

Mar
Mar
Mar
Mar
Mar
Mar
Mar
Mar
Mar
Mar

1
1
1
1
1
1
1
1
1
1

16:34
16:34
16:34
16:34
16:34
16:34
16:18
16:34
16:19
16:34

[ -> busybox
addgroup -> busybox
adduser -> busybox
ash -> busybox
awk -> busybox
bash -> sh
busybox
bzcat -> busybox
bzip2
cat -> busybox

If you install busybox, you will have to enter busybox commands as you would any
other commands: beginning with the name of the utility you want to run (busybox).
If busybox has access to the system version of a utility, it will use that version before it
uses its internal version. You can use the busybox which (page 68) utility to determine
which version of a utility busybox will run. The following example shows that busybox
will use its internal version of ls but the system version of cat:
$ busybox which ls
$ busybox which cat
/bin/cat

The busybox utility is typically set up on an embedded system (e.g., a router) so the name
of each utility is a link to busybox. When configured in this manner, you can run the
busybox command by simply typing the name of the command you want to run. You
can see how this setup works by giving the following commands. The first command
links ls in the working directory to busybox [$(which busybox) uses command substitution to return the absolute pathname of the busybox utility]. The second command
executes busybox through the ls link, running the busybox version of the ls utility.
$ ln -s $(which busybox) ls
$ ./ls
...

Examples

When you call busybox without any arguments, it displays information about itself.
In the following output, busybox uses the term function in place of applet.
$ busybox
BusyBox v1.18.2 (2011-08-26 21:31:48 UTC) multi-call binary.
Copyright (C) 1998-2009 Erik Andersen, Rob Landley, Denys Vlasenko
and others. Licensed under GPLv2.
See source distribution for full notice.
Usage: busybox [function] [arguments]...
or: busybox --list[-full]
or: function [arguments]...

busybox

731

BusyBox is a multi-call binary that combines many common Unix
utilities into a single executable. Most people will create a
link to busybox for each function they wish to use and BusyBox
will act like whatever it was invoked as.
Currently defined functions:
[, [[, acpid, add-shell, addgroup, adduser, adjtimex, ar,
arp, arping, ash, awk, base64, basename, beep, blkid,
blockdev, bootchartd, brctl, bunzip2, bzcat, bzip2, cal, cat,
catv, chat, chattr, chgrp, chmod, chown, chpasswd, chpst,
chroot, chrt, chvt, cksum, clear, cmp, comm, cp, cpio, crond,
...

You can use the ––help option to display information about most busybox applets:
$ busybox ar --help
BusyBox v1.18.2 (2011-08-26 21:31:48 UTC) multi-call binary.
Usage: ar [-o] [-v] [-p] [-t] [-x] ARCHIVE FILES
Extract or list FILES from an ar archive
Options:
-o
-p
-t
-x
-v

Preserve original dates
Extract to stdout
List
Extract
Verbose

When busybox is installed as a stand-alone utility, a command must start with the
word busybox followed by the name of the applet you want busybox to run:
$ busybox ls -l
-rw-rw-r-1 sam
-rw-rw-r-1 sam

sam
sam

8445 Feb
16890 Feb

9 17:09 memo1
9 17:09 memo2

If you are running a busybox shell, you can just enter the same commands you would
use if you were running bash or tcsh. You can invoke a busybox shell by giving the
command busybox sh.
$ busybox sh
BusyBox v1.18.4 (Ubuntu 1:1.18.4-2ubuntu2) built-in shell (ash)
Enter 'help' for a list of built-in commands.
~ $ ls -l
-rw-rw-r--rw-rw-r--

1 sam
1 sam

sam
sam

8445 Feb
16890 Feb

9 17:09 memo1
9 17:09 memo2

732 bzip2

bzip2
bzip2

Compresses or decompresses files
bzip2 [options] [file-list]
bunzip2 [options] [file-list]
bzcat [options] [file-list]
bzip2recover [file]
The bzip2 utility compresses files, bunzip2 restores files compressed using bzip2, and bzcat displays files
compressed with bzip2.

Arguments

The file-list is a list of one or more ordinary files (no directories) that are to be compressed or decompressed. If file-list is empty or if the special option – is present, bzip2
reads from standard input. The ––stdout option causes bzip2 to write to standard
output.

Options

Under Linux, bzip2, bunzip2, and bzcat accept the common options described on
page 718.

The Mac OS X version of bzip2 accepts long options
tip Options for bzip2 preceded by a double hyphen (––) work under Mac OS X as well as under Linux.
––stdout

–c Writes the results of compression or decompression to standard output.

––decompress

–d Decompresses a file that was compressed using bzip2. This option with bzip2 is
equivalent to the bunzip2 command.

––fast or
––best

–n Sets the block size when compressing a file. The n is a digit from 1 to 9, where
1 (––fast) generates a block size of 100 kilobytes and 9 (––best) generates a
block size of 900 kilobytes. The default level is 9. The ––fast and ––best options
are provided for compatibility with gzip and do not necessarily yield the fastest
or best compression.

––force

–f Forces compression even if a file already exists, has multiple links, or comes
directly from a terminal. The option has a similar effect with bunzip2.

––keep

–k Does not delete input files while compressing or decompressing them.

––quiet

–q Suppresses warning messages; does display critical messages.

––test
––verbose

–t Verifies the integrity of a compressed file. Displays nothing if the file is OK.
–v For each file being compressed, displays the name of the file, the compression
ratio, the percentage of space saved, and the sizes of the decompressed and
compressed files.

bzip2 733

Discussion

The bzip2 and bunzip2 utilities work similarly to gzip and gunzip; see the discussion of
gzip (page 839) for more information. Normally bzip2 does not overwrite a file; you
must use ––force to overwrite a file during compression or decompression.

Notes

The bzip2 home page is bzip.org.
The bzip2 utility does a better job of compressing files than gzip does.
Use the ––bzip2 modifier with tar (page 969) to compress archive files using bzip2.
See page 66 for additional information on and examples of using tar to create and
unpack archives.

bzcat file-list Works like cat except it uses bunzip2 to decompress file-list as it copies files to standard

output.
bzip2recover Attempts to recover a damaged file that was compressed using bzip2.

Examples

In the following example, bzip2 compresses a file and gives the resulting file the same
name with a .bz2 filename extension. The –v option displays statistics about the
compression.
$ ls -l
-rw-r--r-- 1 sam sam 737414 04-03 19:05 bigfile
$ bzip2 -v bigfile
bigfile: 3.926:1, 2.037 bits/byte, 74.53% saved, 737414 in, 187806 out
$ ls -l
-rw-r--r-- 1 sam sam 187806 04-03 19:05 bigfile.bz2

Next touch creates a file with the same name as the original file; bunzip2 refuses to
overwrite the file in the process of decompressing bigfile.bz2. The ––force option
enables bunzip2 to overwrite the file.
$ touch bigfile
$ bunzip2 bigfile.bz2
bunzip2: Output file bigfile already exists.
$ bunzip2 --force bigfile.bz2
$ ls -l
-rw-r--r-- 1 sam sam 737414 04-03 19:05 bigfile

734 cal

cal
Displays a calendar

cal

cal [options] [[month] year]
The cal utility displays a calendar.

Arguments

Options

The arguments specify the month and year for which cal displays a calendar. The
month is a decimal integer from 1 to 12 and the year is a decimal integer. Without any
arguments, cal displays a calendar for the current month. When you specify a single
argument, cal displays a calendar for the year specified by the argument.
–j (Julian) Displays Julian days—a calendar that numbers the days consecutively
from January 1 (1) through December 31 (365 or 366).
–m (Monday) Makes Monday the first day of the week. Without this option, Sunday
is the first day of the week. L
–m n
(month) Displays a calendar for the nth month of the current year. O
–y (year) Displays a calendar for the current year. L
–3 (three months) Displays the previous, current, and next months. L

Notes

Do not abbreviate the year. The year 05 is not the same as 2005.
The ncal (new cal) utility displays a more compact calendar.

Examples

The following command displays a calendar for December 2012:
$ cal 12 2012
December 2012
Su Mo Tu We Th Fr Sa
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31

Next is a Julian calendar for 1949:
$ cal -j 1949
1949

Su

Mo

2
9
16
23
30
...

3
10
17
24
31

January
Tu We Th

Fr

4
11
18
25

7
14
21
28

5
12
19
26

6
13
20
27

Sa
1
8
15
22
29

Su

Mo

37
44
51
58

38
45
52
59

February
Tu We Th
32 33 34
39 40 41
46 47 48
53 54 55

Fr
35
42
49
56

Sa
36
43
50
57

cat

735

cat
cat [options] [file-list]
The cat utility copies files to standard output. You can use cat to display the contents of one or more
text files on the screen.

Arguments

The file-list is a list of the pathnames of one or more files that cat processes. If you
do not specify an argument or if you specify a hyphen (–) in place of a filename, cat
reads from standard input.

Options

Under Linux, cat accepts the common options described on page 718. Options preceded
by a double hyphen (––) work under Linux only. Except as noted, options named with
a single letter and preceded by a single hyphen work under Linux and OS X.

––show-all

–A Same as –vET. L

––number-nonblank

–b Numbers all lines that are not blank as they are written to standard output.
–e (end) Same as –vE. L O
––show-ends
––number
––squeeze-blank

–E Marks the end of each line with a dollar sign. L
–n Numbers all lines as they are written to standard output.
–s Removes extra blank lines so there are never two or more blank lines in a row.
–t (tab) Same as –vT.

––show-tabs –T Displays TABs as ^I. L
––show-nonprinting

–v Displays CONTROL characters using the caret notation (^M) and displays characters
that have the high bit set (META characters) using the M- notation (page 223). This
option does not convert TABs and LINEFEEDs. Use –T (––show-tabs) if you want to
display TABs as ^I. LINEFEEDs cannot be displayed as anything but themselves; otherwise, the line could be too long.

Notes

See page 135 for a discussion of cat, standard input, and standard output.
Use the od utility (page 897) to display the contents of a file that does not contain text
(for example, an executable program file).
Use the tac utility to display lines of a text file in reverse order (Linux only). See the
tac info page for more information.
The name cat is derived from one of the functions of this utility, catenate, which
means to join together sequentially, or end to end.

cat

Joins and displays files

736 cat

Set noclobber to avoid overwriting a file
caution Despite cat’s warning message, the shell destroys the input file (letter) before invoking cat in the
following example:
$ cat memo letter > letter
cat: letter: input file is output file

You can prevent overwriting a file in this situation by setting the noclobber variable (pages 139
and 397).

Examples

The following command displays the contents of the memo text file on the terminal:
$ cat memo
...

The next example catenates three text files and redirects the output to the file
named all:
$ cat page1 letter memo > all

You can use cat to create short text files without using an editor. Enter the following
command line, type the text you want in the file, and press CONTROL-D on a line by itself:
$ cat > new_file
...

(text)
...
CONTROL-D

In this case cat takes input from standard input (the keyboard) and the shell redirects
standard output (a copy of the input) to the file you specify. The CONTROL-D indicates
the EOF (end of file) and causes cat to return control to the shell.
In the next example, a pipeline sends the output from who to standard input of cat.
The shell redirects cat’s output to the file named output; after the commands have
finished executing, output contains the contents of the header file, the output of
who, and the contents of footer. The hyphen on the command line causes cat to read
standard input after reading header and before reading footer.
$ who | cat header - footer > output

cd 737

cd
cd [options] [directory]
The cd builtin makes directory the working directory.

Arguments

The directory is the pathname of the directory you want to be the new working
directory. Without an argument, cd makes your home directory the working directory. Using a hyphen in place of directory changes to the previous working
directory.

Options

The following options are available under bash and dash only.
–L (no dereference) If directory is a symbolic link, cd makes the symbolic link the
working directory (default). See page 115 for information on dereferencing
symbolic links.
–P (dereference) If directory is a symbolic link, cd makes the directory the symbolic
link points to the working directory. See page 115 for information on dereferencing
symbolic links.

Notes

The cd command is a bash, dash, and tcsh builtin.
See page 92 for a discussion of cd.
Without an argument, cd makes your home directory the working directory; it uses the
value of the HOME (bash; page 307) or home (tcsh; page 392) variable to determine
the pathname of your home directory.
With an argument of a hyphen, cd makes the previous working directory the working
directory. It uses the value of the OLDPWD (bash) or owd (tcsh) variable to determine
the pathname of the previous working directory.
The CDPATH (bash; page 313) or cdpath (tcsh; page 392) variable contains a colonseparated list of directories that cd searches. Within this list, a null directory name
(::) or a period (:.:) represents the working directory. If CDPATH or cdpath is not set,
cd searches only the working directory for directory. If this variable is set and directory is not an absolute pathname (does not begin with a slash), cd searches the
directories in the list; if the search fails, cd searches the working directory. See
page 313 for a discussion of CDPATH.

cd

Changes to another working directory

738 cd

Examples

A cd command without an argument makes a user’s home directory the working directory. In the following example, cd makes Max’s home directory the working directory
and the pwd builtin verifies the change:
$ pwd
/home/max/literature
$ cd
$ pwd
/home/max

Under OS X, home directories are stored in /Users, not /home.
The next command uses an absolute pathname to make the /home/max/literature
directory the working directory:
$ cd /home/max/literature
$ pwd
/home/max/literature

Next, the cd utility uses a relative pathname to make a subdirectory of the current
working directory the new working directory:
$ cd memos
$ pwd
/home/max/literature/memos

Finally cd uses the .. reference to the parent of the working directory (page 92) to
make the parent of the current working directory the new working directory:
$ cd ..
$ pwd
/home/max/literature

chgrp 739

Changes the group associated with a file
chgrp [options] group file-list
chgrp [options] ––reference=rfile file-list

L

The chgrp utility changes the group associated with one or more files. The second syntax works under
Linux only.

Arguments

The group is the name or numeric group ID of the new group. The file-list is a list of
the pathnames of the files whose group association is to be changed. The rfile is the
pathname of a file whose group is to become the new group associated with file-list.

Options

Options preceded by a double hyphen (––) work under Linux only. Except as noted,
options named with a single letter and preceded by a single hyphen work under Linux
and OS X.

––changes

–c Displays a message for each file whose group is changed. L

––dereference

For each file that is a symbolic link, changes the group of the file the link points
to, not the symbolic link itself. Under Linux, this option is the default. See
page 115 for information on dereferencing symbolic links. L

––quiet or
––silent

–f Suppresses warning messages about files whose permissions prevent you from
changing their group IDs.

––no-dereference –h For each file that is a symbolic link, changes the group of the symbolic link,

not the file the link points to. See page 115 for information on dereferencing
symbolic links.
–H (partial dereference) For each file that is a symbolic link, changes the group
of the file the link points to, not the symbolic link itself. This option affects
files specified on the command line; it does not affect files found while
descending a directory hierarchy. This option treats files that are not symbolic
links normally and works with –R only. See page 115 for information on
dereferencing symbolic links.
–L (dereference) For each file that is a symbolic link, changes the group of the file
the link points to, not the symbolic link itself. This option affects all files, treats
files that are not symbolic links normally, and works with –R only. See page 115
for information on dereferencing symbolic links.
–P (no dereference) For each file that is a symbolic link, changes the group of the
symbolic link, not the file the link points to (default). This option affects all files,
treats files that are not symbolic links normally, and works with –R only. See
page 115 for information on dereferencing symbolic links.

chgrp

chgrp

740 chgrp
––recursive
––reference=rfile
––verbose

Notes

–R Recursively descends a directory specified in file-list and changes the group ID
on all files in the directory hierarchy.
Changes the group of the files in file-list to that of rfile. L
–v For each file, displays a message saying whether its group was retained or
changed.
Only the owner of a file or a user working with root privileges can change the group
association of a file.
Unless you are working with root privileges, you must belong to the specified group
to change the group ID of a file to that group.
See page 746 for information on how to use chown to change the group associated
with and/or the owner of a file.

Examples

See “Dereferencing Symbolic Links Using chgrp” on page 117 for examples that use
the –H, –L, and –P options.
The following command changes the group that the manuals file is associated with;
the new group is pubs:
$ chgrp pubs manuals

The next example uses the –v option to cause chgrp to report on each file it is
called with:
$ chgrp -v pubs *
changed group of 'mixture' to pubs
group of 'memo' retained as pubs

chmod 741

Changes the access mode (permissions) of a file
chmod [options] who operator permission file-list
chmod [options] mode file-list
chmod [options] ––reference=rfile file-list

symbolic
absolute
referential L

The chmod utility changes the ways in which a file can be accessed by the owner of the file, the group
the file is associated with, and/or all other users. You can specify the new access mode absolutely or
symbolically. Under Linux, you can also specify the mode referentially (third syntax). Under OS X,
you can use chmod to modify ACLs (page 1047).

Arguments

Arguments specify which files are to have their modes changed in which ways. The
rfile is the pathname of a file whose permissions are to become the new permissions
of the files in file-list.

Symbolic
You can specify multiple sets of symbolic modes (who operator permission) by separating each set from the next with a comma.
The chmod utility changes the access permission for the class of users specified by
who. The class of users is designated by one or more of the letters specified in the who
column of Table VI-4.

Table VI-4

Symbolic mode user class specification

who

User class

Meaning

u

User

Owner of the file

g

Group

Group the file is associated with

o

Other

All other users

a

All

Use in place of ugo

Table VI-5 lists the symbolic mode operators.

Table VI-5

Symbolic mode operators

operator

Meaning

+

Adds the permission for the specified user class

–

Removes the permission for the specified user class

=

Sets the permission for the specified user class; resets all other permissions
for that user class

chmod

chmod

742 chmod

The access permission is specified by one or more of the letters listed in Table VI-6.

Table VI-6

Symbolic mode permissions

permission

Meaning

r

Sets read permission

w

Sets write permission

x

Sets execute permission

s

Sets the user ID or group ID (depending on the who argument) to that of the
owner of the file while the file is being executed (for more information see
page 101)

t

Sets the sticky bit (only a user working with root privileges can set the sticky
bit, and it can be used only with u; see page 1095)

X

Makes the file executable only if it is a directory or if another user class has
execute permission

u

Sets the specified permissions to those of the owner

g

Sets the specified permissions to those of the group

o

Sets the specified permissions to those of others

Absolute
You can use an octal number to specify the access mode. Construct the number by
ORing the appropriate values from Table VI-7. To OR two or more octal numbers
from this table, just add them. (Refer to Table VI-8 on the next page for examples.)

Table VI-7

Absolute mode specifications

mode

Meaning

4000

Sets the user ID when the program is executed (page 101)

2000

Sets the group ID when the program is executed (page 101)

1000

Sticky bit (page 1095)

0400

Owner can read the file

0200

Owner can write to the file

0100

Owner can execute the file

0040

Group can read the file

0020

Group can write to the file

chmod 743

Table VI-7

Absolute mode specifications (continued)

mode

Meaning

0010

Group can execute the file

0004

Others can read the file

0002

Others can write to the file

0001

Others can execute the file

Table VI-8 lists some typical modes.

Table VI-8

Options

Examples of absolute mode specifications

Mode

Meaning

0777

Owner, group, and others can read, write, and execute the file

0755

Owner can read, write, and execute the file; group and others can read and
execute the file

0711

Owner can read, write, and execute the file; group and others can execute the
file

0644

Owner can read and write the file; group and others can read the file

0640

Owner can read and write the file, group can read the file, and others cannot
access the file

Options preceded by a double hyphen (––) work under Linux only. Except as noted,
options named with a single letter and preceded by a single hyphen work under Linux
and OS X.

––changes

–c Displays a message for each file whose permissions are changed. L

––quiet or
––silent

–f Suppresses warning messages about files whose ownership prevents chmod from
changing the permissions of the file.
–H (partial dereference) For each file that is a symbolic link, changes permissions of the
file the link points to, not the symbolic link itself. This option affects files specified
on the command line; it does not affect files found while descending a directory hierarchy. This option treats files that are not symbolic links normally and works with
–R only. See page 115 for information on dereferencing symbolic links. O
–L (dereference) For each file that is a symbolic link, changes permissions of the file
the link points to, not the symbolic link itself. This option affects all files, treats
files that are not symbolic links normally, and works with –R only. See page 115
for information on dereferencing symbolic links. O

744 chmod

–P (no dereference) For each file that is a symbolic link, changes permissions of the
symbolic link, not the file the link points to. This option affects all files, treats
files that are not symbolic links normally, and works with –R only. See page 115
for information on dereferencing symbolic links. O
––recursive
––reference=rfile
––verbose

Notes

–R Recursively descends a directory specified in file-list and changes the permissions on all files in the directory hierarchy.
Changes the permissions of the files in file-list to that of rfile. L
–v For each file, displays a message saying that its permissions were changed (even
if they were not changed) and specifying the permissions. Use ––changes to display messages only when permissions are actually changed.
Only the owner of a file or a user working with root privileges can change the access
mode, or permissions, of a file.
When you use symbolic arguments, you can omit the permission from the command
line when the operator is =. This omission takes away all permissions for the specified
user class. See the second example in the next section.
Under Linux, chmod never changes the permissions of symbolic links.
Under Linux, chmod dereferences symbolic links found on the command line. In other
words, chmod changes the permissions of files that symbolic links found on the command line point to; chmod does not affect files found while descending a directory
hierarchy. This behavior mimics the behavior of the Mac OS X –H option.
A big difference between absolute chmod commands and symbolic chmod commands is
that, when using a symbolic command, you use the + or – operators to modify existing
permissions of a file or you use the = operator to set permissions to a specified value.
When using absolute commands, you can only set permissions to a specified value.
See page 99 for another discussion of chmod.

Examples

See page 1047 for examples of using chmod to change ACLs under OS X.
The following examples show how to use the chmod utility to change the permissions
of the file named temp. The initial access mode of temp is shown by ls. See “Discussion” on page 864 for information about the ls display.
$ ls -l temp
-rw-rw-r-- 1 max pubs 57 07-12 16:47 temp

When you do not follow an equal sign with a permission, chmod removes all permissions for the specified user class. The following command removes all access
permissions for the group and all other users so only the owner has access to the file:
$ chmod go= temp
$ ls -l temp
-rw------- 1 max pubs 57 07-12 16:47 temp

chmod 745

The next command changes the access modes for all users (owner, group, and others)
to read and write. Now anyone can read from or write to the file.
$ chmod a=rw temp
$ ls -l temp
-rw-rw-rw- 1 max pubs 57 07-12 16:47 temp

Using an absolute argument, a=rw becomes 666. The next command performs the
same function as the preceding one:
$ chmod 666 temp

The next command removes write access permission for other users. As a result,
members of the pubs group can read from and write to the file, but other users can
only read from the file:
$ chmod o-w temp
$ ls -l temp
-rw-rw-r-- 1 max pubs 57 07-12 16:47 temp

The following command yields the same result, using an absolute argument:
$ chmod 664 temp

The next command adds execute access permission for all users:
$ chmod a+x temp
$ ls -l temp
-rwxrwxr-x 1 max pubs 57 07-12 16:47 temp

If temp is a shell script or other executable file, all users can now execute it. (The
operating system requires read and execute access to execute a shell script but only
execute access to execute a binary file.) The absolute command that yields the same
result is
$ chmod 775 temp

The final command uses symbolic arguments and the = operator to achieve the same
result as the preceding command. It sets permissions to read, write, and execute for
the owner and the group, and to read and execute for other users. A comma separates
the sets of symbolic modes.
$ chmod ug=rwx,o=rx temp

746 chown

chown
chown

Changes the owner of a file and/or the group the file is associated with
chown [options] owner file-list
chown [options] owner:group file-list
chown [options] owner: file-list
chown [options] :group file-list
chown [options] ––reference=rfile file-list

L

The chown utility changes the owner of a file and/or the group the file is associated with. Only a user
working with root privileges can change the owner of a file. Only a user working with root privileges
or the owner of a file who belongs to the new group can change the group a file is associated with.
The last syntax works under Linux only.

Arguments

The owner is the username or numeric user ID of the new owner. The group is the
group name or numeric group ID of the new group the file is to be associated with.
The file-list is a list of the pathnames of the files whose ownership and/or group
association you want to change. The rfile is the pathname of a file whose owner
and/or group association is to become the new owner and/or group association of
file-list. Table VI-9 shows the ways you can specify the new owner and/or group.

Table VI-9

Options
––changes

Specifying the new owner and/or group

Argument

Meaning

owner

The new owner of file-list; the group is not changed

owner:group

The new owner of and new group associated with file-list

owner:

The new owner of file-list; the group associated with file-list is changed to the
new owner’s login group

:group

The new group associated with file-list; the owner is not changed

Under Linux, chown accepts the common options described on page 718. Options preceded by a double hyphen (––) work under Linux only. Except as noted, options named
with a single letter and preceded by a single hyphen work under Linux and OS X.
–c Displays a message for each file whose ownership/group is changed. L

––dereference

Changes the ownership/group of the files symbolic links point to, not the
symbolic links themselves. Under Linux, this option is the default. See
page 115 for information on dereferencing symbolic links. L

––quiet or
––silent

–f Suppresses error messages about files whose ownership and/or group association
chown cannot change.
–H (partial dereference) For each file that is a symbolic link, changes the owner
and/or group association of the file the link points to, not the symbolic link itself.

chown

747

This option affects files specified on the command line; it does not affect files
found while descending a directory hierarchy. This option treats files that are not
symbolic links normally and works with –R only. See page 115 for information
on dereferencing symbolic links.
––no-dereference –h For each file that is a symbolic link, changes the owner and/or group association

of the symbolic link, not the file the link points to. See page 115 for information
on dereferencing symbolic links.
–L (dereference) For each file that is a symbolic link, changes the owner and/or
group association of the file the link points to, not the symbolic link itself. This
option affects all files, treats files that are not symbolic links normally, and works
with –R only. See page 115 for information on dereferencing symbolic links.
–P (no dereference) For each file that is a symbolic link, changes the owner and/or
group association of the symbolic link, not the file the link points to. This option
affects all files, treats files that are not symbolic links normally, and works with
–R only. See page 115 for information on dereferencing symbolic links.
––recursive

–R When you include directories in the file-list, this option descends the directory
hierarchy, setting the specified owner and/or group association for all files in the
hierarchy.

––reference=rfile

Changes the owner and/or group association of the files in the file-list to that
of rfile. L

––verbose

–v For each file, displays a message saying whether its owner and/or group association
was retained or changed.

Notes

The chown utility clears setuid and setgid bits when it changes the owner of a file.

Examples

The following command changes the owner of the chapter1 file in the manuals
directory; the new owner is Sam:
# chown sam manuals/chapter1

The following command makes Max the owner of, and Max’s login group the group
associated with, all files in the /home/max/literature directory and in all its
subdirectories:
# chown -R max: /home/max/literature

Under Mac OS X, home directories are stored in /Users, not /home.
The next command changes the ownership of the files in literature to max and the
group associated with these files to pubs:
# chown max:pubs /home/max/literature/*

The final example changes the group association of the files in manuals to pubs without
altering their ownership. The owner of the files, who is executing this command, must
belong to the pubs group.
$ chown :pubs manuals/*

748 cmp

cmp
Compares two files

cmp

cmp [options] file1 [file2 [skip1 [skip2]]]
The cmp utility displays the differences between two files on a byte-by-byte basis. If the files are the
same, cmp is silent. If the files differ, cmp displays the byte and line number of the first difference.

Arguments

The file1 and file2 arguments are the pathnames of the files that cmp compares. If
file2 is omitted, cmp uses standard input instead. Using a hyphen (–) in place of file1
or file2 causes cmp to read standard input instead of that file.
The skip1 and skip2 arguments are decimal numbers indicating the number of bytes to
skip in each file before beginning the comparison. You can use the standard multiplicative
suffixes after skip1 and skip2; see Table VI-1 on page 717.

Options

Under Linux and OS X, cmp accepts the common options described on page 718.

The Mac OS X version of cmp accepts long options
tip Options for cmp preceded by a double hyphen (––) work under Mac OS X as well as under Linux.
––print–bytes

–b Displays more information, including filenames, byte and line numbers, and the
octal and ASCII values of the first differing byte.

––ignore–initial=n1[:n2]

–i n1[:n2]
Without n2, skips the first n1 bytes in both files before beginning the comparison.
With n1 and n2, skips the first n1 bytes in file1 and skips the first n2 bytes in file2
before beginning the comparison. You can follow n1 and/or n2 with one of the
multiplicative suffixes listed in Table VI-1 on page 717.
––verbose

–l (lowercase “l”) Instead of stopping at the first byte that differs, continues
comparing the two files and displays both the location and the value of each
byte that differs. Locations are displayed as decimal byte count offsets from
the beginning of the files; byte values are displayed in octal. The comparison
terminates when an EOF is encountered on either file.

––silent or
––quiet

–s Suppresses output from cmp; only sets the exit status (see “Notes”).

Notes

Byte and line numbering start at 1.
The cmp utility does not display a message if the files are identical; it only sets the
exit status. This utility returns an exit status of 0 if the files are the same and an exit
status of 1 if they are different. An exit status greater than 1 means an error occurred.
When you use skip1 (and skip2), the offset values cmp displays are based on the byte
where the comparison began.

cmp

749

Under Mac OS X, cmp compares data forks (page 1044) of a file only.
Unlike diff (page 777), cmp works with binary as well as ASCII files.

Examples

The examples use the files named a and b. These files have two differences. The first
difference is that the word lazy in file a is replaced by lasy in file b. The second difference is subtler: A TAB character appears just before the NEWLINE character in file b.
$ cat a
The quick brown fox jumped over the lazy dog.
$ cat b
The quick brown fox jumped over the lasy dog.TAB

The first example uses cmp without any options to compare the two files. The cmp
utility reports that the files are different and identifies the offset from the beginning
of the files where the first difference is found:
$ cmp a b
a b differ: char 39, line 1

You can display the octal ASCII values of the bytes and the characters at that location
by adding the –b (––print–bytes) option:
$ cmp --print-bytes a b
a b differ: char 39, line 1 is 172 z 163 s

The –l option displays all bytes that differ between the two files. Because this option
creates a lot of output if the files have many differences, you might want to redirect
the output to a file. The following example shows the two differences between files
a and b. The –b option displays the values for the bytes as well. Where file a has a
CONTROL–J (NEWLINE), file b has a CONTROL–I (TAB). The message saying that the EOF on file a
has been reached indicates that file b is longer than file a.
$ cmp -lb a b
39 172 z
163 s
46 12 ^J
11 ^I
cmp: EOF on a

In the next example, the ––ignore–initial option causes cmp to ignore 39 bytes,
skipping over the first difference in the files. The cmp utility now reports on the
second difference. The difference is put at character 7, which is the 46th character
in the original file b (7 characters past the ignored 39 characters).
$ cmp --ignore-initial=39 a b
a b differ: char 7, line 1

You can use skip1 and skip2 in place of the ––ignore–initial option used in the preceding
example:
$ cmp a b 39 39
a b differ: char 7, line 1

750 comm

comm
comm

Compares sorted files
comm [options] file1 file2
The comm utility displays a line-by-line comparison of two sorted files. The first of the three columns
it displays lists the lines found only in file1, the second column lists the lines found only in file2, and
the third lists the lines common to both files.

Arguments

The file1 and file2 arguments are pathnames of the files that comm compares. Using
a hyphen (–) in place of file1 or file2 causes comm to read standard input instead of
that file.

Options

You can combine the options. With no options, comm produces three-column output.
–1 Does not display column 1 (does not display lines found only in file1).
–2 Does not display column 2 (does not display lines found only in file2).
–3 Does not display column 3 (does not display lines found in both files).

Notes

If the files have not been sorted, comm will not work properly.
Lines in the second column are preceded by one
are preceded by two TAB s.

TAB,

and those in the third column

The exit status indicates whether comm completed normally (0) or abnormally (not 0).

Examples

The following examples use files named c and d. The files have already been sorted:
$ cat c
bbbbb
ccccc
ddddd
eeeee
fffff
$ cat d
aaaaa
ddddd
eeeee
ggggg
hhhhh

Refer to sort on page 942 for information on sorting files.
The following example calls comm without any options, so it displays three columns.
The first column lists those lines found only in file c, the second column lists those
found in d, and the third lists the lines found in both c and d:

comm 751
$ comm c d
aaaaa
bbbbb
ccccc
ddddd
eeeee
fffff
ggggg
hhhhh

The next example uses options to prevent comm from displaying columns 1 and 2.
The result is column 3, a list of the lines common to files c and d:
$ comm -12 c d
ddddd
eeeee

752 configure

configure
configure

Configures source code automatically
./configure [options]
The configure script is part of the GNU Configure and Build System. Software developers who supply
source code for their products face the problem of making it easy for relatively naive users to build and
install their software packages on a wide variety of machine architectures, operating systems, and system
software. To facilitate this process many software developers supply a shell script named configure with
their source code.
When you run configure, it determines the capabilities of the local system. The data collected by configure
is used to build the makefiles with which make (page 869) builds the executables and libraries. You can
adjust the behavior of configure by specifying command-line options and environment variables.

Options
The Mac OS X version of configure accepts long options
tip Options for configure preceded by a double hyphen (––) work under Mac OS X as well as under
Linux.
––disable-feature

Works in the same manner as ––enable-feature except it disables support for
feature.

––enable-feature

The feature is the name of a feature that can be supported by the software
being configured. For example, configuring the Z Shell source code with the
command configure ––enable-zsh-mem configures the source code to use the
special memory allocation routines provided with zsh instead of using the system memory allocation routines. Check the README file supplied with the
software distribution to see the choices available for feature.

––help

Displays a detailed list of all options available for use with configure. The contents
of this list depends on the software you are installing.

––prefix=directory

By default configure builds makefiles that install software in the /usr/local directory hierarchy (when you give the command make install). To install software
into a different directory, replace directory with the absolute pathname of the
desired directory.
––with-package

The package is the name of an optional package that can be included with the
software you are configuring. For example, if you configure the source code for
the Windows emulator wine with the command configure ––with-dll, the source
code is configured to build a shared library of Windows emulation support.
Check the README file supplied with the software you are installing to see the

configure 753

choices available for package. The command configure ––help usually displays
the choices available for package.

Discussion

The GNU Configure and Build System allows software developers to distribute
software that can configure itself to be built on a variety of systems. It builds a
shell script named configure, which prepares the software distribution to be built
and installed on a local system. The configure script searches the local system to
find the dependencies for the software distribution and constructs the appropriate
makefiles. Once you have run configure, you can build the software using a make
command and install the software using a make install command.
The configure script determines which C compiler to use (usually gcc) and specifies a
set of flags to pass to that compiler. You can set the environment CC and CFLAGS
variables to override these values. See the “Examples” section.

Notes

Each package that uses the GNU autoconfiguration utility provides its own custom
copy of configure, which the software developer created using the GNU autoconf utility
(www.gnu.org/software/autoconf). Read the README and INSTALL files that are
provided with the software you are installing for information about the available
options.
The configure scripts are self-contained and run correctly on a wide variety of systems.
You do not need any special system resources to use configure.
The configure utility will exit with an error message if a dependency is not installed.

Examples

The simplest way to call configure is to cd to the base directory for the software you
are installing and run the following command:
$ ./configure

The ./ is prepended to the command name to ensure you are running the configure
script supplied with the software you are installing. For example, to cause configure to
build makefiles that pass the flags –Wall and –O2 to gcc, give the following command
from bash:
$ CFLAGS="-Wall -O2" ./configure

If you are using tcsh, give the following command:
tcsh $ env CFLAGS="-Wall -O2" ./configure

754 cp

cp
Copies files

cp

cp [options] source-file destination-file
cp [options] source-file-list destination-directory
The cp utility copies one or more files. It can either make a copy of a single file (first syntax) or copy
one or more files to a directory (second syntax). With the –R option, cp can copy directory hierarchies.

Arguments

The source-file is the pathname of the file that cp makes a copy of. The destination-file
is the pathname cp assigns to the resulting copy of the file.
The source-file-list is a list of one or more pathnames of files that cp makes copies
of. The destination-directory is the pathname of the directory in which cp places the
copied files. With this syntax, cp gives each copied file the same simple filename as
its source-file.
The –R option enables cp to copy directory hierarchies recursively from the
source-file-list into the destination-directory.

Options

Under Linux, cp accepts the common options described on page 718. Options
preceded by a double hyphen (– –) work under Linux only. Except as noted,
options named with a single letter and preceded by a single hyphen work under
Linux and OS X.

––archive

–a Attempts to preserve the owner, group, permissions, access date, and modification
date of source file(s) while copying recursively without dereferencing symbolic
links. Same as –dpR.

––backup

–b If copying a file would remove or overwrite an existing file, this option makes
a backup copy of the file that would be overwritten. The backup copy has the
same name as the destination-file with a tilde (~) appended to it. When you use
both ––backup and ––force, cp makes a backup copy when you try to copy a
file over itself. For more backup options, search for Backup options in the core
utils info page. L
–d For each file that is a symbolic link, copies the symbolic link, not the file the
link points to. Also preserves hard links in destination-files that exist between
corresponding source-files. This option is equivalent to ––no-dereference and
––preserve=links. See page 115 for information on dereferencing symbolic
links. L

––force

–f When the destination-file exists but cannot be opened for writing, causes cp
to try to remove destination-file before copying source-file. This option is
useful when the user copying a file does not have write permission to an

cp 755

existing destination-file but does have write permission to the directory containing the destination-file. Use this option with –b to back up a destination
file before removing or overwriting it.
–H (partial dereference) For each file that is a symbolic link, copies the file the link
points to, not the symbolic link itself. This option affects files specified on the
command line; it does not affect files found while descending a directory hierarchy. This option treats files that are not symbolic links normally. Under
OS X, works with –R only. See page 115 for information on dereferencing
symbolic links.
––interactive

–i Prompts you whenever cp would overwrite a file. If you respond with a string
that starts with y or Y, cp copies the file. If you enter anything else, cp does not
copy the file.

––dereference

–L (dereference) For each file that is a symbolic link, copies the file the link points
to, not the symbolic link itself. This option affects all files and treats files that
are not symbolic links normally. Under OS X, works with –R only. See page 115
for information on dereferencing symbolic links.

––no-dereference –P (no dereference) For each file that is a symbolic link, copies the symbolic link,

not the file the link points to. This option affects all files and treats files that are
not symbolic links normally. Under OS X, works with –R only. See page 115 for
information on dereferencing symbolic links.
––preserve[=attr] –p Creates a destination-file with the same owner, group, permissions, access date,

modification date, and ACLs as the source-file. The –p option does not take an
argument.
Without attr, ––preserve works as described above. The attr is a commaseparated list that can include mode (permissions), ownership (owner and
group), timestamps (access and modification dates), links (hard links), and all
(all attributes).
––parents

Copies a relative pathname to a directory, creating directories as needed. See the
“Examples” section. L

––recursive –R or –r

Recursively copies directory hierarchies including ordinary files. Under Linux,
the ––no-dereference (–d) option is implied: With the –R, –r, or ––recursive
option, cp copies the links (not the files the links point to). The –r and
––recursive options are available under Linux only.
––update
––verbose

–u Copies only when the destination-file does not exist or when it is older than the
source-file (i.e., this option will not overwrite a newer destination file). L
–v Displays the name of each file as cp copies it.
–X Do not copy extended attributes (page 1044). O

756 cp

Notes

Under Linux, cp dereferences symbolic links unless you use one or more of the –R,
–r, ––recursive, –P, –d, or ––no-dereference options. As explained on the previous
page, under Linux the –H option dereferences only symbolic links listed on the
command line. Under Mac OS X, without the –R option, cp always dereferences
symbolic links; with the –R option, cp does not dereference symbolic links (–P is
the default) unless you specify –H or –L.
Many options are available for cp under Linux. See the coreutils info page for a
complete list.
If the destination-file exists before you execute a cp command, cp overwrites the file,
destroying its contents but leaving the access privileges, owner, and group associated
with the file as they were.
If the destination-file does not exist, cp uses the access privileges of the source-file.
The user who copies the file becomes the owner of the destination-file and the user’s
login group becomes the group associated with the destination-file.
Using the –p option (or ––preserve without an argument) causes cp to attempt to set
the owner, group, permissions, access date, and modification date to match those of
the source-file.
Unlike with the ln utility (page 856), the destination-file that cp creates is independent
of its source-file.
Under OS X version 10.4 and above, cp copies extended attributes (page 1044). The
–X option causes cp not to copy extended attributes.

Examples

The first command makes a copy of the file letter in the working directory. The name
of the copy is letter.sav.
$ cp letter letter.sav

The next command copies all files with a filename extension of .c to the archives
directory, which is a subdirectory of the working directory. Each copied file retains
its simple filename but has a new absolute pathname. The –p (––preserve) option
causes the copied files in archives to have the same owner, group, permissions, access
date, and modification date as the source files.
$ cp -p *.c archives

The next example copies memo from Sam’s home directory to the working directory:
$ cp ~sam/memo .

The next example runs under Linux and uses the ––parents option to copy the file
memo/thursday/max to the dir directory as dir/memo/thursday/max. The find utility
shows the newly created directory hierarchy.

cp 757
$ cp --parents memo/thursday/max dir
$ find dir
dir
dir/memo
dir/memo/thursday
dir/memo/thursday/max

The following command copies the files named memo and letter into another directory.
The copies have the same simple filenames as the source files (memo and letter) but
have different absolute pathnames. The absolute pathnames of the copied files are
/home/sam/memo and /home/sam/letter, respectively.
$ cp memo letter /home/sam

The final command demonstrates one use of the –f (––force) option. Max owns the
working directory and tries unsuccessfully to copy one over another file (me) that he
does not have write permission for. Because he has write permission to the directory
that holds me, Max can remove the file but cannot write to it. The –f (––force) option
unlinks, or removes, me and then copies one to the new file named me.
$ ls -ld
drwxrwxr-x
2 max max 4096 10-16 22:55 .
$ ls -l
-rw-r--r-1 root root 3555 10-16 22:54 me
-rw-rw-r-1 max max 1222 10-16 22:55 one
$ cp one me
cp: cannot create regular file 'me': Permission denied
$ cp -f one me
$ ls -l
-rw-r--r-1 max max 1222 10-16 22:58 me
-rw-rw-r-1 max max 1222 10-16 22:55 one

If Max had used the –b (––backup) option in addition to –f (––force), cp would have
created a backup of me named me~. Refer to “Directory Access Permissions” on
page 103 for more information.

758 cpio

cpio
Creates an archive, restores files from an archive, or copies a directory hierarchy

cpio

cpio ––create|–o [options]
cpio ––extract|–i [options] [pattern-list]
cpio ––pass-through|–p [options] destination-directory
The cpio utility has three modes of operation: Create (copy-out) mode places multiple files into a single
archive file, extract (copy-in) mode restores files from an archive, and pass-through (copy-pass) mode
copies a directory hierarchy. The archive file cpio creates can be saved on disk, tape, other removable
media, or a remote system.
Create mode reads a list of names of ordinary or directory files from standard input and writes the
resulting archive file to standard output. You can use this mode to create an archive. Extract mode
reads an archive from standard input and extracts files from that archive. You can restore all files from
the archive or only those files whose names match a pattern. Pass-through mode reads a list of names
of ordinary or directory files from standard input and copies the files to a specified directory.

Arguments

In create mode, cpio constructs an archive from the files named on standard input.
By default cpio in extract mode extracts all files found in the archive. You can choose
to extract files selectively by supplying a pattern-list. If the name of a file in the archive
matches one of the patterns in pattern-list, cpio extracts that file; otherwise, it ignores
the file. The patterns in a cpio pattern-list are similar to shell wildcards (page 148)
except that pattern-list match slashes (/) and a leading period (.) in a filename.
In pass-through mode you must supply the name of the destination-directory as an
argument to cpio.

Options

A major option specifies the mode in which cpio operates: create, extract, or
pass-through.

Major Options
You must include exactly one of these options. Options preceded by a double hyphen
(––) work under Linux only. Options named with a single letter and preceded by a
single hyphen work under Linux and OS X.
––extract

–i (copy-in mode) Reads the archive from standard input and extracts files. Without
a pattern-list, cpio extracts all files from the archive. With a pattern-list, cpio
extracts only files with names that match one of the patterns in pattern-list. The
following example extracts from the device mounted on /dev/sde1 only those files
whose names end in .c:
$ cpio -i \*.c < /dev/sde1

The backslash prevents the shell from expanding the
argument to cpio.

* before it passes the

cpio
––create

759

–o (copy-out mode) Constructs an archive from the files named on standard
input. These files, which can be ordinary or directory files, must each appear
on a separate line. The archive is written to standard output as it is built. The
find utility frequently generates the filenames that cpio uses. The following
command builds an archive of the /home filesystem and writes it to the device
mounted on /dev/sde1:
# find /home -depth -print | cpio -o > /dev/sde1

The –depth option causes find to search for files in a depth-first search, thereby
reducing the likelihood of permissions problems when you restore the files from
the archive. See the discussion of this option on page 761.
––pass-through

–p (copy-pass mode) Copies files from one place on the system to another. Instead
of constructing an archive file containing the files named on standard input, cpio
copies them to the destination-directory (the last argument on the cpio command
line). The effect is the same as if you had created an archive with copy-out mode
and then extracted the files with copy-in mode, except using pass-through mode
avoids creating an archive. The following example copies the files in the working
directory and all subdirectories into ~max/code:
$ find . -depth -print | cpio -pdm ~max/code

Other Options
The following options alter the behavior of cpio. These options work with one or
more of the preceding major options.
Except as noted, options preceded by a double hyphen (––) work under Linux only.
Except as noted, options named with a single letter and preceded by a single hyphen
work under Linux and OS X.
––reset-access-time

–a Resets the access times of source files after copying them so they have the same
access time after copying as they did before.
–B (block) Sets the block size to 5,120 bytes instead of the default 512 bytes. Under
Linux this option affects input and output block sizes; under OS X it affects only
output block sizes.
Sets the block size used for input and output to n 512-byte blocks. L

––block-size=n

–c (compatible) Writes header information in ASCII so older (incompatible) cpio
utilities on other systems can read the file. This option is rarely needed.
––make-directories

–d Creates directories as needed when copying files. For example, you need this
option when you are extracting files from an archive with a file list generated
by find with the –depth option. This option can be used only with the –i
(––extract) and –p (––pass-through) options.

760 cpio
––pattern-file=filename

–E filename
Reads pattern-list from filename, one pattern per line. Additionally, you can
specify pattern-list on the command line.
––file=archive

–F archive
Uses archive as the name of the archive file. In extract mode, reads from archive
instead of standard input. In create mode, writes to archive instead of standard
output. You can use this option to access a device on another system on a network;
see the –f (––file) option to tar (page 969) for more information.

––format fmt

In create mode, writes the archive in fmt format, as shown in Table VI-10. If you
do not specify a format, cpio writes a POSIX format file (odc in the table). O

Table VI-10

––nonmatching

––help
––dereference

––link

cpio archive formats

format

Description

cpio

The same as odc

newc

The format used for cpio archives under UNIX System V, release 4

odc

The historical POSIX portable octet-oriented cpio format (default)

pax

The POSIX pax format

ustar

The POSIX tar format

–f (flip) Reverses the sense of the test performed on pattern-list when extracting
files from an archive. Files are extracted from the archive only if they do not
match any of the patterns in the pattern-list.
Displays a list of options. L O
–L For each file that is a symbolic link, copies the file the link points to (not the symbolic link itself). This option treats files that are not symbolic links normally. See
page 115 for information on dereferencing symbolic links.
–l When possible, links files instead of copying them.

––preserve-modification-time

–m Preserves the modification times of files that are extracted from an archive.
Without this option the files show the time they were extracted. With this option
the created files show the time they had when they were copied into the archive.
––no-absolute-filenames

In extract mode, creates all filenames relative to the working directory—even
files that were archived using absolute pathnames. L
––quiet
––rename

Suppresses most messages. L O
–r Allows you to rename files as cpio copies them. When cpio prompts you with the
name of a file, you respond with the new name. The file is then copied with the new
name. If you press RETURN without entering a filename, cpio does not copy the file.

cpio

761

––list

–t (table of contents) Displays a table of contents of the archive. This option works
only with the –i (––extract) option, although no files are actually extracted from
the archive. With the –v (––verbose) option, it displays a detailed table of contents
in a format similar to that displayed by ls –l.

––unconditional

–u Overwrites existing files regardless of their modification times. Without this
option cpio will not overwrite a more recently modified file with an older one;
it displays a warning message.

––verbose

–v Lists files as they are processed. With the –t (––list) option, it displays a detailed
table of contents in a format similar to that displayed by ls –l.

Discussion

Without the –u (––unconditional) option, cpio will not overwrite a more recently
modified file with an older file.
You can use both ordinary and directory filenames as input when you create an
archive. If the name of an ordinary file appears in the input list before the name of
its parent directory, the ordinary file appears before its parent directory in the archive
as well. This order can lead to an avoidable error: When you extract files from the
archive, the child has nowhere to go in the file structure if its parent has not yet been
extracted.
Making sure that files appear after their parent directories in the archive is not
always a solution. One problem occurs if the –m (––preserve-modification-time)
option is used when extracting files. Because the modification time of a parent
directory is updated whenever a file is created within it, the original modification
time of the parent directory is lost when the first file is written to it.
The solution to this potential problem is to ensure that all files appear before their
parent directories when creating an archive and to create directories as needed when
extracting files from an archive. When you use this technique, directories are
extracted only after all files have been written to them and their modification times
are preserved.
With the –depth option, find generates a list of files, with all children appearing in
the list before their parent directories. If you use this list to create an archive, the
files are in the proper order. (Refer to the first example in the next section.) When
extracting files from an archive, the –d (––make-directories) option causes cpio to
create parent directories as needed and the –m (– –preserve-modification-time)
option does just what its name says. Using this combination of utilities and options
preserves directory modification times through a create/extract sequence.
This strategy also solves another potential problem. Sometimes a parent directory
might not have permissions set so that you can extract files into it. When cpio
automatically creates the directory with –d (– –make-directories), you can be
assured that you have write permission to the directory. When the directory is
extracted from the archive (after all the files are written into the directory), it is
extracted with its original permissions.

762 cpio

Examples

The first example creates an archive of the files in Sam’s home directory, writing the
archive to a USB flash drive mounted at /dev/sde1.
$ find /home/sam -depth -print | cpio -oB >/dev/sde1

The find utility produces the filenames that cpio uses to build the archive. The –depth
option causes all entries in a directory to be listed before listing the directory name
itself, making it possible for cpio to preserve the original modification times of directories (see the preceding “Discussion” section). Use the –d (––make-directories) and
–m (––preserve-modification-time) options when you extract files from this archive
(see the following examples). The –B option sets the block size to 5,120 bytes.
Under OS X, home directories are stored in /Users, not /home.
To check the contents of the archive file and display a detailed listing of the files it
contains, give the following command:
$ cpio -itv < /dev/sde1

The following command restores the files that formerly were in the memo subdirectory
of Sam’s home directory:
$ cpio -idm /home/sam/memo/\* < /dev/sde1

The –d (––make-directories) option ensures that any subdirectories that were in the
memo directory are re-created as needed. The –m (––preserve-modification-time)
option preserves the modification times of files and directories. The asterisk in the
regular expression is escaped to keep the shell from expanding it.
The next command is the same as the preceding command except it uses the Linux
– –no-absolute-filenames option to re-create the memo directory in the working
directory, which is named memocopy. The pattern does not start with the slash that
represents the root directory, allowing cpio to create the files with relative pathnames.
$ pwd
/home/sam/memocopy
$ cpio -idm --no-absolute-filenames home/sam/memo/\* < /dev/sde1

The final example uses the –f option to restore all files in the archive except those that
were formerly in the memo subdirectory:
$ cpio -ivmdf /home/sam/memo/\* < /dev/sde1

The –v option lists the extracted files as cpio processes the archive, verifying the
expected files have been extracted.

crontab

763

Maintains crontab files
crontab [–u user-name] filename
crontab [–u user-name] option
A crontab file associates periodic times (such as 14:00 on Wednesdays) with commands. The
cron/crond daemon executes each command at the specified time. When you are working as yourself,
the crontab utility installs, removes, lists, and allows you to edit your crontab file. A user working with
root privileges can work with any user’s crontab file.

Arguments

The first syntax copies the contents of filename (which contains crontab commands)
into the crontab file of the user who runs the command or that of username. When
the user does not have a crontab file, this process creates a new one; when the user
has a crontab file, this process overwrites the file. When you replace filename with a
hyphen (–), crontab reads commands from standard input.
The second syntax lists, removes, or allows you to edit the crontab file, depending
on which option you specify.

Options

Choose only one of the –e, –l, or –r options. A user working with root privileges can
use –u with one of these options.
–e (edit) Runs the text editor specified by the VISUAL or EDITOR environment
variable on the crontab file, enabling you to add, change, or delete entries. This
option installs the modified crontab file when you exit from the editor.
–l (list) Displays the contents of the crontab file.
–r (remove) Deletes the crontab file.
–u username
(user) Works on username’s crontab file. Only a user working with root privileges
can use this option.

Notes

This section covers the versions of crontab and crontab files that were written by Paul
Vixie—hence this version of cron is called Vixie cron. These versions are POSIX compliant and differ from an earlier version of Vixie cron as well as from the classic SVR3 syntax.
User crontab files are kept in the /var/spool/cron or /var/spool/cron/crontabs
directory. Each file is named with the username of the user who it belongs to.
The daemon named cron/crond reads the crontab files and runs the commands. If a
command line in a crontab file does not redirect its output, all output sent to standard
output and standard error is mailed to the user unless you set the MAILTO variable
within the crontab file to a different username.

crontab

crontab

764 crontab

Crontab files do not inherit variables set in startup files. For this reason, you might
want to put the assignment export BASH_ENV=~/.bashrc near the top of crontab
files you write.
Crontab directories

To make the system administrator’s job easier, the directories named /etc/cron.hourly,
/etc/cron.daily, /etc/cron.weekly, and /etc/cron.monthly hold crontab files that, on
most systems, are run by run-parts, which in turn are run by the /etc/crontab file. Each
of these directories contains files that execute system tasks at the interval named by
the directory. A user working with root privileges can add files to these directories
instead of adding lines to root’s crontab file. A typical /etc/crontab file looks like this:
$ cat /etc/crontab
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/
# run-parts
01 * * * * root
02 4 * * * root
22 4 * * 0 root
42 4 1 * * root

run-parts /etc/cron.hourly
run-parts /etc/cron.daily
run-parts /etc/cron.weekly
run-parts /etc/cron.monthly

Each entry in a crontab file begins with five fields that specify when the command is
to be run (minute, hour, day of the month, month, and day of the week). An asterisk
appearing in place of a number is a wildcard representing all possible values. In the
day-of-the-week field, you can use either 7 or 0 to represent Sunday.
It is a good practice to start cron/crond jobs a variable number of minutes before or after
the hour, half-hour, or quarter-hour. When you start jobs at these times, it becomes less
likely that many processes will start at the same time, thereby potentially overloading the
system.
When cron/crond starts (usually when the system is booted), it reads all crontab files
into memory. Once a minute, the cron/crond daemon reviews all crontab entries it has
stored in memory, and runs whichever jobs are due to be run at that time.
Special time
specifications

You can use the special time specifications shown in Table VI-11 in place of the initial
five fields described above.

Table VI-11

crontab special time specifications

Specification

Meaning

Replaces

@reboot

Run when the system boots

@yearly

Run on January 1

0011*

@monthly

Run on the first day of each month

001**

@weekly

Run every Sunday

00**0

@daily

Run once a day

00***

@hourly

Run once an hour

0****

crontab
cron.allow,
cron.deny

Examples

765

By creating, editing, and removing the cron.allow and cron.deny files, a user working
with root privileges determines which users can run cron/crond jobs. Under Linux
these files are kept in the /etc directory; under Mac OS X they are kept in /var/at
(which has a symbolic link at /usr/lib/cron). When you create a cron.deny file with
no entries and no cron.allow file exists, everyone can use crontab. When the
cron.allow file exists, only users listed in that file can use crontab, regardless of the
presence and contents of cron.deny. Otherwise, you can list in the cron.allow file
those users who are allowed to use crontab and in cron.deny those users who are not
allowed to use it. (Listing a user in cron.deny is not strictly necessary because, if a
cron.allow file exists and the user is not listed in it, the user will not be able to use
crontab anyway.)
In the following example, Sam uses crontab –l to list the contents of his crontab file
(/var/spool/cron/sam). All the scripts that Sam runs are in his ~/bin directory. The first
line sets the MAILTO variable to max so Max gets the output from commands run
from Sam’s crontab file that is not redirected. The sat.job script runs every Saturday
(day 6) at 2:05 AM, twice.week runs at 12:02 AM on Sunday and Thursday (days 0 and
4), and twice.day runs twice a day, every day, at 10:05 AM and 4:05 PM.
$ who am i
sam
$ crontab -l
MAILTO=max
05 02 * * 6
00 02 * * 0,4
05 10,16 * * *

$HOME/bin/sat.job
$HOME/bin/twice.week
$HOME/bin/twice.day

To add an entry to your crontab file, run the crontab utility with the –e (edit) option.
Some Linux systems use a version of crontab that does not support the –e option. If
the local system runs such a version, you need to make a copy of your existing
crontab file, edit it, and then resubmit it, as in the following example. The –l (list)
option displays a copy of your crontab file.
$ crontab -l > newcron
$ vim newcron
...
$ crontab newcron

766 cut

cut
Selects characters or fields from input lines

cut

cut [options] [file-list]
The cut utility selects characters or fields from lines of input and writes them to standard output. Character and field numbering start with 1.

Arguments

The file-list is a list of ordinary files. If you do not specify an argument or if you specify
a hyphen (–) in place of a filename, cut reads from standard input.

Options

Under Linux, cut accepts the common options described on page 718. Options
preceded by a double hyphen (––) work under Linux only. Options named with
a single letter and preceded by a single hyphen work under Linux and OS X.

––characters=clist

–c clist
Selects the characters given by the column numbers in clist. The value of clist is
one or more comma-separated column numbers or column ranges. A range is
specified by two column numbers separated by a hyphen. A range of –n means
columns 1 through n; n– means columns n through the end of the line.
––delimiter=dchar

–d dchar
Specifies dchar as the input field delimiter. Also specifies dchar as the output
field delimiter unless you use the – –output-delimiter option. The default
delimiter is a TAB character. Quote dchar as necessary to protect it from shell
expansion.
––fields=flist

–f flist
Selects the fields specified in flist. The value of flist is one or more commaseparated field numbers or field ranges. A range is specified by two field numbers
separated by a hyphen. A range of –n means fields 1 through n; n– means fields
n through the last field. The field delimiter is a TAB character unless you use the
–d (––delimiter) option to change it.

––output-delimiter=ochar

Specifies ochar as the output field delimiter. The default delimiter is the TAB
character. You can specify a different delimiter by using the – –delimiter
option. Quote ochar as necessary to protect it from shell expansion.
--only-delimited

–s Copies only lines containing delimiters. Without this option, cut copies—but
does not modify—lines that do not contain delimiters. Works only with the –d
(––delimiter) option.

cut

767

Notes

Although limited in functionality, cut is easy to learn and use and is a good choice
when columns and fields can be selected without using pattern matching. Sometimes
cut is used with paste (page 905).

Examples

For the next two examples, assume that an ls –l command produces the following
output:
$ ls -l
total 2944
-rwxr-xr-x
-rw-rw-r--rw-rw-r--rw-rw-r--rw-rw-r--rw-rw-r--rw-rw-r--

1
1
1
1
1
1
1

zach
zach
zach
zach
zach
zach
zach

pubs
259 02-01 00:12 countout
pubs
9453 02-04 23:17 headers
pubs 1474828 01-14 14:15 memo
pubs 1474828 01-14 14:33 memos_save
pubs
7134 02-04 23:18 tmp1
pubs
4770 02-04 23:26 tmp2
pubs
13580 11-07 08:01 typescript

The following command outputs the permissions of the files in the working directory.
The cut utility with the –c option selects characters 2 through 10 from each input line.
The characters in this range are written to standard output.
$ ls -l | cut -c2-10
otal 2944
rwxr-xr-x
rw-rw-r-rw-rw-r-rw-rw-r-rw-rw-r-rw-rw-r-rw-rw-r--

The next command outputs the size and name of each file in the working directory.
The –f option selects the fifth and ninth fields from the input lines. The –d option tells
cut to use SPACEs, not TABs, as delimiters. The tr utility (page 987) with the –s option
changes sequences of more than one SPACE character into a single SPACE; otherwise, cut
counts the extra SPACE characters as separate fields.
$ ls -l | tr -s '

' ' ' | cut -f5,9 -d' '

259 countout
9453 headers
1474828 memo
1474828 memos_save
7134 tmp1
4770 tmp2
13580 typescript

The last example displays a list of full names as stored in the fifth field of the
/etc/passwd file. The –d option specifies that the colon character be used as the field
delimiter. Although this example works under Mac OS X, /etc/passwd does not
contain information about most users; see “Open Directory” on page 1042 for

768 cut

more information.
$ cat /etc/passwd
root:x:0:0:Root:/:/bin/sh
sam:x:401:50:Sam the Great:/home/sam:/bin/zsh
max:x:402:50:Max Wild:/home/max:/bin/bash
zach:x:504:500:Zach Brill:/home/zach:/bin/tcsh
hls:x:505:500:Helen Simpson:/home/hls:/bin/bash
$ cut -d: -f5 /etc/passwd
Root
Sam the Great
Max Wild
Zach Brill
Helen Simpson

date

769

date
date

Displays or sets the system time and date
date [options] [+format]
date [options] [newdate]
The date utility displays the time and date known to the system. A user working with root privileges
can use date to change the system clock.

Arguments

The +format argument specifies the format for the output of date. The format string,
which consists of field descriptors and text, follows a plus sign (+). The field descriptors
are preceded by percent signs, and date replaces each one with its value in the output.
Table VI-12 lists some of the field descriptors.

Table VI-12

Selected field descriptors

Descriptor

Meaning

%A

Unabbreviated weekday—Sunday to Saturday

%a

Abbreviated weekday—Sun to Sat

%B

Unabbreviated month—January to December

%b

Abbreviated month—Jan to Dec

%c

Date and time in default format used by date

%D

Date in mm/dd/yy format

%d

Day of the month—01 to 31

%H

Hour—00 to 23

%I

Hour—00 to 12

%j

Julian date (day of the year—001 to 366)

%M

Minutes—00 to 59

%m

Month of the year—01 to 12

%n

NEWLINE character

%P

AM

%r

Time in AM/PM notation

%S

Seconds—00 to 60 (the 60 accommodates leap seconds)

%s

Number of seconds since the beginning of January 1, 1970

or PM

770 date

Table VI-12

Selected field descriptors (continued)

Descriptor

Meaning

%T

Time in HH:MM:SS format

%t

TAB character

%w

Day of the week—0 to 6 (0 = Sunday)

%Y

Year in four-digit format (for example, 2014)

%y

Last two digits of the year—00 to 99

%Z

Time zone (for example, PDT)

By default date zero fills numeric fields. Placing an underscore (_) immediately following
the percent sign (%) for a field causes date to blank fill the field. Placing a hyphen (–)
following the percent sign causes date not to fill the field—that is, to left-justify the field.
The date utility assumes that, in a format string, any character that is not a percent sign,
an underscore or a hyphen following the percent sign, or a field descriptor is ordinary
text and copies it to standard output. You can use ordinary text to add punctuation to
the date and to add labels (for example, you can put the word DATE: in front of the
date). Surround the format argument with single quotation marks if it contains SPACE s
or other characters that have a special meaning to the shell.
Setting the
system clock

When a user working with root privileges specifies newdate, the system changes the
system clock to reflect the new date. The newdate argument has the syntax
nnddhhmm[[cc]yy][.ss]
where nn is the number of the month (01–12), dd is the day of the month (01–31),
hh is the hour based on a 24-hour clock (00–23), and mm is the minutes (00–59).
When you change the date, you must specify at least these fields.
The optional cc specifies the first two digits of the year (the value of the century minus
1), and yy specifies the last two digits of the year. You can specify yy or ccyy following
mm. When you do not specify a year, date assumes that the year has not changed.
You can specify the number of seconds past the start of the minute using .ss.

Options

Under Linux, date accepts the common options described on page 718. Options preceded by a double hyphen (––) work under Linux only. Except as noted, options named
with a single letter and preceded by a single hyphen work under Linux and OS X.

––date=datestring

–d datestring
Displays the date specified by datestring, not the current date. According to the
date man page, “the datestring is a mostly free-format date string” such as 2pm
next thursday. See Date input formats in the date info page for details about the
syntax of datestring. This option does not change the system clock. L

date

771

––reference=file

–r file
Displays the modification date and time of file in place of the current date and
time. L

––utc or
––universal

–u Displays or sets the time and date using Universal Coordinated Time (UTC;
page 1100). UTC is also called Greenwich Mean Time (GMT).

Notes

If you set up a locale database, date uses that database to substitute terms appropriate
to your locale. For more information refer to “Locale” on page 316.

Examples

The first example shows how to set the date to 2:07:30
changing the year:

PM

on August 19 without

# date 08191407.30
Sun Aug 19 14:07:30 PDT 2012

The next example shows the format argument, which causes date to display the date
in a commonly used format:
$ date '+Today is %h %d, %Y'
Today is Aug 19, 2012

772 dd

dd
Converts and copies a file

dd

dd [arguments]
The dd (device-to-device copy) utility converts and copies a file. The primary use of dd is to copy files
to and from hard disk files and removable media. It can operate on hard disk partitions and create
block-for-block identical disk images. Often dd can handle the transfer of information to and from
other operating systems when other methods fail. Its rich set of arguments gives you precise control
over the characteristics of the transfer.

Arguments

Under Linux, dd accepts the common options described on page 718. By default dd
copies standard input to standard output.

bs=n

(block size) Reads and writes n bytes at a time. This argument overrides the ibs
and obs arguments.

cbs=n

(conversion block size) When performing data conversion during the copy,
converts n bytes at a time.

conv=type[,type...]

By applying conversion types in the order given on the command line, converts
the data being copied. The types must be separated by commas with no SPACEs.
Table VI-13 lists the types of conversions.
Restricts to numblocks the number of blocks of input that dd copies. The size
of each block is the number of bytes specified by the bs or ibs argument.

count=numblocks

(input block size) Reads n bytes at a time.

ibs=n

(input file) Reads from filename instead of from standard input. You can specify
a device name for filename to read from that device.

if=filename

(output block size) Writes n bytes at a time.

obs=n
of=filename

(output file) Writes to filename instead of to standard output. You can specify
a device name for filename to write to that device.

seek=numblocks

Skips numblocks blocks of output before writing any output. The size of each
block is the number of bytes specified by the bs or obs argument.

skip=numblocks

Skips numblocks blocks of input before starting to copy. The size of each block
is the number of bytes specified by the bs or ibs argument.

Table VI-13

Conversion types

type

Meaning

ascii

Converts EBCDIC-encoded characters to ASCII, allowing you to read tapes
written on IBM mainframe and similar computers.

dd 773

Table VI-13

Conversion types (continued)

type

Meaning

block

Each time dd reads a line of input (i.e., a sequence of characters terminated
by a NEWLINE), dd outputs a block of text and pads it with SPACEs until it is the
size given by the bs or obs argument. It then outputs the NEWLINE that ends the
line of output.

ebcdic

Converts ASCII-encoded characters to EBCDIC, allowing you to write tapes for
use on IBM mainframe and similar computers.

lcase

Converts uppercase letters to lowercase while copying data.

noerror

If a read error occurs, dd normally terminates. This conversion allows dd to
continue processing data and is useful when you are trying to recover data
from bad media.

notrunc

Does not truncate the output file before writing to it.

ucase

Converts lowercase letters to uppercase while copying data.

unblock

Performs the opposite of the block conversion.

Notes

Under Linux, you can use the standard multiplicative suffixes to make it easier to specify large block sizes. See Table VI-1 on page 717. Under Mac OS X, you can use some
of the standard multiplicative suffixes; however, OS X uses lowercase letters in place
of the uppercase letters shown in the table. In addition, under OS X, dd supports b
(block; multiply by 512) and w (word; multiply by the number of bytes in an integer).

Examples

You can use dd to create a file filled with pseudorandom bytes.
$ dd if=/dev/urandom of=randfile2 bs=1 count=100

The preceding command reads from the /dev/urandom file (an interface to the kernel’s
random number generator) and writes to the file named randfile. The block size is 1
and the count is 100, so randfile is 100 bytes long. For bytes that are more random,
you can read from /dev/random. See the urandom and random man pages for more
information. Under OS X, urandom and random behave identically.
Copying a partition

You can also use dd to make an exact copy of a disk partition. Be careful, however—
the following command wipes out anything that was on the /dev/sdb1 partition:
# dd if=/dev/sda1 of=/dev/sdb1

Backing up a
partition

The following command copies the partition named /dev/sda2 to a file named
boot.img. Under OS X, hdiutil might do a better job of copying a partition.
# dd if=/dev/sda2 of=boot.img
1024000+0 records in
1024000+0 records out
524288000 bytes (524 MB) copied, 14.4193 s, 36.4 MB/s

774 dd

Be careful if you copy the image file to a partition: You will overwrite the information
in the partition. Unmount the partition and reverse the if and of parameters to copy
the image file to the partition. Mount the partition once dd has copied the image file.
# umount /dev/sda2
# dd if=boot.img of=/dev/sda2
1024000+0 records in
1024000+0 records out
524288000 bytes (524 MB) copied, 15.7692 s, 33.2 MB/s
# mount /dev/sda2

You can compress a partition image file just as you can compress most files:
# ls -lh boot.img
-rw-r--r--. 1 root root 500M 04-03 15:27 boot.img
# bzip2 boot.img
# ls -lh boot.img.bz2
-rw-r--r--. 1 root root 97M 04-03 15:27 boot.img.bz2
Wiping a file

You can use a similar technique to wipe data from a file before deleting it, making it
almost impossible to recover data from the deleted file. You might want to wipe a
file for security reasons; wipe a file several times for added security.
In the following example, ls shows the size of the file named secret; dd, with a block
size of 1 and a count corresponding to the number of bytes in secret, then wipes the
file. The conv=notrunc argument ensures that dd writes over the data in the file and
not another place on the disk.
$ ls -l secret
-rw-rw-r-- 1 max max 2494 02-06 00:56 secret
$ dd if=/dev/urandom of=secret bs=1 count=2494 conv=notrunc
2494+0 records in
2494+0 records out
$ rm secret

You can also use the shred (Linux) or srm (OS X) utility to securely delete files.

df

775

df
df [options] [filesystem-list]
The df (disk free) utility reports on the total space and the free space on each mounted device.

Arguments

When you call df without an argument, it reports on the free space on each of the
devices mounted on the local system.
The filesystem-list is an optional list of one or more pathnames that specify the filesystems you want the report to cover. This argument works on Mac OS X and some
Linux systems. You can refer to a mounted filesystem by its device pathname or by
the pathname of the directory it is mounted on.

Options

Options preceded by a double hyphen (––) work under Linux only. Except as noted,
options named with a single letter and preceded by a single hyphen work under Linux
and OS X.

––all

–a Reports on filesystems with a size of 0 blocks, such as /dev/proc. Normally df
does not report on these filesystems.

––block-size=sz

–B sz
The sz specifies the units the report uses (the default is 1-kilobyte blocks). The
sz is a multiplicative suffix from Table VI-1 on page 717. See also the –H (––si)
and –h (––human-readable) options. L
–g (gigabyte) Displays sizes in 1-gigabyte blocks. O

––si –H Displays sizes in K (kilobyte), M (megabyte), and G (gigabyte) blocks, as is

appropriate. Uses powers of 1,000.
––human-readable

–h Displays sizes in K (kilobyte), M (megabyte), and G (gigabyte) blocks, as is
appropriate. Uses powers of 1,024.
––inodes

–i Reports the number of inodes (page 1074) that are used and free instead of
reporting on blocks.
–k (kilobyte) Displays sizes in 1-kilobyte blocks.

––local

–l Displays local filesystems only.
–m (megabyte) Displays sizes in 1-megabyte blocks. O

––type=fstype

–t fstype
Reports information only about the filesystems of type fstype, such as DOS or
NFS. Repeat this option to report on several types of filesystems. L
–T fstype
Reports information only about the filesystems of type fstype, such as DOS or
NFS. Separate multiple filesystem types with commas. O

df

Displays disk space usage

776 df
––exclude-type=fstype

–x fstype
Reports information only about the filesystems not of type fstype. L

Notes

Under Mac OS X, the df utility supports the BLOCKSIZE environment variable
(page 718) and ignores block sizes smaller than 512 bytes or larger than 1 gigabyte.
Under Mac OS X, the count of used and free inodes (–i option) is meaningless on
HFS+ filesystems. On these filesystems, new files can be created as long as free space
is available in the filesystem.

Examples

In the following example, df displays information about all mounted filesystems on
the local system:
$ df
Filesystem
/dev/sda12
/dev/sda1
/dev/sda8
/dev/sda9
/dev/sda10
/dev/sda5
/dev/sda7
/dev/sda6
zach:/c
zach:/d

1k-blocks
1517920
15522
1011928
1011928
1130540
4032092
1011928
2522048
2096160
2096450

Used Available Use% Mounted on
53264
1387548
4% /
4846
9875 33% /boot
110268
850256 11% /free1
30624
929900
3% /free2
78992
994120
7% /free3
1988080
1839188 52% /home
60
960464
0% /tmp
824084
1569848 34% /usr
1811392
284768 86% /zach_c
1935097
161353 92% /zach_d

Next df is called with the –l and –h options, generating a human-readable list of local
filesystems. The sizes in this listing are given in terms of megabytes and gigabytes.
$ df -lh
Filesystem
/dev/sda12
/dev/sda1
/dev/sda8
/dev/sda9
/dev/sda10
/dev/sda5
/dev/sda7
/dev/sda6

Size
1.4G
15M
988M
988M
1.1G
3.8G
988M
2.4G

Used Avail Use% Mounted on
52M 1.3G
4% /
4.7M 9.6M 33% /boot
108M 830M 11% /free1
30M 908M
3% /free2
77M 971M
7% /free3
1.9G 1.8G 52% /home
60k 938M
0% /tmp
805M 1.5G 34% /usr

The next example, which runs under Linux only, displays information about the
/free2 partition in megabyte units:
$ df -BM /free2
Filesystem
/dev/sda9

1M-blocks
988

Used Available Use% Mounted on
30
908
3% /free2

The final example, which runs under Linux only, displays information about NFS
filesystems in human-readable terms:
$ df -ht nfs
Filesystem
zach:/c
zach:/d

Size
2.0G
2.0G

Used Avail Use% Mounted on
1.7G 278M 86% /zach_c6
1.8G 157M 92% /zach_d

diff 777

diff
diff [options] file1 file2
diff [options] file1 directory
diff [options] directory file2
diff [options] directory1 directory2
The diff utility displays line-by-line differences between two text files. By default diff displays the differences
as instructions, which you can use to edit one of the files to make it the same as the other.

Arguments

The file1 and file2 are pathnames of ordinary text files that diff works on. When the
directory argument is used in place of file2, diff looks for a file in directory with the
same name as file1. It works similarly when directory replaces file1. When you specify
two directory arguments, diff compares the files in directory1 with the files that have
the same simple filenames in directory2.

Options

The diff utility accepts the common options described on page 718, with one exception:
When one of the arguments is a directory and the other is an ordinary file, you cannot
compare to standard input.

The Mac OS X version of diff accepts long options
tip Options for diff preceded by a double hyphen (––) work under Mac OS X as well as under Linux.
––ignore-blank-lines

–B Ignores differences that involve only blank lines.
––ignore-space-change

–b Ignores whitespace (SPACE s and TAB s) at the ends of lines and considers other
strings of whitespace to be equal.
––context[=lines] –C [lines]

Displays the sections of the two files that differ, including lines lines (the
default is 3) around each line that differs to show the context. Each line in file1
that is missing from file2 is preceded by a hyphen (–); each extra line in file2
is preceded by a plus sign (+); and lines that have different versions in the two
files are preceded by an exclamation point (!). When lines that differ are within
lines lines of each other, they are grouped together in the output.
––ed

–e Creates and sends to standard output a script for the ed editor, which will edit file1
to make it the same as file2. You must add w (Write) and q (Quit) instructions to
the end of the script if you plan to redirect input to ed from the script. When you
use ––ed, diff displays the changes in reverse order: Changes to the end of the file
are listed before changes to the top, preventing early changes from affecting later
changes when the script is used as input to ed. For example, if a line near the top
were deleted, subsequent line numbers in the script would be wrong.

diff

Displays the differences between two text files

778 diff

–i Ignores differences in case when comparing files.

––ignore-case

––new-file –N When comparing directories, when a file is present in one of the directories only,

considers it to be present and empty in the other directory.
––show-c-function

–p Shows which C function, bash control structure, Perl subroutine, and so forth
is affected by each change.
––brief

–q Does not display the differences between lines in the files. Instead, diff reports
only that the files differ.

––recursive

–r When using diff to compare the files in two directories, causes the comparisons
to descend through the directory hierarchies.

––unified[=lines] –U lines or –u

Uses the easier-to-read unified output format. See the discussion of diff on
page 58 for more detail and an example. The lines argument is the number of
lines of context; the default is three. The –u option does not take an argument
and provides three lines of context.
––width=n –W n

Sets the width of the columns that diff uses to display the output to n characters.
This option is useful with the ––side-by-side option. The sdiff utility (see the
“Notes” section) uses a lowercase w to perform the same function: –w n.
––ignore-all-space

–w (whitespace) Ignores whitespace when comparing lines.
––side-by-side

–y Displays the output in a side-by-side format. This option generates the same
output as sdiff. Use the –W (––width) option with this option.

Discussion

When you use diff without any options, it produces a series of lines containing Add
(a), Delete (d), and Change (c) instructions. Each of these lines is followed by the lines
from the file you need to add to, delete from, or change, respectively, to make the files
the same. A less than symbol (<) precedes lines from file1. A greater than symbol (>)
precedes lines from file2. The diff output appears in the format shown in Table VI-14.
A pair of line numbers separated by a comma represents a range of lines; a single line
number represents a single line.
The diff utility assumes you will convert file1 to file2. The line numbers to the left of
each of the a, c, or d instructions always pertain to file1; the line numbers to the right
of the instructions apply to file2. To convert file2 to file1, run diff again, reversing
the order of the arguments.

Notes

The sdiff utility is similar to diff but its output might be easier to read. The diff ––sideby-side option produces the same output as sdiff. See the “Examples” section and
refer to the diff and sdiff man and info pages for more information.
Use the diff3 utility to compare three files.
Use cmp (page 748) to compare nontext (binary) files.

diff 779

Table VI-14

Examples

diff output

Instruction

Meaning (to change file1 to file2)

line1 a line2,line3
> lines from file2

Append line2 through line3 from file2 after line1 in file1

line1,line2 d line3
< lines from file1

Delete line1 through line2 from file1

line1,line2 c line3,line4
< lines from file1
–––
> lines from file 2

Change line1 through line2 in file1 to line3 through line4 from
file2

The first example shows how diff displays the differences between two short, similar
files:
$ cat m
aaaaa
bbbbb
ccccc
$ cat n
aaaaa
ccccc
$ diff m n
2d1
< bbbbb

The difference between files m and n is that the second line of file m (bbbbb) is missing
from file n. The first line that diff displays (2d1) indicates that you need to delete the
second line from file1 (m) to make it the same as file2 (n). The next line diff displays
starts with a less than symbol (<), indicating that this line of text is from file1. In this
example, you do not need this information—all you need to know is the line number
so that you can delete the line.
The ––side-by-side option and the sdiff utility, both with the output width set to 30
characters, display the same output. In the output, a less than symbol points to the
extra line in file m; diff/sdiff leaves a blank line in file n where the extra line would go
to make the files the same.
$ diff --side-by-side --width=30 m n
aaaaa
aaaaa
bbbbb
<
ccccc
ccccc
$ sdiff -w 30 m n
aaaaa
aaaaa
bbbbb
<
ccccc
ccccc

The next example uses the same m file and a new file, p, to show diff issuing an a
(Append) instruction:

780 diff
$ cat p
aaaaa
bbbbb
rrrrr
ccccc
$ diff m p
2a3
> rrrrr

In the preceding example, diff issues the instruction 2a3 to indicate you must append a
line to file m, after line 2, to make it the same as file p. The second line diff displays indicates the line is from file p (the line begins with >, indicating file2). In this example, you
need the information on this line; the appended line must contain the text rrrrr.
The next example uses file m again, this time with file r, to show how diff indicates a
line that needs to be changed:
$ cat r
aaaaa
-q
ccccc
$ diff m r
2c2
< bbbbb
--> -q

The difference between the two files appears in line 2: File m contains bbbbb, and
file r contains –q. The diff utility displays 2c2 to indicate that you need to change line
2. After indicating a change is needed, diff shows you must change line 2 in file m
(bbbbb) to line 2 in file r (–q) to make the files the same. The three hyphens indicate
the end of the text in file m that needs to be changed and the beginning of the text in
file r that is to replace it.
Comparing the same files using the side-by-side and width options (–y and –W) yields
an easier-to-read result. The pipe symbol (|) indicates the line on one side must replace
the line on the other side to make the files the same:
$ diff -y -W 30 m r
aaaaa
aaaaa
bbbbb
| -q
ccccc
ccccc

The next examples compare the two files q and v:
$ cat q
Monday
Tuesday
Wednesday
Thursday
Saturday
Sunday

$ cat v
Monday
Wednesday
Thursday
Thursday
Friday
Saturday
Sundae

diff 781

Running in side-by-side mode, diff shows Tuesday is missing from file v, there is only one
Thursday in file q (there are two in file v), and Friday is missing from file q. The last line
is Sunday in file q and Sundae in file v: diff indicates these lines are different. You can
change file q to be the same as file v by removing Tuesday, adding one Thursday and
Friday, and substituting Sundae from file v for Sunday from file q. Alternatively, you can
change file v to be the same as file q by adding Tuesday, removing one Thursday and
Friday, and substituting Sunday from file q for Sundae from file v.
$ diff -y -W 30 q v
Monday
Monday
Tuesday
<
Wednesday
Wednesday
Thursday
Thursday
> Thursday
> Friday
Saturday
Saturday
Sunday
| Sundae
Context diff

With the ––context option (called a context diff), diff displays output that tells you
how to turn the first file into the second file. The top two lines identify the files and
show that q is represented by asterisks, whereas v is represented by hyphens. Following a row of asterisks that indicates the beginning of a hunk of text is a row of
asterisks with the numbers 1,6 in the middle. This line indicates that the instructions
in the first section tell you what to remove from or change in file q—namely, lines 1
through 6 (that is, all the lines of file q; in a longer file it would mark the first hunk).
The hyphen on the second subsequent line indicates you need to remove the line with
Tuesday. The line with an exclamation point indicates you need to replace the line
with Sunday with the corresponding line from file v. The row of hyphens with the
numbers 1,7 in the middle indicates that the next section tells you which lines from
file v—lines 1 through 7—you need to add or change in file q. You need to add a second line with Thursday and a line with Friday, and you need to change Sunday in file
q to Sundae (from file v).
$ diff --context q v
*** q Mon Aug 27 18:26:45 2012
--- v
Mon Aug 27 18:27:55 2012

***************
*** 1,6 ****
Monday
- Tuesday
Wednesday
Thursday
Saturday
! Sunday
--- 1,7 ---Monday
Wednesday
Thursday
+ Thursday
+ Friday
Saturday
! Sundae

782 diskutil O

diskutil O

diskutil O
Checks, modifies, and repairs local volumes
diskutil action [arguments]
The diskutil utility mounts, unmounts, and displays information about disks and partitions (volumes).
It can also format and repair filesystems and divide a disk into partitions. The diskutil utility is available
under OS X only. O

Arguments

The action specifies what diskutil is to do. Table VI-15 lists common actions along
with the argument each takes.

Table VI-15

diskutil actions

Action

Argument

Description

eraseVolume

type name device

Reformats device using the format type and the label name.
The name specifies the name of the volume; alphanumeric
names are the easiest to work with.
The filesystem type is typically HFS+, but can also be UFS
or MS-DOS. You can specify additional options as part of
the type. For example, a FAT32 filesystem (as used in
Windows 98 and above) would have a type of MS-DOS
FAT32. A journaled, case-sensitive, HFS+ filesystem would
have a type of Case-sensitive Journaled HFS+.

Notes

info

device

Displays information about device. Does not require
ownership of device.

list

[device]

Lists partitions on device. Without device lists partitions on
all devices. Does not require ownership of device.

mount

device

Mounts device.

mountDisk

device

Mounts all devices on the disk containing device.

reformat

device

Reformats device using its current name and format.

repairVolume

device

Repairs the filesystem on device.

unmount

device

Unmounts device.

unmountDisk

device

Unmounts all devices on the disk containing device.

verifyVolume

device

Verifies the filesystem on device. Does not require
ownership of device.

The diskutil utility provides access to the Disk Management framework, the support
code used by the Disk Utility application. It allows some choices that are not supported
from the graphical interface.

diskutil O 783

You must own device, or be working with root privileges, when you specify an action
that modifies or changes the state of a volume.
fsck

disktool

Examples

The diskutil verifyVolume and repairVolume actions are analogous to the fsck utility
on Linux systems. Under OS X, the fsck utility is deprecated except when the system
is in single-user mode.
Some of the functions performed by diskutil were handled by disktool in the past.
The first example displays a list of disk devices and volumes available on the local
system:
$ diskutil list
/dev/disk0
#:
type
0: Apple_partition_scheme
1:
Apple_partition_map
2:
Apple_HFS
3:
Apple_HFS
/dev/disk1
#:
type
0: Apple_partition_scheme
1:
Apple_partition_map
2:
Apple_Driver43
3:
Apple_Driver43
4:
Apple_Driver_ATA
5:
Apple_Driver_ATA
6:
Apple_FWDriver
7:
Apple_Driver_IOKit
8:
Apple_Patches
9:
Apple_HFS
10:
Apple_HFS

name

Eva01
Users
name

Spare
House

size
*152.7 GB
31.5 KB
30.7 GB
121.7 GB

identifier
disk0
disk0s1
disk0s3
disk0s5

size
*232.9 GB
31.5 KB
28.0 KB
28.0 KB
28.0 KB
28.0 KB
256.0 KB
256.0 KB
256.0 KB
48.8 GB
184.1 GB

identifier
disk1
disk1s1
disk1s2
disk1s3
disk1s4
disk1s5
disk1s6
disk1s7
disk1s8
disk1s9
disk1s10

The next example displays information about one of the mounted volumes:
$ diskutil info disk1s9
Device Node:
/dev/disk1s9
Device Identifier: disk1s9
Mount Point:
/Volumes/Spare
Volume Name:
Spare
File System:
Owners:
Partition Type:
Bootable:
Media Type:
Protocol:
SMART Status:
UUID:

HFS+
Enabled
Apple_HFS
Is bootable
Generic
FireWire
Not Supported
C77BB3DC-EFBB-30B0-B191-DE7E01D8A563

Total Size:
Free Space:

48.8 GB
48.8 GB

Read Only:
Ejectable:

No
Yes

784 diskutil O

The next example formats the partition at /dev/disk1s8 as an HFS+ Extended (HFSX)
filesystem and labels it Spare2. This command erases all data on the partition:
# diskutil eraseVolume 'Case-sensitive HFS+' Spare2 disk1s8
Started erase on disk disk1s10
Erasing
Mounting Disk
Finished erase on disk disk1s10

The final example shows the output of a successful verifyVolume operation:
$ diskutil verifyVolume disk1s9
Started verify/repair on volume disk1s9 Spare
Checking HFS Plus volume.
Checking Extents Overflow file.
Checking Catalog file.
Checking Catalog hierarchy.
Checking volume bitmap.
Checking volume information.
The volume Spare appears to be OK.
Mounting Disk
Verify/repair finished on volume disk1s9 Spare

ditto O
Copies files and creates and unpacks archives
ditto [options] source-file destination-file
ditto [options] source-file-list destination-directory
ditto –c [options] source-directory destination-archive
ditto –x [options] source-archive-list destination-directory
The ditto utility copies files and their ownership, timestamps, and other attributes, including extended
attributes (page 1044). It can copy to and from cpio and zip archive files, as well as copy ordinary files
and directories. The ditto utility is available under OS X only. O

Arguments

The source-file is the pathname of the file that ditto is to make a copy of. The
destination-file is the pathname that ditto assigns to the resulting copy of the file.
The source-file-list specifies one or more pathnames of files and directories that ditto
makes copies of. The destination-directory is the pathname of the directory that
ditto copies the files and directories into. When you specify a destination-directory,
ditto gives each of the copied files the same simple filename as its source-file.
The source-directory is a single directory that ditto copies into the destination-archive.
The resulting archive holds copies of the contents of source-directory, but not the
directory itself.
The source-archive-list specifies one or more pathnames of archives that ditto extracts
into destination-directory.
Using a hyphen (–) in place of a filename or a directory name causes ditto to read from
standard input or write to standard output instead of reading from or writing to that
file or directory.

Options

You cannot use the –c and –x options together.
–c (create archive) Creates an archive file.

––help

Displays a help message.
–k (pkzip) Uses the zip format, instead of the default cpio (page 758) format, to create
or extract archives. For more information on zip, see the tip on page 65.

––norsrc

(no resource) Ignores extended attributes. This option causes ditto to copy only
data forks (the default behavior under Mac OS X 10.3 and earlier).

––rsrc

(resource) Copies extended attributes, including resource forks (the default
behavior under Mac OS X 10.4 and above). Also –rsrc and –rsrcFork.
–V (very verbose) Sends a line to standard error for each file, symbolic link, and
device node copied by ditto.
–v (verbose) Sends a line to standard error for each directory copied by ditto.

ditto O

ditto O 785

786 ditto O

–X (exclude) Prevents ditto from searching directories in filesystems other than the
filesystems that hold the files it was explicitly told to copy.
–x (extract archive) Extracts files from an archive file.
–z (compress) Uses gzip (page 838) or gunzip to compress or decompress cpio
archives.

Notes

The ditto utility does not copy the locked attribute flag (page 1046). The utility also
does not copy ACLs.
By default ditto creates and reads archives (page 1055) in the cpio (page 758) format.
The ditto utility cannot list the contents of archive files; it can only create or extract
files from archives. Use pax or cpio to list the contents of cpio archives, and use unzip
with the –l option to list the contents of zip files.

Examples

The following examples show three ways to back up a user’s home directory, including
extended attributes (except as mentioned in “Notes”), while preserving timestamps and
permissions. The first example copies Zach’s home directory to the volume (filesystem)
named Backups; the copy is a new directory named zach.0228:
$ ditto /Users/zach /Volumes/Backups/zach.0228

The next example copies Zach’s home directory into a single cpio-format archive file
on the volume named Backups:
$ ditto -c /Users/zach /Volumes/Backups/zach.0228.cpio

The next example copies Zach’s home directory into a zip archive:
$ ditto -c -k /Users/zach /Volumes/Backups/zach.0228.zip

Each of the next three examples restores the corresponding backup archive into
Zach’s home directory, overwriting any files that are already there:
$ ditto /Volumes/Backups/zach.0228 /Users/zach
$ ditto -x /Volumes/Backups/zach.0228.cpio /Users/zach
$ ditto -x -k /Volumes/Backups/zach.0228.zip /Users/zach

The following example copies the Scripts directory to a directory named ScriptsBackups
on the remote host plum. It uses an argument of a hyphen in place of source-directory
locally to write to standard output and in place of destination-directory on the remote
system to read from standard input:
$ ditto -c Scripts - | ssh plum ditto -x - ScriptsBackups

The final example copies the local startup disk (the root filesystem) to the volume
named Backups.root. Because some of the files can be read only by root, the script
must be run by a user with root privileges. The –X option keeps ditto from trying to
copy other volumes (filesystems) that are mounted under /.
# ditto -X / /Volumes/Backups.root

dmesg

787

Displays kernel messages
dmesg [options]
The dmesg utility displays messages stored in the kernel ring buffer.

Options

–c Clears the kernel ring buffer after running dmesg. L
–M core
The core is the name of the (core dump) file to process (defaults to /dev/kmem).
O
–N kernel
The kernel is the pathname of a kernel file (defaults to /mach). If you are displaying
information about a core dump, kernel should be the kernel that was running at
the time the core file was created. O

Discussion

When the system boots, the kernel fills its ring buffer with messages regarding
hardware and module initialization. Messages in the kernel ring buffer are often
useful for diagnosing system problems.

Notes

Under Mac OS X, you must run this utility while working with root privileges.
As a ring buffer, the kernel message buffer keeps the most recent messages it receives,
discarding the oldest messages once it fills up. To save a list of kernel boot messages,
give the following command immediately after booting the system and logging in:
$ dmesg > dmesg.boot

This command saves the kernel messages in the dmesg.boot file. This list can be
educational and quite useful when you are having a problem with the boot process.
Under most Linux systems, after the system boots, the system records much of the
same information as dmesg displays in /var/log/messages or a similar file.

Examples

The following command displays kernel messages in the ring buffer with the string
serial in them, regardless of case:

$ dmesg | grep
[
1.304433]
[
1.329978]
[
1.354473]
[
1.411213]
[
1.411221]
...

-i serial
Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: SerialNumber: 0000:02:03.0

dmesg

dmesg

788 dscl O

dscl O
dscl O

Displays and manages Directory Service information
dscl [options] [datasource [command]]
The dscl (Directory Service command line) utility enables you to work with Directory Service directory
nodes. When you call dscl without arguments, it runs interactively. The dscl utility is available under
OS X only. O

Arguments

The datasource is a node name or a Mac OS X Server host specified by a hostname
or IP address. A period (.) specifies the local domain.

Options

–p (prompt) Prompts for a password as needed.
–q (quiet) Does not prompt.
–u user
Authenticates as user.

Commands

Refer to the “Notes” section for definitions of some of the terms used here.
The hyphen (–) before a command is optional.

–list path [key] (also –ls) Lists subdirectories in path, one per line. If you specify key, this command

lists subdirectories that match key.
–read [path [key]]

(also –cat and .) Displays a directory, one property per line.
–readall [path [key]]

Displays properties with a given key.
–search path key value

Displays properties where key matches value.

Notes

When discussing Directory Service, the term directory refers to a collection of data
(a database), not to a filesystem directory. Each directory holds one or more properties. Each property comprises a key–value pair, where there might be more than
one value for a given key. In general, dscl displays a property with the key first,
followed by a colon, and then the value. If there is more than one value, the values
are separated by SPACEs. If a value contains SPACEs, dscl displays the value on the line
following the key.
Under Mac OS X and Mac OS X Server, Open Directory stores information for the
local system in key–value-formatted XML files in the /var/db/dslocal directory
hierarchy.

dscl O 789

The dscl utility is the command-line equivalent of NetInfo Manager (available on
versions of Mac OS X prior to 10.5) or of Workgroup Manager on Mac OS X Server.

Examples

The dscl –list command displays a list of top-level directories when you specify a path
of /:
$ dscl . -list /
AFPServer
AFPUserAliases
Aliases
AppleMetaRecord
Augments
Automount
...
SharePoints
SMBServer
Users
WebServer

The period as the first argument to dscl specifies the local domain as the data source.
The next command displays a list of Users directories:
$ dscl . -list /Users
_amavisd
_appowner
_appserver
_ard
...
_www
_xgridagent
_xgridcontroller
daemon
max
nobody
root

You can use the dscl –read command to display information about a specific user:
$ dscl . -read /Users/root
AppleMetaNodeLocation: /Local/Default
GeneratedUID: FFFFEEEE-DDDD-CCCC-BBBB-AAAA00000000
NFSHomeDirectory: /var/root
Password: *
PrimaryGroupID: 0
RealName:
System Administrator
RecordName: root
RecordType: dsRecTypeStandard:Users
UniqueID: 0
UserShell: /bin/sh

The following dscl –readall command lists all usernames and user IDs on the local
system. The command looks for the RecordName and UniqueID keys in the /Users

790 dscl O

directory and displays the associated values. The dscl utility separates multiple values
with SPACEs. See page 1042 for an example of a shell script that calls dscl with the
–readall command.
$ dscl . -readall /Users RecordName UniqueID
RecordName: _amavisd amavisd
UniqueID: 83
RecordName: _appowner appowner
UniqueID: 87
...
RecordName: daemon
UniqueID: 1
RecordName: sam
UniqueID: 501
RecordName: nobody
UniqueID: -2
RecordName: root
UniqueID: 0

The following example uses the dscl –search command to display all properties where
the key RecordName equals sam:
$ dscl . -search / RecordName sam
Users/sam
RecordName = (
sam
)

du 791

du
du [options] [path-list]
The du (disk usage) utility reports how much disk space is occupied by a directory hierarchy or a file.
By default du displays the number of 1,024-byte blocks occupied by the directory hierarchy or file.

Arguments

Without any arguments, du displays information about the working directory and its
subdirectories. The path-list specifies the directories and files du displays information
about.

Options

Options preceded by a double hyphen (––) work under Linux only. Except as noted,
options named with a single letter and preceded by a single hyphen work under Linux
and OS X.
Without any options, du displays the total storage used for each argument in path-list.
For directories, du displays this total after recursively listing the totals for each
subdirectory.

––all

–a Displays the space used by all ordinary files along with the total for each
directory.

––block-size=sz

–B sz
The sz argument specifies the units the report uses. It is a multiplicative suffix
from Table VI-1 on page 717. See also the –H (––si) and –h (––human-readable)
options. L

––total

–c Displays a grand total at the end of the output.

––dereference-args

–D (partial dereference) For each file that is a symbolic link, reports on the file the
link points to, not the symbolic link itself. This option affects files specified on
the command line; it does not affect files found while descending a directory
hierarchy. This option treats files that are not symbolic links normally. See
page 115 for information on dereferencing symbolic links. L
–d depth
Displays information for subdirectories to a level of depth directories. O
––si

(human readable) Displays sizes in K (kilobyte), M (megabyte), and G (gigabyte)
blocks, as appropriate. Uses powers of 1,000. L
–H (partial dereference) For each file that is a symbolic link, reports on the file the
link points to, not the symbolic link itself. This option affects files specified on
the command line; it does not affect files found while descending a directory
hierarchy. This option treats files that are not symbolic links normally. See
page 115 for information on dereferencing symbolic links. O

du

Displays information on disk usage by directory hierarchy and/or file

792 du
––human-readable

–h Displays sizes in K (kilobyte), M (megabyte), and G (gigabyte) blocks, as
appropriate. Uses powers of 1,024.
–k Displays sizes in 1-kilobyte blocks.
––dereference

–L For each file that is a symbolic link, reports on the file the link points to, not
the symbolic link itself. This option affects all files and treats files that are not
symbolic links normally. The default is –P (––no-dereference). See page 115 for
information on dereferencing symbolic links.
–m Displays sizes in 1-megabyte blocks.

––no-dereference –P For each file that is a symbolic link, reports on the symbolic link, not the file the

link points to. This option affects all files and treats files that are not symbolic
links normally. This behavior is the default. See page 115 for information on
dereferencing symbolic links.
––summarize

–s Displays only the total size for each directory or file you specify on the command
line; subdirectory totals are not displayed.

––one-file-system –x Reports only on files and directories on the same filesystem as that of the argument

being processed.

Examples

In the first example, du displays size information about subdirectories in the working
directory. The last line contains the total for the working directory and its
subdirectories.
$ du
26
4
47
4
12
105

./Postscript
./RCS
./XIcon
./Printer/RCS
./Printer
.

The total (105) is the number of blocks occupied by all plain files and directories
under the working directory. All files are counted, even though du displays only the
sizes of directories.
If you do not have read permission for a file or directory that du encounters, du
sends a warning to standard error and skips that file or directory. Next, using the
–s (summarize) option, du displays the total for each of the directories in /usr but
does not display information for subdirectories:
$ du -s
4
260292
10052
7772
1720468
105240
0

/usr/*
/usr/X11R6
/usr/bin
/usr/games
/usr/include
/usr/lib
/usr/lib32
/usr/lib64

du 793
du: cannot read directory `/usr/local/lost+found': Permission denied
...
130696 /usr/src

When you add the –c (total) option to the preceding example, du displays the same
listing with a total at the end:
$ du -sc /usr/*
4
/usr/X11R6
260292 /usr/bin
...
130696 /usr/src
3931436 total

The following example uses the –s (summarize), –h (human-readable), and –c (total)
options:
$ du -shc /usr/*
4.0K
/usr/X11R6
255M
/usr/bin
9.9M
/usr/games
7.6M
/usr/include
1.7G
/usr/lib
103M
/usr/lib32
...
128M
/usr/src
3.8G
total

The final example displays, in human-readable format, the total size of all files the
user can read in the /usr filesystem. Redirecting standard error to /dev/null discards
all warnings about files and directories that are unreadable.
$ du -hs /usr 2>/dev/null
3.8G
/usr

794 echo

echo
echo

Displays a message
echo [options] message
The echo utility copies its arguments, followed by a NEWLINE, to standard output. Both the Bourne Again
and TC Shells have their own echo builtin that works similarly to the echo utility.

Arguments

The message consists of one or more arguments, which can include quoted strings,
ambiguous file references, and shell variables. A SPACE separates each argument from the
next. The shell recognizes unquoted special characters in the arguments. For example,
the shell expands an asterisk into a list of filenames in the working directory.

Options

You can configure the tcsh echo builtin to treat backslash escape sequences and the
–n option in different ways. Refer to echo_style in the tcsh man page. The typical tcsh
configuration recognizes the –n option, enables backslash escape sequences, and
ignores the –E and –e options.
–E Suppresses the interpretation of backslash escape sequences such as \n. Available
with the bash builtin version of echo only.
–e Enables the interpretation of backslash escape sequences such as \n. Available
with the bash builtin version of echo only.
Gives a short summary of how to use echo. The summary includes a list of the
backslash escape sequences interpreted by echo. This option works only with
the echo utility; it does not work with the echo builtins. L

––help

–n Suppresses the NEWLINE terminating the message.

Notes

Suppressing the interpretation of backslash escape sequences is the default behavior
of the bash builtin version of echo and of the echo utility.
You can use echo to send messages to the screen from a shell script. See page 149 for
a discussion of how to use echo to display filenames using wildcard characters.
The echo utility and builtins provide an escape notation to represent certain nonprinting
characters in message (Table VI-16). You must use the –e option for these backslash
escape sequences to work with the echo utility and the bash echo builtin. Typically you
do not need the –e option with the tcsh echo builtin.

Table VI-16

Backslash escape sequences

Sequence

Meaning

\a

Bell

\c

Suppress trailing NEWLINE

echo 795

Table VI-16

Examples

Backslash escape sequences (continued)

Sequence

Meaning

\n

NEWLINE

\t

HORIZONTAL TAB

\v

VERTICAL TAB

\\

BACKSLASH

Following are some echo commands. These commands will work with the echo utility
(/bin/echo) and the bash and tcsh echo builtins, except for the last, which might not
need the –e option under tcsh.
$ echo "This command displays a string."
This command displays a string.
$ echo -n "This displayed string is not followed by a NEWLINE."
This displayed string is not followed by a NEWLINE.$ echo hi
hi
$ echo -e "This message contains\v a vertical tab."
This message contains
a vertical tab.
$

The following examples contain messages with the backslash escape sequence \c. In
the first example, the shell processes the arguments before calling echo. When the
shell sees the \c, it replaces the \c with the character c. The next three examples show
how to quote the \c so that the shell passes it to echo, which then does not append a
NEWLINE to the end of the message. The first four examples are run under bash and
require the –e option. The final example runs under tcsh, which might not need this
option.
$ echo -e There is a newline after this line.\c
There is a newline after this line.c
$ echo -e 'There is no newline after this line.\c'
There is no newline after this line.$
$ echo -e "There is no newline after this line.\c"
There is no newline after this line.$
$ echo -e There is no newline after this line.\\c
There is no newline after this line.$
$ tcsh
tcsh $ echo -e 'There is no newline after this line.\c'
There is no newline after this line.$

You can use the –n option in place of –e and \c.

796 expand/unexpand

expand/unexpand

expand/unexpand
Converts TABs to SPACEs and SPACEs to TABs
expand [option] [file-list]
unexpand [option] [file-list]
The expand utility converts TABs to SPACEs and the unexpand utility converts SPACEs to TABs.

Arguments

The expand utility reads files from file-list and converts all TABs on each line to SPACEs,
assuming TAB stops are eight spaces apart.
The unexpand utility reads files from file-list and converts all SPACEs at the beginning
of each line to TABs. It stops converting for a line when it reads the first character on
a line that is not a SPACE or a TAB.
If you do not specify a filename or if you specify a hyphen (–) in place of a filename,
expand/unexpand reads from standard input.

Options

The expand/unexpand utilities accept the common options described on page 718.

––all

–a On each line, converts all SPACEs and TABs, not just the initial ones (unexpand only).

––first-only

On each line, stops converting SPACEs after reading the first character that is not
a SPACE or a TAB (unexpand only). Overrides the ––all option.

––initial

–i On each line, stops converting TABs after reading the first character that is not a
SPACE or a TAB (expand only).

––tabs=num | list

–t num | list
Specifies num as the number of SPACEs per TAB stop. With list, specifies each TAB
stop as the number of characters from the left margin. By default TAB stops are
eight characters apart. With unexpand, implies the ––all option.

Examples

The following examples show how expand works. All blanks in the tabs.only file are
single TABs. The ––show-tabs option causes cat to display TABs as ^I and SPACEs as SPACEs.
$ cat tabs.only
>>
>>
>>

>>

x

$ cat --show-tabs tabs.only
>>^I>>^I>>^I>>^Ix

The expand ––tabs=2 option specifies two SPACEs per
specifies TAB stops at columns 20, 24, 30, and 36.

TAB

stop; ––tabs=20,24,30,36

expand/unexpand 797
$ expand --tabs=2 tabs.only | cat --show-tabs
>> >> >> >> x
$ expand --tabs=20,24,30,36 tabs.only | cat --show-tabs
>>
>> >>
>>
x

Next, unexpand converts each group of eight SPACEs to a TAB. All blanks in the spaces.only
file are multiple SPACEs. Because unexpand converts TABs the same way the terminal driver
does (TAB stops every eight characters), the output from cat and unexpand with the –a
option appear the same.
$ cat spaces.only
>>
$ unexpand -a spaces.only
>>

>>

>>

>> x

>>

>>

>> x

When you send the output of unexpand with the –a option through a pipeline to cat
with the ––show-tabs option, you can see where unexpand put the TABs.
$ unexpand -a spaces.only | cat --show-tabs
^I^I >>^I^I
>> >>^I>> x

798 expr

expr
Evaluates an expression

expr

expr expression
The expr utility evaluates an expression and sends the result to standard output. It evaluates character
strings that represent either numeric or nonnumeric values. Operators are used with the strings to form
expressions.

Arguments

The expression is composed of strings interspersed with operators. Each string and
operator constitute a distinct argument that you must separate from other arguments
with a SPACE. You must quote operators that have special meanings to the shell (for
example, the multiplication operator, *).
The following list of expr operators is given in order of decreasing precedence. Each
operator within a group of operators has the same precedence. You can change the
order of evaluation by using parentheses.

:

(comparison) Compares two strings, starting with the first character in each
string and ending with the last character in the second string. The second string
is a regular expression with an implied caret (^) as its first character. If expr
finds a match, it displays the number of characters in the second string. If
expr does not find a match, it displays a zero.

*

(multiplication)
(division)
(remainder)
Work only on strings that contain the numerals 0 through 9 and optionally a
leading minus sign. Convert strings to integer numbers, perform the specified
arithmetic operation on numbers, and convert the result back to a string before
sending it to standard output.

+
–

(addition)
(subtraction)
Function in the same manner as the preceding group of operators.

/
%

<
<=
= or ==
!=
>=
>

(less than)
(less than or equal to)
(equal to)
(not equal to)
(greater than or equal to)
(greater than)
Relational operators work on both numeric and nonnumeric arguments. If
one or both of the arguments are nonnumeric, the comparison is nonnumeric,
using the machine collating sequence (typically ASCII). If both arguments are
numeric, the comparison is numeric. The expr utility displays a 1 (one) if the
comparison is true and a 0 (zero) if it is false.

expr 799

Notes

&

(AND) Evaluates both of its arguments. If neither is 0 or a null string, expr displays
the value of the first argument. Otherwise, it displays a 0 (zero). You must quote
this operator.

|

(OR) Evaluates the first argument. If it is neither 0 nor a null string, expr displays
the value of the first argument. Otherwise, it displays the value of the second
argument. You must quote this operator.

The expr utility returns an exit status of 0 (zero) if the expression evaluates to anything
other than a null string or the number 0, a status of 1 if the expression is null or 0, and
a status of 2 if the expression is invalid.
Although expr and this discussion distinguish between numeric and nonnumeric
arguments, all arguments to expr are nonnumeric (character strings). When applicable, expr attempts to convert an argument to a number (for example, when using
the + operator). If a string contains characters other than 0 through 9 and optionally a leading minus sign, expr cannot convert it. Specifically, if a string contains a
plus sign or a decimal point, expr considers it to be nonnumeric. If both arguments
are numeric, the comparison is numeric. If one is nonnumeric, the comparison is
lexicographic.

Examples

In the following examples, expr evaluates constants. You can also use expr to evaluate
variables in a shell script. The fourth command displays an error message because of
the illegal decimal point in 5.3:
$ expr 17 + 40
57
$ expr 10 - 24
-14
$ expr -17 + 20
3
$ expr 5.3 + 4
expr: non-numeric argument

The multiplication (*), division (/), and remainder (%) operators provide additional
arithmetic power. You must quote the multiplication operator (precede it with a
backslash) so that the shell will not treat it as a special character (an ambiguous file
reference). You cannot put quotation marks around the entire expression because
each string and operator must be a separate argument.
$ expr 5 \* 4
20
$ expr 21 / 7
3
$ expr 23 % 7
2

800 expr

The next two examples show how parentheses change the order of evaluation. You
must quote each parenthesis and surround the backslash/parenthesis combination
with SPACEs:
$ expr 2 \* 3 + 4
10
$ expr 2 \* \( 3 + 4 \)
14

You can use relational operators to determine the relationship between numeric or
nonnumeric arguments. The following commands compare two strings to see if they
are equal; expr displays a 0 when the relationship is false and a 1 when it is true.
$ expr fred == sam
0
$ expr sam == sam
1

In the following examples, the relational operators, which must be quoted, establish
order between numeric or nonnumeric arguments. Again, if a relationship is true,
expr displays a 1.
$ expr fred \> sam
0
$ expr fred \< sam
1
$ expr 5 \< 7
1

The next command compares 5 with m. When one of the arguments expr is comparing
with a relational operator is nonnumeric, expr considers the other to be nonnumeric.
In this case, because m is nonnumeric, expr treats 5 as a nonnumeric argument. The
comparison is between the ASCII (on many systems) values of m and 5. The ASCII
value of m is 109 and that of 5 is 53, so expr evaluates the relationship as true.
$ expr 5 \< m
1

In the next example, the matching operator determines that the four characters in the
second string match the first four characters in the first string. The expr utility displays the number of matching characters (4).
$ expr abcdefghijkl : abcd
4

The & operator displays a 0 if one or both of its arguments are 0 or a null string;
otherwise, it displays the first argument.
$ expr '' \& book
0
$ expr magazine \& book
magazine

expr 801
$ expr 5 \& 0
0
$ expr 5 \& 6
5

The | operator displays the first argument if it is not 0 or a null string; otherwise, it
displays the second argument.
$ expr '' \| book
book
$ expr magazine \| book
magazine
$ expr 5 \| 0
5
$ expr 0 \| 5
5
$ expr 5 \| 6
5

802 file

file
Displays the classification of a file

file

file [option] file-list
The file utility classifies files according to their contents.

Arguments

The file-list is a list of the pathnames of one or more files that file classifies. You can
specify any kind of file, including ordinary, directory, and special files, in the file-list.

Options
The Mac OS X version of file accepts long options
tip Options for file preceded by a double hyphen (––) work under Mac OS X as well as under Linux.
––files-from=file

–f file
Takes the names of files to be examined from file rather than from file-list on
the command line. The names of the files must be listed one per line in file.

––no-dereference –h For each file that is a symbolic link, reports on the symbolic link, not the file the

link points to. This option treats files that are not symbolic links normally. This
behavior is the default on systems where the environment variable
POSIXLY_CORRECT is not defined (typical). See page 115 for information on
dereferencing symbolic links.
––help

Displays a help message.

––mime

–I Displays MIME (page 1080) type strings. O

––mime

–i Displays MIME (page 1080) type strings. L
–i (ignore) Does not display regular files. O

––dereference

–L For each file that is a symbolic link, reports on the file the link points to, not the
symbolic link itself. This option treats files that are not symbolic links normally.
This behavior is the default on systems where the environment variable
POSIXLY_CORRECT is defined. See page 115 for information on dereferencing
symbolic links.

––uncompress

–z (zip) Attempts to classify files within a compressed file.

Notes

The file utility can classify more than 5,000 file types. Some of the more common file
types found on Linux systems, as displayed by file, follow:

file 803
archive
ascii text
c program text
commands text
core file
cpio archive
data
directory
ELF 32-bit LSB executable
empty
English text
executable

The file utility uses a maximum of three tests in its attempt to classify a file: filesystem,
magic number, and language tests. When file identifies the type of a file, it ceases testing. The filesystem test examines the value returned by a stat() system call to see
whether the file is empty or a special file. The magic number (page 1079) test looks
for data in particular fixed formats near the beginning of the file. The language test,
if needed, determines whether the file is a text file, which encoding it uses, and which
language it is written in. Refer to the file man page for a more detailed description of
how file works. The results of file are not always correct.

Examples

Some examples of file identification follow:

/etc/Muttrc:
ASCII English text
/etc/Muttrc.d:
directory
/etc/adjtime:
ASCII text
/etc/aliases.db:
Berkeley DB (Hash, version 9, native byte-order)
/etc/at.deny:
writable, regular file, no read permission
/etc/bash_completion: UTF-8 Unicode English text, with very long lines
/etc/blkid.tab.old:
Non-ISO extended-ASCII text, with CR, LF line terminators
/etc/brltty.conf:
UTF-8 Unicode C++ program text
/etc/chatscripts:
setgid directory
/etc/magic:
magic text file for file(1) cmd
/etc/motd:
symbolic link to '/var/run/motd'
/etc/qemu-ifup:
POSIX shell script text executable
/usr/bin/4xml:
a python script text executable
/usr/bin/Xorg:
setuid executable, regular file, no read permission
/usr/bin/debconf:
a /usr/bin/perl -w script text executable
/usr/bin/locate:
symbolic link to '/etc/alternatives/locate'
/usr/share/man/man7/term.7.gz: gzip compressed data, from Unix, max compression

804 find

find
Finds files based on criteria

find

find [directory-list] [option] [expression]
The find utility selects files that are located in specified directory hierarchies and that meet specified criteria.

Arguments

The directory-list specifies the directory hierarchies that find is to search. When you
do not specify a directory-list, find searches the working directory hierarchy.
The option controls whether find dereferences symbolic links as it descends directory
hierarchies. By default find does not dereference symbolic links (it works with the
symbolic link, not the file the link points to). Under Mac OS X, you can use the –x
option to prevent find from searching directories in filesystems other than those specified in directory-list. Under Linux, the –xdev criterion performs the same function.
The expression contains criteria, as described in the “Criteria” section. The find utility
tests each of the files in each of the directories in the directory-list to see whether it
meets the criteria described by the expression. When you do not specify an expression,
the expression defaults to –print.
A SPACE separating two criteria is a Boolean AND operator: The file must meet both
criteria to be selected. A –or or –o separating the criteria is a Boolean OR operator:
The file must meet one or the other (or both) of the criteria to be selected.
You can negate any criterion by preceding it with an exclamation point. The find utility
evaluates criteria from left to right unless you group them using parentheses.
Within the expression you must quote special characters so the shell does not interpret
them but rather passes them to find. Special characters that are frequently used with find
include parentheses, brackets, question marks, and asterisks.
Each element within the expression is a separate argument. You must separate
arguments from each other with SPACEs. A SPACE must appear on both sides of each
parenthesis, exclamation point, criterion, or other element.

Options

–H (partial dereference) For each file that is a symbolic link, works with the file the
link points to, not the symbolic link itself. This option affects files specified on
the command line; it does not affect files found while descending a directory
hierarchy. This option treats files that are not symbolic links normally. See
page 115 for information on dereferencing symbolic links.
–L (dereference) For each file that is a symbolic link, works with the file the link
points to, not the symbolic link itself. This option affects all files and treats files
that are not symbolic links normally. See page 115 for information on dereferencing symbolic links.
–P (no dereference) For each file that is a symbolic link, works with the symbolic
link, not the file the link points to. This option affects all files and treats files

find 805

that are not symbolic links normally. This behavior is the default. See page 115
for information on dereferencing symbolic links.
–x Causes find not to search directories in filesystems other than the one(s) specified
by directory-list. O
––xdev

Criteria

Causes find not to search directories in filesystems other than the one(s) specified
by directory-list. L
You can use the following criteria within the expression. As used in this list, ±n is a decimal integer that can be expressed as +n (more than n), –n (fewer than n), or n (exactly n).
–anewer filename
(accessed newer) The file being evaluated meets this criterion if it was accessed
more recently than filename.
–atime ±n
(access time) The file being evaluated meets this criterion if it was last accessed
±n days ago. When you use this option, find changes the access times of directories
it searches.
–depth
The file being evaluated always meets this action criterion. It causes find to take
action on entries in a directory before it acts on the directory itself. When you
use find to send files to the cpio utility, the –depth criterion enables cpio to preserve the modification times of directories when you restore files (assuming you
use the ––preserve–modification–time option to cpio). See the “Discussion” and
“Examples” sections under cpio on pages 761 and 762.
–exec command \;
The file being evaluated meets this action criterion if the command returns a 0 (zero
[true]) exit status. You must terminate the command with a quoted semicolon. The
find utility replaces a pair of braces ({}) within the command with the name of the
file being evaluated. You can use the –exec action criterion at the end of a group of
other criteria to execute the command if the preceding criteria are met. Refer to the
following “Discussion” section for more information. See the section on xargs on
page 1005 for a more efficient way of doing what this option does.
–group name
The file being evaluated meets this criterion if it is associated with the group
named name. You can use a numeric group ID in place of name.
–inum n
The file being evaluated meets this criterion if its inode number is n.
–links ±n
The file being evaluated meets this criterion if it has ±n links.
–mtime ±n
(modify time) The file being evaluated meets this criterion if it was last modified
±n days ago.

806 find

–name filename
The file being evaluated meets this criterion if the pattern filename matches its
name. The filename can include wildcard characters (*, ?, and []) but these
characters must be quoted.
–newer filename
The file being evaluated meets this criterion if it was modified more recently
than filename.
–nogroup
The file being evaluated meets this criterion if it does not belong to a group
known on the local system.
–nouser
The file being evaluated meets this criterion if it does not belong to a user known
on the local system.
–ok command \;
This action criterion is the same as –exec except it displays each command to
be executed enclosed in angle brackets as a prompt and executes the command
only if it receives a response that starts with a y or Y from standard input.
–perm [±]mode
The file being evaluated meets this criterion if it has the access permissions given
by mode. If mode is preceded by a minus sign (–), the file access permissions
must include all the bits in mode. For example, if mode is 644, a file with 755
permissions will meet this criterion. If mode is preceded by a plus sign (+), the
file access permissions must include at least one of the bits in mode. If no plus
or minus sign precedes mode, the mode of the file must exactly match mode. You
may use either a symbolic or octal representation for mode (see chmod on
page 741).
–print
The file being evaluated always meets this action criterion. When evaluation of
the expression reaches this criterion, find displays the pathname of the file it is
evaluating. If –print is the only criterion in the expression, find displays the names
of all files in the directory-list. If this criterion appears with other criteria, find
displays the name only if the preceding criteria are met. If no action criteria
appear in the expression, –print is assumed. (Refer to the following “Discussion”
and “Notes” sections.)
–size ±n[c|k|M|G]
The file being evaluated meets this criterion if it is the size specified by ±n,
measured in 512-byte blocks. Follow n with the letter c to measure files in
characters, k to measure files in kilobytes, M to measure files in megabytes,
or G to measure files in gigabytes.
–type filetype
The file being evaluated meets this criterion if its file type is specified by filetype.
Select a filetype from the following list:

find 807

b
c
d
f
l
p
s

Block special file
Character special file
Directory file
Ordinary file
Symbolic link
FIFO (named pipe)
Socket

–user name
The file being evaluated meets this criterion if it belongs to the user with the
username name. You can use a numeric user ID in place of name.
–xdev
The file being evaluated always meets this action criterion. It prevents find from
searching directories in filesystems other than the one specified by directory-list.
Also –mount. L
–x The file being evaluated always meets this action criterion. It prevents find from
searching directories in filesystems other than the one specified by directory-list.
Also –mount. O

Discussion

Assume x and y are criteria. The following command line never tests whether the file
meets criterion y if it does not meet criterion x. Because the criteria are separated by
a SPACE (the Boolean AND operator), once find determines that criterion x is not met,
the file cannot meet the criteria so find does not continue testing. You can read the
expression as “(test to see whether) the file meets criterion x and [SPACE means and]
criterion y.”
$ find dir x y

The next command line tests the file against criterion y if criterion x is not met. The
file can still meet the criteria so find continues the evaluation. You can read the expression as “(test to see whether) the file meets criterion x or criterion y.” If the file meets
criterion x, find does not evaluate criterion y as there is no need to do so.
$ find dir x -or y
Action criteria

Certain “criteria” do not select files but rather cause find to take action. The action
is triggered when find evaluates one of these action criteria. Therefore, the position
of an action criterion on the command line—not the result of its evaluation—determines whether find takes the action.
The –print action criterion causes find to display the pathname of the file it is testing.
The following command line displays the names of all files in the dir directory (and
all its subdirectories), regardless of how many links they have:
$ find dir -print -links +1

The following command line displays the names of only those files in the dir directory
that have more than one link:
$ find dir -links +1 -print

808 find

This use of –print after the testing criteria is the default action criterion. The following
command line generates the same output as the preceding one:
$ find dir -links +1

Notes

You can use the –a (or –and) operator between criteria to improve clarity. This operator
is a Boolean AND operator, just as the SPACE is.
You might want to consider using pax (page 907) in place of cpio. OS X users might
want to use ditto (page 785).

Examples

The simplest find command has no arguments and lists the files in the working directory
and all subdirectories:
$ find
...

The following command finds and displays the pathnames of files in the working
directory and subdirectories that have filenames beginning with a. The command uses
a period to designate the working directory. To prevent the shell from interpreting the
a* as an ambiguous file reference, it is enclosed within single quotation marks.
$ find . -name 'a*' -print

If you omit the directory-list argument, find searches the working directory. The
next command performs the same function as the preceding one without explicitly
specifying the working directory or the –print criterion:
$ find -name 'a*'

The next command sends a list of selected filenames to the cpio utility, which writes
them to the device mounted on /dev/sde1. The first part of the command line ends
with a pipe symbol, so the shell expects another command to follow and displays a
secondary prompt (>) before accepting the rest of the command line. You can read
this find command as “find, in the root directory and all subdirectories ( /), ordinary
files (–type f) that have been modified within the past day (–mtime –1), with the
exception of files whose names are suffixed with .o (! –name '*.o').” (An object file
carries a .o suffix and usually does not need to be preserved because it can be
re-created from the corresponding source file.)
$ find / -type f -mtime -1 ! -name '*.o' -print |
> cpio -oB > /dev/sde1

The following command finds, displays the filenames of, and deletes the files named
core or junk in the working directory and its subdirectories:
$ find . \( -name core -o -name junk \) -print -exec rm {} \;
...

The parentheses and the semicolon following –exec are quoted so the shell does not
treat them as special characters. SPACE s separate the quoted parentheses from other
elements on the command line. Read this find command as “find, in the working
directory and subdirectories (.), files named core (–name core) or (–o) junk (–name

find 809

junk) [if a file meets these criteria, continue] and (SPACE ) print the name of the file
(–print) and (SPACE) delete the file (–exec rm {}).”
The following shell script uses find in conjunction with grep to identify files that
contain a particular string. This script enables you to look for a file when you
remember its contents but cannot remember its filename. The finder script locates
files in the working directory and subdirectories that contain the string specified
on the command line. The –type f criterion causes find to pass to grep only the
names of ordinary files, not directory files.
$ cat finder
find . -type f -exec grep -l "$1" {} \;
$ finder "Executive Meeting"
./january/memo.0102
./april/memo.0415

When called with the string Executive Meeting, finder locates two files containing that
string: ./january/memo.0102 and ./april/memo.0415. The period (.) in the pathnames
represents the working directory; january and april are subdirectories of the working
directory. The grep utility with the ––recursive option performs the same function as
the finder script.
The next command looks in two user directories for files that are larger than 100
blocks (–size +100) and have been accessed only more than five days ago—that is,
files that have not been accessed within the past five days (–atime +5). This find
command then asks whether you want to delete the file (–ok rm { }). You must
respond to each query with y (for yes) or n (for no). The rm command works only
if you have write and execute access permissions to the directory holding the file.
$ find /home/max /home/hls -size +100 -atime +5 -ok rm {} \;
< rm ... /home/max/notes >? y
< rm ... /home/max/letter >? n
...

In the next example, /home/sam/track/memos is a symbolic link to the directory
named /home/sam/memos. When you use the –L option, find dereferences (follows)
the symbolic link and searches that directory.
$ ls -l /home/sam/track
lrwxrwxrwx. 1 sam pubs
15 04-12 10:35 memos -> /home/sam/memos
-rw-r--r--. 1 sam pubs 12753 04-12 10:34 report
$ find /home/sam/track
/home/sam/track
/home/sam/track/memos
/home/sam/track/report
$ find -L /home/sam/track
/home/sam/track
/home/sam/track/memos
/home/sam/track/memos/memo.710
/home/sam/track/memos/memo.817
/home/sam/track/report

810 finger

finger
finger

Displays information about users
finger [options] [user-list]
The finger utility displays the usernames of users, together with their full names, terminal device
numbers, times they logged in, and other information. The options control how much information
finger displays, and the user-list specifies which users finger displays information about. The finger
utility can retrieve information from both local and remote systems.

Arguments

Without any arguments, finger provides a short (–s) report on users who are logged
in on the local system. When you specify a user-list, finger provides a long (–l) report
on each user in the user-list. Names in the user-list are not case sensitive.
If the name includes an at sign (@), the finger utility interprets the name following the
@ as the name of a remote host to contact over the network. If a username appears in
front of the @, finger provides information about that user on the remote system.

Options

–l (long) Displays detailed information (the default display when you specify user-list).
–m (match) If a user-list is specified, displays entries only for those users whose username matches one of the names in user-list. Without this option the user-list
names match usernames and full names.
–p (no plan, project, or pgpkey) Does not display the contents of .plan, .project,
and .pgpkey files for users. Because these files might contain backslash escape
sequences that can change the behavior of the screen, you might not wish to
view them. Normally the long listing displays the contents of these files if they
exist in the user’s home directory.
–s (short) Provides a short report for each user (the default display when you do
not specify user-list).

Discussion

The long report provided by the finger utility includes the user’s username, full name,
home directory location, and login shell, plus information about when the user last
logged in and how long it has been since the user last typed on the keyboard and read
her email. After extracting this information from system files, finger displays the contents
of the ~/.plan, ~/.project, and ~/.pgpkey files in the user’s home directory. It is up to each
user to create and maintain these files, which usually provide more information about the
user (such as telephone number, postal mail address, schedule, interests, and PGP key).
The short report generated by finger is similar to that provided by the w utility; it
includes the user’s username, his full name, the device number of the user’s terminal,
the amount of time that has elapsed since the user last typed on the terminal keyboard,

finger 811

the time the user logged in, and the location of the user’s terminal. If the user logged
in over the network, finger displays the name of the remote system.

Notes

Not all Linux distributions install finger by default.
When you specify a network address, the finger utility queries a standard network service that runs on the remote system. Although this service is supplied with most Linux
systems, some administrators choose not to run it (to minimize the load on their systems, eliminate possible security risks, or simply maintain privacy). If you try to use
finger to get information on someone at such a site, the result might be an error message or nothing at all. The remote system determines how much information to share
with the local system and in which format. As a result the report displayed for any
given system might differ from the examples shown in this section. See also “finger:
Lists Users on the System” on page 71.
A file named ~/.nofinger causes finger to deny the existence of the person in whose
home directory it appears. For this subterfuge to work, the finger query must originate
from a system other than the local host and the fingerd daemon must be able to see the
.nofinger file (generally the home directory must have its execute bit for other users set).

Examples

The first example displays information on the users logged in on the local system:
$ finger
Login
max
hls
sam

Name
Max Wild
Helen Simpson
Sam the Great

Tty
Idle
tty1
13:29
pts/1 13:29
pts/2

*

Login
Jun 25
Jun 25
Jun 26

Time Office Office Phone
21:03
21:02 (:0)
07:47 (plum.example.com)

The asterisk (*) in front of the name of Helen’s terminal (TTY) line indicates she has
blocked others from sending messages directly to her terminal (see mesg on page 75).
A long report displays the string messages off for users who have disabled messages.
The next two examples cause finger to contact the remote system named guava over
the network for information:
$ finger
[guava]
Login
max
roy

@guava
Name
Max Wild
Roy Wong

Tty
tty1
pts/2

Idle Login Time
Office
23:15 Jun 25 11:22
Jun 25 11:22

$ finger max@guava
[guava]
Login: max
Name: Max Wild
Directory: /home/max
Shell: /bin/zsh
On since Sat Jun 23 11:22 (PDT) on tty1, idle 23:22
Last login Sun Jun 24 06:20 (PDT) on ttyp2 from speedy
Mail last read Thu Jun 21 08:10 2012 (PDT)
Plan:
For appointments contact Sam the Great, x1963.

Office Phone

812 fmt

fmt
Formats text very simply

fmt

fmt [option] [file-list]
The fmt utility performs simple text formatting by attempting to make all nonblank lines nearly the
same length.

Arguments

The fmt utility reads the files in file-list and sends a formatted version of their contents
to standard output. If you do not specify a filename or if you specify a hyphen (–) in
place of a filename, fmt reads from standard input.

Options

Options preceded by a double hyphen (––) work under Linux only. Except as noted,
options named with a single letter and preceded by a single hyphen work under Linux
and OS X.

––split-only

–s Splits long lines but does not fill short lines. L
–s Replaces whitespace (multiple adjacent SPACE and/or TAB characters) with a single
SPACE. O

––tagged-paragraph

–t Indents all but the first line of each paragraph. L
–t n
Specifies n as the number of SPACEs per TAB stop. The default is eight. O
––uniform-spacing

–u Changes the formatted output so that one SPACE appears between words and two
SPACEs appear between sentences. L
––width=n –w n

Changes the output line length to n characters. Without this option, fmt keeps
output lines close to 75 characters wide. You can also specify this option as –n.

Notes

The fmt utility works by moving NEWLINE characters. The indention of lines, as well as
the spacing between words, is left intact.
You can use fmt to format text while you are using an editor, such as vim. For
example, you can format a paragraph with the vim editor in command mode by
positioning the cursor at the top of the paragraph and then entering !}fmt –60.
This command replaces the paragraph with the output generated by feeding it
through fmt, specifying a width of 60 characters. Type u immediately if you want
to undo the formatting.

fmt

Examples

813

The following example shows how fmt attempts to make all lines the same length. The
–w 50 option gives a target line length of 50 characters.
$ cat memo
One factor that is important to remember while administering the dietary
intake of Charcharodon carcharias is that there is, at least from
the point of view of the subject,
very little
differentiating the prepared morsels being proffered from your digits.
In other words, don't feed the sharks!
$ fmt -w 50 memo
One factor that is important to remember while
administering the dietary intake of Charcharodon
carcharias is that there is, at least from the
point of view of the subject, very little
differentiating the prepared morsels being
proffered from your digits.
In other words, don't feed the sharks!

The next example demonstrates the ––split-only option. Long lines are broken so
that none is longer than 50 characters; this option prevents fmt from filling short lines.
$ fmt -w 50 --split-only memo
One factor that is important to remember while
administering the dietary
intake of Charcharodon carcharias is that there
is, at least from
the point of view of the subject,
very little
differentiating the prepared morsels being
proffered from your digits.
In other words, don't feed the sharks!

814 fsck

fsck
Checks and repairs a filesystem

fsck

fsck [options] [filesystem-list]
The fsck utility verifies the integrity of a filesystem and reports on and optionally repairs problems it
finds. It is a front end for filesystem checkers, each of which is specific to a filesystem type. Although
fsck is present on a Macintosh, the diskutil front end is typically used to call it; see “Notes.” L

Arguments

Without the –A option and with no filesystem-list, fsck checks the filesystems listed in
the /etc/fstab file one at a time (serially). With the –A option and with no filesystemlist, fsck checks the filesystems listed in the /etc/fstab file in parallel if possible. See the
–s option for a discussion of checking filesystems in parallel.
The filesystem-list specifies the filesystems to be checked. It can either specify the
name of the device that holds the filesystem (e.g., /dev/sda2) or, if the filesystem
appears in /etc/fstab, specify the mount point (e.g., /usr) for the filesystem. The
filesystem-list can also specify the label for the filesystem from /etc/fstab (e.g.,
LABEL=home) or the UUID specifier (e.g., UUID=397df592-6e...).

Options

When you run fsck, you can specify both global options and options specific to the
filesystem type that fsck is checking (e.g., ext2/ext3/ext4, msdos, vfat). Global options
must precede type-specific options.

Global Options
–A (all) Processes all filesystems listed in the /etc/fstab file, in parallel if possible.
See the –s option for a discussion of checking filesystems in parallel. Do not
specify a filesystem-list when you use this option; you can specify filesystem
types to be checked with the –t option. Use this option with the –a, –p, or –n
option so fsck does not attempt to process filesystems in parallel interactively (in
which case you would have no way of responding to its multiple prompts).
–N (no) Assumes a no response to any questions that arise while processing a
filesystem. This option generates the messages you would normally see but
causes fsck to take no action.
–R (root-skip) With the –A option, does not check the root filesystem. This option
is useful when the system boots, because the root filesystem might be mounted
with read-write access.
–s (serial) Causes fsck to process filesystems one at a time. Without this option, fsck
processes multiple filesystems that reside on separate physical disk drives in parallel. Parallel processing enables fsck to process multiple filesystems more quickly.
This option is required if you want to process filesystems interactively. See the –a,
–p, or –N (or –n, on some filesystems) option to turn off interactive processing.
–T (title) Causes fsck not to display its title.

fsck 815

–t fstype
(filesystem type) A comma-separated list that specifies the filesystem type(s) to
process. With the –A option, fsck processes all the filesystems in /etc/fstab that
are of type fstype. Common filesystem types are ext2/ext3/ext4, msdos, and
vfat. You do not typically check remote NFS filesystems.
–V (verbose) Displays more output, including filesystem type-specific commands.

Filesystem Type-Specific Options
The following command lists the filesystem checking utilities available on the local
system. Files with the same inode numbers are linked (page 112).
$ ls -i /sbin/*fsck*
9961 /sbin/btrfsck
21955 /sbin/fsck.ext2
3452 /sbin/dosfsck
21955 /sbin/fsck.ext3
21955 /sbin/e2fsck
21955 /sbin/fsck.ext4
8471 /sbin/fsck
21955 /sbin/fsck.ext4dev
6489 /sbin/fsck.cramfs
8173 /sbin/fsck.msdos

8646 /sbin/fsck.ntfs
3502 /sbin/fsck.vfat
3804 /sbin/fsck.xfs

Review the man page or give the pathname of the filesystem checking utility to determine
which options the utility accepts:
$ /sbin/fsck.ext4
Usage: /sbin/fsck.ext4 [-panyrcdfvstDFSV] [-b superblock] [-B blocksize]
[-I inode_buffer_blocks] [-P process_inode_size]
[-l|-L bad_blocks_file] [-C fd] [-j ext-journal]
[-E extended-options] device
Emergency help:
-p
-n
-y
-c
-f
...

Automatic repair (no questions)
Make no changes to the filesystem
Assume "yes" to all questions
Check for bad blocks and add them to the badblock list
Force checking even if filesystem is marked clean

The following options apply to many filesystem types, including ext2/ext3/ext4:
–a (automatic) Same as the –p option; kept for backward compatibility.
–f (force) Forces fsck to check filesystems even if they are clean. A clean filesystem
is one that was just successfully checked with fsck or was successfully
unmounted and has not been mounted since then. Clean filesystems are skipped
by fsck, which greatly speeds up system booting under normal conditions. For
information on setting up periodic, automatic filesystem checking on
ext2/ext3/ext4 filesystems, see tune2fs on page 991.
–n (no) Same as the –N global option. Does not work on all filesystems.
–p (preen) Attempts to repair all minor inconsistencies it finds when processing a
filesystem. If any problems are not repaired, fsck terminates with a nonzero exit
status. This option runs fsck in batch mode; as a consequence, it does not ask
whether to correct each problem it finds. The –p option is commonly used with
the –A option when checking filesystems while booting Linux.

816 fsck

–r (interactive) Asks whether to correct or ignore each problem that is found. For
many filesystem types, this behavior is the default. This option is not available
on all filesystems.
–y (yes) Assumes a yes response to any questions that fsck asks while processing a
filesystem. Use this option with caution, as it gives fsck free rein to do what it
thinks is best to clean a filesystem.

Notes

Apple suggests using diskutil (page 782) in place of fsck unless you are working in
an environment where diskutil is not available (e.g., single-user mode). For more
information see support.apple.com/kb/TS1417.
You can run fsck from a live or rescue CD/DVD.
When a filesystem is consistent, fsck displays a report such as the following:
# fsck -f /dev/sdb1
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sdb1: 710/4153408 files (10.1% non-contiguous), 455813/8303589 blocks

Interactive mode

You can run fsck either interactively or in batch mode. For many filesystems, unless
you use one of the –a, –p, –y, or –n options, fsck runs in interactive mode. In interactive
mode, if fsck finds a problem with a filesystem, it reports the problem and allows you
to choose whether to repair or ignore it. If you repair a problem you might lose some
data; however, that is often the most reasonable alternative.
Although it is technically feasible to repair files that are damaged and that fsck says you
should remove, this action is rarely practical. The best insurance against significant loss
of data is to make frequent backups.

Order of checking

The fsck utility looks at the sixth column in the /etc/fstab file to determine if, and
in which order, it should check filesystems. A 0 (zero) in this position indicates the
filesystem should not be checked. A 1 (one) indicates it should be checked first; this
status is usually reserved for the root filesystem. A 2 (two) indicates the filesystem
should be checked after those marked with a 1.

fsck is a front end

Similar to mkfs (page 887), fsck is a front end that calls other utilities to handle various
types of filesystems. For example, fsck calls e2fsck to check the widely used
ext2/ext3/ext4 filesystems. Refer to the e2fsck man page for more information. Other
utilities that fsck calls are typically named fsck.type, where type is the filesystem type.
By splitting fsck in this manner, filesystem developers can provide programs to check
their filesystems without affecting the development of other filesystems or changing
how system administrators use fsck.

fsck 817
Boot time

Run fsck on filesystems that are unmounted or are mounted readonly. When Linux
is booting, the root filesystem is first mounted readonly to allow it to be processed
by fsck. If fsck finds no problems with the root filesystem, it is then remounted (using
the remount option to the mount utility) read-write and fsck is typically run with the
–A, –R, and –p options.

lost+found

When it encounters a file that has lost its link to its filename, fsck asks whether to
reconnect it. If you choose to reconnect it, fsck puts the file in a directory named
lost+found in the root directory of the filesystem in which it found the file. The reconnected file is given its inode number as a name. For fsck to restore files in this way, a
lost+found directory must be present in the root directory of each filesystem. For
example, if a system uses the /, /usr, and /home filesystems, you should have these
three lost+found directories: /lost+found, /usr/lost+found, and /home/lost+found.
Each lost+found directory must have unused entries in which fsck can store the inode
numbers for files that have lost their links. When you create an ext2/ext3/ext4 filesystem, mkfs (page 887) creates a lost+found directory with the required unused
entries. Alternatively, you can use the mklost+found utility to create this directory in
ext2/ext3/ext4 filesystems if needed. On other types of filesystems, you can create the
unused entries by adding many files to the directory and then removing them: Use
touch (page 985) to create 500 entries in the lost+found directory and then use rm to
delete them.

Messages

Table VI-17 lists fsck’s common messages. In general fsck suggests the most logical
way of dealing with a problem in the file structure. Unless you have information that
suggests another response, respond to the prompts with yes. Use the system backup
tapes or disks to restore data that is lost as a result of this process.

Table VI-17

Common fsck messages

Phase (message)

What fsck checks

Phase 1 - Checking inodes,
blocks, and sizes

Checks inode information.

Phase 2 - Checking directory
structure

Looks for directories that point to bad inodes that fsck found in
Phase 1.

Phase 3 - Checking directory
connectivity

Looks for unreferenced directories and a nonexistent or full
lost+found directory.

Phase 4 - Checking reference
counts

Checks for unreferenced files, a nonexistent or full lost+found
directory, bad link counts, bad blocks, duplicated blocks, and
incorrect inode counts.

Phase 5 - Checking group
summary information

Checks whether the free list and other filesystem structures are
OK. If any problems are found with the free list, Phase 6 is run.

Phase 6 - Salvage free list

If Phase 5 found any problems with the free list, Phase 6 fixes
them.

818 fsck

Cleanup
Once it has repaired the filesystem, fsck informs you about the status of the filesystem.
The fsck utility displays the following message after it repairs a filesystem:
*****File System Was Modified*****

On ext2/ext3/ext4 filesystems, fsck displays the following message when it has finished
checking a filesystem:
filesys: used/maximum files (percent non-contiguous), used/maximum blocks
This message tells you how many files and disk blocks are in use as well as how many
files and disk blocks the filesystem can hold. The percent non-contiguous tells you
how fragmented the disk is.

ftp

819

ftp
ftp [options] [remote-system]
The ftp utility is a user interface to the standard File Transfer Protocol (FTP), which transfers files
between systems that can communicate over a network. To establish an FTP connection, you must
have access to an account (personal, guest, or anonymous) on the remote system.

Use FTP only to download public information
security FTP is not a secure protocol. The ftp utility sends your password over the network as cleartext,
which is not a secure practice. You can use sftp (page 700) as a secure replacement for ftp if the
server is running OpenSSH. You can also use scp (page 698) for many FTP functions other than
allowing anonymous users to download information. Because scp uses an encrypted connection,
user passwords and data cannot be sniffed.

Arguments
Options

The remote-system is the name or IP address of the server, running an FTP daemon
(e.g., ftpd, vsftpd, or sshd), you want to exchange files with.
–i (interactive) Turns off prompts during file transfers with mget and mput. See
also prompt (next page).
–n (no automatic login) Disables automatic logins.
–p (passive mode) Starts ftp in passive mode (page 821).
–v (verbose) Tells you more about how ftp is working. Displays responses from the
remote-system and reports transfer times and speeds.

Commands

The ftp utility is interactive. After you start it, ftp prompts you to enter commands to
set parameters and transfer files. Following are some of the commands you can use
in response to the ftp> prompt.

![command] Escapes to (spawns) a shell on the local system; use CONTROL-D or exit to return to ftp when

you are finished using the local shell. Follow the exclamation point with a command
to execute that command only; ftp returns to the ftp> prompt when the command completes executing. Because the shell that ftp spawns with this command is a child of the
shell that is running ftp, no changes you make in this shell are preserved when you
return to ftp. Specifically, when you want to copy files to a local directory other than
the directory that you started ftp from, you need to use the ftp lcd command to change
the local working directory: Issuing a cd command in the spawned shell will not make
the change you desire. See page 824 for an example.
ascii Sets the file transfer type to ASCII. This command allows you to transfer text files

from systems that end lines with a RETURN/LINEFEED combination and automatically strip
off the RETURN. Such a transfer is useful when the remote computer is a DOS or MS
Windows machine. The cr command must be ON for ascii to work.

ftp

Transfers files over a network

820 ftp
binary Sets the file transfer type to binary. This command allows you to transfer correctly

files that contain non-ASCII (unprintable) characters. It also works for ASCII files
that do not require changes to the ends of lines.
bye Closes the connection to a remote system and terminates ftp. Same as quit.
cd remote-directory

Changes to the working directory named remote-directory on the remote system.
close Closes the connection with the remote system without exiting from ftp.
cr (carriage return) Toggles RETURN stripping when you retrieve files in ASCII mode. See ascii.
dir [directory [file]]

Displays a listing of the directory named directory from the remote system. When
you do not specify directory, the working directory is displayed. When you specify a
file, the listing is saved on the local system in a file named file.
get remote-file [local-file]

Copies remote-file to the local system under the name local-file. Without local-file, ftp
uses remote-file as the filename on the local system. The remote-file and local-file can
be pathnames.
glob Toggles filename expansion for the mget and mput commands and displays the current

state (Globbing on or Globbing off).
help Displays a list of commands recognized by the ftp utility on the local system.
lcd [local_directory]

(local change directory) Changes the working directory on the local system to
local_directory. Without an argument, this command changes the working directory
on the local system to your home directory (just as cd does without an argument).
ls [directory [file]]

Similar to dir but produces a more concise listing on some remote systems.
mget remote-file-list

(multiple get) Unlike the get command, allows you to retrieve multiple files from the
remote system. You can name the remote files literally or use wildcards (see glob).
See also prompt.
mput local-file-list

(multiple put) The mput command allows you to copy multiple files from the local
system to the remote system. You can name the local files literally or use wildcards
(see glob). See also prompt.
open Interactively specifies the name of the remote system. This command is useful if you

did not specify a remote system on the command line or if the attempt to connect to
the remote system failed.
passive Toggles between the active (PORT—the default) and passive (PASV) transfer modes

and displays the transfer mode. See “Passive versus active connections” under the
“Notes” section.
prompt When using mget or mput to receive or send multiple files, ftp asks for verification

(by default) before transferring each file. This command toggles that behavior and
displays the current state (Interactive mode off or Interactive mode on).

ftp

821

put local-file [remote-file]

Copies local-file to the remote system under the name remote-file. Without remote-file,
ftp uses local-file as the filename on the remote system. The remote-file and local-file
can be pathnames.
pwd Causes ftp to display the pathname of the remote working directory. Use !pwd to

display the name of the local working directory.
quit Closes the connection to a remote system and terminates ftp. Same as bye.
reget remote-file Attempts to resume an aborted transfer. This command is similar to get, but instead
of overwriting an existing local file, ftp appends new data to it. Not all servers support

reget.
user [username] If the ftp utility did not log you in automatically, you can specify your account name
as username. If you omit username, ftp prompts for a username.

Notes

A Linux or Mac OS X system running ftp can exchange files with any of the many
operating systems that support the FTP protocol. Many sites offer archives of free
information on an FTP server, although many of these FTP sites are merely alternatives to an easier-to-access Web site (for example, ftp://ftp.ibiblio.org/pub/Linux and
http://www.ibiblio.org/software/linux). Most browsers can connect to and download
files from FTP servers.
The ftp utility makes no assumptions about filesystem naming or structure because
you can use ftp to exchange files with non-UNIX/Linux systems (whose filename
conventions might be different).
See page 954 for information on using curlftpfs to mount an FTP directory on the local
system without needing root privileges.

Anonymous FTP

Many systems—most notably those from which you can download free software—allow you to log in as anonymous. Most systems that support anonymous
logins accept the name ftp as an easier-to-spell and quicker-to-enter synonym for
anonymous. An anonymous user is usually restricted to a portion of a filesystem set
aside to hold files shared with remote users. When you log in as an anonymous user,
the server prompts you to enter a password. Although any password might be
accepted, by convention you are expected to supply your email address. Many systems that permit anonymous access store interesting files in the pub directory.

Passive versus
active connections

A client can ask an FTP server to establish either a PASV (passive—the default) or
PORT (active) connection for data transfer. Some servers are limited to one type of
connection. The difference between passive and active FTP connections lies in
whether the client or server initiates the data connection. In passive mode, the client
initiates the data connection to the server (on port 20 by default); in active mode, the
server initiates the data connection (there is no default port). Neither type of connection is inherently more secure. Passive connections are more common because a client
behind a NAT (page 1082) firewall can connect to a passive server and because it is
simpler to program a scalable passive server.

Automatic login

You can store server-specific FTP username and password information so you do not
have to enter it each time you visit an FTP site. Each line of the ~/.netrc file identifies

822 ftp

a server. When you connect to an FTP server, ftp reads ~/.netrc to determine whether
you have an automatic login set up for that server. The syntax of a line in ~/.netrc is
machine server login username password passwd
where server is the name of the server, username is your username, and passwd is
your password on server. Replace machine with default on the last line of the file to
specify a username and password for systems not listed in ~/.netrc. The default line
is useful for logging in on anonymous servers. A sample ~/.netrc file follows:
$ cat ~/.netrc
machine plum login max password mypassword
default login anonymous password max@example.com

To protect the account information in .netrc, make it readable by only the user in
whose home directory it appears. Refer to the netrc man page for more information.

Examples

Connect and log in

Following are two ftp sessions wherein Max transfers files from and to an FTP server
named plum. When Max gives the command ftp plum, the local ftp client connects to
the server, which asks for a username and a password. Because he is logged in on his
local system as max, ftp suggests that he log in on plum as max. To log in as max, Max
could just press RETURN. His username on plum is watson, however, so he types watson
in response to the Name (plum:max): prompt. Max responds to the Password:
prompt with his normal (remote) system password, and the FTP server greets him and
informs him that it is Using binary mode to transfer files. With ftp in binary mode,
Max can transfer ASCII and binary files.
$ ftp plum
Connected to plum (172.16.192.151).
220 (vsFTPd 2.3.4)
Name (plum:max): watson
331 Please specify the password.
Password:
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp>

After logging in, Max uses the ftp ls command to display the contents of his remote
working directory, which is his home directory on plum. Then he cds to the memos
directory and displays the files there.
ls and cd

ftp> ls
227 Entering Passive Mode (172,16,192,151,222,168)
150 Here comes the directory listing.
drwxr-xr-x
2 500
500
4096 Oct 10 23:52 expenses
drwxr-xr-x
2 500
500
4096 Oct 10 23:59 memos
drwxrwxr-x
22 500
500
4096 Oct 10 23:32 tech
226 Directory send OK.
ftp> cd memos
250 Directory successfully changed.

ftp
ftp> ls
227 Entering Passive Mode (172,16,192,151,226,0)
150 Here comes the directory listing.
-rw-r--r-1 500
500
4770 Oct 10
-rw-r--r-1 500
500
7134 Oct 10
-rw-r--r-1 500
500
9453 Oct 10
-rw-r--r-1 500
500
3466 Oct 10
-rw-r--r-1 500
500
1945 Oct 10
226 Directory send OK.

23:58
23:58
23:58
23:59
23:59

823

memo.0514
memo.0628
memo.0905
memo.0921
memo.1102

Next Max uses the ftp get command to copy memo.1102 from the server to the local
system. His use of binary mode ensures that he will get a good copy of the file
regardless of whether it is in binary or ASCII format. The server confirms the file
was copied successfully and notes the size of the file and the time it took to copy.
Max then copies the local file memo.1114 to the remote system. The file is copied
into his remote working directory, memos.
get and put

ftp> get memo.1102
local: memo.1102 remote: memo.1102
227 Entering Passive Mode (172,16,192,151,74,78)
150 Opening BINARY mode data connection for memo.1102 (1945 bytes).
226 Transfer complete.
1945 bytes received in 7.1e-05 secs (2.7e+04 Kbytes/sec)
ftp> put memo.1114
local: memo.1114 remote: memo.1114
227 Entering Passive Mode (172,16,192,151,214,181)
150 Ok to send data.
226 Transfer complete.
1945 bytes sent in 2.8e-05 secs (6.8e+04 Kbytes/sec)

After a while Max decides he wants to copy all the files in the memos directory on plum
to a new directory on the local system. He gives an ls command to make sure he is going
to copy the right files, but ftp has timed out. Instead of exiting from ftp and giving another
ftp command from the shell, Max gives ftp an open plum command to reconnect to the
server. After logging in, he uses the ftp cd command to change directories to memos on
the server.
Timeout and open

ftp> ls
No control connection for command: Success
Passive mode refused.
ftp> open plum
Connected to plum (172.16.192.151).
220 (vsFTPd 2.3.4)
...
ftp> cd memos
250 Directory successfully changed.

At this point, Max realizes he has not created the new directory to hold the files he
wants to download. Giving an ftp mkdir command would create a new directory on
the server, but Max wants a new directory on the local system. He uses an exclamation point (!) followed by a mkdir memos.hold command to invoke a shell and run
mkdir on the local system, thereby creating a directory named memos.hold in his
working directory on the local system. (You can display the name of your working

824 ftp

directory on the local system using !pwd.) Next, because he wants to copy files from
the server to the memos.hold directory on the local system, Max has to change his
working directory on the local system. Giving the command !cd memos.hold will not
accomplish what Max wants to do because the exclamation point spawns a new shell
on the local system and the cd command would be effective only in the new shell,
which is not the shell that ftp is running under. For this situation, ftp provides the lcd
(local cd) command, which changes the working directory for ftp and reports on the
new local working directory.
lcd (local cd)

ftp> !mkdir memos.hold
ftp> lcd memos.hold
Local directory now /home/max/memos.hold

Max uses the ftp mget (multiple get) command followed by the asterisk (*) wildcard
to copy all the files from the remote memos directory to the memos.hold directory on
the local system. When ftp prompts him for the first file, he realizes that he forgot to
turn off the prompts, so he responds with n and presses CONTROL-C to stop copying files
in response to the second prompt. The server checks whether he wants to continue
with his mget command.
Next Max gives the ftp prompt command, which toggles the prompt action (turns it
off if it is on and turns it on if it is off). Now when he gives an mget * command, ftp
copies all the files without prompting him.
After getting the files he wants, Max gives a quit command to close the connection
with the server, exit from ftp, and return to the local shell prompt.
mget and prompt

ftp> mget *
mget memo.0514? n
mget memo.0628? CONTROL-C
Continue with mget? n
ftp> prompt
Interactive mode off.
ftp> mget *
local: memo.0514 remote: memo.0514
227 Entering Passive Mode (172,16,192,151,153,231)
150 Opening BINARY mode data connection for memo.0514 (4770 bytes).
226 Transfer complete.
4770 bytes received in 8.8e-05 secs (5.3e+04 Kbytes/sec)
local: memo.0628 remote: memo.0628
227 Entering Passive Mode (172,16,192,151,20,35)
150 Opening BINARY mode data connection for memo.0628 (7134 bytes).
226 Transfer complete.
...
150 Opening BINARY mode data connection for memo.1114 (1945 bytes).
226 Transfer complete.
1945 bytes received in 3.9e-05 secs (4.9e+04 Kbytes/sec)
ftp> quit
221 Goodbye.

gawk 825

Searches for and processes patterns in a file
gawk [options] [program] [file-list]
gawk [options] –f program-file [file-list]
AWK is a pattern-scanning and processing language that searches one or more files for records (usually
lines) that match specified patterns. It processes lines by performing actions, such as writing the record
to standard output or incrementing a counter, each time it finds a match. As opposed to procedural
languages, AWK is data driven: You describe the data you want to work with and tell AWK what to
do with the data once it finds it.

See Chapter 14 for information on gawk
tip See Chapter 14 starting on page 619 for information on the awk, gawk, and mawk implementations
of the AWK language.

gawk

gawk

826 gcc

gcc
Compiles C and C++ programs

gcc

gcc [options] file-list [–larg]
g++ [options] file-list [–larg]
The Linux and Mac OS X operating systems use the GNU C compiler, gcc, to preprocess, compile,
assemble, and link C language source files. The same compiler with a different front end, g++, processes
C++ source code. The gcc and g++ compilers can also assemble and link assembly language source files,
link object files only, or build object files for use in shared libraries.
These compilers take input from files you specify on the command line. Unless you use the –o option,
they save the executable program in a file named a.out.
The gcc and g++ compilers are part of GCC, the GNU Compiler Collection, which includes front ends
for C, C++, Objective C, Fortran, Java, and Ada as well as libraries for these languages. Visit
gcc.gnu.org for more information.

gcc and g++
tip Although this section specifies the gcc compiler, most of the information applies to g++ as well.

Arguments

The file-list is a list of files gcc is to process.

Options

Without any options gcc accepts C language source files, assembly language files,
object files, and other files described in Table VI-18 on page 828. The gcc utility
preprocesses, compiles, assembles, and links these files as appropriate, producing
an executable file named a.out. When you create object files without linking them
to produce an executable file, gcc names each object file by adding the extension
.o to the basename of the corresponding source file. When you create an executable
file, gcc deletes the object files after linking.
Some of the most commonly used options are listed here. When certain filename
extensions are associated with an option, you can assume gcc adds the extension to
the basename of the source file.
–c (compile) Suppresses the linking step of compilation. Compiles and/or assembles
source code files and leaves the object code in files with the extension .o.
–Dname[=value]
Usually #define preprocessor directives are given in header, or include, files. You
can use this option to define symbolic names on the command line instead. For
example, –DLinux is equivalent to placing the line #define Linux in an include
file; –DMACH=i586 is the same as #define MACH i586.

gcc

827

–E (everything) For source code files, suppresses all steps of compilation except preprocessing and writes the result to standard output. By convention the extension
.i is used for preprocessed C source and .ii for preprocessed C++ source.
–fpic
Causes gcc to produce position-independent code, which is suitable for installing
in a shared library.
–fwritable-strings
By default the GNU C compiler places string constants into protected memory,
where they cannot be changed. Some (usually older) programs assume you can
modify string constants. This option changes the behavior of gcc so that string
constants can be modified.
–g (gdb) Embeds diagnostic information in the object files. This information is used
by symbolic debuggers, such as gdb. Although this option is necessary only if
you later use a debugger, it is a good practice to include it as a matter of course.
–Idirectory
Looks for include files in directory before looking in the standard locations.
Give this option multiple times to look in more than one directory.
–larg
(lowercase “l”) Searches the directories /lib and /usr/lib for a library file named
libarg.a. If the file is found, gcc then searches this library for any required functions. Replace arg with the name of the library you want to search. For example,
the –lm option normally links the standard math library libm.a. The position of
this option is significant: It generally needs to appear at the end of the command
line but can be repeated multiple times to search different libraries. Libraries are
searched in the order in which they appear on the command line. The linker uses
the library only to resolve undefined symbols from modules that precede the
library option on the command line. You can add other library paths to search
for libarg.a using the –L option.
–Ldirectory
Adds directory to the list of directories to search for libraries given with the –l
option. Directories that are added to the list with –L are searched before gcc
looks in the standard locations for libraries.
–o file
(output) Names the executable program that results from linking file instead of
a.out.
–On
(optimize) Attempts to improve (optimize) the object code produced by the compiler. The value of n might be 0, 1, 2, or 3 (or 06 if you are compiling code for
the Linux kernel). The default value of n is 1. Larger values of n result in better
optimization but might increase both the size of the object file and the time it

828 gcc

takes gcc to run. Specify –O0 to turn off optimization. Many related options
control precisely the types of optimizations attempted by gcc when you use –O.
Refer to the gcc info page for details.
–pedantic
The C language accepted by the GNU C compiler includes features that are not
part of the ANSI standard for the C language. Using this option forces gcc to reject
these language extensions and accept only standard C programming language
features.
–Q Displays the names of functions as gcc compiles them. This option also displays
statistics about each pass.
–S (suppress) Suppresses the assembling and linking steps of compilation on source
code files. The resulting assembly language files have .s filename extensions.
–traditional
Causes gcc to accept only C programming language features that existed in the
traditional Kernighan and Ritchie C programming language. With this option,
older programs written using the traditional C language (which existed before
the ANSI standard C language was defined) can be compiled correctly.
–Wall
Causes gcc to warn you about questionable code in the source code files. Many
related options control warning messages more precisely.

Notes

The preceding list of options represents only a small fraction of the full set of options
available with the GNU C compiler. See the gcc info page for a complete list.
Although the –o option is generally used to specify a filename in which to save object
code, this option also allows you to name files resulting from other compilation steps.
In the following example, the –o option causes the assembly language produced by
the gcc command to be stored in the file acode instead of pgm.s, the default:
$ gcc -S -o acode pgm.c

The lint utility found in many UNIX systems is not available on Linux or OS X.
However, the –Wall option performs many of the same checks and can be used in
place of lint.
Table VI-18 summarizes the conventions used by the C compiler for assigning filename
extensions.

Table VI-18

Filename extensions

Extension

Type of file

.a

Library of object modules

.c

C language source file

gcc

Table VI-18

OS X/clang

Examples

829

Filename extensions (continued)

Extension

Type of file

.C, .cc, or .cxx

C++ language source file

.i

Preprocessed C language source file

.ii

Preprocessed C++ language source file

.m

Objective C

.mm

Objective C++

.o

Object file

.s

Assembly language source file

.S

Assembly language source file that needs preprocessing

Apple has not shipped a new version of gcc in a long time. It uses the LLVM compiler
suite (www.llvm.org)—in particular, the clang front end for C/C++ and Objective C
(clang.llvm.org). The command-line arguments for clang are compatible with gcc.
Both clang and gcc are part of the optional Xcode package.
The first example compiles, assembles, and links a single C program, compute.c. The
executable output is saved in a.out. The gcc utility deletes the object file.
$ gcc compute.c

The next example compiles the same program using the C optimizer (–O option, level
2). It assembles and links the optimized code. The –o option causes gcc to store the
executable output in compute.
$ gcc -O2 -o compute compute.c

Next a C source file, an assembly language file, and an object file are compiled,
assembled, and linked. The executable output is stored in progo.
$ gcc -o progo procom.c profast.s proout.o

In the next example, gcc searches the standard math library found at /lib/libm.a
when it is linking the himath program and stores the executable output in a.out:
$ gcc himath.c -lm

In the following example, the C compiler compiles topo.c with options that check
the code for questionable source code practices (–Wall option) and violations of the
ANSI C standard (–pedantic option). The –g option embeds debugging support in
the executable file, which is saved in topo with the –o topo option. Full optimization is enabled with the –O3 option.

830 gcc

The warnings produced by the C compiler are sent to standard output. In this example
the first and last warnings result from the –pedantic option; the other warnings result
from the –Wall option.
$ gcc -Wall -g -O3 -pedantic -o topo topo.c
In file included from topo.c:2:
/usr/include/ctype.h:65: warning: comma at end of enumerator list
topo.c:13: warning: return-type defaults to 'int'
topo.c: In function 'main':
topo.c:14: warning: unused variable 'c'
topo.c: In function 'getline':
topo.c:44: warning: 'c' might be used uninitialized in this function

When compiling programs that rely on the X11 include files and libraries, you might
need to use the –I and –L options to tell gcc where to locate those include files and
libraries. The next example uses those options and instructs gcc to link the program
with the basic X11 library:
$ gcc -I/usr/X11R6/include plot.c -L/usr/X11R6/lib -lX11

GetFileInfo O 831

Displays file attributes
GetFileInfo [option] file
The GetFileInfo utility displays file attributes (page 1046), including the file’s type and creator code,
creation and last modification times, and attribute flags such as the invisible and locked flags. The
GetFileInfo utility is available under OS X only. O

Arguments

The file specifies a single file or a directory that GetFileInfo displays information
about.

Options

The options for GetFileInfo correspond to the options for SetFile (page 938).
Without an option, GetFileInfo reports on the metadata of file, indicating the flags
that are set, the file’s type and creator codes, and its creation and modification
dates. Missing data is omitted. When you specify an option, GetFileInfo displays the
information specified by that option only. This utility accepts a single option; it
silently ignores additional options.
–aflag
(attribute) Reports the status of the single attribute flag named flag. This
option displays 1 if flag is set and 0 if flag is not set. The flag must follow the
–a immediately, without any intervening SPACEs. See Table D-2 on page 1046
for a list of attribute flags.
–c (creator) Displays the creator code of file. If file is a directory and has no creator
code, this option displays an error message.
–d (date) Displays the creation date of file as mm/dd/yyyy hh:mm:ss, using a 24-hour
clock.
–m (modification) Displays the modification date of file as mm/dd/yyyy hh:mm:ss,
using a 24-hour clock.
–P (no dereference) For each file that is a symbolic link, displays information about
the symbolic link, not the file the link points to. This option affects all files and
treats files that are not symbolic links normally. See page 115 for information
on dereferencing symbolic links.
–t (type) Displays the type code of file. If file is a directory and has no type code,
this option displays an error message.

Discussion

Without an option, GetFileInfo displays flags as the string avbstclinmedz, with uppercase
letters denoting which flags are set. See page 1046 for a discussion of attribute flags.

GetFileInfo O

GetFileInfo O

832 GetFileInfo O

Notes

You can use the SetFile utility (page 938) to set file attributes. You can set Mac OS X
permissions and ownership (page 98) using chmod (page 741) or chown (page 746),
and you can display this information using ls (page 861) or stat (page 957).
Directories do not have type or creator codes, and they might not have all flags. The
GetFileInfo utility cannot read special files such as device files.

Examples

The first example shows the output from GetFileInfo when you call it without an
option.
$ GetFileInfo picture.jpg
file: "/private/tmp/picture.jpg"
type: "JPEG"
creator: "GKON"
attributes: avbstClinmedz
created: 07/18/2012 15:15:26
modified: 07/18/2012 15:15:26

The only uppercase letter on the attributes line is C, indicating that this flag is set.
The c flag tells the Finder to look for a custom icon for this file. See Table D-2 on
page 1046 for a list of flags.
The next example uses the –a flag to display the attribute flags for a file:
$ GetFileInfo -a /Applications/Games/Alchemy/Alchemy
avBstclInmedz

The output shows that the b and i flags are set.
The GetFileInfo utility can process only one file each time you call it. The following
multiline bash command uses a for loop (page 431) to display the creator codes of
multiple files. The echo command displays the name of the file being examined
because GetFileInfo does not always display the name of the file:
$ for i in *
> do echo -n "$i: "; GetFileInfo -c "$i"
> done
Desktop: Desktop is a directory and has no creator
Documents: Documents is a directory and has no creator
...
aa: ""
ab: ""
...

grep

833

grep
grep [options] pattern [file-list]
The grep utility searches one or more text files for the pattern, which can be a simple string or another
form of a regular expression. The grep utility takes various actions, specified by options, each time it
finds a line that contains a match for the pattern. This utility takes its input either from files you specify
on the command line or from standard input.

Arguments

The pattern is a regular expression, as defined in Appendix A. You must quote regular
expressions that contain special characters, SPACE s, or TABs. An easy way to quote these
characters is to enclose the entire expression within single quotation marks.
The file-list is a list of the pathnames of ordinary text files that grep searches. With
the –r option, file-list can contain directories; grep searches the files in these directory
hierarchies.

Options

Without any options grep sends lines that contain a match for pattern to standard
output. When you specify more than one file on the command line, grep precedes each
line it displays with the name of the file it came from, followed by a colon.

Major Options
You can use only one of the following three options at a time. Normally you do not
need to use any of these options, because grep defaults to –G, which is regular grep.
–E (extended) Interprets pattern as an extended regular expression (page 1019).
The command grep –E is the same as egrep. See the “Notes” section.
–F (fixed) Interprets pattern as a fixed string of characters. The command grep –F
is the same as fgrep.
–G (grep) Interprets pattern as a basic regular expression (default).

Other Options
The grep utility accepts the common options described on page 718.

The Mac OS X version of grep accepts long options
tip Options for grep preceded by a double hyphen (––) work under Mac OS X as well as under Linux.
––count

–c Displays only the number of lines that contain a match in each file.

––context=n –C n

Displays n lines of context around each matching line.

grep

Searches for a pattern in files

834 grep
––file=file

–f file
Reads file, which contains one pattern per line, and finds lines in the input that
match each of the patterns.

––no-filename

–h Does not display the filename at the beginning of each line when searching
multiple files.

––ignore-case

–i Causes lowercase letters in the pattern to match uppercase letters in the file, and
vice versa. Use this option when you are searching for a word that might appear
at the beginning of a sentence (that is, the word might or might not start with
an uppercase letter).

––files-with-matches

–l (lowercase “l”; list) Displays only the name of each file that contains one or
more matches. A filename is displayed only once, even if the file contains more
than one match.
––max-count=n –m n

Stops reading each file, or standard input, after displaying n lines containing
matches.
––line-number
––quiet or
––silent

–n Precedes each line by its line number in the file. The file does not need to contain
line numbers.
–q Does not write anything to standard output; only sets the exit code.

––recursive

–r or –R
Recursively descends directories in the file-list and processes files within these
directories.

––no-messages

–s (silent) Does not display an error message if a file in the file-list does not exist
or is not readable.

––invert-match

–v Causes lines not containing a match to satisfy the search. When you use this
option by itself, grep displays all lines that do not contain a match for the
pattern.

––word-regexp –w With this option, the pattern must match a whole word. This option is helpful

if you are searching for a specific word that might also appear as a substring of
another word in the file.
––line-regexp

Notes
egrep and fgrep

–x The pattern matches whole lines only.
The grep utility returns an exit status of 0 if it finds a match, 1 if it does not find a
match, and 2 if the file is not accessible or the grep command contains a syntax error.
Two utilities perform functions similar to that of grep. The egrep utility (same as grep
–E) allows you to use extended regular expressions (page 1019), which include a different set of special characters than basic regular expressions (page 1017). The fgrep

grep

835

utility (same as grep –F) is fast and compact but processes only simple strings, not
regular expressions.
GNU grep, which runs under Linux and Mac OS X, uses extended regular expressions
in place of regular expressions. Thus egrep is virtually the same as grep. Refer to the
grep info page for a minimal distinction.

Examples

The following examples assume the working directory contains three files: testa,
testb, and testc.
File testa

File testb

File testc

aaabb
bbbcc
ff-ff
cccdd
dddaa

aaaaa
bbbbb
ccccc
ddddd

AAAAA
BBBBB
CCCCC
DDDDD

The grep utility can search for a pattern that is a simple string of characters. The following
command line searches testa and displays each line that contains the string bb:
$ grep bb testa
aaabb
bbbcc

The –v option reverses the sense of the test. The following example displays the lines
in testa that do not contain the string bb:
$ grep -v bb testa
ff-ff
cccdd
dddaa

The –n option displays the line number of each displayed line:
$ grep -n bb testa
1:aaabb
2:bbbcc

The grep utility can search through more than one file. Here it searches through each
file in the working directory. The name of the file containing the string precedes each
line of output.
$ grep bb *
testa:aaabb
testa:bbbcc
testb:bbbbb

When you include the –w option in the search for the string bb, grep produces no
output because none of the files contains the string bb as a separate word:
$ grep -w bb *
$

836 grep

The search grep performs is case sensitive. Because the previous examples specified
lowercase bb, grep did not find the uppercase string BBBBB in testc. The –i option
causes both uppercase and lowercase letters to match either case of letter in the
pattern:
$ grep -i bb *
testa:aaabb
testa:bbbcc
testb:bbbbb
testc:BBBBB
$ grep -i BB *
testa:aaabb
testa:bbbcc
testb:bbbbb
testc:BBBBB

The –c option displays the number of lines in each file that contain a match:
$ grep -c bb *
testa:2
testb:1
testc:0

The –f option finds matches for each pattern in a file of patterns. In the next example,
gfile holds two patterns, one per line, and grep searches for matches to the patterns
in gfile:
$ cat gfile
aaa
bbb
$ grep -f gfile test*
testa:aaabb
testa:bbbcc
testb:aaaaa
testb:bbbbb

The following command line searches text2 and displays lines that contain a string
of characters starting with st, followed by zero or more characters (.* represents zero
or more characters in a regular expression—see Appendix A), and ending in ing:
$ grep 'st.*ing' text2
...

The ^ regular expression matches the beginning of a line and, by itself, matches every
line in a file. Together with the –n option, ^ displays each line in a file preceded by
its line number:
$ grep -n '^' testa
1:aaabb
2:bbbcc
3:ff-ff
4:cccdd
5:dddaa

grep

837

The next command line counts the number of times #include statements appear in C
source files in the working directory. The –h option causes grep to suppress the filenames
from its output. The input to sort consists of all lines from *.c that match #include. The
output from sort is an ordered list of lines that contains many duplicates. When uniq with
the –c option processes this sorted list, it outputs repeated lines only once, along with a
count of the number of repetitions of each repeated line in its input.
$
9
2
1
6
2
2
2
3

grep -h '#include' *.c | sort | uniq -c
#include "buff.h"
#include "poly.h"
#include "screen.h"
#include "window.h"
#include "x2.h"
#include "x3.h"
#include 
#include 

The final command calls the vim editor with a list of files in the working directory that
contain the string Sampson. The $(...) command substitution construct (page 361)
causes the shell to execute grep in place and supply vim with a list of filenames to edit:
$ vim $(grep -l 'Sampson' *)
...

The single quotation marks are not necessary in this example, but they are required
if the regular expression contains special characters or SPACEs. It is a good habit to
quote the pattern so the shell does not interpret special characters the pattern might
contain.

838 gzip

gzip
Compresses or decompresses files

gzip

gzip [options] [file-list]
gunzip [options] [file-list]
zcat [file-list]
The gzip utility compresses files, the gunzip utility restores files compressed with gzip, and the zcat utility
displays files compressed with gzip.

Arguments

The file-list is a list of the names of one or more files that are to be compressed or
decompressed. If a directory appears in file-list with no ––recursive option, gzip/gunzip
issues an error message and ignores the directory. With the – –recursive option,
gzip/gunzip recursively compresses/decompresses files within the directory hierarchy.
If file-list is empty or if the special option – (hyphen) is present, gzip reads from standard
input. The ––stdout option causes gzip and gunzip to write to standard output.
The information in this section also applies to gunzip, a link to gzip.

Options

The gzip, gunzip, and zcat utilities accept the common options described on page 718.

The Mac OS X versions of gzip, gunzip, and zcat accept long options
tip Options for gzip, gunzip, and zcat preceded by a double hyphen (––) work under Mac OS X as
well as under Linux.
––stdout

–c Writes the results of compression or decompression to standard output instead
of to filename.gz or filename, respectively.

––decompress or
––uncompress

–d Decompresses a file compressed with gzip. This option with gzip is equivalent to
the gunzip command.

––force

–f Overwrites an existing output file on compression/decompression.

––list

–l For each compressed file in file-list, displays the file’s compressed and decompressed sizes, the compression ratio, and the name of the file before
compression. Use this option with ––verbose to display additional information.

––fast or
––best

–n Controls the tradeoff between the speed of compression and the amount of
compression. The n argument is a digit from 1 to 9; level 1 is the fastest (least)
compression and level 9 is the best (slowest and most) compression. The
default level is 6. The options ––fast and ––best are synonyms for –1 and –9,
respectively.

––quiet

–q Suppresses warning messages.

gzip

839

––recursive

–r Recursively descends directories in file-list and compresses/decompresses files
within these directories.

––test

–t Verifies the integrity of a compressed file. This option displays nothing if the file
is OK.

––verbose

–v During compression, displays the name of the file, the name of the compressed
file, and the amount of compression as each file is processed.

Discussion

Compressing files reduces disk space requirements and shortens the time needed to
transmit files between systems. When gzip compresses a file, it adds the extension .gz
to the filename. For example, compressing the file fname creates the file fname.gz
and, unless you use the – –stdout (–c) option, deletes the original file. To restore
fname, use the command gunzip with the argument fname.gz.
Almost all files become much smaller when compressed with gzip. On rare occasions
a file will become larger, but only by a slight amount. The type of a file and its contents
(as well as the –n option) determine how much smaller a file becomes; text files are
often reduced by 60 to 70 percent.
The attributes of a file, such as its owner, permissions, and modification and access
times, are left intact when gzip compresses and gunzip decompresses a file.
If the compressed version of a file already exists, gzip reports that fact and asks for
your confirmation before overwriting the existing file. If a file has multiple links to
it, gzip issues an error message and exits. The ––force option overrides the default
behavior in both of these situations.

Notes

The bzip2 utility (page 732) compresses files more efficiently than does gzip.
Without the ––stdout (–c) option, gzip removes the files in file-list.
In addition to the gzip format, gunzip recognizes several other compression formats,
enabling gunzip to decompress files compressed with compress.
To see an example of a file that becomes larger when compressed with gzip, compare
the size of a file that has been compressed once with the same file compressed with
gzip again. Because gzip complains when you give it an argument with the extension
.gz, you need to rename the file before compressing it a second time.
The tar utility with the –z modifier (page 970) calls gzip.
You can catenate files by catenating their gzip’d versions. In the following example, gzip
first compresses the file named aa and, by means of the ––stdout option, sends the output to cc.gzip, and then compresses bb appending the output to cc.gzip. The final
command shows zcat decompressing cc.gzip, which contains the contents of both files.
$ gzip --stdout aa > cc.gzip
$ gzip --stdout bb >> cc.gzip

840 gzip
$ zcat cc.gzip
This is file aa.
This is file bb.

The following related utilities display and manipulate compressed files. None of these
utilities changes the files it works on.
zcat file-list Works like cat except it uses gunzip to decompress file-list as it copies files to standard

output.
zdiff [options] file1 [file2]

Works like diff (page 777) except file1 and file2 are decompressed with gunzip as
needed. The zdiff utility accepts the same options as diff. If you omit file2, zdiff compares
file1 with the compressed version of file1.
zless file-list Works like less except that it uses gunzip to decompress file-list as it displays files.

Examples

In the first example, gzip compresses two files. Next gunzip decompresses one of the
files. When a file is compressed and decompressed, its size changes but its modification
time remains the same.
$ ls -l
-rw-rw-r-- 1 max group 33557 07-20 17:32 patch-2.0.7
-rw-rw-r-- 1 max group 143258 07-20 17:32 patch-2.0.8
$ gzip *
$ ls -l
-rw-rw-r-- 1 max group 9693 07-20 17:32 patch-2.0.7.gz
-rw-rw-r-- 1 max group 40426 07-20 17:32 patch-2.0.8.gz
$ gunzip patch-2.0.7.gz
$ ls -l
-rw-rw-r-- 1 max group 33557 07-20 17:32 patch-2.0.7
-rw-rw-r-- 1 max group 40426 07-20 17:32 patch-2.0.8.gz

In the next example, the files in Sam’s home directory are archived using cpio
(page 758). The archive is compressed with gzip before it is written to the device
mounted on /dev/sde1.
$ find ~sam -depth -print | cpio -oBm | gzip >/dev/sde1

head 841

Displays the beginning of a file
head [options] [file-list]
The head utility displays the beginning of a file. This utility takes its input either from one or more
files you specify on the command line or from standard input.

Arguments

The file-list is a list of the pathnames of the files that head displays. When you specify
more than one file, head displays the filename before displaying the first few lines of
each file. When you do not specify a file, head takes its input from standard input.

Options

Under Linux, head accepts the common options described on page 718. Options preceded by a double hyphen (––) work under Linux only. Except as noted, options named
with a single letter and preceded by a single hyphen work under Linux and OS X.

––bytes=n[u]

–c n[u]
Displays the first n bytes (characters) of a file. Under Linux only, the u argument
is an optional multiplicative suffix as described on page 717, except that head uses
a lowercase k for kilobyte (1,024-byte blocks) and accepts b for 512-byte blocks.
If you include a multiplicative suffix, head counts by this unit instead of by bytes.

––lines=n

–n n
Displays the first n lines of a file. You can use –n to specify n lines without using
the lines keyword or the –n option. If you specify a negative value for n, head
displays all but the last n lines of the file.

––quiet

–q Suppresses header information when you specify more than one filename on the
command line. L

Notes

The head utility displays the first ten lines of a file by default.

Examples

The examples in this section are based on the following file:
$ cat eleven
line one
line two
line three
line four
line five
line six
line seven
line eight
line nine
line ten
line eleven

head

head

842 head

Without any arguments head displays the first ten lines of a file:
$ head eleven
line one
line two
line three
line four
line five
line six
line seven
line eight
line nine
line ten

The next example displays the first three lines (–n 3) of the file:
$ head -n 3 eleven
line one
line two
line three

The following example is equivalent to the preceding one:
$ head -3 eleven
line one
line two
line three

The next example displays the first six characters (–c 6) in the file:
$ head -c 6 eleven
line o$

The final example displays all but the last seven lines of the file:
$ head -n -7 eleven
line one
line two
line three
line four

join 843

join
join [options] file1 file2
The join utility displays a single line for each pair of lines from file1 and file2 that have the same value
in a common field called the join field. Both files must be sorted on the join field or join will not display
the correct output.

Arguments

The join utility reads lines from file1 and file2 and for each pair of lines, compares
the specified join field from both lines. If you do not specify a join field, join takes the
first field as the join field. If the join fields are the same, join copies the join field and
the rest of the lines from both files to standard output. You can specify a hyphen (–)
in place of either filename (but not both) to cause join to read from standard input.

Options

The join utility accepts the ––help and ––version options described on page 718.
–1 field
Specifies field number field as the join field in file1. The first field on a line is
field number 1.
–2 field
Specifies field number field as the join field in file2. The first field on a line is
field number 1.
–a 1 | 2
Displays lines from file1 (if you specify 1) or file2 (if you specify 2) whose join
field does not match the join field from the other file. Also displays normal
output of join (lines with join fields that match). See also the –v option.

––ignore-case

–i Matches uppercase letters to lowercase letters, and vice versa.
–j field
Specifies field number field as the join field in both file1 and file2. The first field
on a line is field number 1.
–t char
Specifies char as the input and output field separator and causes join to include
blanks (SPACEs and/or TABs) as part of the fields.
–v 1 | 2
Displays lines from file1 (if you specify 1) or file2 (if you specify 2) whose join
field does not match the join field from the other file. Suppresses the normal
output of join (lines with join fields that match). See also the –a option.

––check-order

Makes sure file1 and file2 are both sorted on the join field and displays an error
message if they are not. Default is ––nocheck-order.

join

Joins lines from two files based on a common field

844 join

Notes

The concept of a join comes from relational databases; see page 613 for information
on joins under SQL.
By default join does the following.
• Uses the first field on each line as the common field it joins on.
• Uses one or more blanks (SPACEs and/or TABs) as field separators and ignores
leading blanks. The –t option causes join to include blanks as part of the
fields and to use the specified character as the input and output field
separator.
• Separates output fields with a single SPACE.
• Outputs for each pair of joined lines the common join field followed by the
remaining fields from file1 and then the remaining fields from file2.
• Does not check if input files are sorted on the common field that is the basis
for the join. See the ––check-order option.

Examples

The examples in this section use the following files:
$ cat one
9999 first line file one.
aaaa second line file one.
cccc third line file one.
$ cat two
aaaa FIRST line file two.
bbbb SECOND line file two.
cccc THIRD line file two.

The first example shows the simplest use of join. The files named one and two are
joined based, by default, on the first field in each line of both files. Both files are in
sorted order based on the join field. The join fields on two pairs of lines match and
join displays those lines.
$ join one two
aaaa second line file one. FIRST line file two.
cccc third line file one. THIRD line file two.

You can use the ––check-order option to see if both files are properly sorted. In the
following example, sort (page 942) with the –r option sorts one in reverse alphabetical order and sends the output through a pipeline to join. The shell replaces the –
argument to join with the standard input to join, which comes from the pipeline; join
displays an error message.
$ sort -r one | join --check-order - two
join: file 1 is not in sorted order

Next the –a option with an argument of 1 causes join to display, in addition to its
normal output, lines from the first file (one) that do not have a matching join field.

join 845
$ join -a 1 one two
9999 first line file one.
aaaa second line file one. FIRST line file two.
cccc third line file one. THIRD line file two.

Use –v in place of –a to prevent join from displaying those lines it normally displays
(those that have a matching join field).
$ join -v 1 one two
9999 first line file one.

The final example uses onea as the first file and specifies the third field of the first
file (–1 3) as the match field. The second file (two) uses the default (first) field for
matching.
$ cat onea
first line aaaa file one.
second line 1111 file one.
third line cccc file one.
$ join -1 3 onea two
aaaa first line file one. FIRST line file two.
cccc third line file one. THIRD line file two.

846 kill

kill
Terminates a process by PID

kill

kill [option] PID-list
kill –l [signal-name | signal-number]
The kill utility sends a signal to one or more processes. Typically this signal terminates the processes.
For kill to work, the processes must belong to the user executing kill. However, a user working with
root privileges can terminate any process. The –l (lowercase “l”) option lists information about
signals.

Arguments
Options

The PID-list is a list of process identification (PID) numbers of processes that kill is
to terminate.
–l (list) Without an argument, displays a list of signals. With an argument of a
signal-name, displays the corresponding signal-number. With an argument of
a signal-number, displays the corresponding signal-name.
–signal-name | –signal-number
Sends the signal specified by signal-name or signal-number to PID-list. You can
specify a signal-name preceded by SIG or not (e.g., SIGKILL or KILL). Without
this option, kill sends a software termination signal (SIGTERM; signal number
15).

Notes

See also killall on page 848 and “kill: Aborting a Background Job” on page 147.
Table 10-5 on page 483 lists some signals. The command kill –l displays a complete
list of signal numbers and names.
In addition to the kill utility, a kill builtin is available in the Bourne Again and TC
Shells. The builtins work similarly to the utility described here. Give the command
/bin/kill to use the kill utility and the command kill to use the builtin. It does not
usually matter which version you use.
The shell displays the PID number of a background process when you initiate the process. You can also use the ps utility (page 921) to determine PID numbers.
If the software termination signal does not terminate a process, try sending a KILL
signal (signal number 9). A process can choose to ignore any signal except KILL.

root: Do not run kill with arguments of –9 0 or KILL 0
caution If you run the command kill –9 0 while you are working with root privileges, you will bring the
system down.

kill

847

The kill utility/builtin accepts job identifiers in place of the PID-list. Job identifiers
consist of a percent sign (%) followed by either a job number or a string that uniquely
identifies the job.
To terminate all processes that the current login process initiated and have the
operating system log you out, give the command kill –9 0.

Examples

The first example shows a command line executing the file compute as a background
process and kill terminating that process:
$ compute &
[2] 259
$ kill 259
$ RETURN
[2]+ Terminated

compute

The next example shows the ps utility determining the PID number of the background
process running a program named xprog and the kill utility terminating xprog with the
TERM signal:
$ ps
PID TTY
7525 pts/1
14668 pts/1
14699 pts/1

TIME CMD
00:00:00 bash
00:00:00 xprog
00:00:00 ps

$ kill -TERM 14668
$

The final example shows kill terminating a background process using a job number.
As explained on page 148, the jobs builtin lists the numbers of all jobs controlled by
the terminal the command is given from.
$ sleep 60 &
[1] 24280
$ kill %1
$ RETURN
[1]+ Terminated
$

sleep 60

848 killall

killall
killall

Terminates a process by name
killall [option] name-list
The killall utility sends a signal to one or more processes executing specified commands. Typically this
signal terminates the processes. For killall to work, the processes must belong to the user executing killall.
However, a user working with root privileges can terminate any process.

Arguments

The name-list is a SPACE-separated list of names of programs that are to receive signals.

Options

Options preceded by a double hyphen (––) work under Linux only. Except as noted,
options named with a single letter and preceded by a single hyphen work under Linux
and OS X.

––interactive
––list
––quiet

–i Prompts for confirmation before killing a process. L
–l Displays a list of signals (but kill –l displays a better list). With this option killall
does not accept a name-list.
–q Does not display a message if killall fails to terminate a process. L
–signal-name | –signal-number
Sends the signal specified by signal-name or signal-number to name-list. You
can specify a signal-name preceded by SIG or not (e.g., SIGKILL or KILL).
Without this option, kill sends a software termination signal (SIGTERM; signal
number 15).

Notes

See also kill on page 846.
Table 10-5 on page 483 lists some signals. The command kill –l displays a complete
list of signal numbers and names.
If the software termination signal does not terminate the process, try sending a KILL
signal (signal number 9). A process can choose to ignore any signal except KILL.
You can use ps (page 921) to determine the name of the program you want to
terminate.

Examples

You can give the following commands to experiment with killall:
$ sleep 60 &
[1] 23274
$ sleep 50 &
[2] 23275
$ sleep 40 &
[3] 23276

killall
$ sleep 120 &
[4] 23277
$ killall sleep
$ RETURN
[1]
Terminated
[2]
Terminated
[3]- Terminated
[4]+ Terminated

sleep
sleep
sleep
sleep

849

60
50
40
120

The next command, run by a user with root privileges, terminates all instances of the
Firefox browser:
# killall firefox

850 launchctl O

launchctl O
launchctl O

Controls the launchd daemon
launchctl [command [options] [arguments]]
The launchctl utility controls the launchd daemon. The launchctl utility is available under OS X only. O

Arguments

The command is the command that launchctl sends to launchd. Table VI-19 lists some
of the commands and the options and arguments each command accepts. Without a
command, launchctl reads commands, options, and arguments from standard input,
one set per line. Without a command, when standard input comes from the keyboard,
launchctl runs interactively.

Table VI-19

Option

launchctl commands

Command

Argument

Description

help

None

Displays a help message

list

None

Lists jobs loaded into launchd

load [–w]

Job configuration file

Loads the job named by the argument

shutdown

None

Prepares for shutdown by removing all jobs

start

Job name

Starts the job named by the argument

stop

Job name

Stops the job named by the argument

unload [–w]

Job configuration file

Unloads the job named by the argument

Only the load and unload commands take an option.
–w (write) When loading a file, removes the Disabled key and saves the modified
configuration file. When unloading a file, adds the Disabled key and saves the
modified configuration file.

Discussion

The launchctl utility is the user interface to launchd, which manages system daemons
and background tasks (called jobs). Each job is described by a job configuration file,
which is a property list file in the format defined by the launchd.plist man page.
For security reasons, users not working with root privileges cannot communicate
with the system’s primary launchd process, PID 1. When such a user loads jobs, OS
X creates a new instance of launchd for that user. When all its jobs are unloaded, that
instance of launchd quits.

Notes

The launchctl utility and launchd daemon were introduced in Mac OS X version 10.4.
Under version 10.3 and earlier, system jobs were managed by init, xinetd, and cron.

launchctl O 851

Examples

The first example, which is run by a user with root privileges, uses the list command
to list launchd jobs running on the local system:
# launchctl list
PID
Status Label
51479
0x109490.launchctl
50515
0x10a780.bash
50514
0x10a680.sshd
50511
0x108d20.sshd
22
0x108bc0.securityd
0
com.apple.launchctl.StandardIO
37057
[0x0-0x4e84e8].com.apple.ScreenSaver.Engine
27860
0x10a4e0.DiskManagementTo
27859
[0x0-0x3a23a2].com.apple.SoftwareUpdate
...

The next example enables the ntalk service. Looking at the ntalk.plist file before and
after the launchctl command is executed shows that launchctl has modified the file by
removing the Disabled key.
# cat /System/Library/LaunchDaemons/ntalk.plist
...

Disabled

Label
com.apple.ntalkd
...
# launchctl load -w /System/Library/LaunchDaemons/ntalk.plist
# cat /System/Library/LaunchDaemons/ntalk.plist
...

Label
com.apple.ntalkd
...

Without any arguments, launchctl prompts for commands on standard input. Give
a quit command or press CONTROL-D to exit from launchctl. In the last example, a user
running with root privileges causes launchctl to display a list of jobs and then to stop
the job that would launch airportd:
# launchctl
launchd% list
PID
Status Label
8659
0x10ba10.cron
1
0x10c760.launchd
...
0
com.apple.airport.updateprefs
0
com.apple.airportd
0
com.apple.AirPort.wps
0
0x100670.dashboardadvisoryd
0
com.apple.launchctl.System
launchd% stop com.apple.airportd
launchd% quit

852 less

less
Displays text files, one screen at a time

less

less [options] [file-list]
The less utility displays text files, one screen at a time.

Arguments

The file-list is the list of files you want to view. If there is no file-list, less reads from
standard input.

Options

The less utility accepts the common options described on page 718.

The Mac OS X version of less accepts long options
tip Options for less preceded by a double hyphen (––) work under Mac OS X as well as under Linux.
––clear-screen

–c Repaints the screen from the top line down instead of scrolling.

––QUIT-AT-EOF –E (exit) Normally less requires you to enter q to terminate. This option exits
automatically the first time less reads the end of file.
––quit-at-eof

–e (exit) Similar to –E, except that less exits automatically the second time it reads
the end of file.

––quit-if-one-screen

–F Displays the file and quits if the file can be displayed on a single screen.
–i Causes a search for a string of lowercase letters to match both uppercase and
lowercase letters. This option is ignored if you specify a pattern that includes
uppercase letters.

––ignore-case

––IGNORE-CASE

–I Causes a search for a string of letters of any case to match both uppercase and
lowercase letters, regardless of the case of the search pattern.
––long-prompt

–m Each prompt reports the percentage of the file less has displayed. It reports
byte numbers when less reads from standard input because less has no way of
determining the size of the input file.

––LINE-NUMBERS

–N Displays a line number at the beginning of each line.
––prompt=prompt

–Pprompt
Changes the short prompt string (the prompt that appears at the bottom of each
screen of output) to prompt. Enclose prompt in quotation marks if it contains
SPACEs. The less utility replaces special symbols in prompt with other values when
it displays the prompt. For example, less displays the current filename in place of
%f. See the less info page for a list of these special symbols and descriptions of
other prompts. Custom prompts are useful if you are running less from within

less 853

another program and want to give instructions or information to the person using
the program. The default prompt is the name of the file displayed in reverse video.
––squeeze-blank-lines

–s Displays multiple, adjacent blank lines as a single blank line. When you use less to
display text that has been formatted for printing with blank space at the top and
bottom of each page, this option shortens these headers and footers to a single line.
––tabs=n

–xn
Sets tab stops n characters apart. The default is eight characters.

––window=n –[z]n

Sets the scrolling size to n lines. The default is the height of the display in lines.
Each time you move forward or backward a page, you move n lines. The z part
of the option maintains compatibility with more and can be omitted.
+command
Any command you can give less while it is running can also be given as an option
by preceding it with a plus sign (+) on the command line. See the “Commands”
section. A command preceded by a plus sign on the command line is executed as
soon as less starts and applies to the first file only.
++command
Similar to +command except that command is applied to every file in file-list,
not just the first file.

Notes

The phrase “less is more” explains the origin of the name of this utility. The more utility is the original Berkeley UNIX pager (also available under Linux). The less utility
is similar to more but includes many enhancements. (Under OS X, less and more are
copies of the same file.) After displaying a screen of text, less displays a prompt and
waits for you to enter a command. You can skip forward and backward in the file,
invoke an editor, search for a pattern, or perform a number of other tasks.
See the v command in the next section for information on how you can edit the file
you are viewing.
You can set the options to less either from the command line when you call less or
by setting the LESS environment variable. For example, the following bash command
causes less to run with the –x4 and –s options:
$ export LESS="-x4 -s"

Normally you would set LESS in ~/.bash_profile if you are using bash or in ~/.login if
you are using tcsh. Once you have set the LESS variable, less is invoked with the specified options each time you call it. Any options you give on the command line override
the settings in the LESS variable. The LESS variable is used both when you call less from
the command line and when less is invoked by another program, such as man. To specify
less as the pager to use with man and other programs, set the environment variable
PAGER to less; under bash you can add the following line to ~/.bash_profile:
export PAGER=less

854 less

Commands

Whenever less pauses, you can enter any of a large number of commands. This section
describes some commonly used commands. Refer to the less info page and the less
––help command for more information. The optional numeric argument n defaults to
1, except as noted. You do not need to follow these commands with a RETURN.

nb or nCONTROL-B (backward) Scrolls backward n lines. The default value of n is the height of the screen

in lines.
nd or nCONTROL-D (down) Scrolls forward n lines. The default value of n is one-half the height of the

screen in lines. When you specify n, it becomes the new default value for this
command.
F (forward) Scrolls forward. If the end of the input is reached, this command waits for
more input and then continues scrolling. This command allows you to use less in a
manner similar to tail –f (page 965), except that less paginates the output as it

appears.
ng (go) Goes to line number n. This command might not work if the file is read from

standard input and you have moved too far into the file. The default value of n is 1.
h or H (help) Displays a summary of all available commands. The summary is displayed
using less, as the list of commands is quite long.
nRETURN or nj (jump) Scrolls forward n lines. The default value of n is 1.
q or :q Terminates less.
nu or nCONTROL-U (up) Scrolls backward n lines. The default value of n is one-half the height of the

screen in lines. When you specify n, it becomes the default value for this command.
v Brings the current file into an editor with the cursor on the current line. The less utility

uses the editor specified by the EDITOR environment variable. If EDITOR is not set,
less uses vi (which is typically linked to vim).
nw Scrolls backward like nb, except that the value of n becomes the new default value

for this command.
ny or nk Scrolls backward n lines. The default value of n is 1.
nz Displays the next n lines like nSPACE except that the value of n, if present, becomes the

new default value for the z and SPACE commands.
nSPACE Displays the next n lines. Pressing the SPACE bar by itself displays the next screen

of text.
/regular-expression

Skips forward in the file, looking for lines that contain a match for regular-expression.
If you begin regular-expression with an exclamation point (!), this command looks for
lines that do not contain a match for regular-expression. If regular-expression begins
with an asterisk (*), this command continues the search through file-list. (A normal
search stops at the end of the current file.) If regular-expression begins with an at sign
(@), this command begins the search at the beginning of file-list and continues to the
end of file-list.

less 855
?regular-expression

This command is similar to the preceding one except it searches backward through the
file (and file-list). An asterisk (*) as the first character in regular-expression causes
the search to continue backward through file-list to the beginning of the first file. An
at sign (@) causes the search to start with the last line of the last file in file-list and
progress toward the first line of the first file.
{ or ( or [ If one of these characters appears in the top line of the display, this command scrolls

forward to the matching right brace, parenthesis, or bracket. For example, typing {
causes less to move the cursor forward to the matching }.
} or ) or ] Similar to the preceding commands, these commands move the cursor backward to

the matching left brace, parenthesis, or bracket.
CONTROL-L

Redraws the screen. This command is useful if the text on the screen has become
garbled.

[n]:n Skips to the next file in file-list. If n is given, skips to the nth next file in file-list.
![command line] Executes command line under the shell specified by the SHELL environment variable
or under sh (usually linked to or a copy of bash or dash) by default. A percent sign (%)

in command line is replaced by the name of the current file. If you omit command line,
less starts an interactive shell.

Examples

The following example displays the file memo.txt. To see more of the file, the user
presses the SPACE bar in response to the less prompt at the lower-left corner of the
screen:
$ less memo.txt
...
memo.txt SPACE
...

In the next example, the user changes the prompt to a more meaningful message and
uses the –N option to display line numbers. The command line also instructs less to
skip forward to the first line containing the string procedure.
$ less -Ps"Press SPACE to continue, q to quit" -N +/procedure ncut.icn
28 procedure main(args)
29
local filelist, arg, fields, delim
30
31
filelist:=[]
...
45
# Check for real field list
46
#
47
if /fields then stop(“-fFIELD_LIST is required.")
48
49
# Process the files and output the fields
Press SPACE to continue, q to quit

856 ln

ln
Makes a link to a file

ln

ln [options] existing-file [new-link]
ln [options] existing-file-list directory
The ln utility creates hard or symbolic links to one or more files. You can create a symbolic link, but
not a hard link, to a directory.

Arguments

In the first syntax the existing-file is the pathname of the file you want to create a link
to. The new-link is the pathname of the new link. When you are creating a symbolic link,
the existing-file can be a directory. If you omit new-link, ln creates a link to existingfile in the working directory, using the same simple filename as existing-file.
In the second syntax the existing-file-list is a list of the pathnames of the ordinary
files you want to create links to. The ln utility establishes the new links in the
directory. The simple filenames of the entries in the directory are the same as the
simple filenames of the files in the existing-file-list.

Options

Options preceded by a double hyphen (––) work under Linux only. Except as noted,
options named with a single letter and preceded by a single hyphen work under Linux
and OS X.

––backup

–b If the ln utility will remove a file, this option makes a backup by appending a
tilde (~) to the filename. This option works only with ––force. L

––force

–f Normally ln does not create the link if new-link already exists. This option
removes new-link before creating the link. When you use ––force and ––backup
together (Linux only), ln makes a copy of new-link before removing it.

––interactive

–i If new-link already exists, this option prompts you before removing new-link.
If you enter y or yes, ln removes new-link before creating the link. If you answer
n or no, ln does not remove new-link and does not make a new link.

––symbolic

–s Creates a symbolic link. When you use this option, the existing-file and the
new-link might be directories and might reside on different filesystems. Refer to
“Symbolic Links” on page 112.

Notes

For more information refer to “Links” on page 109. The ls utility with the –l option
displays the number of hard links to a file (Figure 4-12; page 98).

Hard links

By default ln creates hard links. A hard link to a file is indistinguishable from the original file. All hard links to a file must be in the same filesystem. For more information
refer to “Hard Links” on page 110.

Symbolic links

You can also use ln to create symbolic links. Unlike a hard link, a symbolic link can
exist in a different filesystem from the linked-to file. Also, a symbolic link can point
to a directory. For more information refer to “Symbolic Links” on page 112.

ln 857

If new-link is the name of an existing file, ln does not create the link unless you use
the ––force option (Linux only) or answer yes when using the –i (– –interactive)
option.

Examples

The following command creates a link between memo2 in the literature subdirectory
of Zach’s home directory and the working directory. The file appears as memo2 (the
simple filename of the existing file) in the working directory:
$ ln ~zach/literature/memo2 .

You can omit the period that represents the working directory from the preceding
command. When you give a single argument to ln, it creates a link in the working
directory.
The next command creates a link to the same file. This time the file appears as
new_memo in the working directory:
$ ln ~zach/literature/memo2 new_memo

The following command creates a link that causes the file to appear in Sam’s home
directory:
$ ln ~zach/literature/memo2 ~sam/new_memo

You must have write and execute access permissions to the other user’s directory for
this command to work. If you own the file, you can use chmod to give the other user
write access permission to the file.
The next command creates a symbolic link to a directory. The ls –ld command shows
the link:
$ ln -s /usr/local/bin bin
$ ls -ld bin
lrwxrwxrwx 1 zach zach 14 Feb 10 13:26 bin -> /usr/local/bin

The final example attempts to create a symbolic link named memo1 to the file
memo2. Because the file memo1 exists, ln refuses to make the link. When you use the
–i (––interactive) option, ln asks whether you want to replace the existing memo1 file
with the symbolic link. If you enter y or yes, ln creates the link and the old memo1
disappears.
$ ls -l memo?
-rw-rw-r-1 zach group
-rw-rw-r-1 zach group
$ ln -s memo2 memo1
ln: memo1: File exists
$ ln -si memo2 memo1
ln: replace 'memo1'? y
$ ls -l memo?
lrwxrwxrwx
1 zach group
-rw-rw-r-1 zach group

224 07-31 14:48 memo1
753 07-31 14:49 memo2

5 07-31 14:49 memo1 -> memo2
753 07-31 14:49 memo2

Under Linux you can also use the ––force option to cause ln to overwrite a file.

858 lpr

lpr
Sends files to printers

lpr

lpr [options] [file-list]
lpq [options] [job-identifiers]
lprm [options] [job-identifiers]
The lpr utility places one or more files into a print queue, providing orderly access to printers for several users or processes. This utility can work with printers attached to remote systems. You can use
the lprm utility to remove files from the print queues and the lpq utility to check the status of files in
the queues. Refer to “Notes” later in this section.

Arguments

The file-list is a list of one or more filenames for lpr to print. Often these files are text
files, but many systems are configured so lpr can accept and properly print a variety
of file types, including PostScript and PDF files. Without a file-list, lpr accepts input
from standard input.
The job-identifiers is a list of job numbers or usernames. If you do not know the job
number, use lpq to display a list of print jobs.

Options

Some of the following options depend on which type of file is being printed as well
as on how the system is configured for printing.
–h (no header) Suppresses printing of the header (burst) page. This page is useful
for identifying the owner of the output in a multiuser setup, but printing it is a
waste of paper when this identification is not needed.
–l (lowercase “l”) Specifies that lpr should not preprocess (filter) the file being
printed. Use this option when the file is already formatted for the printer.
–P printer
Routes the print jobs to the queue for the printer named printer. If you do not
use this option, print jobs are routed to the default printer for the local system.
The acceptable values for printer are found in the Linux file /etc/printcap and
can be displayed by an lpstat –t command. These values vary from system to
system.
–r (remove) Deletes the files in file-list after calling lpr.
–# n
Prints n copies of each file. Depending on which shell you are using, you might
need to escape the # by preceding it with a backslash to keep the shell from
interpreting it as a special character.

lpr

859

Discussion

The lpr utility takes input either from files you specify on the command line or from
standard input; it adds these files to the print queue as print jobs. The utility assigns a
unique identification number to each print job. The lpq utility displays the job numbers
of the print jobs that lpr has set up; you can use the lprm utility to remove a job from
the print queue.

lpq

The lpq utility displays information about jobs in a print queue. When called without
any arguments, lpq lists all print jobs queued for the default printer. Use lpr’s –P
printer option with lpq to look at other print queues—even those for printers connected to remote systems. With the –l option, lpq displays more information about
each job. If you give a username as an argument, lpq displays only the printer jobs
belonging to that user.

lprm

One item displayed by lpq is the job number for each print job in the queue. To
remove a job from the print queue, give the job number as an argument to lprm.
Unless you are working with root privileges, you can remove only your own jobs.
Even a user working with root privileges might not be able to remove a job from a
queue for a remote printer. If you do not give any arguments to lprm, it removes the
active printer job (that is, the job that is now printing) from the queue, if you own
that job.

Notes

If you normally use a printer other than the system default printer, you can set up lpr
to use another printer as your personal default by assigning the name of this printer
to the environment variable PRINTER. For example, if you use bash, you can add
the following line to ~/.bash_profile to set your default printer to the printer named
ps:
export PRINTER=ps

LPD and LPR

Traditionally, UNIX had two printing systems: the BSD Line Printer Daemon (LPD)
and the System V Line Printer system (LPR). Linux adopted those systems at first, and
both UNIX and Linux have seen modifications to and replacements for these systems.
Today CUPS is the default printing system under Linux and OS X.

CUPS

CUPS is a cross-platform print server built around the Internet Printing Protocol
(IPP), which is based on HTTP. CUPS provides a number of printer drivers and can
print different types of files, including PostScript files. CUPS provides System V and
BSD command-line interfaces and, in addition to IPP, supports LPD/LPR, HTTP,
SMB, and JetDirect (socket) protocols, among others.
This section describes the LPD command-line interface that runs under CUPS and
also in native mode on older systems.

Examples

The first command sends the file named memo2 to the default printer:
$ lpr memo2

860 lpr

Next a pipeline sends the output of ls to the printer named deskjet:
$ ls | lpr -Pdeskjet

The next example paginates and sends the file memo to the printer:
$ pr -h "Today's memo" memo | lpr

The next example shows a number of print jobs queued for the default printer. Max
owns all the jobs, and the first one is being printed (it is active). Jobs 635 and 639
were created by sending input to lpr’s standard input; job 638 was created by giving
ncut.icn as an argument to the lpr command. The last column gives the size of each
print job.
$ lpq
deskjet is ready
Rank
Owner
active max
1st
max
2nd
max

and printing
Job Files
635 (stdin)
638 ncut.icn
639 (stdin)

Total Size
38128 bytes
3587 bytes
3960 bytes

The next command removes job 638 from the default print queue:
$ lprm 638

ls 861

ls
ls

Displays information about one or more files
ls [options] [file-list]
The ls utility displays information about one or more files. It lists the information alphabetically by
filename unless you use an option that changes the order.

Arguments

When you do not provide an argument, ls displays the names of the visible files (those
with filenames that do not begin with a period) in the working directory.
The file-list is a list of one or more pathnames of any ordinary, directory, or device
files. It can include ambiguous file references.
When the file-list includes a directory, ls displays the contents of the directory. It
displays the name of the directory only when needed to avoid ambiguity, such as
when the listing includes more than one directory. When you specify an ordinary
file, ls displays information about that one file.

Options

Options preceded by a double hyphen (––) work under Linux only. Except as noted,
options named with a single letter and preceded by a single hyphen work under Linux
and OS X.
The options determine the type of information ls displays, the manner in which it
displays the information, and the order in which the information is displayed.
When you do not use an option, ls displays a short list that contains just the names
of files, in alphabetical order.

––almost-all

–A The same as –a but does not list the . and .. directory entries.

––all

–a Includes hidden filenames (those filenames that begin with a period; page 86) in
the listing. Without this option ls does not list information about files with hidden
filenames unless you include the name of a hidden file in the file-list. The *
ambiguous file reference does not match a leading period in a filename, so you
must use this option or explicitly specify a filename (ambiguous or not) that begins
with a period to display files with hidden filenames.

––escape

–b Displays nonprinting characters in a filename, using backslash escape sequences
similar to those used in C language strings (Table VI-20). Other nonprinting
characters are displayed as a backslash followed by an octal number.

Table VI-20

Backslash escape sequences

Sequence

Meaning

\b

BACKSPACE

\n

NEWLINE

862 ls

Table VI-20

Backslash escape sequences (continued)

Sequence

Meaning

\r

RETURN

\t

HORIZONTAL TAB

\v

VERTICAL TAB

\\

BACKSLASH

––color[=when]

The ls utility can display various types of files in different colors but normally
does not use colors (the same result as when you specify when as none). If you
do not specify when or if you specify when as always, ls uses colors. When you
specify when as auto, ls uses colors only when the output goes to a screen. See
the “Notes” section for more information. L

––directory

–d Displays directories without displaying their contents. This option does not
dereference symbolic links; that is, for each file that is a symbolic link, this
option lists the symbolic link, not the file the link points to.
–e Displays ACLs (page 1048). O

––classify

–F Displays a slash (/) after each directory, an asterisk (*) after each executable file,
and an at sign (@) after a symbolic link.

––format=word

By default ls displays files sorted vertically. This option sorts files based on word:
across or horizontal (also –x), separated by commas (also –m), long (also –l), or
single-column (also –1). L

––dereference-command-line

–H (partial dereference) For each file that is a symbolic link, lists the file the link
points to, not the symbolic link itself. This option affects files specified on the
command line; it does not affect files found while descending a directory hierarchy. This option treats files that are not symbolic links normally. See page 115
for information on dereferencing symbolic links.
––human-readable

–h With the –l option, displays sizes in K (kilobyte), M (megabyte), and G (gigabyte)
blocks, as appropriate. This option works with the –l option only. It displays powers of 1,024. Under Mac OS X, it displays B (bytes) in addition to the preceding
suffixes. See also ––si.
––inode

–i Displays the inode number of each file. With the –l option, this option displays
the inode number in column 1 and shifts other items one column to the right.

––dereference

–L (dereference) For each file that is a symbolic link, lists the file the link points to,
not the symbolic link itself. This option affects all files and treats files that are
not symbolic links normally. See page 115 for information on dereferencing
symbolic links.

ls 863

–l (lowercase “l”) Lists more information about each file. This option does not
dereference symbolic links; that is, for each file that is a symbolic link, this
option lists the symbolic link, not the file the link points to. If standard output
for a directory listing is sent to the screen, this option displays the number of
blocks used by all files in the listing on a line before the listing. Use this option
with –h to make file sizes more readable. See the “Discussion” section for more
information.

––format=long

––format=commas

–m Displays a comma-separated list of files that fills the width of the screen.
–P (no dereference) For each file that is a symbolic link, lists the symbolic link, not
the file the link points to. This option affects all files and treats files that are
not symbolic links normally. See page 115 for information on dereferencing
symbolic links. O
––hide-control-chars

–q Displays nonprinting characters in a filename as question marks. When standard
output is sent to the screen, this behavior is the default. Without this option, when
standard output is sent to a filter or a file, nonprinting characters are output as
themselves.
––recursive
––reverse
––size

–R Recursively lists directory hierarchies.
–r Displays the list of filenames in reverse sorted order.
–s Displays the number of 1,024-byte (Linux) or 512-byte (OS X) blocks allocated
to the file. The size precedes the filename. With the –l option, this option displays
the size in column 1 and shifts other items one column to the right. If standard
output for a directory listing is sent to the screen, this option displays the number
of blocks used by all files in the listing on a line before the listing. You can include
the –h option to make the file sizes easier to read.
Under Mac OS X, you can use the BLOCKSIZE environment variable
(page 718) to change the size of the blocks this option reports on.

––si

––sort=time

With the –l option, displays sizes in K (kilobyte), M (megabyte), and G (gigabyte)
blocks, as appropriate. This option works with the –l option only. This option displays powers of 1,000. See also ––human-readable. L
–t Displays files sorted by the time they were last modified.

––sort=word

By default ls displays files in ASCII order. This option sorts the files based on
word: filename extension (–X; Linux only), none (–U; Linux only), file size (–S),
access time (–u), or modification time (–t). See ––time for an exception. L

––time=word

By default ls with the –l option displays the modification time of a file. Set
word to atime (–u) to display the access time or to ctime (–t) to display the
modification time. The list is sorted by word when you also give the
––sort=time option. L

864 ls

–u Displays files sorted by the time they were last accessed.

––sort=access

––format=extension

–X Displays files sorted by filename extension. Files with no filename extension are
listed first. L
–x Displays files sorted by lines (the default display is sorted by columns).

––format=across

––format=single-column

–1 (one) Displays one file per line. This type of display is the default when you
redirect the output from ls.

Discussion

The ls long listing (–l or – –format=long options) displays the columns shown in
Figure 4-12 on page 98. The first column, which contains 10 or 11 characters, is
divided as described in the following paragraphs. The character in the first position
describes the type of file, as shown in Table VI-21.

Table VI-21

First character in a long ls display

Character

Meaning

–

Ordinary

b

Block device

c

Character device

d

Directory

p

FIFO (named pipe)

l

Symbolic link

The next nine characters of the first column describe the access permissions associated with the file. These characters are divided into three sets of three characters each.
The first three characters represent the owner’s access permissions. If the owner has
read access permission to the file, r appears in the first character position. If the
owner is not permitted to read the file, a hyphen appears in this position. The next
two positions represent the owner’s write and execute access permissions. If w
appears in the second position, the owner is permitted to write to the file; if x appears
in the third position, the owner is permitted to execute the file. An s in the third position indicates the file has both setuid and execute permissions. An S in the third
position indicates setuid permission without execute permission. A hyphen indicates
that the owner does not have the access permission associated with the character
position.
In a similar manner the second set of three characters represents the access permissions
for the group the file is associated with. An s in the third position indicates the file has
setgid permission with execute permission, and an S indicates setgid permission with
no execute permission.

ls 865

The third set of three characters represents the access permissions for other users. A
t in the third position indicates that the file has the sticky bit (page 1095) set.
Refer to chmod on page 741 for information on changing access permissions.
If ACLs (page 104) are enabled and a listed file has an ACL, ls –l displays a plus sign
(+) following the third set of three characters.
Still referring to Figure 4-12 on page 98, the second column indicates the number of
hard links to the file. Refer to page 109 for more information on links.
The third and fourth columns display the name of the owner of the file and the name
of the group the file is associated with, respectively.
The fifth column indicates the size of the file in bytes or, if information about a device
file is being displayed, the major and minor device numbers. In the case of a directory,
this number is the size of the directory file, not the size of the files that are entries
within the directory. (Use du [page 791] to display the sum of the sizes of all files in
a directory.) Use the –h option to display the size of files in kilobytes, megabytes, or
gigabytes.
The last two columns display the date and time the file was last modified and the
filename.

Notes

By default ls does not dereference symbolic links: For each file that is a symbolic link,
ls lists the symbolic link, not the file the link points to. Use the –L or –H option to
dereference symbolic links. For more information refer to “Dereferencing Symbolic
Links” on page 115.
For other than long listings (displayed by the –l option), when standard output goes
to the screen, ls displays output in columns based on the width of the screen. When
you redirect standard output to a filter or file, ls displays a single column.
Refer to page 148 for examples of using ls with ambiguous file references.
Set the LANG locale variable to C if ls output is sorted in a way you would not
expect. See the tip titled “The C locale” on page 318 for more information.
With the ––color option, ls displays the filenames of various types of files in different
colors. By default executable files are green, directory files are blue, symbolic links
are cyan, archives and compressed files are red, and ordinary text files are black. The
manner in which ls colors the various file types is specified in the /etc/DIR_COLORS
file. If this file does not exist on the local system, ls will not color filenames. You can
modify /etc/DIR_COLORS to alter the default color/filetype mappings on a systemwide basis. For your personal use, you can copy /etc/DIR_COLORS to the
~/.dir_colors file in your home directory and modify it. For your login, ~/.dir_colors
overrides the systemwide colors established in /etc/DIR_COLORS. Refer to the
dir_colors and dircolors man pages for more information.

866 ls

Examples

See “Dereferencing Symbolic Links Using ls” on page 116 for examples that use the
–H and –L options.
The first example shows ls, without any options or arguments, listing the names of
the files in the working directory in alphabetical order. The list is sorted in columns
(vertically):
$ ls
bin calendar
c
execute

letters
shell

The next example shows the ls utility with the –x option, which sorts the files
horizontally:
$ ls -x
bin
execute

c
letters

calendar
shell

The –F option appends a slash (/) to files that are directories, an asterisk to files that
are executable, and an at sign (@) to files that are symbolic links:
$ ls -Fx
bin/
execute*

c/
letters/

calendar
shell@

Next the –l (long) option displays a long list. The files are still in alphabetical order:
$ ls -l
drwxr-xr-x
drwxr-xr-x
-rw-r--r--rwxr-xr-x
drwxr-xr-x
lrwxrwxrwx

2
2
1
1
2
1

sam
sam
sam
sam
sam
sam

pubs 4096 05-20 09:17 bin
pubs 4096 03-26 11:59 c
pubs 104 01-09 14:44 calendar
pubs
85 05-06 08:27 execute
pubs 4096 04-04 18:56 letters
sam
9 05-21 11:35 shell -> /bin/bash

The –a (all) option lists all files, including those with hidden names:
$ ls -a
.
..
.profile

bin
c
calendar

execute
letters
shell

Combining the –a and –l options displays a long listing of all files, including those with
hidden filenames (page 86), in the working directory. This list is still in alphabetical
order:
$ ls -al
drwxr-xr-x
drwxrwxrwx
-rw-r--r-drwxr-xr-x
drwxr-xr-x
-rw-r--r--rwxr-xr-x
drwxr-xr-x
lrwxrwxrwx

5
3
1
2
2
1
1
2
1

sam
sam
sam
sam
sam
sam
sam
sam
sam

sam
sam
sam
pubs
pubs
pubs
pubs
pubs
sam

4096
4096
160
4096
4096
104
85
4096
9

05-21
05-21
05-21
05-20
03-26
01-09
05-06
04-04
05-21

11:50
11:50
11:45
09:17
11:59
14:44
08:27
18:56
11:35

.
..
.profile
bin
c
calendar
execute
letters
shell -> /bin/bash

ls 867

When you add the –r (reverse) option to the command line, ls produces a list in
reverse alphabetical order:
$ ls -ral
lrwxrwxrwx
drwxr-xr-x
-rwxr-xr-x
-rw-r--r-drwxr-xr-x
drwxr-xr-x
-rw-r--r-drwxrwxrwx
drwxr-xr-x

1
2
1
1
2
2
1
3
5

sam
sam
sam
sam
sam
sam
sam
sam
sam

sam
pubs
pubs
pubs
pubs
pubs
sam
sam
sam

9
4096
85
104
4096
4096
160
4096
4096

05-21
04-04
05-06
01-09
03-26
05-20
05-21
05-21
05-21

11:35
18:56
08:27
14:44
11:59
09:17
11:45
11:50
11:50

shell -> /bin/bash
letters
execute
calendar
c
bin
.profile
..
.

Use the –t and –l options to list files so the most recently modified file appears at the
top of the list:
$ ls -tl
lrwxrwxrwx
drwxr-xr-x
-rwxr-xr-x
drwxr-xr-x
drwxr-xr-x
-rw-r--r--

1
2
1
2
2
1

sam
sam
sam
sam
sam
sam

sam
9 05-21 11:35 shell -> /bin/bash
pubs 4096 05-20 09:17 bin
pubs
85 05-06 08:27 execute
pubs 4096 04-04 18:56 letters
pubs 4096 03-26 11:59 c
pubs 104 01-09 14:44 calendar

Together the –r and –t options cause the file you modified least recently to appear at
the top of the list:
$ ls -trl
-rw-r--r-drwxr-xr-x
drwxr-xr-x
-rwxr-xr-x
drwxr-xr-x
lrwxrwxrwx

1
2
2
1
2
1

sam
sam
sam
sam
sam
sam

pubs 104 01-09 14:44 calendar
pubs 4096 03-26 11:59 c
pubs 4096 04-04 18:56 letters
pubs
85 05-06 08:27 execute
pubs 4096 05-20 09:17 bin
sam
9 05-21 11:35 shell -> /bin/bash

The next example shows ls with a directory filename as an argument. The ls utility
lists the contents of the directory in alphabetical order:
$ ls bin
c e lsdir

To display information about the directory file itself, use the –d (directory) option.
This option lists information about the directory only:
$ ls -dl bin
drwxr-xr-x 2 sam pubs 4096 05-20 09:17 bin

You can use the following command to display a list of all files that have hidden filenames
(filenames that start with a period) in your home directory. It is a convenient way to list
the startup (initialization) files in your home directory.

868 ls
$ ls -d ~/.*
/home/sam/.
/home/sam/..
/home/sam/.AbiSuite
/home/sam/.Azureus
/home/sam/.BitTornado
...

A plus sign (+) to the right of the permissions in a long listing denotes the presence
of an ACL for a file:
$ ls -l memo
-rw-r--r--+ 1 sam pubs 19 07-19 21:59 memo

Under OS X you can use the –le option to display an ACL:
$ ls -le memo
-rw-r--r-- + 1 sam pubs 19 07-19 21:59 memo
0: user:jenny allow read

See page 1048 for more examples of using ls under OS X to display ACLs.

make 869

Keeps a set of programs current
make [options] [target-files] [arguments]
The GNU make utility keeps a set of executable programs (or other files) current, based on differences
in the modification times of the programs and the source files that each program is dependent on.

Arguments

The target-files refer to targets on dependency lines in the makefile. When you do
not specify a target-file, make updates the target on the first dependency line in the
makefile. Command-line arguments of the form name=value set the variable name
to value inside the makefile. See the “Discussion” section for more information.

Options

If you do not use the –f option, make takes its input from a file named GNUmakefile,
makefile, or Makefile (in that order) in the working directory. In this section, this
input file is referred to as makefile. Many users prefer to use the name Makefile
because it shows up earlier in directory listings.

The Mac OS X version of make accepts long options
tip Options for make preceded by a double hyphen (––) work under Mac OS X as well as under Linux.
––directory=dir –C dir

Changes directories to dir before starting.
––debug

–d Displays information about how make decides what to do.

––file=file

–f file
(input file) Uses file as input instead of makefile.

––jobs[=n]

–j [n]
Runs up to n commands at the same time instead of the default of one command. Running multiple commands simultaneously is especially effective if you
are working on a multiprocessor system. If you omit n, make does not limit the
number of simultaneous jobs.

––keep-going

–k Continues with the next file from the list of target-files instead of quitting when
a construction command fails.

––just-print or
––dry-run

–n (no execution) Displays, but does not execute, the commands that make would
execute to bring the target-files up-to-date.

––silent or
––quiet
––touch

–s Does not display the names of the commands being executed.
–t Updates the modification times of target files but does not execute any construction
commands. Refer to touch on page 985.

make

make

870 make

Discussion

The make utility bases its actions on the modification times of the programs and the
source files that each program depends on. Each of the executable programs, or
target-files, depends on one or more prerequisite files. The relationships between
target-files and prerequisites are specified on dependency lines in a makefile. Construction commands follow the dependency line, specifying how make can update the
target-files. See page 872 for examples of makefiles.

Documentation

Refer to www.gnu.org/software/make/manual/make.html and to the make info page
for more information about make and makefiles.
Although the most common use of make is to build programs from source code, this
general-purpose build utility is suitable for a wide range of applications. Anywhere
you can define a set of dependencies to get from one state to another represents a
candidate for using make.
Much of make’s power derives from the features you can set up in a makefile. For
example, you can define variables using the same syntax found in the Bourne Again
Shell. Always define the variable SHELL in a makefile; set it to the pathname of the
shell you want to use when running construction commands. To define the variable
and assign it a value, place the following line near the top of a makefile:
SHELL=/bin/sh

Assigning the value /bin/sh to SHELL allows you to use a makefile on other computer
systems. On Linux systems, /bin/sh is generally linked to /bin/bash or /bin/dash.
Under Mac OS X, /bin/sh is a copy of bash that attempts to emulate the original
Bourne Shell. The make utility uses the value of the environment variable SHELL if
you do not set SHELL in a makefile. If SHELL does not hold the path of the shell
you intended to use and if you do not set SHELL in a makefile, the construction commands might fail.
Following is a list of additional make features.
• You can run specific construction commands silently by preceding them
with an at sign (@). For example, the following lines will display a short help
message when you run the command make help:
help:
@echo
@echo
@echo
@echo
@echo

"You can make the following:"
" "
"libbuf.a
-- the buffer library"
"Bufdisplay
-- display any-format buffer"
"Buf2ppm
-- convert buffer to pixmap"

Without the @s in the preceding example, make would display each of the
echo commands before executing it. This way of displaying a message
works because no file is named help in the working directory. As a result
make runs the construction commands in an attempt to build this file.
Because the construction commands display messages but do not build the
file help, you can run make help repeatedly with the same result.

make 871

• You can cause make to ignore the exit status of a command by preceding the
command with a hyphen (–). For example, the following line allows make
to continue regardless of whether the call to /bin/rm is successful (the call
to /bin/rm fails if libbuf.a does not exist):
-/bin/rm libbuf.a

• You can use special variables to refer to information that might change from
one use of make to the next. Such information might include files that need
updating, files that are newer than the target, and files that match a pattern.
For example, you can use the variable $? in a construction command to
identify all prerequisite files that are newer than the target file. This variable
allows you to print any files that have changed since the last time you
printed those files:
list:
.list
.list:
Makefile buf.h xtbuff_ad.h buff.c buf_print.c xtbuff.c
pr $? | lpr
date >.list

The target list depends on the source files that might be printed. The construction command pr $? | lpr prints only those source files that are newer
than the file .list. The line date > .list modifies the .list file so it is newer than
any of the source files. The next time you run the command make list, only
the files that have been changed are printed.
• You can include other makefiles as if they were part of the current makefile.
The following line causes make to read Make.config and treat the content
of that file as though it were part of the current makefile, allowing you to
put information common to more than one makefile in a single place:
include Make.config

Note

Under Mac OS X, the make utility is part of the optional Xcode package.

Examples

The first example causes make to bring the target-file named analysis up-to-date by
issuing three cc commands. It uses a makefile named GNUmakefile, makefile, or
Makefile in the working directory.
$ make analysis
cc -c analy.c
cc -c stats.c
cc -o analysis analy.o stats.o

The following example also updates analysis but uses a makefile named analysis.mk
in the working directory:
$ make -f analysis.mk analysis
'analysis' is up to date.

872 make

The next example lists the commands make would execute to bring the target-file
named credit up-to-date. Because of the –n option, make does not execute the
commands.
$ make -n credit
cc -c -O credit.c
cc -c -O accounts.c
cc -c -O terms.c
cc -o credit credit.c accounts.c terms.c

The next example uses the –t option to update the modification time of the target-file
named credit. After you use this option, make thinks that credit is up-to-date.
$ make -t credit
$ make credit
'credit' is up to date.
Example makefiles

Following is a very simple makefile named Makefile. This makefile compiles a program
named morning (the target file). The first line is a dependency line that shows morning
depends on morning.c. The next line is the construction line: It shows how to create
morning using the gcc C compiler. The construction line must be indented using a TAB,
not SPACEs.
$ cat Makefile
morning: morning.c
TAB gcc -o morning morning.c

When you give the command make, make compiles morning.c if it has been modified
more recently than morning.
The next example is a simple makefile for building a utility named ff. Because the cc
command needed to build ff is complex, using a makefile allows you to rebuild ff easily,
without having to remember and retype the cc command.
$ cat Makefile
# Build the ff command from the fastfind.c source
SHELL=/bin/sh
ff:
gcc -traditional -O2 -g -DBIG=5120 -o ff fastfind.c myClib.a
$ make ff
gcc -traditional -O2 -g -DBIG=5120 -o ff fastfind.c myClib.a

In the next example, a makefile keeps the file named compute up-to-date. The make
utility ignores comment lines (lines that begin with a hashmark [#]); the first three
lines of the following makefile are comment lines. The first dependency line shows
that compute depends on two object files: compute.o and calc.o. The corresponding
construction line gives the command make needs to produce compute. The second
dependency line shows that compute.o depends not only on its C source file but also
on the compute.h header file. The construction line for compute.o uses the C compiler optimizer (–O3 option). The third set of dependency and construction lines is

make 873

not required. In their absence, make infers that calc.o depends on calc.c and produces
the command line needed for the compilation.
$ cat Makefile
#
# Makefile for compute
#
compute: compute.o calc.o
gcc -o compute compute.o calc.o
compute.o: compute.c compute.h
gcc -c -O3 compute.c
calc.o: calc.c
gcc -c calc.c
clean:
rm

*.o *core* *~

There are no prerequisites for clean, the last target. This target is often used to remove
extraneous files that might be out-of-date or no longer needed, such as .o files.
The next example shows a much more sophisticated makefile that uses features not
discussed in this section. Refer to the sources cited under “Documentation” on
page 870 for information about these and other advanced features.
$ cat Makefile
###########################################################
## build and maintain the buffer library
###########################################################
SHELL=/bin/sh
###########################################################
## Flags and libraries for compiling. The XLDLIBS are needed
#
whenever you build a program using the library. The CCFLAGS
#
give maximum optimization.
CC=gcc
CCFLAGS=-O2 $(CFLAGS)
XLDLIBS= -lXaw3d -lXt -lXmu -lXext -lX11 -lm
BUFLIB=libbuf.a
###########################################################
## Miscellaneous
INCLUDES=buf.h
XINCLUDES=xtbuff_ad.h
OBJS=buff.o buf_print.o xtbuff.o
###########################################################
## Just a 'make' generates a help message
help:
Help
@echo "You can make the following:"
@echo " "
@echo " libbuf.a
-- the buffer library"
@echo " bufdisplay
-- display any-format buffer"
@echo " buf2ppm
-- convert buffer to pixmap"

874 make
###########################################################
## The main target is the library
libbuf.a: $(OBJS)
-/bin/rm libbuf.a
ar rv libbuf.a $(OBJS)
ranlib libbuf.a
###########################################################
## Secondary targets -- utilities built from the library
bufdisplay: bufdisplay.c libbuf.a
$(CC) $(CCFLAGS) bufdisplay.c -o bufdisplay $(BUFLIB) $(XLDLIBS)
buf2ppm: buf2ppm.c libbuf.a
$(CC) $(CCFLAGS) buf2ppm.c -o buf2ppm $(BUFLIB)

###########################################################
## Build the individual object units
buff.o: $(INCLUDES) buff.c
$(CC) -c $(CCFLAGS) buff.c
buf_print.o:$(INCLUDES) buf_print.c
$(CC) -c $(CCFLAGS) buf_print.c
xtbuff.o: $(INCLUDES) $(XINCLUDES) xtbuff.c
$(CC) -c $(CCFLAGS) xtbuff.c

The make utility can be used for tasks other than compiling code. As a final example,
assume you have a database that lists IP addresses and the corresponding hostnames
in two columns; also assume the database dumps these values to a file named
hosts.tab. You need to extract only the hostnames from this file and generate a Web
page named hosts.html containing these names. The following makefile is a simple
report writer:
$ cat makefile
#
SHELL=/bin/bash
#
hosts.html: hosts.tab
@echo "" > hosts.html
@awk '{print $$2, "
"}' hosts.tab >> hosts.html @echo "" >> hosts.html man 875 man man [options] [section] command man –k keyword The man (manual) utility provides online documentation for Linux and OS X utilities. In addition to utilities, documentation is available for many system commands and details that relate to Linux and OS X. Because many Linux and OS X utilities come from GNU, the GNU info utility (page 36) frequently provides more complete information about them. A one-line header is associated with each manual page. This header consists of a utility name, the section of the manual in which the command is found, and a brief description of what the utility does. These headers are stored in a database, enabling you to perform quick searches on keywords associated with each man page. Arguments The section argument tells man to limit its search to the specified section of the manual (see page 34 for a listing of manual sections). Without this argument man searches the sections in numerical order and displays the first man page it finds. In the second form of the man command, the –k option searches for the keyword in the database of man page headers; man displays a list of headers that contain the keyword. A man –k command performs the same function as apropos (page 35). Options Options preceded by a double hyphen (––) work under Linux only. Not all options preceded by a double hyphen work under all Linux distributions. Options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––all –a Displays man pages for all sections of the manual. Without this option man displays only the first page it finds. Use this option when you are not sure which section contains the desired information. –K keyword Searches for keyword in all man pages. This option can take a long time to run. It is not available under some Linux distributions. ––apropos –k keyword Displays manual page headers that contain the string keyword. You can scan this list for commands of interest. This option is equivalent to the apropos command (page 35). ––manpath=path –M path Searches the directories in path for man pages, where path is a colon-separated list of directories. See “Discussion.” ––troff –t Formats the page for printing on a PostScript printer. The output goes to standard output. man Displays documentation for utilities 876 man Discussion The manual pages are organized into sections, each pertaining to a separate aspect of the Linux system. Section 1 contains user-callable utilities and is the section most likely to be accessed by users who are not system administrators or programmers. Other sections of the manual describe system calls, library functions, and commands used by system administrators. See page 34 for a listing of the manual sections. Pager The man utility uses less (page 852) to display manual pages that fill more than one screen. To use another pager, set the environment variable PAGER to the pathname of that pager. For example, adding the following line to the ~/.bash_profile file sets up a bash user to use more instead of less: export PAGER=$(which more) This statement assigns the pathname of the more utility [$(which more) returns the absolute pathname of the more utility] to the environment variable PAGER. Under OS X, less and more are copies of the same file. Because of the way each is called, they work slightly differently. MANPATH You can tell man where to look for man pages by setting the environment variable MANPATH to a colon-separated list of directories. For example, bash users running Linux can add the following line to ~/.bash_profile to cause man to search the /usr/man, /usr/local/man, and /usr/local/share/man directories: export MANPATH=/usr/man:/usr/local/man:/usr/local/share/man Working as a privileged user, you can edit /etc/manpath.config or /etc/man.config (Linux) or /etc/man.conf (OS X) to further configure man. Refer to the man man page for more information. Notes See page 33 for another discussion of man. The argument to man does not have to be the name of a utility. For example, the command man ascii lists the ASCII characters and their various representations; the command man –k postscript lists man pages that pertain to PostScript. The man pages are commonly stored in an unformatted, compressed form. When you request a man page, it has to be decompressed and formatted before being displayed. To speed up subsequent requests for that man page, man attempts to save the formatted version of the page. Some utilities described in the manual pages have the same name as shell builtin commands. The behavior of the shell builtin might differ slightly from the behavior of the utility as described in the manual page. For information about shell builtins, see the man page for builtin or the man page for a specific shell. References to man pages frequently use section numbers in parentheses. For example, write(2) refers to the man page for write in section 2 of the manual (page 34). The first of the following commands uses the col utility to generate a simple text man page that does not include bold or underlined text. The second command generates a PostScript version of the man page. man 877 $ man ls | col -b > ls.txt $ man -t ls > ls.ps Under Linux you can use ps2pdf to convert the PostScript file to a PDF file. Examples The following example uses man to display the documentation for the write utility, which sends messages to another user’s terminal: $ man write WRITE(1) User Commands WRITE(1) NAME write - send a message to another user SYNOPSIS write user [ttyname] DESCRIPTION Write allows you to communicate with other users, by copying lines from your terminal to theirs. When you run the write command, the user you are gets a message of the form: writing to Message from yourname@yourhost on yourtty at hh:mm ... ... The next example displays the man page for another utility—the man utility itself, which is a good starting place for someone learning about the system: $ man man MAN(1) Manual pager utils MAN(1) NAME man - an interface to the on-line reference manuals SYNOPSIS man [-C file] [-d] [-D] [--warnings[=warnings]] [-R encoding] [-L locale] [-m system[,...]] [-M path] [-S list] [-e extension] [-i|-I] [--regex|--wildcard] [--names-only] [-a] [-u] [--no-subpages] [-P pager] [-r prompt] [-7] [-E encoding] ... DESCRIPTION man is the system's manual pager. Each page argument given to man is normally the name of a program, utility or function. The manual page associated with each of these arguments is ... You can also use the man utility to find the man pages that pertain to a topic. In the next example, man –k displays man page headers containing the string latex. The apropos utility functions similarly to man –k. 878 man $ man -k latex elatex (1) [latex] latex (1) mkindex (1) pdflatex (1) pod2latex (1) Pod::LaTeX (3pm) ... - structured text formatting and typesetting structured text formatting and typesetting script to process LaTeX index and glossary files PDF output from TeX convert pod documentation to latex format Convert Pod data to formatted Latex The search for the keyword entered with the –k option is not case sensitive. Although the keyword entered on the command line is all lowercase, it matches the last header, which contains the string LaTeX (uppercase and lowercase). The 3pm entry on the last line indicates the man page is from Section 3 (Subroutines) of the Linux System Manual and comes from the Perl Programmers Reference Guide (it is a Perl subroutine; see Chapter 11 for more information on the Perl programming language). mc 879 mc mc [options] [dirL [dirR]] Midnight Commander is a full-screen textual user shell that includes a comprehensive file manager; a simple editor; and FTP, SSH, and Samba clients. Arguments The dirL and dirR are the names of the directories Midnight Commander displays in the left and right panels, respectively. When called without arguments, Midnight Commander displays in both panels the working directory of the shell it was called from. Options This section describes a few of the many options Midnight Commander accepts. See the mc man page for a complete list. You can set many options from the Options menu on the menubar (page 882). ––stickcars ––nocolor ––color ––nomouse ––version Notes –a Disables the display of graphical characters for drawing lines. –b Displays Midnight Commander in black and white. –c Displays Midnight Commander in color if the device it is running on is capable of displaying color. –d Disables support for the mouse. –V Displays version and build information. Midnight Commander (www.midnight-commander.org) was written in 1994 by Miguel de Icaza. It has a comprehensive man page (mc). Current versions of Midnight Commander accept mouse input (which this section does not discuss). The Midnight Commander input lines approximate standard emacs commands and the Midnight Commander documentation uses the same key notation as emacs; see page 223 for more information. By default Midnight Commander uses its internal editor. Select OptionsConfiguration to display the Configure options window (Figure VI-5; page 883) and remove the tick from the check box labeled Use internal edit to use a different editor. When Midnight Commander does not use its internal editor, it uses the editor specified by the EDITOR environment variable. If EDITOR is not set, it uses vi (which is typically linked to vim). mc Manages files in a textual environment (aka Midnight Commander) 880 mc Menubar Selection bar Directory panels Shell command line Function key labels Figure VI-1 The Display The basic Midnight Commander screen As shown in Figure VI-1, the Midnight Commander screen is divided into four sections, the two largest occupied by the directory panels. The top line holds the menubar; if it is not visible, press F9. The next-to-bottom line holds the shell command line and the bottom line holds the function key labels. Function key labels change depending on context. The current directory panel holds the selection bar (the highlight that runs the width of the panel), which specifies the current file. Most commands work on the current file; some commands, such as those that copy or move a file, use the directory displayed in the second panel as the target. Moving the Cursor This section describes some of the ways you can move the cursor from the directory panel and the menubar. In the current directory panel • UP ARROW (or CONTROL-P) and DOWN ARROW (or CONTROL-N) keys move the selection bar up and down, respectively. • TAB moves the selection bar to the second panel and makes the second panel the current panel. • F1 displays the Help window (Figure VI-2). Exactly what F1 displays depends on the context in which it is used. • F2 displays the User menu (Figure VI-3). • F9 moves the cursor to the menubar (Figure VI-4; page 882). • F10 exits from Midnight Commander or closes a window if one is open. mc 881 Figure VI-2 The Help window (F1) • Typing a command enters the command on the shell command line. • Other function keys open windows as specified on the bottom line. With an entry on the menubar highlighted (after pressing F9) • RIGHT ARROW and LEFT ARROW keys move the cursor from menu to menu. • With no drop-down menu displayed, the DOWN ARROW or RETURN key opens the highlighted drop-down menu. Alternatively, you can type the initial letter of the menu to open it. Figure VI-3 The User menu (F2) 882 mc Figure VI-4 The menubar and the Left drop-down menu • With a drop-down menu displayed, UP ARROW and DOWN ARROW keys move the highlight from menu item to menu item and RETURN selects the highlighted item. Alternatively, you can type the highlighted letter in an item to select it. Commands This section describes giving commands by selecting menu items. You can also give commands by typing emacs-like commands at the shell command line. For example, CONTROL-X c (C-x c in emacs notation; page 223) runs chmod on the current file. See the mc man page for a complete list of Midnight Commander commands. Menubar The menubar, shown in Figure VI-4, holds five drop-down menus. • Left—Changes the display and organization of the left directory panel, allowing you to control which information about each file is listed, the order in which the listed files are displayed, and more. This menu also enables you to open an FTP, OpenSSH, or Samba site. • File—Allows you to view, edit, copy, move, link, rename, and delete the current file. Also allows you to change the mode (chmod), group, or owner of the current file. The function keys are shortcuts to many items on this menu. • Command—Enables you to find, compress, compare, and list files in many ways. • Options—Displays windows that allow you to configure Midnight Commander. One of the most useful is the Configure options window (Figure VI-5). • Right—Performs the same functions as the Left menu except it works on the right directory panel. mc 883 Figure VI-5 The Configure options window Tutorial This tutorial follows Sam as he works with Midnight Commander. Changing directories Sam starts Midnight Commander from his home directory by giving the command mc. Midnight Commander displays a screen similar to the one shown in Figure VI-1 on page 880, except it shows his home directory in both directory panels; the selection bar is at the top of the left panel, over the .. entry (page 92). Sam wants to display a list of the files in the memos subdirectory of his home directory, so he presses the DOWN ARROW key several times until the selection bar is over memos. Then he presses RETURN to display the contents of the memos directory in the left panel. Midnight Commander displays the name of the directory it is displaying (~/memos) at the upper-left corner of the panel. Viewing a file Next Sam wants to look at the contents of the memoD file in the memos directory; he uses the DOWN ARROW key to move the selection bar until it is over memoD and presses F3 (View). Because it is a long text file, Sam uses the SPACE bar to scroll through the file; Midnight Commander displays information about which part of the file it is displaying at the top of the screen. The function key labels on the last line change to tasks that are useful while viewing a file (e.g., edit, hex, save). When Sam is done viewing the file, he presses F10 (Quit) to close the view window and redisplay the panels. Copying a file Sam wants to copy memoD, but he is not sure which directory he wants to copy it to. He makes the right directory panel the active panel by pressing TAB; the selection bar moves to the right panel, which is still displaying his home directory. Sam moves the selection bar until it is over the name of the directory he wants to examine and presses RETURN. When he is satisfied that this directory is the one he wants to copy memoD to, he presses TAB again to move the selection bar back to the left panel, where 884 mc Figure VI-6 The Copy window it is once again over memoD. Next he presses F5 (Copy) to display the Copy window (Figure VI-6), and presses RETURN to copy the file. Using FTP Next, Sam wants to look at some files on the ftp.kernel.org FTP site, so he presses F9 (PullDn) to move the cursor to the menubar; Midnight Commander highlights Left on the menubar. Sam presses RETURN to display the Left drop-down menu. Using the DOWN ARROW key, he moves the highlight down to the FTP link and presses RETURN; Midnight Commander displays the FTP to machine window (Figure VI-7). Sam presses RETURN to display the top level of the ftp.kernel.org site in the left directory panel. Now he can work with the files at this site as though they were local, copying files to the local system as needed. Figure VI-7 The FTP to machine window mc 885 The hotlist Because Sam visits ftp.kernel.org frequently, he decides to add it to his hotlist. You can add any directory, local or remote, to your hotlist. With ftp.kernel.org displayed in the left panel, Sam enters CONTROL-\ to open the Directory hotlist window and presses A to add the current directory to the hotlist. Midnight Commander displays a small Add to hotlist window and Sam presses RETURN to confirm he wants to add the directory to his hotlist. Now ftp.kernel.org is in the hotlist; Sam presses F10 (Quit) to close the Directory hotlist window. Because Sam is done working with ftp.kernel.org, he types cd (followed by a RETURN); the command appears on the shell command line and the left panel displays his home directory. When Sam wants to display the files at ftp.kernel.org, he can give the command CONTROL-\ to display the Directory hotlist window, move the highlight until it is over ftp.kernel.org, and press RETURN. This tutorial covers a very small selection of Midnight Commander commands. Use the Midnight Commander help key (F1) and the mc man page to learn about other commands. 886 mkdir mkdir mkdir Creates a directory mkdir [options] directory-list The mkdir utility creates one or more directories. Arguments The directory-list is a list of pathnames of directories that mkdir creates. Options Under Linux, mkdir accepts the common options described on page 718. Options preceded by a double hyphen (––) work under Linux only. Options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––mode=mode –m mode Sets the permission to mode. You can represent the mode absolutely using an octal number (Table VI-7 on page 742) or symbolically (Table VI-4 on page 741). ––parents –p Creates directories that do not exist in the path to the directory you wish to create. ––verbose –v Displays the name of each directory created. This option is helpful when used with the –p option. Notes You must have permission to write to and search (execute permission) the parent directory of the directory you are creating. The mkdir utility creates directories that contain the standard hidden entries (. and ..). Examples The following command creates the accounts directory as a subdirectory of the working directory and the prospective directory as a subdirectory of accounts: $ mkdir -p accounts/prospective Without changing working directories, the same user now creates another subdirectory within the accounts directory: $ mkdir accounts/existing Next the user changes the working directory to the accounts directory and creates one more subdirectory: $ cd accounts $ mkdir closed The last example shows the user creating another subdirectory. This time the ––mode option removes all access permissions for the group and others: $ mkdir -m go= accounts/past_due mkfs 887 Creates a filesystem on a device mkfs [options] device The mkfs utility creates a filesystem on a device such as a flash drive or a partition of a hard disk. It acts as a front end for programs that create filesystems, each specific to a filesystem type. The mkfs utility is available under Linux only. L mkfs destroys all data on a device caution Be careful when using mkfs: It destroys all data on a device or partition. Arguments The device is the name of the device that you want to create the filesystem on. If the device name is in /etc/fstab, you can use the mount point of the device instead of the device name (e.g., /home in place of /dev/sda2). Options When you run mkfs, you can specify both global options and options specific to the filesystem type that mkfs is creating (e.g., ext2, ext3, ext4, msdos, reiserfs). Global options must precede type-specific options. Global Options –t fstype (type) The fstype is the type of filesystem you want to create—for example, ext3, ext4, msdos, or reiserfs. The default filesystem varies. –v (verbose) Displays more output. Use –V for filesystem-specific information. Filesystem Type-Specific Options The options described in this section apply to many common filesystem types, including ext2/ext3/ext4. The following command lists the filesystem creation utilities available on the local system: $ ls /sbin/mkfs.* /sbin/mkfs.btrfs /sbin/mkfs.cramfs /sbin/mkfs.ext2 /sbin/mkfs.ext3 /sbin/mkfs.ext4 /sbin/mkfs.ext4dev /sbin/mkfs.msdos /sbin/mkfs.ntfs /sbin/mkfs.vfat /sbin/mkfs.xfs There is frequently a link to /sbin/mkfs.ext2 at /sbin/mke2fs. Review the man page or give the pathname of the filesystem creation utility to determine which options the utility accepts. $ /sbin/mkfs.ext4 Usage: mkfs.ext4 [-c|-l filename] [-b block-size] [-f fragment-size] [-i bytes-per-inode] [-I inode-size] [-J journal-options] [-G meta group size] [-N number-of-inodes] [-m reserved-blocks-percentage] [-o creator-os] [-g blocks-per-group] [-L volume-label] [-M last-mounted-directory] [-O feature[,...]] [-r fs-revision] [-E extended-option[,...]] [-T fs-type] [-U UUID] [-jnqvFKSV] device [blocks-count] mkfs mkfs 888 mkfs –b size (block) Specifies the size of blocks in bytes. On ext2, ext3, and ext4 filesystems, valid block sizes are 1,024, 2,048, and 4,096 bytes. –c (check) Checks for bad blocks on the device before creating a filesystem. Specify this option twice to perform a slow, destructive, read-write test. Discussion Before you can write to and read from a hard disk or other device in the usual fashion, there must be a filesystem on it. Typically a hard disk is divided into partitions (page 1085), each with a separate filesystem. A flash drive normally holds a single filesystem. Refer to Chapter 4 for more information on filesystems. Notes Under OS X, use diskutil (page 782) to create a filesystem. You can use tune2fs (page 991) with the –j option to change an existing ext2 filesystem into a journaling filesystem (page 1076) of type ext3. (See the “Examples” section.) You can also use tune2fs to change how often fsck (page 814) checks a filesystem. mkfs is a front end Examples Much like fsck, mkfs is a front end that calls other utilities to handle various types of filesystems. For example, mkfs calls mke2fs (which is typically linked to mkfs.ext2, mkfs.ext3, and mkfs.ext4) to create the widely used ext2/ext3/ext4 filesystems. Refer to the mke2fs man page for more information. Other utilities that mkfs calls are typically named mkfs.type, where type is the filesystem type. By splitting mkfs in this manner, filesystem developers can provide programs to create their filesystems without affecting the development of other filesystems or changing how system administrators use mkfs. In the following example, mkfs creates a default filesystem on the device at /dev/sda2: # mkfs /dev/sda2 mke2fs 1.41.14 (22-Dec-2010) Filesystem label= OS type: Linux Block size=1024 (log=0) Fragment size=1024 (log=0) Stride=0 blocks, Stripe width=0 blocks 128016 inodes, 512000 blocks 25600 blocks (5.00%) reserved for the super user First data block=1 Maximum filesystem blocks=67633152 63 block groups 8192 blocks per group, 8192 fragments per group 2032 inodes per group Superblock backups stored on blocks: 8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409 Writing inode tables: done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 22 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. mkfs 889 The next command writes a VFAT filesystem to a USB flash drive at /dev/sdb1: # mkfs -t vfat /dev/sdb1 mkfs.vfat 3.0.12 (29 Oct 2011) See page 993 for an example that uses tune2fs to convert an ext2 filesystem to an ext3 journaling filesystem. 890 mv mv Renames or moves a file mv mv [options] existing-file new-filename mv [options] existing-file-list directory mv [options] existing-directory new-directory The mv utility, which renames or moves one or more files, has three formats. The first renames a single file with a new filename that you supply. The second renames one or more files so that they appear in a specified directory. The third renames a directory. The mv utility physically moves (copies and deletes) the original file if it is not possible to rename it (that is, if you move the file from one filesystem to another). Arguments In the first form, the existing-file is a pathname of the ordinary file you want to rename. The new-filename is the new pathname of the file. In the second form, the existing-file-list is a list of the pathnames of the files you want to rename and the directory specifies the new parent directory for the files. The files you rename will have the same simple filenames as each of the files in the existingfile-list but new absolute pathnames. The third form renames the existing-directory with the new-directory name. This form works only when the new-directory does not exist. Options Under Linux, mv accepts the common options described on page 718. Options preceded by a double hyphen (– –) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––backup –b Makes a backup copy (by appending ~ to the filename) of any file that would be overwritten. L ––force –f Causes mv not to prompt you if a move would overwrite an existing file that you do not have write permission for. You must have write permission for the directory holding the existing file. ––interactive –i Prompts for confirmation if mv would overwrite a file. If your response begins with a y or Y, mv overwrites the file; otherwise, mv does not move the file. ––update –u If a move would overwrite an existing file—not a directory—this option causes mv to compare the modification times of the source and target files. If the target file has a more recent modification time (the target is newer than the source), mv does not replace it. L ––verbose –v Lists files as they are moved. mv 891 Notes When GNU mv copies a file from one filesystem to another, mv is implemented as cp (with the –a option) and rm: It first copies the existing-file to the new-file and then deletes the existing-file. If the new-file already exists, mv might delete it before copying. As with rm, you must have write and execute access permissions to the parent directory of the existing-file, but you do not need read or write access permission to the file itself. If the move would overwrite a file that you do not have write permission for, mv displays the file’s access permissions and waits for a response. If you enter y or Y, mv overwrites the file; otherwise, it does not move the file. If you use the –f option, mv does not prompt you for a response but simply overwrites the file. Although earlier versions of mv could move only ordinary files between filesystems, mv can now move any type of file, including directories and device files. Examples The first command renames letter, a file in the working directory, as letter.1201: $ mv letter letter.1201 The next command renames the file so it appears with the same simple filename in the user’s ~/archives directory: $ mv letter.1201 ~/archives The following command moves all files in the working directory whose names begin with memo so they appear in the /p04/backup directory: $ mv memo* /p04/backup Using the –u option prevents mv from replacing a newer file with an older one. After the mv –u command shown below is executed, the newer file (memo2) has not been overwritten. The mv command without the –u option overwrites the newer file (memo2’s modification time and size have changed to those of memo1). $ ls -l -rw-rw-r-- 1 sam sam -rw-rw-r-- 1 sam sam $ mv -u memo1 memo2 $ ls -l -rw-rw-r-- 1 sam sam -rw-rw-r-- 1 sam sam $ mv memo1 memo2 $ ls -l -rw-rw-r-- 1 sam sam 22 03-25 23:34 memo1 19 03-25 23:40 memo2 22 03-25 23:34 memo1 19 03-25 23:40 memo2 22 03-25 23:34 memo2 892 nice nice Changes the priority of a command nice nice [option] [command-line] The nice utility reports the priority of the shell or alters the priority of a command. An ordinary user can decrease the priority of a command, but only a user working with root privileges can increase the priority of a command. The nice builtin in the TC Shell uses a different syntax. Refer to the “Notes” section for more information. Arguments The command-line is the command line you want to execute at a different priority. Without any options or arguments, nice displays the priority of the shell running nice. Options Without an option, nice defaults to an adjustment of 10, lowering the priority of the command by 10—typically from 0 to 10. As you raise the priority value, the command runs at a lower priority. The option preceded by a double hyphen (––) works under Linux only. The option named with a single letter and preceded by a single hyphen works under Linux and OS X. ––adjustment=value –n value Changes the priority by the increment (or decrement) specified by value. The priorities range from –20 (the highest priority) to 19 (the lowest priority). A positive value lowers the priority, whereas a negative value raises the priority. Only a user working with root privileges can specify a negative value. When you specify a value outside this range, the priority is set to the limit of the range. Notes You can use renice (page 925) or top’s r command (page 983) to change the priority of a running process. Higher (more positive) priority values mean that the kernel schedules a job less often. Lower (more negative) values cause the job to be scheduled more often. When a user working with root privileges schedules a job to run at the highest priority, this change can affect the performance of the system for all other jobs, including the operating system itself. For this reason you should be careful when using nice with negative values. The TC Shell includes a nice builtin. Under tcsh, use the following syntax to change the priority at which command-line is run. The default priority is 4. You must include the plus sign for positive values. nice [±value] command line nice 893 The tcsh nice builtin works differently from the nice utility: When you use the builtin, nice –5 decrements the priority at which command-line is run. When you use the utility, nice –n –5 increments the priority at which command-line is run. Examples The following command executes find in the background at the lowest possible priority. The ps –l command displays the nice value of the command in the NI column: # nice -n 19 find / -name core -print > corefiles.out & [1] 422 # ps -l F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY 4 R 0 389 8657 0 80 0 - 4408 pts/4 4 D 0 422 389 28 99 19 - 1009 pts/4 0 R 0 433 389 0 80 0 - 1591 pts/4 TIME 00:00:00 00:00:04 00:00:00 The next command finds very large files and runs at a high priority (–15): # nice -n -15 find / -size +50000k CMD bash find ps 894 nl nl Numbers lines from a file nl nl [options] file-list The nl utility reads files and sequentially numbers some or all of the lines before sending them to standard output. It does not change the files it reads. Arguments The file-list is a list of the pathnames of one or more files that nl reads. If you do not specify an argument or if you specify a hyphen (–) in place of a filename, nl reads from standard input. Options Under Linux, nl accepts the common options described on page 718. Options preceded by a double hyphen (––) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––body-numbering=a | pRE –b a | pRE Specifies which lines nl numbers; use a to number all lines or p followed by a basic regular expression (Appendix A) to number lines that match that regular expression. By default nl numbers nonblank lines only. ––number-format=ln | rn | rz –n ln | rn | rz Specifies the line number style; use ln to left-justify line numbers without leading zeros (no fill), rn to right-justify line numbers without leading zeros (default), or rz to right-justify line numbers with leading zeros (zero fill). ––number-separator=stg –s stg Separates line numbers from the text using the characters in stg. Default separator is a TAB. ––number-width=num –w num Sets the width of the line number field to num characters. Default is 6 characters. Notes The nl utility has many options including ones that can start renumbering at the top of each page and ones that can number headers, footers, and the body of a document separately. Give the command info coreutils 'nl invocation' for more information. nl 895 Examples When called without options, nl numbers all nonblank lines. $ nl lines 1 He was not candid. He lacked a certain warmth so that you 2 always felt chilled and uncomfortable in his presence. With the –b (––body-numbering) option set to a, nl numbers all lines. $ nl -b a lines 1 He was not candid. He lacked a certain warmth so that you 2 3 always felt chilled and uncomfortable in his presence. With the –n (––number-format) option set to ln, nl left-justifies numbers. $ nl -n ln lines 1 He was not candid. He lacked a certain warmth so that you 2 always felt chilled and uncomfortable in his presence. You can combine options. The following example right-justifies and zero-fills (–n rz) within a field of three characters (–w 3) all lines (–b a) and separates the numbers from the lines using two SPACEs (–s ' '). $ nl -n rz -w 3 -b a -s ' ' lines 001 He was not candid. He lacked a certain warmth so that you 002 003 always felt chilled and uncomfortable in his presence. The final example numbers lines that contain the word and. Within a regular expression (Appendix A), \< forces a match to the beginning of a word and \> forces a match to the end of a word so the regular expression \ matches only the word and and does not match the string and within the word candid. $ nl -b p'\' lines He was not candid. He lacked a certain warmth so that you 1 always felt chilled and uncomfortable in his presence. 896 nohup nohup nohup Runs a command that keeps running after you log out nohup command line The nohup utility executes a command line such that the command keeps running after you log out. In other words, nohup causes a process to ignore a SIGHUP signal. Depending on how the local shell is configured, a process started without nohup and running in the background might be killed when you log out. The TC Shell includes a nohup builtin. Refer to the “Notes” section for more information. Arguments The command line is the command line you want to execute. Notes Under Linux, nohup accepts the common options described on page 718. If you do not redirect the output from a command you execute using nohup, both standard output and standard error are sent to the file named nohup.out in the working directory. If you do not have write permission for the working directory, nohup sends output to ~/nohup.out. Unlike the nohup utility, the TC Shell’s nohup builtin does not send output to nohup.out. Background jobs started from tcsh continue to run after you log out. Examples The following command executes find in the background, using nohup: $ nohup find / -name core -print > corefiles.out & [1] 14235 od 897 od od [options] [file-list] The od (octal dump) utility dumps the contents of a file. The dump is useful for viewing executable (object) files and text files with embedded nonprinting characters. This utility takes its input from the file you specify on the command line or from standard input. Arguments The file-list specifies the pathnames of the files that od displays. When you do not specify a file-list, od reads from standard input. Options The od utility accepts the common options described on page 718. Options preceded by a double hyphen (––) work under Linux only. Options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––address-radix=base –A base Specifies the base used when displaying the offsets shown for positions in the file. By default offsets are given in octal. Possible values for base are d (decimal), o (octal), x (hexadecimal), and n (no offsets displayed). –j n Skips n bytes before displaying data. ––skip-bytes=n ––read-bytes=n –N n Reads a maximum of n bytes and then quits. –S n Outputs from the file only those bytes that contain runs of n or more printable ASCII characters that are terminated by a NULL byte. ––strings=n ––format=type[n] –t type[n] Specifies the output format for displaying data from a file. You can repeat this option with different format types to see the file in several different formats. Table VI-22 lists the possible values for type. By default od dumps a file as 2-byte octal numbers. You can specify the number of bytes od uses to compose each number by specifying a length indicator, n. You can specify a length indicator for all types except a and c. Table VI-24 (next page) lists the possible values of n. Table VI-22 Output formats type Type of output a Named character: displays nonprinting control characters using their official ASCII names—for example, FORMFEED is displayed as ff od Dumps the contents of a file 898 od Table VI-22 Output formats (continued) type Type of output c ASCII character: displays nonprinting control characters as backslash escape sequences (Table VI-23) or three-digit octal numbers d Signed decimal f Floating point o Octal (default) u Unsigned decimal x Hexadecimal Table VI-23 Output format type c backslash escape sequences Sequence Meaning \0 NULL \a BELL \b BACKSPACE \f FORMFEED \n NEWLINE \r RETURN \t HORIZONTAL TAB \v VERTICAL TAB Table VI-24 Length indicators n Number of bytes to use Integers (types d, o, u, and x) C (character) Uses single characters for each decimal value S (short integer) Uses 2 bytes I (integer) Uses 4 bytes L (long) Uses 4 bytes on 32-bit machines and 8 bytes on 64-bit machines od 899 Table VI-24 Length indicators (continued) n Number of bytes to use Floating point (type f) Notes F (float) Uses 4 bytes D (double) Uses 8 bytes L (long double) Typically uses 8 bytes To retain backward compatibility with older, non-POSIX versions of od, the od utility includes the options listed in Table VI-25 as shorthand versions of many of the preceding options. Table VI-25 Examples Shorthand format specifications Shorthand Equivalent specification –a –t a –b –t oC –c –t c –d –t u2 –f –t fF –h –t x2 –i –t d2 –l –t d4 –o –t o2 –x –t x2 The file ac, which is used in the following examples, contains all of the ASCII characters. In the first example, the bytes in this file are displayed as named characters. The first column shows the offset of the first byte on each line of output from the start of the file. The offsets are displayed as octal values. $ od -t 0000000 0000020 0000040 0000060 0000100 0000120 0000140 0000160 0000200 0000220 a ac nul soh stx etx eot enq ack bel bs ht nl vt ff dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc sp ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ del nul soh stx etx eot enq ack bel bs ht nl vt ff dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc cr so si fs gs rs us cr so si fs gs rs us 900 od 0000240 0000260 0000300 0000320 0000340 0000360 0000400 0000401 sp ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ del nl In the next example, the bytes are displayed as octal numbers, ASCII characters, or printing characters preceded by a backslash (refer to Table VI-23 on page 898): $ od -t 0000000 0000020 0000040 0000060 0000100 0000120 0000140 0000160 0000200 0000220 0000240 0000260 0000300 0000320 0000340 0000360 0000400 0000401 c ac \0 001 002 003 004 005 006 \a \b \t 020 021 022 023 024 025 026 027 030 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ 177 200 201 202 203 204 205 206 207 210 220 221 222 223 224 225 226 227 230 240 241 242 243 244 245 246 247 250 260 261 262 263 264 265 266 267 270 300 301 302 303 304 305 306 307 310 320 321 322 323 324 325 326 327 330 340 341 342 343 344 345 346 347 350 360 361 362 363 364 365 366 367 370 \n \n \v \f \r 016 017 031 032 033 034 035 036 037 211 231 251 271 311 331 351 371 212 232 252 272 312 332 352 372 213 233 253 273 313 333 353 373 214 234 254 274 314 334 354 374 215 235 255 275 315 335 355 375 216 236 256 276 316 336 356 376 217 237 257 277 317 337 357 377 The final example finds in the file /usr/bin/who all strings that are at least three characters long (the default) and are terminated by a null byte. See strings on page 959 for another way of displaying a similar list. The offset positions are given as decimal offsets. $ od -A ... 0035151 0035165 0035169 0035187 0035201 0035217 0035234 0035238 0035253 0035268 0035280 0035297 0035301 0035307 0035312 0035320 0035326 0035333 ... d -S 3 /usr/bin/who GNU coreutils en_ /usr/share/locale Michael Stone David MacKenzie Joseph Arceneaux who abdlmpqrstuwHT %Y-%m-%d %H:%M %b %e %H:%M extra operand %s all count dead heading login lookup message open O Opens files, directories, and URLs open [option] [file-list] The open utility opens one or more files, directories, or URLs. The open utility is available under OS X only. O Arguments The file-list specifies the pathnames of the files, directories, or URLs that open is to open. Options Without any options, open opens the files in file-list as though you had doubleclicked each of the files’ icons in the Finder. –a application Opens file-list using application. This option is equivalent to dragging file-list to the application’s icon in the Finder. –b bundle Opens file-list using the application with bundle identifier bundle. A bundle identifier is a string, registered with the system, that identifies an application that can open files. For example, the bundle identifier com.apple.TextEdit specifies the TextEdit editor. –e (edit) Opens file-list using the TextEdit application. –F (fresh) Does not attempt to restore application windows that were open the last time you launched the application. –f (file) Opens standard input as a file in the default text editor. This option does not accept file-list. –g (background) Allows the application to run in the background. –h (header) Finds and opens a header matching file-list using Xcode. Prompts if more than one header matches. –n (new) Opens a new instance of the application even if one is already open. –R (reveal) Shows the file in the Finder instead of opening the associated application. –t (text) Opens file-list using the default text editor (see the “Discussion” section). –W (wait) Does not display the shell prompt until you quit the application. Discussion Opening a file brings up the application associated with that file. For example, opening a disk image file mounts it. The open utility returns immediately, without waiting for the application to launch. open O open O 901 902 open O LaunchServices is a system framework that identifies applications that can open files. It maintains lists of available applications and user preferences about which application to use for each file type. LaunchServices also keeps track of the default text editor used by the –t and –f options. Notes When a file will be opened by a GUI application, you must run open from Terminal or another terminal emulator that is running under a GUI. Otherwise, the operation will fail. Examples The first example mounts the disk image file backups.dmg. The disk is mounted in /Volumes, using the name it was formatted with. $ ls /Volumes House Spare Lion $ open backups.dmg $ ls /Volumes Backups House Spare Lion The next command opens the file picture.jpg. You must run this and the following example from a textual window within a GUI (e.g., Terminal). The application selected depends on the file attributes. If the file’s type and creator code specify a particular application, open opens the file using that application. Otherwise, open uses the system’s default program for handling .jpg files. $ open picture.jpg The next example opens the /usr/bin directory in the Finder. The /usr directory is normally hidden from the Finder because its invisible file attribute flag (page 1046) is set. However, the open utility can open any file you can access from the shell, even if it is not normally accessible from the Finder. $ open /usr/bin The next command opens /usr/include/c++/4.2.1/tr1/stdio.h in Xcode: $ open -h stdio.h When more than one header matches the argument you give with the –h option, open prompts to determine which you want to open: $ open -h stdio stdio? [0] cancel [1] all [2] /usr/include/c++/4.2.1/cstdio [3] /usr/include/c++/4.2.1/ext/stdio_filebuf.h ... Which header(s) for "stdio"? otool O Displays object, library, and executable files otool options file-list The otool utility displays information about, or part of, object, library, and executable files. The otool utility is available under OS X only. O Arguments The file-list specifies the pathnames of files that otool is to display. Options You must use at least one of the –L, –M, –t, or –T options to specify which part of each file in file-list otool is to display. –L (libraries) Displays the names and version numbers of the shared libraries an object file uses. –M (module) Displays the module table of a shared library. –p name (print) Begins output at the symbol named name. This option requires the –t option and either the –v or –V option. –T (table of contents) Displays the table of contents of a shared library. –t (text) Displays the TEXT section of an object file. –V (very verbose) Displays even more data (than the –v option) symbolically. When displaying code, this option causes otool to display the names of called routines instead of their addresses. –v (verbose) Displays data symbolically. When displaying code, this option causes otool to display the names of instructions instead of numeric codes. Discussion The otool utility displays information about the contents and dependencies of object files. This information can be helpful when you are debugging a program. For example, when you are setting up a chroot jail, otool can report which libraries are needed to run a given program. Some options are useful only with certain types of object modules. For example, the –T option does not report anything for a typical executable file. Notes The otool utility is part of the Developer Tools optional install. An otool –L command performs a function similar to the ldd utility on systems using the ELF binary format. otool O otool O 903 904 otool O Examples The examples in this section use the compiled version of the following C program: $ cat myname.c #include int main(void) { printf("My name is Sam.\n"); return 0; } In the first example, otool displays the libraries the program is linked with: $ otool -L myname myname: /usr/lib/libmx.A.dylib (compatibility version 1.0.0, current version 92.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.0.0) In some cases, a library used by a program will depend on other libraries. You can use otool –L on a library to see whether it uses other libraries: $ otool -L /usr/lib/libmx.A.dylib /usr/lib/libSystem.B.dylib /usr/lib/libmx.A.dylib: /usr/lib/libmx.A.dylib (compatibility version 1.0.0, current version 92.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.0.0) /usr/lib/libSystem.B.dylib: /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.0.0) /usr/lib/system/libmathCommon.A.dylib (compatibility version 1.0.0, current ... The next example disassembles the code for the main function. When compiling programs, the compiler sometimes modifies symbol names. The compiler gives functions, such as main, a leading underscore; thus the symbol name is _main. $ otool -Vt -p _main myname myname: (__TEXT,__text) section _main: 00002ac0 mfspr r0,lr 00002ac4 stmw r30,0xfff8(r1) 00002ac8 stw r0,0x8(r1) 00002acc stwu r1,0xffb0(r1) 00002ad0 or r30,r1,r1 00002ad4 bcl 20,31,0x2ad8 00002ad8 mfspr r31,lr 00002adc addis r2,r31,0x0 00002ae0 addi r3,r2,0x4b8 00002ae4 bl _printf$LDBLStub 00002ae8 li r0,0x0 00002aec or r3,r0,r0 00002af0 lwz r1,0x0(r1) 00002af4 lwz r0,0x8(r1) 00002af8 mtspr lr,r0 00002afc lmw r30,0xfff8(r1) 00002b00 blr ... paste 905 Joins corresponding lines from files paste [option] [file-list] The paste utility reads lines from the file-list and joins corresponding lines in its output. By default output lines are separated by a TAB character. Arguments The file-list is a list of ordinary files. When you specify a hyphen (–) instead of a filename, paste reads from standard input. Options Under Linux, paste accepts the common options described on page 718. Options preceded by a double hyphen (––) work under Linux only. Options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––delimiter=dlist –d dlist The dlist is a list of characters used to separate output fields. If dlist contains a single character, paste uses that character instead of the default TAB character to separate fields. If dlist contains more than one character, the characters are used in turn to separate fields and are then reused from the beginning of the list as necessary. ––serial –s Processes one file at a time; pastes horizontally. See the “Examples” section. Notes The paste utility is often used to rearrange the columns of a table. A utility, such as cut, can place the desired columns in separate files, and then paste can join them in any order. Examples The following example uses the files fnames and acctinfo. You can create these files using cut (page 766) and the /etc/passwd file. The paste command puts the full-name field first, followed by the remaining user account information. A TAB character separates the two output fields. Although this example works under Mac OS X, /etc/passwd does not contain information about most users; see “Open Directory” on page 1042 for more information. $ cat fnames Sam the Great Max Wild Zach Brill Helen Simpson $ cat acctinfo sam:x:401:50:/home/sam:/bin/zsh max:x:402:50:/home/max:/bin/bash zach:x:504:500:/home/zach:/bin/tcsh hls:x:505:500:/home/hls:/bin/bash paste paste 906 paste $ paste fnames acctinfo Sam the Great sam:x:401:50:/home/sam:/bin/zsh Max Wild max:x:402:50:/home/max:/bin/bash Zach Brill zach:x:504:500:/home/zach:/bin/tcsh Helen Simpson hls:x:505:500:/home/hls:/bin/bash The next examples use the files p1, p2, p3, and p4. In the last example in this group, the –d option gives paste a list of characters to use to separate output fields: $ cat 1 one ONE $ cat 2 two TWO extra $ cat 3 three THREE $ cat 4 four FOUR p1 p2 p3 p4 $ paste 4 four FOUR p4 p3 p2 p1 3 2 three two THREE TWO extra 1 one ONE $ paste -d="+-=" p3 p2 p1 p4 3+2-1=4 three+two-one=four THREE+TWO-ONE=FOUR +extra-= The final example uses the ––serial option to paste the files one at a time: $ paste --serial p1 p2 p3 p4 1 one ONE 2 two TWO extra 3 three THREE 4 four FOUR pax 907 pax pax [options] [pattern-list] pax –w [options] [source-files] pax –r [options] [pattern-list] pax –rw [options] [source-files] destination-directory The pax utility has four modes of operation: List mode displays the list of files in an archive, create mode places multiple files into a single archive file, extract mode restores files from an archive, and copy mode copies a directory hierarchy. The archive files created by pax can be saved on disk, removable media, or a remote system. List mode (absence of a major option) reads an archive from standard input and displays a list of files stored in the archive. Create mode (–w for write) reads a list of ordinary or directory filenames from the command line or standard input and writes the resulting archive file to standard output. Extract mode (–r for read) reads an archive from standard input and extracts files from that archive; you can restore all files from the archive or only those files whose names match a pattern. Copy mode (–rw) reads a list of ordinary or directory filenames from the command line or standard input and copies the files to an existing directory. Arguments In create mode, pax reads the names of files that will be included in the archive from source-files on the command line. If source-files is not present, pax reads the names from standard input, one filename per line. By default, in extract mode pax extracts, and in list mode pax displays the names of, all files in the archive it reads from standard input. You can choose to extract or display the names of files selectively by supplying a pattern-list. If the name of a file in the archive matches a pattern in the pattern-list, pax extracts that file or displays that filename; otherwise, it ignores the file. The pax patterns are similar to shell wildcards (page 148) except they match slashes (/) and a leading period (.) in a filename. See the fnmatch man page for more information about pax patterns (OS X only). In copy mode, pax does not create destination-directory. Thus destination-directory must exist before you give pax a copy mode command. Options A major option specifies the mode in which pax operates: create, extract, or copy. Major Options Three options determine the mode in which pax operates. You must include zero or one of these options. Without a major option, pax runs in list mode. –r (read) Reads the archive from standard input and extracts files. Without a pattern-list on the command line, pax extracts all files from the archive. With a pax Creates an archive, restores files from an archive, or copies a directory hierarchy 908 pax pattern-list, pax extracts only those files whose names match one of the patterns in the pattern-list. The following example extracts from the root.pax archive file on an external drive only those files whose names end in .c: $ pax -r \*.c < /Volumes/Backups/root.pax The backslash prevents the shell from expanding the argument to pax. * before it passes the –rw (copy) Copies files named on the command line or standard input from one place on the system to another. Instead of constructing an archive file containing the files named on standard input, pax copies them to the destination-directory (the last argument on the pax command line). The effect is the same as if you had created an archive in create mode and then extracted the files in extract mode, except using copy mode avoids creating an archive. The following example copies the working directory hierarchy to the /Users/max/code directory. The code directory must exist before you give this command. Make sure that ~max/code is not a subdirectory of the working directory or you will perform a recursive copy. $ pax -rw . ~max/code –w (write) Constructs an archive from the files named on the command line or standard input. These files might be ordinary or directory files. When the files come from standard input, each must appear on a separate line. The archive is written to standard output. The find utility frequently generates the filenames that pax uses. The following command builds an archive of the /Users directory and writes it to the archive file named Users.pax on /Volumes/Backups: # find -d /Users -print | pax -w > /Volumes/Backups/Users.pax The –d option causes find to search for files in a depth-first manner, reducing the likelihood of permissions problems when you restore the files from the archive. See the discussion of this find option on page 761. Other Options The following options alter the behavior of pax. These options work with one or more of the major options. –c (complement) Reverses the sense of the test performed on pattern-list. Files are listed or put in the archive only if they do not match any of the patterns in pattern-list. –f archive (file) Uses archive as the name of the archive file. In list and extract modes, this option reads from archive instead of standard input. In create mode, it writes to archive instead of standard output. You can use this option to access a device on another system on a network; see the ––file option to tar (page 969) for more information. pax 909 –H (partial dereference) For each file that is a symbolic link, copies the file the link points to, not the symbolic link itself. This option affects files specified on the command line; it does not affect files found while descending a directory hierarchy. This option treats files that are not symbolic links normally. See page 115 for information on dereferencing symbolic links. –L (dereference) For each file that is a symbolic link, copies the file the link points to, not the symbolic link itself. This option affects all files and treats files that are not symbolic links normally. See page 115 for information on dereferencing symbolic links. –l (link) In copy mode, when possible, makes hard links instead of copying files. –P (no dereference) For each file that is a symbolic link, copies the symbolic link, not the file the link points to. This option affects all files and treats files that are not symbolic links normally. This behavior is the default for pax. See page 115 for information on dereferencing symbolic links. –p preserve-list Preserves or discards the file attributes specified by preserve-list. The preserve-list is a string of one or more letters as shown in Table VI-26. By default pax preserves file access and modification times but does not preserve ownership or file permissions. This option works in extract and copy modes only. Table VI-26 Preserve flags Letter Meaning a Discard the access time e Preserve everything m Discard the modification time o Preserve ownership p Preserve permissions –s subcmd Executes subcmd, a substitution command, on filenames while storing them in any of the modes. The subcmd has the following syntax: s/search-string/replacement-string/[gp] The pax utility replaces occurrences of the regular expression search-string with replacement-string. A trailing g indicates a global replacement; without it pax replaces only the first instance of search-string in each filename. A trailing p (for print) causes pax to display each substitution it makes. The subcmd is similar to vim’s search and replace feature described on page 188, except it lacks an address. –v (verbose) In list mode, displays output similar to that produced by ls –l. In other modes, displays a list of files being processed. 910 pax –X In copy and create modes, prevents pax from searching (descending into) directories in filesystems other than those holding source-files. –x format In create mode, writes the archive in format format, as shown in Table VI-27. If you do not specify a format, pax writes a (POSIX) tar format file (ustar in the table). –z (gzip) In create mode, compresses archives using gzip. In extract mode, decompresses archives using gunzip. Table VI-27 Discussion pax archive formats format Description cpio The format specified for cpio archives in POSIX sv4cpio The format used for cpio archives under UNIX System V, release 4 tar The historical Berkeley tar format ustar The POSIX tar format (default) The pax utility is a general replacement for tar, cpio, and other archive programs. In create and copy modes, pax processes specified directories recursively. In list and extract modes, if a pattern in the pattern-list matches a directory name, pax lists or extracts the files from the named directory. Notes There is no native pax format. Instead, the pax utility can read and write archives in a number of formats. By default it creates POSIX tar format archives. See Table VI-27 for the list of formats pax supports. The pax utility determines the format of an archive file; it does not allow you to specify an archive format (–x option) in list or extract mode. Under Mac OS X version 10.4 and above, pax copies extended attributes (page 1044). Examples In the first example, pax creates an archive named corres.0901.pax. This archive stores the contents of the corres directory. $ pax -w corres > corres.0901.pax The –w option puts pax in create mode, where the list of files to be put in an archive is supplied by command-line arguments (corres in the preceding example) or from standard input. The pax utility sends the archive to standard output. The preceding example redirects that output to a file named corres.0901.pax. pax 911 Next, without any options, pax displays a list of files in the archive it reads from standard input: $ pax < corres.0901.pax corres corres/hls corres/max corres/max/0823 corres/max/0828 corres/max/0901 corres/memo1 corres/notes The following command uses the –f option to name the input file and performs the same function as the preceding command: $ pax -f corres.0901.pax When pax reads an archive file, as in the previous examples, it determines the format of the file. You do not have to (nor are you allowed to) specify the format. The next example, run under Mac OS X, attempts to create an archive of the /etc directory hierarchy in the default tar format. In this case pax is run by a user with root privileges because some of the files in this hierarchy cannot be read by ordinary users. Because pax does not follow (dereference) symbolic links by default, and because under OS X /etc is a symbolic link, pax copies the link and not the directory the link points to. The first command creates the archive, the second command shows the link in the archive, and the third command uses the –v option to show that it is a link: # pax -w -f /tmp/etc.tar /etc # pax -f /tmp/etc.tar /etc # pax -v -f /tmp/etc.tar lrwxr-xr-x 1 root admin 0 May 21 01:48 /etc => private/etc pax: ustar vol 1, 1 files, 10240 bytes read, 0 bytes written. The –L option follows (dereferences) links. In the next example, pax creates the desired archive of /etc: # pax -wLf /tmp/etc.tar /etc # pax -f /tmp/etc.tar /etc /etc/6to4.conf /etc/AFP.conf /etc/afpovertcp.cfg /etc/aliases /etc/aliases.db /etc/amavisd.conf /etc/amavisd.conf.personal /etc/appletalk.cfg ... 912 pax The next example uses pax to create a backup of the memos directory, preserving ownership and file permissions. The destination directory must exist before you give pax a copy mode command. $ mkdir memos.0625 $ pax -rw -p e memos memos.0625 $ ls memos.0625 memos The preceding example copies the memos directory into the destination directory. You can use pax to make a copy of a directory in the working directory without putting it in a subdirectory. In the next example, the –s option causes pax to replace the name of the memos directory with the name . (the name of the working directory) as it writes files to the memos.0625 directory: $ pax -rw -p e -s /memos/./ memos memos.0625 The following example uses find to build a list of files that begin with the string memo and that are located in the working directory hierarchy. The pax utility writes these files to an archive in cpio format. The output from pax goes to standard output, which is sent through a pipeline to bzip2, which compresses it before writing it to the archive file. $ find . -type f -name "memo*" | pax -w -x cpio | bzip2 > memos.cpio.bz2 The final example extracts files from an archive, removing leading slashes (the ^ forces the match to the beginning of the line) to prevent overwriting existing system files. The replacement string uses ! as a separator so that a forward slash can be in one of the strings. $ pax -r -f archive.pax -s '!^/!!p' plutil O Manipulates property list files plutil [options] file-list The plutil utility converts property list files between formats and checks their syntax. The plutil utility is available under OS X only. O Arguments Options The file-list specifies the pathnames of one or more files that plutil is to manipulate. –convert format Converts file-list to format, which must be either xml1 or binary1. –e extension Gives output files a filename extension of extension. –help Displays a help message. –lint Checks the syntax of files (default). –o file (output) Names the converted file file. –s (silent) Does not display any messages for successfully converted files. Discussion The plutil utility converts files from the XML property list format to binary format, and vice versa. The plutil utility can read—but not write—the older plain-text property list format. Notes The plutil utility accepts a double hyphen (––) to mark the end of the options on the command line. For more information refer to “Common Options” on page 718. Examples The following example checks the syntax of the file named java.plist: $ plutil java.plist java.plist: OK The next example shows the output of plutil as it checks a damaged property list file: $ plutil broken.plist broken.plist: XML parser error: Encountered unexpected element at line 2 (plist can only include one object) Old-style plist parser error: Malformed data byte group at line 1; invalid hex plutil O plutil O 913 914 plutil O The next example converts the file StartupParameters.plist to binary format, overwriting the original file: $ plutil -convert binary1 StartupParameters.plist The final example converts the binary format property list file loginwindow.plist in the working directory to XML format and, because of the –o option, puts the converted file in /tmp/lw.p: $ plutil -convert xml1 -o /tmp/lw.p loginwindow.plist pr 915 pr pr Paginates files for printing pr [options] [file-list] The pr utility breaks files into pages, usually in preparation for printing. Each page has a header with the name of the file, date, time, and page number. The pr utility takes its input from files you specify on the command line or from standard input. The output from pr goes to standard output and is frequently redirected through a pipeline to a printer. Arguments The file-list is a list of the pathnames of text files that you want pr to paginate. When you omit the file-list, pr reads from standard input. Options Under Linux, pr accepts the common options described on page 718. Options preceded by a double hyphen (––) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. You can embed options within the file-list. An embedded option affects only those files following it on the command line. ––show-control-chars –c Displays control characters with a caret (^; for example, ^H). Displays other nonprinting characters as octal numbers preceded by a backslash. L ––columns=col –col Displays output in col columns with a default of one. This option might truncate lines and cannot be used with the –m (––merge) option. ––double-space –d Double-spaces the output. ––form-feed –F Uses a FORMFEED character to skip to the next page rather than filling the current page with NEWLINE characters. ––header=head –h head Replaces the filename at the top of each page with head. If head contains SPACEs, you must enclose it within quotation marks. ––length=lines –l lines Sets the page length to lines lines. The default is 66 lines. ––merge –m Displays all specified files simultaneously in multiple columns. This option can- not be used with the –col (––columns) option. ––number-lines=[c[num]] –n[c[num]] Numbers the lines of output. The pr utility appends the character c to the number to separate it from the contents of the file (the default is a TAB). The num specifies the number of digits in each line number (the default is 5). 916 pr ––indent=spaces –o spaces Indents the output by spaces characters (specifies the left margin). ––separator=c –s[c] Separates columns with the single character c (defaults to TAB when you omit c). By default pr uses TABs as separation characters to align columns unless you use the –w option, in which case nothing separates the columns. ––omit-header –t Causes pr not to display its five-line page header and trailer. The header that pr normally displays includes the name of the file, the date, time, and page number. The trailer is five blank lines. ––width=num –w num Sets the page width to num columns. This option is effective only with multicolumn output (the –m [––merge] or –col [––columns] option). ––pages=firstpage[:lastpage] +firstpage[:lastpage] Output begins with the page numbered firstpage and ends with lastpage. Without lastpage, pr outputs through the last page of the document. The short version of this option begins with a plus sign, not a hyphen. Mac OS X does not accept the lastpage argument; printing always continues through the end of the document. Notes When you use the –col (––columns) option to display the output in multiple columns, pr displays the same number of lines in each column (with the possible exception of the last column). Examples The first command shows pr paginating a file named memo and sending its output through a pipeline to lpr for printing: $ pr memo | lpr Next memo is sent to the printer again, this time with a special heading at the top of each page. The job is run in the background. $ pr -h 'MEMO RE: BOOK' memo | lpr & [1] 4904 Finally pr displays the memo file on the screen, without any header, starting with page 3: $ pr -t +3 memo ... printf 917 Formats string and numeric data printf format-string [data-list] The printf utility formats a list of arguments. Many shells (e.g., bash, tcsh, busybox) and some utilities (e.g., gawk) have their own printf builtins that work similarly to the printf utility. Arguments The printf utility reads data-list and sends it to standard output based on formatstring. The format-string features presented in this section are not complete; see a C printf() function reference for more details. However, the printf utility is not a complete implementation of the C printf() function, so not all features of that function work in the printf utility. The format-string holds three types of objects: characters, escape sequences, and format specifications. The printf utility copies characters from format-string to standard output without modification; it converts escape sequences before copying them. For each format specification, printf reads one element from data-list, formats that element according to the specification, and sends the result to standard output. Some of the more commonly used escape sequences are \n (NEWLINE), \t (TAB), \" ("), and \\ (\). Each format specification in format-string has the following syntax: % [flag][min-width][.precision][spec-letter] where % introduces the format specification. flag is – (left justify; printf right-justifies by default), 0 (pad with zeros; printf pads with SPACEs by default), or + (precede positive numbers with a plus sign; printf does not precede positive numbers with a plus sign by default). min-width is the minimum width of the number or string printf outputs. .precision specifies the number of digits to the right of the decimal (floating-point numbers), the number of digits (integer/decimal numbers), or the length of the string (strings; longer strings are truncated to this length). spec-letter is one of the values listed in Table VI-28. Table VI-28 printf spec-letters spec-letter Type of data Notes c Character Single character; use s for strings d Decimal Width defaults to width of number printf printf 918 printf Table VI-28 Option printf spec-letters (continued) spec-letter Type of data Notes e Floating point Exponential notation (e.g., 1.20e+01) f Floating point Includes decimal point; defaults to eight decimal places g Floating point Same as f, unless exponent is less than –4, then same as e o Octal Converts to octal s String Width defaults to width of string X Uppercase hexadecimal Converts to hexadecimal x Lowercase hexadecimal Converts to hexadecimal The printf utility has no options. The bash printf builtin accepts the following option: –v var Assigns the output of printf to the variable named var instead of sending it to standard output (bash printf builtin only). Notes The printf utility and the bash printf builtin do not accept commas following formatstring and between each of the elements of data-list. The gawk version of printf requires commas in these locations. If data-list has more elements than format-string specifies, printf reuses format-string as many times as necessary to process all elements in data-list. If data-list has fewer elements than format-string specifies, printf assumes zero (for numbers) or null (for characters and strings) values for the nonexistent elements. Examples The examples in this section use the bash printf builtin. Using the printf utility yields the same results. By itself, the %s format-string element causes printf to copy an item from data-list to standard output. It does not terminate its output with a NEWLINE. $ printf %s Hi Hi$ You can specify a NEWLINE (as the \n escape sequence) at the end of format-string to cause the shell prompt to appear on the line following the output. Because the backslash is a special shell character, you must quote it by preceding it with another backslash or by putting it within single or double quotation marks. printf 919 $ printf "%s\n" Hi Hi $ When you specify more than one element in the data-list but have only a single format specification in the format-string, printf reuses the format specification as many times as necessary: $ printf "%s\n" Hi how are you? Hi how are you? When you enclose the data-list within quotation marks, the shell passes the data-list to printf as a single argument; the data-list has a single element and it is processed in one pass by the single format specification. $ printf "%s\n" "Hi how are you?" Hi how are you? %c The c spec-letter processes a single character of a data-list element. It discards extra characters. Use s for processing strings. $ printf "%c\n" abcd a %d The d spec-letter processes a decimal number. By default the width occupied by the number is the number of characters in the number. In the next example, the text within the format-string (other than %d) consists of characters; printf copies these without modifying them. $ printf "Number is %d end\n" 1234 Number is 1234 end You can specify min-width to cause a number to occupy more space. The next example shows a four-digit number occupying ten SPACEs: $ printf "Number is %10d end\n" 1234 Number is 1234 end You can use the 0 flag to zero-fill the number or the – flag to left-justify it. $ printf "Number is %010d end\n" 1234 Number is 0000001234 end $ printf "Number is %-10d end\n" 1234 Number is 1234 end %f The f spec-letter processes a floating-point number so it appears as a number with a decimal point. The amt.left shell script calls printf with the f spec-letter, a min-width of four characters, and a precision of two decimal places. 920 printf This example assigns a value to the dols variable and places it in the environment of the amt.left script. When given a value of 102.442, printf displays a six-character value with two places to the right of the decimal point. $ cat amt.left printf "I have $%4.2f dollars left.\n" $dols $ dols=102.442 ./amt.left I have $102.44 dollars left. If you use e in place of f, printf displays a floating-point number with an exponent. $ printf "We have %e miles to go.\n" 4216.7829 We have 4.216783e+03 miles to go. %X The X spec-letter converts a decimal number to hexadecimal. $ cat hex read -p "Enter a number to convert to hex: " x printf "The converted value is %X\n" $x $ ./hex Enter a number to convert to hex: 255 The converted value is FF ps 921 ps ps Displays process status ps [options] [process-list] The ps utility displays status information about processes running on the local system. Arguments The process-list is a comma- or SPACE-separated list of PID numbers. When you specify a process-list, ps reports on just the processes in that list. Options Under Linux, the ps utility accepts three types of options, each preceded by a different prefix. You can intermix the options. See the ps man page for details. Two hyphens: One hyphen: No hyphens: GNU (long) options UNIX98 (short) options BSD options Options preceded by a double hyphen (––) work under Linux only. Options named with a single letter and preceded by no hyphen or a single hyphen work under Linux and OS X. –A (all) Reports on all processes. Also –e. –e (everything) Reports on all processes. Also –A. –f (full) Displays a listing with more columns of information. ––forest f Displays the process tree (no hyphen before the f). L –l (long) Produces a long listing showing more information about each process. See the “Discussion” section for a description of the columns this option displays. ––no-headers Omits the header. This option is useful if you are sending the output to another program. L ––user usernames –uusernames Reports on processes being run by usernames, a comma-separated list of the names or UIDs of one or more users on the local system. –w (wide) Without this option ps truncates output lines at the right side of the screen. This option extends the display to 132 columns; it wraps around one more line, if needed. Use this option twice to extend the display to an unlimited number of columns, which is the default when you redirect the output of ps. Discussion Without any options ps displays the statuses of all active processes controlled by your terminal or screen. Table VI-29 (next page) lists the heading and content of each of the four columns ps displays. 922 ps Table VI-29 Column headings I Heading Meaning PID The process identification number. TTY (terminal) The name of the terminal that controls the process. TIME The number of hours, minutes, and seconds the process has been running. CMD The command line the process was called with. The command is truncated to fit on one line. Use the –w option to see more of the command line. The columns that ps displays depend on your choice of options. Table VI-30 lists the headings and contents of the most common columns. Table VI-30 Column headings II The column titles differ, depending on the type of option you use. This table shows the headings for UNIX98 (one-hyphen) options. Heading Meaning %CPU The percentage of total CPU time the process is using. Because of the way Linux handles process accounting, this figure is approximate, and the total of %CPU values for all processes might exceed 100%. %MEM (memory) The percentage of RAM the process is using. COMMAND or CMD The command line the process was called with. The command is truncated to fit on one line. Use the –w option to see more of the command line. This column is always displayed last on a line. F (flags) The flags associated with the process. NI (nice) PID PPID The nice value of the process (page 892). The process identification number. (parent PID) The process identification number of the parent process. PRI (priority) The priority of the process. RSS (resident set size) The number of blocks of memory the process is using. SIZE or SZ The size, in blocks, of the core image of the process. ps 923 Table VI-30 Column headings II (continued) Heading Meaning STAT or S (status) The status of the process as specified by one or more letters from the following list: < D L N R S T W X Z High priority Sleeping and cannot be interrupted Pages locked in memory (real-time and custom I/O) Low priority Available for execution (in the run queue) Sleeping Either stopped or being traced Has no pages resident in RAM Dead Zombie process that is waiting for its child processes to terminate before it terminates STIME or START The date the process started. TIME The number of minutes and seconds that the process has been running. TTY (terminal) The name of the terminal controlling the process. USER or UID The username of the user who owns the process. WCHAN If the process is waiting for an event, the address of the kernel function that caused the process to wait. This value is 0 for processes that are not waiting or sleeping. (wait channel) Notes Use top (page 981) to display process status information dynamically. Examples The first example shows ps, without any options, displaying the user’s active processes. The first process is the shell (bash), and the second is the process executing the ps utility. $ ps PID TTY 2697 pts/0 3299 pts/0 TIME CMD 00:00:02 bash 00:00:00 ps With the –l (long) option, ps displays more information about the processes: $ ps -l F S UID 000 S 500 000 R 500 PID 2697 3300 PPID 2696 2697 C PRI 0 75 0 76 NI ADDR 0 0 - SZ WCHAN 639 wait4 744 - TTY pts/0 pts/0 TIME CMD 00:00:02 bash 00:00:00 ps 924 ps The –u option shows information about the specified user: $ ps -u root PID TTY 1 ? 2 ? 3 ? 4 ? 5 ? ... TIME 00:00:01 00:00:00 00:00:00 00:00:01 00:00:00 CMD init kthreadd migration/0 ksoftirqd/0 watchdog/0 The ––forest option causes ps to display what the man page describes as an “ASCII art process tree.” Processes that are children of other processes appear indented under their parents, making the process hierarchy, or tree, easier to see. $ ps -ef --forest UID PID PPID root 1 0 root 2 1 ... root 785 1 root 839 1 root 3305 839 max 3307 3305 max 3308 3307 max 3774 3308 ... root 1040 1 root 3351 1040 root 3402 3351 root 3416 3402 root 3764 3416 root 3773 3764 ps and kill C STIME TTY 0 Jul22 ? 0 Jul22 ? TIME CMD 00:00:03 init 00:00:00 [keventd] 0 0 0 0 0 0 Jul22 Jul22 Aug01 Aug01 Aug01 Aug01 ? ? ? ? pts/1 pts/1 00:00:00 /usr/sbin/apmd -p 10 -w 5 -W -P 00:00:01 /usr/sbin/sshd 00:00:00 \_ /usr/sbin/sshd 00:00:00 \_ /usr/sbin/sshd 00:00:00 \_ -bash 00:00:00 \_ ps -ef --forest 0 0 0 0 0 0 Jul22 Aug01 Aug01 Aug01 Aug01 Aug01 ? tty2 tty2 tty2 tty2 tty2 00:00:00 login -- root 00:00:00 \_ -bash 00:00:00 \_ make modules 00:00:00 \_ make -C drivers CFLA 00:00:00 \_ make -C scsi mod 00:00:00 \_ ld -m elf_i3 The next sequence of commands shows how to use ps to determine the PID number of a process running in the background and how to terminate that process using kill. In this case it is not necessary to use ps because the shell displays the PID number of the background processes. The ps utility verifies the PID number. The first command executes find in the background. The shell displays the job and PID numbers of the process, followed by a prompt. $ find ~ -name memo -print > memo.out & [1] 3343 Next ps confirms the PID number of the background task. If you did not already know this number, using ps would be the only way to obtain it. $ ps PID 3308 3343 3344 TTY pts/1 pts/1 pts/1 TIME 00:00:00 00:00:00 00:00:00 CMD bash find ps Finally kill (page 846) terminates the process: $ kill 3343 $ RETURN [1]+ Terminated $ find ~ -name memo -print >memo.out renice 925 renice renice Changes the priority of a process renice priority [option] process-list renice –n increment [option] process-list The renice utility alters the priority of a running process. An ordinary user can decrease the priority of a process that he owns. Only a user running with root privileges can increase the priority of a process or alter the priority of another user’s process. Arguments The process-list specifies the PID numbers of the processes that are to have their priorities altered. Each process has its priority set to priority or, using the second format, has its priority incremented by a value of increment (which can be negative). Options The options, which you can specify throughout the process-list, change the interpretation of the arguments that follow them on the command line. –p (process) Interprets the following arguments as process ID (PID) numbers (default). –u (user) Interprets the following arguments as usernames or user ID numbers. Notes The range of priorities is from –20 (the highest priority) to +20 (the lowest priority). Higher (more positive) priority values mean the kernel schedules a job less often. Lower (more negative) values cause the job to be scheduled more often. When a user running with root privileges schedules a job to run at the highest priority, this change can affect the performance of the system for all other jobs, including the operating system itself. For this reason you should be careful when using renice with negative values. See nice (page 892) if you want to start a process with a nondefault priority. Examples The first example decreases the priority of all tasks owned by Zach: $ renice -n 5 -u zach In the following example, a user running with root privileges uses ps to check the priority of the process running find. The NI (nice) column shows a value of 19 and the administrator decides to increase the priority by 5: # ps -l UID PID PPID CPU PRI NI 501 9705 9701 0 31 0 501 10548 9705 0 12 19 # renice -n -5 10548 VSZ 27792 27252 RSS WCHAN 856 516 - STAT Ss RN TT p1 p1 TIME COMMAND 0:00.15 -bash 0:00.62 find / 926 rm rm Removes a file (deletes a link) rm rm [options] file-list The rm utility removes hard and/or symbolic links to one or more files. When you remove the last hard link to a file, the file is deleted. Be careful when you use rm with wildcards caution Because this utility enables you to remove a large number of files with a single command, use rm cautiously, especially when you are working with ambiguous file references. If you have any doubts about the effect of an rm command with an ambiguous file reference, first use echo with the same file reference and evaluate the list of files the reference generates. Alternatively, you can use the rm –i (––interactive) option. Arguments The file-list is a list of the files whose links rm will remove. Removing the only hard link to a file deletes the file. Removing a symbolic link deletes the symbolic link only. Options Under Linux, rm accepts the common options described on page 718. Options preceded by a double hyphen (––) work under Linux only. Options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––force –f Without asking for your consent, removes files for which you do not have write access permission. This option also suppresses informative messages if a file does not exist. ––interactive –i Asks before removing each file. If you use ––recursive with this option, rm also asks you before examining each directory. –P Overwrites the files three times before removing them. ––recursive ––verbose Notes –r Deletes the contents of the specified directory, including all its subdirectories, and the directory itself. Use this option with caution. –v Displays the name of each file as it is removed. To delete a file, you must have execute and write access permission to the parent directory of the file, but you do not need read or write access permission to the file itself. If you are running rm interactively (that is, if rm’s standard input is coming from the keyboard) and you do not have write access permission to the file, rm displays your access permission and waits for you to respond. If your response starts with a y or Y, rm deletes the file; otherwise, it takes no action. If standard input is not coming from a keyboard, rm deletes the file without querying you. rm 927 Refer to page 110 for information on hard links and page 112 for information on symbolic links. Page 115 includes a discussion about removing links. You can use rm with the –r option or rmdir (page 928) to remove an empty directory. When you want to remove a file that begins with a hyphen, you must prevent rm from interpreting the filename as an option. One way to do so is to give the special option –– (double hyphen) before the name of the file. This option tells rm that no more options follow: Any arguments that come after it are filenames, even if they look like options. Use shred to remove a file securely security Using rm does not securely delete a file—it is possible to recover a file that has been deleted using rm. Use the shred utility to delete files more securely. See the example “Wiping a file” on page 774 for another method of securely deleting files. Examples The following commands delete files both in the working directory and in another directory: $ rm memo $ rm letter memo1 memo2 $ rm /home/sam/temp The next example asks the user before removing each file in the working directory and its subdirectories: $ rm -ir * This command is useful for removing filenames that contain special characters, especially SPACEs, TABs, and NEWLINEs. (You should not create filenames containing these characters on purpose, but it might happen accidentally.) 928 rmdir rmdir rmdir Removes directories rmdir directory-list The rmdir utility deletes empty directories. Arguments The directory-list is a list of pathnames of empty directories that rmdir removes. Options Under Linux, rmdir accepts the common options described on page 718. Options preceded by a double hyphen (––) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––ignore-fail-on-non-empty Suppresses the message rmdir normally displays when it fails because a directory is not empty. With the – –parents option, rmdir does not quit when it finds a directory that is not empty. L ––parents –p Removes a hierarchy of empty directories. ––verbose –v Displays the names of directories as they are removed. L Notes Use the rm utility with the –r option if you need to remove directories that are not empty, together with their contents. Examples The following command deletes the empty literature directory from the working directory: $ rmdir literature The next command removes the letters directory, using an absolute pathname: $ rmdir /home/sam/letters The final command removes the letters, march, and 05 directories, assuming the directories are empty except for other directories named in the path: $ rmdir -p letters/march/05 rsync 929 Securely copies files and directory hierarchies over a network rsync [options] [[user@]from-host:]source-file [[user@]to-host:][destination-file] The rsync (remote synchronization) utility copies an ordinary file or directory hierarchy locally or from the local system to or from another system on a network. By default this utility uses OpenSSH to transfer files and the same authentication mechanism as OpenSSH; as a consequence, it provides the same security as OpenSSH. The rsync utility prompts for a password when it needs one. Alternatively, you can use the rsyncd daemon as a transfer agent. See Chapter 16 for information on rsync tip See Chapter 16 starting on page 673 for information on rsync. rsync rsync 930 scp scp Securely copies one or more files to or from a remote system scp scp [[user@]from-host:]source-file [[user@]to-host:][destination-file] The scp (secure copy) utility copies an ordinary or directory file from one system to another on a network. This utility uses OpenSSH to transfer files and the same authentication mechanism as OpenSSH; as a consequence, it provides the same security as OpenSSH. The scp utility prompts for a password when it needs one. See Chapter 17 for information on scp tip See the sections of Chapter 17 starting on pages 691 and 698 for information on the scp secure copy utility, one of the OpenSSH secure communication utilities. screen 931 Manages several textual windows screen [options] [program] The screen utility (screen package) is a full-screen textual window manager. A single session run on one physical or virtual terminal allows you to work in one of several windows, each of which typically runs a shell. It allows you to detach from and attach to a session and can help prevent you from losing data when a connection drops. Arguments The program is the name of the program screen runs initially in the windows it opens. If you do not specify program, screen runs the shell specified by the SHELL environment variable. If SHELL is not set, it runs sh, which is typically linked to bash or dash. Options This section describes a few of the many options screen accepts. See the screen man page for a complete list. When you call screen you can specify options to perform tasks. While you are working in a screen session you can use commands (next) to perform many of the same tasks. –d [pid.tty.host] (detach) Detaches the screen session specified by pid.tty.host. Does not start screen. Equivalent to typing CONTROL-A D from a screen session. –L Turns on logging for all windows in the session you are starting. –ls (list) Displays a list of screen sessions including identification strings (i.e., pid.tty.host) you can use to attach to a session. Does not start screen. –r [pid.tty.host] (resume) Attaches to the screen session specified by pid.tty.host or simply pid. You do not need to specify pid.tty.host if only a single screen session is running. Does not start a new screen session. –S [session] Specifies session as the name of the session you are starting. –t [title] Specifies title as the name of all windows in the session you are starting. If you do not specify a title or if you do not use this option, the title defaults to the name of the program the window is running, typically bash. Do not hold the SHIFT key while typing a CONTROL character tip This book uses the common convention of showing an uppercase letter following the CONTROL key. However, all control characters are entered as lowercase characters. Do not hold down the SHIFT key while entering a CONTROL character. In no case is a CONTROL character typed as an uppercase letter. screen screen 932 screen Commands Table VI-31 lists a few of the many commands screen accepts; see the screen man page for a complete list. You must give these commands from an active window (while you are running screen). By default all commands start with CONTROL-A. Table VI-31 screen commands Command What it does CONTROL-A ? (help) Displays a list of screen commands (key bindings). CONTROL-A " Displays a list of windows in the current session. You can select a new active window from this list. CONTROL-A 1, 2, ... 9 Makes window number 1, 2, ..., 9 the active window. CONTROL-A A (annotate) Prompts for a window title. Use the erase key (typically BACKSPACE) to back up over the current title before entering a new title. Notes CONTROL-A c (create) Opens a new window and makes that window the active window. CONTROL-A d Detaches from the screen session. CONTROL-A H Toggles logging for the active window. CONTROL-A m (message) Redisplays the most recent message. CONTROL-A N (number) Displays the title and number of the active window. CONTROL-A n (next) Makes the window with the next higher number the active window. The window with the highest number wraps to the window with the lowest number. CONTROL-A p (previous) Makes the window with the next lower number the active window. The window with the lowest number wraps to the window with the highest number. You might want to look at one of the other tools that does the same work as screen: tmux (tmux.sourceforge.net), byobu (launchpad.net/byobu), Terminator (GUI; software.jessies.org/terminator), or Guake (GUI; guake.org). Discussion Terminology Session When you run screen from the command line, you start a screen session. One session can have many windows. A session name has the format pid.tty.host. A session comprises one or more windows. You can attach to and detach from a session without interrupting the work you are doing in the windows. Also terminal. Window When used with reference to screen, the term window refers to a textual window, not a graphical one. Typically a screen window occupies the entire physical or virtual screen. Within a session, you can identify a window by its window number. Also screen. screen 933 Active window working with). The active window is the window screen is displaying (and you are How it works When you call screen without naming a program on the command line, it opens a window running a shell. This window looks identical to your login screen but it is not running a login shell and might display information on its status (last) line. From this screen window you can use screen commands to open additional windows and display other windows (switch to a different active window). Summary You call screen once to start a session and then open windows as needed. You can detach from and attach to a session at will. Logging By default screen log files are written to the user’s home directory and are named screenlog.no, where no is the number of the screen whose log the file holds. emacs If you are running emacs in a screen session, use CONTROL-A a in place of the normal CONTROL-A to move the cursor to the beginning of a line. Starting screen with the –L option turns logging on for all windows in the session; pressing CONTROL-A H from a screen window turns logging on for that (the active) window if logging is off and turns it off if it is on. When you turn logging on and a log file for the active window exists, screen appends to that file. ~/.screenrc The screen startup file is named .screenrc and is located in the user’s home directory. You can put many commands in this file; this section briefly discusses how to set up a window’s status line. The first line in the following file turns the status line on (it is on by default). The second line pertains to terminals that do not have a status line: It makes the last line of the terminal the status line. The third line specifies what screen is to display on the status line. $ cat ~/.screenrc hardstatus on hardstatus alwayslastline hardstatus string "%{Rb} Host %H %= Title %t %= Number %n %= Session %S %= %D %c " Within the hardstatus string, the % sign is the escape mechanism that tells screen the next character has a special meaning. Braces enclose multiple special characters; string displays any character not preceded by % as itself; %= causes screen to pad the status line with SPACEs. From left to right, %{Rb} causes screen to display the status line with a bright red foreground and a blue background; screen replaces %H with the hostname, %t with the title of the window, %n with the window number, %S with the name of the session, %D with the day of the week, and %c with the time based on a 24-hour clock. See STRING ESCAPES in the screen man page for a complete list of escapes you can use in the hardstatus string. Tutorial This tutorial follows Sam as he works with screen. Sam has some work to do on a remote system he connects to using ssh (Chapter 17). The connection Sam uses is not reliable and he does not want to lose his work if he gets disconnected. Also, it is late in the day and Sam wants to be able to disconnect from the remote system, go home, and reconnect without losing track of what he is doing. 934 screen Working from the system named guava, Sam uses ssh to connect to plum, the remote system he wants to work on: [sam@guava ~]$ ssh plum Last login: Sat Feb 23 11:48:33 2013 from 172.16.192.1 [sam@plum ~]$ Next he gives the command screen without any options or arguments. Sam’s ~/.screenrc on plum is the same as the one shown on the preceding page. The screen utility opens a new window and starts a shell that displays a prompt. Sam’s terminal looks the same as it did when he first logged in except for the status line at the bottom of the screen: Host plum Title bash Number 0 Session pts-1.plum Tue 10:59 The status line shows Sam is working on the system named plum in window number 0. Initially the title of the window is the name of the program running in the window; in this case it is bash. Sam wants to make the status line reflect the work he is doing in the window. He is compiling a program, and wants to change the title to Compile. He gives a CONTROL-A A command to change the window title; screen responds by prompting him on the message line (just above the status line): Set window's title to: bash Before he can enter a new title, Sam must use the erase key (typically BACKSPACE) to back up over bash. Then he enters Compile and presses RETURN. Now the status line looks like this: Host plum Title Compile Number 0 Session pts-1.plum Tue 11:01 Sam wants to keep track of the work he does in this window so he enters CONTROL-A H to turn logging on; screen confirms the creation of the log file on the status line: Creating logfile "screenlog.0". If there had been a file named screenlog.0, screen would have reported it was appending to the file. Sam wants to work on two more tasks: reading reports and writing a letter. He gives a CONTROL-A c command to open a new window; the status bar reports this window is number 1. Then he changes the title of the window to Report using CONTROL-A A as before. He repeats the process, opening window number 2 and giving it the title Letter. When he gives a CONTROL-A " (CONTROL-A followed by a double quotation mark) command, screen displays information about the three windows Sam has set up: Num Name Flags 0 Compile 1 Report 2 Letter ... Host plum $(L) $ $ Title Number - Session pts-1.plum Tue 11:14 screen 935 The output from this command shows the window numbers, titles (under the heading Name), and the flags associated with each window. The L flag on the Compile window indicates logging has been turned on for that window. On the status line, the window number is a hyphen, indicating this window is an informational window. Sam can select a different active window from this informational window by using the UP ARROW and DOWN ARROW keys to move the highlight between the listed windows and pressing RETURN when the highlight is over the window he wants to use. He highlights the Compile window and presses RETURN; screen closes the informational window and displays the Compile window. Sam gives the commands to start the compilation with the output from the compilation going to the screen. Next Sam wants to work on the letter, and, because he knows it is in window number 2, he uses a different technique to select the Letter window. He presses CONTROL-A 2 and screen displays window number 2. As he is working on the letter, he realizes he needs some information from a report. He is using window number 2 and he remembers the reports are in window number 1. He presses CONTROL-A p to display the previous window (the window with the next lowest number); screen displays window number 1, which is the Report window, and Sam starts looking for the information he needs. All of a sudden Sam sees his prompt from guava; the connection failed. Undaunted, he uses ssh to reconnect to plum and gives a screen –ls command to list screen sessions that are running on plum: [sam@guava ~]$ ssh plum Last login: Tue May 1 10:55:53 2012 from guava [sam@plum ~]$ screen -ls There is a screen on: 2041.pts-1.plum (Detached) 1 Socket in /var/run/screen/S-sam. All Sam needs to do is attach to the detached screen session and he will be back where he was with his work. Because only one screen session exists, he could simply give a screen –r command to attach to it; he chooses to give a command that includes the name of the session he wants to attach to: [sam@plum ~]$ screen -r 2041.pts-1.plum The screen utility displays the Report window, which still shows the same information as it did when Sam got disconnected. Sam works for a while longer, periodically checking on the compilation, and decides it is time to go home. This time Sam detaches from the screen session on purpose by giving a CONTROL-A d command. He sees the original command he used to start screen followed by a message from screen saying he has detached from the session and giving the session number. Then he logs out of plum, and guava displays a prompt. $ screen [detached from 2041.pts-1.plum] [sam@plum ~]$ exit logout Connection to plum closed. [sam@guava ~]$ 936 screen Sam goes home and eats dinner. When he is ready to work again, he uses ssh to log in on plum from his computer at home. He attaches to his screen session as he did after the connection failed and displays window 0, the Compile window, to see how the compilation is doing. The window holds the last lines of the compilation that show it was successful, so he types exit to close the window; screen displays the previous active window. Now CONTROL-A " would list two windows. After working on the letter for a while, Sam starts worrying about which commands he gave to compile his program—so he opens the log file from the old window 0 (screenlog.0) and checks. In that file he sees all the commands he gave after he turned logging on, together with the shell prompts and other output that appeared on the screen. All is well. Sam is done so he exits from each of the two remaining windows. Now plum displays its prompt preceded by the commands he gave before he started running screen. Sam logs off of plum and he is done. You do not have to run screen on a remote system. This utility can be useful on the local system when you want to work with several windows at the same time. If you are running several virtual machines or working with several local or remote systems, you can run one local screen session with several windows, each logged in on a different system. sed 937 sed sed [–n] program [file-list] sed [–n] –f program-file [file-list] The sed (stream editor) utility is a batch (noninteractive) editor. It transforms an input stream that can come from a file or standard input. It is frequently used as a filter or in a pipeline. Because it makes only one pass through its input, sed is more efficient than an interactive editor such as ed. Most Linux distributions provide GNU sed; Mac OS X supplies BSD sed. Chapter 15 applies to both versions. See Chapter 15 for information on sed tip See Chapter 15 starting on page 653 for information on sed. sed Edits a file noninteractively 938 SetFile O SetFile O SetFile O Sets file attributes SetFile [options] file-list The SetFile utility sets file attributes (page 1046), including the file’s type and creator codes, creation and last modification times, and attribute flags such as the invisible and locked flags. The SetFile utility is available under OS X only. O Arguments The file-list specifies the pathnames of one or more files that SetFile works on. Options The options for SetFile correspond to the options for GetFileInfo (page 831). –a flags (attribute) Sets the attribute flags specified by flags. An uppercase letter for a flag sets that flag and a lowercase letter unsets the flag. The values of unspecified flags are not changed. See Table D-2 on page 1046 or the SetFile man page for a list of attribute flags. –c creator Sets the creator code to creator. –d date Sets the creation date to date. The format of the date is mm/dd/[yy]yy [hh:mm:[:ss] [AM | PM]]. If you do not specify AM or PM, SetFile assumes a 24-hour clock. You must enclose a date string that contains SPACEs within quotation marks. –m date (modification) Sets the modification date to date. The format of date is the same as that used with the –d option. –P (no dereference) For each file that is a symbolic link, sets information about the symbolic link, not the file the link points to. This option affects all files and treats files that are not symbolic links normally. See page 115 for information on dereferencing symbolic links. By default SetFile dereferences symbolic links. –t type Sets the type code to type. Notes The SetFile utility is part of the optional Xcode package. The options to SetFile and the corresponding options to GetFileInfo have minor differences. For example, you can specify multiple attribute flags with the –a option to SetFile but only a single flag with GetFileInfo. Also, SetFile requires a SPACE between the –a option and the list of flags; GetFileInfo does not allow a SPACE there. SetFile O 939 Examples The first example sets the type and creator codes of the file named arch to SIT5 and SIT!, respectively, indicating that it is a StuffIt archive. The GetFileInfo utility displays these codes. $ SetFile -t SIT5 -c SIT! arch $ GetFileInfo -c arch "SIT!" $ GetFileInfo -t arch "SIT5" The next example marks the file named secret as invisible and locked. The file will not be visible in the Finder, and most OS X applications will be unable to overwrite it. $ SetFile -a VL secret The final example clears the invisible attribute flag from every file (but not files with hidden filenames [page 86]) in the working directory: $ SetFile -a v * 940 sleep sleep sleep Creates a process that sleeps for a specified interval sleep time sleep time-list L The sleep utility causes the process executing it to go to sleep for the time specified. Arguments By itself, the time denotes a number of seconds; the time can be an integer or a decimal fraction. Under Linux you can append a unit specification to time: s (seconds), m (minutes), h (hours), or d (days). Under Linux you can construct a time-list by including several times on the command line: The total time the process sleeps is the sum of these times. For example, if you specify 1h 30m 100s, the process will sleep for 91 minutes and 40 seconds. Examples You can use sleep from the command line to execute a command after a period of time. The following example executes in the background a process that reminds you to make a phone call in 20 minutes (1,200 seconds): $ (sleep 1200; echo "Remember to make call.") & [1] 4660 Alternatively, under Linux, you could give the following command to get the same reminder: $ (sleep 20m; echo "Remember to make call.") & [2] 4667 You can also use sleep within a shell script to execute a command at regular intervals. For example, the per shell script executes a program named update every 90 seconds: $ cat per #!/bin/bash while true do update sleep 90 done If you execute a shell script such as per in the background, you can terminate it only by using kill. The final shell script accepts the name of a file as an argument and waits for that file to appear on the disk. If the file does not exist, the script sleeps for 1 minute and 45 seconds before checking for the file again. sleep 941 $ cat wait_for_file #!/bin/bash if [ $# != 1 ]; then echo "Usage: wait_for_file filename" exit 1 fi while true do if [ -f "$1" ]; then echo "$1 is here now" exit 0 fi sleep 1m 45s done Under OS X, replace 1m 45s with 105. 942 sort sort Sorts and/or merges files sort sort [options] [file-list] The sort utility sorts and/or merges one or more text files. Arguments The file-list is a list of pathnames of one or more ordinary files that contain the text to be sorted. If the file-list is omitted, sort takes its input from standard input. Without the –o option, sort sends its output to standard output. This utility sorts and merges files unless you use the –m (merge only) or –c (check only) option. Options The sort utility orders the file using the collating sequence specified by the LC_COLLATE locale variable (page 317). Without a ––key option, sort orders a file based on full lines. Use ––key to specify sort fields within a line. You can follow a ––key option with additional options without a leading hyphen; see the “Discussion” section for more information. The Mac OS X version of sort accepts long options tip Options for sort preceded by a double hyphen (––) work under Mac OS X as well as under Linux. ––ignore-leading-blanks –b Blanks (TAB and SPACE characters) normally mark the beginning of fields in the input file. Without this option, sort considers leading blanks to be part of the field they precede. This option ignores leading blanks within a field, so sort does not consider these characters in sort comparisons. ––check –c Checks whether the file is properly sorted. The sort utility does not display anything if everything is in order. It displays a message if the file is not in sorted order and returns an exit status of 1. ––dictionary-order –d Ignores all characters that are not alphanumeric characters or blanks. For example, sor t does not consider punctuation with this option. The LC_CTYPE locale variable (page 317) affects the outcome when you specify this option. ––ignore-case –f (fold) Considers all lowercase letters to be uppercase letters. Use this option when you are sorting a file that contains both uppercase and lowercase letters. sort 943 ––ignore-nonprinting –i Ignores nonprinting characters. This option is overridden by the ––dictionaryorder option. The LC_CTYPE locale variable (page 317) affects the outcome when you specify this option. ––key=start[,stop] –k start[,stop] Specifies a sort field within a line. Without this option sort orders a file based on full lines. The sort field starts at the position on the line specified by start and ends at stop, or the end of the line if stop is omitted. The start and stop positions are in the format f[.c], where f is the field number and c is the optional character within the field. Numbering starts with 1. When c is omitted from start, it defaults to the first character in the field; when c is omitted from stop, it defaults to the last character in the field. See the “Discussion” section for further explanation of sort fields and the “Examples” section for illustrations of their use. ––merge –m Assumes that each of the multiple input files is in sorted order and merges them without verifying they are sorted. ––numeric-sort –n Sorts in arithmetic sequence; does not order lines or sort fields in the machine collating sequence. With this option, minus signs and decimal points take on their arithmetic meaning. ––output=filename –o filename Sends output to filename instead of standard output; filename can be the same as one of the names in the file-list. ––reverse –r Reverses the sense of the sort (for example, z precedes a). ––field-separator=x –t x Specifies x as the field separator. See the “Discussion” section for more information on field separators. ––unique Discussion –u Outputs repeated lines only once. When you use this option with ––check, sort displays a message if the same line appears more than once in the input file, even if the file is in sorted order. Without any options, sort bases its ordering on full lines. Sort order The way locale is set affects the way sort orders a file. For traditional ordering, set LC_ALL to C before running sort. Field In the following description, a field is a sequence of characters in a line of input. Without the ––field-separator option, fields are bounded by the empty string preceding a group of one or more blanks (TAB and SPACE characters). You cannot see the empty 944 sort Field Field Field Toni.Barnett....55020 Possible sort fields Figure VI-8 Fields and sort fields string that delimits the fields; it is an imaginary point between two fields. Fields are also bounded by the beginning and end of the line. The line shown in Figure VI-8 holds the fields Toni, SPACEBarnett, and SPACESPACESPACESPACE55020. These fields define sort fields. Sometimes fields and sort fields are the same. Sort field A sort field is a sequence of characters that sort uses to put lines in order. A sort field can contain all or part of one or more fields (Figure VI-8). The ––key option specifies pairs of pointers that define subsections of each line (sort fields) for comparison. See the ––key option (page 943) for details. Leading blanks The –b option causes sort to ignore leading blanks in a sort field. If you do not use this option, sort considers each leading blank to be a character in the sort field and includes it in the sort comparison. Options You can specify options that pertain only to a given sort field by immediately following the stop pointer (or the start pointer if there is no stop pointer) with one of the options b, d, f, i, n, or r. In this case you must not precede the option with a hyphen. Multiple sort fields When you specify more than one sort field, sort examines them in the order you specify them on the command line. If the first sort field of two lines is the same, sort examines the second sort field. If these are again the same, sort looks at the third field. This process continues for all the sort fields you specify. If all the sort fields are the same, sort examines the entire line. Examples The examples in this section demonstrate some of the features and uses of the sort utility. The examples assume LC_ALL is set to C and the following file named list is in the working directory: $ cat list Tom Winstrom Janet Dempsey Alice MacLeod David Mack Toni Barnett Jack Cooper Richard MacDonald 94201 94111 94114 94114 95020 94072 95510 This file contains a list of names and ZIP codes. Each line of the file contains three fields: the first name field, the last name field, and the ZIP code field. For the examples to work, the blanks in the file must be SPACE s, and not TABs. sort 945 The first example demonstrates sort without any options—the only argument is the name of the input file. In this case sort orders the file on a line-by-line basis. If the first characters on two lines are the same, sort looks at the second characters to determine the proper order. If the second characters are the same, sort looks at the third characters. This process continues until sort finds a character that differs between the lines. If the lines are identical, it does not matter which one sort puts first. In this example, sort needs to examine only the first three characters (at most) of each line. It displays a list that is in alphabetical order by first name. $ sort list Alice MacLeod David Mack Jack Cooper Janet Dempsey Richard MacDonald Tom Winstrom Toni Barnett 94114 94114 94072 94111 95510 94201 95020 You can instruct sort to skip any number of fields and characters on a line before beginning its comparison. Blanks normally mark the beginning of a field. The next example sorts the same list by last name, the second field: The ––key=2 argument instructs sort to begin its comparison with this field. Because there is no second pointer, the sort field extends to the end of the line. Now the list is almost in last name order, but there is a problem with Mac. $ sort --key=2 list Toni Barnett Jack Cooper Janet Dempsey Richard MacDonald Alice MacLeod David Mack Tom Winstrom 95020 94072 94111 95510 94114 94114 94201 In the preceding example, MacLeod comes before Mack. After finding that the sort fields of these two lines were the same through the third letter (Mac), sort put L before k because it arranges lines based on the value of LC_COLLATE, which is assumed to be set to C, which specifies that uppercase letters come before lowercase ones. The ––ignore-case option makes sort treat uppercase and lowercase letters as equals and fixes the problem with MacLeod and Mack: $ sort --ignore-case Toni Barnett Jack Cooper Janet Dempsey Richard MacDonald David Mack Alice MacLeod Tom Winstrom --key=2 list 95020 94072 94111 95510 94114 94114 94201 946 sort The next example attempts to sort list on the third field, the ZIP code. In this case sort does not put the numbers in order but rather puts the shortest name first in the sorted list and the longest name last. The ––key=3 argument instructs sort to begin its comparison with the third field, the ZIP code. A field starts with a blank and includes subsequent blanks. In the case of the list file, the blanks are SPACEs. The ASCII value of a SPACE character is less than that of any other printable character, so sort puts the ZIP code that is preceded by the most SPACEs first and the ZIP code that is preceded by the fewest SPACEs last. $ sort --key=3 list David Mack Jack Cooper Tom Winstrom Toni Barnett Janet Dempsey Alice MacLeod Richard MacDonald 94114 94072 94201 95020 94111 94114 95510 The –b (––ignore-leading-blanks) option causes sort to ignore leading SPACEs within a field. With this option, the ZIP codes come out in the proper order. When sort determines that MacLeod and Mack have the same ZIP codes, it compares the entire lines, putting Alice MacLeod before David Mack (because A comes before D). $ sort -b --key=3 list Jack Cooper 94072 Janet Dempsey 94111 Alice MacLeod 94114 David Mack 94114 Tom Winstrom 94201 Toni Barnett 95020 Richard MacDonald 95510 To sort alphabetically by last name when ZIP codes are the same, sort needs to make a second pass that sorts on the last name field. The next example shows how to make this second pass by specifying a second sort field and uses the –f (––ignore-case) option to keep the Mack/MacLeod problem from cropping up again: $ sort -b -f --key=3 Jack Cooper Janet Dempsey David Mack Alice MacLeod Tom Winstrom Toni Barnett Richard MacDonald --key=2 list 94072 94111 94114 94114 94201 95020 95510 The next example shows a sort command that skips not only fields but also characters. The –k 3.4 option (equivalent to ––key=3.4) causes sort to start its comparison with the fourth character of the third field. Because the command does not define an end to the sort field, it defaults to the end of the line. The sort field is the last two digits in the ZIP code. sort 947 $ sort -fb -k 3.4 list Tom Winstrom 94201 Richard MacDonald 95510 Janet Dempsey 94111 Alice MacLeod 94114 David Mack 94114 Toni Barnett 95020 Jack Cooper 94072 The problem of how to sort by last name within the last two digits of the ZIP code is solved by a second pass covering the last name field. The f option following the –k 2 affects the second pass, which orders lines by last name only. $ sort -b -k 3.4 -k 2f list Tom Winstrom 94201 Richard MacDonald 95510 Janet Dempsey 94111 David Mack 94114 Alice MacLeod 94114 Toni Barnett 95020 Jack Cooper 94072 The next set of examples uses the cars data file. All blanks in this file are TABs; it contains no SPACEs. From left to right, the columns in this file contain each car’s make, model, year of manufacture, mileage, and price: $ cat cars plym fury chevy malibu ford mustang volvo s80 ford thundbd chevy malibu bmw 325i honda accord ford taurus toyota rav4 chevy impala ford explor 1970 1999 1965 1998 2003 2000 1985 2001 2004 2002 1985 2003 73 60 45 102 15 50 115 30 10 180 85 25 2500 3000 10000 9850 10500 3500 450 6000 17000 750 1550 9500 Without any options sort displays a sorted copy of the file: $ sort cars bmw 325i chevy impala chevy malibu chevy malibu ford explor ford mustang ford taurus ford thundbd honda accord plym fury toyota rav4 volvo s80 1985 1985 1999 2000 2003 1965 2004 2003 2001 1970 2002 1998 115 85 60 50 25 45 10 15 30 73 180 102 450 1550 3000 3500 9500 10000 17000 10500 6000 2500 750 9850 948 sort The objective of the next example is to sort by manufacturer and by price within manufacturer. Unless you specify otherwise, a sort field extends to the end of the line. The –k 1 sort field specifier sorts from the beginning of the line. The command line instructs sort to sort on the entire line and then make a second pass, sorting on the fifth field all lines whose first-pass sort fields were the same (–k 5): $ sort -k 1 -k 5 cars bmw 325i 1985 chevy impala 1985 chevy malibu 1999 chevy malibu 2000 ford explor 2003 ford mustang 1965 ford taurus 2004 ford thundbd 2003 honda accord 2001 plym fury 1970 toyota rav4 2002 volvo s80 1998 115 85 60 50 25 45 10 15 30 73 180 102 450 1550 3000 3500 9500 10000 17000 10500 6000 2500 750 9850 Because no two lines are the same, sort makes only one pass, sorting on each entire line. (If two lines differed only in the fifth field, they would be sorted properly on the first pass anyway, so the second pass would be unnecessary.) Look at the lines containing taurus and thundbd. They are sorted by the second field rather than by the fifth field, demonstrating that sort never made a second pass and so never sorted on the fifth field. The next example forces the first-pass sort to stop at the end of the first field. The –k 1,1 option specifies a start pointer of the first character of the first field and a stop pointer of the last character of the first field. When you do not specify a character within a start pointer, it defaults to the first character; when you do not specify a character within a stop pointer, it defaults to the last character. Now taurus and thundbd are properly sorted by price. But look at explor: It is less expensive than the other Fords, but sort has it positioned as the most expensive. The sort utility put the list in ASCII collating sequence order, not in numeric order: 9500 comes after 10000 because 9 comes after 1. $ sort -k 1,1 -k 5 cars bmw 325i 1985 chevy impala 1985 chevy malibu 1999 chevy malibu 2000 ford mustang 1965 ford thundbd 2003 ford taurus 2004 ford explor 2003 honda accord 2001 plym fury 1970 toyota rav4 2002 volvo s80 1998 115 85 60 50 45 15 10 25 30 73 180 102 450 1550 3000 3500 10000 10500 17000 9500 6000 2500 750 9850 sort 949 The –n (numeric) option on the second pass puts the list in the proper order: $ sort -k 1,1 -k 5n cars bmw 325i 1985 115 chevy impala 1985 85 chevy malibu 1999 60 chevy malibu 2000 50 ford explor 2003 25 ford mustang 1965 45 ford thundbd 2003 15 ford taurus 2004 10 honda accord 2001 30 plym fury 1970 73 toyota rav4 2002 180 volvo s80 1998 102 450 1550 3000 3500 9500 10000 10500 17000 6000 2500 750 9850 The next example again demonstrates that, unless you instruct it otherwise, sort orders a file starting with the field you specify and continuing to the end of the line. It does not make a second pass unless two of the first sort fields are the same. Because there is no stop pointer on the first sort field specifier, the sort field for the first pass includes the third field through the end of the line. Although this example sorts the cars by years, it does not sort the cars by model within manufacturer within years (ford thundbd comes before ford explor, so these lines should be reversed). $ sort -k 3 -k 1 cars ford mustang 1965 plym fury 1970 bmw 325i 1985 chevy impala 1985 volvo s80 1998 chevy malibu 1999 chevy malibu 2000 honda accord 2001 toyota rav4 2002 ford thundbd 2003 ford explor 2003 ford taurus 2004 45 73 115 85 102 60 50 30 180 15 25 10 10000 2500 450 1550 9850 3000 3500 6000 750 10500 9500 17000 Specifying an end to the sort field for the first pass allows sort to perform its secondary sort properly: $ sort -k 3,3 -k 1 cars ford mustang 1965 plym fury 1970 bmw 325i 1985 chevy impala 1985 volvo s80 1998 chevy malibu 1999 chevy malibu 2000 honda accord 2001 toyota rav4 2002 ford explor 2003 ford thundbd 2003 ford taurus 2004 45 73 115 85 102 60 50 30 180 25 15 10 10000 2500 450 1550 9850 3000 3500 6000 750 9500 10500 17000 950 sort The next examples demonstrate important sorting techniques: putting a list in alphabetical order, merging uppercase and lowercase entries, and eliminating duplicates. The unsorted list follows: $ cat short Pear Pear apple pear Apple Following is a plain sort: $ sort short Apple Pear Pear apple pear The following folded sort is a good start, but it does not eliminate duplicates: $ sort -f short Apple apple Pear Pear pear The –u (unique) option eliminates duplicates but without the –f the uppercase entries come first: $ sort -u short Apple Pear apple pear When you attempt to use both –u and –f, some of the entries get lost: $ sort -uf short apple Pear Two passes is the answer. Both passes are unique sorts, and the first folds lowercase letters onto uppercase ones: $ sort -u -k 1f -k 1 short Apple apple Pear pear split 951 split split [options] [filename [prefix]] The split utility breaks its input into 1,000-line sections named xaa, xab, xac, and so on. The last section might be shorter. Options can change the sizes of the sections and lengths of the names. Arguments The filename is the pathname of the file that split processes. If you do not specify an argument or if you specify a hyphen (–) instead of the filename, split reads from standard input. The prefix is one or more characters that split uses to prefix the names of the files it creates; the default is x. Options Under Linux, split accepts the common options described on page 718. Options preceded by a double hyphen (––) work under Linux only. Options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––suffix-length=len –a len Specifies that the filename suffix is len characters long (the default is 2). ––bytes=n[u] –b n[u] Breaks the input into files that are n bytes long. The u is an optional unit of measure that can be k (kilobyte or 1,024-byte blocks) or m (megabyte or 1,048,576-byte blocks). If you include the unit of measure, split counts by this unit instead of by bytes. Under Linux, u can be b (512-byte blocks) or any of the suffixes listed in Table VI-1 on page 717. ––numeric-suffixes –d Specifies numeric suffixes instead of alphabetic suffixes. ––lines=num –l num Breaks the input into files that are num lines long (the default is 1,000). Discussion By default split names the first file it creates xaa. The x is the default prefix. You can change the prefix by using the prefix argument on the command line. You can change the number of characters in each filename following the prefix by using the ––suffixlength option. Examples By default split breaks a file into 1,000-line sections with the names xaa, xab, xac, and so on. The wc utility with the –l option shows the number of lines in each file. The last file, xar, is smaller than the rest. split Divides a file into sections 952 split $ split /usr/share/dict/words $ wc -l * 1000 xaa 1000 xab 1000 xac ... 1000 xdt 569 xdu 98569 total The next example uses the prefix argument to specify a filename prefix of SEC and uses –c (––suffix-length) to change the number of letters in the filename suffix to 3: $ split $ ls SECaaa SECaab ... SECaaj -a 3 /usr/share/dict/words SEC SECaak SECaal SECaau SECaav SECabe SECabf SECabo SECabp SECaby SECabz SECaci SECacj SECacs SECact SECadc SECadd SECaat SECabd SECabn SECabx SECach SECacr SECadb SECadl SECadm SECadn ssh 953 ssh ssh [option] [user@]host [command-line] The ssh utility logs in on a remote system and starts a shell. Optionally, ssh executes a command on the remote system and logs out. The ssh utility provides secure, encrypted communication between two systems over an unsecure network. See Chapter 17 for information on ssh tip See the sections of Chapter 17 starting on pages 691 and 694 for information on the ssh utility, one of the OpenSSH secure communication utilities. ssh Securely runs a program or opens a shell on a remote system 954 sshfs/curlftpfs sshfs/curlftpfs sshfs/curlftpfs Mounts a directory on an OpenSSH or FTP server as a local directory sshfs [options] [user@]host:[remote-directory] mount-point curlftpfs [options] host mount-point fusermount –u mount-point The sshfs and curlftpfs utilities enable a nonprivileged user to mount a directory hierarchy on an OpenSSH or FTP host as a local directory. Arguments The user is the optional username sshfs logs in as. The user defaults to the username of the user running the command on the local system. The host is the server running the OpenSSH or FTP daemon. The remote-directory is the directory to be mounted and defaults to the home directory of the user sshfs logs in as. A relative pathname specifies a directory relative to the user’s home directory; an absolute pathname specifies the absolute pathname of a directory. The mount-point is the (empty) local directory where the remote directory is mounted. Unless ssh/ftp is set up to log in automatically, curlftpfs prompts for a username and sshfs and curlftpfs prompt for a password. See page 702 (ssh) and page 821 (ftp) for information on setting up automatic logins. Options The sshfs and curlftpfs utilities have many options; see their respective man pages for complete lists. –o allow_other Grants access to the mount-point to users other than the user who ran the sshfs/curlftpfs command, including a privileged user. By default only the user who ran the command (and not a privileged user) can access the mount-point regardless of directory permissions. You must uncomment the user_allow_other line in /etc/fuse.conf for this option to work (see “Notes,” following). –o allow_root Grants access to the mount-point to a privileged user. By default only the user who ran the sshfs/curlftpfs command can access the mount-point. You must uncomment the user_allow_other line in /etc/fuse.conf for this option to work (see “Notes,” following). –o debug –d Displays FUSE debugging information. Implies –f. –f Runs sshfs/curlftpfs in the foreground. –p port Connects to port port on the remote system (sshfs only). sshfs/curlftpfs 955 Discussion The sshfs and curlftpfs utilities are based on the FUSE (Filesystems in USErspace; fuse.sourceforge.net) kernel module. FUSE enables users to mount remote directories they have access to without working as a privileged user (which is required when giving a mount command) and without the remote system exporting the directory hierarchy (required when running NFS). The mounted directory can be in a location that is convenient to the user, typically in the user’s home directory hierarchy. If the user can mount the directory without entering a username and password, the process can be automated by putting commands in one of the user’s startup files. Notes You might need to install the sshfs or curlftpfs package before you can run sshfs, curlftpfs, or fusermount. See Appendix C for details. The sshfs and curlftpfs utilities depend on the FUSE kernel module. The following command checks whether this module is loaded: $ lsmod | grep fuse fuse 71167 3 The documentation recommends running sshfs and curlftpfs as a nonprivileged user. Without the –o allow_other option, only the user who gives the sshfs or curlftpfs command can access the mounted directory, regardless of permissions. Not even a privileged user can access it. You can use the –o allow_other option to give other users access to the mounted directory and –o allow_root to give a privileged user access. For these options to work, you must work as a privileged user to uncomment the user_allow_other line in the /etc/fuse.conf file. This line must appear exactly as shown with nothing else on the line. $ cat /etc/fuse.conf # mount_max = 1000 user_allow_other The mount_max line specifies the maximum number of simultaneous mounts. If you uncomment this line, the syntax must be exactly as shown. Examples sshfs To use sshfs to mount a remote directory, you must be able to use ssh to list that directory. Sam has set up ssh so he does not need a password to run commands on the remote system named plum. See page 702 for information on setting up ssh so it does not require a password. If ssh requires you to supply a password, sshfs will also require a password. $ ssh plum ls letter.0505 memos pix After Sam checks that he can use ssh to run commands on plum, he creates the sam.plum.fs directory and mounts his home directory from plum on that directory. 956 sshfs/curlftpfs An ls command shows that Sam’s files on plum are now available in the sam.plum.fs directory. $ mkdir sam.plum.fs $ sshfs plum: sam.plum.fs $ ls sam.plum.fs letter.0505 memos pix When he is done using the files, Sam unmounts the remote directory using fusermount with the –u option. $ fusermount -u sam.plum.fs $ ls sam.plum.fs Next, Sam mounts his memos directory from plum so anyone on his local system can access the files in that directory. First, working as a privileged user, he creates a directory under the root directory and gives himself ownership of that directory. # mkdir /sam.memos ; chown sam:sam /sam.memos # ls -ld /sam.memos drwxr-xr-x. 2 sam sam 4096 02-23 15:55 /sam.memos Again working with root privileges, he edits the /etc/fuse.conf file and uncomments the user_allow_other line so other users will be able to access the mounted directory. Then, working as himself, Sam uses sshfs with the –o allow_other option to mount his plum memos directory on the local system as /sam.memos. The df utility (page 775) shows the mounted directory. $ sshfs -o allow_other plum:memos /sam.memos $ ls /sam.memos 0602 0603 0604 $ df -h /sam.memos Filesystem Size plum:memos 146G Used Avail Use% Mounted on 2.3G 136G 2% /sam.memos Now any user on the system can access Sam’s memos directory from plum as /sam.memos on the local system. curlftpfs The next example shows how to mount an anonymous FTP server as a local directory. See page 821 for information about setting up a .netrc file so you do not have to supply a password on the command line. In the following example, Sam creates the kernel.org directory and then runs curlftpfs to mount the FTP directory at mirrors.kernel.org as kernel.org. He uses the –o user=ftp option to specify a username and enters his email address in response to the password prompt. Use fusermount –u to unmount the remote directory. $ mkdir kernel.org $ curlftpfs -o user=ftp mirrors.kernel.org kernel.org Enter host password for user 'ftp': $ ls -l kernel.org | head -4 dr-xr-xr-x. 18 root root 4096 01-14 19:13 archlinux drwxrwxr-x. 30 root root 4096 12-21 09:33 centos drwxrwxr-x. 12 root root 4096 2012-02-23 cpan stat 957 stat stat [options] [file-list] The stat utility displays information about files. Arguments The file-list specifies the pathnames of one or more files that stat displays information about. Without a file-list, stat displays information about standard input. Options Options preceded by a double hyphen (––) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. Without any options, stat displays all available information about each file it processes. ––format=fmt –c fmt Formats output using fmt. See the stat man page for more information. L –F (file type) Displays a slash (/) after each directory, an asterisk (*) after each executable file, an at sign (@) after a symbolic link, an equal sign (=) after a socket, and a pipe symbol (|) after a FIFO. O –f (filesystem) Displays filesystem information instead of file information. L –f fmt Formats output using fmt. The fmt string is similar to that used by printf (page 917). See the stat man page for more information. O ––dereference –L For each file that is a symbolic link, displays information about the file the link points to, not the symbolic link itself. This option treats files that are not symbolic links normally. See page 115 for information on dereferencing symbolic links. –l (long) Uses the same format as ls –l (page 98). O ––printf=fmt Formats output using fmt. The fmt string is similar to that used by printf (page 917). See the stat man page for more information. –q (quiet) Suppresses error messages. O –s (shell) Displays information in a format that can be used to initialize shell variables. O –x (Linux) Displays a more verbose format that is compatible with the version of stat found on Linux. O stat Displays information about files 958 stat Examples The Linux and Mac OS X versions of stat display different information. The examples show the output from the Linux version. The first example displays information about the /bin/bash file: $ stat /bin/bash File: '/bin/bash' Size: 893964 Blocks: 1752 IO Block: 4096 Device: fd01h/64769d Inode: 22183 Links: 1 Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( Context: system_u:object_r:shell_exec_t:s0 Access: 2012-05-02 03:12:22.065940944 -0700 Modify: 2012-03-13 08:53:35.000000000 -0700 Change: 2012-05-02 03:12:16.675941400 -0700 Birth: - regular file 0/ The next example displays information about the root filesystem: $ stat -f / File: "/" ID: 491003435dced81d Namelen: 255 Type: ext2/ext3 Block size: 4096 Fundamental block size: 4096 Blocks: Total: 13092026 Free: 12171215 Available: 12040183 Inodes: Total: 3276800 Free: 3177256 root) strings 959 Displays strings of printable characters from files strings [options] file-list The strings utility displays strings of printable characters from object and other nontext files. Arguments The file-list is a list of files that strings processes. Options Options preceded by a double hyphen (––) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––all ––print-file-name –a Processes whole files. Without this option strings processes only the initialized and loaded parts of an object file. –f Precedes each string with the name of the file that the string comes from. L ––bytes=min –min Displays strings of characters that are at least min characters long (the default is 4). Discussion The strings utility can help you determine the contents of nontext files. One notable application for strings is determining the owner of files in a lost+found directory. Examples The following example displays strings of four or more printable characters in the executable file for the man utility. If you did not know what this file was, these strings could help you determine that it was the man executable. $ strings /usr/bin/man ... --Man-- next: %s [ view (return) | skip (Ctrl-D) | quit (Ctrl-C) ] format: %d, save_cat: %d, found: %d cannot write to %s in catman mode creating temporary cat for %s can't write to temporary cat for %s can't create temporary cat for %s cat-saver exited with status %d found ultimate source file %s ... strings strings 960 stty stty Displays or sets terminal parameters stty stty [options] [arguments] Without any arguments, stty displays parameters that affect the operation of the terminal or terminal emulator. For a list of some of these parameters and an explanation of each, see the “Arguments” section. The arguments establish or change parameters. Options Under Linux, stty accepts the common options described on page 718. Options preceded by a double hyphen (– –) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––all –a Reports on all parameters. This option does not accept arguments. –F device Affects device. Without this option stty affects the device attached to standard input. You can change the characteristics of a device only if you own its device file or if you are working with root privileges. L ––file=device –f device Affects device. Performs the same function as –F. O ––save –g Generates a report of the current settings in a format you can use as arguments to another stty command. This option does not accept arguments. Arguments The arguments to stty specify which terminal parameters stty is to alter. To turn on each of the parameters that is preceded by an optional hyphen (indicated in the following list as [–]), specify the parameter without the hyphen. To turn it off, use the hyphen. Unless specified otherwise, this section describes the parameters in their on states. Special Keys and Characteristics columns n Sets the line width to n columns. ek (erase kill) Sets the erase and line kill keys to their default values. Many systems use DELETE and CONTROL-U, respectively, as the defaults. erase x Sets the erase key to x. To specify a control character, precede x with CONTROL-V (for example, use CONTROL-V CONTROL-H to indicate CONTROL-H) or use the notation ^h, where ^ is a caret (SHIFT-6 on most keyboards). intr x Sets the interrupt key to x. See erase x for conventions. kill x Sets the line kill key to x. See erase x for conventions. rows n Sets the number of screen rows to n. stty 961 sane Sets the terminal parameters to values that are usually acceptable. The sane argument is useful when several stty parameters have changed, making it difficult to use the terminal to run stty to set things right. If sane does not appear to work, try entering the following characters: CONTROL-J stty sane CONTROL-J susp x (suspend) Sets the suspend (terminal stop) key to x. See erase x for conventions. werase x (word erase) Sets the word erase key to x. See erase x for conventions. Modes of Data Transmission [–]cooked See raw. L cooked See sane. O [–]cstopb (stop bits) Selects two stop bits (–cstopb specifies one stop bit). [–]parenb (parity enable) Enables parity on input and output. When you specify –parenb, the system does not use or expect a parity bit when communicating with the terminal. [–]parodd (parity odd) Selects odd parity (–parodd selects even parity). [–]raw The normal state is –raw. When the system reads input in its raw form, it does not interpret the following special characters: erase (usually DELETE), line kill (usually interrupt execution (CONTROL-C), and EOF (CONTROL-D). In addition, the system does not use parity bits. Reflecting the humor that is typical of Linux’s heritage, under Linux you can also specify –raw as cooked. CONTROL-U), Treatment of Characters [–]echo Echoes characters as they are typed (full-duplex operation). If a terminal is half- duplex and displays two characters for each one it should display, turn the echo parameter off (–echo). Use –echo when the user is entering passwords. [–]echoe (echo erase) The normal setting is echoe, which causes the kernel to echo the charac- ter sequence BACKSPACE SPACE BACKSPACE when you use the erase key to delete a character. The effect is to move the cursor backward across the line, removing characters as you delete them. [–]echoke (echo kill erase) The normal setting is echoke. When you use the kill character to delete a line while this option is set, all characters back to the prompt are erased on the current line. When this option is negated, pressing the kill key moves the cursor to the beginning of the next line. [–]echoprt (echo print) The normal setting is –echoprt, which causes characters to disappear as you erase them. When you set echoprt, characters you erase are displayed between a backslash ( \) and a slash (/). For example, if you type the word sort and then erase it by pressing BACKSPACE four times, Linux displays sort\tros/ when echoprt is set. If you use the kill character to delete the entire line, having echoprt set causes the entire line to be displayed as if you had BACKSPACEd to the beginning of the line. 962 stty [–]lcase For uppercase-only terminals, translates all uppercase characters into lowercase as they are entered (also [–]LCASE). L [–]nl Accepts only a NEWLINE character as a line terminator. With –nl in effect, the system accepts a RETURN character from the terminal as a NEWLINE but sends a RETURN followed by a NEWLINE to the terminal in place of a NEWLINE. [–]tabs Transmits each TAB character to the terminal as a TAB character. When tabs is turned off (–tabs), the kernel translates each TAB character into the appropriate number of SPACEs and transmits them to the terminal (also [–]tab3). Job Control Parameters [–]tostop Stops background jobs if they attempt to send output to the terminal (–tostop allows background jobs to send output to the terminal). Notes The name stty is an abbreviation for set teletypewriter, or set tty (page 990), the first terminal UNIX was run on. Today stty is commonly thought of as meaning set terminal. The shells retain some control over standard input when you use them interactively. As a consequence, a number of the options available with stty appear to have no effect. For example, the command stty –echo appears to have no effect under tcsh: tcsh $ stty -echo tcsh $ date Tue May 28 16:53:01 PDT 2013 While stty –echo does work when you are using bash interactively, stty –echoe does not. However, you can still use these options to affect shell scripts and other utilities. $ cat testit #!/bin/bash stty -echo read -p "Enter a value: " a echo echo "You entered: $a" stty echo $ ./testit Enter a value: You entered: 77 The preceding example does not display the user’s response to the Enter a value: prompt. The value is retained by the a variable and is displayed by the echo "You entered: $a" statement. Examples The first example shows that stty without any arguments displays several terminal operation parameters. (The local system might display more or different parameters.) The character following the erase = is the erase key. A ^ preceding a character indicates stty 963 a CONTROL key. In the example the erase key is set to CONTROL-H. If stty does not display the erase character, it is set to its default value of DELETE. If it does not display a kill character, it is set to its default of ^U. $ stty speed 38400 baud; line = 0; erase = ^H; Next the ek argument returns the erase and line kill keys to their default values: $ stty ek The next display verifies the change. The stty utility does not display either the erase character or the line kill character, indicating both are set to their default values: $ stty speed 38400 baud; line = 0; The next example sets the erase key to CONTROL-H. The CONTROL-V quotes the CONTROL-H so the shell does not interpret it and passes it to stty unchanged: $ stty erase CONTROL–V CONTROL–H $ stty speed 38400 baud; line = 0; erase = ^H; Next stty sets the line kill key to CONTROL-X. This time the user enters a caret (^) followed by an x to represent CONTROL-X. You can use either a lowercase or uppercase letter. $ stty kill ^X $ stty speed 38400 baud; line = 0; erase = ^H; kill = ^X; Now stty changes the interrupt key to CONTROL-C: $ stty intr CONTROL–VCONTROL–C In the following example, stty turns off TABs so the appropriate number of SPACEs is sent to the terminal in place of a TAB. Use this command if a terminal does not automatically expand TABs. $ stty -tabs If you log in under Linux and everything appears on the terminal in uppercase letters, give the following command and then check the CAPS LOCK key. If it is set, turn it off. $ STTY -LCASE Turn on lcase if you are using a very old terminal that cannot display lowercase characters. Although no one usually changes the suspend key from its default of CONTROL-Z, you can. Give the following command to change the suspend key to CONTROL-T: $ stty susp ^T 964 sysctl sysctl sysctl Displays and alters kernel variables at runtime sysctl [options] [variable-list] sysctl [options] –w [var=value ... ] The sysctl utility displays and alters kernel variables, including kernel tuning parameters, at runtime. Arguments Options The variable-list is a list of kernel variables whose values sysctl displays. In the second syntax, each value is assigned to the variable named var. –a (all) Displays all kernel variables. –b (binary) Displays kernel variables as binary data without terminating O NEWLINEs. –n (no label) Displays variables without labels. Discussion The sysctl utility provides access to a number of kernel variables, including the maximum number of processes the kernel will run at one time, the filename used for core files, and the system security level. Some variables cannot be altered or can be altered only in certain ways. For example, you can never lower the security level. Examples The sysctl utility is commonly used for tuning the kernel. The process limits can be displayed by anyone, but can be altered only by a user who is working with root privileges. The following example shows a user displaying and changing the maximum number of threads: $ sysctl kernel.threads-max kernel.threads-max = 32015 ... # sysctl -w kernel.threads-max=20000 kernel.threads-max = 20000 tail 965 tail tail [options] [file-list] The tail utility displays the last part, or end, of a file. Arguments The file-list is a list of pathnames of the files that tail displays. When you specify more than one file, tail displays the filename of each file before displaying lines from that file. If you do not specify an argument or, under Linux, if you specify a hyphen (–) instead of a filename, tail reads from standard input. Options Under Linux, tail accepts the common options dcribed on page 718. Options preceded by a double hyphen (––) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. –b [+]n Counts by 512-byte blocks instead of lines. The n argument is an integer that specifies the number of blocks. Thus the command tail –b 5 displays the last five blocks of a file. See the note about using a plus sign (+) in the next option. O ––bytes=[+]n[u] –c [+]n[u] Counts by bytes (characters) instead of lines. The n argument is an integer that specifies the number of bytes. Thus the command tail –c 5 displays the last five bytes of a file. Under Linux only, the u is an optional multiplicative suffix as described on page 717, except that tail uses a lowercase k for kilobyte (1,024byte blocks). If you include a multiplicative suffix, tail counts by this unit instead of by bytes. If you put a plus sign (+) in front of n, tail counts from the start of the file instead of the end. The tail utility still displays characters through the end of the file, even though it starts counting from the beginning. Thus tail –c +5 causes tail to display from the fifth character through the end of the file. ––follow –f After copying the last line of the file, tail enters an endless loop, waiting and copying additional lines from the file if the file grows. If you specify multiple files in file-list with this option, tail includes a new header each time it displays output from a different file so you know which file is being added to. This option is useful for tracking the progress of a process that is running in the background and sending its output to a file. The tail utility continues to wait indefinitely, so you must use the interrupt key to terminate it. See also the –s option. tail Displays the last part (tail) of a file 966 tail ––lines=[+]n[u] –n [+]n[u] Counts by lines (the default). The n argument is an integer that specifies the number of lines. Under Linux, the u is an optional unit of measure; see the –c (––bytes) option for an explanation of its use. Although it is not documented, you can use ±n to specify a number of lines without using this option. If you put a plus sign (+) in front of n, tail counts from the start of the file instead of the end. The tail utility still displays lines through the end of the file, even though it starts counting from the beginning. Thus tail –n +5 causes tail to display from the fifth line through the last line of the file. ––quiet –q Suppresses header information when you specify multiple files in file-list. L ––sleep-interval=n –s n When used with –f, causes tail to sleep for n seconds between checks for additional output. L Notes The tail utility displays the last ten lines of its input by default. Examples The examples are based on the eleven file: $ cat eleven line one line two line three line four line five line six line seven line eight line nine line ten line eleven First tail displays the last ten lines of the eleven file (no options): $ tail eleven line two line three line four line five line six line seven line eight line nine line ten line eleven Next it displays the last three lines (–n 3 or ––lines 3) of the file: tail 967 $ tail -n 3 eleven line nine line ten line eleven The following example displays the file starting at line 8 (+8): $ tail -n +8 eleven line eight line nine line ten line eleven The next example displays the last six characters in the file (–c 6 or ––bytes 6). Only five characters are evident (leven); the sixth is a NEWLINE. $ tail -c 6 eleven leven Monitor output The final example demonstrates the –f option. Here tail monitors the output of a make command, which is being sent to the file accounts.out: $ make accounts > accounts.out & $ tail -f accounts.out cc -c trans.c cc -c reports.c ... CONTROL-C $ In the preceding example, running tail with –f displays the same information as running make in the foreground and not redirecting its output to a file. However, using tail offers some advantages. First, the output of make is saved in a file. (The output would not be saved if you did not redirect its output.) Second, if you decide to do something else while make is running, you can kill tail and the screen will be free for you to use while make continues running in the background. When you are running a large job such as compiling a large program, you can use tail with the –f option to check on its progress periodically. 968 tar tar Stores or retrieves files to/from an archive file tar tar option [modifiers] [file-list] The tar (tape archive) utility creates, adds to, lists, and retrieves files from an archive file. Arguments The file-list is a list of pathnames of the files that tar archives or extracts. Options Use only one of the following options to indicate which action you want tar to take. You can alter the action of the option by following it with one or more modifiers. The Mac OS X version of tar accepts long options tip Options for tar preceded by a double hyphen (––) work under Mac OS X as well as under Linux. ––create –c Creates an archive. This option stores the files named in file-list in a new archive. If the archive already exists, tar destroys it before creating the new archive. If a file-list argument is a directory, tar copies the directory hierarchy into the archive. Without the ––file option, tar writes the archive to standard output. ––compare or ––diff –d Compares an archive with the corresponding disk files and reports on the differences. ––help Displays a list of options and modifiers, along with short descriptions of each. ––append –r Writes the files named in file-list to the end of the archive. This option leaves files that are already in the archive intact, so duplicate copies of files might appear in the archive. When tar extracts the files, the last version of a file read from the archive is the one that ends up on the disk. ––list –t (table of contents) Without a file-list, this option produces a table of contents listing all files in an archive. With a file-list, it displays the name of each file in the file-list each time it occurs in the archive. You can use this option with the ––verbose option to display detailed information about files in an archive. ––update –u Adds the files from file-list if they are not already in the archive or if they have been modified since they were last written to the archive. Because of the additional checking required, tar runs more slowly when you specify this option. ––extract –x Extracts file-list from the archive and writes it to the disk, overwriting existing files with the same names. Without a file-list, this option extracts all files from the archive. If the file-list includes a directory, tar extracts the directory hierarchy. The tar utility attempts to keep the owner, modification time, and access privileges the same as those of the original file. If tar reads the same file more than once, the last version read will appear on the disk when tar is finished. tar 969 Modifiers ––blocking-factor n –b n Uses n as the blocking factor for creating an archive. Use this option only when tar is creating an archive directly to removable media. (When tar reads an archive, it automatically determines the blocking factor.) The value of n is the number of 512-byte blocks to write as a single block on the removable medium. ––directory dir –C dir Changes the working directory to dir before processing. ––checkpoint Displays periodic messages. This option lets you know tar is running without forcing you to view all the messages displayed by ––verbose. ––exclude=file Does not process the file named file. If file is a directory, no files or directories in that directory hierarchy are processed. The file can be an ambiguous file reference; quote special characters as needed. ––file filename –f filename Uses filename as the name of the file the archive is created in or extracted from. The filename can be the name of an ordinary file or a device (e.g., a DVD or USB flash drive). You can use a hyphen (–) instead of the filename to refer to standard input when creating an archive and to standard output when extracting files from an archive. The following two commands are equivalent ways of creating a compressed archive of the files in the /home directory hierarchy on /dev/sde1: $ tar -zcf /dev/sde1 /home $ tar -cf - /home | gzip > /dev/sde1 ––dereference –h For each file that is a symbolic link, archives the file the link points to, not the symbolic link itself. See page 115 for information on dereferencing symbolic links. ––ignore-failed-read When creating an archive, tar normally quits with a nonzero exit status if any of the files in file-list is unreadable. This option causes tar to continue processing, skipping unreadable files. ––bzip2 –j Uses bzip2 (pages 64 and 732) to compress/decompress files when creating an archive and extracting files from an archive. ––tape–length n –L n Asks for a new medium after writing n *1,024 bytes to the current medium. This feature is useful when you are building archives that are too big to fit on a single USB flash drive, partition, DVD, or other storage device. ––touch –m Sets the modification time of the extracted files to the time of extraction. Without this option tar attempts to maintain the modification time of the original file. 970 tar ––one-file-system When a directory name appears in file-list while it is creating an archive, tar recursively processes the files and directories in the named directory hierarchy. With this option tar stays in the filesystem that contains the named directory and does not process directories in other filesystems. Under OS X, you can use –l (lowercase “l”) in place of ––one-file-system. Under Linux, –l is used for a different purpose. ––absolute-names –P The default behavior of tar is to force all pathnames to be relative by stripping leading slashes from them. This option disables this feature, so absolute pathnames remain absolute. ––sparse –S Linux allows you to create sparse files (large, mostly empty files). The empty sections of sparse files do not take up disk space. When tar extracts a sparse file from an archive, it normally expands the file to its full size. As a result, when you restore a sparse file from a tar backup, the file takes up its full space and might no longer fit in the same disk space as the original. This option causes tar to handle sparse files efficiently so they do not take up unnecessary space either in the archive or when they are extracted. ––verbose –v Lists each file as tar reads or writes it. When combined with the –t option, –v causes tar to display a more detailed listing of the files in the archive, showing their ownership, permissions, size, and other information. ––interactive –w Asks you for confirmation before reading or writing each file. Respond with y if you want tar to take the action. Any other response causes tar not to take the action. ––exclude-from filename –X filename Similar to the ––exclude option, except filename specifies a file that contains a list of files to exclude from processing. Each file listed in filename must appear on a separate line. ––compress or –Z Uses compress when creating an archive and uncompress when extracting files ––uncompress from an archive. ––gzip or ––gunzip Notes –z Uses gzip when creating an archive and gunzip when extracting files from an archive. This option also works to extract files from archives that have been compressed with the compress utility. The ––help option displays all tar options and modifiers; the ––usage option provides a brief summary of the same information. The info page on tar provides extensive information, including a tutorial for this utility. You can use ambiguous file references in file-list when you create an archive but not when you extract files from an archive. The name of a directory file within the file-list references the directory hierarchy (all files and directories within that directory). tar 971 The file that tar sends its output to by default is compilation specific; typically it goes to standard output. Use the –f option to specify a different filename or device to hold the archive. When you create an archive using a simple filename in file-list, the file appears in the working directory when you extract it. If you use a relative pathname when you create an archive, the file appears with that relative pathname, starting from the working directory when you extract it. If you use the –P option and an absolute pathname when you create an archive, and if you use the –P option when you extract files from the archive, tar extracts the file with the same pathname and might overwrite the original files. See page 697 for an example of using ssh with tar to create an archive file on a remote system of the contents of the working directory hierarchy. Leading hyphens The tar utility does not require leading hyphens on options and modifiers. However, it behaves slightly differently with regard to the order of options and modifiers it requires with and without the hyphen. You can specify one or more modifiers following an option. With a leading hyphen, the following tar command generates an error: $ tar -cbf 10 /dev/sde1 memos tar: f: Invalid blocking factor Try 'tar --help' or 'tar --usage' for more information. The error occurs because the –b modifier takes an argument but is not the last modifier in a group. The same command works correctly if you omit the leading hyphen. You must use leading hyphens if you separate options: $ tar -cb 10 -f /dev/sde1 memos Examples The following example makes a copy of the /home/max directory hierarchy on a USB flash drive mounted at /dev/sde1. The v modifier causes the command to list the files it writes to the device. This command erases anything that was already on the device. The message from tar explains that the default action is to store all pathnames as relative paths instead of absolute paths, thereby allowing you to extract the files into a different directory on the disk. $ tar -cvf /dev/sde1 /home/max tar: Removing leading '/' from member names. /home/max/ /home/max/.bash_history /home/max/.bash_profile ... In the next example, the same directory is saved on the device mounted on /dev/sde1 with a blocking factor of 100. Without the v modifier, tar does not display the list of files it is writing to the device. The command runs in the background and displays any messages after the shell issues a new prompt. 972 tar $ tar -cb 100 -f /dev/sde1 /home/max & [1] 4298 $ tar: Removing leading '/' from member names. The next command displays the table of contents of the archive on the device mounted on /dev/sde1: $ tar -tvf drwxrwxrwx -rw-r--r--rw-r--r-drwx------rw------... /dev/sde1 max/group max/group max/group max/group max/group 0 678 571 0 2799 Jun 30 21:39 2012 home/max/ Aug 6 14:12 2013 home/max/.bash_history Aug 6 14:06 2013 home/max/.bash_profile Nov 6 22:34 2013 home/max/mail/ Nov 6 22:34 2013 home/max/mail/sent-mail Next, Max creates a gzipped tar archive in /tmp/max.tgz. This approach is a popular way to bundle files that you want to transfer over a network or otherwise share with others. Ending a filename with .tgz is one convention for identifying gzipped tar archives. Another convention is to end the filename with .tar.gz. $ tar -czf /tmp/max.tgz literature The final command lists the files in the compressed archive max.tgz: $ tar -tzvf /tmp/max.tgz ... tee 973 tee tee Copies standard input to standard output and one or more files tee [options] file-list The tee utility copies standard input to standard output and to one or more files. Arguments The file-list is a list of the pathnames of files that receive output from tee. If a file in file-list does not exist, tee creates it. Options Options preceded by a double hyphen (––) work under Linux only. Options named with a single letter and preceded by a single hyphen work under Linux and OS X. Without any options, tee overwrites the output files if they exist and responds to interrupts. ––append –a Appends output to existing files rather than overwriting them. –i Causes tee not to respond to the SIGINT interrupt. O ––ignore-interrupts –i Causes tee not to respond to interrupts. L Examples In the following example, a pipeline sends the output from make to tee, which copies that information to standard output and to the file accounts.out. The copy that goes to standard output appears on the screen. The cat utility displays the copy that was sent to the file. $ make accounts | tee accounts.out cc -c trans.c cc -c reports.c ... $ cat accounts.out cc -c trans.c cc -c reports.c ... Refer to page 967 for a similar example that uses tail –f rather than tee. 974 telnet telnet telnet Connects to a remote computer over a network telnet [options] [remote-system] The telnet utility implements the TELNET protocol to connect to a remote computer over a network. telnet is not secure security The telnet utility is not secure. It sends your username and password over the network in cleartext, which is not a secure practice. Use ssh (pages 691 and 694) when it is available. Arguments Options The remote-system is the name or IP address of the remote system that telnet connects to. When you do not specify a remote-system, telnet works interactively and prompts you to enter one of the commands described in this section. –a Initiates automatic login (the default behavior under OS X). –e c (escape) Changes the escape character from CONTROL-] to the character c. –K Prevents automatic login. This option is available under OS X and some Linux distributions (it is the default behavior under Linux). –l username Attempts an automatic login on the remote system using username. If the remote system understands how to handle automatic login with telnet, it prompts for a password. Discussion After telnet connects to a remote system, you can put telnet into command mode by typing the escape character (usually CONTROL-]). The remote system typically reports the escape character it recognizes. To leave command mode, type RETURN on a line by itself. In command mode, telnet displays the telnet> prompt. You can use the following commands in command mode: ? (help) Displays a list of commands recognized by the telnet utility on the local system. close Closes the connection to the remote system. If you specified the name of a system on the command line when you started telnet, close has the same effect as quit: The telnet program quits, and the shell displays a prompt. If you used the open command instead of specifying a remote system on the command line, close returns telnet to command mode. logout Logs you out of the remote system; similar to close. telnet 975 open remote-computer If you did not specify a remote system on the command line or if the attempt to connect to the system failed, you can specify the name or IP address of a remote system interactively using this command. quit Quits the telnet session. z Suspends the telnet session. When you suspend a session, you return to the login shell on the local system. To resume the suspended telnet session, type fg (page 147) at a shell prompt. Notes Under Linux, telnet does not attempt to log in automatically. Under OS X, telnet attempts to log in automatically. When telnet attempts to log in automatically, it uses your username on the local system unless you specify a different name with the –l option. The telnet utility (telnet package), a user interface to the TELNET protocol, is older than ssh and is not secure. Nevertheless, it might work where ssh (page 694) is not available (there is more non-UNIX support for TELNET access than for ssh access). In addition, some legacy devices, such as terminal servers, facilities infrastructure, and network devices, still do not support ssh. telnet versus ssh When you connect to a remote UNIX or Linux system, telnet presents a textual login: prompt. Because telnet is designed to work with non-UNIX and non-Linux systems, it does not assume your remote username is the same as your local username (ssh does make this assumption). In some cases, telnet requires no login credentials. In addition, telnet allows you to configure special parameters, such as how RETURNs or interrupts are processed (ssh does not give you this option). When using telnet between UNIX and/or Linux systems, you rarely need to change any parameters. Examples In the following example, Sam connects to the remote system named plum. After running a few commands on plum, he types CONTROL-] to escape to command mode and types help to display a list of telnet commands. Next he uses the z command to suspend the telnet session so he can run a few commands on the local system. He then gives an fg command to the shell to resume using telnet; he has to press RETURN to display the prompt on plum. The logout command on the remote system ends the telnet session, and the local shell displays a prompt. [sam@guava ~]$ telnet plum Trying 172.16.192.151... Connected to plum. Escape character is '^]'. Fedora release 16 (Verne) Kernel 3.3.0-4.fc16.i686 on an i686 (1) login: sam Password: Last login: Tue Apr 10 10:28:19 from guava ... 976 telnet [sam@plum ~]$ CONTROL-] telnet> help Commands may be abbreviated. close logout display mode ... telnet> z Commands are: close current connection forcibly logout remote user and close the connection display operating parameters try to enter line or character mode ('mode ?' for more) [1]+ Stopped ... [sam@guava ~]$ fg telnet plum telnet plum RETURN [sam@plum ~]$ logout Connection closed by foreign host. [sam@guava ~]$ Using telnet to connect to other ports By default telnet connects to port 23, which is used for remote logins. However, you can use telnet to connect to other services by specifying a port number. In addition to standard services, many of the special remote services available on the Internet use unallocated port numbers. Unlike the port numbers for standard protocols, these port numbers can be picked arbitrarily by the administrator of the service. Although telnet is no longer commonly employed to log in on remote systems, it is still used extensively as a debugging tool by allowing you to communicate directly with a TCP server. Some standard protocols are simple enough that an experienced user can debug problems by connecting to a remote service directly using telnet. If you are having a problem with a network server, a good first step is to try to connect to it using telnet. If you use telnet to connect to port 25 on a host, you can interact with SMTP. In addition, port 110 connects to the POP protocol, port 80 connects with a WWW server, and port 143 connects to IMAP. All of these are ASCII protocols and are documented in RFCs (page 1090). You can read the RFCs or search the Web for examples of how to use them interactively. In the following example, a system administrator who is debugging a problem with email delivery uses telnet to connect to the SMTP port (port 25) on the server at example.com to see why it is bouncing mail from the spammer.com domain. The first line of output indicates which IP address telnet is trying to connect to. After telnet displays the Connected to smtpsrv.example.com message, the user emulates an SMTP dialog, following the standard SMTP protocol. The first line, which starts with helo, begins the session and identifies the local system. After the SMTP server identifies itself, the user enters a line that identifies the mail sender as user@spammer.com. The SMTP server’s response explains why the message is bouncing, so the user ends the session with quit. telnet 977 $ telnet smtpsrv 25 Trying 192.168.1.1... Connected to smtpsrv.example.com. Escape character is '^]'. helo example.com 220 smtpsrv.example.com ESMTP Sendmail 8.13.1/8.13.1; Wed, 2 May 2012 00:13:43 -0500 (CDT) 250 smtpsrv.example.com Hello desktop.example.com [192.168.1.97], pleased to meet you mail from:user@spammer.com 571 5.0.0 Domain banned for spamming quit 221 2.0.0 smtpsrv.example.com closing connection The telnet utility allows you to use any protocol you want, as long as you know it well enough to type commands manually. 978 test test Evaluates an expression test test expression [ expression ] The test utility evaluates an expression and returns a condition code indicating the expression is either true (0) or false (not 0). You can place brackets ([]) around the expression instead of using the word test (second syntax). Arguments The expression contains one or more criteria (see the following list) that test evaluates. A –a separating two criteria is a Boolean AND operator: Both criteria must be true for test to return a condition code of true. A –o is a Boolean OR operator. When –o separates two criteria, one or the other (or both) of the criteria must be true for test to return a condition code of true. You can negate any criterion by preceding it with an exclamation point (!). You can group criteria using parentheses. If there are no parentheses, –a takes precedence over –o, and test evaluates operators of equal precedence from left to right. Within the expression you must quote special characters, such as parentheses, so the shell does not interpret them but rather passes them to test unchanged. Because each element, such as a criterion, string, or variable within the expression, is a separate argument, you must separate each element from other elements using a SPACE. Table VI-32 lists the criteria you can use within the expression. Table VI-33 on page 980 lists test’s relational operators. Table VI-32 Criteria Criterion Meaning string True if string has a length greater than zero. L True if string is not a null string. O –n string True if string has a length greater than zero. –z string True if string has a length of zero. string1 = string2 True if string1 is equal to string2. string1 != string2 True if string1 is not equal to string2. int1 relop int2 True if integer int1 has the specified algebraic relationship to integer int2. The relop is a relational operator from Table VI-33 on page 980. As a special case, –l string, which returns the length of string, may be used for int1 or int2. test Table VI-32 979 Criteria (continued) Criterion Meaning file1 –ef file2 True if file1 and file2 have the same device and inode numbers. file1 –nt file2 True if file1 was modified after file2 (the modification time of file1 is newer than that of file2 ). file1 –ot file2 True if file1 was modified before file2 (the modification time of file1 is older than that of file2 ). –b filename True if the file named filename exists and is a block special file. –c filename True if the file named filename exists and is a character special file. –d filename True if the file named filename exists and is a directory. –e filename True if the file named filename exists. –f filename True if the file named filename exists and is an ordinary file. –g filename True if the file named filename exists and its setgid bit (page 101) is set. –G filename True if the file named filename exists and is associated with the group that is the primary group of the user running the command (same effective group ID). –k filename True if the file named filename exists and its sticky bit (page 1095) is set. –L filename True if the file named filename exists and is a symbolic link. –O filename True if the file named filename exists and is owned by the user running the command (same effective user ID). –p filename True if the file named filename exists and is a named pipe. –r filename True if the file named filename exists and the user running the command has read permission for it. –s filename True if the file named filename exists and contains information (has a size greater than 0 bytes). –t file-descriptor True if file-descriptor is open and associated with the screen or keyboard. The file-descriptor for standard input is 0, for standard output is 1, and for standard error is 2. See page 456 for more information. –u filename True if the file named filename exists and its setuid bit (page 101) is set. –w filename True if the file named filename exists and you have write permission for it. –x filename True if the file named filename exists and the user running test has execute/search permission for it. 980 test Table VI-33 Relational operators Relational operator Meaning –eq Equal to –ge Greater than or equal to –gt Greater than –le Less than or equal to –lt Less than –ne Not equal to Notes The test command is built into the Bourne Again and TC Shells. Examples See page 456 for examples that use the –t criterion to check whether a file descriptor from the process running test is associated with the terminal. The following examples demonstrate the use of the test utility in Bourne Again Shell scripts. Although test works from a command line, it is more commonly employed in shell scripts to test input or verify access to a file. The first example prompts the user, reads a line of input into a variable, and uses the synonym for test, [], to see whether the user entered yes: $ cat user_in read -p "Input yes or no: " user_input if [ "$user_input" = "yes" ] then echo "You input yes." fi The next example prompts for a filename and then uses the synonym for test, [], to see whether the user has read access permission (–r) for the file and (–a) whether the file contains information (–s): $ cat validate read -p "Enter filename: " filename if [ -r "$filename" -a -s "$filename" ] then echo "File $filename exists and contains information." echo "You have read access permission to the file." fi top 981 top top Dynamically displays process status top [options] The top utility displays information about the status of the local system, including information about current processes. Options Although top does not require the use of hyphens with options, it is a good idea to include them for clarity and consistency with other utilities. You can cause top to run as though you had specified any of the options by giving commands to the utility while it is running. See the “Discussion” section for more information. –ca Causes top to run in accumulative mode. In this mode, times and events are counted cumulatively since top started. See the top man page if you want to use the –c option to run top in another mode. O –d ss.tt (delay) Specifies ss.tt as the number of seconds and tenths of seconds of delay from one display update to the next. The default is 3 seconds. L –i Ignores idle and zombie processes (processes without a parent). L –n n (number) Specifies the number of iterations: top updates the display n times and exits. L –p n (PID) Monitors the process with a PID of n. You can use this option up to 20 times on a command line or specify n as a comma-separated list of up to 20 PID numbers. L –S (sum) Causes top to run in cumulative mode. In cumulative mode, the CPU times reported for processes include forked processes and CPU times accumulated by child processes that are now dead. L –s (secure) Runs top in secure mode, restricting the commands you can use while top is running to those that pose less of a security risk. L –s ss (seconds) Specifies ss as the number of seconds of delay from one display update to the next. The default is 1 second. O Discussion The first few lines top displays summarize the status of the local system. You can turn each of these lines on or off with the toggle switches (interactive command 982 top keys) specified in the following descriptions. The first line is the same as the output of the uptime utility (page 72) and shows the current time, the amount of time the local system has been running since it was last booted, the number of users logged in, and the load averages from the last 1, 5, and 15 minutes (toggle l [lowercase “l”]). The second line indicates the number of running processes (toggle t). Next is one or more lines, one for each CPU/core (also toggle t) followed by a line for memory (toggle m) and one for swap space (also toggle m). The rest of the display reports on individual processes, which are listed in order by descending CPU usage (i.e., the most CPU-intensive process is listed first). By default top displays the number of processes that fit on the screen. Table VI-34 describes the meanings of the fields displayed for each process. Table VI-34 Field names Name Meaning PID Process identification number USER Username of the owner of the process PR Priority of the process NI nice value (page 892) VIRT Number of kilobytes of virtual memory used by the process RES Number of kilobytes of physical (nonswapped) memory used by the process SHR Number of kilobytes of shared memory used by the process S Status of the process (see STAT on page 923) %CPU Percentage of the total CPU time the process is using %MEM Percentage of physical memory the process is using TIME[+] Total CPU time used by the process COMMAND Command line that started the process or name of the program (toggle with c) While top is running, you can issue the following commands to modify its behavior: h (help) Displays a summary of the commands you can use while top is running. L ? (help) Displays a summary of the commands you can use while top is running. O k (kill) Allows you to kill a process. Unless you are working with root privileges, you can kill only processes you own. When you use this command, top top 983 prompts you for the PID of the process and the signal to send to the process. You can enter either a signal number or a name. (See Table 10-5 on page 483 for a list of signals.) This command is disabled when you are working in secure mode. L n (number) When you give this command, top asks you to enter the number of processes you want it to display. If you enter 0 (the default), top shows as many processes as fit on the screen. q (quit) Terminates top. r (renice) Changes the priority of a running process (refer to renice on page 925). Unless you are working with root privileges, you can change the priority of only your own processes and only to lower the priority by entering a positive value. A user working with root privileges can enter a negative value, increasing the priority of the process. This command is disabled when you are working in secure mode. L S (sum) Toggles top between cumulative mode and regular mode. See the –S option for details. L s (seconds) Prompts for the number of seconds to delay between updates to the display (3 is the default). You may enter an integer, a fraction, or 0 (for continuous updates). Under Linux, this command is disabled when you are working in secure mode. W (write) Writes top’s current configuration to your personal configuration file (~/.toprc). L SPACE Refreshes the screen. Notes The Linux and OS X versions of top are very different. Although it applies to both versions, this coverage of top is oriented toward the Linux version. Refer to the OS X top man page for more information about the OS X version. The top utility is similar to ps but periodically updates the display, enabling you to monitor the behavior of the local system over time. This utility shows only as much of the command line for each process as fits on a line. If a process is swapped out, top replaces the command line with the name of the command in parentheses. Under Linux, the top utility uses the proc filesystem. When proc is not mounted, top does not work. Requesting continuous updates is almost always a mistake. The display is updated too quickly and the system load increases dramatically. 984 top Example The following display is the result of a typical execution of top on a system that has a CPU with four cores: top - 15:58:38 up 8 days, 5:25, 1 user, load average: 0.54, 0.70, 0.71 Tasks: 295 total, 1 running, 293 sleeping, 0 stopped, 1 zombie Cpu0 : 2.0%us, 2.3%sy, 0.0%ni, 95.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 3.5%us, 5.4%sy, 0.0%ni, 90.8%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu2 : 7.1%us, 1.0%sy, 0.0%ni, 91.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 5.4%us, 10.9%sy, 0.0%ni, 83.4%id, 0.0%wa, 0.3%hi, 0.0%si, 0.0%st Mem: 16466476k total, 16275772k used, 190704k free, 370208k buffers Swap: 58589160k total, 108k used, 58589052k free, 12858064k cached PID 1530 942 3354 19166 19126 7867 7919 12234 21269 3617 4867 5223 21277 1 2 3 ... USER root zach zach zach zach zach zach zach zach zach zach zach root root root root PR 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 RT NI 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 VIRT 267m 3937m 790m 586m 1178m 2567m 2635m 2692m 19356 762m 463m 474m 0 23844 0 0 RES 175m 3.2g 86m 122m 658m 1.6g 2.1g 1.4g 1564 108m 40m 216m 0 2000 0 0 SHR 38m 3.0g 34m 28m 33m 1.5g 2.0g 1.4g 1064 21m 16m 22m 0 1220 0 0 S %CPU %MEM TIME+ COMMAND S 12 1.1 216:19.29 Xorg S 7 20.4 171:19.46 vmware-vmx S 7 0.5 321:18.92 kwin S 3 0.8 4:13.05 plugin-containe S 1 4.1 13:56.39 firefox S 1 10.0 18:15.98 vmware-vmx S 1 13.2 43:57.29 vmware-vmx S 1 9.1 14:09.21 vmware-vmx R 1 0.0 0:00.56 top S 0 0.7 9:59.94 plasma-desktop S 0 0.3 0:55.99 konsole S 0 1.3 21:35.01 vmware S 0 0.0 0:00.01 vmware-rtc S 0 0.0 0:01.21 init S 0 0.0 0:00.06 kthreadd S 0 0.0 0:00.02 migration/0 touch 985 Creates a file or changes a file’s access and/or modification time touch [options] file-list The touch utility changes the access and/or modification time of a file to the current time or a time you specify. You can also use touch to create a file. Arguments The file-list is a list of the pathnames of the files that touch will create or update. Options Under Linux, touch accepts the common options described on page 718. Options preceded by a double hyphen (––) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. Without any options, touch changes the access and modification times to the current time. When you do not specify the –c (––no-create) option, touch creates files that do not exist. –a Updates the access time only, leaving the modification time unchanged. ––no–create –c Does not create files that do not exist. ––date=datestring –d datestring Updates times using the date specified by datestring. Most familiar formats are permitted for datestring. Components of the date and time not included in datestring are assumed to be the current date and time. This option may not be used with –t. L –m Updates the modification time only, leaving the access time unchanged. ––reference=file –r file Updates times with the times of file. –t [[cc]yy]nnddhhmm[.ss] Changes times to the date specified by the argument. The nn argument is the number of the month (01–12), dd is the day of the month (01–31), hh is the hour based on a 24-hour clock (00–23), and mm is the minutes (00–59). You must specify at least these fields. You can specify the number of seconds past the start of the minute with .ss. The optional cc specifies the first two digits of the year (the value of the century minus 1), and yy specifies the last two digits of the year. When you do not specify a year, touch assumes the current year. When you do not specify cc, touch assumes 20 for yy in the range 0–68 and 19 for yy in the range 69–99. This option may not be used with –d. touch touch 986 touch Examples The first three commands show touch updating an existing file. The ls utility with the –l option displays the modification time of the file. The last three commands show touch creating a file. $ ls -l program.c -rw-r--r--. 1 sam pubs 17481 03-13 16:22 program.c $ touch program.c $ ls -l program.c -rw-r--r--. 1 sam pubs 17481 05-02 11:30 program.c $ ls -l read.c ls: cannot access read.c: No such file or directory $ touch read.c $ ls -l read.c -rw-r--r--. 1 sam pubs 0 05-02 11:31 read.c The first of the following ls commands displays the file modification times; the second ls (with the –lu options) displays the file access times: $ ls -l -rw-r--r--. 1 sam pubs 466 01-10 19:44 cases -rw-r--r--. 1 sam pubs 1398 04-18 04:24 excerpts $ ls -lu -rw-r--r--. 1 sam pubs 466 05-02 11:34 cases -rw-r--r--. 1 sam pubs 1398 05-02 11:34 excerpts The next example works on the two files shown above and demonstrates the use of the –a option to change the access time only and the –t option to specify a date for touch to use instead of the current date and time. After the touch command is executed, ls shows that the access times of the files cases and excerpts have been updated but the modification times remain the same. $ touch -at 02040608 cases excerpts $ ls -l -rw-r--r--. 1 sam pubs 466 01-10 19:44 cases -rw-r--r--. 1 sam pubs 1398 04-18 04:24 excerpts $ ls -lu -rw-r--r--. 1 sam pubs 466 02-04 06:08 cases -rw-r--r--. 1 sam pubs 1398 02-04 06:08 excerpts tr 987 tr Replaces specified characters tr tr [options] string1 [string2] The tr utility reads standard input and, for each input character, either maps it to an alternate character, deletes the character, or leaves the character as is. This utility reads from standard input and writes to standard output. Arguments The tr utility is typically used with two arguments, string1 and string2. The position of each character in the two strings is important: Each time tr finds a character from string1 in its input, it replaces that character with the corresponding character from string2. With one argument, string1, and the –d (––delete) option, tr deletes the characters specified in string1. The option –s (––squeeze-repeats) replaces multiple sequential occurrences of characters in string1 with single occurrences (for example, abbc becomes abc). Ranges A range of characters is similar in function to a character class within a regular expression (page 1013). GNU tr does not support ranges (character classes) enclosed within brackets. You can specify a range of characters by following the character that appears earlier in the collating sequence with a hyphen and the character that comes later in the collating sequence. For example, 1–6 expands to 123456. Although the range A–Z expands as you would expect in ASCII, this approach does not work when you use the EBCDIC collating sequence, as these characters are not sequential in EBCDIC. See “Character Classes” for a solution to this issue. Character Classes A tr character class is not the same as the character class described elsewhere in this book. (GNU documentation uses the term list operator for what this book calls a character class.) You specify a character class as '[:class:]', where class is one of the character classes from Table VI-35. You must specify a character class in string1 (and not string2) unless you are performing case conversion (see the “Examples” section) or you use the –d and –s options together. Table VI-35 Character classes Class Meaning alnum Letters and digits alpha Letters blank Whitespace 988 tr Table VI-35 Options Character classes (continued) Class Meaning cntrl CONTROL characters digit Digits graph Printable characters but not SPACEs lower Lowercase letters print Printable characters including SPACEs punct Punctuation characters space Horizontal or vertical whitespace upper Uppercase letters xdigit Hexadecimal digits Options preceded by a double hyphen (––) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––complement –c Complements string1, causing tr to match all characters except those in string1. ––delete –d Deletes characters that match those specified in string1. If you use this option with the –s (– –squeeze-repeats) option, you must specify both string1 and string2 (see “Notes”). ––help Summarizes how to use tr, including the special symbols you can use in string1 and string2. L ––squeeze-repeats –s Replaces multiple sequential occurrences of a character in string1 with a single occurrence of the character when you call tr with only one string argument. If you use both string1 and string2, the tr utility first translates the characters in string1 to those in string2; it then replaces multiple sequential occurrences of a character in string2 with a single occurrence of the character. ––truncate-set1 Notes –t Truncates string1 so it is the same length as string2 before processing input. L When string1 is longer than string2, the initial portion of string1 (equal in length to string2) is used in the translation. When string1 is shorter than string2, tr repeats the last character of string1 to extend string1 to the length of string2. In this case tr departs from the POSIX standard, which does not define a result. If you use the –d (––delete) and –s (––squeeze-repeats) options at the same time, tr first deletes the characters in string1 and then replaces multiple sequential occurrences of a character in string2 with a single occurrence of the character. tr 989 Examples You can use a hyphen to represent a range of characters in string1 or string2. The two command lines in the following example produce the same result: $ echo abcdef | tr xyzabc $ echo abcdef | tr xyzabc 'abcdef' 'xyzabc' 'a-f' 'x-za-c' The next example demonstrates a popular method for disguising text, often called ROT13 (rotate 13) because it replaces the first letter of the alphabet with the thirteenth, the second with the fourteenth, and so forth. The first line ends with a pipe symbol that implicitly continues the line (see the optional section on page 144) and causes bash to start the next line with a secondary prompt (page 311). $ echo The punchline of the joke is ... | > tr 'A-M N-Z a-m n-z' 'N-Z A-M n-z a-m' Gur chapuyvar bs gur wbxr vf ... To make the text intelligible again, reverse the order of the arguments to tr: $ echo Gur chapuyvar bs gur wbxr vf ... | > tr 'N-Z A-M n-z a-m' 'A-M N-Z a-m n-z' The punchline of the joke is ... The ––delete option causes tr to delete selected characters: $ echo If you can read this, you can spot the missing vowels! | > tr --delete 'aeiou' If y cn rd ths, y cn spt th mssng vwls! In the following example, tr replaces characters and reduces pairs of identical characters to single characters: $ echo tennessee | tr -s 'tnse' 'srne' serene The next example replaces each sequence of nonalphabetic characters (the complement of all the alphabetic characters as specified by the character class alpha) in the file draft1 with a single NEWLINE character. The output is a list of words, one per line. $ tr -c -s '[:alpha:]' '\n' < draft1 The next example uses character classes to upshift the string hi there: $ echo hi there | tr '[:lower:]' '[:upper:]' HI THERE 990 tty tty Displays the terminal pathname tty tty [option] The tty utility displays the pathname of standard input if it is a terminal and displays not a tty if it is not a terminal. The exit status is 0 if standard input is a terminal and 1 if it is not. Options ––silent or ––quiet Under Linux, tty accepts the common options described on page 718. The option preceded by a double hyphen (––) works under Linux only. The option named with a single letter and preceded by a single hyphen works under Linux and OS X. –s Causes tty not to print anything. The tty utility sets its exit status. See “Examples” for another way to determine if standard input is coming from a terminal. Notes The term tty is short for teletypewriter, the terminal device on which UNIX was first run. This command appears in UNIX, and Linux has kept it for the sake of consistency and tradition. Examples The following example illustrates the use of tty: $ tty /dev/pts/11 $ echo $? 0 $ tty < memo not a tty $ echo $? 1 You can use test (or [ ]; page 978) with the –t option in place of tty with the –s option to determine if file descriptor 0 (normally standard input) is associated with the terminal. $ $ 0 $ $ 1 [ -t 0 ] echo $? [ -t 0 ] < /dev/null echo $? See “Determining Whether a File Descriptor Is Associated with the Terminal” on page 456 for more information on using test with the –t option. tune2fs 991 Changes parameters on an ext2, ext3, or ext4 filesystem tune2fs [options] device The tune2fs utility displays and modifies filesystem parameters on ext2, ext3, and ext4 filesystems. This utility can also set up journaling on an ext2 filesystem, turning it into an ext3 filesystem. With typical filesystem permissions, tune2fs must be run by a user working with root privileges. The tune2fs utility is available under Linux only. L Arguments The device is the name of the device, such as /dev/sda8, that holds the filesystem whose parameters you want to display or modify. Options –C n (count) Sets the number of times the filesystem has been mounted without being checked to n. This option is useful for staggering filesystem checks (see “Discussion”) and for forcing a check the next time the system boots. –c n (max count) Sets the maximum number of times the filesystem can be mounted between filesystem checks to n. Set n to 0 (zero) to disregard this parameter. –e behavior (error) Specifies what the kernel will do when it detects an error. Set behavior to continue (continues execution), remount-ro (remounts the filesystem readonly), or panic (causes a kernel panic). Regardless of how you set this option, an error will cause fsck to check the filesystem the next time the system boots. –i n[u] (interval) Sets the maximum time between filesystem checks to n time periods. Without u or with u set to d, the time period is days. Set u to w to set the time period to weeks; use m for months. Set n to 0 (zero) to disregard this parameter. Because a filesystem check is forced only when the system is booted, the time specified by this option might be exceeded. –j (journal) Adds an ext3 journal to an ext2 filesystem. For more information on journaling filesystems, see page 1076. –l (list) Lists information about the filesystem. –T date (time) Sets the time the filesystem was last checked to date. The date is the time and date in the format yyyynndd[hh[mm]ss]]]. Here yyyy is the year, nn is the number of the month (01–12), and dd is the day of the month (01–31). You must specify at least these fields. The hh is the hour based on a 24-hour clock (00–23), mm is the minutes (00–59), and .ss is the number of seconds past the start of the minute. You can also specify date as now. tune2fs tune2fs 992 tune2fs Discussion Checking a large filesystem can take a long time. When all filesystem checks occur at the same time, the system might boot slowly. Use the –C and/or –T options to stagger filesystem checks so they do not all happen at the same time. Examples Following is the output of tune2fs run with the –l option on a typical ext3 filesystem: # /sbin/tune2fs -l /dev/sda1 tune2fs 1.41.14 (22-Dec-2010) Filesystem volume name: Last mounted on: Filesystem UUID: b6d9714e-ed5d-45b8-8023-716a669c16d8 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file Filesystem flags: signed_directory_hash Default mount options: (none) Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 624624 Block count: 2498099 Reserved block count: 124904 Free blocks: 1868063 Free inodes: 509355 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 609 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8112 Inode blocks per group: 507 Filesystem created: Mon Apr 22 09:41:43 2013 Last mount time: Tue May 7 03:54:59 2013 Last write time: Tue May 7 03:54:59 2013 Mount count: 4 Maximum mount count: 31 Last checked: Mon Apr 22 09:41:43 2013 Check interval: 15552000 (6 months) Next check after: Sat Oct 26 09:41:43 2012 Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 28 Desired extra isize: 28 Journal inode: 8 First orphan inode: 308701 Default directory hash: half_md4 Directory Hash Seed: bceae349-a46f-4d45-a8f1-a21b1ae8d2bd Journal backup: inode blocks tune2fs 993 Next the administrator uses tune2fs to convert an ext2 filesystem to an ext3 journaling filesystem: # /sbin/tune2fs -j /dev/sda5 tune2fs 1.41.14 (22-Dec-2010) Creating journal inode: done This filesystem will be automatically checked every 30 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. 994 umask umask umask Specifies the file-creation permissions mask umask [option] [mask] The umask builtin specifies the mask the system uses to set up access permissions when it creates a file. This builtin works slightly differently in each of the shells. Arguments The mask can be a three- or four-digit octal number (bash and tcsh) or a symbolic value (bash) such as you would use with chmod (page 741). Without any arguments, umask displays the file-creation permissions mask. Option Discussion –S (symbolic) Displays the file-creation permissions mask symbolically. A mask that you specify using symbolic values specifies the permissions that are allowed. A mask that you specify using octal numbers specifies the permissions that are not allowed; the digits correspond to the permissions for the owner of the file, members of the group the file is associated with, and everyone else. Because the resulting file creation mask specifies the permissions that are not allowed, the system uses binary arithmetic to subtract each of the three digits from 7 when you create a file. If the file is an ordinary file (and not a directory file), the system then removes execute permissions from the file. The result is three or four octal numbers that specify the access permissions for the file (the numbers you would use with chmod). An octal value of 1 (001 binary) represents execute permission, 2 (010 binary) write permission, and 4 (100 binary) read permission (for a file). You must use binary or octal arithmetic when performing permissions calculations. To calculate file permissions given a umask value, subtract the umask value from octal 777. For example, assume a umask value of 003: 777 –003 774 111 664 Starting permissions for calculation Subtract umask value Resulting permissions for a directory Remove execute permissions Resulting permissions for an ordinary file To calculate permissions for a directory, umask subtracts the umask value from 777: In the example where umask has a value of 003, octal 7 minus octal 0 equals octal 7 (for two positions). Octal 7 minus octal 3 equals octal 4—or using binary arithmetic, umask 995 111 – 011 = 100. The result is that the system gives permissions of 774 (rwxrwxr––) to a directory. To calculate permissions for an ordinary file, the system changes the execute bit (001 binary) to 0 for each position. If the execute bit is not set, the system does not change the execute bit. In the example, removing the execute bit from octal 7 yields octal 6 (removing 001 from 111 yields 110; two positions). Octal 4 remains octal 4 because the execute bit is not set (the 001 bit is not set in 100, so it remains 100). The result is that the system gives permissions of 664 (rw–rw–r––) to an ordinary file. Notes Most utilities and applications do not attempt to create files with execute permissions, regardless of the value of mask; they assume you do not want an executable file. As a result, when a utility or application such as touch creates a file, the system subtracts each of the digits in mask from 6. An exception occurs with mkdir, which does assume that you want the execute (access in the case of a directory) bit set. See the “Examples” section. The umask program is a builtin in bash and tcsh and generally goes in the initialization file for your shell (~/.bash_profile [bash] or ~/.login [tcsh]). Under bash, the argument u=rwx,go=r turns off all bits in the mask for the owner and turns off the read bit in the mask for groups and other users (the mask is 0033), causing those bits to be on in file permissions (744 or 644). Refer to chmod on page 741 for more information about symbolic permissions. Examples The following commands set the file-creation mask and display the mask and its effect when you create a file and a directory. The mask of 022, when subtracted from 777, gives permissions of 755 (rwxr–xr–x) for a directory. For an ordinary file, the system subtracts execute permissions from 755, yielding permissions of 644 (rw–r––r––). $ umask 022 $ umask 0022 $ touch afile $ mkdir adirectory $ ls -ld afile adirectory drwxr-xr-x. 2 sam pubs 4096 12-31 12:42 adirectory -rw-r--r--. 1 sam pubs 0 12-31 12:42 afile The next example sets the same mask using symbolic values. The –S option displays the mask symbolically: $ umask u=rwx,g=rx,o=rx $ umask 0022 $ umask -S u=rwx,g=rx,o=rx 996 uniq uniq Displays unique lines from a file uniq uniq [options] [input-file] [output-file] The uniq utility displays its input, removing all but one copy of successive repeated lines. If the file has been sorted (see sort on page 942), uniq ensures that no two lines it displays are the same. Arguments When you do not specify the input-file, uniq reads from standard input. When you do not specify the output-file, uniq writes to standard output. Options Under Linux, uniq accepts the common options described on page 718. Options preceded by a double hyphen (– –) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. A field is a sequence of characters bounded by SPACE s, TAB s, NEWLINE s, or a combination of these characters. ––count ––repeated –c Precedes each line with the number of occurrences of the line in the input file. –d Displays one copy of lines that are repeated; does not display lines that are not repeated. ––skip-fields=nfield –f nfield Ignores the first nfield blank-separated fields of each line. The uniq utility bases its comparison on the remainder of the line, including the leading blanks of the next field on the line (see the –s [––skip-chars] option). ––ignore-case –i Ignores case when comparing lines. ––skip-chars=nchar –s nchar Ignores the first nchar characters of each line. If you also use the –f (––skip-fields) option, uniq ignores the first nfield fields followed by nchar characters. You can use this option to skip over leading blanks of a field. ––unique –u Displays only lines that are not repeated. ––check-chars=nchar –w nchar Compares up to nchars characters on a line after honoring the –f (––skip-fields) and –s (––skip-chars) options. By default uniq compares the entire line. L Examples These examples assume the file named test in the working directory contains the following text: uniq 997 $ cat test boy took bat home boy took bat home girl took bat home dog brought hat home dog brought hat home dog brought hat home Without any options, uniq displays only one copy of successive repeated lines: $ uniq test boy took bat home girl took bat home dog brought hat home The –c (––count) option displays the number of consecutive occurrences of each line in the file: $ uniq -c test 2 boy took bat home 1 girl took bat home 3 dog brought hat home The –d (––repeated) option displays only lines that are consecutively repeated in the file: $ uniq -d test boy took bat home dog brought hat home The –u (––unique) option displays only lines that are not consecutively repeated in the file: $ uniq -u test girl took bat home In the next example, the –f (––skip-fields) option skips the first field in each line, causing the lines that begin with boy and the one that begins with girl to appear to be consecutive repeated lines. The uniq utility displays only one occurrence of these lines: $ uniq -f 1 test boy took bat home dog brought hat home The next example uses both the –f (––skip-fields) and –s (––skip-chars) arguments first to skip two fields and then to skip two characters. The two characters this command skips include the SPACE that separates the second and third fields and the first character of the third field. Ignoring these characters, all the lines appear to be consecutive repeated lines containing the string at home. The uniq utility displays only the first of these lines: $ uniq -f 2 -s 2 test boy took bat home 998 w w Displays information about local system users w w [options] [username] The w utility displays the names of users logged in on the local system, together with their terminal device numbers, the times they logged in, the commands they are running, and other information. Arguments Options The username restricts the display to information about that user. Under OS X you can specify several usernames separated by SPACEs. –f (from) Removes the FROM column. For users who are directly connected, this field contains a hyphen. L –h (no header) Suppresses the header line. –i (idle) Sorts output by idle time. O –s (short) Displays less information: username, terminal device, idle time, and command. L Discussion The first line w displays is the same as that displayed by uptime (page 72). This line includes the time of day, how long the system has been running (in days, hours, and minutes), how many users are logged in, and how busy the system is (load average). From left to right, the load averages indicate the number of processes that have been waiting to run in the past 1 minute, 5 minutes, and 15 minutes. The columns of information that w displays have the following headings: USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT The USER is the username of the user. The TTY is the device name for the line the user logged in on. The FROM is the system name a remote user is logged in from; it is a hyphen for a local user. The LOGIN@ gives the date and time the user logged in. The IDLE indicates how many minutes have elapsed since the user last used the keyboard. The JCPU is the CPU time used by all processes attached to the user’s tty, not including completed background jobs. The PCPU is the time used by the process named in the WHAT column. The WHAT is the command the user is running. Examples The first example shows the full list produced by the w utility: $ w 10:26am up 1 day, 55 min, USER TTY FROM max tty1 max tty2 - 6 users, load average: 0.15, 0.03, LOGIN@ IDLE JCPU PCPU Fri 9am 20:39m 0.22s 0.01s Fri 5pm 17:16m 0.07s 0.07s 0.01 WHAT vim td -bash w 999 root sam hls pts/1 pts/2 pts/3 potato Fri 4pm 14:28m Fri 5pm 3:23 10:07am 0.00s 0.20s 0.08s 0.08s 0.07s 0.08s 0.02s -bash /bin/bash w In the next example, the –s option produces an abbreviated listing: $ w -s 10:30am USER max max root sam hls up 1 day, 58 min, TTY FROM tty1 tty2 pts/1 pts/2 pts/3 potato 6 users, load average: 0.15, 0.03, 0.01 IDLE WHAT 20:43m vim td 17:19m -bash 14:31m -bash 0.20s vim memo.030125 0.00s w -s The final example requests information only about Max: $ w max 10:35am up 1 day, 1:04, USER TTY FROM max tty1 max tty2 - 6 users, load average: 0.06, 0.01, 0.00 LOGIN@ IDLE JCPU PCPU WHAT Fri 9am 20:48m 0.22s 0.01s vim td Fri 5pm 17:25m 0.07s 0.07s -bash 1000 wc wc Displays the number of lines, words, and bytes in one or more files wc wc [options] [file-list] The wc utility displays the number of lines, words, and bytes in one or more files. When you specify more than one file on the command line, wc displays totals for each file as well as totals for all files. Arguments The file-list is a list of the pathnames of one or more files that wc analyzes. When you omit file-list, wc takes its input from standard input. Options Under Linux, wc accepts the common options described on page 718. Options preceded by a double hyphen (––) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––bytes –c Displays only the number of bytes in the input. ––max-line-length –L Displays the length of the longest line in the input. L ––lines –l (lowercase “l”) Displays only the number of lines (that is, NEWLINE characters) in the input. ––chars –m Displays only the number of characters in the input. ––words –w Displays only the number of words in the input. Notes A word is a sequence of characters bounded by SPACE s, TAB s, NEWLINE s, or a combination of these characters. When you redirect its input, wc does not display the name of the file. Examples The following command analyzes the file named memo. The numbers in the output represent the number of lines, words, and bytes in the file. $ wc memo 5 31 146 memo The next command displays the number of lines and words in three files. The line at the bottom, with the word total in the right column, contains the sum of each column. $ wc -lw memo1 memo2 memo3 10 62 memo1 12 74 memo2 12 68 memo3 34 204 total which 1001 Shows where in PATH a utility is located which utility-list For each utility in utility-list, the which utility searches the directories in the PATH variable (page 308) and displays the absolute pathname of the first file it finds whose simple filename is the same as the utility. Arguments The utility-list is a list of one or more utilities (commands) that which searches for. For each utility, which searches the directories listed in the PATH environment variable, in order, and displays the full pathname of the first utility (executable file) it finds. If which does not locate a utility, it displays a message. Options Options preceded by a double hyphen (––) work under Linux only; not all of these options are available under all Linux distributions. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––all –a Displays all matching executable files in PATH, not just the first. ––read-alias –i Reads aliases from standard input and reports on matching aliases in addition to executable files in PATH (turn off with ––skip-alias). L ––read-functions Reads shell functions from standard input and reports on matching functions in addition to executable files in PATH (turn off with ––skip-functions). L ––show-dot Displays ./ in place of the absolute pathname when a directory in PATH starts with a period and a matching executable file is in that directory (turn off with ––skip-dot). L ––show-tilde Displays a tilde (~) in place of the absolute pathname of the user’s home directory where appropriate. This option is ignored when a user working with root privileges runs which. L ––tty-only Does not process more options (to the right of this option) if the process running which is not attached to a terminal. L Notes Some distributions define an alias for which such as the following: $ alias which alias which='alias | /usr/bin/which --tty-only --read-alias --show-dot --show-tilde' If which is not behaving as you would expect, verify that you are not running an alias. The preceding alias causes which to be effective only when it is run interactively (––tty-only) and to display aliases, display the working directory as a period when appropriate, and display the name of the user’s home directory as a tilde. which which 1002 which The TC Shell includes a which builtin (see the tcsh man page) that works slightly differently from the which utility (/usr/bin/which). Without any options, the which utility does not locate aliases, functions, and shell builtins because these do not appear in PATH. In contrast, the tcsh which builtin locates aliases, functions, and shell builtins. Examples The first example quotes the first letter of the utility (\which) to prevent the shell from invoking the alias (page 344) for which: $ \which vim dir which /usr/bin/vim /bin/dir /usr/bin/which The next example, which works on some Linux systems only, is the same as the first but uses the alias for which (which it displays): $ which vim dir which alias which='alias | /usr/bin/which --tty-only --read-alias --show-dot --show-tilde' /usr/bin/which /usr/bin/vim /usr/bin/dir The final example is the same as the preceding one except it is run from tcsh. The tcsh which builtin is used instead of the which utility: tcsh $ which vim dir which /usr/bin/vim /bin/dir which: shell built-in command. who 1003 who who Displays information about logged-in users who [options] who am i The who utility displays information about users who are logged in on the local system. This information includes each user’s username, terminal device, login time, and, if applicable, the hostname of the remote system the user is logged in from. Arguments When given two arguments (traditionally, am i), who displays information about the user giving the command. If applicable, the username is preceded by the hostname of the system the user is logged in from (e.g., plum!max). Options Under Linux, who accepts the common options described on page 718. Options preceded by a double hyphen (– –) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. ––all ––boot –a Displays a lot of information. L –b Displays the date and time the system was last booted. L ––heading –H Displays a header. ––login –l (lowercase “l”) Lists devices waiting for a user to log in. L ––count –q (quick) Lists the usernames only, followed by the number of users logged in on the system. ––mesg –T Appends after each user’s username a character that shows whether that user has messages enabled. A plus sign (+) means that messages are enabled, a hyphen (–) means that they are disabled, and a question mark (?) indicates that who cannot find the device. If messages are enabled, you can use write to communicate with the user. Refer to “mesg: Denies or Accepts Messages” on page 75. ––users –u Includes each user’s idle time in the display. If the user has typed on her terminal in the past minute, a period (.) appears in this field. If no input has occurred for more than a day, the word old appears. In addition, this option displays the PID number and comment fields. See the “Discussion” section. Discussion The line who displays has the following syntax: user [messages] line login-time [idle] [PID] comment 1004 who The user is the username of the user. The messages argument indicates whether messages are enabled or disabled (see the –T [––mesg] option). The line is the device name associated with the line the user is logged in on. The login-time is the date and time when the user logged in. The idle argument is the length of time since the terminal was last used (the idle time; see the –u [––users] option). The PID is the process identification number. The comment is the name of the remote system the user is logged in from (it is blank for local users). Notes The finger utility (pages 71 and 810) provides information similar to that given by who. Examples The following examples demonstrate the use of the who utility: $ who max sam zach tty2 pts/1 tty3 2012-05-01 10:42 (:0) 2012-05-01 10:39 (plum) 2012-05-01 10:43 (:1) $ who am i sam pts/1 2012-05-01 10:39 (plum) $ who -HTu NAME LINE max + tty2 sam + pts/1 zach - tty3 TIME IDLE 2012-05-01 10:42 00:08 2012-05-01 10:39 . 2012-05-01 10:43 00:08 PID 1825 1611 2259 COMMENT (:0) (plum) (:1) xargs 1005 Converts standard input to command lines xargs [options] [command] The xargs utility is a convenient, efficient way to convert standard output of one command into arguments for another command. This utility reads from standard input, keeps track of the maximum allowable length of a command line, and avoids exceeding that limit by repeating command as needed. Finally xargs executes the constructed command line(s). Arguments The command is the command line you want xargs to use as a base for the command it constructs. If you omit command, it defaults to echo. The xargs utility appends to command the arguments it receives from standard input. If any arguments should precede the arguments from standard input, you must include them as part of command. Options Options preceded by a double hyphen (––) work under Linux only. Except as noted, options named with a single letter and preceded by a single hyphen work under Linux and OS X. –I [marker] (replace) Allows you to place arguments from standard input anywhere within command. All occurrences of marker in command for xargs are replaced by the arguments generated from standard input of xargs. With this option, xargs executes command for each input line. The –l (––max-lines) option is ignored when you use this option. ––max-lines[=num] –l [num] (lowercase “l”) Executes command once for every num lines of input (num defaults to 1). L ––max-args=num –n num Executes command once for every num arguments in the input line. ––max-procs=num –P num Allows xargs to run up to maxprocs instances of command simultaneously. (The default is 1, which runs commands sequentially.) This option might improve the throughput if xargs is running on a multiprocessor system. ––interactive –p Prompts prior to each execution of command. xargs xargs 1006 xargs ––no-run-if-empty –r Causes xargs not to execute command if standard input is empty. Ordinarily xargs executes command at least once, even if standard input includes only blanks (SPACEs and TABs). L Discussion The xargs utility reads arguments to command from standard input, interpreting each whitespace-delimited string as a separate argument. It then constructs a command line from command and a series of arguments. When the maximum command line length would be exceeded by adding another argument, xargs runs the command line it has built. If there is more input, xargs repeats the process of building a command line and running it. This process continues until all input has been read. Notes The xargs utility is often used as an efficient alternative to the –exec option of find (page 805). If you call find with the –exec option to run a command, it runs the command once for each file it processes. Each execution of the command creates a new process, which can drain system resources when you are processing many files. By accumulating as many arguments as possible, xargs can greatly reduce the number of processes needed. The first example in the “Examples” section shows how to use xargs with find. Examples To locate and remove all files with names ending in .o from the working directory and its subdirectories, you can use the find –exec option: $ find . -name \*.o -exec rm --force {} \; This approach calls the rm utility once for each .o file that find locates. Each invocation of rm requires a new process. If a lot of .o files exist, the system must spend a significant amount of time creating, starting, and then cleaning up these processes. You can reduce the number of processes by allowing xargs to accumulate as many filenames as possible before calling rm: $ find . -name \*.o -print | xargs rm --force In the next example, the content of all *.txt files located by find is searched for lines containing the word login. All filenames that contain login are displayed by grep. $ find . -name \*.txt -print | xargs grep -w -l login The next example shows how to use the –I option to cause xargs to embed standard input within command instead of appending it to command. This option also causes command to be executed each time a NEWLINE character is encountered in standard input; –l (––max-lines) does not override this behavior. $ cat names Tom, Dick, and Harry xargs 1007 $ xargs echo "Hello," < names Hello, Tom, Dick, and Harry $ xargs -I xxx echo "Hello xxx. Join me for lunch?" < names Hello Tom,. Join me for lunch? Hello Dick,. Join me for lunch? Hello and Harry. Join me for lunch? The final example uses the same input file as the previous examples as well as the –n (––max-args) and –l (––max-lines) options: $ xargs -n 1 echo "Hi there" < names Hi there Tom, Hi there Dick, Hi there and Hi there Harry $ xargs -l 2 echo "Hi there" < names Hi there Tom, Dick, Hi there and Harry Regular Expressions In This Appendix Characters . . . . . . . . . . . . . . . . . . 1012 Delimiters . . . . . . . . . . . . . . . . . . 1012 Simple Strings. . . . . . . . . . . . . . . 1012 Special Characters . . . . . . . . . . . 1012 Rules . . . . . . . . . . . . . . . . . . . . . . 1015 Bracketing Expressions . . . . . . . 1016 The Replacement String . . . . . . . 1016 Extended Regular Expressions. . . 1017 A regular expression defines a set of one or more strings of characters. A simple string of characters is a regular expression that defines one string of characters: itself. A more complex regular expression uses letters, numbers, and special characters to define many different strings of characters. A regular expression is said to match any string it defines. AppendixA This appendix describes the regular expressions used by ed, vim, emacs, grep, mawk/gawk, sed, Perl, and many other utilities. Refer to page 550 for more information on Perl regular expressions. The regular expressions used in shell ambiguous file references are different and are described in “Filename Generation/Pathname Expansion” on page 148. 1012 Appendix A Regular Expressions Characters As used in this appendix, a character is any character except a NEWLINE. Most characters represent themselves within a regular expression. A special character, also called a metacharacter, is one that does not represent itself. If you need to use a special character to represent itself, you must quote it as explained on page 1015. Delimiters A character called a delimiter usually marks the beginning and end of a regular expression. The delimiter is always a special character for the regular expression it delimits (that is, it does not represent itself but marks the beginning and end of the expression). Although vim permits the use of other characters as a delimiter and grep does not use delimiters at all, the regular expressions in this appendix use a forward slash (/ ) as a delimiter. In some unambiguous cases, the second delimiter is not required. For example, you can sometimes omit the second delimiter when it would be followed immediately by RETURN. Simple Strings The most basic regular expression is a simple string that contains no special characters except the delimiters. A simple string matches only itself (Table A-1). In the examples in this appendix, the strings that are matched are underlined and look like this. Table A-1 Simple strings Regular expression Matches Examples /ring/ ring ring, spring, ringing, stringing /Thursday/ Thursday Thursday, Thursday’s /or not/ or not or not, poor nothing Special Characters You can use special characters within a regular expression to cause the regular expression to match more than one string. A regular expression that includes a spe- Special Characters 1013 cial character always matches the longest possible string, starting as far toward the beginning (left) of the line as possible. Periods A period (.) matches any character (Table A-2). Table A-2 Regular expression Periods Matches Examples / .alk/ All strings consisting of a SPACE followed by any character followed by alk will talk, might balk /.ing/ All strings consisting of any character preceding ing sing song, ping, before inglenook Brackets Brackets ( []) define a character class1 that matches any single character within the brackets (Table A-3). If the first character following the left bracket is a caret (^), the brackets define a character class that matches any single character not within the brackets. You can use a hyphen to indicate a range of characters. Within a characterclass definition, backslashes and asterisks (described in the following sections) lose their special meanings. A right bracket (appearing as a member of the character class) can appear only as the first character following the left bracket. A caret is special only if it is the first character following the left bracket. A dollar sign is special only if it is followed immediately by the right bracket. Table A-3 Regular expression Brackets Matches Examples /[bB]ill/ Member of the character class b and B followed by ill bill, Bill, billed /t[aeiou].k/ t followed by a lowercase vowel, any character, and a k talkative, stink, teak, tanker /# [6–9]/ # followed by a SPACE and a member of the character class 6 through 9 # 60, # 8:, get # 9 /[^a–zA–Z]/ Any character that is not a letter (ASCII character set only) 1, 7, @, ., }, Stop! 1. GNU documentation and POSIX call these List Operators and defines Character Class operators as expressions that match a predefined group of characters, such as all numbers (see Table VI-35 on page 987). 1014 Appendix A Regular Expressions Asterisks An asterisk can follow a regular expression that represents a single character (Table A-4). The asterisk represents zero or more occurrences of a match of the regular expression. An asterisk following a period matches any string of characters. (A period matches any character, and an asterisk matches zero or more occurrences of the preceding regular expression.) A character-class definition followed by an asterisk matches any string of characters that are members of the character class. Table A-4 Regular expression Asterisks Matches Examples /ab*c/ a followed by zero or more b’s followed by ac ac, abc, abbc, debbcaabbbc /ab.*c/ ab followed by zero or more characters followed by c abc, abxc, ab45c, xab 756.345 x cat /t.*ing/ t followed by zero or more characters followed by ing thing, ting, I thought of going /[a–zA–Z ]*/ A string composed only of letters and SPACEs 1. any string without numbers or punctuation! /(.*)/ As long a string as possible between ( and ) Get (this) and (that); /([^)]*)/ The shortest string possible that starts with ( and ends with ) (this), Get (this and that) Carets and Dollar Signs A regular expression that begins with a caret (^) can match a string only at the beginning of a line. In a similar manner, a dollar sign ($) at the end of a regular expression matches the end of a line. The caret and dollar sign are called anchors because they force (anchor) a match to the beginning or end of a line (Table A-5). Table A-5 Carets and dollar signs Regular expression Matches Examples /^T/ A T at the beginning of a line This line..., That Time..., In Time /^+[0–9]/ A plus sign followed by a digit at the beginning of a line +5 +45.72, +759 Keep this... /:$/ A colon that ends a line ...below: Rules 1015 Quoting Special Characters You can quote any special character (but not parentheses [except in Perl; page 554] or a digit) by preceding it with a backslash (Table A-6). Quoting a special character makes it represent itself. Table A-6 Quoted special characters Regular expression Matches Examples /end\./ All strings that contain end followed by a period The end., send., pretend.mail / \\/ A single backslash \ / \*/ An asterisk *.c, an asterisk (*) / \[5\]/ [5] it was five [5] /and\/or/ and/or and/or Rules The following rules govern the application of regular expressions. Longest Match Possible A regular expression always matches the longest possible string, starting as far toward the beginning (left end) of the line as possible. Perl calls this type of match a greedy match (page 553). For example, given the string This (rug) is not what it once was (a long time ago), is it? the expression /Th.*is/ matches This (rug) is not what it once was (a long time ago), is and /(.*)/ matches (rug) is not what it once was (a long time ago) However, /([^)]*)/ matches (rug) Given the string singing songs, singing more and more the expression /s.*ing/ matches singing songs, singing and /s.*ing song/ matches singing song 1016 Appendix A Regular Expressions Empty Regular Expressions Within some utilities, such as vim and less (but not grep), an empty regular expression represents the last regular expression you used. For example, suppose you give vim the following Substitute command: :s/mike/robert/ If you then want to make the same substitution again, you can use the following command: :s//robert/ Alternatively, you can use the following commands to search for the string mike and then make the substitution /mike/ :s//robert/ The empty regular expression (//) represents the last regular expression you used (/mike/). Bracketing Expressions You can use quoted parentheses, \( and \), to bracket a regular expression. (However, Perl uses unquoted parentheses to bracket regular expressions; page 554.) The string the bracketed regular expression matches can be recalled, as explained in “Quoted Digit” on the next page. A regular expression does not attempt to match quoted parentheses. Thus a regular expression enclosed within quoted parentheses matches what the same regular expression without the parentheses would match. The expression /\(rexp\)/ matches what /rexp/ would match; /a\(b*\)c/ matches what /ab*c/ would match. You can nest quoted parentheses. The bracketed expressions are identified only by the opening \( , so no ambiguity arises in identifying them. The expression /\([a–z]\([A–Z]*\)x\)/ consists of two bracketed expressions, one nested within the other. In the string 3 t dMNORx7 l u, the preceding regular expression matches dMNORx, with the first bracketed expression matching dMNORx and the second matching MNOR. The Replacement String The vim and sed editors use regular expressions as search strings within Substitute commands. You can use the ampersand (&) and quoted digits (\n) special characters to represent the matched strings within the corresponding replacement string. Extended Regular Expressions 1017 Ampersand Within a replacement string, an ampersand (&) takes on the value of the string that the search string (regular expression) matched. For example, the following vim Substitute command surrounds a string of one or more digits with NN. The ampersand in the replacement string matches whatever string of digits the regular expression (search string) matched: :s/[0-9][0-9]*/NN&NN/ Two character-class definitions are required because the regular expression [0–9]* matches zero or more occurrences of a digit, and any character string constitutes zero or more occurrences of a digit. Quoted Digit Within the search string, a bracketed regular expression, \(xxx\) [(xxx) in Perl], matches what the regular expression would have matched without the quoted parentheses, xxx. Within the replacement string, a quoted digit, \n, represents the string that the bracketed regular expression (portion of the search string) beginning with the nth \( matched. Perl accepts a quoted digit for this purpose, but the preferred style is to precede the digit with a dollar sign ($n; page 554). For example, you can take a list of people in the form last-name, first-name initial and put it in the form first-name initial last-name with the following vim command: :1,$s/\([^,]*\), \(.*\)/\2 \1/ This command addresses all the lines in the file (1,$). The Substitute command (s) uses a search string and a replacement string delimited by forward slashes. The first bracketed regular expression within the search string, \([^,]*\), matches what the same unbracketed regular expression, [^,]*, would match: zero or more characters not containing a comma (the last-name). Following the first bracketed regular expression are a comma and a SPACE that match themselves. The second bracketed expression, \(.*\), matches any string of characters (the first-name and initial). The replacement string consists of what the second bracketed regular expression matched ( \2), followed by a SPACE and what the first bracketed regular expression matched ( \1). Extended Regular Expressions This section covers patterns that use an extended set of special characters. These patterns are called full regular expressions or extended regular expressions. In addition 1018 Appendix A Regular Expressions to ordinary regular expressions, Perl and vim provide extended regular expressions. The three utilities egrep, grep when run with the –E option (similar to egrep), and mawk/gawk provide all the special characters included in ordinary regular expressions, except for \( and \), as well those included in extended regular expressions. Two of the additional special characters are the plus sign (+) and the question mark (?). They are similar to *, which matches zero or more occurrences of the previous character. The plus sign matches one or more occurrences of the previous character, whereas the question mark matches zero or one occurrence. You can use any one of the special characters *, +, and ? following parentheses, causing the special character to apply to the string surrounded by the parentheses. Unlike the parentheses in bracketed regular expressions, these parentheses are not quoted (Table A-7). Table A-7 Extended regular expressions Regular expression Matches Examples /ab+c/ a followed by one or more b’s followed by ac yabcw, abbc57 /ab?c/ a followed by zero or one b followed by c back, abcdef /(ab)+c/ One or more occurrences of the string ab followed by c zabcd, ababc! /(ab)?c/ Zero or one occurrence of the string ab followed by c xc, abcc In full regular expressions, the vertical bar (|) special character is a Boolean OR operator. Within vim, you must quote the vertical bar by preceding it with a backslash to make it special (\|). A vertical bar between two regular expressions causes a match with strings that match the first expression, the second expression, or both. You can use the vertical bar with parentheses to separate from the rest of the regular expression the two expressions that are being ORed (Table A-8). Table A-8 Full regular expressions Regular expression Meaning Examples /ab|ac/ Either ab or ac ab, ac, abac (abac is two matches of the regular expression) /^Exit|^Quit/ Lines that begin with Exit or Quit Exit, Quit, No Exit /(D|N)\. Jones/ D. Jones or N. Jones P.D. Jones, N. Jones Appendix Summary 1019 Appendix Summary A regular expression defines a set of one or more strings of characters. A regular expression is said to match any string it defines. In a regular expression, a special character is one that does not represent itself. Table A-9 lists special characters. Table A-9 Special characters Character Meaning . Matches any single character * Matches zero or more occurrences of a match of the preceding character ^ Forces a match to the beginning of a line $ A match to the end of a line \ Quotes special characters \< Forces a match to the beginning of a word \> Forces a match to the end of a word Table A-10 lists ways of representing character classes and bracketed regular expressions. Table A-10 Character classes and bracketed regular expressions Class Defines [xyz ] Defines a character class that matches x, y, or z [^ xyz ] Defines a character class that matches any character except x, y, or z [x–z ] Defines a character class that matches any character x through z inclusive \( xyz \) Matches what xyz matches (a bracketed regular expression; not Perl) ( xyz ) Matches what xyz matches (a bracketed regular expression; Perl only) In addition to the preceding special characters and strings (excluding quoted parentheses, except in vim), the characters in Table A-11 are special within full, or extended, regular expressions. Table A-11 Extended regular expressions Expression Matches + Matches one or more occurrences of the preceding character ? Matches zero or one occurrence of the preceding character 1020 Appendix A Regular Expressions Table A-11 Extended regular expressions (continued) Expression Matches (xyz )+ Matches one or more occurrences of what xyz matches (xyz )? Matches zero or one occurrence of what xyz matches (xyz )* Matches zero or more occurrences of what xyz matches xyz |abc Matches either what xyz or what abc matches (use \| in vim) (xy |ab)c Matches either what xyc or what abc matches (use \| in vim) Table A-12 lists characters that are special within a replacement string in sed and vim. Table A-12 Replacement strings String Represents & Represents what the regular expression (search string) matched \n A quoted number, n, represents what the n th bracketed regular expression in the search string matched $n A number preceded by a dollar sign, n, represents what the nth bracketed regular expression in the search string matched (Perl only) Help In This Appendix Solving a Problem . . . . . . . . . . . . 1022 Linux and OS X Newsgroups . . . 1023 Mailing Lists . . . . . . . . . . . . . . . . 1024 Specifying a Terminal . . . . . . . . . 1024 You need not be a user or system administrator in isolation. A large community of Linux and OS X experts is willing to assist you in learning about, helping you solve problems with, and getting the most out of a Linux or OS X system. Before you ask for help, however, make sure you have done everything you can to solve the problem yourself. No doubt, someone has experienced the same problem before you and the answer to your question exists somewhere on the Internet. Your job is to find it. This appendix lists resources and describes methods that can help you in that task. AppendixB B 1022 Appendix B Help Solving a Problem Following is a list of steps that can help you solve a problem without asking someone for help. Depending on your understanding of and experience with the hardware and software involved, these steps might lead to a solution. 1. Most Linux and OS X distributions come with extensive documentation. Read the documentation on the specific hardware or software you are having a problem with. If it is a GNU product, use info; otherwise, use man to find local information. Also look in /usr/share/doc for documentation on specific tools. For more information refer to “Where to Find Documentation” on page 33. 2. When the problem involves some type of error or other message, use a search engine, such as Google (www.google.com) or Google Groups (groups.google.com), to look up the message on the Internet. If the message is long, pick a unique part of the message to search for; 10 to 20 characters should be enough. Enclose the search string within double quotation marks. See “Using the Internet to Get Help” on page 41 for an example of this kind of search. 3. Check whether the Linux Documentation Project (www.tldp.org) has a HOWTO or mini-HOWTO on the subject in question. Search its site for keywords that relate directly to the product and problem. Read the FAQs. 4. GNU manuals are available at www.gnu.org/manual. In addition, you can visit the GNU home page (www.gnu.org) to obtain other documentation and GNU resources. Many of the GNU pages and resources are available in a variety of languages. 5. Use Google or Google Groups to search on keywords that relate directly to the product and problem. 6. When all else fails (or perhaps before you try anything else), examine the system logs in /var/log. First look at the end of the messages (Linux) or system.log (OS X) file using one of the following commands: # tail -20 /var/log/messages # tail -20 /var/log/system.log If messages or system.log contains nothing useful, run the following command. It displays the names of the log files in chronological order, with the most recently modified files appearing at the bottom of the list. $ ls -ltr /var/log Look at the files at the bottom of the list first. If the problem involves a network connection, review the secure or auth.log file (some systems Finding Linux and OS X Related Information 1023 might use a different name) on the local and remote systems. Also look at messages or system.log on the remote system. 7. The /var/spool directory contains subdirectories with useful information: cups holds the print queues; postfix, mail, or exim4 holds the user’s mail files; and so on. If you are unable to solve a problem yourself, a thoughtful question to an appropriate newsgroup (page 1023) or mailing list (page 1024) can elicit useful information. When you send or post a question, make sure you describe the problem and identify the local system carefully. Include the version numbers of the operating system and any software packages that relate to the problem. Describe the hardware, if appropriate. There is an etiquette to posting questions—see www.catb.org/~esr/faqs/smart-questions.html for a good paper by Eric S. Raymond and Rick Moen titled “How To Ask Questions the Smart Way.” The author’s home page ( www.sobell.com) contains corrections to this book, answers to selected chapter exercises, and pointers to other Linux and OS X sites. Finding Linux and OS X Related Information Linux and OS X distributions come with reference pages stored online. You can read these documents by using the man or info (page 36) utility. You can read man and info pages to get more information about specific topics while reading this book or to determine which features are available with Linux or OS X. To search for topics, use apropos (see page 35 or give the command man apropos). The Apple support site (www.apple.com/support) has many useful Mac OS X links. Linux and OS X Newsgroups One of the best ways of getting specific information is through a newsgroup. You can often find the answer to a question by reading postings to the newsgroup. Try using Google Groups (groups.google.com) to search through newsgroups to see whether your question has already been asked and answered. Alternatively, open a newsreader program and subscribe to appropriate newsgroups. If necessary, you can post a question for someone to answer. Before you do so, make sure you are posting to the correct group and that your question has not already been answered. The comp.os.linux.answers (Linux) and comp.sys.mac.system (OS X) newsgroups provide postings of solutions to common problems and periodic postings of the most upto-date versions of FAQ and HOWTO documents. The comp.os.linux.misc newsgroup has answers to miscellaneous Linux-related questions. Apple Support Communities (discussions.apple.com) has a list of online forums, including a section on Mac OS X and System Software. 1024 Appendix B Help Mailing Lists Subscribing to a mailing list allows you to participate in an electronic discussion. With most lists, you can send and receive email dedicated to a specific topic to and from a group of users. Moderated lists do not tend to stray as much as unmoderated lists, assuming the list has a good moderator. The disadvantage of a moderated list is that some discussions might be cut off when they get interesting if the moderator deems that the discussion has gone on for too long. Mailing lists described as bulletins are strictly unidirectional: You cannot post information to these lists but can only receive periodic bulletins. If you have the subscription address for a mailing list but are not sure how to subscribe, put the word help in the body and/or header of email you send to the address. You will usually receive instructions via return email. You can also use a search engine to search for mailing list linux or mailing list os x. You can find Red Hat and Fedora mailing lists at www.redhat.com/mailman/listinfo, Ubuntu mailing lists at www.ubuntu.com/support/community/mailinglists, and Debian mailing lists at www.debian.org/MailingLists/subscribe. Apple supports many mailing lists. Visit www.lists.apple.com/mailman/listinfo to display a list of Mac OS X mailing lists; click on the name of a list to display subscription information. Specifying a Terminal Because vim, emacs, and other textual and pseudographical programs take advantage of features specific to various kinds of terminals and terminal emulators, you must tell these programs the name of the terminal you are using or the terminal your terminal emulator is emulating. Most of the time the terminal name is set for you. If the terminal name is not specified or is not specified correctly, the characters on the screen will be garbled or, when you start a program, the program will ask which type of terminal you are using. Terminal names describe the functional characteristics of a terminal or terminal emulator to programs that require this information. Although terminal names are referred to as either Terminfo or Termcap names, the difference relates to the method each system uses to store the terminal characteristics internally—not to the manner in which you specify the name of a terminal. Terminal names that are often used with Linux terminal emulators and with graphical monitors while they are run in textual mode include ansi, linux, vt100, vt102, vt220, and xterm. When you are running a terminal emulator, you can specify the type of terminal you want to emulate. Set the emulator to either vt100 or vt220, and set TERM to the same value. When you log in, you might be prompted to identify the type of terminal you are using: TERM = (vt100) Specifying a Terminal 1025 You can respond to this prompt in one of two ways. First you can press RETURN to set your terminal type to the name in parentheses. If that name does not describe the terminal you are using, you can enter the correct name and then press RETURN. TERM = (vt100) ansi You might also receive the following prompt: TERM = (unknown) This prompt indicates that the system does not know which type of terminal you are using. If you plan to run programs that require this information, enter the name of the terminal or terminal emulator you are using before you press RETURN. TERM If you do not receive a prompt, you can give the following command to display the value of the TERM variable and check whether the terminal type has been set: $ echo $TERM If the system responds with the wrong name, a blank line, or an error message, set or change the terminal name. From the Bourne Again Shell (bash), enter a command similar to the following to set the TERM variable so the system knows which type of terminal you are using: export TERM=name Replace name with the terminal name for the terminal you are using, making sure you do not put a SPACE before or after the equal sign. If you always use the same type of terminal, you can place this command in your ~/.bashrc file (page 278), causing the shell to set the terminal type each time you log in. For example, give the following command to set your terminal name to vt100: $ export TERM=vt100 Use the following syntax under the TC Shell (tcsh): setenv TERM name Again replace name with the terminal name for the terminal you are using. Under tcsh you can place this command in your ~/.login file (page 372). For example, under tcsh you can give this command to set your terminal name to vt100: $ setenv TERM vt100 LANG For some programs to display information correctly, you might need to set the LANG variable (page 314). Frequently you can set this variable to C. Under bash use the command $ export LANG=C and under tcsh use $ setenv LANG C Keeping the System Up-to-Date In This Appendix Using yum . . . . . . . . . . . . . . . . . . 1028 Using apt-get . . . . . . . . . . . . . . . . 1034 BitTorrent. . . . . . . . . . . . . . . . . . . 1038 The apt-get and yum utilities both fill the same role: They install and update software packages. Both utilities compare the files in a repository (generally on the Internet) with those on the local system and update the files on the local system according to your instructions. Both utilities automatically install and update any additional files that a package depends on (dependencies). Most Linux distributions come with apt-get or yum. Debian-based systems such as Ubuntu and Mint are set up to use apt-get, which works with deb packages. Red Hat and Fedora use yum, which works with rpm packages. There are also versions of apt-get that work with rpm packages. On a Mac OS X system it is easiest to use the software update GUI to keep the system up-to-date. AppendixC C To facilitate the update process, apt-get and yum keep locally a list of packages that are held in each of the repositories they use. Any software you want to install or update must reside in a repository. When you give apt-get or yum a command to install a package, they look for the package in a local package list. If the package appears in the list, apt-get or yum fetches both that package and any packages the package you are installing depends on and installs the packages. 1028 Appendix C Keeping the System Up-to-Date The yum examples in this section are from a Fedora system and the apt-get examples are from an Ubuntu system. Although the files, input, and output on the local system might look different, how you use the tools—and the results—will be the same. In contrast to apt-get and yum, BitTorrent efficiently distributes large amounts of static data, such as installation ISO images. It does not examine files on the local system and does not check for dependencies. Using yum Early releases of Linux did not include a tool for managing updates. Although the rpm utility could install or upgrade individual software packages, it was up to the user to locate a package and any packages it depended on. When Terra Soft produced its Red Hat–based Linux distribution for the PowerPC, named Yellow Dog, the company created the Yellow Dog Updater to fill this gap. This program has since been ported to other architectures and distributions. The result is named Yellow Dog Updater, Modified (yum; yum.baseurl.org). rpm packages The yum utility works with rpm packages. When yum installs or upgrades a software package, it also installs or upgrades packages that the package depends on. Repositories The yum utility downloads package headers and packages from servers called repositories. Frequently, yum is set up to use copies of repositories kept on mirror sites. See “Configuring yum” on page 1032 for information on selecting a repository. Using yum to Install, Remove, and Update Packages Installing packages The behavior of yum depends on the options you specify. To install a new package together with the packages it depends on, while working with root privileges give the command yum install, followed by the name of the package. After yum determines what it needs to do, it asks for confirmation. The next example installs the tcsh package: # yum install tcsh Loaded plugins: langpacks, presto, refresh-packagekit Resolving Dependencies --> Running transaction check ---> Package tcsh.i686 0:6.17-15.fc16 will be installed --> Finished Dependency Resolution Dependencies Resolved =============================================================================== Package Arch Version Repository Size =============================================================================== Installing: tcsh i686 6.17-15.fc16 fedora 403 k Transaction Summary =============================================================================== Install 1 Package Total download size: 403 k Installed size: 403 k Using yum 1029 Is this ok [y/N]: y Downloading Packages: tcsh-6.17-15.fc16.i686.rpm Running Transaction Check Running Transaction Test Transaction Test Succeeded Running Transaction Installing : tcsh-6.17-15.fc16.i686 Verifying : tcsh-6.17-15.fc16.i686 | 403 kB 00:00 1/1 1/1 Installed: tcsh.i686 0:6.17-15.fc16 Complete! Removing packages You can also use yum to remove packages, using a similar syntax. The following example removes the tcsh package: # yum remove tcsh Loaded plugins: langpacks, presto, refresh-packagekit Resolving Dependencies --> Running transaction check ---> Package tcsh.i686 0:6.17-15.fc16 will be erased --> Finished Dependency Resolution Dependencies Resolved =============================================================================== Package Arch Version Repository Size =============================================================================== Removing: tcsh i686 6.17-15.fc16 @fedora 1.1 M Transaction Summary =============================================================================== Remove 1 Package Installed size: 1.1 M Is this ok [y/N]: y Downloading Packages: Running Transaction Check Running Transaction Test Transaction Test Succeeded Running Transaction Erasing : tcsh-6.17-15.fc16.i686 Verifying : tcsh-6.17-15.fc16.i686 1/1 1/1 Removed: tcsh.i686 0:6.17-15.fc16 Complete! Updating packages The update option, without additional parameters, updates all installed packages. It downloads summary files that hold information about package headers for installed packages, determines which packages need to be updated, prompts to continue, and downloads and installs the updated packages. Unlike the situation with apt-get, the yum upgrade command is very similar to yum update. In the following example, yum determines that two packages, at and firefox, need to be updated and checks dependencies. Once it has determined what it needs to do, yum advises you of the action(s) it will take, prompts with Is this ok [y/N], and, if you approve, downloads and installs the packages. 1030 Appendix C Keeping the System Up-to-Date # yum update Loaded plugins: langpacks, presto, refresh-packagekit Resolving Dependencies --> Running transaction check ---> Package at.i686 0:3.1.13-3.fc16 will be updated ---> Package at.i686 0:3.1.13-6.fc16 will be an update ---> Package bash.i686 0:4.2.10-4.fc16 will be updated ---> Package bash.i686 0:4.2.28-1.fc16 will be an update --> Finished Dependency Resolution Dependencies Resolved =============================================================================== Package Arch Version Repository Size =============================================================================== Updating: at i686 3.1.13-6.fc16 updates 61 k bash i686 4.2.28-1.fc16 updates 961 k Transaction Summary =============================================================================== Upgrade 2 Packages Total download size: 1.0 M Is this ok [y/N]: y Downloading Packages: Setting up and reading Presto delta metadata updates/prestodelta | 988 Processing delta metadata Download delta size: 457 k (1/2): at-3.1.13-3.fc16_3.1.13-6.fc16.i686.drpm | 52 (2/2): bash-4.2.10-4.fc16_4.2.28-1.fc16.i686.drpm | 405 Finishing rebuild of rpms, from deltarpms | 1.0 Presto reduced the update size by 56% (from 1.0 M to 457 k). Running Transaction Check Running Transaction Test Transaction Test Succeeded Running Transaction Updating : bash-4.2.28-1.fc16.i686 ... Verifying : at-3.1.13-3.fc16.i686 Updated: at.i686 0:3.1.13-6.fc16 kB 00:02 kB kB 00:00 00:00 MB 00:01 1/4 4/4 bash.i686 0:4.2.28-1.fc16 Complete! You can update individual packages by specifying the names of the packages on the command line following the word update. Other yum Commands Many yum commands and options are available. A few of the more useful commands are described here. The yum man page contains a complete list. check-update Lists packages installed on the local system that have updates available in the yum repositories. clean all Removes header files that yum uses for resolving dependencies. Also removes cached packages—yum does not automatically remove packages once they have been down- loaded and installed, unless you set keepcache (page 1032) to 0. clean metadata Removes the files yum uses to determine remote package availability. Using this option forces yum to download all metadata the next time you run it. Using yum 1031 list available Lists all packages that can be installed from the yum repositories. search word Searches for word in the package description, summary, packager, and name. yum Groups In addition to working with single packages, yum can work with groups of packages. The next example shows how to display a list of installed and available package groups: $ yum grouplist Loaded plugins: langpacks, presto, refresh-packagekit Installed Groups: Administration Tools Design Suite GNOME Desktop Environment ... Installed Language Groups: Arabic Support [ar] Armenian Support [hy] Assamese Support [as] ... Available Groups: Authoring and Publishing Base Books and Guides DNS Name Server ... Available Language Groups: Afrikaans Support [af] Akan Support [ak] Albanian Support [sq] ... Done The command yum groupinfo followed by the name of a group displays information about the group, including a description of the group and a list of mandatory, default, and optional packages. The next example displays information about the DNS Name Server group of packages. If the name of the package includes a SPACE, you must quote it. $ yum groupinfo "DNS Name Server" Loaded plugins: langpacks, presto, refresh-packagekit Group: DNS Name Server Description: This package group allows you to run a DNS name server (BIND) on the system. Default Packages: bind-chroot Optional Packages: bind dnsperf ldns nsd pdns pdns-recursor rbldnsd unbound 1032 Appendix C Keeping the System Up-to-Date To install a group of packages, give the command yum groupinstall followed by the name of the group. Downloading rpm Package Files Using yumdownloader The yumdownloader utility locates and downloads—but does not install—rpm package files. If this utility is not available on the local system, use yum to download and install the yum-utils package before you attempt to run yumdownloader. The following example downloads the samba rpm file to the working directory: $ yumdownloader samba Loaded plugins: langpacks, presto, refresh-packagekit samba-3.6.5-85.fc16.i686.rpm | 4.7 MB Downloading source files 00:23 You can use yumdownloader with the ––source option to download rpm source package files. The yumdownloader utility automatically enables the necessary source repositories. The following example downloads in the working directory the rpm file for the latest version of the kernel source code for the installed release: $ yumdownloader --source kernel Loaded plugins: langpacks, presto, refresh-packagekit Enabling updates-source repository updates-source/metalink | 14 kB updates-source | 3.3 kB updates-source/primary_db | 1.2 MB Enabling fedora-source repository fedora-source/metalink | 20 kB fedora-source | 3.1 kB fedora-source/primary_db | 3.5 MB kernel-3.4.2-1.fc16.src.rpm | 64 MB 00:00 00:00 00:01 00:00 00:00 00:06 02:09 Without the ––source option, yumdownloader downloads the executable kernel rpm file. Configuring yum yum.conf Most Linux distributions that use yum for updating files come with yum ready to use; you do not need to configure it. This section describes the yum configuration files for those users who want to modify them. The primary configuration file, /etc/yum.conf, holds global settings. The first example shows a typical yum.conf file: $ cat /etc/yum.conf [main] cachedir=/var/cache/yum/$basearch/$releasever keepcache=0 debuglevel=2 logfile=/var/log/yum.log exactarch=1 obsoletes=1 gpgcheck=1 Using yum 1033 plugins=1 installonly_limit=3 ... # PUT YOUR REPOS HERE OR IN separate files named file.repo # in /etc/yum.repos.d The section labeled [main] defines global configuration options. The cachedir specifies the directory in which yum stores downloaded packages, although with keepcache set to 0, yum does not store headers and packages after installing them. The amount of information logged is specified by debuglevel, with a value of 10 producing the most information. The logfile specifies where yum keeps its log. Setting exactarch to 1 causes yum to update packages only with packages of the same architecture, thereby preventing an i686 package from replacing an i386 package, for example. You can use retries to specify the number of times yum will try to retrieve a file before returning an error (the default is 6). Set this parameter to 0 to cause yum to continue trying forever. Setting obsoletes to 1 causes yum to replace obsolete packages when performing an update; it has no effect during an install. When gpgcheck is set to 1, yum checks the GPG (GNU Privacy Guard; GnuPG.org) signatures on packages it installs. This check verifies the authenticity of the packages. Setting plugins to 1 enables yum plugins, which extend yum functionality. (See wiki.linux.duke.edu/YumPlugins and yum in section 8 of the man pages for more information on yum plugins.) The installonly parameter specifies packages that yum is to install but never upgrade, such as the kernel. The installonly_limit specifies the number of versions of a given installonly package that are to be installed at one time. yum.repos.d Although the balance of the yum configuration information, which specifies the yum repositories, can appear in the yum.conf file, this information is frequently kept in the /etc/yum.repos.d directory. A parameter set in a repository section overrides the same parameter set in the [main] section. Following is a listing for a yum.repos.d directory on a Fedora system: $ ls /etc/yum.repos.d fedora.repo fedora-updates.repo fedora-updates-testing.repo Each of these files contains a header, such as [fedora], which provides a unique name for the repository. The name of the file is generally similar to the repository name, with the addition of a fedora- (or similar) prefix and a .repo filename extension. On a Fedora system, commonly used repositories include fedora (held in the fedora.repo file), which contains the packages present on the installation DVD; updates (held in the fedora-updates.repo file), which contains updated versions of the stable packages; and updates-testing (held in the fedora-updates-testing.repo file), which contains updates that are not ready for release. The last two repositories are not enabled; do not enable either of these repositories unless you are testing unstable packages. Never enable them on a production system. 1034 Appendix C Keeping the System Up-to-Date Each *.repo file can specify several repositories tip Each *.repo file includes specifications for several related repositories, which are usually disabled. For example, the fedora.repo file holds [fedora-debuginfo] and [fedora-source] in addition to [fedora]. You cannot download source files using yum. Instead, use yumdownloader (page 1032) for this task. The next example shows part of the fedora.repo file that specifies the parameters for the fedora repository: $ cat /etc/yum.repos.d/fedora.repo [fedora] name=Fedora $releasever - $basearch failovermethod=priority #baseurl=http://download.fedoraproject.org/pub/fedora/linux/releases/$releasever/Everything/$basearch/os/ mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=fedora-$releasever&arch=$basearch enabled=1 metadata_expire=7d gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$basearch ... Repository specification Each repository specification contains the name of the repository enclosed in brackets ([]), a name, a failovermethod, a baseurl, and a mirrorlist. The name provides an informal name for the repository that yum displays. The failovermethod determines the order in which yum contacts an initial mirror site and additional mirror sites if the first one fails; priority selects the sites in the order in which they appear, while roundrobin selects sites randomly. The baseurl indicates the location of the main repository; it is normally commented out. The mirrorlist specifies the URL of a file that holds a list of baseurls, or mirrors of the main repository. The mirror list server uses geoip (geolocation; www.geoiptool.com) to attempt to return the nearest mirrors for yum to try. Only baseurl or mirrorlist can be enabled at one time. These definitions use two variables: yum sets $basearch to the architecture of the system and $releasever to the version of the release (such as 11 for Fedora 11). The repository described by the file is enabled (yum will use it) if enabled is set to 1 and is disabled if enabled is set to 0. As described earlier, gpgcheck determines whether yum checks GPG signatures on files it downloads. The gpgkey specifies the location of the GPG key. Refer to the yum.conf man page for more options. Using apt-get APT (Advanced Package Tool) is a collection of utilities that download, install, remove, upgrade, and report on software packages. APT utilities download packages and call dpkg utilities to manipulate the packages once they are on the local system. For more information refer to www.debian.org/doc/manuals/apt-howto. Using apt-get 1035 Updating the package list The primary APT command is apt-get; its arguments determine what the command does. Working with root privileges, give the command apt-get update to update the local package list: # apt-get update Get:1 http://extras.ubuntu.com precise Release.gpg [72 B] Get:2 http://security.ubuntu.com precise-security Release.gpg [198 B] Hit http://extras.ubuntu.com precise Release Get:3 http://security.ubuntu.com precise-security Release [49.6 kB] Hit http://extras.ubuntu.com precise/main Sources Get:4 http://us.archive.ubuntu.com precise Release.gpg [198 B] Hit http://extras.ubuntu.com precise/main i386 Packages Get:5 http://us.archive.ubuntu.com precise-updates Release.gpg [198 B] Get:6 http://us.archive.ubuntu.com precise-backports Release.gpg [198 B] Get:7 http://us.archive.ubuntu.com precise Release [49.6 kB] Get:8 http://security.ubuntu.com precise-security/main Sources [22.5 kB] Get:9 http://security.ubuntu.com precise-security/restricted Sources [14 B] ... Fetched 13.4 MB in 2min 20s (95.4 kB/s) Reading package lists... Done Check the dependency tree The apt-get utility does not tolerate a broken dependency tree. To check the status of the local dependency tree, give the command apt-get check: # apt-get check Reading package lists... Done Building dependency tree Reading state information... Done The easiest way to fix errors that apt-get reveals is to remove the offending packages and then reinstall them using apt-get. Using apt-get to Install, Remove, and Update Packages Installing packages The following command uses apt-get to install the zsh package: # apt-get install zsh Reading package lists... Done Building dependency tree Reading state information... Done Suggested packages: zsh-doc The following NEW packages will be installed: zsh 0 upgraded, 1 newly installed, 0 to remove and 2 not upgraded. Need to get 0 B/4,667 kB of archives. After this operation, 11.5 MB of additional disk space will be used. Selecting previously unselected package zsh. (Reading database ... 166307 files and directories currently installed.) Unpacking zsh (from .../zsh_4.3.17-1ubuntu1_i386.deb) ... Processing triggers for man-db ... Setting up zsh (4.3.17-1ubuntu1) ... update-alternatives: using /bin/zsh4 to provide /bin/zsh (zsh) in auto mode. update-alternatives: using /bin/zsh4 to provide /bin/rzsh (rzsh) in auto mode. update-alternatives: using /bin/zsh4 to provide /bin/ksh (ksh) in auto mode. 1036 Appendix C Keeping the System Up-to-Date Removing packages Remove a package the same way you install a package, substituting remove for install: # apt-get remove zsh Reading package lists... Done Building dependency tree Reading state information... Done The following packages will be REMOVED: zsh 0 upgraded, 0 newly installed, 1 to remove and 2 not upgraded. After this operation, 11.5 MB disk space will be freed. Do you want to continue [Y/n]? y (Reading database ... 167467 files and directories currently installed.) Removing zsh ... Processing triggers for man-db ... To ensure you can later reinstall a package with the same configuration, the apt-get remove command does not remove configuration files from the /etc directory hierarchy. Although it is not recommended, you can use the purge command instead of remove to remove all the package files, including configuration files. Alternatively, you can move these files to an archive so you can restore them later if necessary. Using apt-get to Upgrade the System Two arguments cause apt-get to upgrade all packages on the system: upgrade upgrades all packages on the system that do not require new packages to be installed and dist-upgrade upgrades all packages on the system, installing new packages as needed; this argument will install a new version of the operating system if one is available. The following command updates all packages on the system that depend only on installed packages: # apt-get upgrade Reading package lists... Done Building dependency tree Reading state information... Done The following packages will be upgraded: eog libtiff4 2 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. Need to get 906 kB of archives. After this operation, 20.5 kB disk space will be freed. Do you want to continue [Y/n]? y Get:1 http://us.archive.ubuntu.com/ubuntu/ precise-updates/main libtiff4 i386 3.9.5-2ubuntu1.1 [142 kB] Get:2 http://us.archive.ubuntu.com/ubuntu/ precise-updates/main eog i386 3.4.2-0ubuntu1 [763 kB] Fetched 906 kB in 2s (378 kB/s) (Reading database ... 167468 files and directories currently installed.) Preparing to replace libtiff4 3.9.5-2ubuntu1 (using .../libtiff4_3.9.5-2ubuntu1.1_i386.deb) ... Unpacking replacement libtiff4 ... Preparing to replace eog 3.4.1-0ubuntu1 (using .../eog_3.4.2-0ubuntu1_i386.deb) ... Unpacking replacement eog ... Processing triggers for libglib2.0-0 ... Processing triggers for man-db ... Processing triggers for gconf2 ... Processing triggers for hicolor-icon-theme ... Processing triggers for bamfdaemon ... Rebuilding /usr/share/applications/bamf.index... Processing triggers for desktop-file-utils ... Using apt-get 1037 Processing triggers for gnome-menus ... Setting up libtiff4 (3.9.5-2ubuntu1.1) ... Setting up eog (3.4.2-0ubuntu1) ... Processing triggers for libc-bin ... ldconfig deferred processing now taking place When apt-get asks if you want to continue, enter Y to upgrade the listed packages; otherwise, enter N. Packages that are not upgraded because they depend on packages that are not already installed are listed as kept back. Use dist-upgrade to upgrade all packages, including packages that depend on packages that are not installed. This command also installs dependencies. Other apt-get Commands autoclean Removes old archive files. check Checks for broken dependencies. clean Removes archive files. dist-upgrade Upgrades packages on the system, installing new packages as needed. If a new version of the operating system is available, this option upgrades to the new version. purge Removes a package and all its configuration files. source Downloads source files. update Retrieves new lists of packages. upgrade Upgrades all packages on the system that do not require new packages to be installed. Repositories Repositories hold collections of software packages and related information, including headers that describe each package and provide information on other packages the package depends on. Typically, a Linux distribution maintains repositories for each of its releases. Software package categories Software packages are frequently divided into several categories. Ubuntu uses the following categories: • main—Ubuntu-supported open-source software • universe—Community-maintained open-source software • multiverse—Software restricted by copyright or legal issues • restricted—Proprietary device drivers • backports—Packages from later releases of Ubuntu that are not available for an earlier release The apt-get utility selects packages from repositories it searches based on the categories specified in the sources.list file. 1038 Appendix C Keeping the System Up-to-Date sources.list: Specifies Repositories for apt-get to Search The /etc/apt/sources.list file specifies the repositories apt-get searches when you ask it to find or install a package. You must modify the sources.list file to enable apt-get to download software from nondefault repositories. Typically, you do not need to configure apt-get to install supported software. Each line in sources.list describes one repository and has the following syntax: type URI repository category-list where type is deb for packages of executable files and deb-src for packages of source files; URI is the location of the repository, usually cdrom or an Internet address that starts with http://; repository is the name of the repository apt-get is to search; and category-list is a SPACE-separated list of categories apt-get selects packages from. Comments begin with a pound sign (#) anywhere on a line and end at the end of the line. The following line from sources.list on an Ubuntu system causes apt-get to search the Precise archive located at us.archive.ubuntu.com/ubuntu for deb packages that contain executable files. It accepts packages that are categorized as main, restricted, and multiverse: deb http://us.archive.ubuntu.com/ubuntu/ precise main restricted multiverse Default repositories Replace deb with deb-src to search for packages of source files in the same manner. Use the apt-get source command to download source packages. The default sources.list file on an Ubuntu system includes repositories such as precise (Precise as originally released), precise-updates (major bug fixes after the release of Precise), precise-security (critical security-related updates), and precise-backports (newer, less-tested software that is not reviewed by the Ubuntu security team). Some repositories in sources.list might be commented out. Remove the leading pound sign (#) on the lines of the repositories you want to enable. After you modify sources.list, give the command apt-get update (page 1035) to update the local package indexes. The next line, which was added to sources.list, enables apt-get to search a third-party repository (but see the following security tip): deb http://download.skype.com/linux/repos/debian/ stable non-free In this case, the repository is named stable and the category is non-free. Although the code is compiled for Debian, it runs on Ubuntu, as is frequently the case. Use repositories you trust security There are many repositories of software packages. Be selective in which repositories you add to sources.list: When you add a repository, you are trusting the person who runs the repository not to put malicious software in packages you might download. In addition, unsupported packages might conflict with other packages or cause upgrades to fail. BitTorrent The easiest way to download a BitTorrent file is to click the torrent file object in a Web browser or in the Nautilus File Browser; this action opens a GUI. This section BitTorrent 1039 describes how BitTorrent works and explains how to download a BitTorrent file from the command line. The BitTorrent protocol implements a hybrid client/server and P2P (page 1180) file transfer mechanism. BitTorrent efficiently distributes large amounts of static data, such as the Fedora/RHEL installation ISO images. It can replace protocols such as anonymous FTP, where client authentication is not required. Each BitTorrent client that downloads a file provides additional bandwidth for uploading the file, thereby reducing the load on the initial source. In general, BitTorrent downloads proceed faster than FTP downloads. Unlike protocols such as FTP, BitTorrent groups multiple files into a single package: a BitTorrent file. Tracker, peer, seed, and swarm BitTorrent, like other P2P systems, does not use a dedicated server. Instead, the functions of a server are performed by the tracker, peers, and seeds. The tracker is a server that allows clients to communicate with each other. Each client—called a peer when it has downloaded part of the BitTorrent file and a seed after it has downloaded the entire BitTorrent file—acts as an additional source for the BitTorrent file. Peers and seeds are collectively called a swarm. As with a P2P network, a member of a swarm uploads to other clients the sections of the BitTorrent file it has already downloaded. There is nothing special about a seed: It can be removed at any time after the torrent is available for download from other seeds. The torrent The first step in downloading a BitTorrent file is to locate or acquire the torrent, a file with the filename extension of .torrent. A torrent contains pertinent information (metadata) about the BitTorrent file to be downloaded, such as its size and the location of the tracker. You can obtain a torrent by accessing its URI, or you can acquire it via the Web, an email attachment, or other means. The BitTorrent client can then connect to the tracker to learn the locations of other members of the swarm it can download the BitTorrent file from. Manners After you have downloaded a BitTorrent file (the local system has become a seed), it is good manners to allow the local BitTorrent client to continue to run so peers (clients that have not downloaded the entire BitTorrent file) can upload at least as much information as you have downloaded. Prerequisites If necessary, use yum (page 1028) or apt-get (page 1035) to install the rtorrent package. Using BitTorrent The rtorrent utility is a textual BitTorrent client that provides a pseudographical interface. When you have a torrent, enter a command such as the following, substituting the name of the torrent you want to download for the Fedora torrent in the example: $ rtorrent Fedora-17-i686-Live-Desktop.torrent A torrent can download one or more files; the torrent specifies the filename(s) for the downloaded file(s) and, in the case of a multifile torrent, a directory to hold the files. The torrent in the preceding command saves the BitTorrent files in the Fedora-17i686-Live-Desktop directory in the working directory. 1040 Appendix C Keeping the System Up-to-Date The following example shows rtorrent running. Depending on the speed of the Internet connection and the number of seeds, downloading a large BitTorrent file can take from hours to days. *** rTorrent 0.8.9/0.12.9 - guava:7739 *** [View: main] Fedora-17-i686-Live-Desktop 479.9 / 646.0 MB Rate: 0.0 / 1187.6 KB Uploaded: ... [Throttle off/off KB] [Rate 2.2/1193.5 KB] [Port: 6977] [U 0/0] You can abort the download by pressing CONTROL-Q. The download will automatically resume from where it left off when you download the same torrent to the same location again. Make sure you have enough room to download the torrent caution Some torrents are huge. Make sure the partition you are working in has enough room to hold the BitTorrent file you are downloading. Enter the command rtorrent –help or see the rtorrent man page for a list of options. Visit libtorrent.rakshasa.no/wiki/RTorrentUserGuide for more complete documentation. One of the most useful options is –upload_rate, which limits how much bandwidth in kilobytes per second the swarm can use while downloading the torrent from the local system (upstream bandwidth). By default there is no limit to the bandwidth the swarm can use. The following command prevents BitTorrent from using more than 100 kilobytes per second of upstream bandwidth: $ rtorrent -o upload_rate=100 Fedora-17-i686-Live-Desktop.torrent BitTorrent usually allows higher download rates for members of the swarm that upload more data, so it is to your advantage to increase this value if you have spare bandwidth. You need to leave enough free upstream bandwidth for the acknowledgment packets from your download to get through or the download will be very slow. The value assigned to max_uploads specifies the number of concurrent uploads that rtorrent will permit. By default there is no limit. If you are downloading over a very slow connection, try setting upload_rate=3 and max_uploads=2. The name of the file or directory BitTorrent saves a file or files in is specified by the torrent. You can specify a different file or directory name by using the directory=directory option. Mac OS X Notes In This Appendix Open Directory . . . . . . . . . . . . . . 1042 /etc/group. . . . . . . . . . . . . . . . . . 1042 Filesystems . . . . . . . . . . . . . . . . . 1043 Extended Attributes . . . . . . . . . . 1044 Activating the Terminal META Key. . . . . . . . . . . . . . . . . . 1049 Startup Files . . . . . . . . . . . . . . . . 1050 Remote Logins. . . . . . . . . . . . . . . 1050 Many Utilities Do Not Respect Apple Human Interface Guidelines . . . . . . . . . . . . . . . . 1050 Installing Xcode and MacPorts . . 1050 Mac OS X Implementation of Linux Features . . . . . . . . . . . . . 1051 This appendix is a brief guide to Mac OS X features that differ from those of Linux. See Chapter 1 for a history of UNIX, Linux, and OS X. AppendixD D 1042 Appendix D Mac OS X Notes Open Directory Open Directory replaced the monolithic NetInfo database in Mac OS X version 10.5. The ni* utilities, including nireport and nidump, were replaced by dscl (page 788). The work the lookupd daemon did is now done by the DirectoryService daemon. For information on the local system, Mac OS X now uses a hierarchy of small *.plist XML files called nodes, which are stored in the /var/db/dslocal hierarchy. Many of these files are human readable. On Mac OS X Server, a networkwide Open Directory is based on OpenLDAP, Kerberos, and the SASL-based Password Server. /etc/passwd OS X uses the /etc/passwd file only when it boots into single-user mode. Because OS X does not use the user information stored in the /etc/passwd file while it is in multiuser mode, examples in this book that use this file do not run under OS X. In most cases you must use dscl to extract information from the passwd database. As an example, the whos2 program (following) is a version of whos (page 434) that runs under OS X 10.5 and above. whos2 For each command-line argument, whos2 searches the passwd database. Inside the for loop, the dscl (page 788) –readall command lists all usernames and user IDs on the local system. The command looks for the RealName keys in the /Users directory and displays the associated values. The four-line sed (page 653) command deletes lines containing only a dash (/^–$/ d), finds lines containing only RealName: (/^RealName:$/), reads and appends the next line (N; page 656), and substitutes a semicolon for the NEWLINE (s/\n/; /). Finally grep (page 833) selects lines containing the ID the program was called with. $ cat whos2 #!/bin/bash if [ $# -eq 0 ] then echo "Usage: whos id..." 1>&2 exit 1 fi for id do dscl . -readall /Users RealName | sed '/^-$/ d /^RealName:$/N;s/\n// N s/\n/; /' | grep -i "$id" done /etc/group Groups (page 1070) allow users to share files or programs without allowing all system users access to them. This scheme is useful if several users are working with files that are not public. In a Linux system, the /etc/group file associates one or more usernames with each group (number). Mac OS X 10.5 and above rely on Open Directory (page 1042) to provide group information. OS X 10.4 and below use NetInfo for this task. Filesystems 1043 Filesystems Mac OS X supports several types of filesystems. The most commonly used is the default HFS+ (Hierarchical File System Plus). Introduced under Mac OS 8.1 to support larger disks, HFS+ is an enhanced version of the original HFS filesystem. When it was introduced in OS 3.0 in 1986, HFS stood in contrast to the then-standard MFS (Macintosh File System). Some applications will not run correctly under filesystems other than HFS+. HFS+ is different from Linux filesystems, but because Linux offers a standardized filesystem interface, the differences are generally transparent to the user. The most significant differences are the following: • HFS+ is case preserving but not case sensitive (below). • HFS+ allows a user working with root privileges to create hard links to directories. • HFS+ files have extended attributes (next page). OS X also supports Linux filesystems such as UFS (UNIX File System), which it inherited from Berkeley UNIX. Other supported filesystems include FAT16 and FAT32, which were originally used in DOS and Windows. These filesystems are typically used on removable media, such as digital camera storage cards. OS X also supports NTFS (Windows) and exFAT (USB flash drives), ISO9660 (CD-ROMs), and UDF (DVDs). Nondisk Filesystems Mac OS X supports filesystems that do not correspond to a physical volume such as a partition on a hard disk or a CD-ROM. A .dmg (disk image) file is one example (to mount a disk image file so you can access the files it holds, double-click it in the Finder). Another example is a virtual filesystem in which the filenames represent kernel functionality. For example, the /Network virtual filesystem holds a directory tree representing the local network; most network filesystem protocols use this filesystem. Also, you can use the Disk Utility to create encrypted or password-protected .iso (ISO9660; page 1076) image files you can mount. Finally you can use hdiutil to mount and manipulate disk images. Case Sensitivity The default OS X filesystem, HFS+, is, by default, case preserving but not case sensitive. Case preserving means the filesystem remembers which capitalization you used when you created a file and displays the filename with that capitalization, but accepts any capitalization to refer to the file. Thus, under HFS+, files named JANUARY, January, and january refer to the same file. You can set up an HFS+ filesystem to be case sensitive. 1044 Appendix D Mac OS X Notes /Volumes Startup disk Each physical hard disk in a Mac OS X system is typically divided into one or more logical sections (partitions or volumes). Each OS X system has a volume, called the startup disk, that the system boots from. By default, the startup disk is named Macintosh HD. When the system boots, Macintosh HD is mounted as the root directory (/). The root directory always has a subdirectory named Volumes. For historical reasons, every volume other than the startup disk is mounted in the /Volumes directory. For example, the pathname of a disk labeled MyPhotos is /Volumes/MyPhotos. To simplify access, /Volumes holds a symbolic link (page 112) to the startup disk (the root directory). Assuming that the startup disk is named Macintosh HD, /Volumes/Macintosh HD is a symbolic link to /. $ ls -ld '/Volumes/Macintosh HD' lrwxr-xr-x 1 root admin 1 Jul 12 19:03 /Volumes/Macintosh HD -> / The system automatically mounts all volumes in /Volumes. The desktop presented by the Finder contains an icon for each mounted disk and for files in the user’s Desktop directory, making the /Volumes directory the effective root (top level or /) for the Finder and other applications. The Finder presents the pre-UNIX Mac OS view of the filesystem. Extended Attributes Mac OS X files have extended attributes that include file forks (e.g., data, resource), file attributes, and access control lists (ACLs). Not all utilities recognize extended attributes. Resource forks and file attributes are native to the HFS+ filesystem. Mac OS X emulates resource forks and file attributes on other types of filesystems. These features are not found on Linux filesystems. Some utilities do not process extended attributes caution Some third-party programs and most utilities under Mac OS X 10.3 and below do not support extended attributes. Some utilities require options to process extended attributes. See also the tip “Redirection does not support resource forks” on the next page. File Forks Forks are segments of a single file, each holding different content. Mac OS has supported file forks since its inception. The most widely used are the data fork and the resource fork. Data forks The data fork is equivalent to a Linux file. It consists of an unstructured stream of bytes. Many files have only a data fork. Resource forks The resource fork holds a database that allows random access to resources, each of which has a type and identifier number. Modifying, adding, or deleting one resource has no effect on the other resources. Resource forks can store different types of Extended Attributes 1045 information—some critical and some merely useful. For example, a Mac OS graphics program might save a smaller copy of an image (a preview or thumbnail) in a resource fork. Also, text files created with the BBEdit text editor store display size and tab stop information in the file’s resource fork. Because this program is a text editor and not a word processor, this type of information cannot be stored in the data fork. Losing a resource fork that holds a thumbnail or display information is at most an inconvenience because, for example, the thumbnail can be regenerated from the original image. Other programs store more important data in the resource fork or create files that contain only a resource fork, which holds all the file’s information. In this case, losing the resource fork is just as bad as losing the data fork. It might even be worse: You might not notice a resource fork is missing because the data fork is still there. You might notice the loss only when you try to use the file. Linux utilities might not preserve resource forks A Linux filesystem associates each filename with a single stream of bytes. Forks do not fit this model. As a result, many Linux utilities do not process resource forks, but instead process only data forks. Most of the file utilities provided with OS X 10.4 and above support resource forks. Many third-party utilities do not support resource forks. Pipelines do not work with resource forks caution Pipelines work with the data fork of a file only; they do not work with resource forks. Redirection does not support resource forks caution When you redirect input to or output from a utility (page 135), only the information in the data fork is redirected. Information in the resource fork is not redirected. For example, the following command copies the data fork of song.ogg only: $ cat song.ogg > song.bak.ogg If you are unsure whether a program supports resource forks, test it before relying on this functionality. Make a backup copy using the Finder, ditto, or, under version 10.4 and above, cp. Then check whether the copied file works correctly. Table D-1 lists utilities that manipulate resource forks. These utilities are installed with the Developer Tools package. Consult the respective man pages for detailed descriptions. Table D-1 Utilities that manipulate resource forks Utility Function Rez Creates a resource fork from a resource description file DeRez Creates a resource description file from a resource fork RezWack Converts a file with forks to a single flat file that holds all the forks UnRezWack Converts a flat file that holds forks to a file with forks SplitForks Converts a file with forks to multiple files, each holding a fork 1046 Appendix D Mac OS X Notes File Attributes A file contains data. Information about the file is called metadata. Examples of metadata include ownership information and permissions for a file. Mac OS X stores more metadata than Linux stores. This section discusses file attributes, the metadata stored by Mac OS X. File attributes include the following: • Attribute flags • Type codes • Creator codes The same caveats apply to file attributes as apply to resource forks: Some utilities might not preserve them when copying or manipulating files, and many utilities do not recognize attribute flags. Loss of file attributes is particularly common when you move files to non-Macintosh systems. Attribute Flags Attribute flags (Table D-2) hold information that is distinct from Linux permissions. Two attribute flags are the invisible flag (which keeps a file from being displayed in file dialogs and in the Finder) and the locked flag (which keeps a file from being modified). Flags are generally ignored by command-line utilities and affect only GUI applications. The ls utility lists files that have the invisible flag set. See GetFileInfo on page 831 and SetFile on page 938 for information on displaying, setting, and clearing attribute flags. Table D-2 Attribute flags Flag Can the flag be set on a directory? Description a No Alias file b No Has bundle c Yes Custom icon l No Locked t No Stationery pad file v Yes Invisible Creator Codes and Type Codes Type codes and creator codes are 32-bit integers, generally displayed as 4-character words, that specify the type and creator of a file. The creator code specifies the application that created a document, not the user who created it. An application can Extended Attributes 1047 typically open documents that have the same creator code as the application, but cannot open documents with other creator codes. Creator codes Creator codes generally correspond to vendors or product lines. The operating system—and in particular the Finder—uses creator codes to group related files. For example, the AppleWorks application file and its document files have the creator code BOBO. In the file browser in an application’s open file dialog box, grayed-out files generally indicate files that have a different creator code from the application and that the application cannot open. The open utility (page 901) also looks at creator codes when it opens a file. Type codes Type codes indicate how a file is used. The type code APPL indicates an application—a program used to open other files. For example, an AppleWorks word processor document has the type code CWWP, a mnemonic for Claris Works Word Processor (AppleWorks used to be named Claris Works). While a few type codes, such as the application type, are standardized, vendors are free to invent new type codes for their programs. A single application might support several document types. For example, AppleWorks supports spreadsheet files (CWSS), word processor documents (CWWP), and drawings (CWGR). Similarly a graphics program will typically support many document types. Data files used by an application might also have the same creator code as the application, even though they cannot be opened as documents. For example, the dictionary used by a spell checker cannot be opened as a document but typically uses the same creator code as the spell checker. Filename extensions Filename extensions (page 85) can substitute for type and creator codes. For example, an AppleWorks word processor document is saved with an extension of .cwk. Also, if open cannot determine which application to use to open a file using the file’s creator code, it reverts to using the file’s filename extension. ACLs ACLs (access control lists) are discussed on page 104. This section discusses how to enable and work with ACLs under Mac OS X. chmod: Working with ACLs Under Mac OS X, you can use chmod (page 741) to create, modify, and delete ACL rules. A chmod command used for this purpose has the following syntax: chmod option[# n] "who allow|deny permission-list" file-list where option is +a (add rule), –a (remove rule), or =a (change rule); n is an optional rule number; who is a username, user ID number, group name, or group ID number; permission-list is one or more comma-separated file access permissions selected from read, write, append, and execute; and file-list is the list of files the rule is applied to. The quotation marks are required. In the first example, Sam adds a rule that grants Helen read and write access to the memo file: $ chmod +a "helen allow read,write" memo 1048 Appendix D Mac OS X Notes The chmod utility displays an error message if you forget the quotation marks: $ chmod +a max deny write memo chmod: Invalid entry format -- expected allow or deny ls –l An ls –l command displays a plus sign (+) following the permissions for a file that has an ACL (see also Figure 4-12 on page 98): $ ls -l memo -rw-r--r--+ 1 sam ls –le staff 1680 May 12 13:30 memo Under Mac OS X, the ls –e option displays ACL rules: $ ls -le memo -rw-r--r--+ 1 sam staff 1680 May 12 13:30 memo 0: user:helen allow read,write For each rule, the –e option displays from left to right the rule number followed by a colon, user: or group: followed by the name of the user or group the rule pertains to, allow or deny depending on whether the rule grants or denies permissions, and a list of the permissions that are granted or denied. The kernel processes multiple rules in an ACL in rule number order but it does not necessarily number rules in the order you enter them. In general, rules denying permissions come before rules granting permissions. You can override the default order by assigning a number to a rule using the +a# n syntax. The following command bumps rule 0 from the previous example to rule 1 and replaces it with a rule that denies Max write access to memo: $ chmod +a# 0 "max deny write" memo $ ls -le memo -rw-r--r--+ 1 sam staff 1680 May 12 13:30 memo 0: user:max deny write 1: user:helen allow read,write There are two ways to remove access rules. First, you can specify a rule by using its number: $ chmod -a# 1 memo $ ls -le memo -rw-r--r--+ 1 sam staff 0: user:max deny write 1680 May 12 13:30 memo Second, you can specify a rule by giving the string you used when you added it: $ chmod –a "max deny write" memo $ ls -le memo -rw-r--r-- 1 sam staff 1680 May 12 13:30 memo After you remove the last rule, memo does not have an ACL. (There is no + in the line ls –le displays.) When you specify an ACL operation that chmod cannot complete, it displays an error message: $ chmod -a# 0 memo chmod: No ACL present Activating the Terminal META Key 1049 In the next example, Sam restores Helen’s read and write permission to memo: $ chmod +a "helen allow read,write" memo $ ls -le memo -rw-r--r--+ 1 sam staff 1680 May 12 13:30 memo 0: user:helen allow read,write Sam then removes the write permissions he just gave Helen. When you remove one of several access permissions from a rule, the other permissions remain unchanged: $ chmod -a "helen allow write" memo $ ls -le memo -rw-r--r--+ 1 sam staff 1680 May 12 13:30 memo 0: user:helen allow read The next example shows that chmod inserts rules in an ACL in a default order. Even though Sam added the allow rule before the deny rule, the allow rule appears first. The rule controlling permission for Helen, which was added before either of the other rules, appears last. $ chmod +a "max allow read" memo $ chmod +a "max deny read" memo $ ls -le memo -rw-r--r--+ 1 sam staff 1680 May 12 13:30 memo 0: user:max deny read 1: user:max allow read 2: user:helen allow read You can replace a rule using the =a syntax. In the following example, Sam changes rule 2 to grant Helen read and write permission to memo: $ chmod =a# 2 "helen allow read,write" memo $ ls -le memo -rw-r--r--+ 1 sam staff 1680 May 12 13:30 memo 0: user:max deny read 1: user:max allow read 2: user:helen allow read,write Activating the Terminal META Key Using the Mac OS X Terminal utility, you can make the OPTION (or ALT) key work as the META key. From the Terminal utility File menu, select Window Settings to display the Terminal Inspector window. This window provides a drop-down menu of properties you can change. Select Keyboard, check the box labeled Use option key as meta key, and click Use Settings as Defaults. This procedure causes the OPTION key (on Mac keyboards) or the ALT key (on PC keyboards) to function as the META key while you are using the Terminal utility. 1050 Appendix D Mac OS X Notes Startup Files Both Mac OS and application documentation refer to startup files as configuration files or preference files. Many Mac OS X applications store startup files in the Library and Library/Preferences subdirectories of a user’s home directory, which are created when an account is set up. Most of these files do not have invisible filenames. Use launchctl (page 850) to modify these files. Remote Logins By default, Mac OS X does not allow remote logins. You can enable remote logins via ssh by enabling remote login in the Services tab of the Sharing pane of the Preferences window. OS X does not support telnet logins. Many Utilities Do Not Respect Apple Human Interface Guidelines By default, rm under Mac OS X does not act according to the Apple Human Interface Guidelines, which state that an operation should either be reversible or ask for confirmation. In general, Mac OS X command-line utilities do not ask whether you are sure of what you are doing. Installing Xcode and MacPorts Xcode is free software supplied by Apple. See developer.apple.com/xcode for more information. To download and install Xcode, open the App Store, search for and click xcode, then follow the instructions for installation. Once you have installed Xcode, you might want to install Command Line Tools; you must install this package if you want to use MacPorts (next). To install Command Line Tools, select Xcode Preferences, click Downloads, and click Install adjacent to Command Line Tools. The MacPorts (www.macports.org) project is an “open-source community initiative to design an easy-to-use system for compiling, installing, and upgrading either commandline, X11, or Aqua based open-source software on the Mac OS X operating system.” MacPorts includes more than 14,000 tools; see www.macports.org/ports.php for a list. To work with MacPorts on a Macintosh, you must first install Xcode as explained above. You can then install MacPorts by visiting www.macports.org/install.php and following the instructions in the section titled Mac OS X Package (.pkg) Installer. Mac OS X Implementation of Linux Features Installing gawk 1051 Once you have installed MacPorts, you can install individual packages using the port utility. For example, you can install gawk using the following command: $ sudo port install gawk ---> Computing dependencies for gawk ---> Fetching archive for gawk ... ---> No broken files found. Installing MySQL Once you have installed MacPorts, you can install MySQL using the following command: $ sudo port install mysql51 ---> Dependencies to be installed: mysql_select zlib ---> Fetching archive for mysql_select ... ---> No broken files found. Alternatively, you can visit www.mysql.com/downloads/mysql (the MySQL Community Server), select Mac OS X as your platform, and download the appropriate DMG Archive or Compressed TAR Archive file. These files include a launchd startup item and a System Preferences pane for MySQL; see the MySQL documentation at dev.mysql.com/doc/refman/5.6/en/macosx-installation.html. Mac OS X Implementation of Linux Features Table D-3 explains how some Linux features are implemented under Mac OS X. Table D-3 Mac OS X implementation of Linux features Linux feature OS X implementation /bin/sh The /bin/sh file is a copy of bash (/bin/bash); it is not a link to /bin/bash as it is on most Linux systems. The original Bourne Shell does not exist under OS X. When you call bash using the command sh, bash tries to mimic the behavior of the original Bourne Shell as closely as possible. Core files By default, OS X does not save core files. When core files are saved, they are kept in /cores, not in the working directory. Developer tools The Developer Tools package is not installed by default. Development APIs Mac OS X uses two software development APIs: Cocoa and BSD UNIX. Dynamic linker ld.so The OS X dynamic linker is dyld, not ld.so. ELF and a.out binary The primary binary format under OS X is Mach-O, not ELF or a.out. formats 1052 Appendix D Mac OS X Notes Table D-3 Mac OS X implementation of Linux features (continued) Linux feature OS X implementation /etc/group Mac OS X uses Open Directory (page 1042), not /etc/group, to store group information. /etc/passwd Mac OS X uses Open Directory (page 1042), not /etc/passwd, to store user accounts. Filesystem structure /etc/fstab Instead of filesystems being mounted according to settings in /etc/fstab, filesystems are automatically mounted in the /Volumes directory (page 1044). finger By default, Mac OS X disables remote finger support. LD_LIBRARY_PATH The variable used to control the dynamic linker is DYLD_LIBRARY_PATH, not LD_LIBRARY_PATH. Shared libraries *.so OS X shared library files are named *.dylib, not *.so. They are typically distributed in .framework bundles that include resources and headers. System databases Some system databases, such as passwd and group, are stored by Open Directory (page 1042), not in the /etc directory (page 96). You can work with Open Directory databases using the dscl utility (page 788). vi editor As with many Linux distributions, when you call the vi editor, Mac OS X 10.3 and above run vim (page 159) because the file /usr/bin/vi is a link to /usr/bin/vim. Glossary All entries marked with FOLDOC are based on definitions in the Free On-Line Dictionary of Computing (foldoc.org), Denis Howe, editor. Used with permission. G lossary Glossary 1054 Glossary 10.0.0.0 See private address space on page 1086. 172.16.0.0 See private address space on page 1086. 192.168.0.0 See private address space on page 1086. 802.11 A family of specifications developed by IEEE for wireless LAN technology, including 802.11 (1–2 megabits per second), 802.11a (54 megabits per second), 802.11b (11 megabits per second), and 802.11g (54 megabits per second). absolute pathname A pathname that starts with the root directory (represented by / ). An absolute pathname locates a file without regard to the working directory. access In computer jargon, a verb meaning to use, read from, or write to. To access a file means to read from or write to the file. Access Control See ACL. List access permissions Permission to read from, write to, or execute a file. If you have write access permission to a file (usually just called write permission), you can write to the file. Also access privilege. ACL Access Control List. A system that performs a function similar to file permissions but with much finer-grain control. active window On a desktop, the window that receives the characters you type on the keyboard. Same as focus, desktop (page 1069). address mask See network mask on page 1082. alias A mechanism of a shell that enables you to define new commands. alphanumeric character One of the characters, either uppercase or lowercase, from A to Z and 0 to 9, inclusive. ambiguous file A reference to a file that does not necessarily specify any one file but can be used to reference specify a group of files. The shell expands an ambiguous file reference into a list of filenames. Special characters represent single characters (?), strings of zero or more characters (*), and character classes ([]) within ambiguous file references. An ambiguous file reference is a type of regular expression (page 1089). angle bracket A left angle bracket (<) and a right angle bracket (>). The shell uses < to redirect a command’s standard input to come from a file and > to redirect the standard output. The shell uses the characters << to signify the start of a Here document and >> to append output to a file. animate When referring to a window action, means that the action is slowed down so the user can view it. For example, when you minimize a window, it can disappear all at once (not animated) or it can slowly telescope into the panel so you can get a visual feel for what is happening (animated). Glossary 1055 anti-aliasing Adding gray pixels at the edge of a diagonal line to get rid of the jagged appearance and thereby make the line look smoother. Anti-aliasing sometimes makes type on a screen look better and sometimes worse; it works best on small and large fonts and is less effective on fonts from 8 to 15 points. See also subpixel hinting (page 1096). API Application program interface. The interface (calling conventions) by which an application program accesses an operating system and other services. An API is defined at the source code level and provides a level of abstraction between the application and the kernel (or other privileged utilities) to ensure the portability of the code.FOLDOC append To add something to the end of something else. To append text to a file means to add the text to the end of the file. The shell uses >> to append a command’s output to a file. applet A small program that runs within a larger program. Examples are Java applets that run in a browser and panel applets that run from a desktop panel. archive A file that contains a group of smaller, typically related, files. Also, to create such a file. The tar and cpio utilities can create and read archives. argument A number, letter, filename, or another string that gives some information to a command and is passed to the command when it is called. A command-line argument is anything on a command line following the command name that is passed to the command. An option is a kind of argument. arithmetic expression A group of numbers, operators, and parentheses that can be evaluated. When you evaluate an arithmetic expression, you end up with a number. The Bourne Again Shell uses the expr command to evaluate arithmetic expressions; the TC Shell uses @, and the Z Shell uses let. ARP Address Resolution Protocol. A method for finding a host’s MAC address (page 1078; also Ethernet address) from its IP address. ARP allows the IP address to be independent of the MAC address.FOLDOC array An arrangement of elements (numbers or strings of characters) in one or more dimensions. The Bourne Again, TC, and Z Shells and awk/mawk/gawk can store and process arrays. ASCII American Standard Code for Information Interchange. A code that uses seven bits to represent both graphic (letters, numbers, and punctuation) and CONTROL characters. You can represent textual information, including program source code and English text, in ASCII code. Because ASCII is a standard, it is frequently used when exchanging information between computers. See the file /usr/pub/ascii or give the command man ascii to see a list of ASCII codes. Extensions of the ASCII character set use eight bits. The seven-bit set is common; the eight-bit extensions are still coming into popular use. The eighth bit is sometimes referred to as the metabit. 1056 Glossary ASCII terminal A textual terminal. Contrast with graphical display (page 1070). ASP Application service provider. A company that provides applications over the Internet. asynchronous event An event that does not occur regularly or synchronously with another event. Linux system signals are asynchronous; they can occur at any time because they can be initiated by any number of nonregular events. attachment A file that is attached to, but is not part of, a piece of email. Attachments are frequently opened by programs (including your Internet browser) that are called by your mail program so you might not be aware that they are not an integral part of an email message. authentication The verification of the identity of a person or process. In a communication system, authentication verifies that a message really comes from its stated source, like the signature on a (paper) letter. The most common form of authentication is typing a user name (which might be widely known or easily guessable) and a corresponding password that is presumed to be known only to the individual being authenticated. Other methods of authentication on a Linux system include the /etc/passwd and /etc/shadow files, LDAP, biometrics, Kerberos 5, and SMB.FOLDOC automatic mounting A way of demand mounting directories from remote hosts without having them hard configured into /etc/fstab. Also called automounting. avoided An object, such as a panel, that should not normally be covered by another object, such as a window. back door A security hole deliberately left in place by the designers or maintainers of a system. The motivation for creating such holes is not always sinister; some operating systems, for example, come out of the box with privileged accounts intended for use by field service technicians or the vendor’s maintenance programmers. Ken Thompson’s 1983 Turing Award lecture to the ACM revealed the existence, in early UNIX versions, of a back door that might be the most fiendishly clever security hack of all time. The C compiler contained code that would recognize when the login command was being recompiled and would insert some code recognizing a password chosen by Thompson, giving him entry to the system whether or not an account had been created for him. Normally such a back door could be removed by removing it from the source code for the compiler and recompiling the compiler. But to recompile the compiler, you have to use the compiler, so Thompson arranged that the compiler would recognize when it was compiling a version of itself. It would insert into the recompiled compiler the code to insert into the recompiled login the code to allow Thompson entry, and, of course, the code to recognize itself and do the whole thing again the next time around. Having done this once, he was then able to recompile the compiler from the original sources; the hack perpetuated itself invisibly, leaving the back door in place and active but with no trace in the sources. Sometimes called a wormhole. Also trap door.FOLDOC Glossary 1057 background process A process that is not run in the foreground. Also called a detached process, a background process is initiated by a command line that ends with an ampersand (&) control operator. You do not have to wait for a background process to run to completion before giving the shell additional commands. If you have job control, you can move background processes to the foreground, and vice versa. basename The name of a file that, in contrast with a pathname, does not mention any of the directories containing the file (and therefore does not contain any slashes [/]). For example, hosts is the basename of /etc/hosts.FOLDOC baud The maximum information-carrying capacity of a communication channel in symbols (state transitions or level transitions) per second. It coincides with bits per second only for two-level modulation with no framing or stop bits. A symbol is a unique state of the communication channel, distinguishable by the receiver from all other possible states. For example, it might be one of two voltage levels on a wire for a direct digital connection, or it might be the phase or frequency of a carrier.FOLDOC Baud is often mistakenly used as a synonym for bits per second. baud rate Transmission speed. Usually used to measure terminal or modem speed. Common baud rates range from 110 to 38,400 baud. See baud. Berkeley UNIX One of the two major versions of the UNIX operating system. Berkeley UNIX was developed at the University of California at Berkeley by the Computer Systems Research Group and is often referred to as BSD (Berkeley Software Distribution). BIND Berkeley Internet Name Domain. An implementation of a DNS (page 1066) server developed and distributed by the University of California at Berkeley. BIOS Basic Input/Output System. On PCs, EEPROM-based (page 1067) system software that provides the lowest-level interface to peripheral devices and controls the first stage of the bootstrap (page 1058) process, which loads the operating system. The BIOS can be stored in different types of memory. The memory must be nonvolatile so that it remembers the system settings even when the system is turned off. Also BIOS ROM. bit The smallest piece of information a computer can handle. A bit is a binary digit: either 1 or 0 (on or off ). bit depth Same as color depth (page 1061). bit-mapped display A graphical display device in which each pixel on the screen is controlled by an underlying representation of zeros and ones. blank character Either a SPACE or a TAB character, also called whitespace (page 1101). In some contexts, NEWLINE s are considered blank characters. block A section of a disk or tape (usually 1,024 bytes long but shorter or longer on some systems) that is written at one time. 1058 Glossary block device A disk or tape drive. A block device stores information in blocks of characters and is represented by a block device (block special) file. Contrast with character device (page 1060). block number Disk and tape blocks are numbered so that Linux can keep track of the data on the device. blocking factor The number of logical blocks that make up a physical block on a tape or disk. When you write 1K logical blocks to a tape with a physical block size of 30K, the blocking factor is 30. Boolean The type of an expression with two possible values: true and false. Also, a variable of Boolean type or a function with Boolean arguments or result. The most common Boolean functions are AND, OR, and NOT.FOLDOC boot See bootstrap. boot loader A very small program that takes its place in the bootstrap process that brings a computer from off or reset to a fully functional state. bootstrap Derived from “Pull oneself up by one’s own bootstraps,” the incremental process of loading an operating system kernel into memory and starting it running without any outside assistance. Frequently shortened to boot. Bourne Again Shell bash. GNU’s command interpreter for UNIX, bash is a POSIX-compliant shell with full Bourne Shell syntax and some C Shell commands built in. The Bourne Again Shell supports emacs-style command-line editing, job control, functions, and online help.FOLDOC Bourne Shell sh. This UNIX command processor was developed by Steve Bourne at AT&T Bell Laboratories. brace A left brace ( {) and a right brace (}). Braces have special meanings to the shell. bracket A square bracket (page 1095) or an angle bracket (page 1054). branch In a tree structure, a branch connects nodes, leaves, and the root. The Linux filesystem hierarchy is often conceptualized as an upside-down tree. The branches connect files and directories. In a source code control system, such as SCCS or RCS, a branch occurs when a revision is made to a file and is not included in subsequent revisions to the file. bridge Typically a two-port device originally used for extending networks at layer 2 (data link) of the Internet Protocol model. broadcast A transmission to multiple, unspecified recipients. On Ethernet, a broadcast packet is a special type of multicast (page 1081) packet; it has a special address indicating that all devices that receive it should process it. Broadcast traffic exists at several layers of the network stack, including Ethernet and IP. Broadcast traffic has one source but indeterminate destinations (all hosts on the local network). Glossary 1059 broadcast address The last address on a subnet (usually 255), reserved as shorthand to mean all hosts. broadcast network A type of network, such as Ethernet, in which any system can transmit information at any time, and all systems receive every message. BSD See Berkeley UNIX on page 1057. buffer An area of memory that stores data until it can be used. When you write information to a file on a disk, Linux stores the information in a disk buffer until there is enough to write to the disk or until the disk is ready to receive the information. bug An unwanted and unintended program property, especially one that causes the program to malfunction.FOLDOC builtin (command) A command that is built into a shell. Each of the three major shells—the Bourne Again, TC, and Z Shells—has its own set of builtins. byte A component in the machine data hierarchy, usually larger than a bit and smaller than a word; now most often eight bits and the smallest addressable unit of storage. A byte typically holds one character.FOLDOC bytecode A binary file containing an executable program consisting of a sequence of (op code, data) pairs. A bytecode program is interpreted by a bytecode interpreter; Python uses the Python virtual machine. The advantage of bytecode is that it can be run on any processor for which there is a bytecode interpreter. Compiled code (machine code) can run only on the processor for which it was compiled.FOLDOC C programming language A modern systems language that has high-level features for efficient, modular programming as well as lower-level features that make it suitable for use as a systems programming language. It is machine independent so that carefully written C programs can be easily transported to run on different machines. Most of the Linux operating system is written in C, and Linux provides an ideal environment for programming in C. C Shell csh. The C Shell command processor was developed by Bill Joy for BSD UNIX. It was named for the C programming language because its programming constructs are similar to those of C. See shell on page 1092. cable modem A type of modem that allows you to access the Internet by using your cable television connection. cache Holding recently accessed data, a small, fast memory designed to speed up subsequent access to the same data. Most often applied to processor-memory access but also used for a local copy of data accessible over a network, from a hard disk, and so on.FOLDOC calling environment A list of variables and their values that is made available to a called program. Refer to “Executing a Command” on page 325. 1060 Glossary cascading stylesheet See CSS on page 1064. cascading windows An arrangement of windows such that they overlap, generally with at least part of the title bar visible. Opposite of tiled windows (page 1098). case sensitive Able to distinguish between uppercase and lowercase characters. Unless you set the ignorecase parameter, vim performs case-sensitive searches. The grep utility performs case-sensitive searches unless you use the –i option. catenate To join sequentially, or end to end. The Linux cat utility catenates files: It displays them one after the other. Also concatenate. chain loading The technique used by a boot loader to load unsupported operating systems. Used for loading such operating systems as DOS or Windows, it works by loading another boot loader. characterbased A program, utility, or interface that works only with ASCII (page 1055) characters. This set of characters includes some simple graphics, such as lines and corners, and can display colored characters. It cannot display true graphics. Contrast with GUI (page 1071). characterA terminal that displays only characters and very limited graphics. See characterbased terminal based. character class In a regular expression, a group of characters that defines which characters can occupy a single character position. A character-class definition is usually surrounded by square brackets. The character class defined by [abcr] represents a character position that can be occupied by a, b, c, or r. Also list operator. In GNU documentation and POSIX, used to refer to sets of characters with a common characteristic, denoted by the notation [:class:]; for example, [:upper:] denotes the set of uppercase letters. This book uses the term character class as explained under “Brackets” on page 1013. character device A terminal, printer, or modem. A character device stores or displays characters one at a time. A character device is represented by a character device (character special) file. Contrast with block device (page 1058). check box A GUI widget, usually the outline of a square box with an adjacent caption, that a user can click to display or remove a tick (page 1098). When the box holds a tick, the option described by the caption is on or true. Also tick box. checksum A computed value that depends on the contents of a block of data and is transmitted or stored along with the data to detect corruption of the data. The receiving system recomputes the checksum based on the received data and compares this value with the one sent with the data. If the two values are the same, the receiver has some confidence that the data was received correctly. Glossary 1061 The checksum might be 8, 16, or 32 bits, or some other size. It is computed by summing the bytes or words of the data block, ignoring overflow. The checksum might be negated so that the total of the data words plus the checksum is zero. Internet packets use a 32-bit checksum.FOLDOC child process A process that is created by another process, the parent process. Every process is a child process except for the first process, which is started when Linux begins execution. When you run a command from the shell, the shell spawns a child process to run the command. See process on page 1087. CIDR Classless Inter-Domain Routing. A scheme that allocates blocks of Internet addresses in a way that allows summarization into a smaller number of routing table entries. A CIDR block is a block of Internet addresses assigned to an ISP by the Internic.FOLDOC CIFS Common Internet File System. An Internet filesystem protocol based on SMB (page 1093). CIFS runs on top of TCP/IP, uses DNS, and is optimized to support slower dial-up Internet connections. SMB and CIFS are used interchangeably.FOLDOC CIPE Crypto IP Encapsulation (page 1067). This protocol (page 1087) tunnels (page 1099) IP packets within encrypted UDP (page 1099) packets, is lightweight and simple, and works over dynamic addresses, NAT (page 1082), and SOCKS (page 1094) proxies (page 1087). cipher (cypher) A cryptographic system that uses a key to transpose/substitute characters within a message, the key itself, or the message. ciphertext Text that is encrypted. Contrast with plaintext (page 1086). Classless Inter-Domain Routing See CIDR on page 1061. cleartext Text that is not encrypted. Also plaintext. Contrast with ciphertext. CLI Command-line interface. See also character-based (page 1060). Also textual interface. client A computer or program that requests one or more services from a server. CODEC Coder/decoder or compressor/decompressor. A hardware and/or software technology that codes and decodes data. MPEG is a popular CODEC for computer video. color depth The number of bits used to generate a pixel—usually 8, 16, 24, or 32. The color depth is directly related to the number of colors that can be generated. The number of colors that can be generated is 2 raised to the color-depth power. Thus a 24-bit video adapter can generate about 16.7 million colors. color quality See color depth. 1062 Glossary combo box A combination of a drop-down list (page 1067) and text box (page 1097). You can enter text in a combo box. Or, you can click a combo box, cause it to expand and display a static list of selections for you to choose from. command What you give the shell in response to a prompt. When you give the shell a command, it executes a utility, another program, a builtin command, or a shell script. Utilities are often referred to as commands. When you are using an interactive utility, such as vim or mail, you use commands that are appropriate to that utility. command line A line containing instructions and arguments that executes a command. This term usually refers to a line you enter in response to a shell prompt on a character-based terminal or terminal emulator. command substitution Replacing a command with its output. The shells perform command substitution when you enclose a command between $( and ) or between a pair of back ticks (‘‘), also called grave accent marks. component architecture A notion in object-oriented programming where “components” of a program are completely generic. Instead of having a specialized set of methods and fields, they have generic methods through which the component can advertise the functionality it supports to the system into which it is loaded. This strategy enables completely dynamic loading of objects. JavaBeans is an example of a component architecture.FOLDOC concatenate See catenate on page 1060. condition code See exit status on page 1068. connectionoriented protocol A type of transport layer data communication service that allows a host to send data in a continuous stream to another host. The transport service guarantees that all data will be delivered to the other end in the same order as sent and without duplication. Communication proceeds through three well-defined phases: connection establishment, data transfer, and connection release. The most common example is TCP (page 1097). Also called connection-based protocol and stream-oriented protocol. Contrast with connectionless protocol and datagram (page 1064).FOLDOC connectionless The data communication method in which communication occurs between hosts protocol with no previous setup. Packets sent between two hosts might take different routes. There is no guarantee that packets will arrive as transmitted or even that they will arrive at the destination at all. UDP (page 1099) is a connectionless protocol. Also called packet switching. Contrast with circuit switching and connection-oriented protocol.FOLDOC console The main system terminal, usually directly connected to the computer and the one that receives system error messages. Also system console and console terminal. console terminal See console. Glossary 1063 control character A character that is not a graphic character, such as a letter, number, or punctuation mark. Such characters are called control characters because they frequently act to control a peripheral device. RETURN and FORMFEED are control characters that control a terminal or printer. The word CONTROL is shown in this book in THIS FONT because it is a key that appears on most terminal keyboards. Control characters are represented by ASCII codes less than 32 (decimal). See also nonprinting character on page 1083. control operator A token that performs a control function. The Bourne Again Shell uses the following symbols as control operators: ||, &, &&, ;, ;;, (, ), |, |&, and RETURN. control structure A statement used to change the order of execution of commands in a shell script or other program. Each shell provides control structures (for example, if and while) as well as other commands that alter the order of execution (for example, exec). Also control flow commands. cookie Data stored on a client system by a server. The client system browser sends the cookie back to the server each time it accesses that server. For example, a catalog shopping service might store a cookie on your system when you place your first order. When you return to the site, it knows who you are and can supply your name and address for subsequent orders. You might consider cookies to be an invasion of privacy. CPU Central processing unit. The part of a computer that controls all the other parts. The CPU includes the control unit and the arithmetic and logic unit (ALU). The control unit fetches instructions from memory and decodes them to produce signals that control the other parts of the computer. These signals can cause data to be transferred between memory and ALU or peripherals to perform input or output. A CPU that is housed on a single chip is called a microprocessor. Also processor and central processor. cracker An individual who attempts to gain unauthorized access to a computer system. These individuals are often malicious and have many means at their disposal for breaking into a system. Contrast with hacker (page 1071).FOLDOC crash The system suddenly and unexpectedly stops or fails. Derived from the action of the hard disk heads on the surface of the disk when the air gap between the two collapses. cryptography The practice and study of encryption and decryption—encoding data so that only a specific individual or machine can decode it. A system for encrypting and decrypting data is a cryptosystem. Such systems usually rely on an algorithm for combining the original data (plaintext) with one or more keys—numbers or strings of characters known only to the sender and/or recipient. The resulting output is called ciphertext (page 1061). The security of a cryptosystem usually depends on the secrecy of keys rather than on the supposed secrecy of an algorithm. Because a strong cryptosystem has a large range of keys, it is not possible to try all of them. Ciphertext appears random to standard statistical tests and resists known methods for breaking codes.FOLDOC 1064 Glossary .cshrc file In your home directory, a file that the TC Shell executes each time you invoke a new TC Shell. You can use this file to establish variables and aliases. CSS Cascading stylesheet. Describes how documents are presented on screen and in print. Attaching a stylesheet to a structured document can affect the way it looks without adding new HTML (or other) tags and without giving up device independence. Also stylesheet. current (process, line, character, directory, event, etc.) The item that is immediately available, working, or being used. The current process is the program you are running, the current line or character is the one the cursor is on, and the current directory is the working directory. cursor A small lighted rectangle, underscore, or vertical bar that appears on a terminal screen and indicates where the next character will appear. Differs from the mouse pointer (page 1081). daemon A program that is not invoked explicitly but lies dormant, waiting for some condition(s) to occur. The perpetrator of the condition need not be aware that a daemon is lurking (although often a program will commit an action only because it knows that it will implicitly invoke a daemon). From the mythological meaning, later rationalized as the acronym Disk And Execution MONitor.FOLDOC data structure A particular format for storing, organizing, working with, and retrieving data. Frequently, data structures are designed to work with specific algorithms that facilitate these tasks. Common data structures include trees, files, records, tables, arrays, etc. datagram A self-contained, independent entity of data carrying sufficient information to be routed from the source to the destination computer without reliance on earlier exchanges between this source and destination computer and the transporting network. UDP (page 1099) uses datagrams; IP (page 1075) uses packets (page 1084). Packets are indivisible at the network layer; datagrams are not.FOLDOC See also frame (page 1070). dataless A computer, usually a workstation, that uses a local disk to boot a copy of the operating system and access system files but does not use a local disk to store user files. dbm A standard, simple database manager. Implemented as gdbm (GNU database manager), it uses hashes to speed searching. The most common versions of the dbm database are dbm, ndbm, and gdbm. DDoS attack Distributed denial of service attack. A DoS attack (page 1066) from many systems that do not belong to the perpetrator of the attack. debug To correct a program by removing its bugs (that is, errors). default Something that is selected without being explicitly specified. For example, when used without an argument, ls displays a list of the files in the working directory by default. Glossary 1065 delta A set of changes made to a file that has been encoded by the Source Code Control System (SCCS). denial of service See DoS attack on page 1066. dereference To access the thing to which a pointer points, that is, to follow the pointer. At first sight the word dereference might be thought to mean “to cause to stop referring,” but its meaning is well established in jargon. See page 115.FOLDOC When speaking of symbolic links, dereference means to follow the link rather than working with the reference to the link. For example, the –L or ––dereference option causes ls to list the entry that a symbolic link points to rather than the symbolic link (the reference) itself. desktop A collection of windows, toolbars, icons, and buttons, some or all of which appear on your display. A desktop comprises one or more workspaces (page 1102). desktop manager An icon- and menu-based user interface to system services that allows you to run applications and use the filesystem without using the system’s command-line interface. detached process See background process on page 1057. device A disk drive, printer, terminal, plotter, or other input/output unit that can be attached to the computer. Short for peripheral device. device driver Part of the Linux kernel that controls a device, such as a terminal, disk drive, or printer. device file A file that represents a device. Also special file. device filename The pathname of a device file. All Linux systems have two kinds of device files: block and character device files. Linux also has FIFOs (named pipes) and sockets. Device files are traditionally located in the /dev directory. device number See major device number (page 1079) and minor device number (page 1080). DHCP Dynamic Host Configuration Protocol. A protocol that dynamically allocates IP addresses to computers on a LAN.FOLDOC dialog box In a GUI, a special window, usually without a titlebar, that displays information. Some dialog boxes accept a response from the user. directory Short for directory file. A file that contains a list of other files. directory hierarchy A directory, called the root of the directory hierarchy, and all the directory and ordinary files below it (its children). directory service A structured repository of information on people and resources within an organization, facilitating management and communication.FOLDOC 1066 Glossary disk partition See partition on page 1085. diskless A computer, usually a workstation, that has no disk and must contact another computer (a server) to boot a copy of the operating system and access the necessary system files. distributed computing A style of computing in which tasks or services are performed by a network of cooperating systems, some of which might be specialized. DMZ Demilitarized zone. A host or small network that is a neutral zone between a LAN and the Internet. It can serve Web pages and other data to the Internet and allow local systems access to the Internet while preventing LAN access to unauthorized Internet users. Even if a DMZ is compromised, it holds no data that is private and none that cannot be easily reproduced. DNS Domain Name Service. A distributed service that manages the correspondence of full hostnames (those that include a domain name) to IP addresses and other system characteristics. DNS domain name See domain name. document object model See DOM. DOM Document Object Model. A platform-/language-independent interface that enables a program to update the content, structure, and style of a document dynamically. The changes can then be made part of the displayed document. Go to www.w3.org/DOM for more information. domain name A name associated with an organization, or part of an organization, to help identify systems uniquely. Technically, the part of the FQDN (page 1069) to the right of the leftmost period. Domain names are assigned hierarchically. The domain berkeley.edu refers to the University of California at Berkeley, for example; it is part of the toplevel edu (education) domain. Also DNS domain name. Different than NIS domain name (page 1083). Domain Name See DNS. Service door An evolving filesystem-based RPC (page 1091) mechanism. DoS attack Denial of service attack. An attack that attempts to make the target host or network unusable by flooding it with spurious traffic. DPMS Display Power Management Signaling. A standard that can extend the life of CRT monitors and conserve energy. DPMS supports four modes for a monitor: Normal, Standby (power supply on, monitor ready to come to display images almost instantly), Suspend (power supply off, monitor takes up to ten seconds to display an image), and Off. Glossary 1067 drag The motion part of drag-and-drop. drag-and-drop To move an object from one position or application to another within a GUI. To drag an object, the user clicks a mouse button (typically the left one) while the mouse pointer hovers (page 1073) over the object. Then, without releasing the mouse button, the user drags the object, which stays attached to the mouse pointer, to a different location. The user can then drop the object at the new location by releasing the mouse button. drop-down list A widget (page 1101) that displays a static list for a user to choose from. When the list is not active, it appears as text in a box, displaying the single selected entry. When a user clicks the box, a list appears; the user can move the mouse cursor to select an entry from the list. Different from a list box (page 1078). druid In role-playing games, a character that represents a magical user. Fedora/RHEL uses the term druid at the ends of names of programs that guide you through a task-driven chain of steps. Other operating systems call these types of programs wizards. DSA Digital Signature Algorithm. A public key cipher used to generate digital signatures. DSL Digital Subscriber Line/Loop. Provides high-speed digital communication over a specialized, conditioned telephone line. See also xDSL (page 1103). Dynamic Host See DHCP on page 1065. Configuration Protocol editor A utility, such as vim or emacs, that creates and modifies text files. EEPROM Electrically erasable, programmable, readonly memory. A PROM (page 1087) that can be written to. effective user ID The user ID that a process appears to have; usually the same as the user ID. For example, while you are running a setuid program, the effective user ID of the process running the program is that of the owner of the program. element One thing; usually a basic part of a group of things. An element of a numeric array is one of the numbers stored in the array. emoticon See smiley on page 1093. encapsulation See tunneling on page 1099. environment See calling environment on page 1059. environment A variable that was in the environment the shell was called with or a variable that variable (shell) has been marked for export via the environment. An environment variable is available to children of the shell; also called a global variable. As explained on page 305, you can use the export builtin, or the –x option to the declare builtin, to mark a variable for export via the environment. 1068 Glossary EOF End of file. EPROM Erasable programmable readonly memory. A PROM (page 1087) that can be written to by applying a higher than normal voltage. escape See quote on page 1088. Ethernet A type of LAN (page 1077) capable of transfer rates as high as 1,000 megabits per second. Ethernet address See MAC address on page 1078. event An occurrence, or happening, of significance to a task or program—for example, the completion of an asynchronous input/output operation, such as a keypress or mouse click.FOLDOC exabyte 260 bytes or about 1018 bytes. See also large number (page 1077). exit status The status returned by a process; either successful (usually 0) or unsuccessful (usually 1). exploit A security hole or an instance of taking advantage of a security hole.FOLDOC expression See logical expression (page 1078) and arithmetic expression (page 1055). extranet A network extension for a subset of users (such as students at a particular school or engineers working for the same company). An extranet limits access to private information even though it travels on the public Internet. failsafe session A session that allows you to log in on a minimal desktop in case your standard login does not work well enough to allow you to log in to fix a login problem. FDDI Fiber Distributed Data Interface. A type of LAN (page 1077) designed to transport data at the rate of 100 million bits per second over fiberoptic cable. file A collection of related information referred to with a filename and frequently stored on a disk. Text files typically contain memos, reports, messages, program source code, lists, or manuscripts. Binary or executable files contain utilities or programs that you can run. Refer to “Directory Files and Ordinary Files” on page 83. filename The name of a file. A filename refers to a file. filename completion Automatic completion of a filename after you specify a unique prefix. filename extension The part of a filename following a period. filename generation What occurs when the shell expands ambiguous file references. See ambiguous file reference on page 1054. filesystem A data structure (page 1064) that usually resides on part of a disk. All Linux systems have a root filesystem, and many have other filesystems. Each filesystem is composed Glossary 1069 of some number of blocks, depending on the size of the disk partition that has been assigned to the filesystem. Each filesystem has a control block, named the superblock, that contains information about the filesystem. The other blocks in a filesystem are inodes, which contain control information about individual files, and data blocks, which contain the information in the files. filling A variant of maximizing in which window edges are pushed out as far as they can go without overlapping another window. filter A command that can take its input from standard input and send its output to standard output. A filter transforms the input stream of data and sends it to standard output. A pipe symbol (|) usually connects a filter’s input to standard output of one command, and a second pipe symbol connects the filter’s output to standard input of another command. The grep and sort utilities are commonly used as filters. firewall A device for policy-based traffic management used to keep a network secure. A firewall can be implemented in a single router that filters out unwanted packets, or it can rely on a combination of routers, proxy servers, and other devices. Firewalls are widely used to give users access to the Internet in a secure fashion and to separate a company’s public WWW server from its internal network. They are also employed to keep internal network segments more secure. Recently the term has come to be defined more loosely to include a simple packet filter running on an endpoint machine. See also proxy server on page 1088. firmware Software built into a computer, often in ROM (page 1090). Might be used as part of the bootstrap (page 1058) procedure. focus, desktop On a desktop, the window that is active. The window with the desktop focus receives the characters you type on the keyboard. Same as active window (page 1054). footer The part of a format that goes at the bottom (or foot) of a page. Contrast with header (page 1071). foreground process When you run a command in the foreground, the shell waits for the command to finish before giving you another prompt. You must wait for a foreground process to run to completion before you can give the shell another command. If you have job control, you can move background processes to the foreground, and vice versa. See job control on page 1076. Contrast with background process (page 1057). fork To create a process. When one process creates another process, it forks a process. Also spawn. FQDN Fully qualified domain name. The full name of a system, consisting of its hostname and its domain name, including the top-level domain. Technically the name that gethostbyname(2) returns for the host named by gethostname(2). For example, speedy is a hostname and speedy.example.com is an FQDN. An FQDN is sufficient to determine a unique Internet address for a machine on the Internet.FOLDOC 1070 Glossary frame A data link layer packet that contains, in addition to data, the header and trailer information required by the physical medium. Network layer packets are encapsulated to become frames.FOLDOC See also datagram (page 1064) and packet (page 1084). free list In a filesystem, the list of blocks that are available for use. Information about the free list is kept in the superblock of the filesystem. free space The portion of a hard disk that is not within a partition. A new hard disk has no partitions and contains all free space. full duplex The ability to receive and transmit data simultaneously. A network switch (page 1083) is typically a full-duplex device. Contrast with half-duplex (page 1071). fully qualified domain name See FQDN. function See shell function on page 1092. gateway A generic term for a computer or a special device connected to more than one dissimilar type of network to pass data between them. Unlike a router, a gateway often must convert the information into a different format before passing it on. The historical usage of gateway to designate a router is deprecated. GCOS See GECOS. GECOS General Electric Comprehensive Operating System. For historical reasons, the user information field in the /etc/passwd file is called the GECOS field. Also GCOS. gibibyte Giga binary byte. A unit of storage equal to 230 bytes = 1,073,741,824 bytes = 1024 mebibytes (page 1079). Abbreviated as GiB. Contrast with gigabyte. gigabyte A unit of storage equal to 109 bytes. Sometimes used in place of gibibyte. Abbreviated as GB. See also large number on page 1077. global variable See environment variable (shell) on page 1067. (shell) glyph A symbol that communicates a specific piece of information nonverbally. A smiley (page 1093) is a glyph. GMT Greenwich Mean Time. See UTC on page 1100. graphical display A bitmapped monitor that can display graphical images. Contrast with ASCII terminal (page 1056). graphical user interface See GUI. group (of users) A collection of users. Groups are used as a basis for determining file access permissions. If you are not the owner of a file and you belong to the group the file is assigned to, you are subject to the group access permissions for the file. A user can simultaneously belong to several groups. Glossary 1071 group (of windows) A way to identify similar windows so they can be displayed and acted on similarly. Typically windows started by a given application belong to the same group. group ID A unique number that identifies a set of users. It is stored in the password and group databases (/etc/passwd and /etc/group files or their NIS equivalents). The group database associates group IDs with group names. Also GID. GUI Graphical user interface. A GUI provides a way to interact with a computer system by choosing items from menus or manipulating pictures drawn on a display screen instead of by typing command lines. Under Linux, the X Window System provides a graphical display and mouse/keyboard input. GNOME and KDE are two popular desktop managers that run under X. Contrast with character-based (page 1060). hacker A person who enjoys exploring the details of programmable systems and learning how to stretch their capabilities, as opposed to users, who prefer to learn only the minimum necessary. One who programs enthusiastically (even obsessively) or who enjoys programming rather than just theorizing about programming.FOLDOC Contrast with cracker (page 1063). half-duplex A half-duplex device can only receive or transmit at a given moment; it cannot do both. A hub (page 1073) is typically a half-duplex device. Contrast with full duplex (page 1070). hard link A directory entry that contains the filename and inode number for a file. The inode number identifies the location of control information for the file on the disk, which in turn identifies the location of the file’s contents on the disk. Every file has at least one hard link, which locates the file in a directory. When you remove the last hard link to a file, you can no longer access the file. See link (page 1077) and symbolic link (page 1097). hash A string that is generated from another string. See one-way hash function on page 1084. When used for security, a hash can prove, almost to a certainty, that a message has not been tampered with during transmission: The sender generates a hash of a message, encrypts the message and hash, and sends the encrypted message and hash to the recipient. The recipient decrypts the message and hash, generates a second hash from the message, and compares the hash that the sender generated to the new hash. When they are the same, the message has probably not been tampered with. Hashed versions of passwords can be used to authenticate users. A hash can also be used to create an index called a hash table. Also hash value. hash table An index created from hashes of the items to be indexed. The hash function makes it highly unlikely that two items will create the same hash. To look up an item in the index, create a hash of the item and search for the hash. Because the hash is typically shorter than the item, the search is more efficient. header When you are formatting a document, the header goes at the top, or head, of a page. In electronic mail the header identifies who sent the message, when it was sent, what the subject of the message is, and so forth. 1072 Glossary Here document A shell script that takes its input from the file that contains the script. hesiod The nameserver of project Athena. Hesiod is a name service library that is derived from BIND (page 1057) and leverages a DNS infrastructure. heterogeneous Consisting of different parts. A heterogeneous network includes systems produced by different manufacturers and/or running different operating systems. hexadecimal number A base 16 number. Hexadecimal (or hex) numbers are composed of the hexadecimal digits 0–9 and A–F. Computers use bits (page 1057) to represent data. A group of 4 bits represents 16 possible values, 0 through F. A hexadecimal digit provides a convenient way to represent a group of 4 bits. See Table G-1 below. hidden filename A filename that starts with a period. These filenames are called hidden because the ls utility does not normally list them. Use the –a option of ls to list all files, including those with hidden filenames. The shell does not expand a leading asterisk (*) in an ambiguous file reference to match files with hidden filenames. Also hidden file, invisible file. hierarchy An organization with a few things, or thing—one at the top—and with several things below each other thing. An inverted tree structure. Examples in computing include a file tree where each directory might contain files or other directories, a hierarchical network, and a class hierarchy in object-oriented programming.FOLDOC Refer to “The Hierarchical Filesystem” on page 82. Table G-1 Decimal, octal, and hexadecimal numbers Decimal Octal Hex 1 1 1 2 2 3 4 Decimal Octal Hex 17 21 11 2 18 22 12 3 3 19 23 13 4 4 20 24 14 5 5 5 21 25 15 6 6 6 31 37 1F 7 7 7 32 40 20 8 10 8 33 41 21 9 11 9 64 100 40 10 12 A 96 140 60 11 13 B 100 144 64 12 14 C 128 200 80 13 15 D 254 376 FE 14 16 E 255 377 FF 15 17 F 256 400 100 16 20 10 257 401 101 Glossary 1073 history A shell mechanism that enables you to modify and reexecute recent commands. home directory The directory that is the working directory when you first log in. The pathname of this directory is stored in the HOME shell variable. hover To leave the mouse pointer stationary for a moment over an object. In many cases hovering displays a tooltip (page 1098). HTML Hypertext Markup Language. A hypertext document format used on the World Wide Web. Tags, which are embedded in the text, consist of a less than sign (<), a directive, zero or more parameters, and a greater than sign (>). Matched pairs of directives, such as and , delimit text that is to appear in a special place or style.FOLDOC For more information on HTML, visit www.htmlhelp.com/faq/ html/all.html. HTTP Hypertext Transfer Protocol. The client/server TCP/IP protocol used on the World Wide Web for the exchange of HTML documents. hub A multiport repeater. A hub rebroadcasts all packets it receives on all ports. This term is frequently used to refer to small hubs and switches, regardless of the device’s intelligence. It is a generic term for a layer 2 shared-media networking device. Today the term hub is sometimes used to refer to small intelligent devices, although that was not its original meaning. Contrast with network switch (page 1083). hypertext A collection of documents/nodes containing (usually highlighted or underlined) cross-references or links, which, with the aid of an interactive browser program, allow the reader to move easily from one document to another.FOLDOC Hypertext Markup Language See HTML. Hypertext Transfer Protocol See HTTP. i18n Internationalization. An abbreviation of the word internationalization: the letter i followed by 18 letters (nternationalizatio) followed by the letter n. i/o device Input/output device. See device on page 1065. IANA Internet Assigned Numbers Authority. A group that maintains a database of all permanent, registered system services (www.iana.org). ICMP Internet Control Message Protocol. A type of network packet that carries only messages, no data. The most common ICMP packet is the echo request which is sent by the ping utility. icon In a GUI, a small picture representing a file, directory, action, program, and so on. When you click an icon, an action, such as opening a window and starting a program or displaying a directory or Web site, takes place. From miniature religious statues.FOLDOC 1074 Glossary iconify To change a window into an icon. Contrast with restore (page 1090). ignored window A state in which a window has no decoration and therefore no buttons or titlebar to control it with. indentation See indention. indention The blank space between the margin and the beginning of a line that is set in from the margin. inode A data structure (page 1064) that contains information about a file. An inode for a file contains the file’s length, the times the file was last accessed and modified, the time the inode was last modified, owner and group IDs, access privileges, number of links, and pointers to the data blocks that contain the file itself. Each directory entry associates a filename with an inode. Although a single file might have several filenames (one for each link), it has only one inode. input Information that is fed to a program from a terminal or other file. See standard input on page 1095. installation A computer at a specific location. Some aspects of the Linux system are installation dependent. Also site. interactive A program that allows ongoing dialog with the user. When you give commands in response to shell prompts, you are using the shell interactively. Also, when you give commands to utilities, such as vim and mail, you are using the utilities interactively. interface The meeting point of two subsystems. When two programs work together, their interface includes every aspect of either program that the other deals with. The user interface (page 1100) of a program includes every program aspect the user comes into contact with: the syntax and semantics involved in invoking the program, the input and output of the program, and its error and informational messages. The shell and each of the utilities and built-in commands have a user interface. International See ISO. Organization for Standardization internet A large network that encompasses other, smaller networks. Internet The largest internet in the world. The Internet (uppercase “I”) is a multilevel hierarchy composed of backbone networks (ARPANET, NSFNET, MILNET, and others), midlevel networks, and stub networks. These include commercial (.com or .co), university (.ac or .edu), research (.org or .net), and military (.mil) networks and span many different physical networks around the world with various protocols, including the Internet Protocol (IP). Outside the United States, country code domains are popular (.us, .es, .mx, .de, and so forth), although you will see them used within the United States as well. Glossary 1075 Internet Protocol See IP. Internet service provider See ISP. intranet An inhouse network designed to serve a group of people such as a corporation or school. The general public on the Internet does not have access to an intranet. invisible file See hidden filename on page 1072. IP Internet Protocol. The network layer for TCP/IP. IP is a best-effort, packet-switching, connectionless protocol (page 1062) that provides packet routing, fragmentation, and reassembly through the data link layer. IPv4 is slowly giving way to IPv6.FOLDOC IP address Internet Protocol address. A four-part address associated with a particular network connection for a system using the Internet Protocol (IP). A system that is attached to multiple networks that use the IP will have a different IP address for each network interface. IP multicast See multicast on page 1081. IP spoofing A technique used to gain unauthorized access to a computer. The would-be intruder sends messages to the target machine. These messages contain an IP address indicating that the messages are coming from a trusted host. The target machine responds to the messages, giving the intruder (privileged) access to the target. IPC Interprocess communication. A method to communicate specific information between programs. IPv4 IP version 4. See IP and IPv6. IPv6 IP version 6. The next generation of Internet Protocol, which provides a much larger address space (2128 bits versus 232 bits for IPv4) that is designed to accommodate the rapidly growing number of Internet addressable devices. IPv6 also has built-in autoconfiguration, enhanced security, better multicast support, and many other features. iSCSI Internet Small Computer System Interface. A network storage protocol that encapsulates SCSI data into TCP packets. You can use this protocol to connect a system to a storage array using an Ethernet connection. ISDN Integrated Services Digital Network. A set of communications standards that allows a single pair of digital or standard telephone wires to carry voice, data, and video at a rate of 64 kilobits per second. ISO International Organization for Standardization. A voluntary, nontreaty organization founded in 1946. It is responsible for creating international standards in many areas, including computers and communications. Its members are the national standards organizations of 89 countries, including the American National Standards Institute.FOLDOC 1076 Glossary ISO9660 The ISO standard defining a filesystem for CD-ROMs. ISP Internet service provider. Provides Internet access to its customers. job control A facility that enables you to move commands from the foreground to the background, and vice versa. Job control enables you to stop commands temporarily. journaling filesystem A filesystem that maintains a noncached log file, or journal, which records all transactions involving the filesystem. When a transaction is complete, it is marked as complete in the log file. The log file results in greatly reduced time spent recovering a filesystem after a crash, making it particularly valuable in systems where high availability is an issue. JPEG Joint Photographic Experts Group. This committee designed the standard image-compression algorithm. JPEG is intended for compressing either full-color or gray-scale digital images of natural, real-world scenes and does not work as well on nonrealistic images, such as cartoons or line drawings. Filename extensions: .jpg, .jpeg.FOLDOC justify To expand a line of type in the process of formatting text. A justified line has even margins. A line is justified by increasing the space between words and sometimes between letters on the line. Kerberos An MIT-developed security system that authenticates users and machines. It does not provide authorization to services or databases; it establishes identity at logon, which is used throughout the session. Once you are authenticated, you can open as many terminals, windows, services, or other network accesses as you like until your session expires. kernel The part of the operating system that allocates machine resources, including memory, disk space, and CPU (page 1063) cycles, to all other programs that run on a computer. The kernel includes the low-level hardware interfaces (drivers) and manages processes (page 1087), the means by which Linux executes programs. The kernel is the part of the Linux system that Linus Torvalds originally wrote (see the beginning of Chapter 1). kernelspace The part of memory (RAM) where the kernel resides. Code running in kernelspace has full access to hardware and all other processes in memory. See the KernelAnalysis-HOWTO. key binding A keyboard key is said to be bound to the action that results from pressing it. Typically keys are bound to the letters that appear on the keycaps: When you press A, an A appears on the screen. Key binding usually refers to what happens when you press a combination of keys, one of which is CONTROL, ALT, META, or SHIFT, or when you press a series of keys, the first of which is typically ESCAPE. keyboard A hardware input device consisting of a number of mechanical buttons (keys) that the user presses to input characters to a computer. By default a keyboard is connected to standard input of a shell.FOLDOC Glossary 1077 kilo- In the binary system, the prefix kilo- multiplies by 210 (i.e., 1,024). Kilobit and kilobyte are common uses of this prefix. Abbreviated as k. Korn Shell ksh. A command processor, developed by David Korn at AT&T Bell Laboratories, that is compatible with the Bourne Shell but includes many extensions. See also shell on page 1092. l10n Localization. An abbreviation of the word localization: the letter l followed by 10 letters (ocalizatio) followed by the letter n. LAN Local area network. A network that connects computers within a localized area (such as a single site, building, or department). large number Visit mathworld.wolfram.com/LargeNumber.html for a comprehensive list. LDAP Lightweight Directory Access Protocol. A simple protocol for accessing online directory services. LDAP is a lightweight alternative to the X.500 Directory Access Protocol (DAP). It can be used to access information about people, system users, network devices, email directories, and systems. In some cases, it can be used as an alternative for services such as NIS. Given a name, many mail clients can use LDAP to discover the corresponding email address. See directory service on page 1065. leaf In a tree structure, the end of a branch that cannot support other branches. When the Linux filesystem hierarchy is conceptualized as a tree, files that are not directories are leaves. See node on page 1083. least privilege, concept of Mistakes made by a user working with root privileges can be much more devastating than those made by an ordinary user. When you are working on the computer, especially when you are working as the system administrator, always perform any task using the least privilege possible. If you can perform a task logged in as an ordinary user, do so. If you must work with root privileges, do as much as you can as an ordinary user, log in as root or give an su or sudo command so you are working with root privileges, do as much of the task that has to be done with root privileges, and revert to being an ordinary user as soon as you can. Because you are more likely to make a mistake when you are rushing, this concept becomes more important when you have less time to apply it. Lightweight Directory Access Protocol See LDAP. link A pointer to a file. Two kinds of links exist: hard links (page 1071) and symbolic links (page 1097) also called soft links. A hard link associates a filename with a place on the disk where the content of the file is located. A symbolic link associates a filename with the pathname of a hard link to a file. Linux-PAM See PAM on page 1085. 1078 Glossary LinuxSee PAM on page 1085. Pluggable Authentication Modules list box A widget (page 1101) that displays a static list for a user to choose from. The list appears as multiple lines with a scrollbar (page 1092) if needed. The user can scroll the list and select an entry. Different from a drop-down list (page 1067). loadable kernel module See loadable module. loadable module A portion of the operating system that controls a special device and that can be loaded automatically into a running kernel as needed to access that device. local area network See LAN on page 1077. locale The language; date, time, and currency formats; character sets; and so forth that pertain to a geopolitical place or area. For example, en_US specifies English as spoken in the United States and dollars; en_UK specifies English as spoken in the United Kingdom and pounds. See the locale man page in section 5 of the system manual for more information. Also the locale utility. log in To gain access to a computer system by responding correctly to the login: and Password: prompts. Also log on, login. log out To end your session by exiting from your login shell. Also log off. logical expression A collection of strings separated by logical operators (>, >=, =, !=, <=, and <) that can be evaluated as true or false. Also Boolean (page 1058) expression. .login file A file in a user’s home directory that the TC Shell executes when you log in. You can use this file to set environment variables and to run commands that you want executed at the beginning of each session. login name See username on page 1100. login shell The shell you are using when you log in. The login shell can fork other processes that can run other shells, utilities, and programs. See page 278. .logout file A file in a user’s home directory that the TC Shell executes when you log out, assuming that the TC Shell is your login shell. You can put in the .logout file commands that you want run each time you log out. MAC address Media Access Control address. The unique hardware address of a device connected to a shared network medium. Each network adapter has a globally unique MAC address that it stores in ROM. MAC addresses are 6 bytes long, enabling 2566 (about 300 trillion) possible addresses or 65,536 addresses for each possible IPv4 address. A MAC address performs the same role for Ethernet that an IP address performs for TCP/IP: It provides a unique way to identify a host. Also called an Ethernet address. Glossary 1079 machine collating sequence The sequence in which the computer orders characters. The machine collating sequence affects the outcome of sorts and other procedures that put lists in alphabetical order. Many computers use ASCII codes so their machine collating sequences correspond to the ordering of the ASCII codes for characters. macro A single instruction that a program replaces by several (usually more complex) instructions. The C compiler recognizes macros, which are defined using a #define instruction to the preprocessor. magic number A magic number, which occurs in the first 512 bytes of a binary file, is a 1-, 2-, or 4-byte numeric value or character string that uniquely identifies the type of file (much like a DOS 3-character filename extension). See /usr/share/magic and the magic man page for more information. main memory Random access memory (RAM), an integral part of the computer. Although disk storage is sometimes referred to as memory, it is never referred to as main memory. major device number A number assigned to a class of devices, such as terminals, printers, or disk drives. Using the ls utility with the –l option to list the contents of the /dev directory displays the major and minor device numbers of many devices (as major, minor). MAN Metropolitan area network. A network that connects computers and LANs (page 1077) at multiple sites in a small regional area, such as a city. man-in-themiddle A security attack wherein the attacker interposes himself between two subjects. For example, if Max and Zach try to carry on a secure email exchange over a network, Max first sends Zach his public key. However, suppose Mr. X sits between Max and Zach on the network and intercepts Max’s public key. Mr. X then sends his public key to Zach. Zach then sends his public key to Max, but once again Mr. X intercepts it and substitutes his public key and sends that to Max. Without some kind of active protection (a piece of shared information), Mr. X, the man-in-the-middle, can decrypt all traffic between Max and Zach, reencrypt it, and send it on to the other party. masquerade To appear to come from one domain or IP address when actually coming from another. Said of a packet (iptables) or message (sendmail/exim4). See also NAT on page 1082. MD5 Message Digest 5. A one-way hash function (page 1084). The SHA1 (page 1092) algorithm has supplanted MD5 in many applications. MDA Mail delivery agent. One of the three components of a mail system; the other two are the MTA (page 1081) and MUA (page 1081). An MDA accepts inbound mail from an MTA and delivers it to a local user. mebibyte Mega binary byte. A unit of storage equal to 220 bytes = 1,048,576 bytes = 1,024 kibibytes. Abbreviated as MiB. Contrast with megabyte. megabyte A unit of storage equal to 106 bytes. Sometimes used in place of mebibyte. Abbreviated as MB. 1080 Glossary memory See RAM on page 1088. menu A list from which the user might select an operation to be performed. This selection is often made with a mouse or other pointing device under a GUI but might also be controlled from the keyboard. Very convenient for beginners, menus show which commands are available and facilitate experimenting with a new program, often reducing the need for user documentation. Experienced users usually prefer keyboard commands, especially for frequently used operations, because they are faster to use.FOLDOC merge To combine two ordered lists so that the resulting list is still in order. The sort utility can merge files. META key On the keyboard, a key that is labeled META or ALT. Use this key as you would the SHIFT key. While holding it down, press another key. The emacs editor makes extensive use of the META key. metacharacter A character that has a special meaning to the shell or another program in a particular context. Metacharacters are used in the ambiguous file references recognized by the shell and in the regular expressions recognized by several utilities. You must quote a metacharacter if you want to use it without invoking its special meaning. See regular character (page 1089) and special character (page 1094). metadata Data about data. In data processing, metadata is definitional data that provides information about, or documentation of, other data managed within an application or environment. For example, metadata can document data about data elements or attributes (name, size, data type, and so on), records or data structures (page 1064) (length, fields, columns, and so on), and data itself (where it is located, how it is associated, who owns it, and so on). Metadata can include descriptive information about the context, quality and condition, or characteristics of the data.FOLDOC metropolitan area network See MAN on page 1079. MIME Multipurpose Internet Mail Extension. Originally used to describe how specific types of files that were attached to email were to be handled. Today MIME types describe how a file is to be opened or worked with, based on its contents, determined by its magic number (page 1079), and filename extension. An example of a MIME type is image/jpeg: The MIME group is image and the MIME subtype is jpeg. Many MIME groups exist, including application, audio, image, inode, message, text, and video. minimize See iconify on page 1074. minor device number A number assigned to a specific device within a class of devices. See major device number on page 1079. Glossary 1081 modem Modulator/demodulator. A peripheral device that modulates digital data into analog data for transmission over a voice-grade telephone line. Another modem demodulates the data at the other end. module See loadable module on page 1078. mount To make a filesystem accessible to system users. When a filesystem is not mounted, you cannot read from or write to files it contains. mount point A directory that you mount a local or remote filesystem on. mouse A device you use to point to a particular location on a display screen, typically so you can choose a menu item, draw a line, or highlight some text. You control a pointer on the screen by sliding a mouse around on a flat surface; the position of the pointer moves relative to the movement of the mouse. You select items by pressing one or more buttons on the mouse. mouse pointer In a GUI, a marker that moves in correspondence with the mouse. It is usually a small black x with a white border or an arrow. Differs from the cursor (page 1064). mouseover The action of passing the mouse pointer over an object on the screen. MTA Mail transfer agent. One of the three components of a mail system; the other two are the MDA and MUA. An MTA accepts mail from users and MTAs. MUA Mail user agent. One of the three components of a mail system; the other two are the MDA (page 1079) and MTA. An MUA is an end-user mail program such as KMail, mutt, or Outlook. multiboot specification Specifies an interface between a boot loader and an operating system. With compliant boot loaders and operating systems, any boot loader should be able to load any operating system. The object of this specification is to ensure that different operating sy stems will wor k o n a s ingle machin e. For m ore in for mat ion, go t o odin-os.sourceforge.net/guides/multiboot.html. multicast A multicast packet has one source and multiple destinations. In multicast, source hosts register at a special address to transmit data. Destination hosts register at the same address to receive data. In contrast to broadcast (page 1058), which is LANbased, multicast traffic is designed to work across routed networks on a subscription basis. Multicast reduces network traffic by transmitting a packet one time, with the router at the end of the path breaking it apart as needed for multiple recipients. multitasking A computer system that allows a user to run more than one job at a time. A multitasking system, such as Linux, allows you to run a job in the background while running a job in the foreground. multiuser system A computer system that can be used by more than one person at a time. Linux is a multiuser operating system. Contrast with single-user system (page 1093). namespace A set of names (identifiers) in which all names are unique.FOLDOC 1082 Glossary NAT Network Address Translation. A scheme that enables a LAN to use one set of IP addresses internally and a different set externally. The internal set is for LAN (private) use. The external set is typically used on the Internet and is Internet unique. NAT provides some privacy by hiding internal IP addresses and allows multiple internal addresses to connect to the Internet through a single external IP address. See also masquerade on page 1079. NBT NetBIOS over TCP/IP. A protocol that supports NetBIOS services in a TCP/IP environment. Also NetBT. negative caching Storing the knowledge that something does not exist. A cache normally stores information about something that exists. A negative cache stores the information that something, such as a record, does not exist. NetBIOS Network Basic Input/Output System. An API (page 1055) for writing networkaware applications. netboot To boot a computer over the network (as opposed to booting from a local disk). netiquette The conventions of etiquette—that is, polite behavior—recognized on Usenet and in mailing lists, such as not (cross-)posting to inappropriate groups and refraining from commercial advertising outside the business groups. The most important rule of netiquette is “Think before you post.” If what you intend to post will not make a positive contribution to the newsgroup and be of interest to several readers, do not post it. Personal messages to one or two individuals should not be posted to newsgroups; use private email instead.FOLDOC network address The network portion (netid) of an IP address. For a class A network, it is the first byte, or segment, of the IP address; for a class B network, it is the first two bytes; and for a class C network, it is the first three bytes. In each case the balance of the IP address is the host address (hostid). Assigned network addresses are globally unique within the Internet. Also network number. Network Filesystem See NFS. Network Information Service See NIS. network mask A bit mask used to identify which bits in an IP address correspond to the network address and subnet portions of the address. Called a network mask because the network portion of the address is determined by the number of bits that are set in the mask. The network mask has ones in positions corresponding to the network and subnet numbers and zeros in the host number positions. Also subnet mask or mask. network number See network address. Glossary 1083 network segment A part of an Ethernet or other network on which all message traffic is common to all nodes; that is, it is broadcast from one node on the segment and received by all others. This commonality normally occurs because the segment is a single continuous conductor. Communication between nodes on different segments is via one or more routers.FOLDOC network switch A connecting device in networks. Switches are increasingly replacing shared media hubs in an effort to increase bandwidth. For example, a 16-port 10BaseT hub shares the total 10 megabits per second bandwidth with all 16 attached nodes. By replacing the hub with a switch, both sender and receiver can take advantage of the full 10 megabits per second capacity. Each port on the switch can give full bandwidth to a single server or client station or to a hub with several stations. Network switch refers to a device with intelligence. Contrast with hub (page 1073). Network Time See NTP on page 1083. Protocol NFS Network Filesystem. A remote filesystem designed by Sun Microsystems, available on computers from most UNIX system vendors. NIC Network interface card (or controller). An adapter circuit board installed in a computer to provide a physical connection to a network. Each NIC has a unique MAC address (page 1078).FOLDOC NIS Network Information Service. A distributed service built on a shared database to manage system-independent information (such as usernames and passwords). NIS domain name A name that describes a group of systems that share a set of NIS files. Different from domain name (page 1066). NNTP Network News Transfer Protocol. node In a tree structure, the end of a branch that can support other branches. When the Linux filesystem hierarchy is conceptualized as a tree, directories are nodes. See leaf on page 1077. nonprinting character See control character on page 1063. Also nonprintable character. nonvolatile storage A storage device whose content is preserved when its power is off. Also NVS and persistent storage. Some examples are CD-ROM, paper punch tape, hard disk, ROM (page 1090), PROM (page 1087), EPROM (page 1068), and EEPROM (page 1067). Contrast with RAM (page 1088). NTP Network Time Protocol. Built on top of TCP/IP, NTP maintains accurate local time by referring to known accurate clocks on the Internet. null string A string that could contain characters but does not. A string of zero length. octal number A base 8 number. Octal numbers are composed of the digits 0–7, inclusive. Refer to Table G-1 on page 1072. 1084 Glossary one-way hash function A one-way function that takes a variable-length message and produces a fixed-length hash. Given the hash, it is computationally infeasible to find a message with that hash; in fact, you cannot determine any usable information about a message with that hash. Also message digest function. See also hash (page 1071). open source A method and philosophy for software licensing and distribution designed to encourage use and improvement of software written by volunteers by ensuring that anyone can copy the source code and modify it freely. The term open source is now more widely used than the earlier term free software (promoted by the Free Software Foundation; www.fsf.org) but has broadly the same meaning—free of distribution restrictions, not necessarily free of charge. OpenSSH A free version of the SSH (secure shell) protocol suite that replaces TELNET, rlogin, and more with secure programs that encrypt all communication—even passwords—over a network. Refer to “The OpenSSH Secure Communication Utilities” on page 687. operating system A control program for a computer that allocates computer resources, schedules tasks, and provides the user with a way to access resources. option A command-line argument that modifies the effects of a command. Options are usually preceded by hyphens on the command line and traditionally have singlecharacter names (such as –h or –n). Some commands allow you to group options following a single hyphen (for example, –hn). GNU utilities frequently have two arguments that do the same thing: a single-character argument and a longer, more descriptive argument that is preceded by two hyphens (such as – –show-all and ––invert-match). ordinary file A file that is used to store a program, text, or other user data. See directory (page 1065) and device file (page 1065). output Information that a program sends to the terminal or another file. See standard output on page 1095. P2P Peer-to-Peer. A network that does not divide nodes into clients and servers. Each computer on a P2P network can fulfill the roles of client and server. In the context of a file-sharing network, this ability means that once a node has downloaded (part of) a file, it can act as a server. BitTorrent implements a P2P network. packet A unit of data sent across a network. Packet is a generic term used to describe a unit of data at any layer of the OSI protocol stack, but it is most correctly used to describe network or application layer data units (“application protocol data unit,” APDU).FOLDOC See also frame (page 1070) and datagram (page 1064). packet filtering A technique used to block network traffic based on specified criteria, such as the origin, destination, or type of each packet. See also firewall (page 1069). packet sniffer A program or device that monitors packets on a network. See sniff on page 1094. pager A utility that allows you to view a file one screen at a time (for example, less and more). Glossary 1085 paging The process by which virtual memory is maintained by the operating system. The content of process memory is moved (paged out) to the swap space (page 1096) as needed to make room for other processes. PAM Linux-PAM or Linux-Pluggable Authentication Modules. These modules allow a system administrator to determine how various applications authenticate users. parent process A process that forks other processes. See process (page 1087) and child process (page 1061). partition A section of a (hard) disk that has a name so you can address it separately from other sections. A disk partition can hold a filesystem or another structure, such as the swap area. Under DOS and Windows, partitions (and sometimes whole disks) are labeled C:, D:, and so on. Also disk partition and slice. passive FTP Allows FTP to work through a firewall by allowing the flow of data to be initiated and controlled by the client FTP program instead of the server. Also called PASV FTP because it uses the FTP PASV command. passphrase A string of words and characters that you type in to authenticate yourself. A passphrase differs from a password only in length. A password is usually short—6 to 10 characters. A passphrase is usually much longer—up to 100 characters or more. The greater length makes a passphrase harder to guess or reproduce than a password and therefore more secure.FOLDOC password To prevent unauthorized access to a user’s account, an arbitrary string of characters chosen by the user or system administrator and used to authenticate the user when attempting to log in.FOLDOC See also passphrase. PASV FTP See passive FTP. pathname A list of directories separated by slashes (/) and ending with the name of a file, which can be a directory. A pathname is used to trace a path through the file structure to locate or identify a file. pathname, last The part of a pathname following the final /, or the whole filename if there is no /. element of a A simple filename. Same as basename (page 1057). pathname element One of the filenames that forms a pathname. peripheral device See device on page 1065. persistent Data that is stored on nonvolatile media, such as a hard disk. phish An attempt to trick users into revealing or sharing private information, especially passwords or financial information. The most common form is email purporting to be from a bank or vendor that requests that a user fill out a form to “update” an account on a phoney Web site disguised to appear legitimate. Generally sent as spam (page 1094). 1086 Glossary physical device A tangible device, such as a disk drive, that is physically separate from other, similar devices. PID Process identification, usually followed by the word number. Linux assigns a unique PID number as each process is initiated. pipeline One or more simple commands. If a pipeline comprises two or more commands, the commands are connected such that standard output of one command is connected by a pipe symbol (|; a control operator) to standard input of the next. pixel The smallest element of a picture, typically a single dot on a display screen. plaintext Text that is not encrypted. Also cleartext. Contrast with ciphertext (page 1061). Pluggable See PAM on page 1085. Authentication Modules point-to-point link A connection limited to two endpoints, such as the connection between a pair of modems. port A logical channel or channel endpoint in a communications system. The TCP (page 1097) and UDP (page 1099) transport layer protocols used on Ethernet use port numbers to distinguish between different logical channels on the same network interface on the same computer. The /etc/services file (see the beginning of this file for more information) or the NIS (page 1083) services database specifies a unique port number for each application program. The number links incoming data to the correct service (program). Standard, well-known ports are used by everyone: Port 80 is used for HTTP (Web) traffic. Some protocols, such as TELNET and HTTP (which is a special form of TELNET), have default ports specified as mentioned earlier but can use other ports as well.FOLDOC port forwarding The process by which a network port on one computer is transparently connected to a port on another computer. If port X is forwarded from system A to system B, any data sent to port X on system A is sent to system B automatically. The connection can be between different ports on the two systems. See also tunneling (page 1099). portmapper A server that converts TCP/IP port numbers into RPC (page 1091) program numbers. printable character One of the graphic characters: a letter, number, or punctuation mark. Contrast with a nonprintable, or CONTROL, character. Also printing character. private address IANA (page 1073) has reserved three blocks of IP addresses for private internets space or LANs: Glossary 1087 10.0.0.0 - 10.255.255.255 172.16.0.0 - 172.31.255.255 192.168.0.0 - 192.168.255.255 You can use these addresses without coordinating with anyone outside of your LAN (you do not have to register the system name or address). Systems using these IP addresses cannot communicate directly with hosts using the global address space but must go through a gateway. Because private addresses have no global meaning, routing information is not stored by DNSs and most ISPs reject privately addressed packets. Make sure that your router is set up not to forward these packets onto the Internet. privileged port A port (page 1086) with a number less than 1024. On Linux and other UNIX-like systems, only a process running with root privileges can bind to a privileged port. Any user on Windows 98 and earlier Windows systems can bind to any port. Also reserved port. procedure A sequence of instructions for performing a particular task. Most programming languages, including machine languages, enable a programmer to define procedures that allow the procedure code to be called from multiple places. Also subroutine.FOLDOC process The execution of a command by Linux. See “Processes” on page 323. .profile file A startup file in a user’s home directory that the Bourne Again or Z Shell executes when you log in. The TC Shell executes .login instead. You can use the .profile file to run commands, set variables, and define functions. program A sequence of executable computer instructions contained in a file. Linux utilities, applications, and shell scripts are all programs. Whenever you run a command that is not built into a shell, you are executing a program. PROM Programmable readonly memory. A kind of nonvolatile storage. ROM (page 1090) that can be written to using a PROM programmer. prompt A cue from a program, usually displayed on the screen, indicating that it is waiting for input. The shell displays a prompt, as do some of the interactive utilities, such as mail. By default the Bourne Again and Z Shells use a dollar sign ($) as a prompt, and the TC Shell uses a percent sign (%). protocol A set of formal rules describing how to transmit data, especially across a network. Low-level protocols define the electrical and physical standards, bit and byte ordering, and transmission, error detection, and correction of the bit stream. High-level protocols deal with data formatting, including message syntax, terminal-to-computer dialog, character sets, and sequencing of messages.FOLDOC proxy A service that is authorized to act for a system while not being part of that system. See also proxy gateway and proxy server. 1088 Glossary proxy gateway A computer that separates clients (such as browsers) from the Internet, working as a trusted agent that accesses the Internet on their behalf. A proxy gateway passes a request for data from an Internet service, such as HTTP from a browser/client, to a remote server. The data that the server returns goes back through the proxy gateway to the requesting service. A proxy gateway should be transparent to the user. A proxy gateway often runs on a firewall (page 1069) system and acts as a barrier to malicious users. It hides the IP addresses of the local computers inside the firewall from Internet users outside the firewall. You can configure browsers, such as Mozilla/Firefox and Netscape, to use a different proxy gateway or to use no proxy for each URL access method including FTP, netnews, SNMP, HTTPS, and HTTP. See also proxy. proxy server A proxy gateway that usually includes a cache (page 1059) that holds frequently used Web pages so that the next request for that page is available locally (and therefore more quickly). The terms proxy server and proxy gateway are frequently interchanged so that the use of cache does not rest exclusively with the proxy server. See also proxy. Python A simple, high-level, interpreted, object-oriented, interactive language that bridges the gap between C and shell programming. Suitable for rapid prototyping or as an extension language for C applications, Python supports packages, modules, classes, user-defined exceptions, a good C interface, and dynamic loading of C modules. It has no arbitrary restrictions. See Chapter 12.FOLDOC quote When you quote a character, you take away any special meaning that it has in the current context. You can quote a character by preceding it with a backslash. When you are interacting with the shell, you can also quote a character by surrounding it with single quotation marks. For example, the command echo \* or echo '*' displays *. The command echo * displays a list of the files in the working directory. See ambiguous file reference (page 1054), metacharacter (page 1080), regular character (page 1089), regular expression (page 1089), and special character (page 1094). See also escape on page 1068. radio button In a GUI, one of a group of buttons similar to those used to select the station on a car radio. Radio buttons within a group are mutually exclusive; only one button can be selected at a time. RAID Redundant array of inexpensive/independent disks. Two or more (hard) disk drives used in combination to improve fault tolerance and performance. RAID can be implemented in hardware or software. RAM Random access memory. A kind of volatile storage. A data storage device for which the order of access to different locations does not affect the speed of access. Contrast with a hard disk or tape drive, which provides quicker access to sequential data because accessing a nonsequential location requires physical movement of the storage medium and/or read/write head rather than just electronic switching. Contrast with nonvolatile storage (page 1083). Also memory.FOLDOC Glossary 1089 RAM disk RAM that is made to look like a floppy diskette or hard disk. A RAM disk is frequently used as part of the boot (page 1058) process. RAS Remote access server. In a network, a computer that provides access to remote users via analog modem or ISDN connections. RAS includes the dial-up protocols and access control (authentication). It might be a regular fileserver with remote access software or a proprietary system, such as Shiva’s LANRover. The modems might be internal or external to the device. RDF Resource Description Framework. Being developed by W3C (the main standards body for the World Wide Web), a standard that specifies a mechanism for encoding and transferring metadata (page 1080). RDF does not specify what the metadata should or can be. It can integrate many kinds of applications and data, using XML as an interchange syntax. Examples of the data that can be integrated include library catalogs and worldwide directories; syndication and aggregation of news, software, and content; and collections of music and photographs. Visit www.w3.org/RDF for more information. redirection The process of directing standard input for a program to come from a file rather than from the keyboard. Also, directing standard output or standard error to go to a file rather than to the screen. reentrant Code that can have multiple simultaneous, interleaved, or nested invocations that do not interfere with one another. Noninterference is important for parallel processing, recursive programming, and interrupt handling. It is usually easy to arrange for multiple invocations (that is, calls to a subroutine) to share one copy of the code and any readonly data. For the code to be reentrant, however, each invocation must use its own copy of any modifiable data (or synchronized access to shared data). This goal is most often achieved by using a stack and allocating local variables in a new stack frame for each invocation. Alternatively, the caller might pass in a pointer to a block of memory that that invocation can use (usually for output), or the code might allocate some memory on a heap, especially if the data must survive after the routine returns. Reentrant code is often found in system software, such as operating systems and teleprocessing monitors. It is also a crucial component of multithreaded programs, where the term thread-safe is often used instead of reentrant.FOLDOC regular character A character that always represents itself in an ambiguous file reference or another type of regular expression. Contrast with special character. regular expression A string—composed of letters, numbers, and special symbols—that defines one or more strings. See Appendix A. relative pathname A pathname that starts from the working directory. Contrast with absolute pathname (page 1054). remote access server See RAS on page 1089. 1090 Glossary remote filesystem A filesystem on a remote computer that has been set up so that you can access (usually over a network) its files as though they were stored on your local computer’s disks. An example of a remote filesystem is NFS. remote procedure call See RPC on page 1091. resolver The TCP/IP library software that formats requests to be sent to the DNS (page 1066) for hostname-to-Internet address conversion.FOLDOC Resource Description Framework See RDF on page 1089. restore The process of turning an icon into a window. Contrast with iconify (page 1074). return code See exit status on page 1068. RFC Request for comments. Begun in 1969, one of a series of numbered Internet informational documents and standards widely followed by commercial software and freeware in the Internet and UNIX/Linux communities. Few RFCs are standards, but all Internet standards are recorded in RFCs. Perhaps the single most influential RFC has been RFC 822, the Internet electronic mail format standard. The RFCs are unusual in that they are floated by technical experts acting on their own initiative and reviewed by the Internet at large rather than being formally promulgated through an institution such as ANSI. For this reason they remain known as RFCs, even after they are adopted as standards. The RFC tradition of pragmatic, experiencedriven, after-the-fact standard writing done by individuals or small working groups has important advantages over the more formal, committee-driven process typical of ANSI or ISO. For a complete list of RFCs, go to www.rfc-editor.org.FOLDOC roam To move a computer between wireless access points (page 1102) on a wireless network without the user or applications being aware of the transition. Moving between access points typically results in some packet loss, although this loss is transparent to programs that use TCP. ROM Readonly memory. A kind of nonvolatile storage. A data storage device that is manufactured with fixed contents. In general, ROM describes any storage system whose contents cannot be altered, such as a phonograph record or printed book. When used in reference to electronics and computers, ROM describes semiconductor integrated circuit memories, of which several types exist, and CD-ROM. ROM is nonvolatile storage—it retains its contents even after power has been removed. ROM is often used to hold programs for embedded systems, as these usually have a fixed purpose. ROM is also used for storage of the BIOS (page 1057) in a computer. Contrast with RAM (page 1088).FOLDOC root directory The ancestor of all directories and the start of all absolute pathnames. The root directory has no name and is represented by / standing alone or at the left end of a pathname. Glossary 1091 root filesystem The filesystem that is available when the system is brought up in single-user/recovery mode. This filesystem is always represented by /. You cannot unmount or mount the root filesystem. You can remount root to change its mount options. root login Usually the username of Superuser (page 1096). root (user) Another name for Superuser (page 1096). root window Any place on the desktop not covered by a window, object, or panel. rootkit Software that provides a user with root privileges while hiding its presence. rotate When a file, such as a log file, gets indefinitely larger, you must keep it from taking up too much space on the disk. Because you might need to refer to the information in the log files in the near future, it is generally not a good idea to delete the contents of the file until it has aged. Instead you can periodically save the current log file under a new name and create a new, empty file as the current log file. You can keep a series of these files, renaming each as a new one is saved. You will then rotate the files. For example, you might remove xyzlog.4, xyzlog.3xyzlog.4, xyzlog.2xyzlog.3, xyzlog.1 xyzlog.2, xyzlogxyzlog.1, and create a new xyzlog file. By the time you remove xyzlog.4, it will not contain any information more recent than you want to remove. router A device (often a computer) that is connected to more than one similar type of network to pass data between them. See gateway on page 1070. RPC Remote procedure call. A call to a procedure (page 1087) that acts transparently across a network. The procedure itself is responsible for accessing and using the network. The RPC libraries make sure that network access is transparent to the application. RPC runs on top of TCP/IP or UDP/IP. RSA A public key encryption technology that is based on the lack of an efficient way to factor very large numbers. Because of this lack, it takes an extraordinary amount of computer processing time and power to deduce an RSA key. The RSA algorithm is the de facto standard for data sent over the Internet. run To execute a program. runlevel Before the introduction of the systemd/Upstart init daemon, runlevels specified the state of the system, including single-user/recovery and multiuser. Samba A free suite of programs that implement the Server Message Block (SMB) protocol. See SMB (page 1093). schema Within a GUI, a pattern that helps you see and interpret the information that is presented in a window, making it easier to understand new information that is presented using the same schema. scroll To move lines on a terminal or window up and down or left and right. 1092 Glossary scrollbar A widget (page 1101) found in graphical user interfaces that controls (scrolls) which part of a document is visible in the window. A window can have a horizontal scrollbar, a vertical scrollbar (more common), or both.FOLDOC server A powerful centralized computer (or program) designed to provide information to clients (smaller computers or programs) on request. session The lifetime of a process. For a desktop, it is the desktop session manager. For a character-based terminal, it is the user’s login shell process. In KDE, it is launched by kdeinit. A session might also be the sequence of events between when you start using a program, such as an editor, and when you finish. setgid When you execute a file that has setgid (set group ID) permission, the process executing the file takes on the privileges of the group the file belongs to. The ls utility shows setgid permission as an s in the group’s executable position. See also setuid. setuid When you execute a file that has setuid (set user ID) permission, the process executing the file takes on the privileges of the owner of the file. As an example, if you run a setuid program that removes all the files in a directory, you can remove files in any of the file owner’s directories, even if you do not normally have permission to do so. When the program is owned by root, you can remove files in any directory that a user working with root privileges can remove files from. The ls utility shows setuid permission as an s in the owner’s executable position. See also setgid. sexillion In the British system, 1036. In the American system, this number is named undecillion. See also large number (page 1077). SHA1 Secure Hash Algorithm 1. The SHA family is a set of cryptographic hash (page 1071) algorithms that were designed by the National Security Agency (NSA). The second member of this family is SHA1, a successor to MD5 (page 1079). See also cryptography on page 1063. SHA2 Secure Hash Algorithm 2. The third member of the SHA family (see SHA1), SHA2 is a set of four cryptographic hash functions named SHA-224, SHA-256, SHA-384, SHA-512 with digests that are 224, 256, 384, and 512 bits, respectively. share A filesystem hierarchy that is shared with another system using SMB (page 1093). Also Windows share (page 1102). shared network topology A network, such as Ethernet, in which each packet might be seen by systems other than its destination system. Shared means that the network bandwidth is shared by all users. shell A Linux system command processor. The three major shells are the Bourne Again Shell (page 1058), the TC Shell (page 1097), and the Z Shell (page 1103). shell function A series of commands that the shell stores for execution at a later time. Shell functions are like shell scripts but run more quickly because they are stored in the computer’s Glossary 1093 main memory rather than in files. Also, a shell function is run in the shell that calls it (unlike a shell script, which is typically run in a subshell). shell script An ASCII file containing shell commands. Also shell program. signal A very brief message that the UNIX system can send to a process, apart from the process’s standard input. Refer to “trap: Catches a Signal” on page 483. simple filename A single filename containing no slashes (/). A simple filename is the simplest form of pathname. Also the last element of a pathname. Also basename (page 1057). single-user system A computer system that only one person can use at a time. Contrast with multiuser system (page 1081). slider A widget (page 1101) that allows a user to set a value by dragging an indicator along a line. Many sliders allow the user also to click on the line to move the indicator. Differs from a scrollbar (page 1092) in that moving the indicator does not change other parts of the display. SMB Server Message Block. Developed in the early 1980s by Intel, Microsoft, and IBM, SMB is a client/server protocol that is the native method of file and printer sharing for Windows. In addition, SMB can share serial ports and communications abstractions, such as named pipes and mail slots. SMB is similar to a remote procedure call (RPC, page 1091) that has been customized for filesystem access. Also Microsoft Networking or CIFS (page 1061).FOLDOC SMP Symmetric multiprocessing. Two or more similar processors connected via a highbandwidth link and managed by one operating system, where each processor has equal access to I/O devices. The processors are treated more or less equally, with application programs able to run on any or all processors interchangeably, at the discretion of the operating system.FOLDOC smiley A character-based glyph (page 1070), typically used in email, that conveys an emotion. The characters :-) in a message portray a smiley face (look at it sideways). Because it can be difficult to tell when the writer of an electronic message is saying something in jest or in seriousness, email users often use :-) to indicate humor. The two original smileys, designed by Scott Fahlman, were :-) and :-(. Also emoticon, smileys, and smilies. For more information search on smiley on the Internet. smilies See smiley. SMTP Simple Mail Transfer Protocol. A protocol used to transfer electronic mail between computers. It is a server-to-server protocol, so other protocols are used to access the messages. The SMTP dialog usually happens in the background under the control of a message transport system such as sendmail/exim4.FOLDOC snap (windows) As you drag a window toward another window or edge of the workspace, it can move suddenly so that it is adjacent to the other window/edge. Thus the window snaps into position. 1094 Glossary sneakernet Using hand-carried magnetic media to transfer files between machines. sniff To monitor packets on a network. A system administrator can legitimately sniff packets and a malicious user can sniff packets to obtain information such as usernames and passwords. See also packet sniffer (page 1084). SOCKS A networking proxy protocol embodied in a SOCKS server, which performs the same functions as a proxy gateway (page 1088) or proxy server (page 1088). SOCKS works at the application level, requiring that an application be modified to work with the SOCKS protocol, whereas a proxy (page 1087) makes no demands on the application. SOCKSv4 does not support authentication or UDP proxy. SOCKSv5 supports a variety of authentication methods and UDP proxy. sort SPACE To put in a specified order, usually alphabetic or numeric. character A character that appears as the absence of a visible character. Even though you cannot see it, a SPACE is a printable character. It is represented by the ASCII code 32 (decimal). A SPACE character is considered a blank or whitespace (page 1101). spam Posting irrelevant or inappropriate messages to one or more Usenet newsgroups or mailing lists in deliberate or accidental violation of netiquette (page 1082). Also, sending large amounts of unsolicited email indiscriminately. This email usually promotes a product or service. Another common purpose of spam is to phish (page 1085). Spam is the electronic equivalent of junk mail. From the Monty Python “Spam” song.FOLDOC sparse file A file that is large but takes up little disk space. The data in a sparse file is not dense (thus its name). Examples of sparse files are core files and dbm files. spawn See fork on page 1069. special character A character that has a special meaning when it occurs in an ambiguous file reference or another type of regular expression, unless it is quoted. The special characters most commonly used with the shell are * and ?. Also metacharacter (page 1080) and wildcard. special file See device file on page 1065. spin box In a GUI, a type of text box (page 1097) that holds a number you can change by typing over it or using the up and down arrows at the end of the box. Also spinner. spinner See spin box. spoofing See IP spoofing on page 1075. spool To place items in a queue, each waiting its turn for some action. Often used when speaking about printers. Also used to describe the queue. Glossary 1095 SQL Structured Query Language. A language that provides a user interface to relational database management systems (RDBMS). SQL, the de facto standard, is also an ISO and ANSI standard and is often embedded in other programming languages.FOLDOC square bracket A left square bracket ([) or a right square bracket ( ]). These special characters define character classes in ambiguous file references and other regular expressions. standard error A file to which a program can send output. Usually only error messages are sent to this file. Unless you instruct the shell otherwise, it directs this output to the screen (that is, to the device file that represents the screen). standard input A file from which a program can receive input. Unless you instruct the shell otherwise, it directs this input so that it comes from the keyboard (that is, from the device file that represents the keyboard). standard output A file to which a program can send output. Unless you instruct the shell otherwise, it directs this output to the screen (that is, to the device file that represents the screen). startup file A file that the login shell runs when you log in. The Bourne Again and Z Shells run .profile, and the TC Shell runs .login. The TC Shell also runs .cshrc whenever a new TC Shell or a subshell is invoked. The Z Shell runs an analogous file whose name is identified by the ENV variable. status line The bottom (usually the twenty-fourth) line of the terminal. The vim editor uses the status line to display information about what is happening during an editing session. sticky bit Originally, an access permission bit that caused an executable program to remain on the swap area of the disk. Today, Linux and Mac OS X kernels do not use the sticky bit for this purpose but rather use it to control who can remove files from a directory. In this new capacity, the sticky bit is called the restricted deletion flag. If this bit is set on a directory, a file in the directory can be removed or renamed only by a user who is working with root privileges or by a user who has write permission for the directory and who owns the file or the directory. streaming tape A tape that moves at a constant speed past the read/write heads rather than speeding up and slowing down, which can slow the process of writing to or reading from the tape. A proper blocking factor helps ensure that the tape device will be kept streaming. streams See connection-oriented protocol on page 1062. string A sequence of characters. stylesheet See CSS on page 1064. subdirectory A directory that is located within another directory. Every directory except the root directory is a subdirectory. 1096 Glossary subnet Subnetwork. A portion of a network, which might be a physically independent network segment, that shares a network address with other portions of the network and is distinguished by a subnet number. A subnet is to a network as a network is to an internet.FOLDOC subnet address The subnet portion of an IP address. In a subnetted network, the host portion of an IP address is split into a subnet portion and a host portion using a network mask (also subnet mask). See also subnet number. subnet mask See network mask on page 1082. subnet number The subnet portion of an IP address. In a subnetted network, the host portion of an IP address is split into a subnet portion and a host portion using a network mask. Also subnet mask. See also subnet address. subpixel hinting Similar to anti-aliasing (page 1055) but takes advantage of colors to do the antialiasing. Particularly useful on LCD screens. subroutine See procedure on page 1087. subshell A shell that is forked as a duplicate of its parent shell. When you run an executable file that contains a shell script by using its filename on the command line, the shell forks a subshell to run the script. Also, commands surrounded with parentheses are run in a subshell. superblock A block that contains control information for a filesystem. The superblock contains housekeeping information, such as the number of inodes in the filesystem and free list information. superserver The extended Internet services daemon. Superuser A user working with root privileges. This user has access to anything any other system user has access to and more. The system administrator must be able to become Superuser (work with root privileges) to establish new accounts, change passwords, and perform other administrative tasks. The username of Superuser is usually root. Also root or root user. swap The operating system moving a process from main memory to a disk, or vice versa. Swapping a process to the disk allows another process to begin or continue execution. swap space An area of a disk (that is, a swap file) used to store the portion of a process’s memory that has been paged out. Under a virtual memory system, the amount of swap space—rather than the amount of physical memory—determines the maximum size of a single process and the maximum total size of all active processes. Also swap area or swapping area.FOLDOC switch See network switch on page 1083. Glossary 1097 symbolic link A directory entry that points to the pathname of another file. In most cases a symbolic link to a file can be used in the same ways a hard link can be used. Unlike a hard link, a symbolic link can span filesystems and can connect to a directory. system administrator The person responsible for the upkeep of the system. The system administrator has the ability to log in as root or use sudo to work with root privileges. See also Superuser. system console See console on page 1062. system mode The designation for the state of the system while it is doing system work. Some examples are making system calls, running NFS and autofs, processing network traffic, and performing kernel operations on behalf of the system. Contrast with user mode (page 1100). System V One of the two major versions of the UNIX system. TC Shell tcsh. An enhanced but completely compatible version of the BSD UNIX C shell, csh. TCP Transmission Control Protocol. The most common transport layer protocol used on the Internet. This connection-oriented protocol is built on top of IP (page 1075) and is nearly always seen in the combination TCP/IP (TCP over IP). TCP adds reliable communication, sequencing, and flow control and provides full-duplex, process-to-process connections. UDP (page 1099), although connectionless, is the other protocol that runs on top of IP.FOLDOC tera- In the binary system, the prefix tera- multiplies by 240 (1,099,511,627,776). Terabyte is a common use of this prefix. Abbreviated as T. See also large number on page 1077. termcap Terminal capability. On older systems, the /etc/termcap file contained a list of various types of terminals and their characteristics. System V replaced the function of this file with the terminfo system. terminal Differentiated from a workstation (page 1102) by its lack of intelligence, a terminal connects to a computer that runs Linux. A workstation runs Linux on itself. terminfo Terminal information. The /usr/lib/terminfo directory contains many subdirectories, each containing several files. Each of those files is named for and holds a summary of the functional characteristics of a particular terminal. Visually oriented textual programs, such as vim, use these files. An alternative to the termcap file. text box A GUI widget (page 1101) that allows a user to enter text. theme Defined as an implicit or recurrent idea, theme is used in a GUI to describe a look that is consistent for all elements of a desktop. Go to themes.freshmeat.net for examples. thicknet A type of coaxial cable (thick) used for an Ethernet network. Devices are attached to thicknet by tapping the cable at fixed points. 1098 Glossary thinnet A type of coaxial cable (thin) used for an Ethernet network. Thinnet cable is smaller in diameter and more flexible than thicknet cable. Each device is typically attached to two separate cable segments by using a T-shaped connector; one segment leads to the device ahead of it on the network and one to the device that follows it. thread-safe See reentrant on page 1089. thumb The movable button in the scrollbar (page 1092) that positions the image in the window. The size of the thumb reflects the amount of information in the buffer. Also bubble. tick A mark, usually in a check box (page 1060), that indicates a positive response. The mark can be a check mark (✔) or an x. Also check mark or check. TIFF Tagged Image File Format. A file format used for still-image bitmaps, stored in tagged fields. Application programs can use the tags to accept or ignore fields, depending on their capabilities.FOLDOC tiled windows An arrangement of windows such that no window overlaps another. The opposite of cascading windows (page 1060). time to live See TTL. toggle To switch between one of two positions. For example, the ftp glob command toggles the glob feature: Give the command once, and it turns the feature on or off; give the command again, and it sets the feature back to its original state. token A basic, grammatically indivisible unit of a language, such as a keyword, operator, or identifier.FOLDOC token ring A type of LAN (page 1077) in which computers are attached to a ring of cable. A token packet circulates continuously around the ring. A computer can transmit information only when it holds the token. tooltip A minicontext help system that a user activates by allowing the mouse pointer to hover (page 1073) over an object (such as those on a panel). transient window A dialog or other window that is displayed for only a short time. Transmission Control Protocol See TCP on page 1097. Trojan horse A program that does something destructive or disruptive to your system. Its action is not documented, and the system administrator would not approve of it if she were aware of it. The term Trojan horse was coined by MIT-hacker-turned-NSA-spook Dan Edwards. It refers to a malicious security-breaking program that is disguised as something benign, such as a directory lister, archive utility, game, or (in one notorious 1990 case on the Mac) a program to find and destroy viruses. Similar to back door (page 1056).FOLDOC Glossary 1099 TTL Time to live. 1. All DNS records specify how long they are good for—usually up to a week at most. This time is called the record’s time to live. When a DNS server or an application stores this record in cache (page 1059), it decrements the TTL value and removes the record from cache when the value reaches zero. A DNS server passes a cached record to another server with the current (decremented) TTL guaranteeing the proper TTL, no matter how many servers the record passes through. 2. In the IP header, a field that indicates how many more hops the packet should be allowed to make before being discarded or returned. TTY Teletypewriter. The terminal device that UNIX was first run from. Today TTY refers to the screen (or window, in the case of a terminal emulator), keyboard, and mouse that are connected to a computer. This term appears in UNIX, and Linux has kept the term for the sake of consistency and tradition. tunneling Encapsulation of protocol A within packets carried by protocol B, such that A treats B as though it were a data link layer. Tunneling is used to transfer data between administrative domains that use a protocol not supported by the internet connecting those domains. It can also be used to encrypt data sent over a public internet, as when you use ssh to tunnel a protocol over the Internet.FOLDOC See also VPN (page 1101) and port forwarding (page 1086). UDP User Datagram Protocol. The Internet standard transport layer protocol that provides simple but unreliable datagram services. UDP is a connectionless protocol (page 1062) that, like TCP (page 1097), is layered on top of IP (page 1075). Unlike TCP, UDP neither guarantees delivery nor requires a connection. As a result it is lightweight and efficient, but the application program must handle all error processing and retransmission. UDP is often used for sending time-sensitive data that is not particularly sensitive to minor loss, such as audio and video data.FOLDOC UID User ID. A number that the passwd database associates with a username. undecillion In the American system, 1036. In the British system, this number is named sexillion. See also large number (page 1077). unicast A packet sent from one host to another host. Unicast means one source and one destination. Unicode A character encoding standard that was designed to cover all major modern written languages with each character having exactly one encoding and being represented by a fixed number of bits. unmanaged window See ignored window on page 1074. 1100 Glossary URI Universal Resource Identifier. The generic set of all names and addresses that are short strings referring to objects (typically on the Internet). The most common kinds of URIs are URLs.FOLDOC URL Uniform (was Universal) Resource Locator. A standard way of specifying the location of an object, typically a Web page, on the Internet. URLs are a subset of URIs. usage message A message displayed by a command when you call the command using incorrect command-line arguments. User Datagram Protocol See UDP. User ID See UID. user interface See interface on page 1074. user mode The designation for the state of the system while it is doing user work, such as running a user program (but not the system calls made by the program). Contrast with system mode (page 1097). username The name you enter in response to the login: prompt. Other users use your username when they send you mail or write to you. Each username has a corresponding user ID, which is the numeric identifier for the user. Both the username and the user ID are stored in the passwd database (/etc/passwd or the NIS equivalent). Also login name. userspace The part of memory (RAM) where applications reside. Code running in userspace cannot access hardware directly and cannot access memory allocated to other applications. Also userland. See the KernelAnalysis-HOWTO. UTC Coordinated Universal Time. UTC is the equivalent to the mean solar time at the prime meridian (0 degrees longitude). Also called Zulu time (Z stands for longitude zero) and GMT (Greenwich Mean Time). UTF-8 An encoding that allows Unicode (previous page) characters to be represented using sequences of 8-bit bytes. utility A program included as a standard part of Linux. You typically invoke a utility either by giving a command in response to a shell prompt or by calling it from within a shell script. Utilities are often referred to as commands. Contrast with builtin (command) (page 1059). UUID Universally Unique Identifier. A 128-bit number that uniquely identifies an object on the Internet. Frequently used on Linux systems to identify an ext2, ext3, or ext4 disk partition. variable A name and an associated value. The shell allows you to create variables and use them in the interactive shell and in shell scripts. Also, the shell inherits variables when Glossary 1101 it is invoked (environment variables; page 468). Some shell variables establish characteristics of the shell; others have values that reflect different aspects of your ongoing interaction with the shell. viewport Same as workspace (page 1102). virtual console Additional consoles, or displays, that you can view on the system, or physical, console. virus A cracker (page 1063) program that searches out other programs and “infects” them by embedding a copy of itself in them, so that they become Trojan horses (page 1098). When these programs are executed, the embedded virus is executed as well, propagating the “infection,” usually without the user’s knowledge. By analogy with biological viruses.FOLDOC VLAN Virtual LAN. A logical grouping of two or more nodes that are not necessarily on the same physical network segment but that share the same network number. A VLAN is often associated with switched Ethernet.FOLDOC VPN Virtual private network. A private network that exists on a public network, such as the Internet. A VPN is a less expensive substitute for company-owned/leased lines and uses encryption to ensure privacy. A nice side effect is that you can send nonInternet protocols, such as AppleTalk, IPX, or NetBIOS (page 1082), over the VPN connection by tunneling (page 1099) them through the VPN IP stream. W2K Windows 2000 Professional or Server. W3C World Wide Web Consortium (www.w3.org). WAN Wide area network. A network that interconnects LANs (page 1077) and MANs (page 1079), spanning a large geographic area (typically states or countries). WAP Wireless access point. A bridge or router between wired and wireless networks. WAPs typically support some form of access control to prevent unauthorized clients from connecting to the network. Web ring A collection of Web sites that provide information on a single topic or group of related topics. Each home page that is part of the Web ring has a series of links that let you go from site to site. whitespace A collective name for SPACE s and/or TAB s and occasionally NEWLINE s. Also white space. wide area network See WAN. widget The basic objects of a graphical user interface. A button, combo box (page 1062), and scrollbar (page 1092) are examples of widgets. wildcard See metacharacter on page 1080. 1102 Glossary Wi-Fi Wireless Fidelity. A generic term that refers to any type of 802.11 (page 1054) wireless network. window On a display screen, a region that runs or is controlled by a particular program. window manager A program that controls how windows appear on a display screen and how you manipulate them. Windows share See share on page 1092. WINS Windows Internet Naming Service. The service responsible for mapping NetBIOS names to IP addresses. WINS has the same relationship to NetBIOS names that DNS has to Internet domain names. WINS server The program responsible for handling WINS requests. This program caches name information about hosts on a local network and resolves them to IP addresses. wireless access See WAP. point word A sequence of one or more nonblank characters separated from other words by TABs, SPACEs, or NEWLINEs. Used to refer to individual command-line arguments. In vim, a word is similar to a word in the English language—a string of one or more characters bounded by a punctuation mark, a numeral, a TAB, a SPACE, or a NEWLINE. Work buffer A location where vim stores text while it is being edited. The information in the Work buffer is not written to the file on the disk until you give the editor a command to write it. working directory The directory that you are associated with at any given time. The relative pathnames you use are relative to the working directory. Also current directory. workspace A subdivision of a desktop (page 1065) that occupies the entire display. See page 112. workstation A small computer, typically designed to fit in an office and be used by one person and usually equipped with a bit-mapped graphical display, keyboard, and mouse. Differentiated from a terminal (page 1097) by its intelligence. A workstation runs Linux on itself while a terminal connects to a computer that runs Linux. worm A program that propagates itself over a network, reproducing itself as it goes. Today the term has negative connotations, as it is assumed that only crackers (page 1063) write worms. Compare to virus (page 1101) and Trojan horse (page 1098). From Tapeworm in John Brunner’s novel, The Shockwave Rider, Ballantine Books, 1990 (via XEROX PARC).FOLDOC WYSIWYG What You See Is What You Get. A graphical application, such as a word processor, whose display is similar to its printed output. X server The X server is the part of the X Window System that runs the mouse, keyboard, and display. (The application program is the client.) Glossary 1103 X terminal A graphics terminal designed to run the X Window System. X Window System A design and set of tools for writing flexible, portable windowing applications, created jointly by researchers at MIT and several leading computer manufacturers. XDMCP X Display Manager Control Protocol. XDMCP allows the login server to accept requests from network displays. XDMCP is built into many X terminals. xDSL Different types of DSL (page 1067) are identified by a prefix, for example, ADSL, HDSL, SDSL, and VDSL. Xinerama An extension to X.org. Xinerama allows window managers and applications to use the two or more physical displays as one large virtual display. Refer to the XineramaHOWTO. XML Extensible Markup Language. A universal format for structured documents and data on the Web. Developed by W3C (page 1101), XML is a pared-down version of SGML. See www.w3.org/XML and www.w3.org/XML/1999/XML-in-10-points. XSM X Session Manager. This program allows you to create a session that includes certain applications. While the session is running, you can perform a checkpoint (saves the application state) or a shutdown (saves the state and exits from the session). When you log back in, you can load your session so that everything in your session is running just as it was when you logged off. Z Shell zsh. A shell (page 1092) that incorporates many of the features of the Bourne Again Shell (page 1058), Korn Shell (page 1077), and TC Shell (page 1097), as well as many original features. Zulu time See UTC on page 1100. File Tree Index A light page number such as 456 indicates a brief mention. /bin 96 echo 464 /boot 96 /dev 96, 134 kmem (OS X) 787 null 141, 438 random 773 tty 482 urandom 773 /etc 96 apt sources.list 1038 aspell.conf 722 at.allow 726 at.deny 726 bashrc 279 cron.allow 765 cron.daily 764 cron.deny 765 cron.hourly 764 cron.monthly 764 cron.weekly 764 crontab 764 csh.cshrc 372 csh.login 372 /etc, continued csh.logout 373 DIR_COLORS 865 fuse.conf 955, 956 group 558, 1042 issue 26 lftp.conf 700 localtime 322 man.conf 876 man.config 876 manpath.config 876 motd 27 opt 96 passwd 371, 1042 printcap 858 profile 278 profile.d 278 ssh 690 ssh_config 700 ssh_known_hosts 693 termcap 1097 timezone 322 X11 96 xinitrc 705 yum.conf 1032 yum.repos.d 1033 1106 File Tree Index /home 96 /lib 96, 827 modules 96 /lost+found 817 /mnt 96 /opt 96 /proc 97 /root 97 /run 97 /sbin 97 /sys 97 /tmp 97, 448 /Users (OS X) 97 /usr 97 bin 97 test 420 games 97 include 97 lib 97, 827 cron (OS X) 765 terminfo 1097 local 97 ascii 1055 sbin 97 share 97 doc 39, 97, 1022 sharelocales 319 info 97 locale 319 magic 1079 man 97 zoneinfo 321, 322 src 97 linux Documentation 39 /var 97 at (OS X) 726 at.allow (OS X) 726 at.deny (OS X) 726 cron.allow (OS X) 765 cron.deny (OS X) 765 jobs (OS X) 726 spool (OS X) 726 dslocal (OS X) 788 /var, continued log 97, 1022 auth.log 1022 messages 787, 1022 secure 1022 name 309 spool 97, 1023 at 726 spoolspool 726 cron 763 spoolatjobs 726 spoolatspool 726 spoolcrontabs 763 /Volumes (OS X) 1044 ~/ (a user’s home directory) .aspell 721 .bash_history 327 .bash_login 278 .bash_logout 279 .bash_profile 194, 195, 278, 327, 348 .bashrc 279 .cshrc 1064 .dir_colors 865 .emacs 214, 215, 257, 259 .inputrc 339 .lftprc 700 .login 372, 1078 .logout 373, 1078 .mysql_history 603 .netrc 822 .nofinger 811 .pgpkey 72, 810 .plan 72, 810 .profile 278, 1087 .project 72, 810 .ssh 690 authorized_keys 703 config 700 id_rsa 703 id_rsa.pub 703 known_hosts 693 .tcshrc 194, 195, 372 .toprc 983 .vimrc 194, 195 Utility Index A light page number such as 456 indicates a brief mention. Page numbers followed by the letter t refer to tables. Symbols : (null) 473, 485 . (dot) 281, 481 [[...]] 494 @ 386, 388, 390, 407 % 407 A alias 342, 377, 407 alloc 407 apropos 35 apt-get 1034–1038 ash see dash aspell 437, 721–724 at 725–728 atq 725–728 atrm 725–728 atrun (OS X) 726 autoconf 753 awk see gawk B basename 447 bash see bash in the Main index (page 1111) batch 725–728 bg 147, 296, 407 bind 340 bindkey 383, 384, 407 builtins 408, 491t bunzip2 65, 732–733 busybox 729–731 bzcat 65, 732–733 bzip2 64, 732–733 bzip2recover 65, 732–733 C cal 734 cat 52, 135, 137, 138, 425, 696, 735–736 cd 92, 114, 313, 408, 737–738 chdir 408 chgrp 117, 739–740 chmod 99–101, 101t, 285, 741–745 chmod (OS X) 1047–1049 chown 746–747 chsh 277, 371 cmp 748–749 col 876 comm 750–751 compress 65, 86, 839, 970, 970 1108 Utility Index configure 752–753 fgrep 833, 834 cp 53, 95, 127, 754–757 file 60, 322, 802–803 cpio 758–762, 805, 808, 840 filetest 400, 408 cron 763–765 find 380, 430, 804–809 crontab 763–765 finger 71, 73t, 810–811 curlftpfs 954–956 fmt 812–813 cut 359, 766–767 free 74 fromdos 63 D dash 277 date 62, 319, 320, 321, 373, 460, 769–771 dd 772–774 fsck 814–818 fsck (OS X) 783 ftp 819–824 fusermount 954–956 declare 305–306, 474 df 775–776 G diff 58, 777–781 g++ 826–830 diff3 778 dircolors 865 dirs 297, 408 disktool (OS X) 783 diskutil (OS X) 782–784 ditto (OS X) 785–786 dmesg 787 dos2mac 63 dos2unix 63 dscl (OS X) 788–790 du 791–793 E e2fsck 816 echo 61, 149, 371, 408, 445, 445t, 464, 794–795 ed 56, 160, 777 egrep 833, 834, 1018 elvis 160 emacs 214–268 gawk 435, 620–652, 1018 gcc 826–830 gdb 827 getfacl 105–109 GetFileInfo (OS X) 831–832 getopts 488–490 getty 323 glob 409 grep 56, 144, 438, 450, 519, 809, 833–837 gunzip 65, 838, 970 gzip 65, 838–840, 970 H hash 326 hashstat 409 head 56, 841–842 history 326, 327, 374, 409 hostname 53 env 348, 471 eval 348, 408, 487 ex 160, 169 exec 405, 408, 453, 480–483 exit 372, 397, 408, 422, 465 expand 796–797 I info 36–38, 153 ispell see aspell export 306, 468–470 J expr 798–801 jobs 30, 148, 295, 409 join 843–845 F false 292 K fc 328–330 kill 30, 147, 409, 484, 486, 846–847, 924 fg 147, 408 killall 848–849 Utility Index 1109 L P launchctl (OS X) 727, 850–851 ldd 903 less 34, 53, 144, 425, 852–855 let 360, 492 lftp 700 limit 409 ln 110, 113, 856–857 local 347, 476 locale 318–320 locate 70 log 409 login 323, 409 logout 372, 397, 409 lpq 55, 858–860 lpr 55, 142, 858–860 lprm 56, 858–860 lpstat 55 ls 52, 98, 116, 861–867 ls–F 397, 409 paste 905–906 pax 907–912 perldoc 519 pinfo 37 plutil (OS X) 913–914 popd 299, 409 port 1051 pr 915–916 printenv 409, 470 printf 917–920 ps 147, 288, 324, 464, 921–924 ps2pdf 877 pstree 324 pushd 298, 397, 410 pwck 644 pwd 86, 114 pwgen 45 M make 556, 869–874 man 33–35, 875–878 mawk see gawk mc 879–885 mesg 75 mingetty 323 mkdir 90–92, 886 mkfs 887–889 mklost+found 817 mlocate 70 more 53, 853 mount 105 mv 54, 94, 95, 890–891 mysqldump 610 R read 439, 477–479, 479t, 479–480 readlink 117 readonly 305, 306 rehash 326, 410 renice 925 repeat 410 rm 52, 115, 398, 926–927 rmdir 93, 928 rsync 674–684, 699 rtorrent 1039 run-parts 764 S sched 410 scp 692, 698–700 see also OpenSSH in the Main index (page 1111) N ncal 734 nice 409, 892–893 nl 894–895 nohup 409, 896 notify 409 O od 897–900 onintr 401, 409 open (OS X) 901–902 otool (OS X) 903–904 screen 694, 931–936 script 62 sdiff 778, 779 sed 654–669 seq 357, 433 set 350, 386, 390, 410, 427, 429, 430, 460, 472 setenv 386, 410, 469 setfacl 105–109 SetFile (OS X) 938–939 sftp 700 sh 277, 1058 shift 410, 425, 461 shopt 350 1110 Utility Index shred 927 sleep 465, 940–941 sort 58, 143, 364, 454, 942–950 source 281, 410 split 951–952 ssh 27, 691, 694–698 see also OpenSSH in the Main index (page 1111) ssh-add 705 ssh-agent 705 ssh-copy-id 703 sshfs 954–956 ssh-keygen 703–704 stat 957–958 stop 410 strings 959 stty 29, 960–963 U umask 411, 994–995 unalias 343, 345, 411 uname 695 uncompress 86, 970 unexpand 796–797 unhash 411 uniq 58, 996–997 unix2dos 63 unix2mac 63 unlimit 411 unset 304, 411 unsetenv 387, 411 updatedb 70 uptime 72, 982, 998 su 32 V sudo 33 vi 160 view 171 vile 160 vim 161–168 vimtutor 161 suspend 410 sysctl (OS X) 964 systemsetup (OS X) 322 T W tac 667, 735 tail 57, 380, 965–967 tar 66–68, 293, 968–972 tee 145, 973 telnet 974–975 Terminal (OS X) 1049 test 419–421, 421t, 422, 426, 429, 432, 436, 441, 456–457, 978–980 time 394, 395t, 410 w 73, 73t, 998–999 wait 411 wc 60, 359, 1000 whatis 36 where 411 whereis 69 which 68, 411, 876, 1001–1002 who 70, 73t, 1003–1004 write 74, 1003 todos 63 top 981–984 touch 94, 139, 733, 985–986 tput 440 tr 63, 142, 282, 767, 987–989 trap 440, 483–486 true 292, 485 tty 990 X xargs 1005–1007 Y yum 1028–1034 yumdownloader 1032 tune2fs 991–992, 993 type 477 Z typeset see declare zcat 65, 838–840 zdiff 840 zless 840 tzconfig 321 tzselect 321 Index An italic page number such as 123 indicates a definition. A light page number such as 456 indicates a brief mention. Page numbers followed by the letter t refer to tables. Only variables that must always appear with a leading dollar sign are indexed with a leading dollar sign. Other variables are indexed without a leading dollar sign. Symbols – – argument 129, 425, 718 ^ in regular expressions 1014 ^ quick substitution character 334 , (comma) operator 497 ; control operator 145, 290, 499 ;; control operator 442, 499 : (null) builtin 473, 485 :– substitutes default values for a variable 473 :? sends to standard error an error message for a variable 474 := assigns default values for a variable 473 ! (NOT) Boolean operator 499, 662 ! event reference 331 !! reexecutes the previous event 331 ? in extended regular expressions 1018 ? special character 148 . (dot) builtin 281, 481 . directory 92 . in regular expressions 1013 ./ executes a file in the working directory 286, 308 .. directory 92 .jpg filename extension 1076 ‘ ...‘ see command, substitution ( ) control operator 292 ((...)) see arithmetic, evaluation [] character class (regular expressions) 1013, 1060 [] special characters 151 [...] see test utility [[...]] builtin 494 {} around positional parameters 459 {} around variable names 304, 390 {} expansion 356, 432, 487 {} in array variables 474 {} in functions 346 @ builtin 386, 388, 390, 407 * in regular expressions 1014 * special character 149 / (root) directory 88, 96 \ escape character 50, 293, 302 \( in regular expressions 1016 \) in regular expressions 1016 \< in regular expressions 895 1112 Main Index \> in regular expressions 895 & (AND) bitwise operator 498 & control operator 145, 146, 290, 325, 499 & in replacement strings (regular expressions) 1017 && (AND) Boolean operator 403, 493, 498, 622 && control operator 145, 292, 457, 499 &> redirects standard output and standard error 146, 283 # comment 288, 428, 872 #! specifies a script shell 287, 428 % builtin 407 + in extended regular expressions 1018 < redirects standard input 137–138 << Here document 450–452 > redirects standard output 136–137 >& duplicates output file descriptor 283, 284, 422 >&2 duplicates file descriptor 284, 422 >> redirects and appends standard output 140 >| redirects output without clobber 140 >1 redirects standard output 282 | (OR) bitwise operator 498 | (OR) Boolean operator (extended regular expression) 1018 | control operator 142, 290, 435, 497, 499 | see pipeline |& control operator 142, 283, 499 |& shorthand for 2>&1 283 || (OR) Boolean operator 493, 498, 622 || control operator 146, 292, 499 ~– synonym for OLDPWD 358 ~ (tilde) completion (tcsh) 381 ~ (tilde) expansion 89, 307, 358, see also home directory ~ in directory stack manipulation 358 ~+ synonym for PWD 358 $ bash parameters 458–467 $ in regular expressions 1014 $ in variable names 302 $– parameter 466 $ prompt 28 $_ parameter 466 $! parameter 465 $? parameter 465 $? variable (tcsh) 391 $(...) see command, substitution $((...)) see arithmetic, expansion ${...} string length 495 ${} expands variables 472 $@ parameter 433, 435, 462 $* parameter 462 $# parameter 463 $# variable 391, 392 $#argv variable 392 $< reads user input (tcsh) 391 $$ parameter 448, 464 $$ variable (tcsh) 396 $0 parameter 458 $n parameters 459 Numerics 0< redirects standard input 282 1>&2 duplicates file descriptor 284, 422 10.0.0.0 (IP address) 1054 172.16.0.0 (IP address) 1054 192.168.0.0 (IP address) 1054 2> redirects standard error 282 802.11 wireless specification 1054 A –a (AND) Boolean operator 426, 493 .a filename extension 828 a.out file 826 aborting execution see kill builtin absolute mode specifications (chmod) 742t, 743t absolute pathnames 88, 132, 1054 access 1054 Access Control Lists see ACLs access permissions 98–109, 1054 change using chmod 99–101, 741–745 directory 103–104 display using ls 99 establish mask using umask 994–995 execute 285–286 setgid see setgid permissions setuid see setuid ACLs 104–109, 1054 access rules 105–109 default rules 108 effective rights mask 106 enabling 105 getfacl utility 105–109 OS X 1047–1049 setfacl utility 105–109 active window 1054 addbanner shell script 486 addition operators 496 Main Index address see the type of address you are looking for (e.g., MAC address, IP) or see the specific address (e.g., 192.168.0.0) address mask see network, mask administrator 32 algorithm 501 alias 342–345, 377–378, 1054 examples 344–345 quotation marks in 343 recursion 343 recursive plunge 345 special (tcsh) 377 substitution 354 alias builtin 342, 377, 407 alloc builtin 407 Almquist Shell see dash shell alphanumeric character 1054 ALT key 223 ambiguous file references 148, 397, 1054 AND (&) bitwise operator 498 AND (&&) Boolean operator 493, 498 AND (–a) Boolean operator 426, 493 angle bracket 1054 animate 1054 anonymous FTP 821 ANSI 11 ansi terminal name 1024 anti-aliasing 1055 API 1055 append 1055 append standard output using >> 140 Apple Human Interface Guidelines 1050 applet 1055 apropos utility 35 apt-get utility 1034–1038 archive 1055 files 63–68, see also backing up files shell file 451 arguments 127, 1055 command line 462 convert standard input into using xargs 1005–1007 testing 421, 429 argv variable 391 arithmetic evaluation (bash) 359, 436, 476, 492–493 expansion (bash) 359–361, 497 expression 1055 operators 495t, 495–499 perform using test 978–980 array 387, 522, 1055 ASCII 1055 EBCDIC conversion 772 man page 876 terminal 1056 ascii file 1055 ash see dash shell ASP (application service provider) 1056 .aspell file 721 aspell utility 437, 721–724 ASPELL_CONF variable 721 aspell.conf file 722 assembly language 10 assignment operators 497 associative arrays (gawk) 626, 639–644 asterisk special character 149 asynchronous event 1056 at directory 726 at directory (OS X) 765 at utility 725–728 at.allow file 726 at.deny file 726 AT&T Bell Laboratories 3, 276 atd daemon 726 atjobs directory 726 atq utility 725–728 atrm utility 725–728 atrun utility (OS X) 726 atspool directory 726 attachments 1056 attributes, extended (OS X) 1044 attributes, file (OS X) 1046 auth.log file 1022 authentication 1056 authentication, OpenSSH 688, 689, 692 authorized_keys file 703 autoconf utility 753 autocorrect variable 396 autolist variable 382, 392 autologout variable 392 automatic mounting 1056 avoided 1056 awk utility see gawk utility B B language 10 back door 1056 back ticks see command, substitution 1113 1114 Main Index background command, running in the 146 foreground versus 146 jobs 146–148 process 325, 1057 backing up files using cp 754 using cpio 758–762 using pax 907–912 using rsync 681 using tar 968–972 BACKSLASH escape character 50, 293, 302 BACKSLASH in replacement strings 1017 BACKSPACE key (erase character) 29 base operators 498 basename 88, 1057 basename utility 447 bash 276, 1058 see also alias; bash history; bash variables; command; command line; operators; shell scripts alias see alias archive 451 arguments 462 arithmetic evaluation 359, 436, 476, 492–493 arithmetic operators 495t, 495–499 arrays see bash variables, array background process 325 builtins see builtins calling program, name of 458 command line see command line command not found error message 132, 286 command substitution see command, substitution commands see command conditional expressions 493 control operator see control operator control structures see control structures debugging prompt 311, 431 directory stack 297–300 expressions 492–499 features 350–351, 351t features shared with tcsh 373 file descriptors 452, 452–455 functions 346–349, 454, 475–476 globbing 362 hash table 325 history see bash history inheritance 468 logical evaluation 493 menu 448 No such file or directory error message 308 operators see operators option flags, displaying 466 options, command line 349 pathname expansion 303 Permission denied error message 132, 285, 308 prompt (PS1) 309 prompt (PS2) 311 prompt (PS3) 311, 449 prompt (PS4) 311, 431 quiz shell script 508 quotation mark removal 364 recursion 501 redirection operators 284t sequence expression 357, 432 set, turns features on and off 350 shopt, turns features on and off 350 special characters 50, 315, 315t standard error see standard error standard input see standard input standard output see standard output startup files 278–281 step values 432 string operators 494t tcsh, features shared with 373 variables see bash variables word splitting 311 –x option 430, 502 bash history 326, 326–342 bind builtin 340 C Shell mechanism, classic 331–335 commands editing 330, 335–342 reexecuting 328–335 viewing 328–329 event 326 designators 331t modifiers 335t numbers 327, 332 reference using ! 331–335 expansion 354 history builtin 326, 327 INPUTRC variable 339 quick substitution 334 Readline Library 335–342 Readline variables 340t substitute modifier 334 variables 327, 327t word designators 332, 334t Main Index bash parameters 300, 300–314 see also bash variables $– 466 $_ 466 $! 465 $? 465 $@ 433, 435, 462 $* 462 $# 463 $$ 448, 464 $0 458 $n 459 parameter null or not set error message 474 positional 458, 458–463 special 463–466 substitution 302 bash variables 300, 300–314, 467–476 see also bash parameters @ subscript 474 * subscript 474 array 474 ASPELL_CONF 721 assigning values to 300 attributes 305, 305t, 305–306 attributes, listing 306 BASH_ENV 279 braces around 304 CDPATH 313, 737 COLUMNS 449 completion 339 default values, assigning 473 default values, substituting 473 EDITOR 330, 763, 854 ENV 279 environment 468 environment, display using printenv 470 error messages, sending to standard error 474 expansion 358 exported variable see bash variables, environment global variable see bash variables, environment HISTFILE 327 HISTFILESIZE 327 history 326, 327t HISTSIZE 327 HOME 307 IFS 311–313 INPUTRC 339 keyword 301, 307–313, 314t LANG 317, 1025 LC_ALL 317 LC_COLLATE 317 LC_CTYPE 317 LC_MESSAGES 317 LC_NUMERIC 317 LESS 853 LINES 449 local 468 MAIL 309 MAILCHECK 309 MAILPATH 309 MAILTO 763, 765 MANPATH 876 naming 300 noclobber 139–141, 736 null, expanding 472 OLDPWD 358, 737 OPTARG 488 OPTIND 488 PAGER 853, 876 parameter substitution 302 PATH 132, 279, 308–309, 447, 1001 POSIXLY_CORRECT 802 PRINTER 859 PS1 309, 310t PS2 294, 311 PS3 311, 449 PS4 311, 431 PWD 358 quoting 302 RANDOM 433, 507 Readline 340t readonly 305 removing 304 REPLY 449, 478 SHELL 855 shell 467 syntax 304 TERM 28, 43, 1025 TZ 320 unexport 470 unset using unset 304 unset, expanding 472 user created 300, 302–304 VISUAL 763 BASH_ENV variable 279 .bash_history file 327 .bash_login file 278 .bash_logout file 279 1115 1116 Main Index .bash_profile file 194, 195, 278–280, 327, 348 .bashrc file 279–280 bashrc file 279 batch utility 725–728 baud 1057 baud rate 1057 BCPL language 10 Bell Laboratories 3, 276 Berkeley UNIX 3, 370, 1057 bg builtin 147, 296, 407 /bin directory 96 bin directory 97 bind builtin 340 binding, key 1076 bindkey builtin 384, 407 bindkey utility (tcsh) 383 BIOS 1057 birthday shell script 450 bit 1057 bucket 141 depth see color depth -mapped display 1057 BitTorrent 1038–1040 bitwise operators 496, 498 blank characters 302, 1057, 1094 blanks 50 block device 1058 disk 1057 number 1058 Perl 521 special file 1058 blocking factor 1058 .bmp filename extension 86 Boolean operators 1058 ! (NOT) 499, 662 && (AND) 403, 493, 498, 622 | (OR; extended regular expression) 1018 || (OR) 493, 498, 622 –a (AND) 426, 493 control see && control operator and || control operator –o (OR) 493 boot 1058 loader 1058 netboot 1082 /boot directory 96 bootstrap 1058 Bourne, Steve 276, 1058 Bourne Again Shell see bash Bourne Shell (original) 276, 1058 brace 1058 brace expansion 356, 487 braces, variables and 304, 390 bracket 1058 bracket, character class 1013 branch 1058 break control structure 405, 441 bridge, network 1058 broadcast 1058 address 1059 network 1059 BSD see Berkeley UNIX buffer 1059 bug 1059 builtins 153, 476–491, 491t, 1059 : (null) 473, 485 . (dot) 281, 481 [] see builtins, test [[...]] 494 @ 386, 388, 390, 407 % 407 alias 342, 377, 407 alloc 407 bg 147, 296, 407 bind 340 bindkey 384, 407 builtins 408 cd 92, 114, 313, 408, 737–738 chdir 408 commands that are symbols 281t declare 305–306, 474 dirs 297, 379, 408 echo 61, 371, 408, 445, 445t, 464, 794–795 eval 348, 408, 487 exec 405, 408, 453, 480–483 executing 325 exit 44, 372, 397, 408, 422, 465 export 306, 468–470 fc 328–330 fg 147, 295, 408 filetest 400, 408 getopts 488–490 glob 409 hashstat 409 history 326, 327, 374, 409 jobs 30, 148, 295, 409 kill 30, 147, 409, 484, 486 Main Index let 360, 492 limit 409 list using info 153 local 347, 476 log 409 login 409 logout 372, 397, 409 ls–F 397, 409 nice 409, 892 nohup 409, 896 notify 409 : (null) 473, 485 onintr 401, 409 popd 299, 379, 409 printenv 409 printf 917–920 pushd 298, 379, 397, 410 pwd 86 read 439, 477–479, 479t, 479–480 readonly 305, 306 rehash 326, 410 repeat 410 sched 410 set 350, 386, 390, 410, 427, 430, 460, 472 setenv 386, 410, 469 shift 410, 461 shopt 350 source 281, 410 stop 410 suspend 410 symbols as commands 281t tcsh 407, 407t–411t test 419–421, 421t, 422, 426, 429, 432, 436, 441, 978–980 time 394, 395t, 410 tput 440 trap 440, 483–486 type 477 typeset see builtins, declare umask 411 unalias 343, 345, 377, 411 unhash 411 unlimit 411 unset 304, 411 unsetenv 387, 411 utilities versus 420 wait 411 where 411 which 411, 1002 1117 builtins builtin 408 bundle shell script 451 bunzip2 utility 65, 732–733 busybox utility 729–731 byte 1059 bytecode 1059 .bz2 filename extension 64, 86, 733 bzcat utility 65, 732–733 bzip2 utility 64, 732–733 bzip2recover utility 65, 732–733 C .C filename extension 829 .c filename extension 86, 828 C locale 318 C programming language 10, 1059 compiler 826 K&R 828 optimizer 827 C++ programming language 11 C89 programming language 11 cable modems 1059 cache 1059 cal utility 734 Cambridge culture 214 Carnegie Mellon University, see CMU cascading stylesheet see CSS cascading windows 1060 case control structure 442–444, 444t, 444–448 case-preserving filesystem (OS X) 1043 case-sensitive 1060 case-sensitivity, filename 85 cat utility 52, 135, 137, 138, 425, 696, 735–736 categories, software package 1037 catenate 52, 137, 735, 1060 cc see gcc and g++ utilities and C programming language .cc filename extension 829 cd builtin 92, 114, 313, 408, 737–738 CDPATH variable 313, 737 cdpath variable 313, 392, 737 chain loading 1060 character alphanumeric 1054 -based 1060 -based terminal 1060 blank 50, 302, 1057, 1094 class 151, 987, 1019t, 1060 control 1063 1118 Main Index character, continued device 1060 escaping 50, 293 list see character, class map 317 meta 1080, see also special characters nonprinting 1083 printable 1086 quoting 50, 293 regular 1089 special see special characters special file 1060 typeface conventions 25 charmap 317 CHARSET 317 chdir builtin 408 check see tick check box 1060 check mark see tick checksum 1060 chgrp utility 117, 739–740 child directories 83, 84 child processes 324, 1061 chkargs shell script 420, 422 chmod utility 99–101, 101t, 285, 741–745 chmod utility (OS X) 1047–1049 chown utility 746–747 chsh utility 277, 371 CIDR 1061 CIFS 1061 CIPE 1061 cipher 1061 ciphertext 1061 class, character 1060 Classless Inter-Domain Routing see CIDR cleanup shell script 666 cleartext 1061 CLI 24, 1061 client 1061 clock, system, set using date 770 cmp utility 748–749 CMU 2 Codd, E. F. 596 code, reentrant 1089 CODEC 1061 col utility 876 collating sequence, machine 1079 color depth 1061 color quality see color depth column 596 COLUMNS variable 449 combo box 1062 Comer, Doug 5 comm utility 750–751 comma operator 497 command 126, 1062 see also builtins; command line arguments 127 arguments, convert standard input into, using xargs 1005–1007 completion 338–339, 381 continuing 293 control flow see control structures editing/repeating 31 execute remotely using ssh 695–697 execute using exec 480–483 executing 325 execution environment 468 grouping 292 –h option 129 ––help option 128 human-readable option 129 location of, display using which 1001–1002 names 127 priority, change using nice 892–893 separating 289–293 simple 126, 130 substitution 361, 361–362, 379, 403, 429, 447, 837, 1062 command line 28, 126, 126–133, 1062 see also command; shell – – argument 129, 425, 718 arguments 127, 459 arguments, initialize using set 460 arguments, promote using shift 461 brace expansion 356, 487 continuation 144, 293, 499 control operators 145 editing 31, 335–342, 383 executing 132 expansion 354–364, 374–378 filters 144 interface 1061 lists 145 mistakes, correcting 29 options 127, 127–129, 349, 349t options, common 718t parse 130, 354 Main Index processing 130–132, 354 substitution see command line, expansion syntax 126 tokens 127, 354 whitespace on the 290 words 127, 354 command not found error message 132, 286 command_menu shell script 444 comments makefile 872 MySQL 598 Perl 525 shell scripts 288, 428 communication, interprocess 60 comparison operators 496 compiler, gcc and g++ 826–830 completion command 338–339 filename 1068 pathname 338 Readline 338 Readline commands 338–339 variable 339 component architecture 1062 compress utility 839, 970 compress/decompress files 63–68 bunzip2 utility 65 bzcat utility 65 bzip2 utility 64, 732–733 bzip2recover utility 65 compress utility 65, 86, 839, 970 gunzip utility 65, 838–840 gzip utility 65, 838–840, 970 OpenSSH 708 tar utility 970 uncompress utility 86 utilities 77t zcat utility 65, 838–840 zdiff utility 840 zless utility 840 computer, diskless 1066 computing, distributed 1066 concatenate see catenate condition code see exit status conditional evaluation operators 497 conditional expressions 493 config file (OpenSSH) 700 configuration file (OS X) see startup files Configure and Build System, GNU 752 1119 configure utility 752–753 connectionless protocol 1062 connection-oriented protocol 1062 console 1062 console, virtual 43, 1101 context diff 781 continuation, command line 144, 499 continue control structure 405, 441 control character 1063 CONTROL characters, display using cat 735 control flow see control structures CONTROL key 25 control operator 289, 289–294, 1063 ; 145, 290, 499 ;; 442, 499 ( ) 292 & 145, 290, 499 && 145, 292, 457, 499 | 142, 290, 435, 497, 499 |& 142, 283, 499 || 146, 292, 499 continuation, implicit command line 499 NEWLINE 290 short-circuiting 291 control structures 398–406, 418–452, 627–629, 657, 1063 AWK 627–629 break 405, 441 case 442–444, 444t, 444–448 continue 405, 441 for 433–435 for...in 431–434 foreach 403 goto 401 Here document 450–452 if 398 if...then 402, 419–422 if...then...elif 402, 425–430 if...then...else 402, 423–425 Perl 533–542 Python 574–578 sed 657 select 448–450 switch 406 until 439–441 while 405, 435–438 CONTROL-\ key (quit) 30 CONTROL-C key (interrupt) 30 CONTROL-D key (EOF) 43, 135, 372 1120 Main Index key (erase character) 29, 130 key (refresh screen) 170 CONTROL-Q key (Xon) 27 CONTROL-U key (line kill) 30, 130, 960, 963 CONTROL-V key (quote CONTROL keys) 51 CONTROL-W key (erase word) 30, 130, 170 CONTROL-X key (line kill) 30 CONTROL-Z key (suspend) 30, 147 convention, end line key 25 conventions, in this book 24–26, 224, 718, 931 cookie 1063 Coordinated Universal Time see UTC copy partition using dd 773 copyleft 6 correct variable 392 correcting typing mistakes 29 count shell script 436 count_down function 476 cp utility 53, 95, 127, 754–757 cp versus ln 111 CPAN 555 cpdir shell script 293 cpio utility 758–762, 805, 808, 840 CPU 1063 cracker 1063 crash 1063 creation date of files, display using ls 99 creator code (OS X) 1047 cron directory 763 cron utility 763–765 cron.allow file 765 cron.daily file 764 cron.deny file 765 cron.hourly file 764 cron.monthly file 764 cron.weekly file 764 crontab file 763 crontab file 764 crontab utility 763–765 crontabs directory 763 cryptography 1063 csh Shell 370, 1059 csh.cshrc file 372 csh.login file 372 csh.logout file 373 .cshrc file 372, 1064 CSRG (Computer Systems Research Group) 3 CSS 1064 CUPS print server 859 CONTROL-H CONTROL-L curlftpfs utility 954–956 current 1064 current directory see working directory cursor 1064 cut utility 359, 766–767 cwd variable 392 .cxx filename extension 829 cypher 1061 D daemons 1064 dash shell 14, 277 data sink 141 data structure 1064 database 596 see also MySQL 596 dbm 1064 gdbm 1064 locale 771 makewhatis 35 mandb 35 ndbm 1064 NIS 1083 SQL 1095 datagram, network 1064 dataless system 1064 date utility 62, 319, 320, 321, 373, 460, 769–771 dbm database 1064 dd utility 772–774 DDoS attack 1064 Debian Almquist Shell see dash shell debug 1064 bash prompt 311, 431 scp using –v 700 server using telnet 976 shell scripts using –e and –u 287 shell scripts using –x 430 shell scripts using xtrace 502 ssh using –v 698 DEBUG signal 484 declare builtin 305–306, 474 decompress files see compress/decompress files decrement operators 497 default 1064 DEL key (erase character) 29 delete character using BACKSPACE 29 delete line using CONTROL-U 30 delete word using CONTROL-W 30 Main Index delimiter, regular expression 1012 delta, SCCS 1065 denial of service see DoS attack; DDoS attack dereference 1065 dereference symbolic links 115–118 descriptors, file 282 desktop 17, 1065 focus 1069 manager 17, 1065 detached process see background, process /dev directory 96, 134 device 1065 block 1058 character 1060 copy files to/from using dd 772–774 drivers 1065 filename 1065 files 134, 1065 independence 15 major number 1079 minor number 1080 null 141 physical 1086 terminal 482 df utility 775–776 DHCP 1065 dialog box 1065 dial-up connection 43 dictionaries, Python 573–574 die, process 325 diff utility 58, 777–781 diff3 utility 778 .dir_colors file 865 DIR_COLORS file 865 dircolors utility 865 directory 13, 51, 83, 788, 1065, see also the File Tree index (page 1105) . 92 .. 92 / (root) 88, 96 ~ (home) see home directory access permissions 103–104 change using cd 92, 737–738 child 83, 84 create using mkdir 90–92, 886 current see working directory delete using rmdir 93 file 83, 1065 hierarchy 1065 1121 home see home directory links to 109–115 list using ls 52 make using mkdir 90–92 mount using curlftpfs 954–956 mount using sshfs 954–956 move using mv 95 parent 83, 84 pathname 83 remove using rmdir 93, 928 rename using mv 95 root (/) 83, 88, 96, 1090 service 1065 stack 297, 297–300, 358, 379, 409, 410 standard 95–97 subdirectories 83 tree 82 working see working directory Directory Service, display information using dscl (OS X) 788–790 dirs builtin 297, 379, 408 dirstack variable 392 disk block 1057 free space 1070 free space, display using df 775–776 partition see partition startup (OS X) 1044 usage, display using du 791–793 diskless system 1066 disktool utility (OS X) 783 diskutil utility (OS X) 782–784 display bit-mapped 1057 color depth 1061 graphical 1070 displaying see also displaying a file date using date 62 hidden filenames 150 machine name using hostname 53 text using echo 61 displaying a file beginning of, using head 56, 841–842 by screens, using less 852–855 dump using od 897–900 end of, using tail 57, 965–967 group, using ls 99 hidden, using ls 150 1122 Main Index displaying a file, continued links, number of using ls 98 number of lines/words/bytes in using wc 1000 owner of, using ls 98 size of, using ls 99 sorted, using sort 58 type of, using ls 99 unique lines in using uniq 996–997 using cat 52, 735–736 using less 53 using more 53 distributed computing 1066 distribution, Linux 6 distribution, Perl 521 ditto utility (OS X) 785–786 division operator 496 dmesg utility 787 .dmg filename extension (OS X) 1043 DMZ 1066 DNS 1066 doc directory 39, 97, 1022 Document Object Model see DOM Documentation file (kernel) 39 documentation, Linux 33–39 DOM 1066 Domain Name Service see DNS door 1066 DoS attack 1066 DOS files, convert from/to Linux format using unix2dos/dos2unix 63 dos2mac utility 63 dos2unix utility 63 double quotation marks see quotation marks DPMS 1066 drag 1067 drag-and-drop 1067 drivers, device 1065 drop-down list 1067 druid 1067 DSA (Digital Signature Algorithm) 1067 dscl utility (OS X) 788–790 DSL 1067 dslocal file (OS X) 788 du utility 791–793 dunique variable 396 duplex, full/half 961 Dynamic Host Configuration Protocol see DHCP E e2fsck utility 816 EBCDIC 772 echo builtin 61, 371, 408, 445, 445t, 464, 794–795 echo utility 149, 464, 794–795 echo variable 396 ed utility 56, 160, 777 EDITOR variable 330, 763, 854 editors 1067 command line 335–342 ed 56, 160, 777 emacs 214–268 ex 160, 169 Readline Library 335–342 vi 160, see also vim vim see vim war (vi vs. emacs) 215 Edwards, Dan 1098 EEPROM 1067 effective user ID 1067 egrep utility 833, 834, 1018 element 1067 ELF binary format 903 elvis utility 160 emacs 214–268 see also emacs commands *Tab Stops* buffer 250 ALT key 220 aspell, using with 723 back tick 246 backups 222 buffer *compilation* 246 *Tab Stops* 250 current 242 modification state 232 retrieving 222 saving 222 scrolling 225 working with 242 commands and key sequences 224 comment-column variable 253 comments 253 CONTROL key 223 current buffer 242 cursor and Point 225 customizing 256–260 delete 226 Main Index Echo Area 218 editing at the cursor 221 editing, language-sensitive 246–255 .emacs file 214, 215, 257, 257t, 259 escape character 236 ESCAPE key 220, 224 exiting from 218 explicit-shell-file-name variable 255 expressions, working with 250 filename extensions and 247 files 239 fill-column variable 249 flow control conflict 224 function, defining 251 getting started 216 help 229 help commands 230t human-language modes 247–252 indention 252–253 inserting text 218 keyboard translation table 256 keymaps 224, 256, 258 keys binding 215 keymaps 224, 258 META 223–224 META, emulating 220 notation 223, 223t remapping 258–259 sequences and commands 224 use 223 Kill Ring 226, 235, 236 killing 226, 235 language-sensitive editing 215, 246–255 Lisp and 231, 256 Major modes 246, 246–255 Mark 233 Mark and Region 233–234 Mark Ring 234 META key 223–224 META keys, emulating 220 Minibuffer 218 Mode Line 217 modeless 215 modes Auto Fill 249 C 250 Dired 254 human-language 247–252 Major 246, 246–255 Shell 255 special-purpose 254 Text 250 more information 260 moving the cursor by characters 220 lines 220 paragraphs 221 sentences 221 window position 221 words 220 numeric argument 225 paragraph-separate variable 248 paragraph-start variable 248 pathname completion 240 Point 225, 233 Point and cursor 225 Region 233 Region and Mark 233–234 remapping keys 258–259 repetition count 219 screen and 933 searches, types of 226 sentence-end variable 248 shell commands 245, 245t, 246 Shell mode 255 SHIFT key 223 smart completion 225 spell check using aspell 723 startup file see emacs, .emacs file TAB key 250 textual units 219 tutorial 216–222 variable, assigning a value to 257 variable, displaying the value of 257 variable, setting the default value of a 257 version-control variable 222 vi simulation 254 vim, versus 215 visiting a file 222, 239 windows 243 other window 244 working with 243–245 words, working with 247 X Window System and 216 yanking 235 1123 1124 Main Index emacs commands see also emacs auto fill 248 buffer 242t, 266t global 237 open in other window 244 replacement 237 search 237 working with 242 C mode 268t case conversion 249, 249t, 268t comments 254t CONTROL-_ (undo) 232 CONTROL-@ (set Mark) 233 CONTROL-A (move to beginning of line) 220 CONTROL-D (delete) 222 CONTROL-E (move to end of line) 220 CONTROL-H (help system) 229 CONTROL-K (kill to end of line) 226 CONTROL-L (refresh screen) 225 CONTROL-N (move down) 220 CONTROL-P (move up) 220 CONTROL-Q (escape) 236 CONTROL-R (backward incremental search) 227 CONTROL-R RETURN (backward nonincremental search) 228 CONTROL-S (forward incremental search) 227 CONTROL-S RETURN (forward nonincremental search) 228 CONTROL-SPACE (set Mark) 233 CONTROL-U (multiply) 225 CONTROL-V (scroll window forward) 225 CONTROL-X 0 (delete window) 245 CONTROL-X 1 (delete other windows) 245 CONTROL-X 4b (opens buffer in other window) 244 CONTROL-X 4f (opens file in other window) 244 CONTROL-X b (switch buffers) 243 CONTROL-X CONTROL-F (visiting another file) 222 CONTROL-X CONTROL-S (save buffer) 222 CONTROL-X CONTROL-X (exchange Mark and Point) 233 CONTROL-X o (select another window) 244 cursor, moving 261t cut and paste 235 Delete 221 deleting characters 218 deleting text 221, 262t erasing text 226 exiting 218 expressions 251t file 266t open in other window 244 saving 241, 242t visiting 239, 239t function definition 252t help 230t, 263t indention 252t kill 235t killing text 262t line 264t line-oriented operations 237t Mark and Region 233–234 Mark, set 233 META-! (run shell command) 245 META-{ (beginning of paragraph) 221 META-} (end of paragraph) 221 META-a (beginning of sentence) 221 META-b (move by words) 220 META-CONTROL-V (scroll another window) 244 META-d (kill word) 226 META-e (end of sentence) 221 META-f (move by words) 220 META-r (middle of window) 221 META-v (scroll window backward) 225 META-x (any command) 224 META-x compile (run shell command in background) 246 mode C 250, 268t selecting 247 Shell 255t, 268t Text 250 numeric argument 225 paragraph 248 paragraph fill 248 refresh screen 225 Region 264t and Mark 233–234 establish 233 operating on a 234t replacement 238t, 264t interactive and unconditional 237 prompt responses 238t query responses 265t search 263t buffer 237 for regular expressions 228t incremental 227 nonincremental 228 Main Index sentence 248 shell 245, 245t, 246, 267t Shell mode 255t, 268t special characters, inserting 236 tokens 251t undo 232t undoing changes 232–233 window 265t adjusting dimensions of 245 create by splitting 244 delete 245 manipulate 244 scroll another 244 scroll forward/backward 225 select another 244 working with 243–245 yank 235, 235t .emacs file 214, 215, 257, 257t, 259 emoticon 1067 empty regular expressions 1016 emulator, operating system 7 emulator, terminal 43 encapsulation see tunneling encryption OpenSSH 689 public key 688 RSA 1091 end line key 25 end of file see EOF ENTER key 25 env utility 348, 471 ENV variable 279 environment variable see variables, environment environment, command execution 468 EOF 1068 EOF signal, send using CONTROL-D 135 epoch 400 EPROM 1068 –eq relational operator 493 equality operators 496 erase key (CONTROL-H) 29, 130, 960, 962, 963 erase word key (CONTROL-W) 130, 961 ERR signal 484 error messages command not found 132, 286 No such file or directory 308 parameter null or not set 474 Permission denied 132, 285, 308 redirecting to standard error 284, 422 1125 send to standard error for a variable using :? 474 standard error see standard error error, standard see standard error errors, correcting typing mistakes 29 escape a character 50, see also quotation marks; quoting /etc directory 96 Ethernet address see MAC address Ethernet network 1068 eval builtin 348, 408, 487 event 1068 asynchronous 1056 bash history see bash history number 374 ex utility 160, 169 exabyte 1068 exception, Python 580 exec builtin 405, 408, 453, 480–483 exec() system call 323 execute access permission 98, 285–286 commands 132, 325 files in the working directory 286 shell scripts 288, 370 exit builtin 44, 372, 397, 408, 422, 465 EXIT signal 483 exit status 360, 394, 420, 422, 465, 1068 expand utility 796–797 expansion arithmetic (bash) 359–361, 497 command line 354–364 pathname 148 tilde 89 exploit 1068 exponentiation operator 496 export builtin 306, 468–470 exported variable see variables, environment expr utility 798–801 expressions 1068 arithmetic 1055 evaluate using test 978–980 logical 1078 regular see regular expression ext2/ext3/ext4 filesystem 85, 991–992 extended attributes (OS X) 1044 extended regular expressions see regular expression, extended Extensible Markup Language see XML extensions, filename see filename, extensions extranet 1068 1126 Main Index F Fahlman, Scott 1093 failsafe session 1068 false utility 292 FAT filesystem 85 fc builtin 328–330 FDDI 1068 fg builtin 147, 295, 408 fgrep utility 833, 834 FHS (Linux Filesystem Hierarchy Standard) 13, 95 Fiber Distributed Data Interface see FDDI fignore variable 392 file 12, 1068 see also displaying a file; filename access permissions see access permissions access time, change using touch 985–986 ambiguous references 148, 397 archive using ... see backing up files, using ... archiving 63–68 attributes (OS X) 1044, 1046 backup using ... see backing up files, using ... block special 1058 character special 1060 classify using file 802–803 compare using cmp 748–749 compare using comm 750–751 compare using diff 58, 777–781 compress see compress/decompress files configuration (OS X) see startup files contents, identify using file 60 convert from/to Linux/Windows format 63 copy over a network using ftp 819–824 copy over a network using rsync 680 copy to/from a device using dd 772–774 copy using cp 53, 94, 754–757 copy using ditto (OS X) 785 create using cat 137 create using touch 985–986 creation date, display using ls 99 creation permissions mask, establish using umask 994–995 crontab 763 data fork (OS X) 1044 descriptors 282, 452, 452–455, 456 divide using split 951–952 dump using od 897–900 duplicate lines in, remove using uniq 58 edit using emacs 216–222 edit using vim 161–168 file utility see file utility find using find 804–809 fork (OS X) 1044 group, display using ls 99 handle, Python 579 hidden 86, 1072 hidden, display using ls 861, 867 information about, display using ls 861–867 information about, display using stat 957–958 invisible see filename, hidden join lines from using join 843–845 lines, join from multiple using paste 905–906 links to 109–115 links, display number of using ls 98 Makefile 872 manager, mc 879–885 modification time, change using touch 985–986 move using mv 94, 890–891 names see filename number lines in using nl 894–895 order using sort 58 ordinary 83, 1084 owner, display using ls 98 paginate using pr 915–916 pathname 83 permissions see access permissions preference (OS X) see startup files print using lpr 55 property list (OS X) 913 reference, ambiguous 397, 1054 remove securely using shred 927 remove using rm 52, 926–927 rename using mv 94, 890–891 resource fork (OS X) 1044 rotate 1091 search for using locate 70 search for using mlocate 70 size, display using ls 99 size, displaying easily readable 129 sort using sort 58, 942–950 sparse 970, 1094 standard 95–97 startup 87, 278–281, 372, 1095 status and if control structure 399t, 400t strings in a, display using strings 959 tar 66 temporary 401, 448, 464 terminal 134 Main Index type code (OS X) 1047 type, display using file 802–803 type, display using ls 99, 864t utilities 76t wipe using dd 774 file utility 60, 322, 802–803 filec variable 396 filename 88, 1068 see also file / (root) 88 ambiguous references 148, 397 basename 88, 1057 case-sensitivity 25, 85 change using mv 54 characters allowed in 84 completion 380, 1068 conventions 25 device 1065 extensions 85, 86t, 1068 creator codes, and (OS X) 1047 emacs and 247 list of C compiler 828t extensions, remove using an event modifier 335 generation 14, 148–153, 362, 1068 hidden 86, 1072 length 84, 85, 448 reference, ambiguous 397 root directory (/) 88 simple 88, 132, 1093 substitution 378 temporary 448, 464 typeface 25 unique 448, 464 Windows 85 filesystem 82, 1068 case preserving (OS X) 1043 clean 815 create using mkfs 887–889 ext2/ext3/ext4 85, 991–992 FAT 85 free list 1070 HFS (OS X) 1043 HFS+ (OS X) 85, 1043, 1044 hierarchy 82 journaling 991, 1076 mount point 1081 nondisk (OS X) 1043 remote 1090 remount 105 repair using diskutil (OS X) 782–784 1127 repair using fsck 814–818 Standard, Linux (FSSTND) 95 structure 12 superblock 1096 tuning 991–992 UFS 85, 1043 Filesystem, Standard, Linux (FSSTND) 13 Filesystems Hierarchy Standard, Linux (FHS) 13 filetest builtin 400, 408 filling 1069 filters 16, 144, 1069 find utility 380, 430, 804–809 find_uid shell script 638 finger utility 71, 73t, 810–811 firewall 1069 firmware 1069 flash drive, write filesystem to using mkfs 889 fmt utility 812–813 focus, desktop 1069 folder see directory font, anti-aliasing 1055 footer 1069 for control structure 433–435 for...in control structure 431–434 foreach control structure 403 foreach_1 shell script 404 foreground 146, 1069 foreground, background versus 146 fork 323, 1069 file (OS X) 1044 resource (OS X) 1044 fork() system call 323, 325 format output using printf 917–920 format text using fmt 812–813 FQDN 1069 frame, network 1070 free list, filesystem 1070 free space, disk 1070 Free Standards Group (FSG) 96 free utility 74 fromdos utility 63 fsck utility 783, 814–818 FSG (Free Standards Group) 96 FSSTND (Linux Filesystem Standard) 13, 95 FTP mount directory using curlftpfs 954–956 PASV (passive) connection 1085 protocol 819 sftp client 700 tutorial 822–824 1128 Main Index ftp utility 819–824 full duplex 961, 1070 full regular expression see regular expression, extended fully qualified domain name see FQDN functions 1070 bash 346–349, 454, 475–476 count_down 476 makepath 502 mycp 454 Python 567, 567t shell 1092 FUSE 955 fuse.conf file 955, 956 fusermount utility 954–956 G games directory 97 gateway 1070 gawk utility 435, 620–652, 1018 actions 623 arithmetic operators 625t associative arrays 626, 639–644 BEGIN pattern 623, 633 comments 623 control structures 627–629 coprocess 648–650 dollar signs 631 END pattern 623, 634 fields 630 format 641 FS variable 638 functions 624, 624t getline statement 646–648 install under OS X 1051 length function 633 NR variable 634 numeric conversion 626t OFS variable 635 operators, arithmetic 625 options 621 patterns 622 printf function 626, 636 quotation marks 630 range operator (,) 623, 632 relational operators 622t sprintf function 625 syntax 620 variables 623, 624t gcc and g++ utilities 826–830, see also C programming language GCOS see GECOS gdb utility 827 gdbm database 1064 –ge relational operator 493 GECOS 1070 generic operating system 9 getfacl utility 105–109 GetFileInfo utility (OS X) 831–832 getopts builtin 488–490 getty utility 323 gibibyte 1070 GID 1071 gid variable 392 GID, change using chown 746–747 .gif filename extension 86 gigabyte 1070 glob builtin 409 global variable see environment variable globbing 148, 362, 379, 412 glyph 1070 GMT see UTC GNU Compiler Collection 826 Configure and Build System 752 emacs 214 General Public License (GPL) 6 GNUmakefile file 869 goto control structure 401 GPL (GNU General Public License) 6 graphical display 1070 grave accent see command, substitution grep utility 56, 144, 438, 450, 519, 809, 833–837 group about (OS X) 1042 access permission 98 change using chgrp 739 change using chown 746–747 display name using ls 99 ID 739 users 1070 windows 1071 group file 558 group file (OS X) 1042 –gt relational operator 493 guest (virtual machine) 7 GUI 1071 check see GUI, tick check box 1060 check mark see GUI, tick Main Index combo box 1062 dialog box 1065 drag 1067 drag-and-drop 1067 drop-down list 1067 list box see GUI, drop-down list radio button 1088 scrollbar 1092 slider 1093 spin box 1094 spinner see GUI, spin box text box 1097 thumb 1098 tick 1098 tick box see GUI, check box tooltip 1098 WYSIWYG 1102 X Window System 16 gunzip utility 65, 838–840, 970 .gz filename extension 65, 86, 839, 972 gzip utility 65, 838–840, 970 H –h option 38, 129 hacker 1071 half duplex 961, 1071 hang up signal 483 hard links see links, hard hash 1071 one-way 1084 SHA1 algorithm 1092 SHA2 algorithm 1092 table 325, 1071 hash utility 326 hashbang 287 hashstat builtin 409 head utility 56, 841–842 header, document 1071 help documentation 33–42 emacs 229, 230t error messages 41 GNU manuals 41, 1022 –h option 38 ––help option 38, 128 –help option 38 HOWTOs 41 info utility 36–38 Internet 41 Linux Documentation Project 41 1129 local 39 log files 1022 mailing lists 1024t man pages 33–35, 875–878 newsgroups 1023 obtaining 33–42, 1023–1024 problem solving 1022 system manuals 33–38 ––help option 38, 128 –help option 38 Here document control structure 450–452, 1072 hesiod 1072 heterogeneous 1072 hexadecimal number 1072 HFS+ filesystem (OS X) 85, 1043 hidden filenames 86, 1072 hidden filenames, display using ls 150 hierarchy 1072 hierarchy, filesystem 82 hinting, subpixel 1096 HISTFILE variable 327 histfile variable 372, 375, 392 HISTFILESIZE variable 327 histlit variable 376, 396 history 1073, see also bash history .history file 372, 375 history builtin 326, 327, 374, 409 history variable 375, 392 HISTSIZE variable 327 /home directory 96 home directory 51, 87, 307, 1073 ~, shorthand for 89, 307 startup files 87, 372 working directory versus 93 HOME variable 307, 392 home variable 392 host key, OpenSSH 689 host, virtual machine 7 hostname utility 53 hover 1073 HOWTOs 41 HTML 1073 HTTP protocol 1073 hub 1073 human-readable option 129 humor 6, 214, 961, 1093 hunks (diff) 59 HUP signal 483, 896 hypertext 1073 Hypertext Markup Language see HTML hypervisor 8 1130 Main Index I .i filename extension 827, 829 I/O device see device i18n 317, 1073 IANA (Internet Assigned Numbers Authority) 1073 ICMP packet 1073 icon 1073 iconify 1074 id_rsa file 703 id_rsa.pub file 703 if control structure 398 if...then control structure 402, 419–422 if...then...elif control structure 402, 425–430 if...then...else control structure 402, 423–425 IFS variable 311–313 ignored window 1074 ignoreeof variable 372, 397 .ii filename extension 827, 829 include directory 97 increment operators 497 indent shell script 666 indentation see indention indention 1074 inequality operator 496 info directory 97 info utility 36–38, 153 inheritance, bash 468 init daemon 323 inodes 1074 display using ls 112, 862 locate using find 430 input 1074 input, hide using stty 962 input, standard see standard input input/output device see device .inputrc file 339 INPUTRC variable 339 installation, computer 1074 INT signal 483 Integrated Services Digital Network see ISDN interactive 1074 interface 1074 graphical user see GUI pseudographical 28 user 1100 internal field separator see IFS variable International Organization for Standardization see ISO internationalization 316 Internet 1074 Assigned Numbers Authority see IANA netiquette 1082 Printing Protocol see IPP protocol Protocol see IP service provider see ISP URI 1100 URL 1100 internet (lowercase “i”) 1074 interprocess communication 16, 60 interrupt handling 401 interrupt key 30, 401, 960, 963 intranet 1075 invisible files see hidden filenames IP 1075 address 1075 multicast see multicast spoofing 1075 IPC 1075 IPP protocol 859 iptables, masquerade 1079 IPv6 1075 is_regfile shell script 421 isatty() system call 134 ISDN 1075 ISO 1075 .iso filename extension 1043 ISO9660 filesystem 1043, 1076 ISP 1076 ispell utility see aspell utility issue file 26 iterable 572, 577, 578 J job 146, 294 control 15, 146, 294–297, 378, 407, 408, 409, 410, 1076 jobs builtin 148, 295 number 146 number, determining using jobs 148 suspend using CONTROL-Z 147 jobs builtin 30, 409 jobs directory (OS X) 726 join 596 join utility 843–845 journaling filesystem 991, 1076 Joy, Bill 1059 Main Index JPEG 1076 .jpeg filename extension 86, 1076 .jpg filename extension 86, 1076 justify 1076 K K&R 11, 828 KDE desktop 17 Kerberos 1076 kernel 6, 1076 see also Linux command execution environment 468 display messages using dmesg 787 loadable module 1078 Mach 2 module 1078 programming interface 12 space 1076 kernelspace 1076 Kernighan & Ritchie 11 key, META (OS X) 1049 keyboard 1076 keyboard as standard input 135 Keychain password (OS X) 45 keys BACKSPACE (erase character) 29 binding 1076 CONTROL 25 CONTROL-\ (quit) 30 CONTROL-C (interrupt) 30 CONTROL-D (EOF) 43, 135, 372 CONTROL-H (erase character) 29, 130 CONTROL-L (refresh screen) 170 CONTROL-Q (Xon) 27 CONTROL-U (line kill) 30, 130, 960, 963 CONTROL-W (erase word) 30, 130, 170 CONTROL-X (line kill) 30 CONTROL-Z (suspend) 30, 147 DEL (erase character) 29 emacs notation 223, 223t end line 25 ENTER 25 erase 29, 960, 962, 963 interrupt 30, 401, 960, 963 line kill 30, 130, 960, 963 META 223–224, 1080 NEWLINE 25 RETURN 25, 130 1131 suspend 961, 963 typeface 25 keyword variables 301 keywords, search for using apropos 35 kill builtin 30, 147, 484, 486 kill line key (CONTROL-U) 30, 130, 960, 963 KILL signal 31, 484 kill utility 846–847, 924 killall utility 848–849 kilo- 1077 kmem file (OS X) 787 known_hosts file 693 Korn, David 277, 1077 Korn Shell 277, 1077 ksh shell 277, 1077 KVM 9 L l10n 317, 1077 LAMP stack 596 LAN 1077 LANG variable 317, 1025 language, procedural 500 large number 1077 launchctl utility (OS X) 727, 850–851 launchd daemon (OS X) 850–851 LC_ALL variable 317 LC_COLLATE variable 317 LC_CTYPE variable 317 LC_MESSAGES variable 317 LC_NUMERIC variable 317 LDAP 1077 ldd utility 903 –le relational operator 493 leaf 1077 least privilege 1077 less utility 34, 53, 144, 425, 852–855 LESS variable 853 let builtin 360, 492 lexical variable 522 lftp utility 700 lftp.conf file 700 .lftprc file 700 /lib directory 96 lib directory 97, 827 Lightweight Directory Access Protocol see LDAP limit builtin 409 line kill key (CONTROL-U) 30, 130, 960, 963 1132 Main Index line numbers, display using cat 735 line numbers, display using gawk 634 line numbers, display using nl 894 Line Printer Daemon see lpd daemon LINES variable 449 links 13, 109, 109–115, 1077 delete using rm 115 display using ls 98, 112 find using lnks 427 hard 110–112, 1071, 1077 hard versus symbolic 110, 112 hard, create using ln 110, 856–857 point-to-point 1086 remove using rm 115 soft see links, symbolic symbolic 112, 1077, 1097 cd and 114 create using ln 113, 856–857 dereference 115–118, 1065 versus hard 110, 112 symlinks see links, symbolic Linux see also kernel benefits 6–9 distribution 6 documentation 33–42 Documentation Project 41 FHS (Filesystem Hierarchy Standard) 13, 96 Foundation 96 FSSTND (Filesystem Standard) 13, 95 history 2–6 LSB (Linux Standard Base) 96 manual sections 34 newsgroups 1023 overview 11–18 PAM see PAM Pluggable Authentication Modules see PAM standards 6 UNIX heritage 3 linux terminal name 1024 Lisp programming language 214 list box see drop-down list list operator 987, see also character, class list, Perl 522 listjobs variable 397 listlinks variable 397 lists 145 lists, Python 569–573 ln utility 110, 113, 856–857 ln utility versus cp 111 lnks shell script 427 load average, display using uptime 72 load average, display using w 73, 998 loadable modules 1078 local area network see LAN local builtin 347, 476 local directory 97 local variables 325 locale 316–320, 1078 C 318 i18n 317 internationalization 316 l10n 317 LANG variable 317 LC_ variables, setting 319 LC_ALL variable 317 LC_COLLATE variable 317 LC_CTYPE variable 317 LC_MESSAGES variable 317 LC_NUMERIC variable 317 locale utility 318–320 locales directory 319 localization 316 setlocale() system call 317 locale database 771 locale file 319 locale utility 318–320 locales directory 319 locality, process 468 localization 316 localtime file 322 locate utility 70 locktty shell script 440 log builtin 409 log directory 97, 1022 log files, obtain help using 1022 logging in 1078 see also login automatically using OpenSSH 702–704 problems 42 remotely 43 remotely using ssh 691, 694–698 remotely using telnet 974–975 logging out 43, 1078 logical evaluation 493 expressions 1078 operators see Boolean operators Main Index login see also logging in last 27 name see usernames root 1091 shell 278, 1078 terminal/terminal emulator 26 .login file 372, 1078 login builtin 409 login utility 323 loginsh variable 397 .logout file 373, 1078 logout builtin 372, 397, 409 lost+found directory 817 lpd daemon 859 lpq utility 55, 858–860 LPR line printer system 859 lpr utility 55, 142, 858–860 lprm utility 56, 858–860 lpstat utility 55 ls utility 52, 98, 116, 861–867 LSB (Linux Standard Base) 96 ls–F builtin 397, 409 –lt relational operator 493 M .m filename extension 829 MAC address 1078 Mach kernel 2 machine collating sequence 1079 machine name, display using hostname 53 MacPorts 1050 macro 1079 magic file 1079 magic number 803, 1079 mail MAIL variable 309 mailbox 309 MAILCHECK variable 309 MAILPATH variable 309 MDA 1079 MTA 1081 MUA 1081 network addresses 76 utilities 76 MAIL variable 309 mail variable 392 mailbox 309 1133 MAILCHECK variable 309 MAILPATH variable 309 MAILTO variable 763, 765 main memory 1079 mainframe computer 9 major device number 1079 make utility 556, 869–874 Makefile file 869, 872 makefile file 869 makepath function 502 makewhatis database 35 MAN 1079 man directory 97 man utility 33–35, 875–878 man.conf file 876 man.config file 876 mandb database 35 man-in-the-middle 1079 MANPATH variable 876 manpath.config file 876 mask see network, mask masquerading, IP 1079 Massachusetts Institute of Technology see MIT mawk utility see gawk utility mc utility 879–885 McCarthy, John M. 214 MD5 1079 MDA 1079 mebibyte 1079 megabyte 1079 memory see also RAM free, allocating to buffers 74 main 1079 usage, display using free 74 menu 1080 bash 448 shell script 444 merge 1080 mesg utility 75 message deny using mesg 75 of the day, see motd file send using write 74 usage see usage messages Message Digest 5 see MD5 messages file 787, 1022 META characters, display using cat 735 META key 223–224, 1080 1134 Main Index key (OS X) 1049 metabit 1055 metacharacters 1080, see also special characters metadata 1080 creator code (OS X) 1047 file type code (OS X) 1047 OS X 1046 metropolitan area network 1079 microprocessor 10 Microsoft Windows see Windows ... Midnight Commander see mc utility MIME 1080 MIME type, display using file 802 mingetty utility 323 minicomputer 9 mini-HOWTOs 41 minimize window 1080 MINIX 5 minor device number 1080 mirror 681 mistakes, correct typing 29 MIT 16 MIT Lisp culture 214 mkdir utility 90–92, 886 mkfs utility 887–889 mklost+found utility 817 mlocate utility 70 .mm filename extension 829 /mnt directory 96 modem 1081 modem, cable 1059 module, Perl 521 modules directory 96 more utility 53, 853 motd file 27 mount 1081 automatic 1056 point 1081 remount option 105 mount utility 105 mouse 1081 mouseover 1081 pointer 1081 pointer, hover 1073 mouseover 1081 MTA 1081 MUA 1081 multiboot specification 1081 multicast 1081 META multiplication operator 496 multiplicative suffixes 717t Multipurpose Internet Mail Extension see MIME multitasking 12, 1081 multiuser 12, 1081 mv utility 54, 94, 95, 890–891 my 602 .my.cnf file 602 mycp function 454 MySQL 596–616 .my.cnf file 602 adding a user 604 adding data 607 column 596 comments 598 creating a database 603 creating a table 605 database 596 datatypes 598 install under OS X 1051 installing 599 joins 596, 613–616 logging in 605 mysql_history file 603 mysqldump utility 610 options 600 prompt 597 relational database management system 596 retrieving data 608 root password 601 row 596 SELECT statement 608 SQL 597 table 597 mysql_history file 603 mysqldump utility 610 N name, command 127 name, login see usernames namespace 586, 1081 NAT 1082 NBT 1082 ncal utility 734 ndbm database 1064 –ne relational operator 493 negation operator 496 NetBIOS 1082 Main Index netboot 1082 netiquette 1082 .netrc file 822 network address 1082 address space, private 1086 address, mail 76 boot 1082 bridge 1058 broadcast 1058, 1059 broadcast address 1059 datagram 1064 Ethernet 1068 extranet 1068 FDDI 1068 frame 1070 gateway 1070 hub 1073 ICMP packet 1073 internet (lowercase “i”) 1074 mask 1082 metropolitan area 1079 multicast 1081 number see network, address packet 1084 packet filtering 1084 packet sniffer 1084 port forwarding 1086 private address space 1086 privileged port 1087 router 1091 segment 1083 sniff 1094 subnet 1096 subnet addresses 1096 subnet masks 1096 subnet numbers 1096 switch 1083 token ring 1098 topology, shared 1092 tunneling 1099 UDP 1099 unicast 1099 VPN 1101 wide area see WAN Wi-Fi 1102 Network Address Translation see NAT Network Information Service see NIS Network Time Protocol see NTP 1135 control operator 290 key 25 NEWLINE, quote using a backslash 293 NEWLINE, sed, using to work with 663 newsgroups 1023 NFS 1083 NIC 1083 nice builtin 409, 892 nice utility 892–893 NIS 1083 NIS domain name 1083 nl utility 894–895 NNTP (Network News Transfer Protocol) 1083 No such file or directory error message 308 nobeep variable 397 noclobber variable 139–141, 397, 397t, 736 node 1083 node, Open Directory (OS X) 1042 .nofinger file 811 noglob variable 397 nohup builtin 409, 896 nohup utility 896 nonomatch variable 397 nonprinting character 1083 nonvolatile storage 1083 NOT (!) Boolean operator 499, 662 notify builtin 409 notify variable 397 NTP 1083 null device 141 null file 141, 438 null string 1083 number block 1058 gibibyte 1070 gigabyte 1070 hexadecimal 1072 kilo- 1077 large 1077 line, display using gawk 634 magic 803, 1079 mebibyte 1079 megabyte 1079 octal 1083 sexillion 1092 tera- 1097 undecillion 1099 numeric variable 386 NEWLINE NEWLINE 1136 Main Index O –o (OR) Boolean operator 493 .o filename extension 86, 829 octal number 1083 od utility 897–900 OLDPWD variable 358, 737 one-way hash 1084 onintr builtin 401, 409 open source 1084 open utility (OS X) 901–902 OpenSSH 688–689, 1084 authentication 688, 689, 692 authorized keys 702–704 authorized_keys file 703 automatic login 702–704 client, configuring 692–694 clients 691–702 compression 708 configuration files, client 700–702 configuration files, client and server 689–690 debugging 698, 700 encryption 689 firewall, working with 707 id_rsa file 703 id_rsa.pub file 703 keys 689 keys, personal, store using ssh-agent 705 known_hosts file 693 mount directory using sshfs 954–956 opening a remote shell 695 password vs. personal key authentication 704 port forwarding 706–707 protocol versions (1 and 2) 688 randomart image 704 recommended settings, client 692 rhost authentication 689 rsync utility 699 running commands remotely 695–697 security 688 ssh_known_hosts file 693 ssh-add utility 705 ssh-agent utility 705 ssh-copy-id utility 703 ssh-keygen utility 703–704 tunneling 706–707 tutorial: using ssh and scp to connect to an OpenSSH server 691 X11 forwarding 692, 701, 706 operand 127 operating system 1084 operating system, generic/proprietary 9 operators arithmetic (bash) 495t, 495–499 Boolean see Boolean operators list see character, class logical see Boolean operators postdecrement 389 postincrement 389 redirection (bash) 284, 284t relational 493 remainder 498 short-circuiting 291 string 494t ternary 498 /opt directory 96 opt directory 96 OPTARG variable 488 OPTIND variable 488 option flags, displaying 466 options 127, 1084 options, command line 127–129, 349 OR (|) Boolean operator (extended regular expression) 1018 OR (||) Boolean operator 493, 498 OR (–o) Boolean operator 493 OR bitwise operator (|) 498 ordinary file 83, 1084 OSDL (Open Source Development Labs) 96 other access permission 98 otool utility (OS X) 903–904 out shell script 425 output 1084 output, format using printf 917–920 output, monitor using tail 967 output, standard see standard output owd variable 393, 737 owner access permission 98 owner of a file, change using chown 746–747 owner of file, display using ls 98 P .p filename extension 582 P2P 1084 package variable, Perl 521 package, Perl 521 Main Index packet 1084 filtering 1084 sniffer 1084 page breaks 167 PAGER variable 853, 876 pagers 34, 53, 852–855, 1084 paging 1085 PAM 1085 parameter null or not set error message 474 parameters 300 see also bash parameters expansion (bash) 358 positional 458, 458–463 shell 300, 300–314 special 463–466 parent directories 83, 84 parent process 324, 1085 parentheses, group commands using 292 parse 130, 354 partition 1085 copy using dd 773 free space, display using df 775–776 PASC (Portable Application Standards Committee) 277 passive FTP see FTP, PASV passphrase 1085 passwd file 371, 1042 passwd_check shell script 644 passwords 1085 changing 44 generating using pwgen 45 hide using stty 962 Keychain (OS X) 45 paste utility 905–906 PASV FTP see FTP, PASV path, search 68 PATH variable 132, 279, 308–309, 393, 447, 1001 path variable 393 pathnames 83, 87, 94, 1085 ~ (tilde) in 89 absolute 88, 132, 1054 completion 338 elements 1085 expansion 148, 148–153, 303, 362 last element of 1085 relative 89, 132, 1089 pax utility 907–912 .pdf filename extension 86 peer, BitTorrent 1039 period special character 1013 peripheral device see device Perl 518–561 ::, use of 521, 555 . (dot) operator 530 .. (range) operator 531 @_ array 548 $_ variable 542 $! variable 544 $. variable 541 $#array variable 530 array 522 block 521 chomp function 535, 544 chop function 544 closedir function 559 comparison operators 535, 535t compound statement 522 CPAN 555 defined function 527 die function 536, 545 distribution 521 –e option 550 error checking 545 file handle, magic 543 file test operators 534 for control structure 537–540 foreach control structure 537–540 foreach statement, postfix syntax 538 greedy matching 553 handle 542 if control structure 534 if statement, postfix syntax 534, 551 if...else control structure 536 if...elsif...else control structure 537 keys function 533 –l option 550 last statement 539 lexical variable 522 list 522 magic file handle 543 metacharacters 552t module 521, 555 my function 547 namespace 521, 555 next statement 539 nongreedy matching 553 numeric operators 535t opendir function 559 1137 1138 Main Index Perl, continued operators comparison 535, 535t numeric 535t string 535t options, combining 550 package 521 package variable 521 parsimonious matching 553 pop function 531 postfix syntax 534, 538 push function 531 readdir function 559 regular expression metacharacters 552t regular expressions 550–555 replacement string 552 reverse function 546 say function 523 shift function 531, 549, 558 slice, array 531 sort function 546 special characters 525t splice function 531 split function 560 statement, compound 522 statements 524 string operators 535t subroutines 547–550 substr function 557 syntax 524 uc function 538 unless control structure 534 until control structure 540 use feature 'say' 523 use function 523 use strict statement 519 use warnings statement 519, 527 values function 533 –w option 527 while control structure 540 perldoc utility 519 Permission denied error message 132, 285, 308 permissions see access permissions persistent 1085 .pgpkey file 72, 810 philosophy, UNIX 214 phish 1085 physical device 1086 PID 1086 $! variable 465 $$ variable 396, 464 background process 146 display using ps 921–924 fg 276 number 1 323 numbers 324 process, display using ps 147 pinfo utility 37 pipe symbol (|) see also pipeline continuation, implicit 435, 499 extended regular expressions, in 1018 filters, in 144 lists, in 145 noclobber and 140 redirect output using 60 pipe see pipeline pipeline 16, 60, 141, 141–145, 1086 see also pipe symbol precedence of 497 pixel 1086 plaintext 1086 .plan file 72, 810 plus sign (+) in extended regular expressions 1018 plutil utility (OS X) 913–914 point-to-point link 1086 popd builtin 299, 379, 409 popd variable 397 port utility 1051 portable 10 portmapper 1086 ports 1086 connect to using telnet 976 forwarding 1086 forwarding using OpenSSH 706–707 positional parameters 458, 458–463 POSIX 6, 277 POSIXLY_CORRECT variable 802 postdecrement operators 389 postincrement operators 389 PPID see parent process pr utility 915–916 preference file (OS X) see startup files printable character 1086 printcap file 858 printenv builtin 409 printenv utility 470 Main Index printer control using lpq/lpr/lprm 858–860 page breaks 167 paginate file using pr 915–916 print files using lpr 55 server, CUPS 859 PRINTER variable 859 printf builtin 917–920 printf specification letters 917t printf utility 917–920 private address space 1086 privilege, least 1077 privileged ports 1087 problem solving 1022 /proc directory 97 procedural language 500 procedure 1087 process 132, 323, 323–326, 1087 background 325, 1057 child 324, 1061 die 325 foreground 1069 fork 323 ID see PID identification see PID kill using kill 30, 147, 486, 846–847 kill using killall 848–849 locality 468 parent 324, 1085 priority, change using nice 892–893 sleep 325 sleep utility 940–941 spawn see process, fork spontaneous 323 status, display using top 981–984 structure 323 substitution 364 wake up 325 zombie 981 .profile file 278, 1087 profile file 278 profile.d directory 278 program 1087 name of calling 458 terminating see kill builtin .project file 72, 810 PROM 1087 prompt variable 393, 393t prompt2 variable 394 prompt3 variable 394 prompts 1087 $ 25, 28 bash 309–311 MySQL 597 Python 565 representation 25 shell 25, 27 tcsh 370, 393–394 property list files (OS X) 913 proprietary operating systems 9 protocols 1087 connectionless 1062 connection-oriented 1062 FTP 819 TELNET 974 proxy 1087 proxy gateway 1088 proxy server 1088 .ps filename extension 86 ps utility 147, 288, 324, 464, 921–924 PS1 variable 309, 310t PS2 variable 294, 311 ps2pdf utility 877 PS3 variable 311, 449 PS4 variable 311, 431 pseudographical interface 28 pstree utility 324 pts directory 134 public key encryption 688 pushd builtin 298, 379, 397, 410 pushdsilent variable 397 pushdtohome variable 397 pwck utility 644 pwd builtin 86 pwd utility 114 PWD variable 358 pwgen utility 45 Python 564–591, 1088 + in a print statement 567 append() method, list 570 bool() function 583 control structures 574–578 dictionaries 573–574 exception handling 580 file input/output 579–582 file modes 579t file object methods 579t findall() method 583 1139 1140 Main Index Python, continued floating point variable 569 for control structure 577 function 567, 567t defining 584 importing 588 Lambda 589 group() method 584 if control structure 575 if...elif...else control structure 576 if...else control structure 575 implied display 565 indenting logical blocks 574 invoking 564 items() method, dictionary 574 iterable 572, 577, 578 keys() method, dictionary 573 Lambda function 589 len() function 571 libraries 585 list comprehensions 590 list methods 571t lists 569–573 logical blocks, indenting 574 map() function 589 match() method 584 MatchObject 583 methods 568 module, importing 587 namespace 586 NumPy library 586 open() function 579 pickle 582 print statement 567 prompt 565 quotation marks, single and double 566 range() function 578, 578t, 578 raw_input() function 567 readlines() method, file 580 regular expressions 583 remove() method, list 570 reverse() method, list 570 scalar variable 568 SciPy library 586 search() method 583 sort() method, list 570 sorted() function 571 standard input, reading from 567 standard output, writing to 567 strings 578 type() function 584 values() method, dictionary 573 version 564 while control structure 577 Q Qemu 9 quartz-wm window manager 17 question mark (?) in extended regular expressions 1018 QUIT signal 30, 484 quotation marks see also quoting around variables 302, 386 around whitespace 302 double 420 in aliases 343 in gawk 630 in pathname expansion 363 in Perl 524 in Python 566 removal of 364 single 50 quoting 1088 see also quotation marks characters 50 let arguments 360 NEWLINE characters using \ 293 parentheses in regular expressions 1016 shell variables using \ 302 special characters in regular expressions 1015, 1015t trap, arguments to 484 R radio button 1088 RAID 1088 RAM 1088 RAM disk 1089 random access memory see RAM random file 773 random number generator 773 RANDOM variable 433, 507 randomart image (OpenSSH) 704 RAS 1089 RDBMS see relational database management system RDF 1089 Main Index read access permission 98 read builtin 439, 477–479, 479t, 479–480 Readline completion commands 338–339 Readline Library command editor 335–342 Readline variables 340t readlink utility 117 readonly builtin 305, 306 readonly memory see ROM recursion, infinite (aliases) 345 redirection 15, 135, 1089, see also standard error; standard input; standard output redirection operators (bash) 284t reentrant code 1089 regular character 1089 regular expression 1011, 1011–1018, 1089 ^ 1014 \(...\) brackets expressions 1016 $ 1014 ampersand in replacement strings 1017 anchors 1014 asterisks 1014, 1014t brackets 1013, 1013t, 1016 carets and dollar signs 1014t character class 987, 1060 character classes and bracketed 1019t characters 1012 delimiters 1012 empty 1016 extended 1017, 1018t, 1019t pipe symbol (|) 1018 plus sign (+) 1018 question mark (?) 1018 summary 1019 full 1017, 1018t gawk 622 list operator see character, class longest match 1015 periods 1013, 1013t Perl 550–555 Python 583 quoted digits 1017 quoted parentheses 1016 quoting special characters 1015, 1015t replacement strings 1016, 1020t rules 1015 simple strings 1012, 1012t special characters 1012, 1012, 1015, 1019t summary 1019 vim 185 1141 rehash builtin 326, 410 relational database management system 596, see also MySQL relational operators 493 relative pathnames 89, 132, 1089 religious statue, miniature see icon remainder operators 496, 498 remote filesystem 1090 login 43 procedure call see RPC remount filesystem 105 ren shell script 404 renice utility 925 repeat builtin 410 replacement strings in regular expressions 1016 REPLY variable 449, 478 .repo filename extension 1033 repositories 1037, 1037–1038 request for comments see RFC resolver 1090 Resource Description Framework 1089 restore 1090 restricted deletion flag see sticky bit return code see exit status RETURN key 25, 50, 130 RFC 1090 rhost authentication, OpenSSH 689 Ritchie, Dennis 10 rm utility 52, 115, 398, 926–927 rmdir utility 93, 928 rmstar variable 398 roam 1090 ROM 1090 root see also root account; root privileges directory (/) 83, 88, 96, 1090 filesystem (/) 1091 window 1091 root account 32 root account, login 1091 /root directory 97 root privileges 32 rootkit 1091 ROT13 989 rotate files 1091 router 1091 row 596 RPC 1091 1142 Main Index RSA encryption 1091 rsync utility 674–684, 699 rsyncd daemon 677 rtorrent utility 1039 run 1091 /run directory 97 runlevel 1091 run-parts utility 764 S .S filename extension 829 .s filename extension 828, 829 safedit shell script 446 Samba 1091 NBT 1082 NetBIOS 1082 share 1092 SMB 1093 Windows shares 1092 WINS 1102 sandbox 8 savehist variable 375, 394 /sbin directory 97 sbin directory 97 sched builtin 410 schema 1091 scp utility 692, 698–700, see also OpenSSH screen 133 screen as standard output 135 screen commands 932t screen utility 694, 931–936 script utility 62 scripts, shell see shell scripts scroll 1091 scrollbar 1092 sdiff utility 778, 779 search for a pattern using grep 833–837 for files using locate 70 for files using mlocate 70 for inodes using find 430 for keywords using apropos 35 for strings using grep 56 for strings using vim 184, 188 for utilities using whereis 68 for utilities using which 68 path 68 secure file 1022 security access permissions 98–109 ACL 1054 authentication 1056 back door 1056 checksum 1060 cipher 1061 ciphertext 1061 cleartext 1061 cookie 1063 cracker 1063 cryptography 1063 cypher 1061 DDoS attack 1064 DoS attack 1066 file, wipe using dd 774 IP spoofing 1075 Kerberos 1076 Linux features 13 login, last 27 man-in-the-middle 1079 MD5 1079 MITM see security, man-in-the-middle OpenSSH 688 passphrase 1085 password 44, 1085 PATH variable 309 RSA encryption 1091 setgid files 102 setuid files 102 SHA1 hash algorithm 1092 SHA2 hash algorithm 1092 spoofing 1075 Trojan horse 1098 virus 1101 worm 1102 sed utility 654–669 addresses 655 control structures 657 Hold space 658, 666 instructions 656 options 654 Pattern space 656 syntax 654 seed, BitTorrent 1039 segment, network 1083 select control structure 448–450 sendmail masquerade 1079 seq utility 357, 433 Main Index sequence expression, bash 357, 432 Server Message Block protocol see Samba, SMB servers 1092 debug using telnet 976 X 1102 service, directory 1065 session 1092 failsafe 1068 key, OpenSSH 689 record using script 62 set builtin 350, 386, 390, 410, 427, 430, 460, 472 set group ID see setgid permissions set utility 429 setenv builtin 386, 410, 469 setfacl utility 105–109 SetFile utility (OS X) 938–939 setgid permissions 101–102, 1092 setlocale() system call 317 setuid 101–102, 1092 sexillion 1092 sftp utility 700 .sh filename extension 278 sh Shell 277, 1058 SHA1 hash algorithm 1092 SHA2 hash algorithm 1092 shar file 451 share 1092 share directory 97 shared network topology 1092 shebang 287 shell 14–15, 1092 see also bash; bash parameters; bash variables; command line; job control; shell scripts; tcsh; usage messages ~ (tilde) expansion 89 archive file 451 Bourne (original) 1058 busybox 729–731 csh 1059 dash 14, 277 Debian Almquist see shell, dash default, change using chsh 277 features 350–351 filename generation 148–153 functions 1092, see also bash, functions identifying 29 job control see job, control ksh 277 login 278, 1078 OpenSSH 695 options 350–351 parameters 300, 300–314 pathname expansion 148–153 prompt 25, 27 quoting special characters 302 sh 277, 1058 sleep 133 subshell 292 variables see bash variables shell scripts 284, 284–289, 417–508, 1093 see also bash, functions; usage messages addbanner 486 arguments, testing 429 arithmetic evaluation in using expr 798–801 bash 500–510 birthday 450 bundle 451 chkargs 420, 422 chmod, using to make executable 285–286 cleanup 666 command_menu 444 comments, begin using # 288, 428 count 436 count_down 476 cpdir 293 debug using –e and –u 287 debug using –x 430 debug using xtrace 502 executing 288, 370 exit status 420, 422 filename, unique 464 find_uid 638 foreach_1 404 Here document 450–452 indent 666 input, read using $< 391 input, read using read 477–480 is_ordfile 421 lnks 427 locale 318 locktty 440 makepath function 502 menu 444 out 425 passwd_check 644 positional parameters 458, 458–463 quiz 508 recursion 501 1143 1144 Main Index shell scripts, continued ren 404 safedit 446 shell, specify using #! 287, 428 sortmerg 454 special parameters 463–466 spell_check 437 temporary files 401, 448, 464 whos 434 word_count 641 word_usage 640 SHELL variable 855 shell variable 394 shift builtin 410, 461 shift utility 425 shlvl variable 394 shopt builtin 350 short-circuiting operators 291 shortcut see link shred utility 927 SIG* see signal name (e.g., HUP, KILL) signals 483, 483t, 1093 see also signal name (e.g., KILL) display list of using kill 487 send using kill 846–847 send using killall 848–849 simple command 126, 130 simple filenames 88, 132, 1093 single quotation marks see quotation marks single-user system 1093 sleep, shell 133 sleep utility 465, 940–941 sleep() system call 325 slice see partition slider 1093 SMB see Samba, SMB smiley (smilies, plural) 1093 SMTP 1093 snap, window 1093 sneakernet 1094 sniff 1094 SOCKS 1094 soft links see links, symbolic software termination signal 484 sort 1094 sort utility 58, 143, 364, 454, 942–950 sortmerg shell script 454 source builtin 281, 410 sources.list file 1038 SPACE 1094, see also whitespace SPACE bar 50 spam 1094 sparse file 970, 1094 spawn see fork special aliases 377t special characters 50, 148, 1012, 1094 ? 148 [] 151 * 149 bash 315t filename generation 148–153 pathname expansion 148–153 quoting 302 regular expressions 1011, 1019t special files 1060 special parameters 463–466 spell_check shell script 437 spelling correction (tcsh) 384–385 spelling correction see aspell utility spin box 1094 spinner see spin box split utility 951–952 spontaneous process 323 spoofing, IP 1075 spool 1094 spool directory 97, 726, 1023 SQL 597, 1095, see also MySQL square brackets 1095 square brackets, using in place of test 422 src directory 97 .ssh directory 690 ssh directory 690 ssh utility 27, 691, 694–698, see also OpenSSH ssh_config file 700 ssh_known_hosts file 693 ssh-add utility 705 ssh-agent utility 705 ssh-copy-id utility 703 sshfs utility 954–956 ssh-keygen utility 703–704 stack, directory 297, 297–300, 358, 409, 410 Stallman, Richard 3, 214, 215, 223, 231 standard error 133, 282, 1095 duplicate file descriptor using 1>&2 284, 422 error message, sending to 474 file descriptor 282, 452 redirect 282–284 Main Index redirect error messages to 284, 422 redirect using >& 379 redirect using 2> 282 redirect using exec 482 redirect while redirecting standard output 283 redirect with standard output using &> 146 redirect with standard output using >& 379 test if it is going to the screen 456 standard input 133, 1095 file descriptor 282, 452 keyboard as 135 redirect using < 137–138 redirect using 0< 282 redirect using exec 482 test if it is coming from the keyboard 456 standard output 133, 1095 copy to two places using tee 973 duplicate file descriptor using 2>&1 283 file descriptor 282, 452 overwriting a file, avoid by using noclobber 397 redirect and append using >> 140 redirect using > 136–137 redirect using 1> 282 redirect using exec 482 redirect using tee 145 redirect while redirecting standard error 283 redirect with standard error using &> 146 redirect with standard error using >& 379 screen as 135 test if it is going to the screen 456 standards directories and files 95–97 FHS (Linux Filesystem Hierarchy Standard) 95 FSG (Free Standards Group) 96 FSSTND (Linux Filesystem Standard) 95 Linux 6 LSB (Linux Standard Base) 96 option handling 490 POSIX 6, 277 startup disk (OS X) 1044 startup files 87, 278–281, 372, 1095 bash 278–281 BASH_ENV variable 279 .bash_login 278 .bash_logout 279 .bash_profile 194, 195, 278–280, 348 .bashrc 279–280 bashrc 279 csh.cshrc 372 csh.login 372 csh.logout 373 .cshrc 372, 1064 .emacs 214, 215, 257, 257t, 259 ENV variable 279 .history 372, 375 .inputrc 339 list using ls 867 .login 372, 1078 .logout 373, 1078 OS X 1050 .profile 278, 1087 profile 278 tcsh 372 .tcshrc 194, 195, 372 vim 194 .vimrc 194, 195 stat utility 957–958 stat() system call 803 statements, Perl 524 status line 1095 status variable 394 status, exit 1068 Steele, Guy 214 sticky bit 102, 1095 stop builtin 410 stopping a program see kill builtin streaming tape 1095 streams see connection-oriented protocol strings 1095 comparing 494 length (${...}) 495 null 1083 operators 494t pattern matching 494 search for using grep 56 within double quotation marks 302 strings utility 959 Stroustrup, Bjarne 11 Structured Query Language see SQL; MySQL stty utility 29, 960–963 stylesheet see CSS su utility 32 subdirectories 83, 1095 subnet 1096 address 1096 mask 1096 number 1096 subpixel hinting 1096 1145 1146 Main Index subroutine see procedure subshell 292, 1096 subtraction operator 496 sudo utility 33 superblock 1096 Superuser 32, 1096, see also root account; root privileges suspend builtin 410 suspend key (CONTROL-Z) 30, 147, 961, 963 SVID see System V Interface Definition swap 1096 swap space 1096 swap space, display usage using free 74 swarm, BitTorrent 1039 switch control structure 406 switch, network 1083 symbolic links see links, symbolic symbolic mode operators (chmod) 741t symbolic mode permissions (chmod) 742t symbolic mode user class specification (chmod) 741t symlinks see links, symbolic /sys directory 97 sysctl utility (OS X) 964 system see also system calls clock, set using date 770 dataless 1064 load average 998 mode 1097 single-user 1093 system calls 12 exec() 323 fork() 323, 325 isatty() 134 setlocale() 317 sleep() 325 stat() 803 System V 1097 System V Interface Definition 6 systemsetup utility (OS X) 322 T TAB key 50 table, database 597 table, hash 1071 TABs see also whitespace compress using unexpand 796–797 display using cat 735 expand using expand 796–797 tac utility 667, 735 tail utility 57, 380, 965–967 Tanenbaum, Andrew 5 tape, streaming 1095 tar file 66 tar utility 66–68, 293, 968–972 .tar.bz2 filename extension 66 .tar.gz filename extension 66, 86 .tar.Z filename extension 66 tarball 66 .tbz filename extension 66 TC Shell 1097, see also tcsh TCP 1097 tcsh 1097 see also tcsh variables aliases 377–378 arithmetic expressions 389 array 387 assignment statement 370 beepcmd alias 377 builtins 407, 407t–411t command completion 381 command line, editing 383 command substitution 379, 403, see also command, substitution control operators 289, 289–293 control structures see control structures cwdcmd alias 377 directory stack 409, 410 directory stack manipulation 379 epoch 400 event number 374 exiting from 372 expressions 389 filename completion 380 filename substitution 378 globbing 379 hash mechanism 409, 410 hash table 325 history 374–376 expansion 374, 375t variables 375t job control 378 nice builtin 892 numeric variable 386 periodic alias 377 positional parameters 391 Main Index postdecrement operators 389 postincrement operators 389 precmd alias 377 prompt 370 redirect, separate standard output from standard error 379 shell alias 377 special aliases 377, 377t spelling correction 384–385 startup files 372, 373 tilde completion 381 variable completion 381 which builtin 1002 word completion 380 tcsh variable 394 tcsh variables 385–398, 467–476 see also tcsh $? 391 $# 391, 392 $#argv 392 $< 391 $$ 396 argv 391 array of numeric 390 array of string 387 ASPELL_CONF 721 autocorrect 396 autolist 382, 392 autologout 392 braces around 390 cdpath 392, 737 correct 392 cwd 392 dirstack 392 dunique 396 echo 396 EDITOR 763, 854 environment 468 fignore 392 filec 396 gid 392 histfile 372, 375, 392 histlit 376, 396 history 375, 392 HOME 392 home 392 ignoreeof 372, 397 LESS 853 listjobs 397 listlinks 397 loginsh 397 mail 392 MAILTO 763, 765 MANPATH 876 naming 386 nobeep 397 noclobber 397, 397t, 736 noglob 397 nonomatch 397 notify 397 numeric 388 numeric, array of 390 owd 393, 737 PAGER 853, 876 PATH 393, 1001 path 393 popd 397 POSIXLY_CORRECT 802 PRINTER 859 prompt 393, 393t prompt2 394 prompt3 394 pushdsilent 397 pushdtohome 397 quoting 386 rmstar 398 savehist 375, 394 SHELL 855 shell 394 shell 467 shlvl 394 status 394 string 386 string, array of 387 substitution 386 tcsh 394 TERM 28, 373 time 394 tperiod 395 user 395 verbose 398 version 395 visiblebell 398 VISUAL 763 watch 395, 409 who 396, 396t .tcshrc file 194, 195, 372 tee utility 145, 973 1147 1148 Main Index teletypewriter 990, 1099 TELNET protocol 974 telnet utility 974–975 temporary file 401, 448, 464 tera- 1097 TERM signal 30, 484 TERM variable 28, 43, 373, 1025 Termcap 1024 termcap file 1097 terminal 1097 ASCII 1056 character-based 1060 device 482 display pathname of using tty 990 display/set parameters using stty 960–963 emulator 26, 43 files 134 interrupt signal 483 login 26 names 1024 specifying 1024 standard input 135 standard output 135 X 1103 Terminal utility (OS X) 1049 terminating execution see kill builtin termination signal, see TERM signal Terminfo 1024 terminfo file 1097 ternary operator 498 test builtin 419–421, 421t, 422, 426, 429, 432, 436, 441, 978–980 test utility 456–457, 978–980 text box 1097 text, format using fmt 812–813 .tgz filename extension 66, 86, 972 theme 1097 thicknet 1097 thinnet 1098 Thompson, Ken 10, 1056 thread safe see reentrant code thumb 1098 tick 1098 tick box see check box .tif filename extension 86, 1098 .tiff filename extension 86, 1098 tilde completion (tcsh) 381 tilde expansion 89, 307, 358 tildes in directory stack manipulation 358 tiled windows 1098 time localtime file 322 setting the system to UTC 320 setting the system using date 770 timezone file 322 to live see TTL TZ variable 320 tzconfig utility 321 tzselect utility 321 zoneinfo directory 321, 322 zones 320 time builtin 394, 395t, 410 time variable 394 timezone file 322 /tmp directory 97, 448 todos utility 63 toggle 1098 token ring network 1098 tokens 127, 354, 1098 tooltip 1098 top utility 981–984 .toprc file 983 .torrent filename extension 1039 torrent, BitTorrent 1039 Torvalds, Linus 2, 5, 6, 1076 touch utility 94, 139, 733, 985–986 tperiod variable 377, 395 tput builtin 440 tr utility 63, 142, 282, 767, 987–989 tracker, BitTorrent 1039 transient window 1098 Transmission Control Protocol see TCP trap builtin 440, 483–486 trap door see back door Trojan horse 1098 true utility 292, 485 TSTP signal 484 TTL 1099 tty file 482 tty utility 990 TTY see teletypewriter tune2fs utility 991–992, 993 tunneling 1099 tunneling using OpenSSH 706–707 tutorial emacs 216–222 ftp 822–824 mc 883 Main Index screen 933 ssh and scp 691 vim 161–168 .txt filename extension 86 type builtin 477 typeface conventions 24 typescript file 62 typeset builtin see declare builtin TZ variable 320 tzconfig utility 321 tzselect utility 321 U UDP (User Datagram Protocol) 1099 UFS filesystem 85 UID 1099 change using chown 746–747 effective 1067 umask builtin 411 umask utility 994–995 unalias builtin 343, 345, 377, 411 uname utility 695 unary operators 495 uncompress utility 970 undecillion 1099 unexpand utility 796–797 unexport a variable 470 unhash builtin 411 unicast packet 1099 unicode 1099 uniq utility 58, 996–997 UNIX Bourne Shell 276 Linux roots in 3 System V 3, 1097 System V Interface Definition 6 unix2dos utility 63 unix2mac utility 63 unlimit builtin 411 unmanaged window 1099 unset builtin 304, 411 unsetenv builtin 387, 411 until control structure 439–441 updatedb utility 70 uptime utility 72, 982, 998 uptime, display using w 73 urandom file 773 URI 1100 1149 URL 1100 usage messages 127, 422, 425, 428, 437, 1100 USB flash drive, write filesystem to using mkfs 889 user display information about using finger 71, 810–811 display information about using w 998–999 display information about using who 1003–1004 ID see UID input, hide using stty 962 interface 1100 mode 1100 name see usernames User Datagram Protocol see UDP user variable 395 usernames 1100 list using w 73 list using who 70 /Users directory (OS X) 97 userspace 1100 /usr directory 97 UTC 320, 1100 UTF-8 1100 utilities 1100, see also commands; the Utility index (page 1107); inside front and back covers builtin 153 builtins versus 420 busybox collection of 729–731 locate using whereis 68 locate using which 68 mail 76 names, typeface 24 UUID 1100 V /var directory 97 variables 1100 see also bash variables; tcsh variables bash 467–476 completion 339, 381 environment 468 Perl lexical 522 Perl package 521 Python 568 shell 467 verbose variable 398 version variable 395 vi utility 160, see also vim 1150 Main Index view utility 171 viewport see workspace vile utility 160 vim 160–210 see also vim commands; vim parameters ^ special character 185 . special character 185 [] special characters 186 @ symbol 170 * special character 185 \< special character 185 \> special character 185 ~ symbol 170 $ special character 185 abnormal termination 172 active links 169 address 194 aspell, using with 723 automatic matching 198 automatic write 196 back tick 199 .bash_profile file 194, 195 beginning-of-line indicator (^) 185 blank-delimited word 204, 204t buffer 191–193 General-Purpose 191, 192 Named 192 Numbered 192 Work 193 calling 206t case sensitivity 197, 1060 case, change using ~ 183 changing text 208t character 203 Command mode 163, 169, 179–183 compatibility, vi 199 copying text 191 correcting a mistake 167, 170 correcting text 166 crash recovery 173 current character 169 current line 169 cursor, moving 174 cursor, moving by Units of Measure 206t deleting text 166, 180, 208t editing files 200 emacs versus 215 ending a session 172 end-of-line indicator ($) 185 key 178 exit, emergency 162, 172 file locks 171 file size 171 flashing the screen 196 format text using fmt 812 General-Purpose buffer 191, 192 help system 165 history 160 incremental search 197 indention 196, 198 Input mode 163, 164, 169, 178–179 inserting text 167, 208t invisible characters 197 Last Line mode 164, 169 line 204 break in two 190 length 171 numbers 178, 197 wrap 197 wrap margin 197 wrapping a long 175 links, active 169 macros 200 markers 199 matching, automatic 198 miscellaneous commands 210t mode, displaying 198 modes of operation 169 moving text 190 moving the cursor 166 before/after a specific character 183 by characters 175 by lines 176 by paragraphs 177 by sentences 177 by words 176 to a specific character 175 within the screen 177 Named buffer 192 Normal mode see vim, Command mode Numbered buffers 192 page breaks 167 paragraphs 205t parameters 196t putting text 210t quitting 168, 172 –R option 171 –r option 173 ESCAPE Main Index recovering text 173 redrawing the screen 170 Repeat Factor 206 replace string 188 report on changes 198 RETURN key 176, 196 safedit script 446 screen 206 scrolling the display 177, 198 search and replace examples 188t any character indicator (.) 185 beginning-of-word indicator (\<) 185 character-class definition ([]) 186 commands 183–189, 209t end-of-word indicator (\>) 185 for a string 184–186, 186t incremental 184, 197 regular expression 185 special characters 185 string 184, 188 wrap scan 199 zero or more occurrences of a character (*) 185 sentences 204, 204t shortcuts 200 simulation in emacs 254 SPACE bar 175 special characters 179, 196 spell checking using aspell 723 start over 200 starting 161 startup files 194 .bash_profile 194 .tcshrc 194, 195 .vimrc 194, 195 status line 169, 170, 197 substitute addresses 187t Substitute command 183t swap file 173 .tcshrc file 194, 195 terminal, specifying 1024 text, adding 208t undoing changes 167 Units of Measure 203–206 vi compatibility 199 view utility 171 viewing different parts of the Work buffer 177 VIMINIT variable 195 .vimrc file 194, 195 vimtutor utility 161 window 171, 206 word, blank-delimited 204 words 203, 203t words, blank-delimited 204t Work buffer 168, 171, 177, 193, 207t write address 194 yanking text 210t vim commands see also vim; vim parameters ; (repeat Find command) 175, 183 :!!command 202 :!command 202 ? (search backward) 184 . (repeat last command) 190, 193 – (up a line) 176 ( (previous sentence) 177 ) (next sentence) 177 { (next paragraph) 177 } (previous paragraph) 177 / (search forward) 184 ~ (change case) 183 a/A (append) 178 :abbrev 201 advanced 210t Append 178 b/B (back word) 176 c/C (change) 181 cc (change line) 182 Change 181, 182t Command mode 169, 179–183 CONTROL-B (backward one screen) 177 CONTROL-D (foward half screen) 177 CONTROL-F (foward one screen) 177 CONTROL-G (display status) 190, 194 CONTROL-L (refresh screen) 170 CONTROL-T (tab) 196 CONTROL-U (backward half screen) 177 CONTROL-V (quote) 179 CONTROL-W (backward by words) 170 d (delete) 180 dd (delete line) 180 Delete 180, 180t, 191 Delete character 180 DOWN ARROW 176 :e! (edit same file) 200 e/E (forward word) 176 :e# (edit last file) 200 ESCAPE 178 1151 1152 Main Index vim commands, continued Exit 193 :f (file) 194 f/F (find) 175 File 194 G (goto) 178 Goto 178 h (move by characters) 175 H (top of screen) 177 i/I (insert) 178 Insert 178 j (down a line) 176 J (join lines) 190 Join 190 k (up line) 176 L (bottom of screen) 177 l (move by characters) 175 LEFT ARROW 175 M (middle of screen) 177 m (set marker) 199 :map (define macro) 200 :n (edit next file) 200 n/N (search next) 184 o/O (open line) 178 Open 178 p/P (put) 191 PAGE DOWN 177 PAGE UP 177 Put 191 :q (quit) 172 Quit 172 Quote 179 :r (read) 193 r/R (replace) 179 Read 193 Repeat 190, 193 Replace 179 RETURN 176, 196 :rew (rewind) 200 RIGHT ARROW 175 s/S (substitute) 182 :sh (shell) 201 shell commands 198, 201 Status 190 Substitute 182, 183–189 substitute for a string 186–189 t/T (find) 183 U (undo line) 179 u (undo) 179, 191 Undo 179, 191 176 :w (write) 172, 193 w/W (move by words) 176 Write 172, 193 x/X (delete character) 180 y/Y (yank) 191 Yank 191 yy (yank line) 191 ZZ (write and exit) 193 vim parameters see also vim; vim commands autoindent 196 autowrite 196 compatible 199 flash 196 ignorecase 197 incsearch 184, 197 laststatus 197 list 197 magic 196 number 197 report 198 scrolling the display 198 setting 194–195 shell 198 shell 201 shiftwidth 198 showmatch 198 showmode 198 wrap 197 wrapmargin 197 wrapscan 199 VIMINIT variable 195 .vimrc file 194, 195 virtual console 43, 1101 machines 7–9 private network see VPN virtualBox 9 viruses 1101 visiblebell variable 398 VISUAL variable 763 Vixie, Paul 763 VLAN 1101 VMM (virtual machine monitor) 8 VMs (virtual machines) 7–9 VMware 9 Volumes directory (OS X) 1044 VPN 1101 vt100/vt102/vt104/ terminal 1024 UP ARROW Main Index W w utility 73, 73t, 998–999 W2K 1101 W3C 1101 wait builtin 411 Wall, Larry 518 WAN 1101 WAP 1101 watch variable 395, 409 wc utility 60, 359, 1000 Web ring 1101 whatis utility 36 where builtin 411 whereis utility 69 which builtin 411, 1002 which utility 68, 876, 1001–1002 while control structure 405, 435–438 whitespace 50, 126, 1101 on the command line 290 quoting 302 who utility 70, 73t, 1003–1004 who variable 396, 396t whos shell script 434 whos2 shell script (OS X) 1042 wide area network see WAN widget 1101, see also GUI Wi-Fi 1102 wildcards 148, 1101, see also special characters window 1102 see also screens cascading 1060 ignored 1074 manager 17, 1102 manager, quartz-wm 17 manager, screen 931–936 minimize 1080 root 1091 scrollbar 1092 slider 1093 snap 1093 thumb 1098 tiled 1098 transient 1098 unmanaged 1099 Windows, convert files from/to Linux format 63 Windows, filename limitations 85 WINS 1102 wireless 802.11 specification 1054 wireless access point 1102 word_count shell script 641 word_usage shell script 640 words 30, 127, 1102 completion 380 count using wc 60 erase key (CONTROL-W) 30, 130, 961 on the command line 354 splitting 311, 362 Work buffer 1102 working directory 86, 1102 change to another using cd 92 executing a file in 286, 308 relative pathnames and 89 significance of 89 versus home directory 93 workspace 1102 workstation 9, 1102 World Wide Web Consortium 1101 worms 1102 write access permission 98 write utility 74, 1003 WYSIWYG 1102 X X server 1102 X terminal 1103 X Window System 16, 1103 X terminal 1103 X11 forwarding, OpenSSH 692, 701, 706 Xinerama 1103 X11 directory 96 xargs utility 1005–1007 Xcode 1050 XDMCP 1103 xDSL 1103 Xen 8 Xinerama 1103 xinitrc file 705 XINU 5 XML 1103 XSM 1103 xterm terminal name 1024 1153 1154 Main Index Y Z yum 1028–1034 .Z filename extension 65, 86 Z Shell 1103 zcat utility 65, 838–840 zdiff utility 840 zless utility 840 zombie processes 981 zoneinfo directory 321, 322 zsh shell 1103 zulu time see UTC configuration file 1032 install option 1028 remove option 1029 update option 1029 yum.conf file 1032 yum.repos.d directory 1033 yumdownloader utility 1032

Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.7
Linearized                      : No
Modify Date                     : 2017:10:31 12:47:21+01:00
Universal                       : The process that creates this PDF constitutes a trade secret of codeMantra, LLC and is protected by the copyright laws of the United States
Code Mantra                     : http://www.codemantra.com
Create Date                     : 2012:08:27 09:27:49+01:00
EBX PUBLISHER                   : Pearson Education Limited (US titles)
PXC Viewer Info                 : PDF-XChange Viewer;2.5.201.0;Jan 23 2012;21:08:47;D:20171031124721+01'00'
Page Count                      : 1169
Has XFA                         : No
XMP Toolkit                     : XMP Core 4.1.1
Producer                        : Acrobat Distiller 9.5.1 (Windows)
Format                          : application/pdf
Page Mode                       : UseOutlines
Page Layout                     : SinglePage
EXIF Metadata provided by EXIF.tools

Navigation menu