Prentice Hall A Practical Guide To Linux Commands Editors And Shell Programming 2nd

Prentice_Hall_A_Practical_Guide_to_Linux_Commands_Editors_and_Shell_Programming_2nd

User Manual:

Open the PDF directly: View PDF .
Page Count: 1035 [warning: Documents this large are best viewed by clicking the View PDF Link!]

Welcome to Linux and Mac OS X
The Linux and Mac OS X Operating Systems
The Editors
- The vim Editor
- The emacs Editor
The Shells
- The Bourne Again Shell
- The TC Shell
Programming Tools
Command Reference
- Command Reference
Appendixes
Glossary

A Practical Guide to Linux Commands,

Editors, and Shell Programming

SECOND EDITION

Mark G. Sobell

Upper Saddle River, NJ • Boston • Indianapolis • San Francisco

New York • Toronto • Montreal • London • Munich • Paris • Madrid

Capetown • Sydney • Tokyo • Singapore • Mexico City

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where

those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed

with initial capital letters or in all capitals.

The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any

kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in

connection with or arising out of the use of the information or programs contained herein.

The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may

include electronic versions and/or custom covers and content particular to your business, training goals, marketing focus, and

branding interests. For more information, please contact:

U.S. Corporate and Government Sales

(800) 382-3419

corpsales@pearsontechgroup.com

For sales outside the United States, please contact:

International Sales

international@pearson.com

Visit us on the Web: informit.com/ph

Library of Congress Cataloging-in-Publication Data

Sobell, Mark G.

A practical guide to Linux commands, editors, and shell programming /

Mark G. Sobell.—2nd ed.

p. cm.

Includes bibliographical references and index.

ISBN 978-0-13-136736-4 (pbk.)

1. Linux. 2. Operating systems (Computers) I. Title.

QA76.76.O63S59483 2009

005.4'32—dc22

2009038191

be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any

form or by any means, electronic, mechanical, photocopying, recording, or likewise. For information regarding permissions,

write to:

Pearson Education, Inc.

Rights and Contracts Department

501 Boylston Street, Suite 900

Boston, MA 02116

Fax: (617) 671-3447

ISBN-13: 978-0-13-136736-4

ISBN-10: 0-13-136736-6

Text printed in the United States at Edwards Brothers in Ann Arbor, Michigan.

First printing, October 2009

Contents

Preface xxxi

Chapter 1: Welcome to Linux and Mac OS X 1

The History of UNIX and GNU–Linux 2

The Heritage of Linux: UNIX 2

Fade to 1983 3

Next Scene, 1991 4

The Code Is Free 5

Have Fun! 5

What Is So Good About Linux? 6

Why Linux Is Popular with Hardware Companies and Developers 9

Linux Is Portable 9

The C Programming Language 10

Overview of Linux 11

Linux Has a Kernel Programming Interface 11

Linux Can Support Many Users 12

Linux Can Run Many Tasks 12

Linux Provides a Secure Hierarchical Filesystem 12

The Shell: Command Interpreter and Programming Language 13

A Large Collection of Useful Utilities 15

Interprocess Communication 15

System Administration 16

Additional Features of Linux 16

GUIs: Graphical User Interfaces 16

(Inter)Networking Utilities 17

Software Development 17

Chapter Summary 18

Exercises 18

PART I The Linux and Mac OS X

Operating Systems 21

Chapter 2: Getting Started 23

Conventions Used in This Book 24

Logging In from a Terminal or Terminal Emulator 26

Working with the Shell 28

Which Shell Are You Running? 28

Correcting Mistakes 29

Repeating/Editing Command Lines 31

su/sudo: Curbing Your Power (root Privileges) 31

Where to Find Documentation 33

The ––help Option 33

man: Displays the System Manual 33

apropos: Searches for a Keyword 35

info: Displays Information About Utilities 36

HOWTOs: Finding Out How Things Work 38

Getting Help with the System 38

More About Logging In 40

Using Virtual Consoles 40

What to Do If You Cannot Log In 41

Logging Out 41

Changing Your Password 41

Chapter Summary 43

Exercises 44

Advanced Exercises 44

Chapter 3: The Utilities 45

Special Characters 46

Basic Utilities 47

ls: Lists the Names of Files 47

cat: Displays a Text File 48

rm: Deletes a File 48

less Is more: Display a Text File One Screen at a Time 48

hostname: Displays the System Name 49

Working with Files 49

cp: Copies a File 49

mv: Changes the Name of a File 50

lpr: Prints a File 51

grep: Searches for a String 52

head: Displays the Beginning of a File 52

tail: Displays the End of a File 53

sort: Displays a File in Order 54

uniq: Removes Duplicate Lines from a File 54

diff: Compares Two Files 54

file: Identifies the Contents of a File 56

| (Pipe): Communicates Between Processes 56

Four More Utilities 57

echo: Displays Text 57

date: Displays the Time and Date 58

script: Records a Shell Session 58

todos/unix2dos: Converts Linux and Mac OS X Files to Windows Format 59

Compressing and Archiving Files 60

bzip2: Compresses a File 60

bunzip2 and bzcat: Decompress a File 61

gzip: Compresses a File 62

tar: Packs and Unpacks Archives 62

Locating Commands 65

which and whereis: Locate a Utility 65

slocate/locate: Searches for a File 66

Obtaining User and System Information 67

who: Lists Users on the System 67

finger: Lists Users on the System 68

w: Lists Users on the System 69

Communicating with Other Users 70

write: Sends a Message 70

mesg: Denies or Accepts Messages 71

Email 72

Chapter Summary 72

Exercises 75

Advanced Exercises 75

Chapter 4: The Filesystem 77

The Hierarchical Filesystem 78

Directory Files and Ordinary Files 78

Filenames 79

The Working Directory 82

Your Home Directory 82

Pathnames 83

Absolute Pathnames 83

Relative Pathnames 84

Working with Directories 85

mkdir: Creates a Directory 86

cd: Changes to Another Working Directory 87

rmdir: Deletes a Directory 88

Using Pathnames 89

mv,cp: Move or Copy Files 90

mv: Moves a Directory 90

Important Standard Directories and Files 91

Access Permissions 93

ls –l: Displays Permissions 93

chmod: Changes Access Permissions 94

Setuid and Setgid Permissions 96

Directory Access Permissions 98

ACLs: Access Control Lists 99

Enabling ACLs 100

Working with Access Rules 100

Setting Default Rules for a Directory 103

Links 104

Hard Links 106

Symbolic Links 108

rm: Removes a Link 110

Chapter Summary 111

Exercises 112

Advanced Exercises 114

Chapter 5: The Shell 117

The Command Line 118

Syntax 118

Processing the Command Line 120

Executing the Command Line 123

Editing the Command Line 123

Standard Input and Standard Output 123

The Screen as a File 124

The Keyboard and Screen as Standard Input and Standard Output 125

Redirection 126

Pipes 131

Running a Command in the Background 134

Filename Generation/Pathname Expansion 136

The ? Special Character 137

The * Special Character 138

The [] Special Characters 139

Builtins 141

Chapter Summary 142

Utilities and Builtins Introduced in This Chapter 142

Exercises 143

Advanced Exercises 144

PART II The Editors 147

Chapter 6: The vim Editor 149

History 150

Tutorial: Using vim to Create and Edit a File 151

Starting vim 151

Command and Input Modes 153

Entering Text 154

Getting Help 155

Ending the Editing Session 158

The compatible Parameter 158

Introduction to vim Features 158

Online Help 158

Terminology 159

Modes of Operation 159

The Display 160

Correcting Text as You Insert It 160

Work Buffer 161

Line Length and File Size 161

Windows 161

File Locks 161

Abnormal Termination of an Editing Session 162

Recovering Text After a Crash 163

Command Mode: Moving the Cursor 164

Moving the Cursor by Characters 165

Moving the Cursor to a Specific Character 165

Moving the Cursor by Words 166

Moving the Cursor by Lines 166

Moving the Cursor by Sentences and Paragraphs 167

Moving the Cursor Within the Screen 167

Viewing Different Parts of the Work Buffer 167

Input Mode 168

Inserting Text 168

Appending Text 168

Opening a Line for Text 168

Replacing Text 169

Quoting Special Characters in Input Mode 169

Command Mode: Deleting and Changing Text 169

Undoing Changes 169

Deleting Characters 170

Deleting Text 170

Changing Text 171

Replacing Text 172

Changing Case 173

Searching and Substituting 173

Searching for a Character 173

Searching for a String 174

Substituting One String for Another 176

Miscellaneous Commands 180

Join 180

Status 180

. (Period) 180

Copying, Moving, and Deleting Text 180

The General-Purpose Buffer 181

Named Buffers 182

Numbered Buffers 182

Reading and Writing Files 183

Reading Files 183

Writing Files 183

Identifying the Current File 184

Setting Parameters 184

Setting Parameters from Within vim 184

Setting Parameters in a Startup File 185

The .vimrc Startup File 185

Parameters 185

Advanced Editing Techniques 189

Using Markers 189

Editing Other Files 190

Macros and Shortcuts 190

Executing Shell Commands from Within vim 191

Units of Measure 193

Character 193

Word 193

Blank-Delimited Word 194

Line 194

Sentence 194

Paragraph 195

Screen (Window) 196

Repeat Factor 196

Chapter Summary 196

Exercises 201

Advanced Exercises 202

Chapter 7: The emacs Editor 205

History 206

Evolution 206

emacs Versus vim 207

Command-Line emacs Versus Graphical emacs 208

Tutorial: Getting Started with emacs 208

Starting emacs 208

Exiting 210

Inserting Text 210

Deleting Characters 210

Moving the Cursor 211

Editing at the Cursor Position 214

Saving and Retrieving the Buffer 214

The emacs GUI 215

Basic Editing Commands 216

Keys: Notation and Use 216

Key Sequences and Commands 217

META-x: Running a Command Without a Key Binding 217

Numeric Arguments 218

Point and the Cursor 218

Scrolling Through a Buffer 218

Erasing Text 219

Searching for Text 219

Using the Menubar from the Keyboard 221

Online Help 223

Advanced Editing 225

Undoing Changes 225

Point, Mark, and Region 226

Cut and Paste: Yanking Killed Text 228

Inserting Special Characters 230

Global Buffer Commands 230

Visiting and Saving Files 232

Buffers 235

Windows 236

Foreground Shell Commands 238

Background Shell Commands 239

Major Modes: Language-Sensitive Editing 239

Selecting a Major Mode 240

Human-Language Modes 240

C Mode 243

Customizing Indention 246

Comments 247

Special-Purpose Modes 247

Customizing emacs 249

The .emacs Startup File 250

Remapping Keys 251

A Sample .emacs File 253

More Information 254

Access to emacs 254

Chapter Summary 254

Exercises 262

Advanced Exercises 264

PART III The Shells 267

Chapter 8: The Bourne Again Shell 269

Background 270

Shell Basics 271

Startup Files 271

Commands That Are Symbols 275

Redirecting Standard Error 275

Writing a Simple Shell Script 278

Separating and Grouping Commands 281

Job Control 285

Manipulating the Directory Stack 288

Parameters and Variables 290

User-Created Variables 292

Variable Attributes 295

Keyword Variables 296

Special Characters 304

Processes 306

Process Structure 306

Process Identification 306

Executing a Command 308

History 308

Variables That Control History 308

Reexecuting and Editing Commands 310

The Readline Library 318

Aliases 324

Single Versus Double Quotation Marks in Aliases 325

Examples of Aliases 326

Functions 327

Controlling bash: Features and Options 330

Command-Line Options 330

Shell Features 330

Processing the Command Line 334

History Expansion 334

Alias Substitution 334

Parsing and Scanning the Command Line 334

Command-Line Expansion 335

Chapter Summary 343

Exercises 345

Advanced Exercises 347

Chapter 9: The TC Shell 349

Shell Scripts 350

Entering and Leaving the TC Shell 351

Startup Files 352

Features Common to the Bourne Again and TC Shells 353

Command-Line Expansion (Substitution) 354

Job Control 358

Filename Substitution 358

Manipulating the Directory Stack 359

Command Substitution 359

Redirecting Standard Error 359

Working with the Command Line 360

Word Completion 360

Editing the Command Line 363

Correcting Spelling 364

Variables 365

Variable Substitution 366

String Variables 366

Arrays of String Variables 367

Numeric Variables 368

Braces 370

Special Variable Forms 371

Shell Variables 371

Control Structures 378

if 378

goto 381

Interrupt Handling 381

if...then...else 382

foreach 383

while 385

break and continue 385

switch 386

Builtins 387

Chapter Summary 391

Exercises 392

Advanced Exercises 394

PART IV Programming Tools 395

Chapter 10: Programming the Bourne Again Shell 397

Control Structures 398

if...then 398

if...then...else 402

if...then...elif 405

for...in 411

for 412

while 414

until 418

break and continue 420

case 421

select 427

Here Document 429

File Descriptors 431

Parameters and Variables 434

Array Variables 434

Locality of Variables 436

Special Parameters 438

Positional Parameters 440

Expanding Null and Unset Variables 445

Builtin Commands 446

type: Displays Information About a Command 447

read: Accepts User Input 447

exec: Executes a Command or Redirects File Descriptors 450

trap: Catches a Signal 453

kill: Aborts a Process 456

getopts: Parses Options 456

A Partial List of Builtins 459

Expressions 460

Arithmetic Evaluation 460

Logical Evaluation (Conditional Expressions) 461

String Pattern Matching 462

Operators 463

Shell Programs 468

A Recursive Shell Script 469

The quiz Shell Script 472

Chapter Summary 478

Exercises 480

Advanced Exercises 482

Chapter 11: The Perl Scripting Language 485

Introduction to Perl 486

More Information 486

Help 487

perldoc 487

Terminology 489

Running a Perl Program 490

Syntax 491

Variables 493

Scalar Variables 495

Array Variables 497

Hash Variables 500

Control Structures 501

if/unless 501

if...else 503

if...elsif...else 504

foreach/for 505

last and next 506

while/until 508

Working with Files 510

Sort 513

Subroutines 515

Regular Expressions 517

Syntax and the =~ Operator 518

CPAN Modules 523

Examples 525

Chapter Summary 529

Exercises 529

Advanced Exercises 530

Chapter 12: The AWK Pattern Processing Language 531

Syntax 532

Arguments 532

Options 533

Notes 534

Language Basics 534

Patterns 534

Actions 535

Comments 535

Variables 535

Functions 536

Arithmetic Operators 537

Associative Arrays 538

printf 538

Control Structures 539

Examples 541

Advanced gawk Programming 558

getline: Controlling Input 558

Coprocess: Two-Way I/O 560

Getting Input from a Network 562

Chapter Summary 563

Exercises 563

Advanced Exercises 564

Chapter 13: The sed Editor 565

Syntax 566

Arguments 566

Options 566

Editor Basics 567

Addresses 567

Instructions 568

Control Structures 569

The Hold Space 570

Examples 570

Chapter Summary 581

Exercises 581

Chapter 14: The rsync Secure Copy Utility 583

Syntax 584

Arguments 584

Options 584

Notes 586

More Information 586

Examples 587

Using a Trailing Slash (/) on source-file 587

Removing Files 588

Copying Files to and from a Remote System 590

Mirroring a Directory 590

Making Backups 591

Chapter Summary 594

Exercises 594

PART V Command Reference 597

Standard Multiplicative Suffixes 602

Common Options 603

The sample Utility 604

sample Brief description of what the utility does 605

aspell Checks a file for spelling errors 607

at Executes commands at a specified time 611

bzip2 Compresses or decompresses files 615

cal Displays a calendar 617

cat Joins and displays files 618

cd Changes to another working directory 620

chgrp Changes the group associated with a file 622

chmod Changes the access mode (permissions) of a file 626

chown Changes the owner of a file and/or the group the file is

associated with 631

cmp Compares two files 634

comm Compares sorted files 636

configure Configures source code automatically 638

cp Copies files 640

cpio Creates an archive, restores files from an archive, or copies a

directory hierarchy 644

crontab Maintains crontab files 649

cut Selects characters or fields from input lines 652

date Displays or sets the system time and date 655

dd Converts and copies a file 658

df Displays disk space usage 661

diff Displays the differences between two text files 663

diskutil Checks, modifies, and repairs local volumes O668

ditto Copies files and creates and unpacks archives O671

dmesg Displays kernel messages 673

dscl Displays and manages Directory Service information O674

du Displays information on disk usage by directory hierarchy

and/or file 677

echo Displays a message 680

expr Evaluates an expression 682

file Displays the classification of a file 686

find Finds files based on criteria 688

finger Displays information about users 695

fmt Formats text very simply 697

fsck Checks and repairs a filesystem 699

ftp Transfers files over a network 704

gawk Searches for and processes patterns in a file 711

gcc Compiles C and C++ programs 712

GetFileInfo Displays file attributes O717

grep Searches for a pattern in files 719

gzip Compresses or decompresses files 724

head Displays the beginning of a file 727

kill Terminates a process by PID 729

killall Terminates a process by name 731

launchctl Controls the launchd daemon O733

less Displays text files, one screen at a time 735

ln Makes a link to a file 740

lpr Sends files to printers 742

ls Displays information about one or more files 745

make Keeps a set of programs current 753

man Displays documentation for commands 759

mkdir Creates a directory 763

mkfs Creates a filesystem on a device 764

Mtools Uses DOS-style commands on files and directories 767

mv Renames or moves a file 771

nice Changes the priority of a command 773

nohup Runs a command that keeps running after you log out 775

od Dumps the contents of a file 776

open Opens files, directories, and URLs O780

otool Displays object, library, and executable files O782

paste Joins corresponding lines from files 784

pax Creates an archive, restores files from an archive, or copies a

directory hierarchy 786

plutil Manipulates property list files O792

pr Paginates files for printing 794

ps Displays process status 796

rcp Copies one or more files to or from a remote system 800

renice Changes the priority of a process 802

rlogin Logs in on a remote system 803

rm Removes a file (deletes a link) 804

rmdir Removes directories 806

rsh Executes commands on a remote system 807

rsync Copies files and directory hierarchies securely over a network 809

scp Securely copies one or more files to or from a remote system 810

sed Edits a file noninteractively 812

SetFile Sets file attributes O813

sleep Creates a process that sleeps for a specified interval 815

sort Sorts and/or merges files 817

split Divides a file into sections 826

ssh Securely executes commands on a remote system 828

stat Displays information about files 835

strings Displays strings of printable characters 837

stty Displays or sets terminal parameters 838

sysctl Displays and alters kernel variables O842

tail Displays the last part (tail) of a file 843

tar Stores or retrieves files to/from an archive file 846

tee Copies standard input to standard output and one or more files 851

telnet Connects to a remote system over a network 852

test Evaluates an expression 854

top Dynamically displays process status 858

touch Creates a file or changes a file’s access and/or modification time 862

tr Replaces specified characters 864

tty Displays the terminal pathname 867

tune2fs Changes parameters on an ext2,ext3, or ext4 filesystem 868

umask Establishes the file-creation permissions mask 870

uniq Displays unique lines 872

wDisplays information about system users 874

wc Displays the number of lines, words, and bytes 876

which Shows where in PATH a command is located 877

who Displays information about logged-in users 879

xargs Converts standard input to command lines 881

PART VI Appendixes 885

Appendix A: Regular Expressions 887

Characters 888

Delimiters 888

Simple Strings 888

Special Characters 888

Periods 889

Brackets 889

Asterisks 890

Carets and Dollar Signs 890

Quoting Special Characters 891

Rules 891

Longest Match Possible 891

Empty Regular Expressions 892

Bracketing Expressions 892

The Replacement String 892

Ampersand 893

Quoted Digit 893

Extended Regular Expressions 893

Appendix Summary 895

Appendix B: Help 897

Solving a Problem 898

The Apple Web Site 899

Finding Linux and OS X–Related Information 899

Documentation 900

Useful Linux and OS X Sites 901

Linux and OS X Newsgroups 902

Mailing Lists 903

Words 903

Software 904

Office Suites and Word Processors 906

Specifying a Terminal 906

Appendix C: Keeping the System Up-to-Date 909

Using yum 910

Using yum to Install, Remove, and Update Packages 910

Other yum Commands 912

yum Groups 913

Downloading rpm Package Files with yumdownloader 914

Configuring yum 914

Using apt-get 916

Using apt-get to Install, Remove, and Update Packages 917

Using apt-get to Upgrade the System 918

Other apt-get Commands 919

Repositories 919

sources.list: Specifies Repositories for apt-get to Search 920

BitTorrent 921

Prerequisites 921

Using BitTorrent 922

Appendix D: Mac OS X Notes 925

Open Directory 926

Filesystems 927

Nondisk Filesystems 927

Case Sensitivity 927

/Volumes 928

Carbon Pathnames 928

Extended Attributes 928

File Forks 929

File Attributes 931

ACLs 932

Activating the META Key 935

Startup Files 936

Remote Logins 936

Many Utilities Do Not Respect Apple Human Interface Guidelines 936

Mac OS X Implementation of Linux Features 936

Glossary 939

File Tree Index 989

Utility Index 991

Main Index 995

An operating system is the low-level software that schedules

tasks, allocates storage, and handles the interfaces to peripheral

hardware, such as printers, disk drives, the screen, keyboard,

and mouse. An operating system has two main parts: the kernel

and the system programs. The kernel allocates machine

resources—including memory, disk space, and CPU (page 949)

cycles—to all other programs that run on the computer. The sys-

tem programs include device drivers, libraries, utility programs,

shells (command interpreters), configuration scripts and files,

application programs, servers, and documentation. They per-

form higher-level housekeeping tasks, often acting as servers in a

client/server relationship. Many of the libraries, servers, and

utility programs were written by the GNU Project, which is dis-

cussed shortly.

In This Chapter

The History of UNIX and

GNU–Linux . . . . . . . . . . . . . . . . . . . 3

What Is So Good About Linux?. . . . . 6

Overview of Linux . . . . . . . . . . . . . . 11

Additional Features of Linux. . . . . . 16

Welcome to Linux

and Mac OS X

2Chapter 1 Welcome to Linux and Mac OS X

Linux kernel The Linux kernel was developed by Finnish undergraduate student Linus Torvalds,

who used the Internet to make the source code immediately available to others for

free. Torvalds released Linux version 0.01 in September 1991.

The new operating system came together through a lot of hard work. Programmers

around the world were quick to extend the kernel and develop other tools, adding func-

tionality to match that already found in both BSD UNIX and System V UNIX (SVR4)

as well as new functionality. The name Linux is a combination of Linus and UNIX.

The Linux operating system, which was developed through the cooperation of

many, many people around the world, is a product of the Internet and is a free

(open source; page 969) operating system. In other words, all the source code is

free. You are free to study it, redistribute it, and modify it. As a result, the code is

available free of cost—no charge for the software, source, documentation, or sup-

port (via newsgroups, mailing lists, and other Internet resources). As the GNU Free

Software Definition (www.gnu.org/philosophy/free-sw.html) puts it:

Free beer “Free software” is a matter of liberty, not price. To understand the

concept, you should think of “free” as in “free speech,” not as in

“free beer.”

Mach kernel OS X runs the Mach kernel, which was developed at Carnegie Mellon University

(CMU) and is free software. CMU concluded its work on the project in 1994,

although other groups have continued this line of research. Much of the Mac OS X

software is open source: the Mac OS X kernel is based on Mach and FreeBSD code,

utilities come from BSD and the GNU project, and system programs come mostly

from BSD code, although Apple has developed a number of new programs.

The History of UNIX and GNU–Linux

This section presents some background on the relationships between UNIX and

Linux and between GNU and Linux. Go to www.levenez.com/unix for an impres-

sive diagram of the history of UNIX.

The Heritage of Linux: UNIX

The UNIX system was developed by researchers who needed a set of modern com-

puting tools to help them with their projects. The system allowed a group of people

working together on a project to share selected data and programs while keeping

other information private.

Universities and colleges played a major role in furthering the popularity of the

UNIX operating system through the “four-year effect.” When the UNIX operating

Linux, OS X, and UNIX

tip Linux and OS X are closely related to the UNIX operating system. This book describes Linux and

OS X. To make reading easier, this book talks about Linux when it means OS X and Linux, and

points out where OS X behaves differently from Linux. For the same reason, this chapter fre-

quently uses the term Linux to describe both Linux and OS X features.

The History of UNIX and GNU–Linux 3

system became widely available in 1975, Bell Labs offered it to educational institu-

tions at nominal cost. The schools, in turn, used it in their computer science pro-

grams, ensuring that computer science students became familiar with it. Because

UNIX was such an advanced development system, the students became acclimated

to a sophisticated programming environment. As these students graduated and went

into industry, they expected to work in a similarly advanced environment. As more

of them worked their way up the ladder in the commercial world, the UNIX operat-

ing system found its way into industry.

BSD (Berkeley)

UNIX

In addition to introducing students to the UNIX operating system, the Computer

Systems Research Group (CSRG) at the University of California at Berkeley made

significant additions and changes to it. In fact, it made so many popular changes

that one version of the system is called the Berkeley Software Distribution (BSD) of

the UNIX system (or just Berkeley UNIX). The other major version is UNIX

System V (SVR4), which descended from versions developed and maintained by

AT&T and UNIX System Laboratories. Mac OS X inherits much more strongly

from the BSD branch of the tree.

Fade to 1983

Richard Stallman (www.stallman.org) announced1 the GNU Project for creating an

operating system, both kernel and system programs, and presented the GNU Mani-

festo,2 which begins as follows:

GNU, which stands for Gnu’s Not UNIX, is the name for the com-

plete UNIX-compatible software system which I am writing so that

I can give it away free to everyone who can use it.

Some years later, Stallman added a footnote to the preceding sentence when he realized

that it was creating confusion:

The wording here was careless. The intention was that nobody

would have to pay for *permission* to use the GNU system. But

the words don’t make this clear, and people often interpret them as

saying that copies of GNU should always be distributed at little or

no charge. That was never the intent; later on, the manifesto men-

tions the possibility of companies providing the service of distribu-

tion for a profit. Subsequently I have learned to distinguish

carefully between “free” in the sense of freedom and “free” in the

sense of price. Free software is software that users have the free-

dom to distribute and change. Some users may obtain copies at no

charge, while others pay to obtain copies—and if the funds help

support improving the software, so much the better. The important

thing is that everyone who has a copy has the freedom to cooperate

with others in using it.

1. www.gnu.org/gnu/initial-announcement.html

2. www.gnu.org/gnu/manifesto.html

4Chapter 1 Welcome to Linux and Mac OS X

In the manifesto, after explaining a little about the project and what has been

accomplished so far, Stallman continues:

Why I Must Write GNU

I consider that the golden rule requires that if I like a program I

must share it with other people who like it. Software sellers want

to divide the users and conquer them, making each user agree not

to share with others. I refuse to break solidarity with other users in

this way. I cannot in good conscience sign a nondisclosure agree-

ment or a software license agreement. For years I worked within

the Artificial Intelligence Lab to resist such tendencies and other

inhospitalities, but eventually they had gone too far: I could not

remain in an institution where such things are done for me against

my will.

So that I can continue to use computers without dishonor, I have

decided to put together a sufficient body of free software so that I

will be able to get along without any software that is not free. I

have resigned from the AI Lab to deny MIT any legal excuse to

prevent me from giving GNU away.

Next Scene, 1991

The GNU Project has moved well along toward its goal. Much of the GNU operat-

ing system, except for the kernel, is complete. Richard Stallman later writes:

By the early ’90s we had put together the whole system aside from

the kernel (and we were also working on a kernel, the GNU Hurd,3

which runs on top of Mach4). Developing this kernel has been a lot

harder than we expected, and we are still working on finishing it.5

...[M]any believe that once Linus Torvalds finished writing the ker-

nel, his friends looked around for other free software, and for no

particular reason most everything necessary to make a UNIX-like

system was already available.

What they found was no accident—it was the GNU system. The

available free software6 added up to a complete system because the

GNU Project had been working since 1984 to make one. The GNU

Manifesto had set forth the goal of developing a free UNIX-like sys-

tem, called GNU. The Initial Announcement of the GNU Project

also outlines some of the original plans for the GNU system. By the

time Linux was written, the [GNU] system was almost finished.7

3. www.gnu.org/software/hurd/hurd.html

4. www.gnu.org/software/hurd/gnumach.html

5. www.gnu.org/software/hurd/hurd-and-linux.html

6. www.gnu.org/philosophy/free-sw.html

7. www.gnu.org/gnu/linux-and-gnu.html

The History of UNIX and GNU–Linux 5

Today the GNU “operating system” runs on top of the FreeBSD (www.freebsd.org)

and NetBSD (www.netbsd.org) kernels with complete Linux binary compatibility

and on top of Hurd pre-releases and Darwin (developer.apple.com/opensource)

without this compatibility.

The Code Is Free

The tradition of free software dates back to the days when UNIX was released to

universities at nominal cost, which contributed to its portability and success. This

tradition eventually died as UNIX was commercialized and manufacturers came to

regard the source code as proprietary, making it effectively unavailable. Another

problem with the commercial versions of UNIX related to their complexity. As each

manufacturer tuned UNIX for a specific architecture, the operating system became

less portable and too unwieldy for teaching and experimentation.

MINIX Two professors created their own stripped-down UNIX look-alikes for educational

purposes: Doug Comer created XINU and Andrew Tanenbaum created MINIX.

Linus Torvalds created Linux to counteract the shortcomings in MINIX. Every time

there was a choice between code simplicity and efficiency/features, Tanenbaum

chose simplicity (to make it easy to teach with MINIX), which meant this system

lacked many features people wanted. Linux went in the opposite direction.

You can obtain Linux at no cost over the Internet. You can also obtain the GNU

code via the U.S. mail at a modest cost for materials and shipping. You can support

the Free Software Foundation (www.fsf.org) by buying the same (GNU) code in

higher-priced packages, and you can buy commercial packaged releases of Linux

(called distributions; e.g., Ubuntu, Red Hat, openSUSE), that include installation

instructions, software, and support.

GPL Linux and GNU software are distributed under the terms of the GNU General Pub-

lic License (GPL, www.gnu.org/licenses/licenses.html). The GPL says you have the

right to copy, modify, and redistribute the code covered by the agreement. When

you redistribute the code, however, you must also distribute the same license with

the code, thereby making the code and the license inseparable. If you get source

code off the Internet for an accounting program that is under the GPL and then

modify that code and redistribute an executable version of the program, you must

also distribute the modified source code and the GPL agreement with it. Because

this arrangement is the reverse of the way a normal copyright works (it gives rights

instead of limiting them), it has been termed a copyleft. (This paragraph is not a

legal interpretation of the GPL; it is intended merely to give you an idea of how it

works. Refer to the GPL itself when you want to make use of it.)

Have Fun!

Two key words for Linux are “Have Fun!” These words pop up in prompts and

documentation. The UNIX—now Linux—culture is steeped in humor that can be

seen throughout the system. For example, less is more—GNU has replaced the

UNIX paging utility named more with an improved utility named less. The utility to

view PostScript documents is named ghostscript, and one of several replacements for

6Chapter 1 Welcome to Linux and Mac OS X

the vi editor is named elvis. While machines with Intel processors have “Intel Inside”

logos on their outside, some Linux machines sport “Linux Inside” logos. And Tor-

valds himself has been seen wearing a T-shirt bearing a “Linus Inside” logo.

What Is So Good About Linux?

In recent years Linux has emerged as a powerful and innovative UNIX work-alike.

Its popularity has surpassed that of its UNIX predecessors. Although it mimics

UNIX in many ways, the Linux operating system departs from UNIX in several sig-

nificant ways: The Linux kernel is implemented independently of both BSD and Sys-

tem V, the continuing development of Linux is taking place through the combined

efforts of many capable individuals throughout the world, and Linux puts the power

of UNIX within easy reach of both business and personal computer users. Using the

Internet, today’s skilled programmers submit additions and improvements to the

operating system to Linus Torvalds, GNU, or one of the other authors of Linux.

Standards In 1985, individuals from companies throughout the computer industry joined

together to develop the POSIX (Portable Operating System Interface for Computer

Environments) standard, which is based largely on the UNIX System V Interface

Definition (SVID) and other earlier standardization efforts. These efforts were

spurred by the U.S. government, which needed a standard computing environment to

minimize its training and procurement costs. Released in 1988, POSIX is a group of

IEEE standards that define the API (application program interface; page 941), shell,

and utility interfaces for an operating system. Although aimed at UNIX-like systems,

the standards can apply to any compatible operating system. Now that these stan-

dards have gained acceptance, software developers are able to develop applications

that run on all conforming versions of UNIX, Linux, and other operating systems.

Applications A rich selection of applications is available for Linux—both free and commercial—

as well as a wide variety of tools: graphical, word processing, networking, security,

administration, Web server, and many others. Large software companies have

recently seen the benefit in supporting Linux and now have on-staff programmers

whose job it is to design and code the Linux kernel, GNU, KDE, or other software

that runs on Linux. For example, IBM (www.ibm.com/linux) is a major Linux sup-

porter. Linux conforms increasingly more closely to POSIX standards, and some

distributions and parts of others meet this standard. These developments indicate

that Linux is becoming mainstream and is respected as an attractive alternative to

other popular operating systems.

Peripherals Another aspect of Linux that appeals to users is the amazing range of peripherals that is

supported and the speed with which support for new peripherals emerges. Linux often

supports a peripheral or interface card before any company does. Unfortunately

some types of peripherals—particularly proprietary graphics cards—lag in their

support because the manufacturers do not release specifications or source code for

drivers in a timely manner, if at all.

What Is So Good About Linux? 7

Software Also important to users is the amount of software that is available—not just source

code (which needs to be compiled) but also prebuilt binaries that are easy to install

and ready to run. These programs include more than free software. Netscape, for

example, has been available for Linux from the start and included Java support

before it was available from many commercial vendors. Its sibling Mozilla/Thun-

derbird/Firefox is also a viable browser, mail client, and newsreader, performing

many other functions as well.

Platforms Linux is not just for Intel-based platforms (which now include Apple computers): It

has been ported to and runs on the Power PC—including older Apple computers

(ppclinux), Compaq’s (née Digital Equipment Corporation) Alpha-based machines,

MIPS-based machines, Motorola’s 68K-based machines, various 64-bit systems,

and IBM’s S/390. Nor is Linux just for single-processor machines: As of version 2.0,

it runs on multiple-processor machines (SMPs; page 978). It also includes an O(1)

scheduler, which dramatically increases scalability on SMP systems.

Emulators Linux supports programs, called emulators, that run code intended for other operat-

ing systems. By using emulators you can run some DOS, Windows, and Macintosh

programs under Linux. For example, Wine (www.winehq.com) is an open-source

implementation of the Windows API that runs on top of the X Window System and

UNIX/Linux.

Virtual machines A virtual machine (VM or guest) appears to the user and to the software running on

it as a complete physical machine. It is, however, one of potentially many such VMs

running on a single physical machine (the host). The software that provides the vir-

tualization is called a virtual machine monitor (VMM) or hypervisor. Each VM can

run a different operating system from the other VMs. For example, on a single host

you could have VMs running Windows, Ubuntu 7.10, Ubuntu 9.04, and Fedora 10.

A multitasking operating system allows you to run many programs on a single

physical system. Similarly, a hypervisor allows you to run many operating systems

(VMs) on a single physical system.

VMs provide many advantages over single, dedicated machines:

•Isolation—Each VM is isolated from the other VMs running on the same

host: Thus, if one VM crashes or is compromised, the others are not

affected.

•Security—When a single server system running several servers is compro-

mised, all servers are compromised. If each server is running on its own

VM, only the compromised server is affected; other servers remain secure.

•Power consumption—Using VMs, a single powerful machine can replace

many less powerful machines, thereby cutting power consumption.

•Development and support—Multiple VMs, each running a different ver-

sion of an operating system and/or different operating systems, can

facilitate development and support of software designed to run in many

environments. With this organization you can easily test a product in

8Chapter 1 Welcome to Linux and Mac OS X

different environments before releasing it. Similarly, when a user sub-

mits a bug, you can reproduce the bug in the same environment it

occurred in.

•Servers—In some cases, different servers require different versions of sys-

tem libraries. In this instance, you can run each server on its own VM, all

on a single piece of hardware.

•Testing—Using VMs, you can experiment with cutting-edge releases of

operating systems and applications without concern for the base (stable)

system, all on a single machine.

•Networks—You can set up and test networks of systems on a single

machine.

•Sandboxes—A VM presents a sandbox—an area (system) that you can

work in without regard for the results of your work or for the need to

clean up.

•Snapshots—You can take snapshots of a VM and return the VM to the

state it was in when you took the snapshot simply by reloading the VM

from the snapshot.

Xen Xen, which was created at the University of Cambridge and is now being developed

in the open-source community, is an open-source VMM. Xen introduces minimal

performance overhead when compared with running each of the operating systems

natively.

This book does not cover the installation or use of Xen. For more information on

Xen, refer to the Xen home page at www.cl.cam.ac.uk/research/srg/netos/xen and to

wiki.xensource.com/xenwiki

VMware VMware, Inc. (www.vmware.com) offers VMware Server, a free, downloadable,

proprietary product you can install and run as an application under Linux.

VMware Server enables you to install several VMs, each running a different operat-

ing system, including Windows and Linux. VMware also offers a free VMware

player that enables you to run VMs you create with the VMware Server.

KVM The Kernel-based Virtual Machine (KVM; kvm.qumranet.com and libvirt.org) is an

open-source VM and runs as part of the Linux kernel. It works only on systems

based on the Intel VT (VMX) CPU or the AMD SVM CPU.

Qemu Qemu (bellard.org/qemu), written by Fabrice Bellard, is an open-source VMM that

runs as a user application with no CPU requirements. It can run code written for a

different CPU than that of the host machine.

VirtualBox VirtualBox (www.virtualbox.org) is an open-source VM developed by Sun Micro-

systems.

What Is So Good About Linux? 9

Why Linux Is Popular with Hardware Companies

and Developers

Two trends in the computer industry set the stage for the growing popularity of

UNIX and Linux. First, advances in hardware technology created the need for an

operating system that could take advantage of available hardware power. In the

mid-1970s, minicomputers began challenging the large mainframe computers

because, in many applications, minicomputers could perform the same functions

less expensively. More recently, powerful 64-bit processor chips, plentiful and inex-

pensive memory, and lower-priced hard disk storage have allowed hardware com-

panies to install multiuser operating systems on desktop computers.

Proprietary

operating systems

Second, with the cost of hardware continually dropping, hardware manufacturers

could no longer afford to develop and support proprietary operating systems. A

proprietary operating system is one that is written and owned by the manufacturer

of the hardware (for example, DEC/Compaq owns VMS). Today’s manufacturers

need a generic operating system that they can easily adapt to their machines.

Generic operating

systems

Ageneric operating system is written outside of the company manufacturing the

hardware and is sold (UNIX, OS X, Windows) or given (Linux) to the manufac-

turer. Linux is a generic operating system because it runs on different types of hard-

ware produced by different manufacturers. Of course, if manufacturers can pay

only for development and avoid per-unit costs (which they have to pay to Microsoft

for each copy of Windows they sell), they are much better off. In turn, software

developers need to keep the prices of their products down; they cannot afford to

create new versions of their products to run under many different proprietary oper-

ating systems. Like hardware manufacturers, software developers need a generic

operating system.

Although the UNIX system once met the needs of hardware companies and

researchers for a generic operating system, over time it has become more propri-

etary as manufacturers added support for their own specialized features and intro-

duced new software libraries and utilities. Linux emerged to serve both needs: It is a

generic operating system that takes advantage of available hardware.

Linux Is Portable

Aportable operating system is one that can run on many different machines. More

than 95 percent of the Linux operating system is written in the C programming lan-

guage, and C is portable because it is written in a higher-level, machine-independent

language. (The C compiler is written in C.)

Because Linux is portable, it can be adapted (ported) to different machines and can

meet special requirements. For example, Linux is used in embedded computers,

such as the ones found in cellphones, PDAs, and the cable boxes on top of many

10 Chapter 1 Welcome to Linux and Mac OS X

TVs. The file structure takes full advantage of large, fast hard disks. Equally impor-

tant, Linux was originally designed as a multiuser operating system—it was not

modified to serve several users as an afterthought. Sharing the computer’s power

among many users and giving them the ability to share data and programs are cen-

tral features of the system.

Because it is adaptable and takes advantage of available hardware, Linux runs on

many different microprocessor-based systems as well as mainframes. The popularity

of the microprocessor-based hardware drives Linux; these microcomputers are get-

ting faster all the time, at about the same price point. Linux on a fast microcom-

puter has become good enough to displace workstations on many desktops. This

widespread acceptance benefits both users, who do not like having to learn a new

operating system for each vendor’s hardware, and system administrators, who like

having a consistent software environment.

The advent of a standard operating system has given a boost to the development of

the software industry. Now software manufacturers can afford to make one version

of a product available on machines from different manufacturers.

The C Programming Language

Ken Thompson wrote the UNIX operating system in 1969 in PDP-7 assembly lan-

guage. Assembly language is machine dependent: Programs written in assembly lan-

guage work on only one machine or, at best, on one family of machines. For this

reason, the original UNIX operating system could not easily be transported to run

on other machines (it was not portable).

To make UNIX portable, Thompson developed the B programming language, a

machine-independent language, from the BCPL language. Dennis Ritchie developed

the C programming language by modifying B and, with Thompson, rewrote UNIX

in C in 1973. Originally, C was touted as a “portable assembler.” The revised oper-

ating system could be transported more easily to run on other machines.

That development marked the start of C. Its roots reveal some of the reasons why it

is such a powerful tool. C can be used to write machine-independent programs. A

programmer who designs a program to be portable can easily move it to any com-

puter that has a C compiler. C is also designed to compile into very efficient code.

With the advent of C, a programmer no longer had to resort to assembly language

to get code that would run well (that is, quickly—although an assembler will always

generate more efficient code than a high-level language).

C is a good systems language. You can write a compiler or an operating system in

C. It is a highly structured but is not necessarily a high-level language. C allows a

programmer to manipulate bits and bytes, as is necessary when writing an operating

system. At the same time, it has high-level constructs that allow for efficient, modu-

lar programming.

In the late 1980s the American National Standards Institute (ANSI) defined a stan-

dard version of the C language, commonly referred to as ANSI C or C89 (for the

Overview of Linux 11

year the standard was published). Ten years later the C99 standard was published;

it is mostly supported by the GNU Project’s C compiler (named gcc). The original

version of the language is often referred to as Kernighan & Ritchie (or K&R) C,

named for the authors of the book that first described the C language.

Another researcher at Bell Labs, Bjarne Stroustrup, created an object-oriented pro-

gramming language named C++, which is built on the foundation of C. Because

object-oriented programming is desired by many employers today, C++ is preferred

over C in many environments. Another language of choice is Objective-C, which

was used to write the first Web browser. The GNU Project’s C compiler supports C,

C++, and Objective-C.

Overview of Linux

The Linux operating system has many unique and powerful features. Like other

operating systems, it is a control program for computers. But like UNIX, it is also a

well-thought-out family of utility programs (Figure 1-1) and a set of tools that

allow users to connect and use these utilities to build systems and applications.

Linux Has a Kernel Programming Interface

The Linux kernel—the heart of the Linux operating system—is responsible for allo-

cating the computer’s resources and scheduling user jobs so each one gets its fair

share of system resources, including access to the CPU; peripheral devices, such as

hard disk, DVD, and CD-ROM storage; printers; and tape drives. Programs inter-

act with the kernel through system calls, special functions with well-known names.

A programmer can use a single system call to interact with many kinds of devices.

For example, there is one write() system call, rather than many device-specific ones.

When a program issues a write() request, the kernel interprets the context and

passes the request to the appropriate device. This flexibility allows old utilities to

work with devices that did not exist when the utilities were written. It also makes it

Figure 1-1 A layered view of the Linux operating system

Compilers

Database

Management

Systems

Word Mail and

Message

Facilities

Shells

Processors

Hardware

Linux Kernel

12 Chapter 1 Welcome to Linux and Mac OS X

possible to move programs to new versions of the operating system without rewrit-

ing them (provided the new version recognizes the same system calls).

Linux Can Support Many Users

Depending on the hardware and the types of tasks the computer performs, a Linux

system can support from 1 to more than 1,000 users, each concurrently running a

different set of programs. The per-user cost of a computer that can be used by many

people at the same time is less than that of a computer that can be used by only a

single person at a time. It is less because one person cannot generally take advantage

of all the resources a computer has to offer. That is, no one can keep all the printers

going constantly, keep all the system memory in use, keep all the disks busy reading

and writing, keep the Internet connection in use, and keep all the terminals busy at

the same time. By contrast, a multiuser operating system allows many people to use

all of the system resources almost simultaneously. The use of costly resources can be

maximized and the cost per user can be minimized—the primary objectives of a

multiuser operating system.

Linux Can Run Many Tasks

Linux is a fully protected multitasking operating system, allowing each user to run

more than one job at a time. Processes can communicate with one another but

remain fully protected from one another, just as the kernel remains protected from

all processes. You can run several jobs in the background while giving all your

attention to the job being displayed on the screen, and you can switch back and

forth between jobs. If you are running the X Window System (page 16), you can

run different programs in different windows on the same screen and watch all of

them. This capability helps users be more productive.

Linux Provides a Secure Hierarchical Filesystem

Afile is a collection of information, such as text for a memo or report, an accumu-

lation of sales figures, an image, a song, or an executable program. Each file is

stored under a unique identifier on a storage device, such as a hard disk. The Linux

filesystem provides a structure whereby files are arranged under directories, which

are like folders or boxes. Each directory has a name and can hold other files and

directories. Directories, in turn, are arranged under other directories, and so forth,

in a treelike organization. This structure helps users keep track of large numbers of

files by grouping related files in directories. Each user has one primary directory

and as many subdirectories as required (Figure 1-2).

Standards With the idea of making life easier for system administrators and software developers,

a group got together over the Internet and developed the Linux Filesystem Standard

(FSSTND), which has since evolved into the Linux Filesystem Hierarchy Standard

(FHS). Before this standard was adopted, key programs were located in different

places in different Linux distributions. Today you can sit down at a Linux system and

expect to find any given standard program at a consistent location (page 91).

Overview of Linux 13

Links Alink allows a given file to be accessed by means of two or more names. The alter-

native names can be located in the same directory as the original file or in another

directory. Links can make the same file appear in several users’ directories, enabling

those users to share the file easily. Windows uses the term shortcut in place of link

to describe this capability. Macintosh users will be more familiar with the term

alias. Under Linux, an alias is different from a link; it is a command macro feature

provided by the shell (page 324).

Security Like most multiuser operating systems, Linux allows users to protect their data

from access by other users. It also allows users to share selected data and programs

with certain other users by means of a simple but effective protection scheme. This

level of security is provided by file access permissions, which limit the users who can

read from, write to, or execute a file. More recently, Linux has implemented Access

Control Lists (ACLs), which give users and administrators finer-grained control

over file access permissions.

The Shell: Command Interpreter and

Programming Language

In a textual environment, the shell—the command interpreter—acts as an interface

between you and the operating system. When you enter a command on the screen,

the shell interprets the command and calls the program you want. A number of

shells are available for Linux. The four most popular shells are

• The Bourne Again Shell (bash), an enhanced version of the original Bourne

Shell (the original UNIX shell).

• The Debian Almquist Shell (dash), a smaller version of bash, with fewer

features. Most startup shell scripts call dash in place of bash to speed the

boot process.

Figure 1-2 The Linux filesystem structure

etctmphome

hlssammax

notesbin

report log

14 Chapter 1 Welcome to Linux and Mac OS X

• The TC Shell (tcsh), an enhanced version of the C Shell, developed as part

of BSD UNIX.

• The Z Shell (zsh), which incorporates features from a number of shells,

including the Korn Shell.

Because different users may prefer different shells, multiuser systems can have sev-

eral different shells in use at any given time. The choice of shells demonstrates one

of the advantages of the Linux operating system: the ability to provide a customized

interface for each user.

Shell scripts Besides performing its function of interpreting commands from a keyboard and

sending those commands to the operating system, the shell is a high-level program-

ming language. Shell commands can be arranged in a file for later execution (Linux

calls these files shell scripts; Windows calls them batch files). This flexibility allows

users to perform complex operations with relative ease, often by issuing short com-

mands, or to build with surprisingly little effort elaborate programs that perform

highly complex operations.

Filename Generation

Wildcards and

ambiguous file

references

When you type commands to be processed by the shell, you can construct patterns

using characters that have special meanings to the shell. These characters are

called wildcard characters. The patterns, which are called ambiguous file refer-

ences, are a kind of shorthand: Rather than typing in complete filenames, you can

type patterns; the shell expands these patterns into matching filenames. An ambig-

uous file reference can save you the effort of typing in a long filename or a long

series of similar filenames. For example, the shell might expand the pattern mak*

to make-3.80.tar.gz. Patterns can also be useful when you know only part of a file-

name or cannot remember the exact spelling of a filename.

Completion

In conjunction with the Readline library, the shell performs command, filename,

pathname, and variable completion: You type a prefix and press ESCAPE, and the shell

lists the items that begin with that prefix or completes the item if the prefix specifies

a unique item.

Device-Independent Input and Output

Redirection Devices (such as a printer or a terminal) and disk files appear as files to Linux pro-

grams. When you give a command to the Linux operating system, you can instruct

it to send the output to any one of several devices or files. This diversion is called

output redirection.

Device

independence

In a similar manner, a program’s input, which normally comes from a keyboard, can

be redirected so that it comes from a disk file instead. Input and output are device

independent; that is, they can be redirected to or from any appropriate device.

Overview of Linux 15

As an example, the cat utility normally displays the contents of a file on the screen.

When you run a cat command, you can easily cause its output to go to a disk file

instead of the screen.

Shell Functions

One of the most important features of the shell is that users can use it as a program-

ming language. Because the shell is an interpreter, it does not compile programs

written for it but rather interprets programs each time they are loaded from the

disk. Loading and interpreting programs can be time-consuming.

Many shells, including the Bourne Again Shell, support shell functions that the shell

holds in memory so it does not have to read them from the disk each time you exe-

cute them. The shell also keeps functions in an internal format so it does not have to

spend as much time interpreting them.

Job Control

Job control is a shell feature that allows users to work on several jobs at once,

switching back and forth between them as desired. When you start a job, it is fre-

quently run in the foreground so it is connected to the terminal. Using job control,

you can move the job you are working with to the background and continue run-

ning it there while working on or observing another job in the foreground. If a

background job then needs your attention, you can move it to the foreground so it

is once again attached to the terminal. (The concept of job control originated with

BSD UNIX, where it appeared in the C Shell.)

A Large Collection of Useful Utilities

Linux includes a family of several hundred utility programs, often referred to as

commands. These utilities perform functions that are universally required by users.

The sort utility, for example, puts lists (or groups of lists) in alphabetical or numeri-

cal order and can be used to sort lists by part number, last name, city, ZIP code, tele-

phone number, age, size, cost, and so forth. The sort utility is an important

programming tool that is part of the standard Linux system. Other utilities allow

users to create, display, print, copy, search, and delete files as well as to edit, format,

and typeset text. The man (for manual) and info utilities provide online documenta-

tion for Linux.

Interprocess Communication

Pipes and filters Linux enables users to establish both pipes and filters on the command line. A pipe

sends the output of one program to another program as input. A filter is a special

kind of pipe that processes a stream of input data to yield a stream of output data.

A filter processes another program’s output, altering it as a result. The filter’s output

then becomes input to another program.

16 Chapter 1 Welcome to Linux and Mac OS X

Pipes and filters frequently join utilities to perform a specific task. For example, you

can use a pipe to send the output of the sort utility to head (a filter that lists the first

ten lines of its input); you can then use another pipe to send the output of head to a

third utility, lpr, that sends the data to a printer. Thus, in one command line, you can

use three utilities together to sort and print part of a file.

System Administration

On a Linux system the system administrator is frequently the owner and only user

of the system. This person has many responsibilities. The first responsibility may be

to set up the system, install the software, and possibly edit configuration files. Once

the system is up and running, the system administrator is responsible for download-

ing and installing software (including upgrading the operating system), backing up

and restoring files, and managing such system facilities as printers, terminals, serv-

ers, and a local network. The system administrator is also responsible for setting up

accounts for new users on a multiuser system, bringing the system up and down as

needed, monitoring the system, and taking care of any problems that arise.

Additional Features of Linux

The developers of Linux included features from BSD, System V, and Sun Microsys-

tems’ Solaris, as well as new features, in their operating system. Although most of

the tools found on UNIX exist for Linux, in some cases these tools have been

replaced by more modern counterparts. This section describes some of the popular

tools and features available under Linux.

GUIs: Graphical User Interfaces

X11 The X Window System (also called X or X11) was developed in part by researchers

at the Massachusetts Institute of Technology (MIT) and provides the foundation for

the GUIs available with Linux. Given a terminal or workstation screen that supports

X, a user can interact with the computer through multiple windows on the screen,

display graphical information, or use special-purpose applications to draw pictures,

monitor processes, or preview formatted output. X is an across-the-network proto-

col that allows a user to open a window on a workstation or computer system that is

remote from the CPU generating the window.

Aqua Mac OS X comes with two graphical interfaces that can be used simultaneously.

Most Macintosh users are familiar with Aqua, the standard Mac OS X graphical

interface. Aqua is based on a rendering technology named Quartz and has a stan-

dard look and feel for applications. Mac OS X also supports X11, which also uses

Quartz.

Desktop manager Usually two layers run on top of X: a desktop manager and a window manager. A

desktop manager is a picture-oriented user interface that enables you to interact

with system programs by manipulating icons instead of typing the corresponding

Additional Features of Linux 17

commands to a shell. Most Linux distributions run the GNOME desktop manager

(www.gnome.org) by default, but they can also run KDE (www.kde.org) and a

number of other desktop managers. Mac OS X handles the desktop in Aqua, not in

X11, so there is no desktop manager under X11.

Window manager Awindow manager is a program that runs under the desktop manager and allows

you to open and close windows, run programs, and set up a mouse so it has differ-

ent effects depending on how and where you click. The window manager also gives

the screen its personality. Whereas Microsoft Windows allows you to change the

color of key elements in a window, a window manager under X allows you to cus-

tomize the overall look and feel of the screen: You can change the way a window

looks and works (by giving it different borders, buttons, and scrollbars), set up vir-

tual desktops, create menus, and more.

Several popular window managers run under X and Linux. Most Linux distribu-

tions provide both Metacity (the default under GNOME) and kwin (the default

under KDE). Other window managers, such as Sawfish and WindowMaker, are also

available.

Under Mac OS X, most windows are managed by a Quartz layer, which applies the

Apple Aqua look and feel. For X11 applications only, this task is performed by

quartz-wm, which mimics the Apple Aqua look and feel so X11 applications on the

Mac desktop have the same appearance as native Mac OS X applications.

(Inter)Networking Utilities

Linux network support includes many utilities that enable you to access remote sys-

tems over a variety of networks. In addition to sending email to users on other systems,

you can access files on disks mounted on other computers as if they were located on

the local system, make your files available to other systems in a similar manner, copy

files back and forth, run programs on remote systems while displaying the results on

the local system, and perform many other operations across local area networks

(LANs) and wide area networks (WANs), including the Internet.

Layered on top of this network access is a wide range of application programs that

extend the computer’s resources around the globe. You can carry on conversations

with people throughout the world, gather information on a wide variety of subjects,

and download new software over the Internet quickly and reliably.

Software Development

One of Linux’s most impressive strengths is its rich software development environ-

ment. Linux supports compilers and interpreters for many computer languages.

Besides C and C++, languages available for Linux include Ada, Fortran, Java, Lisp,

Pascal, Perl, and Python. The bison utility generates parsing code that makes it eas-

ier to write programs to build compilers (tools that parse files containing structured

information). The flex utility generates scanners (code that recognizes lexical pat-

terns in text). The make utility and the GNU Configure and Build System make it

18 Chapter 1 Welcome to Linux and Mac OS X

easier to manage complex development projects. Source code management systems,

such as CVS, simplify version control. Several debuggers, including ups and gdb, can

help you track down and repair software defects. The GNU C compiler (gcc) works

with the gprof profiling utility to help programmers identify potential bottlenecks in

a program’s performance. The C compiler includes options to perform extensive

checking of C code, thereby making the code more portable and reducing debug-

ging time. Table B-4 on page 904 lists some sites you can download software from.

Under OS X, Apple’s Xcode development environment provides a unified graphical

front end to most of these tools as well as other options and features.

Chapter Summary

The Linux operating system grew out of the UNIX heritage to become a popular

alternative to traditional systems (that is, Windows) available for microcomputer

(PC) hardware. UNIX users will find a familiar environment in Linux. Distributions

of Linux contain the expected complement of UNIX utilities, contributed by pro-

grammers around the world, including the set of tools developed as part of the

GNU Project. The Linux community is committed to the continued development of

this system. Support for new microcomputer devices and features is added soon

after the hardware becomes available, and the tools available on Linux continue to

be refined. Given the many commercial software packages available to run on

Linux platforms and the many hardware manufacturers offering Linux on their sys-

tems, it is clear that the system has evolved well beyond its origin as an undergradu-

ate project to become an operating system of choice for academic, commercial,

professional, and personal use.

Exercises

1. What is free software? List three characteristics of free software.

2. Why is Linux popular? Why is it popular in academia?

3. What are multiuser systems? Why are they successful?

4. What is the Free Software Foundation/GNU? What is Linux? Which parts

of the Linux operating system did each provide? Who else has helped build

and refine this operating system?

5. In which language is Linux written? What does the language have to do

with the success of Linux?

6. What is a utility program?

7. What is a shell? How does it work with the kernel? With the user?

Exercises 19

8. How can you use utility programs and a shell to create your own

applications?

9. Why is the Linux filesystem referred to as hierarchical?

10. What is the difference between a multiprocessor and a multiprocessing

system?

11. Give an example of when you would want to use a multiprocessing

system.

12. Approximately how many people wrote Linux? Why is this project

unique?

13. What are the key terms of the GNU General Public License?

PART I

The Linux and Mac OS X

Operating Systems

CHAPTER 2

Getting Started 23

CHAPTER 3

The Utilities 45

CHAPTER 4

The Filesystem 77

CHAPTER 5

The Shell 117

2323

One way or another you are sitting in front of a screen that is

connected to a computer that is running Linux. You may be

working with a graphical user interface (GUI) or a textual

interface. This book is about the textual interface, also called

the command-line interface (CLI). If you are working with a

GUI, you will need to use a terminal emulator such as xterm,

Konsole, GNOME Terminal, Terminal (under Mac OS X), or a

virtual console (page 40) to follow the examples in this book.

This chapter starts with a discussion of the typographical con-

ventions this book uses, followed by a section about logging in

on the system. The next section introduces the shell and explains

how to fix mistakes on the command line. Next come a brief

reminder about the powers of working with root privileges and

suggestions about how to avoid making mistakes that will make

your system inoperable or hard to work with. The chapter con-

tinues with a discussion about where to find more information

about Linux. It concludes with additional information on log-

ging in, including how to change your password.

In This Chapter

Conventions Used in This Book . . . 24

Logging In from a Terminal or

Terminal Emulator . . . . . . . . . . . . 28

su/sudo: Curbing Your Power

(root Privileges) . . . . . . . . . . . . . . 31

Where to Find Documentation . . . . 33

The ––help Option . . . . . . . . . . . . . 33

man: Displays the System

Manual . . . . . . . . . . . . . . . . . . . . . 35

info: Displays Information About

Utilities. . . . . . . . . . . . . . . . . . . . . 36

HOWTOs: Finding Out How

Things Work . . . . . . . . . . . . . . . . . 40

What to Do If You Cannot Log In. . . 41

Changing Your Password . . . . . . . . 41

Getting Started

24 Chapter 2 Getting Started

Be sure to read the warning about the dangers of misusing the powers of working with

root privileges on page 31. While heeding that warning, feel free to experiment with the

system: Give commands, create files, follow the examples in this book, and have fun.

Conventions Used in This Book

This book uses conventions to make its explanations shorter and clearer. The fol-

lowing paragraphs describe these conventions.

Mac OS X The term Mac OS X refers to both Mac OS X and Mac OS X Server. This book

points out important differences between the two.

Mac OS X versions References to Mac OS X refer to version 10.5 (Leopard). Because the book focuses

on the underlying operating system, which changes little from one release of OS X to

the next, the text will remain relevant through several future releases. The author’s

Web site (www.sobell.com) provides corrections and updates as appropriate.

Text and examples The text is set in this type, whereas examples are shown in a monospaced font (also

called a fixed-width font):

$ cat practice

This is a small file I created

with a text editor.

Items you enter Everything you enter at the keyboard is shown in a bold typeface. Within the text,

this bold typeface is used; within examples and screens, this one is used. In the pre-

vious example, the dollar sign ($) on the first line is a prompt that Linux displays, so

it is not bold; the remainder of the first line is entered by a user, so it is bold.

Utility names Names of utilities are printed in this sans serif typeface. This book references the

emacs text editor and the ls utility or ls command (or just ls) but instructs you to

enter ls –a on the command line. In this way the text distinguishes between utilities,

which are programs, and the instructions you give on the command line to invoke

the utilities.

Filenames Filenames appear in a bold typeface. Examples are memo5,letter.1283, and reports.

Filenames may include uppercase and lowercase letters; however, the popular Linux

ext2,ext3, and ext4 filesystems are case sensitive (page 945), so memo5,MEMO5,

and Memo5 refer to three different files.

The default Mac OS X filesystem, HFS+, is not case sensitive; under OS X, memo5,

MEMO5, and Memo5 refer to the same file. For more information refer to “Case

Sensitivity” on page 927.

Character strings Within the text, characters and character strings are marked by putting them in a

bold typeface. This convention avoids the need for quotation marks or other delim-

iters before and after a string. An example is the following string, which is displayed

by the passwd utility: Sorry, passwords do not match.

Conventions Used in This Book 25

Keys and characters This book uses SMALL CAPS for three kinds of items:

• Keyboard keys, such as the SPACE bar and the RETURN,1ESCAPE, and TAB keys.

• The characters that keys generate, such as the SPACEs generated by the SPACE

bar.

• Keyboard keys that you press with the CONTROL key, such as CONTROL-D. (Even

though D is shown as an uppercase letter, you do not have to press the SHIFT

key; enter CONTROL-D by holding the CONTROL key down and pressing d.)

Prompts and

RETURNs

Most examples include the shell prompt—the signal that Linux is waiting for a

command—as a dollar sign ($), a hash symbol or pound sign (#), or sometimes a per-

cent sign (%). The prompt does not appear in a bold typeface in this book because you

do not enter it. Do not type the prompt on the keyboard when you are experimenting

with examples from this book. If you do, the examples will not work.

Examples omit the RETURN keystroke that you must use to execute them. An example

of a command line is

$ vim memo.1204

To use this example as a model for running the vim text editor, give the command

vim memo.1204 and press the RETURN key. (Press ESCAPE ZZ to exit from vim; see

page 151 for a vim tutorial.) This method of entering commands makes the exam-

ples in the book correspond to what appears on the screen.

Definitions All glossary entries marked with FOLDOC are courtesy of Denis Howe, editor of the Free

On-Line Dictionary of Computing (foldoc.org), and are used with permission. This

site is an ongoing work containing definitions, anecdotes, and trivia.

optional Optional Information

Passages marked as optional appear in a gray box. This material is not central to

the ideas presented in the chapter but often involves more challenging concepts. A

good strategy when reading a chapter is to skip the optional sections and then

return to them when you are comfortable with the main ideas presented in the chap-

ter. This is an optional paragraph.

URLs (Web

addresses)

Web addresses, or URLs, have an implicit http:// prefix, unless ftp:// or https:// is

shown. You do not normally need to specify a prefix in a browser when the prefix is

http://, but you must use a prefix when you specify an FTP or secure HTTP site.

Thus you can specify a URL in a browser exactly as shown in this book.

1. Different keyboards use different keys to move the cursor (page 949) to the beginning of the next line.

This book always refers to the key that ends a line as the RETURN key. Your keyboard may have a RET,NEWLINE,

ENTER,RETURN, or other key. Use the corresponding key on your keyboard each time this book asks you to

press RETURN.

26 Chapter 2 Getting Started

Tip, caution, and

security boxes

The following boxes highlight information that may be helpful while you are using

or administrating a Linux system.

Logging In from a Terminal or Terminal Emulator

Above the login prompt on a terminal, terminal emulator, or other textual device,

many systems display a message called issue (stored in the /etc/issue file). This mes-

sage usually identifies the version of Linux running on the system, the name of the

system, and the device you are logging in on. A sample issue message follows:

Ubuntu 9.10 tiny tty1

The issue message is followed by a prompt to log in. Enter your username and

password in response to the system prompts. The following example shows Max

logging in on the system named tiny:

tiny login: max

Password:

Last login: Wed Mar 10 19:50:38 from dog

[max@tiny max]$

If you are using a terminal (page 983) and the screen does not display the login:

prompt, check whether the terminal is plugged in and turned on, and then press the

RETURN key a few times. If login: still does not appear, try pressing CONTROL-Q (Xon).

If you are using an Apple, PC, another Linux system, or a workstation (page 988),

open the program that runs ssh (secure; page 828), telnet (not secure; page 852), or

This is a tip box

tip A tip box may help you avoid repeating a common mistake or may point toward additional information.

This box warns you about something

caution A caution box warns you about a potential pitfall.

This box marks a security note

security A security box highlights a potential security issue. These notes are usually for system administrators,

but some apply to all users.

Did you log in last?

security As you are logging in to a textual environment, after you enter your username and password, the

system displays information about the last login on this account, showing when it took place and

where it originated. You can use this information to determine whether anyone has accessed the

account since you last used it. If someone has, perhaps an unauthorized user has learned your

password and logged in as you. In the interest of maintaining security, advise the system admin-

istrator of any circumstances that make you suspicious and change your password.

Logging In from a Terminal or Terminal Emulator 27

whichever communications/emulation software you use to log in on the system, and

give it the name or IP address (page 960) of the system you want to log in on. Log

in, making sure you enter your username and password as they were specified when

your account was set up; the routine that verifies the username and password is case

sensitive. Like most systems, Linux does not display your password when you enter

it. By default Mac OS X does not allow remote logins (page 936).

Following is an example of logging in using ssh from a Linux system:

$ ssh max@tiny

max@tiny's password:

Permission denied, please try again.

max@tiny's password:

Last login: Wed Mar 10 21:21:49 2005 from dog

[max@tiny max]$

In the example Max mistyped his password, received an error message and another

prompt, and retyped the password correctly. If your username is the same on the

system you are logging in from and the system you are logging in on, you can omit

your username and the following at sign (@). In the example, Max would have

given the command ssh tiny.

After you log in, the shell prompt (or just prompt) appears, indicating you have suc-

cessfully logged in; it shows the system is ready for you to give a command. The

first shell prompt may be preceded by a short message called the message of the day,

or motd, which is stored in the /etc/motd file.

The usual prompt is a dollar sign ($). Do not be concerned if you have a different

prompt; the examples in this book will work regardless of what the prompt looks

like. In the previous example, the $ prompt (last line) is preceded by the username

(max), an at sign (@), the system name (tiny), and the name of the directory Max is

working in (max). For information on how to change the prompt, refer to page 299

(bash) or page 373 (tcsh).

telnet is not secure

security One of the reasons telnet is not secure is that it sends your username and password over the

network in cleartext (page 947) when you log in, allowing someone to capture your login infor-

mation and log in on your account. The ssh utility encrypts all information it sends over the

network and, if available, is a better choice than telnet. The ssh program has been imple-

mented on many operating systems, not just Linux. Many user interfaces to ssh include a ter-

minal emulator.

Make sure TERM is set correctly

tip The TERM shell variable establishes the pseudographical characteristics of a character-based ter-

minal or terminal emulator. Typically TERM is set for you—you do not have to set it manually. If

things on the screen do not look right, refer to “Specifying a Terminal” on page 906.

28 Chapter 2 Getting Started

Working with the Shell

Before the introduction of the graphical user interface (GUI), UNIX and then Linux

provided only a textual interface (also called a command-line interface or CLI).

Today, a textual interface is available when you log in from a terminal, a terminal

emulator, or a textual virtual console, or when you use ssh or telnet to log in on a

system.

When you log in and are working in a textual (nongraphical) environment, and

when you are using a terminal emulator window in a graphical environment, you

are using the shell as a command interpreter. The shell displays a prompt that indi-

cates it is ready for you to enter a command. In response to the prompt, you type a

command. The shell executes this command and then displays another prompt.

Advantages of

the textual interface

Although the concept may seem antiquated, a textual interface has a place in mod-

ern computing. In some cases an administrator may use a command-line tool either

because a graphical equivalent does not exist or because the graphical tool is not as

powerful or flexible as the textual one. When logging in using a dial-up line or net-

work connection, using a GUI may not be practical because of the slow connection

speed. Frequently, on a server system, a graphical interface may not even be

installed. The first reason for this omission is that a GUI consumes a lot of system

resources; on a server, those resources are better dedicated to the main task of the

server. Additionally, security mandates that a server system run as few tasks as pos-

sible because each additional task can make the system more vulnerable to attack.

Also, some users prefer a textual interface because it is cleaner and more efficient

than a GUI. Some simple tasks, such as changing file access permissions, can be

done more easily from the command line (see chmod on page 626).

Pseudographical

interface

Before the introduction of GUIs, resourceful programmers created textual inter-

faces that included graphical elements such as boxes, borders outlining rudimen-

tary windows, highlights, and, more recently, color. These textual interfaces,

called pseudographical interfaces, bridge the gap between textual and graphical

interfaces. An example of a modern utility that uses a pseudographical interface is

the dpkg-reconfigure utility, which reconfigures an installed software package.

This section explains how to identify the shell you are using and describes the key-

strokes you can use to correct mistakes on the command line. It covers how to abort

a running command and briefly discusses how to edit a command line. Several

chapters of this book are dedicated to shells: Chapter 5 introduces shells, Chapter 8

goes into more detail about the Bourne Again Shell with some coverage of the TC

Shell, Chapter 9 covers the TC Shell exclusively, and Chapter 10 discusses writing

programs (shell scripts) using the Bourne Again Shell.

Which Shell Are You Running?

This book discusses both the Bourne Again Shell (bash) and the TC Shell (tcsh). You

are probably running bash, but you may be running tcsh or another shell such as the

Working with the Shell 29

Z Shell (zsh). You can identify the shell you are running by using the ps utility

(page 796). Type ps in response to the shell prompt and press RETURN.

$ ps

PID TTY TIME CMD

2402 pts/5 00:00:00 bash

7174 pts/5 00:00:00 ps

This command shows that you are running two utilities or commands: bash and ps.

If you are running tcsh,ps will display tcsh instead of bash. If you are running a dif-

ferent shell, ps will display its name.

Correcting Mistakes

This section explains how to correct typographical and other errors you may make

while you are logged in on a textual display. Because the shell and most other utili-

ties do not interpret the command line or other text until after you press RETURN, you

can readily correct typing mistakes before you press RETURN.

You can correct typing mistakes in several ways: erase one character at a time, back

up a word at a time, or back up to the beginning of the command line in one step.

After you press RETURN, it is too late to correct a mistake: You must either wait for the

command to run to completion or abort execution of the program (page 30).

Erasing a Character

While entering characters from the keyboard, you can back up and erase a mistake

by pressing the erase key once for each character you want to delete. The erase key

backs over as many characters as you wish. It does not, in general, back up past the

beginning of the line.

The default erase key is BACKSPACE. If this key does not work, try pressing DELETE or

CONTROL-H. If these keys do not work, give the following stty2 command to set the erase

and line kill (see “Deleting a Line” on the next page) keys to their default values:

$ stty ek

See page 838 for more information on stty.

Deleting a Word

You can delete a word you entered by pressing CONTROL-W. A word is any sequence of

characters that does not contain a SPACE or TAB. When you press CONTROL-W, the cursor

moves left to the beginning of the current word (as you are entering a word) or the

previous word (when you have just entered a SPACE or TAB), removing the word.

2. The command stty is an abbreviation for set teletypewriter, the first terminal that UNIX was run on.

Today stty is commonly thought of as meaning set terminal.

30 Chapter 2 Getting Started

Deleting a Line

Any time before you press RETURN, you can delete the line you are entering by press-

ing the (line) kill key. When you press this key, the cursor moves to the left, erasing

characters as it goes, back to the beginning of the line. The default line kill key is

CONTROL-U. If this key does not work, try CONTROL-X. If these keys do not work, give the

stty command described under “Erasing a Character.”

Aborting Execution

Sometimes you may want to terminate a running program. For example, you may

want to stop a program that is performing a lengthy task such as displaying the con-

tents of a file that is several hundred pages long or copying a file that is not the one

you meant to copy.

To terminate a program from a textual display, press the interrupt key (CONTROL-C or

sometimes DELETE or DEL). If you would like to use another key as your interrupt key,

see “Special Keys and Characteristics” on page 838.

If this method does not terminate the program, try stopping the program with the

suspend key (typically CONTROL-Z), giving a jobs command to verify the number of the

job running the program, and using kill to abort the job. The job number is the num-

ber within the brackets at the left end of the line that jobs displays ([1] in the next

example). In the following example, the user uses the –TERM option to kill to send

a termination signal to the job specified by the job number, which is preceded by a

percent sign (%1). You can omit –TERM from the command as kill sends a termina-

tion signal by default.

$ bigjob

CONTROL-Z

[1]+ Stopped bigjob

$ jobs

[1]+ Stopped bigjob

$ kill -TERM %1

$ RETURN

[1]+ Killed bigjob

The kill command returns a prompt; press RETURN again to see the confirmation mes-

sage. If the terminal interrupt signal does not appear to work, wait ten seconds and

press RETURN. If there is no message indicating the job was killed, use the kill (–KILL)

signal. A running program cannot usually ignore a kill signal; it is almost sure to

abort the program. For more information refer to “Running a Command in the

Background” on page 134 and kill on page 729.

CONTROL-Z suspends a program

tip Although it is not a way of correcting a mistake, you may press the suspend key (typically CONTROL-

Z)by mistake and wonder what happened. If the system displays a message containing the word

Stopped, you have just stopped your job using job control. Give the command fg to continue your

job in the foreground, and you should return to where you were before you pressed the suspend

key. For more information refer to “Moving a Job from the Foreground to the Background” on

page 135.

su/sudo: Curbing Your Power (root Privileges) 31

optional When you press the interrupt key, the Linux operating system sends a terminal

interrupt signal to the program you are running and to the shell. Exactly what effect

this signal has depends on the program.

A program may stop execution and exit immediately (most common), it may ignore

the signal, or it may receive the signal and take another action based on a custom

signal handler procedure.

The Bourne Again Shell has a custom signal handler procedure for the terminal

interrupt signal. The behavior of this signal handler depends on whether the shell is

waiting for the user to finish typing a command or waiting for a program to finish

executing. If the shell is waiting for the user, the signal handler discards the partial

line and displays a prompt. If the shell is waiting for a program, the signal handler

does nothing; the shell keeps waiting. When the program finishes executing, the

shell displays a prompt.

Repeating/Editing Command Lines

To repeat a previous command under bash or tcsh, press the UP ARROW key. Each time

you press this key, the shell displays an earlier command line. To reexecute the dis-

played command line, press RETURN. Press the DOWN ARROW key to browse through the

command lines in the other direction.

The RIGHT and LEFT ARROW keys move the cursor back and forth along the displayed

command line. At any point along the command line, you can add characters by

typing them. Use the erase key to remove characters from the command line. For

information about more complex command-line editing, see page 310 (bash) and

page 363 (tcsh).

su/sudo: Curbing Your Power (root Privileges)

UNIX and Linux systems have always had a privileged user named root. When you

are working as the root user (“working with root privileges”), you have extraordi-

nary systemwide powers. A user working with root privileges is sometimes referred

to as Superuser or administrator. When working with root privileges, you can read

from or write to any file on the system, execute programs that ordinary users cannot,

and more. On a multiuser system you may not be permitted to gain root privileges

and so may not be able to run certain programs. Nevertheless, someone—the system

administrator—can, and that person maintains the system.

Do not experiment while you are working with root privileges

caution Feel free to experiment when you are not working with root privileges. When you are working with

root privileges, do only what you have to do and make sure you know exactly what you are doing.

After you have completed the task at hand, revert to working as yourself. When working with root

privileges, you can damage the system to such an extent that you will need to reinstall Linux to

get it working again.

32 Chapter 2 Getting Started

Under a conventional setup, you can gain root privileges in one of two ways. First

you can log in as the user named root; when you do so you are working with root

privileges until you log off. Alternatively, while you are working as yourself, you can

use the su (substitute user) utility to execute a single command with root privileges

or to gain root privileges temporarily so you can execute several commands. Logging

in as root and running su to gain root privileges require you to enter the root pass-

word. The following example shows how to use su to execute a single command.

$ ls -l /lost+found

ls: cannot open directory /lost+found: Permission denied

$ su -c 'ls -l /lost+found'

Password: Enter the root password

total 0

The first command in the preceding example shows that a user who does not have

root privileges is not permitted to list the files in the /lost+found directory: ls dis-

plays an error message. The second command uses su with the –c (command)

option to execute the same command with root privileges. Single quotation marks

enclose the command to ensure the shell interprets the command properly. When

the command finishes executing (ls shows there are no files in the directory), the

user no longer has root privileges. Without any arguments, su spawns a new shell

running with root privileges. Typically the shell displays a hash or pound sign (#)

prompt when you are working with root privileges. Give an exit command to return

to the normal prompt and nonroot privileges.

$ su

Password: Enter the root password

# ls -l /lost+found

total 0

# exit

exit

Some distributions (e.g., Ubuntu) ship with the root account locked—there is no

root password—and rely on the sudo (www.sudo.ws) utility to allow users to gain

root privileges. The sudo utility requires you to enter your password (not the root

password) to gain root privileges. The following example allows the user to gain

root privileges to view the contents of the /lost+found directory.

$ sudo ls -l /lost+found

[sudo] password for sam: Enter your password

total 0

With an argument of –i,sudo spawns a new shell running with root privileges. Typ-

ically the shell displays a hash or pound sign (#) prompt when you are working with

root privileges. Give an exit command to return to the normal prompt and nonroot

privileges.

Where to Find Documentation 33

$ sudo -i

[sudo] password for sam: Enter your password

# ls -l /lost+found

total 0

# exit

logout

Where to Find Documentation

Distributions of Linux do not typically come with hardcopy reference manuals.

However, its online documentation has always been one of Linux’s strengths. The

man (or manual) and info pages have been available via the man and info utilities

since early releases of the operating system. Not surprisingly, with the ongoing

growth of Linux and the Internet, the sources of documentation have expanded as

well. This section discusses some of the places you can look for information on

Linux. See Appendix B as well.

The ––help Option

Most GNU utilities provide a ––help option that displays information about the

utility. Non-GNU utilities may use a –h or –help option to display help information.

$ cat --help

Usage: cat [OPTION] [FILE]...

Concatenate FILE(s), or standard input, to standard output.

-A, --show-all equivalent to -vET

-b, --number-nonblank number nonblank output lines

-e equivalent to -vE

-E, --show-ends display $ at end of each line

...

If the information that ––help displays runs off the screen, send the output through

the less pager (page 34) using a pipe (page 56):

$ ls --help | less

man: Displays the System Manual

The man utility (page 759) displays (man) pages from the system documentation in a

textual environment. This documentation is helpful when you know which utility

you want to use but have forgotten exactly how to use it. You can also refer to the

man pages to get more information about specific topics or to determine which fea-

tures are available with Linux. You can search for topics covered by man pages

using the apropos utility (page 35). Because the descriptions in the system documen-

tation are often terse, they are most helpful if you already understand the basic

functions of a utility.

34 Chapter 2 Getting Started

To find out more about a utility, give the command man, followed by the name of

the utility. Figure 2-1 shows man displaying information about itself; the user

entered a man man command.

less (pager) The man utility automatically sends its output through a pager—usually less, which

displays one screen at a time. When you access a manual page in this manner, less

displays a prompt [e.g., Manual page man(1) line 1] at the bottom of the screen

after it displays each screen of text and waits for you to request another screen of

text by pressing the SPACE bar. Pressing h (help) displays a list of less commands.

Pressing q (quit) stops less and causes the shell to display a prompt. For more infor-

mation refer to page 48.

Manual sections Based on the FHS (Filesystem Hierarchy Standard; page 91), the Linux system manual

and the man pages are divided into ten sections, where each section describes related tools:

1. User Commands

2. System Calls

3. Subroutines

4. Devices

5. File Formats

6. Games

7. Miscellaneous

8. System Administration

9. Kernel

10. New

Figure 2-1 The man utility displaying information about itself

Online man pages

tip The tldp.org/manpages/man.html site holds links to copies of most man pages. In addition to

presenting man pages in an easy-to-read HTML format, this site does not require you to install

a utility to read the man page.

Where to Find Documentation 35

This layout closely mimics the way the set of UNIX manuals has always been

divided. Unless you specify a manual section, man displays the earliest occurrence in

the manual of the word you specify on the command line. Most users find the infor-

mation they need in sections 1, 6, and 7; programmers and system administrators

frequently need to consult the other sections.

In some cases the manual contains entries for different tools with the same name.

For example, the following command displays the man page for the passwd utility

from section 1 of the system manual:

$ man passwd

To see the man page for the passwd file from section 5, enter this command:

$ man 5 passwd

The preceding command instructs man to look only in section 5 for the man page. In

documentation you may see this man page referred to as passwd(5). Use the –a

option (see the adjacent tip) to view all man pages for a given subject (press qRETURN

to display the next man page). For example, give the command man –a passwd to

view all man pages for passwd.

apropos: Searches for a Keyword

When you do not know the name of the command you need to carry out a particular

task, you can use apropos with a keyword to search for it. This utility searches for

the keyword in the short description line (the top line) of all man pages and displays

those that contain a match. The man utility, when called with the –k (keyword)

option, provides the same output as apropos.

The database apropos uses, named whatis, is not available on many Linux systems

when they are first installed, but is built automatically by cron (see crontab on

page 649 for a discussion of cron).

The following example shows the output of apropos when you call it with the who

keyword. The output includes the name of each command, the section of the man-

ual that contains it, and the brief description from the top of the man page. This list

includes the utility that you need (who) and identifies other, related tools that you

might find useful:

$ apropos who

at.allow (5) - determine who can submit jobs via at or batch

at.deny (5) - determine who can submit jobs via at or batch

from (1) - print names of those who have sent mail

w (1) - Show who is logged on and what they are doing.

w.procps (1) - Show who is logged on and what they are doing.

who (1) - show who is logged on

whoami (1) - print effective userid

Options

tip An option modifies the way a utility or command works. Options are usually specified as one or

more letters preceded by one or two hyphens. An option typically appears following the name of

the utility you are calling and a SPACE. Other arguments (page 941) to the command follow the

option and a SPACE. For more information refer to “Options” on page 119.

36 Chapter 2 Getting Started

whatis The whatis utility is similar to apropos but finds only complete word matches for the

name of the utility:

$ whatis who

who (1) - show who is logged on

info: Displays Information About Utilities

The textual info utility is a menu-based hypertext system developed by the GNU

project (page 3) and distributed with Linux. The info utility can display documenta-

tion on many Linux shells, utilities, and programs developed by the GNU project.

See www.gnu.org/software/texinfo/manual/info for the info manual. Figure 2-2

shows the screen that info displays when you give the command info coreutils (the

coreutils software package holds the Linux core utilities).

Because the information on this screen is drawn from an editable file, your display

may differ from the screens shown in this section. When you see the initial info

screen, you can press any of the following keys or key combinations:

•? lists info commands

•SPACE scrolls through the menu of items for which information is available

•m followed by the name of a menu displays that menu

•m followed by a SPACE displays a list of menus

•q or CONTROL-C quits info

Figure 2-2 The initial screen info coreutils displays

man and info display different information

tip The info utility displays more complete and up-to-date information on GNU utilities than does man.

When a man page displays abbreviated information on a utility that is covered by info, the man page

refers to info. The man utility frequently displays the only information available on non-GNU utilities.

When info displays information on non-GNU utilities, it is frequently a copy of the man page.

Where to Find Documentation 37

The notation info uses to describe keyboard keys may not be familiar to you. The

notation C-h is the same as CONTROL-H. Similarly M-x means hold down the META or ALT

key and press x. On some systems you need to press ESCAPE and then x to duplicate

the function of META-X. For more information refer to “Keys: Notation and Use” on

page 216. For Mac keyboards see “Activating the META key” on page 935.

After giving the command info coreutils, type /sleepRETURN to search for the string

sleep. When you type /, the cursor moves to the bottom line of the screen and dis-

plays Search for string [string]: where string is the last string you searched for. Press

RETURN to search for string or enter the string you want to search for. Typing sleep

displays sleep on that line, and pressing RETURN displays the next occurrence of sleep.

Next, type /RETURN (or /sleepRETURN) to search for the second occurrence of sleep as

shown in Figure 2-3. The asterisk at the left end of the line indicates that this entry

is a menu item. Following the asterisk is the name of the menu item, two colons,

and a description of the item.

Each menu item is a link to the info page that describes that item. To jump to that

page, use the ARROW keys to move the cursor to the line containing the menu item and

press RETURN. Alternatively, you can type the name of the menu item in a menu com-

mand to view the information. To display information on sleep, for example, you

can give the command m sleep, followed by RETURN. When you type m (for menu),

the cursor moves to the bottom line of the window (as it did when you typed /) and

displays Menu item:. Typing sleep displays sleep on that line, and pressing RETURN

displays information about the menu item you have chosen.

Figure 2-3 The screen info coreutils displays after you type /sleepRETURN twice

You may find pinfo easier to use than info

tip The pinfo utility is similar to info but is more intuitive if you are not familiar with the emacs editor.

This utility runs in a textual environment, as does info. When it is available, pinfo uses color to make

its interface easier to use. You may have to install pinfo before you can use it; see Appendix C.

38 Chapter 2 Getting Started

Figure 2-4 shows the top node of information on sleep. A node groups a set of

information you can scroll through with the SPACE bar. To display the next node,

press n. Press p to display the previous node. (The sleep item has only one node.)

As you read through this book and learn about new utilities, you can use man or info

to find out more about those utilities. If you can print PostScript documents, you

can print a manual page by using the man utility with the –t option. For example,

man –t cat | lpr prints information about the cat utility. You can also use a Web

browser to display the documentation at www.tldp.org and print the desired infor-

mation from the browser.

HOWTOs: Finding Out How Things Work

A HOWTO document explains in detail how to do something related to

Linux—from setting up a specialized piece of hardware to performing a system

administration task to setting up specific networking software. Mini-HOWTOs

offer shorter explanations. As with Linux software, one person or a few people gen-

erally are responsible for writing and maintaining a HOWTO document, but many

people may contribute to it.

The Linux Documentation Project (LDP; page 40) site houses most HOWTO and

mini-HOWTO documents. Use a Web browser to visit www.tldp.org, click HOW-

TOs, and pick the index you want to use to find a HOWTO or mini-HOWTO. You

can also use the LDP search feature on its home page to find HOWTOs and other

documents.

Getting Help with the System

This section describes several places you can look for help in using a Linux system.

Figure 2-4 The info page on the sleep utility

Where to Find Documentation 39

Finding Help Locally

/usr/share/doc The /usr/share/doc and /usr/src/linux/Documentation (present only if you installed

the kernel source code) directories often contain more detailed and different infor-

mation about a utility than man or info provides. Frequently this information is

meant for people who will be compiling and modifying the utility, not just using it.

These directories hold thousands of files, each containing information on a separate

topic.

For example, the following commands display information on grep and info. The

second command uses zcat (page 618) to allow you to view a compressed file by

decompressing it and sending it through the less pager (page 34).

$ less /usr/share/doc/grep/README

$ zcat /usr/share/doc/info/README.gz | less

Using the Internet to Get Help

The Internet provides many helpful sites related to Linux. Aside from sites that offer

various forms of documentation, you can enter an error message from a program

you are having a problem with in a search engine such as Google (www.google.com,

or its Linux-specific version at www.google.com/linux). Enclose the error message

within double quotation marks to improve the quality of the results. The search will

likely yield a post concerning your problem and suggestions about how to solve it.

See Figure 2-5.

GNU GNU manuals are available at www.gnu.org/manual. In addition, you can visit the

GNU home page (www.gnu.org) for more documentation and other GNU resources.

Many of the GNU pages and resources are available in a variety of languages.

Figure 2-5 Google reporting on an error message

40 Chapter 2 Getting Started

The Linux

Documentation

Project

The Linux Documentation Project (www.tldp.org; Figure 2-6), which has been

around for almost as long as Linux, houses a complete collection of guides, HOW-

TOs, FAQs, man pages, and Linux magazines. The home page is available in

English, Portuguese, Spanish, Italian, Korean, and French. It is easy to use and sup-

ports local text searches. It also provides a complete set of links you can use to find

almost anything you want related to Linux (click Links in the Search box or go to

www.tldp.org/links). The links page includes sections on general information,

events, getting started, user groups, mailing lists, and newsgroups, with each section

containing many subsections.

More About Logging In

This section discusses how to use virtual consoles, what to do if you have a problem

logging in, and how to change your password.

Using Virtual Consoles

When running Linux on a personal computer, you will frequently work with the

display and keyboard attached to the computer. Using this physical console, you can

access as many as 63 virtual consoles (also called virtual terminals). Some are set up

to allow logins; others act as graphical displays. To switch between virtual consoles,

hold the CONTROL and ALT keys down and press the function key that corresponds to

the console you want to view. For example, CONTROL-ALT-F5 displays the fifth virtual

console. This book refers to the console you see when you press CONTROL-ALT-F1 as the

system console, or just console.

Figure 2-6 The Linux Documentation Project home page

More About Logging In 41

Typically, six virtual consoles are active and have textual login sessions running.

When you want to use both textual and graphical interfaces, you can set up a tex-

tual session on one virtual console and a graphical session on another. No matter

which virtual console you start a graphical session from, the graphical session runs

on the first unused virtual console (typically number seven).

What to Do If You Cannot Log In

If you enter either your username or password incorrectly, the system displays an

error message after you enter both your username and your password. This message

indicates you have entered either the username or the password incorrectly or they

are not valid. It does not differentiate between an unacceptable username and an

unacceptable password—a strategy meant to discourage unauthorized people from

guessing names and passwords to gain access to the system.

Following are some common reasons why logins fail:

•The username and password are case sensitive. Make sure the CAPS LOCK key

is off and enter your username and password exactly as specified or as you

set them up.

•You are not logging in on the right machine. The login/password combina-

tion may not be valid if you are trying to log in on the wrong machine. On

a larger, networked system, you may have to specify the machine you want

to connect to before you can log in.

•Your username is not valid. The login/password combination may not be

valid if you have not been set up as a user. Check with the system

administrator.

Logging Out

To log out from a character-based interface, press CONTROL-D in response to the shell

prompt. This action sends the shell an EOF (end of file) signal. Alternatively, you

can give the command exit. Exiting from a shell does not end a graphical session; it

just exits from the shell you are working with. For example, exiting from the shell

that GNOME terminal provides closes the GNOME terminal window.

Changing Your Password

If someone else assigned you a password, it is a good idea to give yourself a new one.

For security reasons none of the passwords you enter is displayed by any utility.

Protect your password

security Do not allow someone to find out your password: Do not put your password in a file that is not

encrypted, allow someone to watch you type your password, or give your password to someone

you do not know (a system administrator never needs to know your password). You can always

write your password down and keep it in a safe, private place.

42 Chapter 2 Getting Started

To change your password, give the command passwd. The first item passwd asks for is

your current (old) password. This password is verified to ensure that an unauthorized

user is not trying to alter your password. Then the system requests a new password.

After you enter your new password, the system asks you to retype it to make sure

you did not make a mistake when you entered it the first time. If the new password

is the same both times you enter it, your password is changed. If the passwords dif-

fer, it means that you made an error in one of them, and the system displays this

error message:

Sorry, passwords do not match

If your password is not long enough, the system displays the following message:

You must choose a longer password

When it is too simple, the system displays this message:

Bad: new password is too simple

If the system displays an error message, you need to start over. Press RETURN a few

times until the shell displays a prompt and run passwd again.

When you successfully change your password, you change the way you log in. If

you forget your password, someone working with root privileges can run passwd to

change it and tell you your new password.

Under OS X, the passwd utility changes your login password, but does not change

your Keychain password. The Keychain password is used by various graphical

applications. You can change the Keychain password using the Keychain Access

application.

Choose a password that is difficult to guess

security Do not use phone numbers, names of pets or kids, birthdays, words from a dictionary (not even

a foreign language), and so forth. Do not use permutations of these items or a l33t-speak variation

of a word: Modern dictionary crackers may also try these permutations.

Differentiate between important and less important passwords

security It is a good idea to differentiate between important and less important passwords. For example,

Web site passwords for blogs or download access are not very important; it is acceptable to use

the same password for these types of sites. However, your login, mail server, and bank account

Web site passwords are critical: Use different passwords for each and never use these passwords

for an unimportant Web site.

pwgen can help you pick a password

security The pwgen utility (which you may need to install; see Appendix C) generates a list of almost ran-

dom passwords. With a little imagination, you can pronounce, and therefore remember, some of

these passwords.

Chapter Summary 43

Secure passwords To be relatively secure, a password should contain a combination of numbers, upper-

case and lowercase letters, and punctuation characters and meet the following criteria:

• Must be at least four to six or more characters long (depending on how

the administrator sets up the system). Seven or eight characters is a good

compromise between length and security.

• Should not be a word in a dictionary of any language, no matter how

seemingly obscure.

• Should not be the name of a person, place, pet, or other thing that might

be discovered easily.

• Should contain at least two letters and one digit or punctuation character.

• Should not be your username, the reverse of your username, or your user-

name shifted by one or more characters.

Only the first item is mandatory. Avoid using control characters (such as CONTROL-H)

because they may have a special meaning to the system, making it impossible for

you to log in. If you are changing your password, the new password should differ

from the old one by at least three characters. Changing the case of a character does

not make it count as a different character.

Chapter Summary

As with many operating systems, your access to a Linux system is authorized when

you log in. You enter your username in response to the login: prompt, followed by a

password. You can use passwd to change your password any time while you are

logged in. Choose a password that is difficult to guess and that conforms to the cri-

teria imposed by the system administrator.

The system administrator is responsible for maintaining the system. On a single-user

system, you are the system administrator. On a small, multiuser system, you or

another user will act as the system administrator, or this job may be shared. On a

large, multiuser system or network of systems, there is frequently a full-time system

administrator. When extra privileges are required to perform certain system tasks, the

system administrator gains root privileges by logging in as root, or running su or sudo.

On a multiuser system, several trusted users may be allowed to gain root privileges.

Do not work with root privileges as a matter of course. When you have to do some-

thing that requires root privileges, work with root privileges for only as long as you

need to; revert to working as yourself as soon as possible.

The man utility provides online documentation on system utilities. This utility is

helpful both to new Linux users and to experienced users who must often delve into

the system documentation for information on the fine points of a utility’s behavior.

The info utility helps the beginner and the expert alike. It includes documentation on

many Linux utilities.

44 Chapter 2 Getting Started

Exercises

1. The following message is displayed when you attempt to log in with an

incorrect username or an incorrect password:

This message does not indicate whether your username, your password, or

both are invalid. Why does it not tell you this information?

2. Give three examples of poor password choices. What is wrong with each?

Include one that is too short. Give the error message the system displays.

3. Is fido an acceptable password? Give several reasons why or why not.

4. What would you do if you could not log in?

5. Try to change your password to dog. What happens? Now change it to a

more secure password. What makes that password relatively secure?

Advanced Exercises

6. Change your login shell to tcsh without using root privileges.

7. How many man pages are in the Devices subsection of the system manual?

(Hint: Devices is a subsection of Special Files.)

8. The example on page 35 shows that man pages for passwd appear in sec-

tions 1 and 5 of the system manual. Explain how you can use man to

determine which sections of the system manual contain a manual page

with a given name.

9. How would you find out which Linux utilities create and work with

archive files?

4545

When Linus Torvalds introduced Linux and for a long time

thereafter, Linux did not have a graphical user interface (GUI): It

ran using a textual interface, also referred to as a command-line

interface (CLI). All the tools ran from a command line. Today

the Linux GUI is important but many people—especially system

administrators—run many command-line utilities. Command-

line utilities are often faster, more powerful, or more complete

than their GUI counterparts. Sometimes there is no GUI coun-

terpart to a textual utility; some people just prefer the hands-on

feeling of the command line.

When you work with a command-line interface, you are

working with a shell (Chapters 5, 8, and 10). Before you start

working with a shell, it is important that you understand

something about the characters that are special to the shell, so

this chapter starts with a discussion of special characters. The

chapter then describes five basic utilities: ls,cat, rm,less, and

hostname. It continues by describing several other file manipu-

lation utilities as well as utilities that display who is logged in;

that communicate with other users; that print, compress, and

decompress files; and that pack and unpack archive files.

In This Chapter

Special Characters . . . . . . . . . . . . . 46

Basic Utilities . . . . . . . . . . . . . . . . . 47

less Is more: Display a Text File

One Screen at a Time . . . . . . . . . 48

Working with Files. . . . . . . . . . . . . . 49

lpr: Prints a File . . . . . . . . . . . . . . . . 51

| (Pipe): Communicates Between

Processes . . . . . . . . . . . . . . . . . . . 56

Compressing and

Archiving Files . . . . . . . . . . . . . . . 64

Obtaining User and System

Information . . . . . . . . . . . . . . . . . 71

The Utilities

46 Chapter 3 The Utilities

Special Characters

Special characters, which have a special meaning to the shell, are discussed in “File-

name Generation/Pathname Expansion” on page 136. These characters are men-

tioned here so you can avoid accidentally using them as regular characters until you

understand how the shell interprets them. For example, it is best to avoid using any of

the following characters in a filename (even though emacs and some other programs

do) because they make the file harder to reference on the command line:

& ; | * ? ' " ‘ [ ] ( ) $ < > { } # / \ ! ~

Whitespace Although not considered special characters, RETURN,SPACE, and TAB have special mean-

ings to the shell. RETURN usually ends a command line and initiates execution of a

command. The SPACE and TAB characters separate elements on the command line and

are collectively known as whitespace or blanks.

Quoting special

characters

If you need to use a character that has a special meaning to the shell as a regular

character, you can quote (or escape) it. When you quote a special character, you keep

the shell from giving it special meaning. The shell treats a quoted special character as

a regular character. However, a slash (/) is always a separator in a pathname, even

when you quote it.

Backslash To quote a character, precede it with a backslash (\). When two or more special

characters appear together, you must precede each with a backslash (for example,

you would enter ** as \*\*). You can quote a backslash just as you would quote

any other special character—by preceding it with a backslash (\\).

Single quotation

marks

Another way of quoting special characters is to enclose them between single quota-

tion marks: '**'. You can quote many special and regular characters between a

pair of single quotation marks: 'This is a special character: >'. The regular charac-

ters are interpreted as usual, and the shell also interprets the special characters as

regular characters.

The only way to quote the erase character (CONTROL-H), the line kill character

(CONTROL-U), and other control characters (try CONTROL-M) is by preceding each with a

CONTROL-V. Single quotation marks and backslashes do not work. Try the following:

$ echo 'xxxxxxCONTROL-U'

$ echo xxxxxxCONTROL-V CONTROL-U

optional Although you cannot see the CONTROL-U displayed by the second of the preceding pair

of commands, it is there. The following command sends the output of echo (page 57)

through a pipe (page 56) to od (octal display; page 776) to display CONTROL-U as octal

25 (025):

$ echo xxxxxxCONTROL-V CONTROL-U | od -c

0000000 x x x x x x 025 \n

0000010

The \n is the NEWL NE character that echo sends at the end of its output.

Basic Utilities 47

Basic Utilities

One of the important advantages of Linux is that it comes with thousands of utilities

that perform myriad functions. You will use utilities whenever you work with Linux,

whether you use them directly by name from the command line or indirectly from a

menu or icon. The following sections discuss some of the most basic and important

utilities; these utilities are available from a textual interface. Some of the more impor-

tant utilities are also available from a GUI; others are available only from a GUI.

Folder/directory The term directory is used extensively in the next sections. A directory is a resource

that can hold files. On other operating systems, including Windows and Mac OS X,

and frequently when speaking about a Linux GUI, a directory is referred to as a

folder. That is a good analogy: A traditional manila folder holds files just as a direc-

tory does.

ls: Lists the Names of Files

Using the editor of your choice, create a small file named practice. (A tutorial on the

vim editor appears on page 151 and a tutorial on emacs appears on page 208.) After

exiting from the editor, you can use the ls (list) utility to display a list of the names

of the files in your home directory. In the first command in Figure 3-1, ls lists the

name of the practice file. (You may also see files that the system or a program cre-

ated automatically.) Subsequent commands in Figure 3-1 display the contents of the

file and remove the file. These commands are described next. Refer to page 745 or

give the command info coreutils 'ls invocation' for more information.

Run these utilities from a command line

tip This chapter describes command-line, or textual, utilities. You can experiment with these utilities

from a terminal, a terminal emulator within a GUI, or a virtual console (page 40).

In this chapter you work in your home directory

tip When you log in on the system, you are working in your home directory. In this chapter that is the

only directory you use: All the files you create in this chapter are in your home directory. Chapter 4

goes into detail about directories.

$ ls

practice

$ cat practice

This is a small file that I created

with a text editor.

$ rm practice

$ ls

$ cat practice

cat: practice: No such file or directory

Figure 3-1 Using ls,cat, and rm on the file named practice

48 Chapter 3 The Utilities

cat: Displays a Text File

The cat utility displays the contents of a text file. The name of the command is

derived from catenate, which means to join together, one after the other. (Figure 5-8

on page 127 shows how to use cat to string together the contents of three files.)

A convenient way to display the contents of a file to the screen is by giving the com-

mand cat, followed by a SPACE and the name of the file. Figure 3-1 shows cat display-

ing the contents of practice. This figure shows the difference between the ls and cat

utilities: The ls utility displays the name of a file, whereas cat displays the contents

of a file. Refer to page 618 or give the command info coreutils 'cat invocation' for

more information.

rm: Deletes a File

The rm (remove) utility deletes a file. Figure 3-1 shows rm deleting the file named

practice. After rm deletes the file, ls and cat show that practice is no longer in the

directory. The ls utility does not list its filename, and cat says that no such file

exists. Use rm carefully. Refer to page 804 or give the command info coreutils 'rm

invocation' for more information. If you are running Mac OS X, see “Many Utili-

ties Do Not Respect Apple Human Interface Guidelines” on page 936.

less Is more: Display a Text File One Screen at a Time

Pagers When you want to view a file that is longer than one screen, you can use either the

less utility or the more utility. Each of these utilities pauses after displaying a screen

of text; press the SPACE bar to display the next screen of text. Because these utilities

show one page at a time, they are called pagers. Although less and more are very

similar, they have subtle differences. At the end of the file, for example, less displays

an END message and waits for you to press q before returning you to the shell. In

contrast, more returns you directly to the shell. While using both utilities you can

press h to display a Help screen that lists commands you can use while paging

through a file. Give the commands less practice and more practice in place of the cat

command in Figure 3-1 to see how these commands work. Use the command less

/usr/share/dict/words instead if you want to experiment with a longer file. Refer to

page 735 or to the less and more man pages for more information on less and more.

A safer way of removing files

tip You can use the interactive form of rm to make sure that you delete only the file(s) you intend to

delete. When you follow rm with the –i option (see page 35 for a tip on options) and the name of

the file you want to delete, rm displays the name of the file and then waits for you to respond with

y (yes) before it deletes the file. It does not delete the file if you respond with a string that begins

with a character other than y.

$ rm -i toollist

rm: remove regular file 'toollist'? y

Optional: You can create an alias (page 324) for rm –i and put it in your startup file (page 82) so

rm always runs in interactive mode.

Working with Files 49

hostname: Displays the System Name

The hostname utility displays the name of the system you are working on. Use this util-

ity if you are not sure you are logged in on the right machine. Refer to the hostname

man page for more information.

$ hostname

dog

Working with Files

This section describes utilities that copy, move, print, search through, display, sort,

and compare files. If you are running Mac OS X, see “Resource forks” on page 929.

cp: Copies a File

The cp (copy) utility (Figure 3-2) makes a copy of a file. This utility can copy any

file, including text and executable program (binary) files. You can use cp to make a

backup copy of a file or a copy to experiment with.

The cp command line uses the following syntax to specify source and destination

files:

cp source-file destination-file

The source-file is the name of the file that cp will copy. The destination-file is the

name that cp assigns to the resulting (new) copy of the file.

Filename completion

tip After you enter one or more letters of a filename (following a command) on a command line,

press TAB and the Bourne Again Shell will complete as much of the filename as it can. When only

one filename starts with the characters you entered, the shell completes the filename and places

aSPACE after it. You can keep typing or you can press RETURN to execute the command at this

point. When the characters you entered do not uniquely identify a filename, the shell completes

what it can and waits for more input. When pressing TAB does not change the display, press TAB

again (Bourne Again Shell, page 320) or CONTROL-D (TC Shell, page 360) to display a list of pos-

sible completions.

$ ls

memo

$ cp memo memo.copy

$ ls

memo memo.copy

Figure 3-2 cp copies a file

50 Chapter 3 The Utilities

The cp command line in Figure 3-2 copies the file named memo to memo.copy. The

period is part of the filename—just another character. The initial ls command shows

that memo is the only file in the directory. After the cp command, a second ls shows

two files in the directory, memo and memo.copy.

Sometimes it is useful to incorporate the date in the name of a copy of a file. The

following example includes the date January 30 (0130) in the copied file:

$ cp memo memo.0130

Although it has no significance to Linux, the date can help you find a version of a file

you created on a certain date. Including the date can also help you avoid overwriting

existing files by providing a unique filename each day. For more information refer to

“Filenames” on page 79.

Refer to page 640 or give the command info coreutils 'cp invocation' for more

information. Use scp (page 810), ftp (page 704), or rsync (page 583) when you need

to copy a file from one system to another on a common network.

mv: Changes the Name of a File

The mv (move) utility can rename a file without making a copy of it. The mv com-

mand line specifies an existing file and a new filename using the same syntax as cp:

mv existing-filename new-filename

The command line in Figure 3-3 changes the name of the file memo to memo.0130.

The initial ls command shows that memo is the only file in the directory. After you

give the mv command, memo.0130 is the only file in the directory. Compare this

result to that of the cp example in Figure 3-2.

The mv utility can be used for more than changing the name of a file. Refer to “mv,

cp: Move or Copy Files” on page 90. Also refer to page 771 or give the command

info coreutils 'mv invocation' for more information.

cp can destroy a file

caution If the

destination-file

exists before you give a cp command, cp overwrites it. Because cp over-

writes (and destroys the contents of) an existing

destination-file

without warning, you must take

care not to cause cp to overwrite a file that you need. The cp –i (interactive) option prompts you

before it overwrites a file. See page 35 for a tip on options.

The following example assumes that the file named orange.2 exists before you give the cp com-

mand. The user answers y to overwrite the file:

$ cp –i orange orange.2

cp: overwrite 'orange.2'? y

mv can destroy a file

caution Just as cp can destroy a file, so can mv. Also like cp,mv has a –i (interactive) option. See the

caution box labeled “cp can destroy a file.”

Working with Files 51

lpr: Prints a File

The lpr (line printer) utility places one or more files in a print queue for printing.

Linux provides print queues so that only one job is printed on a given printer at a

time. A queue allows several people or jobs to send output simultaneously to a sin-

gle printer with the expected results. On systems that have access to more than one

printer, you can use lpstat –p to display a list of available printers. Use the –P option

to instruct lpr to place the file in the queue for a specific printer—even one that is

connected to another system on the network. The following command prints the file

named report:

$ lpr report

Because this command does not specify a printer, the output goes to the default

printer, which is the printer when you have only one printer.

The next command line prints the same file on the printer named mailroom:

$ lpr -P mailroom report

You can see which jobs are in the print queue by giving an lpstat –o command or by

using the lpq utility:

$ lpq

lp is ready and printing

Rank Owner Job Files Total Size

active max 86 (standard input) 954061 bytes

In this example, Max has one job that is being printed; no other jobs are in the

queue. You can use the job number (86 in this case) with the lprm utility to remove

the job from the print queue and stop it from printing:

$ lprm 86

You can send more than one file to the printer with a single command. The follow-

ing command line prints three files on the printer named laser1:

$ lpr -P laser1 05.txt 108.txt 12.txt

Refer to pages 132 and 742 or to the lpr man page for more information.

$ ls

memo

$ mv memo memo.0130

$ ls

memo.0130

Figure 3-3 mv renames a file

52 Chapter 3 The Utilities

grep: Searches for a String

The grep1 utility searches through one or more files to see whether any contain a

specified string of characters. This utility does not change the file it searches but

simply displays each line that contains the string.

The grep command in Figure 3-4 searches through the file memo for lines that con-

tain the string credit and displays the single line that meets this criterion. If memo

contained such words as discredit,creditor, or accreditation,grep would have dis-

played those lines as well because they contain the string it was searching for. The

–w (words) option causes grep to match only whole words. Although you do not

need to enclose the string you are searching for in single quotation marks, doing so

allows you to put SPACEs and special characters in the search string.

The grep utility can do much more than search for a simple string in a single file.

Refer to page 719 or to the grep man page for more information. See also

Appendix A, “Regular Expressions.”

head: Displays the Beginning of a File

By default the head utility displays the first ten lines of a file. You can use head to

help you remember what a particular file contains. For example, if you have a file

named months that lists the 12 months of the year in calendar order, one to a line,

then head displays Jan through Oct (Figure 3-5).

This utility can display any number of lines, so you can use it to look at only the

first line of a file, at a full screen, or even more. To specify the number of lines

$ cat memo

Helen:

In our meeting on June 6 we

discussed the issue of credit.

Have you had any further thoughts

about it?

Max

$ grep 'credit' memo

discussed the issue of credit.

Figure 3-4 grep searches for a string

1. Originally the name grep was a play on an ed—an original UNIX editor, available on Linux—

command: g/re/p. In this command g stands for global, re is a regular expression delimited by slashes,

and p means print.

Working with Files 53

displayed, include a hyphen followed by the number of lines you want head to

display. For example, the following command displays only the first line of

months:

$ head -1 months

Jan

The head utility can also display parts of a file based on a count of blocks or charac-

ters rather than lines. Refer to page 727 or give the command info coreutils 'head

invocation' for more information.

tail: Displays the End of a File

The tail utility is similar to head but by default displays the last ten lines of a file.

Depending on how you invoke it, this utility can display fewer or more than ten

lines, use a count of blocks or characters rather than lines to display parts of a file,

and display lines being added to a file that is changing. The tail command in

Figure 3-5 displays the last five lines (Aug through Dec) of the months file.

You can monitor lines as they are added to the end of the growing file named logfile

with the following command:

$ tail -f logfile

Press the interrupt key (usually CONTROL-C) to stop tail and display the shell prompt.

Refer to page 843 or give the command info coreutils 'tail invocation' for more

information.

$ head months

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

$ tail -5 months

Aug

Sep

Oct

Nov

Dec

Figure 3-5 head displays the first lines of a file; tail displays the last lines of a file

54 Chapter 3 The Utilities

sort: Displays a File in Order

The sort utility displays the contents of a file in order by lines; it does not change the

original file. Figure 3-6 shows cat displaying the file named days, which contains the

name of each day of the week on a separate line in calendar order. The sort utility

then displays the file in alphabetical order.

The sort utility is useful for putting lists in order. The –u option generates a sorted

list in which each line is unique (no duplicates). The –n option puts a list of num-

bers in numerical order. Refer to page 817 or give the command info coreutils 'sort

invocation' for more information.

uniq: Removes Duplicate Lines from a File

The uniq (unique) utility displays a file, skipping adjacent duplicate lines, but does

not change the original file. If a file contains a list of names and has two successive

entries for the same person, uniq skips the duplicate line (Figure 3-7).

If a file is sorted before it is processed by uniq, this utility ensures that no two lines in the

file are the same. (Of course, sort can do that all by itself with the –u option.) Refer to

page 872 or give the command info coreutils 'uniq invocation' for more information.

diff: Compares Two Files

The diff (difference) utility compares two files and displays a list of the differences

between them. This utility does not change either file; it is useful when you want to com-

pare two versions of a letter or a report or two versions of the source code for a program.

The diff utility with the –u (unified output format) option first displays two lines

indicating which of the files you are comparing will be denoted by a plus sign (+)

$ cat days

Monday

Tuesday

Wednesday

Thursday

Friday

Saturday

Sunday

$ sort days

Friday

Monday

Saturday

Sunday

Thursday

Tuesday

Wednesday

Figure 3-6 sort displays the lines of a file in order

Working with Files 55

and which by a minus sign (–). In Figure 3-8, a minus sign indicates the colors.1 file;

a plus sign indicates the colors.2 file.

The diff –u command breaks long, multiline text into hunks. Each hunk is preceded

by a line starting and ending with two at signs (@@). This hunk identifier indicates

the starting line number and the number of lines from each file for this hunk. In

Figure 3-8, the hunk covers the section of the colors.1 file (indicated by a minus

sign) from the first line through the sixth line. The +1,5 then indicates that the hunk

covers colors.2 from the first line through the fifth line.

Following these header lines, diff –u displays each line of text with a leading minus

sign, a leading plus sign, or a SPACE. A leading minus sign indicates that the line

occurs only in the file denoted by the minus sign. A leading plus sign indicates that

the line occurs only in the file denoted by the plus sign. A line that begins with a

SPACE (neither a plus sign nor a minus sign) occurs in both files in the same location.

Refer to page 663 or to the diff info page for more information.

$ cat dups

Cathy

Fred

Joe

John

Mary

Paula

$ uniq dups

Cathy

Fred

Joe

John

Mary

Paula

Figure 3-7 uniq removes duplicate lines

$ diff -u colors.1 colors.2

--- colors.1 2009-07-29 16:41:11.000000000 -0700

+++ colors.2 2009-07-29 16:41:17.000000000 -0700

@@ -1,6 +1,5 @@

red

+blue

green

yellow

-pink

-purple

orange

Figure 3-8 diff displaying the unified output format

56 Chapter 3 The Utilities

file: Identifies the Contents of a File

You can use the file utility to learn about the contents of a file without having to

open and examine the file yourself. In the following example, file reports that

letter_e.bz2 contains data that was compressed by the bzip2 utility (page 60):

$ file letter_e.bz2

letter_e.bz2: bzip2 compressed data, block size = 900k

Next file reports on two more files:

$ file memo zach.jpg

memo: ASCII text

zach.jpg: JPEG image data, ... resolution (DPI), 72 x 72

Refer to page 686 or to the file man page for more information.

| (Pipe): Communicates Between Processes

Because pipes are integral to the functioning of a Linux system, this chapter introduces

them for use in examples. Pipes are covered in detail beginning on page 131. If you are

running Mac OS X, see the tip “Pipes do not work with resource forks” on page 929.

Aprocess is the execution of a command by Linux (page 306). Communication

between processes is one of the hallmarks of both UNIX and Linux. A pipe (written

as a vertical bar [|] on the command line and appearing as a solid or broken vertical

line on a keyboard) provides the simplest form of this kind of communication. Sim-

ply put, a pipe takes the output of one utility and sends that output as input to

another utility. Using UNIX/Linux terminology, a pipe takes standard output of one

process and redirects it to become standard input of another process. (For more

information refer to “Standard Input and Standard Output” on page 123.) Most of

what a process displays on the screen is sent to standard output. If you do not redi-

rect it, this output appears on the screen. Using a pipe, you can redirect standard

output so it becomes standard input of another utility. For example, a utility such as

head can take its input from a file whose name you specify on the command line fol-

lowing the word head, or it can take its input from standard input. The following

command line sorts the lines of the months file (Figure 3-5, page 53) and uses head

to display the first four months of the sorted list:

$ sort months | head -4

Apr

Aug

Dec

Feb

The next command line displays the number of files in a directory (assuming no file-

names include SPACEs). The wc (word count) utility with the –w (words) option displays

the number of words in its standard input or in a file you specify on the command line:

$ ls | wc -w

Four More Utilities 57

You can use a pipe to send output of a program to the printer:

$ tail months | lpr

Four More Utilities

The echo and date utilities are two of the most frequently used members of the large

collection of Linux utilities. The script utility records part of a session in a file, and

todos or unix2dos makes a copy of a text file that can be read on a machine running

Windows or Mac OS X.

echo: Displays Text

The echo utility copies the characters you type on the command line after echo to

the screen. Figure 3-9 shows some examples. The last example shows how the shell

treats an unquoted asterisk (*) on the command line: It expands the asterisk into a

list of filenames in the directory.

The echo utility is a good tool for learning about the shell and other Linux utilities.

Some examples on page 138 use echo to illustrate how special characters, such as

the asterisk, work. Throughout Chapters 5, 8, and 10, echo helps explain how shell

variables work and how you can send messages from shell scripts to the screen.

Refer to page 680 or give the command info coreutils 'echo invocation' for more

information. See the bash and tcsh man pages for information about the versions of

echo that are built into those shells.

optional You can use echo to create a file by redirecting its output to a file:

$ echo 'My new file.' > myfile

$ cat myfile

My new file.

The greater than (>) sign tells the shell to send the output of echo to the file named

myfile instead of to the screen. For more information refer to “Redirecting Standard

Output” on page 126.

$ ls

memo memo.0714 practice

$ echo Hi

$ echo This is a sentence.

This is a sentence.

$ echo star: *

star: memo memo.0714 practice

Figure 3-9 echo copies the command line (but not the word echo) to the screen

58 Chapter 3 The Utilities

date: Displays the Time and Date

The date utility displays the current date and time:

$ date

Wed Mar 18 17:12:20 PDT 2009

The following example shows how you can choose the format and select the con-

tents of the output of date:

$ date +"%A %B %d"

Wednesday March 18

Refer to page 655 or give the command info coreutils 'date invocation' for more

information.

script: Records a Shell Session

The script utility records all or part of a login session, including your input and the

system’s responses. This utility is useful only from character-based devices, such as

a terminal or a terminal emulator. It does capture a session with vim; however,

because vim uses control characters to position the cursor and display different

typefaces, such as bold, the output will be difficult to read and may not be useful.

When you cat a file that has captured a vim session, the session quickly passes

before your eyes.

By default script captures the session in a file named typescript. To specify a different

filename, follow the script command with a SPACE and the filename. To append to a

file, use the –a option after script but before the filename; otherwise script over-

writes an existing file. Following is a session being recorded by script:

$ script

Script started, file is typescript

$ whoami

sam

$ ls -l /bin | head -5

total 5024

-rwxr-xr-x 1 root root 2928 Sep 21 21:42 archdetect

-rwxr-xr-x 1 root root 1054 Apr 26 15:37 autopartition

-rwxr-xr-x 1 root root 7168 Sep 21 19:18 autopartition-loop

-rwxr-xr-x 1 root root 701008 Aug 27 02:41 bash

$ exit

exit

Script done, file is typescript

Use the exit command to terminate a script session. You can then view the file you

created using cat,less,more, or an editor. Following is the file that was created by

the preceding script command:

$ cat typescript

Script started on Thu Sep 24 20:54:59 2009

$ whoami

sam

Four More Utilities 59

$ ls -l /bin | head -5

total 5024

-rwxr-xr-x 1 root root 2928 Sep 21 21:42 archdetect

-rwxr-xr-x 1 root root 1054 Apr 26 15:37 autopartition

-rwxr-xr-x 1 root root 7168 Sep 21 19:18 autopartition-loop

-rwxr-xr-x 1 root root 701008 Aug 27 02:41 bash

$ exit

exit

Script done on Thu Sep 24 20:55:29 2009

If you will be editing the file with vim,emacs, or another editor, you can use

fromdos or dos2unix (both below) to eliminate from the typescript file the ^M

characters that appear at the ends of the lines. Refer to the script man page for

more information.

todos/unix2dos: Converts Linux and Mac OS X

Files to Windows Format

If you want to share a text file you created on a Linux system with someone on a

system running Windows or Mac OS X, you need to convert the file before the per-

son on the other system can read it easily. The todos (to DOS; part of the tofrodos

package) or unix2dos (UNIX to DOS; part of the unix2dos package) utility converts

a Linux text file so it can be read on a Windows or OS X system. Give the following

command to convert a file named memo.txt (created with a text editor) to a DOS-

format file:

$ todos memo.txt

$ unix2dos memo.txt

You can now email the file as an attachment to someone on a Windows or OS X

system. Without any options, todos overwrites the original file. Use the –b (backup)

option to cause todos to make a copy of the file with a .bak filename extension

before modifying it. Use the –n (new) option to cause unix2dos to write the modified

file to a new file as specified by a second argument (unix2dos old new).

fromdos/dos2unix You can use the fromdos (from DOS; part of the tofrodos package) or dos2unix

(DOS to UNIX; part of the dos2unix package) utility to convert Windows or OS X

files so they can be read on a Linux system:

$ fromdos memo.txt

$ dos2unix memo.txt

See the todos and fromdos or unix2dos and dos2unix man pages for more information.

60 Chapter 3 The Utilities

optional

tr You can also use tr (translate; page 864) to change a Windows or OS X text file into

a Linux text file. In the following example, the –d (delete) option causes tr to

remove RETURNs (represented by \r) as it makes a copy of the file:

$ cat memo | tr -d '\r' > memo.txt

The greater than (>) symbol redirects the standard output of tr to the file named

memo.txt. For more information refer to “Redirecting Standard Output” on page 126.

Converting a file the other way without using todos or unix2dos is not as easy.

Compressing and Archiving Files

Large files use a lot of disk space and take longer than smaller files to transfer from

one system to another over a network. If you do not need to look at the contents of

a large file often, you may want to save it on a CD, DVD, or another medium and

remove it from the hard disk. If you have a continuing need for the file, retrieving a

copy from another medium may be inconvenient. To reduce the amount of disk

space a file occupies without removing the file, you can compress the file without

losing any of the information it holds. Similarly a single archive of several files

packed into a larger file is easier to manipulate, upload, download, and email than

multiple files. You may download compressed, archived files from the Internet. The

utilities described in this section compress and decompress files and pack and

unpack archives.

bzip2: Compresses a File

The bzip2 utility compresses a file by analyzing it and recoding it more efficiently.

The new version of the file looks completely different. In fact, because the new file

contains many nonprinting characters, you cannot view it directly. The bzip2 utility

works particularly well on files that contain a lot of repeated information, such as

text and image data, although most image data is already in a compressed format.

The following example shows a boring file. Each of the 8,000 lines of the letter_e

file contains 72 e’s and a NEWLINE character that marks the end of the line. The file

occupies more than half a megabyte of disk storage.

$ ls -l

-rw-rw-r-- 1 sam sam 584000 Mar 1 22:31 letter_e

The –l (long) option causes ls to display more information about a file. Here it

shows that letter_e is 584,000 bytes long. The –v (verbose) option causes bzip2 to

report how much it was able to reduce the size of the file. In this case, it shrank the

file by 99.99 percent:

$ bzip2 -v letter_e

letter_e: 11680.00:1, 0.001 bits/byte, 99.99% saved, 584000 in, 50 out.

Compressing and Archiving Files 61

$ ls -l

-rw-rw-r-- 1 sam sam 50 Mar 1 22:31 letter_e.bz2

.bz2 filename

extension

Now the file is only 50 bytes long. The bzip2 utility also renamed the file, appending

.bz2 to its name. This naming convention reminds you that the file is compressed;

you would not want to display or print it, for example, without first decompressing

it. The bzip2 utility does not change the modification date associated with the file,

even though it completely changes the file’s contents.

In the following, more realistic example, the file zach.jpg contains a computer

graphics image:

$ ls -l

-rw-r--r-- 1 sam sam 33287 Mar 1 22:40 zach.jpg

The bzip2 utility can reduce the size of the file by only 28 percent because the image

is already in a compressed format:

$ bzip2 -v zach.jpg

zach.jpg: 1.391:1, 5.749 bits/byte, 28.13% saved, 33287 in, 23922 out.

$ ls -l

-rw-r--r-- 1 sam sam 23922 Mar 1 22:40 zach.jpg.bz2

Refer to page 615 or to the bzip2 man page for more information. See also

www.bzip.org, and the Bzip2 mini-HOWTO (see page 38 for instructions on

obtaining this document).

bunzip2 and bzcat: Decompress a File

You can use the bunzip2 utility to restore a file that has been compressed with bzip2:

$ bunzip2 letter_e.bz2

$ ls -l

-rw-rw-r-- 1 sam sam 584000 Mar 1 22:31 letter_e

$ bunzip2 zach.jpg.bz2

$ ls -l

-rw-r--r-- 1 sam sam 33287 Mar 1 22:40 zach.jpg

The bzcat utility displays a file that has been compressed with bzip2. The equivalent

of cat for .bz2 files, bzcat decompresses the compressed data and displays the

decompressed data. Like cat,bzcat does not change the source file. The pipe in the

following example redirects the output of bzcat so instead of being displayed on the

screen it becomes the input to head, which displays the first two lines of the file:

$ bzcat letter_e.bz2 | head -2

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee

Keep the original file by using the –k option

tip The bzip2 utility (and its counterpart, bunzip2) remove the original file when they compress or

decompress a file. Use the –k (keep) option to keep the original file.

62 Chapter 3 The Utilities

After running bzcat, the contents of letter_e.bz2 is unchanged; the file is still stored

on the disk in compressed form.

bzip2recover The bzip2recover utility supports limited data recovery from media errors. Give the

command bzip2recover followed by the name of the compressed, corrupted file

from which you want to recover data.

gzip: Compresses a File

gunzip and zcat The gzip (GNU zip) utility is older and less efficient than bzip2. Its flags and opera-

tion are very similar to those of bzip2. A file compressed by gzip is marked by a .gz

filename extension. Linux stores manual pages in gzip format to save disk space;

likewise, files you download from the Internet are frequently in gzip format. Use

gzip,gunzip, and zcat just as you would use bzip2,bunzip2, and bzcat, respectively.

Refer to page 724 or to the gzip man page for more information.

compress The compress utility can also compress files, albeit not as well as gzip. This utility

marks a file it has compressed by adding .Z to its name.

tar: Packs and Unpacks Archives

The tar utility performs many functions. Its name is short for tape archive, as its

original function was to create and read archive and backup tapes. Today it is used

to create a single file (called a tar file, archive, or tarball) from multiple files or

directory hierarchies and to extract files from a tar file. The cpio utility (page 644)

performs a similar function.

In the following example, the first ls shows the sizes of the files g,b, and d. Next tar

uses the –c (create), –v (verbose), and –f (write to or read from a file) options to cre-

ate an archive named all.tar from these files. Each line of output displays the name

of the file tar is appending to the archive it is creating.

The tar utility adds overhead when it creates an archive. The next command shows

that the archive file all.tar occupies about 9,700 bytes, whereas the sum of the sizes

of the three files is about 6,000 bytes. This overhead is more appreciable on smaller

files, such as the ones in this example.

$ ls -l g b d

-rw-r--r-- 1 zach other 1178 Aug 20 14:16 b

-rw-r--r-- 1 zach zach 3783 Aug 20 14:17 d

-rw-r--r-- 1 zach zach 1302 Aug 20 14:16 g

$ tar -cvf all.tar g b d

gzip versus zip

tip Do not confuse gzip and gunzip with the zip and unzip utilities. These last two are used to pack

and unpack zip archives containing several files compressed into a single file that has been

imported from or is being exported to a system running Windows. The zip utility constructs a zip

archive, whereas unzip unpacks zip archives. The zip and unzip utilities are compatible with

PKZIP, a Windows program that compresses and archives files.

Compressing and Archiving Files 63

$ ls -l all.tar

-rw-r--r-- 1 zach zach 9728 Aug 20 14:17 all.tar

$ tar -tvf all.tar

-rw-r--r-- zach /zach 1302 2009-08-20 14:16 g

-rw-r--r-- zach /other 1178 2009-08-20 14:16 b

-rw-r--r-- zach /zach 3783 2009-08-20 14:17 d

The final command in the preceding example uses the –t option to display a table of

contents for the archive. Use –x instead of –t to extract files from a tar archive. Omit

the –v option if you want tar to do its work silently.2

Compressed

tar files

You can use bzip2,compress, or gzip to compress tar files, making them easier to

store and handle. Many files you download from the Internet will already be in one

of these formats. Files that have been processed by tar and compressed by bzip2 fre-

quently have a filename extension of .tar.bz2 or .tbz. Those processed by tar and gzip

have an extension of .tar.gz,.tgz, or .gz, whereas files processed by tar and compress

use .tar.Z as the extension.

You can unpack a tarred and gzipped file in two steps. (Follow the same procedure if

the file was compressed by bzip2, but use bunzip2 instead of gunzip.) The next exam-

ple shows how to unpack the GNU make utility after it has been downloaded

(ftp.gnu.org/pub/gnu/make/make-3.81.tar.gz):

$ ls -l mak*

-rw-r--r-- 1 sam sam 1564560 Mar 26 18:25 make-3.81.tar.gz

$ gunzip mak*

$ ls -l mak*

-rw-r--r-- 1 sam sam 6072320 Mar 26 18:25 make-3.81.tar

$ tar -xvf mak*

make-3.81/

make-3.81/config/

make-3.81/config/dospaths.m4

make-3.81/config/gettext.m4

make-3.81/config/iconv.m4

...

make-3.81/tests/run_make_tests

make-3.81/tests/run_make_tests.pl

make-3.81/tests/test_driver.pl

The first command lists the downloaded tarred and gzipped file: make-3.81.tar.gz

(about 1.2 megabytes). The asterisk (*) in the filename matches any characters in

any filenames (page 138), so ls displays a list of files whose names begin with mak;

in this case there is only one. Using an asterisk saves typing and can improve accu-

racy with long filenames. The gunzip command decompresses the file and yields

2. Although the original UNIX tar did not use a leading hyphen to indicate an option on the command

line, the GNU/Linux version accepts hyphens, but works as well without them. This book precedes tar

options with a hyphen for consistency with most other utilities.

64 Chapter 3 The Utilities

make-3.81.tar (no .gz extension), which is about 4.8 megabytes. The tar command

creates the make-3.81 directory in the working directory and unpacks the files into it.

$ ls -ld mak*

drwxr-xr-x 8 sam sam 4096 Mar 31 2006 make-3.81

-rw-r--r-- 1 sam sam 6072320 Mar 26 18:25 make-3.81.tar

$ ls -l make-3.81

total 2100

-rw-r--r-- 1 sam sam 53838 Mar 31 2006 ABOUT-NLS

-rw-r--r-- 1 sam sam 2810 Mar 19 2006 AUTHORS

-rw-r--r-- 1 sam sam 18043 Dec 10 1996 COPYING

-rw-r--r-- 1 sam sam 106390 Mar 31 2006 ChangeLog

...

-rw-r--r-- 1 sam sam 16907 Feb 11 2006 vmsjobs.c

-rw-r--r-- 1 sam sam 17397 Feb 11 2006 vpath.c

drwxr-xr-x 6 sam sam 4096 Mar 31 2006 w32

After tar extracts the files from the archive, the working directory contains two files

whose names start with mak:make-3.81.tar and make-3.81. The –d (directory)

option causes ls to display only file and directory names, not the contents of directo-

ries as it normally does. The final ls command shows the files and directories in the

make-3.81 directory. Refer to page 846 or to the tar man page for more information.

optional You can combine the gunzip and tar commands on one command line with a pipe

(|), which redirects the output of gunzip so that it becomes the input to tar:

$ gunzip -c make-3.81.tar.gz | tar -xvf -

The –c option causes gunzip to send its output through the pipe instead of creating a

file. The final hyphen (–) causes tar to read from standard input. Refer to “Pipes”

(page 131), gzip (pages 62 and 724), and tar (page 846) for more information about

how this command line works.

A simpler solution is to use the –z option to tar. This option causes tar to call gunzip

(or gzip when you are creating an archive) directly and simplifies the preceding com-

mand line to

tar: the –x option may extract a lot of files

caution Some tar archives contain many files. To list the files in the archive without unpacking them, run

tar with the –t option and the name of the tar file. In some cases you may want to create a new

directory (mkdir [page 86]), move the tar file into that directory, and expand it there. That way

the unpacked files will not mingle with existing files, and no confusion will occur. This strategy

also makes it easier to delete the extracted files. Depending on how they were created, some tar

files automatically create a new directory and put the files into it; the –t option indicates where tar

will place the files you extract.

tar: the –x option can overwrite files

caution The –x option to tar overwrites a file that has the same filename as a file you are extracting. Follow

the suggestion in the preceding caution box to avoid overwriting files.

Locating Commands 65

$ tar -xvzf make-3.81.tar.gz

In a similar manner, the –j option calls bzip2 or bunzip2.

Locating Commands

The whereis and slocate utilities can help you find a command whose name you have

forgotten or whose location you do not know. When multiple copies of a utility or

program are present, which tells you which copy you will run. The slocate utility

searches for files on the local system.

which and whereis: Locate a Utility

Search path When you give Linux a command, the shell searches a list of directories for a pro-

gram with that name and runs the first one it finds. This list of directories is called a

search path. For information on how to change the search path see page 298. If you

do not change the search path, the shell searches only a standard set of directories

and then stops searching. However, other directories on the system may also con-

tain useful utilities.

which The which utility locates utilities by displaying the full pathname of the file for the

utility. (Chapter 4 contains more information on pathnames and the structure of the

Linux filesystem.) The local system may include several utilities that have the same

name. When you type the name of a utility, the shell searches for the utility in your

search path and runs the first one it finds. You can find out which copy of the utility

the shell will run by using which. In the following example, which reports the loca-

tion of the tar utility:

$ which tar

/bin/tar

The which utility can be helpful when a utility seems to be working in unexpected

ways. By running which, you may discover that you are running a nonstandard

version of a tool or a different one from the one you expected. (“Important Stan-

dard Directories and Files” on page 91 provides a list of standard locations for

executable files.) For example, if tar is not working properly and you find that you

are running /usr/local/bin/tar instead of /bin/tar, you might suspect that the local

version is broken.

whereis The whereis utility searches for files related to a utility by looking in standard loca-

tions instead of using your search path. For example, you can find the locations for

files related to tar:

$ whereis tar

tar: /bin/tar /usr/include/tar.h /usr/share/man/man1/tar.1.gz

In this example whereis finds three references to tar: the tar utility file, a tar header

file, and the tar man page.

66 Chapter 3 The Utilities

slocate/locate: Searches for a File

The slocate (secure locate) or locate utility searches for files on the local system:

$ slocate motd

/usr/share/app-install/icons/xmotd.xpm

/usr/share/app-install/desktop/motd-editor.desktop

/usr/share/app-install/desktop/xmotd.desktop

/usr/share/base-files/motd.md5sums

/usr/share/base-files/motd

...

You may need to install slocate or locate; see Appendix C. Before you can use slocate

or locate the updatedb utility must build or update the slocate database. Typically the

database is updated once a day by a cron script (see page 649 for information on

crontab and cron).

which versus whereis

tip Given the name of a utility, which looks through the directories in your search path (page 297),

in order, and locates the utility. If your search path includes more than one utility with the specified

name, which displays the name of only the first one (the one you would run).

The whereis utility looks through a list of standard directories and works independently of your

search path. Use whereis to locate a binary (executable) file, any manual pages, and source code

for a program you specify; whereis displays all the files it finds.

which,whereis, and builtin commands

caution Both the which and whereis utilities report only the names for utilities as they are found on the

disk; they do not report shell builtins (utilities that are built into a shell; see page 141). When you

use whereis to try to find where the echo command (which exists as both a utility program and

a shell builtin) is kept, you get the following result:

$ whereis echo

echo: /bin/echo /usr/share/man/man1/echo.1.gz

The whereis utility does not display the echo builtin. Even the which utility reports the wrong

information:

$ which echo

/bin/echo

Under bash you can use the type builtin (page 447) to determine whether a command is a builtin:

$ type echo

echo is a shell builtin

If you are not on a network, skip the rest of this chapter

tip If you are the only user on a system that is not connected to a network, you may want to skip the

rest of this chapter. If you are not on a network but are set up to send and receive email, read

“Email” on page 72.

Obtaining User and System Information 67

Obtaining User and System Information

This section covers utilities that provide information about who is using the system,

what those users are doing, and how the system is running.

To find out who is using the local system, you can employ one of several utilities

that vary in the details they provide and the options they support. The oldest utility,

who, produces a list of users who are logged in on the local system, the device each

person is using, and the time each person logged in.

The w and finger utilities show more detail, such as each user’s full name and the

command line each user is running. You can use the finger utility to retrieve infor-

mation about users on remote systems if the local system is attached to a network.

Table 3-1 on page 70 summarizes the output of these utilities.

who: Lists Users on the System

The who utility displays a list of users who are logged in on the local system. In

Figure 3-10 the first column who displays shows that Sam, Max, and Zach are logged

in. (Max is logged in from two locations.) The second column shows the device that

each user’s terminal, workstation, or terminal emulator is connected to. The third col-

umn shows the date and time the user logged in. An optional fourth column shows (in

parentheses) the name of the system that a remote user logged in from.

The information that who displays is useful when you want to communicate with a

user on the local system. When the user is logged in, you can use write (page 70) to

establish communication immediately. If who does not list the user or if you do not

need to communicate immediately, you can send email to that person (page 72).

If the output of who scrolls off the screen, you can redirect the output through a

pipe (|, page 56) so that it becomes the input to less, which displays the output one

screen at a time. You can also use a pipe to redirect the output through grep to look

for a specific name.

If you need to find out which terminal you are using or what time you logged in,

you can use the command who am i:

$ who am i

max tty2 2009-07-25 16:42

$ who

sam tty4 2009-07-25 17:18

max tty2 2009-07-25 16:42

zach tty1 2009-07-25 16:39

max pts/4 2009-07-25 17:27 (coffee)

Figure 3-10 who lists who is logged in

68 Chapter 3 The Utilities

finger: Lists Users on the System

You can use finger to display a list of users who are logged in on the local system. In

addition to usernames, finger supplies each user’s full name along with information

about which device the user’s terminal is connected to, how recently the user typed

something on the keyboard, when the user logged in, and available contact informa-

tion. If the user has logged in over the network, the name of the remote system is

shown as the user’s location. For example, in Figure 3-11 Max is logged in from the

remote system named coffee. The asterisks (*) in front of the device names in the

Tty column indicate that the user has blocked messages sent directly to his terminal

(refer to “mesg: Denies or Accepts Messages” on page 71).

You can also use finger to learn more about an individual by specifying a username on

the command line. In Figure 3-12, finger displays detailed information about Max.

Max is logged in and actively using one of his terminals (tty2); he has not used his

other terminal (pts/4) for 3 minutes and 7 seconds. You also learn from finger that if

you want to set up a meeting with Max, you should contact Zach at extension 1693.

.plan and .project Most of the information in Figure 3-12 was collected by finger from system files.

The information shown after the heading Plan:, however, was supplied by Max. The

finger utility searched for a file named .plan in Max’s home directory and displayed

its contents.

$ finger

max Max Wild *tty2 Jul 25 16:42

max Max Wild pts/4 3 Jul 25 17:27 (coffee)

sam Sam the Great *tty4 29 Jul 25 17:18

zach Zach Brill *tty1 1:07 Jul 25 16:39

Figure 3-11 finger I: lists who is logged in

finger can be a security risk

security On systems where security is a concern, the system administrator may disable finger. This utility

can reveal information that can help a malicious user break into a system.

Mac OS X disables remote finger support by default.

$ finger max

Directory: /home/max Shell: /bin/tcsh

On since Fri Jul 25 16:42 (PDT) on tty2 (messages off)

On since Fri Jul 25 17:27 (PDT) on pts/4 from coffee

3 minutes 7 seconds idle

New mail received Sat Jul 25 17:16 2009 (PDT)

Unread since Sat Jul 25 16:44 2009 (PDT)

Plan:

I will be at a conference in Hawaii all next week.

If you need to see me, contact Zach Brill, x1693.

Figure 3-12 finger II: lists details about one user

Obtaining User and System Information 69

(Filenames that begin with a period, such as .plan, are not normally listed by ls and

are called hidden filenames [page 82].) You may find it helpful to create a .plan file

for yourself; it can contain any information you choose, such as your schedule,

interests, phone number, or address. In a similar manner, finger displays the contents

of the .project and .pgpkey files in your home directory. If Max had not been logged

in, finger would have reported only his user information, the last time he logged in,

the last time he read his email, and his plan.

You can also use finger to display a user’s username. For example, on a system with

a user named Helen Simpson, you might know that Helen’s last name is Simpson

but might not guess her username is hls. The finger utility, which is not case sensi-

tive, can search for information on Helen using her first or last name. The following

commands find the information you seek as well as information on other users

whose names are Helen or Simpson:

$ finger HELEN

...

$ finger simpson

...

See page 695 for more information about finger.

w: Lists Users on the System

The w utility displays a list of the users who are logged in on the local system. As

discussed in the section on who, the information that w displays is useful when you

want to communicate with someone at your installation.

The first column in Figure 3-13 shows that Max, Zach, and Sam are logged in. The

second column shows the name of the device file each user’s terminal is connected to.

The third column shows the system that a remote user is logged in from. The fourth

column shows the time each user logged in. The fifth column indicates how long

each user has been idle (how much time has elapsed since the user pressed a key on

the keyboard). The next two columns identify how much computer processor time

each user has used during this login session and on the task that user is running. The

last column shows the command each user is running.

The first line the w utility displays includes the time of day, the period of time the

computer has been running (in days, hours, and minutes), the number of users

$ w

17:47:35 up 1 day, 8:10, 4 users, load average: 0.34, 0.23, 0.26

USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT

sam tty4 - 17:18 29:14m 0.20s 0.00s vi memo

max tty2 - 16:42 0.00s 0.20s 0.07s w

zach tty1 - 16:39 1:07 0.05s 0.00s run_bdgt

max pts/4 coffee 17:27 3:10m 0.24s 0.24s -bash

Figure 3-13 The w utility

70 Chapter 3 The Utilities

logged in, and the load average (how busy the system is). The three load average

numbers represent the number of jobs waiting to run, averaged over the past 1, 5,

and 15 minutes. Use the uptime utility to display just this line. Table 3-1 compares

the w,who, and finger utilities.

Communicating with Other Users

You can use the utilities discussed in this section to exchange messages and files

with other users either interactively or through email.

write: Sends a Message

The write utility sends a message to another user who is logged in. When you and

another user use write to send messages to each other, you establish two-way commu-

nication. Initially a write command displays a banner on the other user’s terminal,

saying that you are about to send a message (Figure 3-14).

The syntax of a write command line is

write username [terminal]

Table 3-1 Comparison of w,who, and finger

Information displayed w who finger

Username x x x

Terminal-line identification (tty) x x x

Idle time x x

Program the user is executing x

Location the user logged in from x

CPU time used x

Full name (or other information from /etc/passwd)x

User-supplied vanity information x

System uptime and load average x

$ write max

Hi Max, are you there? o

Figure 3-14 The write utility I

Communicating with Other Users 71

The username is the username of the user you want to communicate with. The

terminal is an optional device name that is useful if the user is logged in more

than once. You can display the usernames and device names of all users who are

logged in on the local system by using who,w, or finger.

To establish two-way communication with another user, you and the other user

must each execute write, specifying the other’s username as the username. The write

utility then copies text, line by line, from one keyboard/display to the other

(Figure 3-15). Sometimes it helps to establish a convention, such as typing o (for

“over”) when you are ready for the other person to type and typing oo (for “over

and out”) when you are ready to end the conversation. When you want to stop

communicating with the other user, press CONTROL-D at the beginning of a line. Press-

ing CONTROL-D tells write to quit, displays EOF (end of file) on the other user’s terminal,

and returns you to the shell. The other user must do the same.

If the Message from banner appears on your screen and obscures something you are

working on, press CONTROL- L or CONTROL- R to refresh the screen and remove the banner.

Then you can clean up, exit from your work, and respond to the person who is

writing to you. You have to remember who is writing to you, however, because the

banner will no longer appear on the screen.

mesg: Denies or Accepts Messages

By default, messages to your screen are blocked. Give the following mesg command

to allow other users to send you messages:

$ mesg y

If Max had not given this command before Zach tried to send him a message, Zach

might have seen the following message:

$ write max

write: max has messages disabled

You can block messages by entering mesg n. Give the command mesg by itself to dis-

play is y (for “yes, messages are allowed”) or is n (for “no, messages are not allowed”).

If you have messages blocked and you write to another user, write displays the fol-

lowing message because, even if you are allowed to write to another user, the user

will not be able to respond to you:

$ write max

write: write: you have write permission turned off.

$ write max

Hi Max, are you there? o

Message from max@bravo.example.com on pts/0 at 16:23 ...

Yes Zach, I'm here. o

Figure 3-15 The write utility II

72 Chapter 3 The Utilities

Email enables you to communicate with users on the local system and, if the instal-

lation is part of a network, with other users on the network. If you are connected to

the Internet, you can communicate electronically with users around the world.

Email utilities differ from write in that email utilities can send a message when the

recipient is not logged in. In this case the email is stored until the recipient reads it.

These utilities can also send the same message to more than one user at a time.

Many email programs are available for Linux, including the original character-based

mailx program, Mozilla/Thunderbird, pine, mail through emacs, KMail, and evolution.

Another popular graphical email program is sylpheed (sylpheed.sraoss.jp/en).

Two programs are available that can make any email program easier to use and

more secure. The procmail program (www.procmail.org) creates and maintains

email servers and mailing lists; preprocesses email by sorting it into appropriate files

and directories; starts various programs depending on the characteristics of incom-

ing email; forwards email; and so on. The GNU Privacy Guard (GPG or GNUpg;

www.gnupg.org) encrypts and decrypts email and makes it almost impossible for an

unauthorized person to read.

Network addresses If the local system is part of a LAN, you can generally send email to and receive

email from users on other systems on the LAN by using their usernames. Someone

sending Max email on the Internet would need to specify his domain name

(page 952) along with his username. Use this address to send email to the author of

this book: mgs@sobell.com.

Chapter Summary

The utilities introduced in this chapter are a small but powerful subset of the many

utilities available on a typical Linux system. Because you will use them frequently

and because they are integral to the following chapters, it is important that you

become comfortable using them.

The utilities listed in Table 3-2 manipulate, display, compare, and print files.

Table 3-2 File utilities

Utility Function

cp Copies one or more files (page 49)

diff Displays the differences between two files (page 54)

file Displays information about the contents of a file (page 56)

grep Searches file(s) for a string (page 52)

Chapter Summary 73

To reduce the amount of disk space a file occupies, you can compress it with the bzip2

utility. Compression works especially well on files that contain patterns, as do most

text files, but reduces the size of almost all files. The inverse of bzip2—bunzip2—

restores a file to its original, decompressed form. Table 3-3 lists utilities that compress

and decompress files. The bzip2 utility is the most efficient of these.

An archive is a file, frequently compressed, that contains a group of files. The tar

utility (Table 3-4) packs and unpacks archives. The filename extensions .tar.bz2,

.tar.gz, and .tgz identify compressed tar archive files and are often seen on software

packages obtained over the Internet.

Utility Function

head Displays the lines at the beginning of a file (page 52)

lpq Displays a list of jobs in the print queue (page 51)

lpr Places file(s) in the print queue (page 51)

lprm Removes a job from the print queue (page 51)

mv Renames a file or moves file(s) to another directory (page 50)

sort Puts a file in order by lines (page 54)

tail Displays the lines at the end of a file (page 53)

uniq Displays the contents of a file, skipping adjacent duplicate lines (page 54)

Table 3-3 (De)compression utilities

Utility Function

bunzip2 Returns a file compressed with bzip2 to its original size and format (page 61)

bzcat Displays a file compressed with bzip2 (page 61)

bzip2 Compresses a file (page 60)

compress Compresses a file (not as well as bzip2 or gzip; page 62)

gunzip Returns a file compressed with gzip or compress to its original size and for-

mat (page 62)

gzip Compresses a file (not as well as bzip2; page 62)

zcat Displays a file compressed with gzip (page 62)

Table 3-4 Archive utility

Utility Function

tar Creates or extracts files from an archive file (page 62)

Table 3-2 File utilities (continued)

74 Chapter 3 The Utilities

The utilities listed in Table 3-5 determine the location of a utility on the local sys-

tem. For example, they can display the pathname of a utility or a list of C++ compil-

ers available on the local system.

Table 3-6 lists utilities that display information about other users. You can easily

learn a user’s full name, the user’s login status, the login shell of the user, and other

items of information maintained by the system.

The utilities shown in Table 3-7 can help you stay in touch with other users on the

local network.

Table 3-8 lists miscellaneous utilities.

Table 3-5 Location utilities

Utility Function

locate Searches for files on the local system (page 66)

whereis Displays the full pathnames of a utility, source code, or man page (page 65)

which Displays the full pathname of a command you can run (page 65)

Table 3-6 User and system information utilities

Utility Function

finger Displays detailed information about users, including their full names (page 68)

hostname Displays the name of the local system (page 49)

wDisplays detailed information about users who are logged in on the local

system (page 69)

who Displays information about users who are logged in on the local system

(page 67)

Table 3-7 User communication utilities

Utility Function

mesg Permits or denies messages sent by write (page 71)

write Sends a message to another user who is logged in (page 70)

Table 3-8 Miscellaneous utilities

Utility Function

date Displays the current date and time (page 58)

echo Copies its arguments (page 941) to the screen (page 57)

Advanced Exercises 75

Exercises

1. Which commands can you use to determine who is logged in on a specific

terminal?

2. How can you keep other users from using write to communicate with you?

Why would you want to?

3. What happens when you give the following commands if the file named

done already exists?

$ cp to_do done

$ mv to_do done

4. How can you display the following line on the screen you are working on?

Hi $2, I’m number * (guess)!

5. How can you find the phone number for Ace Electronics in a file named

phone that contains a list of names and phone numbers? Which command

can you use to display the entire file in alphabetical order? How can you

display the file without any adjacent duplicate lines? How can you display

the file without any duplicate lines?

6. What happens when you use diff to compare two binary files that are not

identical? (You can use gzip to create the binary files.) Explain why the diff

output for binary files is different from the diff output for ASCII files.

7. Create a .plan file in your home directory. Does finger display the contents

of your .plan file?

8. What is the result of giving the which utility the name of a command that

resides in a directory that is not in your search path?

9. Are any of the utilities discussed in this chapter located in more than one

directory on the local system? If so, which ones?

10. Experiment by calling the file utility with the names of files in /usr/bin.

How many different types of files are there?

11. Which command can you use to look at the first few lines of a file named

status.report? Which command can you use to look at the end of the file?

Advanced Exercises

12. Re-create the colors.1 and colors.2 files used in Figure 3-8 on page 55.

Test your files by running diff –u on them. Do you get the same results as

in the figure?

76 Chapter 3 The Utilities

13. Try giving these two commands:

$ echo cat

$ cat echo

Explain the differences between the output of each command.

14. Repeat exercise 5 using the file phone.gz, a compressed version of the list

of names and phone numbers. Consider more than one approach to

answer each question, and explain how you made your choices.

15. Find existing files or create files that

a. gzip compresses by more than 80 percent.

b. gzip compresses by less than 10 percent.

c. Get larger when compressed with gzip.

d. Use ls –l to determine the sizes of the files in question. Can you charac-

terize the files in a, b, and c?

16. Older email programs were not able to handle binary files. Suppose that

you are emailing a file that has been compressed with gzip, which produces

a binary file, and the recipient is using an old email program. Refer to the

man page on uuencode, which converts a binary file to ASCII. Learn about

the utility and how to use it.

a. Convert a compressed file to ASCII using uuencode. Is the encoded file

larger or smaller than the compressed file? Explain. (If uuencode is not

on the local system, you can install it using yum or aptitude [both in

Appendix C]; it is part of the sharutils package.)

b. Would it ever make sense to use uuencode on a file before compressing

it? Explain.

7777

Afilesystem is a set of data structures (page 950) that usually

resides on part of a disk and that holds directories of files. Filesys-

tems store user and system data that are the basis of users’ work

on the system and the system’s existence. This chapter discusses

the organization and terminology of the Linux filesystem, defines

ordinary and directory files, and explains the rules for naming

them. It also shows how to create and delete directories, move

through the filesystem, and use absolute and relative pathnames

to access files in various directories. It includes a discussion of

important files and directories as well as file access permissions

and Access Control Lists (ACLs), which allow you to share

selected files with other users. It concludes with a discussion of

hard and symbolic links, which can make a single file appear in

more than one directory.

In addition to reading this chapter, you may want to refer to the

df,fsck,mkfs, and tune2fs utilities in Part V for more information

on filesystems. If you are running Mac OS X, see “Filesystems”

on page 927.

In This Chapter

The Hierarchical Filesystem . . . . . . 78

Directory Files and Ordinary

Files . . . . . . . . . . . . . . . . . . . . . . . 84

The Working Directory. . . . . . . . . . . 82

Your Home Directory . . . . . . . . . . . . 82

Pathnames . . . . . . . . . . . . . . . . . . . 83

Relative Pathnames . . . . . . . . . . . . 84

Working with Directories . . . . . . . . 85

Access Permissions . . . . . . . . . . . . 93

ACLs: Access Control Lists . . . . . . . 99

Hard Links . . . . . . . . . . . . . . . . . . . 106

Symbolic Links . . . . . . . . . . . . . . . 108

The Filesystem

78 Chapter 4 The Filesystem

The Hierarchical Filesystem

Family tree Ahierarchical structure (page 957) frequently takes the shape of a pyramid. One

example of this type of structure is found by tracing a family’s lineage: A couple has

a child, who may in turn have several children, each of whom may have more chil-

dren. This hierarchical structure is called a family tree (Figure 4-1).

Directory tree Like the family tree it resembles, the Linux filesystem is called a tree. It consists of a

set of connected files. This structure allows you to organize files so you can easily

find any particular one. On a standard Linux system, each user starts with one

directory, to which the user can add subdirectories to any desired level. By creating

multiple levels of subdirectories, a user can expand the structure as needed.

Subdirectories Typically each subdirectory is dedicated to a single subject, such as a person,

project, or event. The subject dictates whether a subdirectory should be subdivided

further. For example, Figure 4-2 shows a secretary’s subdirectory named corre-

spond. This directory contains three subdirectories: business,memos, and personal.

The business directory contains files that store each letter the secretary types. If you

expect many letters to go to one client, as is the case with milk_co, you can dedicate

a subdirectory to that client.

One major strength of the Linux filesystem is its ability to adapt to users’ needs.

You can take advantage of this strength by strategically organizing your files so they

are most convenient and useful for you.

Directory Files and Ordinary Files

Like a family tree, the tree representing the filesystem is usually pictured upside

down, with its root at the top. Figures 4-2 and 4-3 show that the tree “grows”

Figure 4-1 A family tree

Grandparent

UncleMomAunt

SelfBrotherSister

Daughter 2Daughter 1

Grandchild 2Grandchild 1

Directory Files and Ordinary Files 79

downward from the root, with paths connecting the root to each of the other files. At

the end of each path is either an ordinary file or a directory file. Special files, which

can also appear at the ends of paths, provide access to operating system features.

Ordinary files, or simply files, appear at the ends of paths that cannot support other

paths. Directory files, also referred to as directories or folders, are the points that

other paths can branch off from. (Figures 4-2 and 4-3 show some empty directo-

ries.) When you refer to the tree, up is toward the root and down is away from the

root. Directories directly connected by a path are called parents (closer to the root)

and children (farther from the root). A pathname is a series of names that trace a

path along branches from one file to another. See page 83 for more information

about pathnames.

Filenames

Every file has a filename. The maximum length of a filename varies with the type of

filesystem; Linux supports several types of filesystems. Although most of today’s

filesystems allow files with names up to 255 characters long, some filesystems

Figure 4-2 A secretary’s directories

correspond

personal

milk_co

businessmemos

cheese_co

letter_2letter_1

Figure 4-3 Directories and ordinary files

DirectoryDirectory

Ordinary FileOrdinary File

Ordinary File Ordinary File

Ordinary File

more questions.

Table 7-10 Visiting files

Command Result

CONTROL-X CONTROL-F Prompts for a filename and reads its contents into a new buffer.

Assigns the file’s simple filename as the buffer name. Other

buffers are unaffected. It is common practice and often useful

to have several files open simultaneously for editing.

CONTROL-X CONTROL-V Prompts for a filename and replaces the current buffer with a

buffer containing the contents of the requested file. The current

buffer is destroyed.

Advanced Editing 233

To create a new file, simply call it up. An empty buffer is created and properly named so

you can eventually save it. The message (New File) appears in the Echo Area, reflecting

emacs’s understanding of the situation. If this new file grew out of a typographical

error, you can give the command CONTROL-X CONTROL-V and enter the correct name.

Pathname Completion

When you are prompted for the pathname of a file in the Minibuffer, you can type the

pathname followed by a RETURN. Alternatively, you can use pathname completion, which

is similar to bash filename completion (page 320), to help you enter a pathname.

While you are entering a pathname in the Minibuffer, press TAB and emacs will com-

plete the pathname as far as possible. If the completed pathname is satisfactory, press

RETURN. In some cases, emacs cannot complete a pathname. For example, a directory

in the pathname you entered may not exist or you may not have permission to read

it. If emacs cannot complete a pathname, it displays a message in the Echo Area. If

the characters following the rightmost slash (/) in the pathname you are typing

match more than one filename, when you press TAB emacs displays [Complete, but

not unique]. If you press TAB a second time, emacs opens a Pathname Completion List

window that displays a list of possible completions (Figure 7-10). You can open this

window manually while you are entering a pathname by typing a question mark (?).

Command Result

CONTROL-X 4 CONTROL-F Prompts for a filename and reads its contents into a new buffer.

Assigns the file’s simple filename as the buffer name. Creates a

new window for this buffer and selects that window. The win-

dow selected before the command still displays the buffer it was

showing before this operation, although the new window may

cover up part of the old window.

Figure 7-10 A Pathname Completion List window

Table 7-10 Visiting files (continued)

234 Chapter 7 The emacs Editor

With the Pathname Completion List window open, you can

• Cancel the selection by pressing CONTROL-G or ESCAPE ESCAPE ESCAPE. The display

returns to the state it was in before you opened the Pathname Completion

List window.

• Type more characters in the Minibuffer to finish the pathname. Press RETURN

to select the pathname; emacs closes the completion window.

• Type more characters in the Minibuffer to make the completion unambig-

uous and press TAB again.

• Press META-v or PAGE UP to move the cursor into the Pathname Completion List

window. Use the ARROW keys to move the cursor between selections. Press RETURN

to choose the selection the cursor is on. You can press CONTROL-G or ESCAPE ESCAPE

ESCAPE to back out of this window and return the cursor to the Minibuffer.

When you press RETURN,emacs closes the Pathname Completion List window, adds

the filename you selected to the end of the pathname you were typing, and moves

the cursor to the end of the pathname you were typing in the Minibuffer. You can

continue typing in the Minibuffer and perform more completions before you press

RETURN to accept the pathname. More information is available from the Completion

and Completion Commands menus of the Minibuffer menu of the emacs online

manual (see the tip on page 210).

Saving Files

You save a buffer by copying its contents back to the original file you called up.

Table 7-11 lists the relevant commands.

You can exit without first getting a warning

caution Clearing the modified flag (META-~) allows you to exit without saving a modified buffer with no

warning. Make sure you know what you are doing when you use META-~.

Did you modify a buffer by mistake?

caution When you give a CONTROL-X s command, you may discover files whose buffers were modified by

mistake as emacs tries to save the wrong changes back to the file. When emacs prompts you

to confirm the save, do not answer yif you are not sure. First exit from the CONTROL-X s dialog by

typing n to any saves you are not sure about. You then have several options:

• Save the suspicious buffer to a temporary file with CONTROL-X CONTROL-W and analyze it later.

• Undo the changes with a string of CONTROL-_ commands until the ** indicator disap-

pears from the buffer’s Mode Line.

• If you are sure that all the changes are wrong, use META-x revert-buffer to get a fresh

copy of the file.

• Kill the buffer outright. Because it is modified, emacs asks whether you are sure before

carrying out this command.

• Give the META-~ (tilde) command to clear the modified condition and ** indicator. A

subsequent CONTROL-X s then believes that the buffer does not need to be written.

Advanced Editing 235

Buffers

An emacs buffer is a storage object that you can edit. It often holds the contents of

a file but can also exist without being associated with a file. You can select only one

buffer at a time, designated as the current buffer. Most commands operate only on

the current buffer, even when windows show multiple buffers on the screen. For the

most part each buffer is its own world: It has its own name, its own modes, its own

file associations, its own modified state, and perhaps its own special key bindings.

You can use the commands shown in Table 7-12 to create, select, list, and manipu-

late buffers.

Table 7-11 Saving files

Command Result

CONTROL-X CONTROL-S This workhorse file-saving command saves the current buffer

into its original file. If the current buffer is not modified, emacs

displays the message (No changes need to be saved).

CONTROL-X s For each modified buffer, you are asked whether you wish to

save it. Answer y or n. This command is given automatically as

you exit from emacs and allows you to save any buffers that

have been modified but not yet written out. Give this command

to save intermediate copies of your work.

META-x set-visited-file-name Prompts for a filename and sets this name as the “original”

name for the current buffer.

CONTROL-X CONTROL-W Prompts for a filename, sets this name as the “original” name

for the current buffer, and saves the current buffer into that file.

Equivalent to META-x set-visited-file-name followed by

CONTROL-X CONTROL-S.

META-~ (tilde) Clears the modified flag from the current buffer. If you mistak-

enly type META-~ against a buffer with changes you want to

keep, you need to make sure the modified condition and its

** indicator are turned back on before leaving emacs, or all

the changes you made will be lost. One easy way to mark a

buffer as modified is to insert a SPACE and then remove it using

DELETE.

Table 7-12 Working with buffers

Command Result

CONTROL-X b Prompts for a buffer name and selects it. If the buffer you name

does not exist, this command creates it.

CONTROL-X 4 b Prompts for a buffer name and selects it in another window. The

existing window is not disturbed, although the new window

may overlap it.

236 Chapter 7 The emacs Editor

Windows

An emacs window is a viewport that looks into a buffer. The emacs screen begins by

displaying a single window, but this screen space can later be divided among two or

more windows. On the screen the current window holds the cursor and views the

current buffer. For a tip on terminology, see “The screen and emacs windows” on

page 207.

CONTROL-X b

buffer-name A window displays one buffer at a time. The command CONTROL-X bbuffer-name

switches the buffer that the current window displays. Multiple windows can display

Command Result

CONTROL-X CONTROL-B Creates a buffer named *Buffer List* and displays it in another

window. The existing window is not disturbed, although the new

window may overlap it. The new buffer is not selected. In the

*Buffer List* buffer, each buffer’s data is shown along with the

name, size, mode(s), and original filename. A % appears for a

readonly buffer, a * indicates a modified buffer, and . appears

for the selected buffer.

META-x rename-buffer Prompts for a new buffer name and gives this new name to the

current buffer.

CONTROL-X CONTROL-Q Toggles the current buffer’s readonly status and the associated

%% Mode Line indicator. This command can prevent you from

accidentally modifying a buffer or allow you to modify a buffer

when visiting a readonly file.

META-x append-to-buffer Prompts for a buffer name and appends Region to the end of

that buffer.

META-x prepend-to-buffer Prompts for a buffer name and prepends Region to the begin-

ning of that buffer.

META-x copy-to-buffer Prompts for a buffer name and deletes the contents of the

buffer before copying Region to that buffer.

META-x insert-buffer Prompts for a buffer name and inserts the contents of that

buffer in the current buffer at Point.

CONTROL-X k Prompts for a buffer name and deletes that buffer. If the buffer

has been modified but not saved, emacs asks you to confirm

the operation.

META-x kill-some-buffers Goes through the list of buffers and offers the chance to delete

each buffer. As with CONTROL-X k,emacs asks you to confirm

the kill command if a modified buffer has not been saved.

Table 7-12 Working with buffers (continued)

Advanced Editing 237

the same buffer with each window displaying a different part of the buffer. Any

change to a buffer is reflected in all windows displaying that buffer. Also, a buffer

can exist without a window open on it.

Splitting a Window

One way to divide the screen is to split the starting window explicitly into two or

more pieces. The command CONTROL-X 2 splits the current window in two, with one

new window appearing above the other. A numeric argument is taken as the size of

the upper window in lines. The command CONTROL-X 3 splits the current window in

two, with the new windows being arranged side by side (Figure 7-11). A numeric

argument is taken as the number of columns to give the left window. For example,

CONTROL-U CONTROL-X 2 splits the current window in two; because of the special “times

4” interpretation of CONTROL-U standing alone, the upper window is given four lines

(barely enough to be useful).

Although these commands split the current window, both windows continue to

view the same buffer. You can select a new buffer in either or both new windows, or

you can scroll each window to show different portions of the same buffer.

Manipulating Windows

CONTROL-X o

META-CONTROL-V

You can use CONTROL-X o (lowercase “o”) to select the other window. If more than two

windows appear on the screen, a sequence of CONTROL-X o commands cycles through

them in top-to-bottom, left-to-right order. The META-CONTROL-V command scrolls the

other window. If more than two windows are visible, the command scrolls the win-

dow that CONTROL-X o would select next. You can use either a positive or negative

scrolling argument, just as with CONTROL-V scrolling in the current window.

Figure 7-11 Splitting a window horizontally

238 Chapter 7 The emacs Editor

Other-Window Display

CONTROL-X 4b

CONTROL-X 4f In normal emacs operation, explicit window splitting is not nearly as common as

the implicit splitting done by the family of CONTROL-X 4 commands. The CONTROL-X 4b

command, for example, prompts for a buffer name and selects it in the other win-

dow. If no other window exists, this command begins with a half-and-half split that

arranges the windows one above the other. The CONTROL-X 4f command prompts for a

filename, calls the file up in the other window, and selects the other window. If no

other window exists, this command begins with a half-and-half split that arranges

the windows one above the other.

Adjusting and Deleting Windows

CONTROL-X 0

CONTROL-X 1Windows may be destroyed when they get in the way. No data is lost in the window’s

associated buffer with this operation, and you can make another window whenever

you like. The CONTROL-X 0 (zero) command deletes the current window and gives its

space to its neighbors; CONTROL-X 1 deletes all windows except the current window.

META-x shrink-

window

CONTROL-X ^

CONTROL-X }

CONTROL-X {

You can also adjust the dimensions of the current window at the expense of its

neighbors. To make a window shorter, give the command META-x shrink-window.

Press CONTROL-X ^ to increase the height of a window, CONTROL-X } to make the window

wider, and CONTROL-X { to make the window narrower. Each of these commands adds

or subtracts one line or column to or from the window, unless you precede the com-

mand with a numeric argument.

The emacs editor has its own guidelines for a window’s minimum useful size and

may destroy a window before you force one of its dimensions to zero. Although the

window may disappear, the buffer remains intact.

Foreground Shell Commands

The emacs editor can run a subshell (a shell that is a child of emacs—refer to “Exe-

cuting a Command” on page 308) to execute a single command line, optionally

with standard input coming from Region of the current buffer and optionally with

standard output replacing Region (Table 7-13). This process is analogous to execut-

ing a shell command from the vim editor and having the input come from the file

you are editing and the output go back to the same file (page 192). As with vim,

how well this process works depends in part on the capabilities of the shell.

Table 7-13 Foreground shell commands

Command Result

META-! (exclamation point) Prompts for a shell command, executes it, and displays the output

CONTROL-U META-! (exclamation

point)

Prompts for a shell command, executes it, and inserts the out-

put at Point

META-| (vertical bar) Prompts for a shell command, gives Region as input, filters it

through the command, and displays the output

Major Modes: Language-Sensitive Editing 239

The emacs editor can also start an interactive subshell that runs continuously in its

own buffer. See “Shell Mode” on page 248 for more information.

Background Shell Commands

The emacs editor can run processes in the background, with their output being fed

into a growing emacs buffer that does not have to remain in view. You can continue

editing while the background process runs and look at its output later. Any shell

command can be run in this way.

The growing output buffer is always named *compilation*. You can read it, copy

from it, or edit it in any way, without waiting for the background process to finish.

Most commonly this buffer is used to review the output of program compilation

and to correct any syntax errors found by the compiler.

META-x compile To run a process in the background, give the command META-x compile to prompt for

a shell command and begin executing it as a background process. The screen splits

in half to show the *compilation* buffer.

You can switch to the *compilation* buffer and watch the execution, if you wish.

To make the display scroll as you watch, position the cursor at the very end of the

text with a META-> command. If you are not interested in this display, just remove the

window with CONTROL-X 0 (zero) if you are in it or CONTROL-X 1 otherwise and keep work-

ing. You can switch back to the *compilation* buffer later with CONTROL-X b.

To kill the background process give the command META-x kill-compilation. The emacs

editor asks for confirmation and then kills the background process.

If standard format error messages appear in *compilation*, you can automatically

visit the line in the file where each error occurred. Give the command CONTROL-X‘(back

tick) to split the screen into two windows and visit the file and line of the next error

message. Scroll the *compilation* buffer until this error message appears at the top

of its window. Use CONTROL-U CONTROL-X‘ to start over with the first error message and

visit that file and line.

Major Modes: Language-Sensitive Editing

The emacs editor has a large collection of feature sets, each specific to a certain vari-

ety of text. The feature sets are called Major modes. A buffer can have only one

Major mode at a time.

Command Result

CONTROL-U META-| (vertical bar) Prompts for a shell command, gives Region as input, filters it

through the command, deletes the old Region, and inserts the

output in that position

Table 7-13 Foreground shell commands (continued)

240 Chapter 7 The emacs Editor

A buffer’s Major mode is private to the buffer and does not affect editing in any

other buffer. If you switch to a new buffer having a different mode, rules for the

new mode take effect immediately. To avoid confusion, the name of a buffer’s

Major mode appears in the Mode Line of any window viewing that buffer

(Figure 7-1 on page 209).

The three classes of Major modes are used for the following tasks:

• Editing human languages (for example, text, nroff, TeX)

• Editing programming languages (for example, C, Fortran, Lisp)

• Special purposes (for example, shell, mail, dired, ftp)

In addition, one Major mode—Fundamental—does nothing special. A Major mode

usually sets up the following:

• Special commands unique to the mode, possibly with their own key bind-

ings. Whereas languages may have just a few special commands, special-

purpose modes may have dozens.

• Mode-specific character syntax and regular expressions defining word-

constituent characters, delimiters, comments, whitespace, and so on. This

setup conditions the behavior of commands oriented to syntactic units,

such as words, sentences, comments, or parenthesized expressions.

Selecting a Major Mode

META-x modename The emacs editor chooses and sets a mode when a file is called up by matching the

filename against a set of regular expression patterns describing the filename and

filename extension. The explicit command to enter a Major mode is META-x

modename. This command is used mostly to correct the Major mode when emacs

guesses wrong.

To have a file define its own mode, include the text –*–modename–

*– some-

where in the first nonblank line of the file, possibly inside a comment suitable for

the programming language the file is written in.

Human-Language Modes

Ahuman language is meant eventually to be used by humans, possibly after being

formatted by a text-formatting program. Human languages share many conven-

tions about the structure of words, sentences, and paragraphs. With regard to these

textual units, the major human language modes all behave in the same way.

Beyond this area of commonality, each mode offers additional functionality ori-

ented to a specific text formatter, such as TeX, LaTeX, or nroff/troff. Text-formatter

extensions are beyond the scope of this chapter; the focus here is on the commands

relating to human textual units.

Major Modes: Language-Sensitive Editing 241

Words

As mnemonic aids, the bindings for words are defined parallel to the character-

oriented bindings CONTROL-F,CONTROL-B,CONTROL-D,DELETE, and CONTROL-T.

META-f

META-b

Just as CONTROL-F and CONTROL-B move forward and backward over characters, so META-f

and META-b move forward and backward over words. They may start from a position

inside or outside the word to be traversed, but in all cases Point finishes just beyond

the word, adjacent to the last character skipped over. Both commands accept a

numeric argument specifying the number of words to be traversed.

META-d

META-DELETE

Just as CONTROL-D and DELETE delete characters forward and backward, so the keys META-d

and META-DELETE kill words forward and backward. They leave Point in exactly the

same finishing position as META-f and META-b do, but they kill the words they pass over.

They also accept a numeric argument.

META-t META-t transposes the word before Point with the word after Point.

Sentences

META-a

META-e

CONTROL-X DELETE

META-k

As mnemonic aids, three of the bindings for sentences are defined parallel to the

line-oriented bindings: CONTROL-A,CONTROL-E, and CONTROL-K. The META-a command moves

backward to the beginning of a sentence; META-e moves forward to the end of a sen-

tence. In addition, CONTROL-X DELETE kills backward to the beginning of a sentence; META-k

kills forward to the end of a sentence.

The emacs editor recognizes the ends of sentences by referring to a regular expression

that is kept in a variable named sentence-end. Briefly, emacs looks for the characters

.,?, or ! followed by two SPACEs or an end-of-line marker, possibly with close quotation

marks or close braces. Give the command CONTROL-H v sentence-end RETURN to display the

value of this variable.

The META-a and META-e commands leave Point adjacent to the first or last nonblank

character in the sentence. They accept a numeric argument specifying the number of

sentences to traverse; a negative argument runs them in reverse.

The META-k and CONTROL-X DELETE commands kill sentences forward and backward, in a

manner analogous to CONTROL-K line kill. They leave Point in the same position as META-a

and META-e do, but they kill the sentences they pass over. They also accept a numeric

argument. CONTROL-X DELETE is useful for quickly backing out of a half-finished sentence.

Paragraphs

META-{

META-}

META-h

The META-{ command moves backward to the most recent paragraph beginning; META-}

moves forward to the next paragraph ending. The META-h command marks the para-

graph the cursor is on as Region (that is, it puts Point at the beginning and Mark at

the end), or marks the next paragraph if the cursor is between paragraphs.

The META-} and META-{ commands leave Point at the beginning of a line, adjacent to the

first character or last character, respectively, of the paragraph. They accept a

242 Chapter 7 The emacs Editor

numeric argument specifying the number of paragraphs to traverse and run in

reverse if given a negative argument.

In human-language modes, paragraphs are separated by blank lines and text-

formatter command lines, and an indented line starts a paragraph. Recognition is

based on the regular expressions stored in the variables paragraph-separate and

paragraph-start. A paragraph is composed of complete lines, including the final line

terminator. If a paragraph starts following one or more blank lines, the last blank

line before the paragraph belongs to the paragraph.

Fill

The emacs editor can fill a paragraph to fit a specified width, breaking lines and

rearranging them as necessary. It breaks lines between words and does not hyphen-

ate words. The emacs editor can fill automatically as you type or in response to an

explicit command.

META-x auto-fill-

mode The META-x auto-fill-mode command toggles Auto Fill mode on and off. When this

mode is on, emacs automatically breaks lines when you press SPACE or RETURN and are

currently beyond the specified line width. This feature is useful when you are enter-

ing new text.

META-q

META-x fill-region Auto Fill mode does not automatically refill the entire paragraph you are currently

working on. If you add new text in the middle of a paragraph, Auto Fill mode breaks

the new text as you type but does not refill the complete paragraph. To refill a com-

plete paragraph or Region of paragraphs, use either META-q to refill the current para-

graph or META-x fill-region to refill each paragraph in Region (between Point and Mark).

You can change the filling width from its default value of 70 by setting the fill-

column variable. Give the command CONTROL-X f to set fill-column to the current cur-

sor position and the command CONTROL-U nnn CONTROL-X f to set fill-column to nnn,

where 0 is the left margin.

Case Conversion

The emacs editor can force words or Regions to all uppercase, all lowercase, or ini-

tial caps (the first letter of each word uppercase, the rest lowercase) characters.

Refer to Table 7-14.

Table 7-14 Case conversion

Command Result

META-l (lowercase “l”) Converts word to the right of Point to lowercase

META-u Converts word to the right of Point to uppercase

META-c Converts word to the right of Point to initial caps

CONTROL-X CONTROL-L Converts Region to lowercase

CONTROL-X CONTROL-U Converts Region to uppercase

Major Modes: Language-Sensitive Editing 243

The word-oriented conversions move Point over the word just converted (just as

META-f does), allowing you to walk through text and convert each word with META-l,

META-u, or META-c, or skip over words to be left alone with META-f. A positive numeric

argument converts that number of words to the right of Point, moving Point as it

goes. A negative numeric argument converts that number of words to the left of

Point but leaves Point stationary. This feature is useful for quickly changing the case

of words you have just typed. Table 7-15 shows some examples.

When the cursor (Point) is in the middle of a word, the case conversion commands

convert the characters to the left of the cursor.

Text Mode

META-x text-mode With very few exceptions, the commands for human-language textual units are

always turned on and available, even when the programming-language modes are

activated. Text mode adds very little to these basic commands but is still worth

turning on just to activate the TAB key function (next). Use the command META-x text-

mode to activate Text mode.

META-x edit-tab-

stops In Text mode, pressing TAB runs the function tab-to-tab-stop. By default TAB stops are

set every eight columns. You can adjust them with META-x edit-tab-stops, which switches

to a special *Tab Stops* buffer. The current TAB stops are laid out in this buffer on a

scale for you to edit. The new stops are installed when or if you type CONTROL-C CONTROL-C.

You can kill this buffer (CONTROL-X k) or switch away from it (CONTROL-X b) without chang-

ing the TAB stops.

The TAB stops you set with the META-x edit-tab-stops command affect only the interpre-

tation of TAB characters arriving from the keyboard. The emacs editor automatically

inserts enough spaces to reach the TAB stop. This command does not affect the inter-

pretation of TAB characters already in the buffer or the underlying file. If you edit the

TAB stops and then use them, when you print the file the hard copy will look the

same as the text on the screen.

C Mode

Programming languages are read by humans but are interpreted by machines.

Besides continuing to handle some of the human-language text units (for example,

words and sentences), the major programming-language modes address several

additional issues:

• Handling balanced expressions enclosed by parentheses, brackets, or

braces as textual units

Table 7-15 Examples of case conversion

Characters and commands Result

HELLOMETA-–META-l (lowercase “l”) hello

helloMETA-–META-u HELLO

helloMETA-–META-c Hello

244 Chapter 7 The emacs Editor

• Handling comments as textual units

• Indention

The emacs editor includes Major modes to support C, Fortran, and several variants

of Lisp. In addition, many users have contributed modes for their favorite lan-

guages. In these modes the commands for human textual units are still available,

with occasional redefinitions. For example, a paragraph is bounded only by blank

lines and indention does not signal a paragraph start. In addition, each mode has

custom code to handle the language-specific conventions for balanced expressions,

comments, and indention. This chapter discusses only C mode.

Expressions

The emacs Major modes are limited to lexical analysis. They can recognize most

tokens (for example, symbols, strings, and numbers) and all matched sets of paren-

theses, brackets, and braces. This is enough for Lisp but not for C. The C mode

lacks a full-function syntax analyzer and is not prepared to recognize all of C’s pos-

sible expressions.1

Table 7-16 lists the emacs commands applicable to parenthesized expressions and

some tokens. By design the bindings run parallel to the CONTROL commands for char-

acters and the META commands for words. All of these commands accept a numeric

argument and run in reverse if that argument is negative.

1. In the emacs documentation the recurring term sexp refers to the Lisp term S-expression. Unfortunately,

it is sometimes used interchangeably with expression, even though the language might not be Lisp.

Table 7-16 Commands for expressions and tokens

Command Result

CONTROL-META-f Moves forward over an expression. The exact behavior depends

on which character lies to the right of Point (or left of Point,

depending on which direction you are moving Point).

• If the first nonwhitespace is an opening delimiter

(parenthesis, bracket, or brace), Point is moved just

past the matching closing delimiter.

• If the first nonwhitespace is a token, Point is moved

just past the end of this token.

CONTROL-META-b Moves backward over an expression.

CONTROL-META-k Kills an expression forward. This command leaves Point at the

same finishing position as CONTROL-META-f but kills the expres-

sion it traverses.

CONTROL-META-@ Sets Mark at the position CONTROL-META-f would move to but does

not change Point. To see the marked Region clearly, give a pair

of CONTROL-X CONTROL-X commands to exchange Point and Mark.

Major Modes: Language-Sensitive Editing 245

Function Definitions

In emacs a balanced expression at the outermost level is considered to be a func-

tion definition and is often called a defun, even though that term is specific to

Lisp. More generally it is understood to be a function definition in the language

at hand.

In C mode a function definition includes the return data type, the function name,

and the argument declarations appearing before the { character. Table 7-17 shows

the commands for operating on function definitions.

Indention

The emacs C mode has extensive logic to control the indention of C programs. You

can adjust this logic for many different styles of C indention (Table 7-18).

Function indention style

caution The emacs editor assumes an opening brace at the left margin is part of a function definition. This

heuristic speeds up the reverse scan for a definition’s leading edge. If your code has an indention

style that puts the opening brace elsewhere, you may get unexpected results.

Table 7-17 Function definition commands

Command Result

CONTROL-META-a Moves to the beginning of the most recent function definition.

Use this command to scan backward through a buffer one func-

tion at a time.

CONTROL-META-e Moves to the end of the next function definition. Use this com-

mand to scan forward through a buffer one function at a time.

CONTROL-META-h Marks as Region the current function definition (or next function

definition, if the cursor is between two functions). This com-

mand sets up an entire function definition for a Region-oriented

operation such as kill.

Table 7-18 Indention commands

Command Result

TAB Adjusts the indention of the current line. TAB inserts or deletes

whitespace at the beginning of the line until the indention con-

forms to the current context and rules in effect. Point is not moved

unless it lies in the whitespace area; in that case it is moved to

the end of the whitespace. TAB inserts leading SPACEs; you can

press TAB with the cursor at any position on the line. If you want

to insert a TAB in the text, use META-i or CONTROL-Q TAB.

LINEFEED Shorthand for RETURN followed by TAB. The LINEFEED key is a con-

venience for entering new code, giving you an autoindent as

you begin each line.

246 Chapter 7 The emacs Editor

Customizing Indention

Many styles of C programming have evolved, and emacs does its best to support

automatic indention for all of them. The indention coding was completely rewritten

for emacs version 19; it supports C, C++, Objective-C, and Java. The new emacs

syntactic analysis is much more precise and can classify each syntactic element of

each line of a program into a single syntactic category (out of about 50), such as

statement, string, or else-clause. Based on that analysis, emacs refers to the offset

table named c-offsets-alist to look up how much each line should be indented from

the preceding line.

To customize indention, you must change the offset table. Although you can define

a completely new offset table for each customized style, it is typically more conve-

nient to feed in a short list of exceptions to the standard rules. Each mainstream

style (GNU, K&R [Kernighan and Ritchie], BSD, and so on) has such an exception

list; all are collected in c-style-alist. Here is one entry from c-style-alist:

("gnu"

(c-basic-offset . 2)

(c-comment-only-line-offset . (0 . 0))

(c-offsets-alist . ((statement-block-intro . +)

(knr-argdecl-intro . 5)

(substatement-open . +)

(label . 0)

(statement-case-open . +)

(statement-cont . +)

(arglist-intro . c-lineup-arglist-intro-after-paren)

(arglist-close . c-lineup-arglist)

))

)

Constructing a custom style is beyond the scope of this book. If you are curious, the

long story is available in emacs info beginning at “Customizing C Indentation.” The

sample .emacs file given in this chapter (page 253) adds a very simple custom style

and arranges to use it on every .c file that is edited.

Command Result

The next two commands indent multiple lines with a single command.

CONTROL-META-q Reindents all lines inside the next pair of matched braces.

CONTROL-META-q assumes the left brace is correctly indented

and drives the indention from there. If you need to adjust the

left brace, type TAB just to the left of the brace before giving

this command. All lines up to the matching brace are

indented as if you had typed TAB on each one.

CONTROL-META-\ Reindents all lines in Region. Put Point just to the left of a left

brace and then give the command. All lines up to the matching

brace are indented as if you had typed TAB on each one.

Table 7-18 Indention commands (continued)

Major Modes: Language-Sensitive Editing 247

Comments

Each buffer has its own comment-column variable, which you can view with the

CONTROL-H v comment-column RETURN help command. Table 7-19 lists commands that

facilitate working with comments.

Special-Purpose Modes

The emacs editor includes a third family of Major modes that are not oriented

toward a particular language or toward ordinary editing. Instead, these modes

perform some special function. The following modes may define their own key

bindings and commands to accomplish that function:

• Rmail: reads, archives, and composes email

• Dired: moves around an ls –l display and operates on files

• VIP: simulates a complete vi environment

• VC: allows you to drive version-control systems (including RCS, CVS, and

Subversion) from within emacs

Table 7-19 Comment commands

Command Result

META-; Inserts a comment on the current line or aligns an existing com-

ment. This command’s behavior differs according to the situation.

• If no comment is on this line, META-; creates an empty

comment at the value of comment-column.

• If text already on this line overlaps the position of

comment-column,META-; creates an empty comment

one SPACE after the end of the text.

• If a comment is already on this line but not at the cur-

rent value of comment-column,META-; realigns the

comment at that column. If text is in the way, it places

the comment one SPACE after the end of the text.

Once an aligned (possibly empty) comment exists on the line,

Point moves to the start of the comment text.

CONTROL-X ; Sets comment-column to the column after Point. The left mar-

gin is column 0.

CONTROL-U –CONTROL-X ;Kills the comment on the current line. This command sets

comment-column from the first comment found above this line

and then performs a META-; command to insert or align a com-

ment at that position.

CONTROL-U CONTROL-X ;Sets comment-column to the position of the first comment

found above this line and then executes a META-; command to

insert or align a comment on this line.

248 Chapter 7 The emacs Editor

• GUD (Grand Unified Debugger): allows you to run and debug C (and

other) programs from within emacs

• Tramp: allows you to edit files on any remote system you can reach with

ftp or scp

• Shell: runs an interactive subshell from inside an emacs buffer

This book discusses only Shell mode.

Shell Mode

One-time shell commands and Region filtering were discussed earlier under “Fore-

ground Shell Commands” on page 238. Each emacs buffer in Shell mode has an

underlying interactive shell permanently associated with it. This shell takes its input

from the last line of the buffer and sends its output back to the buffer, advancing Point

as it goes. If you do not edit the buffer, it holds a record of the complete shell session.

The shell runs asynchronously, whether or not you have its buffer in view. The

emacs editor uses idle time to read the shell’s output and add it to the buffer.

META-x shell Type META-x shell to create a buffer named *shell* and start a subshell. If a buffer

named *shell* already exists, emacs just switches to that buffer. The shell that this

command runs is taken from one of the following sources:

• The Lisp variable explicit-shell-file-name

• The environment variable ESHELL

• The environment variable SHELL

To start a second shell, first give the command META-x rename-buffer to change the name

of the existing shell’s buffer, and then give the command META-x shell to start another

shell. You can create as many subshells and buffers as you like, all running in parallel.

A special set of commands is defined in Shell mode (Table 7-20). These commands

are bound mostly to two-key sequences starting with CONTROL-C. Each sequence is sim-

ilar to the ordinary control characters found in Linux but uses a leading CONTROL-C.

Table 7-20 Shell mode commands

Command Result

RETURN If Point is at the end of the buffer, emacs inserts the RETURN and

sends this (the last) line to the shell. If Point is elsewhere, it

copies this line to the end of the buffer, peeling off the old shell

prompt (see the regular expression shell-prompt-pattern), if

one existed. Then this copied line—now the last in the buffer—is

sent to the shell.

CONTROL-C CONTROL-D Sends CONTROL-D to the shell or its subshell.

CONTROL-C CONTROL-C Sends CONTROL-C to the shell or its subshell.

CONTROL-C CONTROL-\ Sends a quit signal to the shell or its subshell.

Customizing emacs 249

optional

Customizing emacs

At the heart of emacs is a Lisp interpreter written in C. This version of Lisp is signif-

icantly extended and includes many special editing commands. The interpreter’s

main task is to execute the Lisp-coded system that implements the look-and-feel of

emacs.

Reduced to its essentials, this system implements a continuous loop that watches

keystrokes arrive, parses them into commands, executes those commands, and

updates the screen. This behavior can be customized in a number of ways.

• As single keystrokes arrive, they are mapped immediately through a key-

board translation table. By changing the entries in this table, it is possible

to swap keys. If you are used to vi or vim, for example, you might want to

swap DELETE and CONTROL-H. Then CONTROL-H backspaces as it does in vim, and

DELETE (which is not used by vim) is the help key. If you use DELETE as an inter-

rupt key, you may want to choose another key to swap with CONTROL-H.

• The mapped keystrokes are gathered into small groups called key

sequences. A key sequence may be only a single key, such as CONTROL-N, or

may include two or more keys, such as CONTROL-X CONTROL-F. Once gathered,

the key sequences are used to select a particular procedure to be executed.

The rules for gathering each key sequence and the specific procedure name

to be executed when that sequence comes in are codified in a series of

tables called keymaps. By altering the keymaps, you can change the gath-

ering rules or change which procedure is associated with which sequence.

For example, if you are used to vi’s or vim’s use of CONTROL-W to back up over

the word you are entering, you may want to change emacs’s CONTROL-W bind-

ing from the standard kill-region to delete-word-backward.

• The command behavior is often conditioned by one or more global vari-

ables or options. It may be possible to get the behavior you want by setting

some of these variables.

• The command itself is usually a Lisp program that can be reprogrammed

to make it behave as desired. Although this task is not appropriate for

beginners, the Lisp source to nearly all commands is available and the

Command Result

CONTROL-C CONTROL-U Kills the text on the current line not yet completed.

CONTROL-C CONTROL-R Scrolls back to the beginning of the last shell output, putting the

first line of output at the top of the window.

CONTROL-C CONTROL-O Deletes the last batch of shell output.

Table 7-20 Shell mode commands (continued)

250 Chapter 7 The emacs Editor

internal Lisp system is fully documented. As mentioned earlier, it is com-

mon practice to load customized Lisp code at startup time, even if you did

not write the code yourself.

Most emacs documentation glosses over the translation, gathering, and procedure

selection steps and talks about keystrokes as though they were commands. However,

it is important to know that the underlying machinery exists and to understand that

you can change its behavior.

The .emacs Startup File

Each time you start emacs, it loads the file of Lisp code named ~/.emacs. Using this

file is the most common way to customize emacs. Two command-line options con-

trol the use of the .emacs file. The –q option ignores the .emacs file so emacs starts

without it; this is one way to get past a bad .emacs file. The –u user option uses the

~user/.emacs file (the .emacs file from the home directory of user).

The .emacs startup file is generally concerned only with key bindings and option

settings; it is possible to write the Lisp statements for this file in a straightforward

style. Each parenthesized Lisp statement is a Lisp function call. Inside the paren-

theses the first symbol is the function name; the rest of the SPACE-separated tokens

are arguments to that function.

Assigning a value to

a variable

The most common function in the .emacs file, setq, is a simple assignment to a global

variable. The first argument is the name of the variable to set and the second argu-

ment is its value. The following example sets the variable named c-indent-level to 8:

(setq c-indent-level 8)

Displaying the value

of a variable

While you are running emacs, the command CONTROL-H v prompts for the name of a

variable. When you enter the name of a variable and press RETURN ,emacs displays the

value of the variable.

Setting the default

value of a variable

You can set the default value for a variable that is buffer-private by using the func-

tion named setq-default. To set a specific element of a vector, use the function name

aset. The first argument is the name of the vector, the second is the offset, and the

third is the value of the target entry. In the startup file the new values are usually

constants. Table 7-21 shows the formats of these constants.

Table 7-21 Formats of constants in .emacs

Command Result

Numbers Decimal integers, with an optional minus sign

Strings Similar to C strings but with extensions for CONTROL and META

characters: \C-s yields CONTROL-S,\M-s yields META-s, and \M-\C-s

yields CONTROL-META-s

Characters Not like C characters; start with ? and continue with a printing

character or with a backslash escape sequence (for example,

?a,?\C-i,?\033)

Customizing emacs 251

Remapping Keys

The emacs command loop begins each cycle by translating incoming keystrokes into

the name of the command to be executed. The basic translation operation uses the

ASCII value of the incoming character to index a 128-element vector called a keymap.

Sometimes a character’s eighth bit is interpreted as the META case, but this cannot

always be relied on. At the point of translation all META characters appear with the

ESCAPE prefix, whether or not they were typed that way.

Each position in this vector is one of the following:

• Not defined: No translation possible in this map.

• The name of another keymap: Switches to that keymap and waits for the

next character to arrive.

• The name of a Lisp function to be called: Translation process is done; call

this command.

Because keymaps can reference other keymaps, an arbitrarily complex recognition

tree can be set up. The mainstream emacs bindings use at most three keys, with a very

small group of well-known prefix keys, each with its well-known keymap name.

Each buffer can have a local keymap that is used first for any keystrokes arriving

while a window into that buffer is selected. The local keymap allows the regular

mapping to be extended or overridden on a per-buffer basis and is most often used

to add bindings for a Major mode.

The basic translation flow runs as follows:

• Map the first character through the buffer’s local keymap. If it is defined

as a Lisp function name, translation is done and emacs executes that

function. If it is not defined, use this same character to index the global

top-level keymap.

• Map the first character through the top-level global keymap global-map.

At this and each following stage, the following conditions hold:

◆If the entry for this character is not defined, it is an error. Send a bell

to the terminal and discard all the characters entered in this key

sequence.

◆If the entry for this character is defined as a Lisp function name, trans-

lation is done and the function is executed.

Command Result

Booleans Not 1 and 0; use t for true and nil for false

Other Lisp objects Begin with a single quotation mark and continue with the

object’s name

Table 7-21 Formats of constants in .emacs (continued)

252 Chapter 7 The emacs Editor

◆If the entry for this character is defined as the name of another key-

map, switch to that keymap and wait for another character to select

one of its elements.

Everything input during the remapping process must be either a command or an

error. Ordinary characters that are to be inserted in the buffer are usually bound to

the command self-insert-command. Each of the well-known prefix characters is

each associated with a keymap (Table 7-22).

To see the current state of the keymaps, type CONTROL-H b. They appear in the follow-

ing order: local, global, and shorter maps for each prefix key. Each line specifies the

name of the Lisp function to be called; the documentation for that function can be

retrieved with the command CONTROL-H ffunction-name or CONTROL-H kkey-sequence.

The most common type of keymap customization is making small changes to the

global command assignments without creating any new keymaps or commands.

This type of customization is most easily done in the .emacs file using the Lisp func-

tion define-key. The define-key function takes three arguments:

• The keymap name

• A single character defining a position in that map

• The command to be executed when this character appears

For instance, to bind the command backward-kill-word to CONTROL-W, use the statement

(define-key global-map "\C-w" 'backward-kill-word)

The \ character causes C-w to be interpreted as CONTROL-W instead of three letters

(equivalent to \^w). The unmatched single quotation mark in front of the com-

mand name is correct. This Lisp escape character keeps the name from being

evaluated too soon. To bind the command kill-region to CONTROL-X CONTROL-K, use the

statement

(define-key ctl-x-map "\C-k" 'kill-region)

Table 7-22 Keymap prefixes

Keymap prefix Applies to

ctl-x-map For characters following CONTROL-X

ctl-x-4-map For characters following CONTROL-X 4

esc-map For characters following ESCAPE (including META characters)

help-map For characters following CONTROL-H

mode-specific-map For characters following CONTROL-C

Customizing emacs 253

A Sample .emacs File

The following ~/.emacs file produces a plain editing environment that minimizes

surprises for vi and vim users. If any section or any line is not appropriate for your

situation, you can edit it or make it a comment by placing one or more semicolons

(;) beginning in column 1.

;;; Preference Variables

(setq make-backup-files nil) ;Do not make backup files

(setq backup-by-copying t) ;If you do, at least do not destroy links

(setq delete-auto-save-files t) ;Delete autosave files when writing orig

(setq blink-matching-paren nil) ;Do not blink opening delim

(setq require-final-newline 'ask) ;Ask about missing final newline

;; Reverse mappings for C-h and DEL.

;; Sometimes useful to get DEL character from the Backspace key,

;; and online help from the Delete key.

;; NB: F1 is always bound to online help.

(keyboard-translate ?\C-h ?\177)

(keyboard-translate ?\177 ?\C-h)

;; Some vi sugar: emulate the CR command

;; that positions us to first non-blank on next line.

(defun forward-line-1-skipws ()

"Position to first nonwhitespace character on next line."

(interactive)

(if (= (forward-line) 0) ;if moved OK to start of next line

(skip-chars-forward " \t"))) ;skip over horizontal whitespace

;; Bind this to M-n. ("enhanced next-line")

;; C-M-n is arguably more "correct" but (1) it takes three fingers

;; and (2) C-M-n is already bound to forward-list.

(define-key esc-map "n" 'forward-line-1-skipws)

;; C mode customization: set vanilla (8-space bsd) indention style

(require 'cc-mode) ;kiss: be sure it's here

(setq c-default-style

(java-mode . "java")

(awk-mode . "awk")

(c-mode . "bsd")

(other . "gnu")

))

;; See also CC Mode in online help for more style setup examples.

;; end of c mode style setup

254 Chapter 7 The emacs Editor

More Information

A lot of emacs documentation is available in both paper and electronic form. The

emacs info page and emacs help functions (page 223) provide an abundance of

information. See also the GNU emacs Web page at www.gnu.org/software/emacs.

The comp.emacs and gnu.emacs.help newsgroups offer support for and a general

discussion about emacs.

Access to emacs

The emacs editor is included in the repositories of most Linux distributions. You

can download and install emacs with apt-get (page 916) or yum (page 910). You can

download the latest version of the source code from www.gnu.org.

The Free Software Foundation can be reached at these addresses:

Mail: Free Software Foundation, Inc.

51 Franklin Street, Fifth Floor

Boston, MA 02110-1301, USA

Email: gnu@gnu.org

Phone: +1 617-542-5942

Fax +1 617 542 2652

World Wide

Web

www.gnu.org

Chapter Summary

You can precede many of the commands in the following tables with a numeric

argument to make the command repeat the number of times specified by the argu-

ment. Precede a numeric argument with CONTROL-U to keep emacs from entering the

argument as text.

Table 7-23 lists commands that move the cursor.

Table 7-23 Moving the cursor

Command Result

CONTROL-F Forward by characters

CONTROL-B Backward by characters

META-f Forward by words

META-b Backward by words

META-e To end of sentence

Chapter Summary 255

Table 7-1 lists commands that kill and delete text.

Command Result

META-a To beginning of sentence

META-} To end of paragraph

META-{ To beginning of paragraph

META-> Forward to end of buffer

META-< Backward to beginning of buffer

CONTROL-ESCAPE To end of line

CONTROL-A To beginning of line

CONTROL-N Forward (down) one line

CONTROL-P Backward (up) one line

CONTROL-V Scroll forward (down) one window

META-v Scroll backward (up) one window

CONTROL-L Clear and repaint screen, and scroll current line to center of

window

META-r To beginning of middle line

CONTROL-U

num

META-r To beginning of line number

num

(0 = top, –= bottom)

Table 7-1 Killing and deleting text

Command Result

CONTROL-DELETE Deletes character under cursor

DELETE Deletes character to left of cursor

META-d Kills forward to end of current word

META-DELETE Kills backward to beginning of previous word

META-k Kills forward to end of sentence

CONTROL-X DELETE Kills backward to beginning of sentence

CONTROL-K Kills forward to, but not including, line-ending LINEFEED; if there

is no text between the cursor and the LINEFEED, kills the LINEFEED

CONTROL-U 1 CONTROL-K Kills from cursor forward to and including LINEFEED

CONTROL-U 0 CONTROL-K Kills from cursor backward to beginning of line

META-z

char

Kills forward to, but not including, next occurrence of

char

Table 7-23 Moving the cursor (continued)

256 Chapter 7 The emacs Editor

Table 7-2 lists commands that search for strings and regular expressions.

Table 7-3 lists commands that provide online help.

Command Result

META-w Copies Region to Kill Ring (does not delete Region from buffer)

CONTROL-W Kills Region (deletes Region from buffer)

CONTROL-Y Yanks most recently killed text into current buffer at Point; sets

Mark at beginning of this text, with Point and cursor at the end

META-y Erases just-yanked text, rotates Kill Ring, and yanks next item

(only after CONTROL-Y or META-y)

Table 7-2 Search commands

Command Result

CONTROL-S Prompts incrementally for a string and searches forward

CONTROL-S RETURN Prompts for a complete string and searches forward

CONTROL-R Prompts incrementally for a string and searches backward

CONTROL-R RETURN Prompts for a complete string and searches backward

META-CONTROL-S Prompts incrementally for a regular expression and searches

forward

META- CONTROL-S RETURN Prompts for a complete regular expression and searches

forward

META-CONTROL-R Prompts incrementally for a regular expression and searches

backward

META- CONTROL-R RETURN Prompts for a complete regular expression and searches

backward

Table 7-3 Online help

Command Result

CONTROL-H a Prompts for

string

and displays a list of commands whose

names contain

string

CONTROL-H b Displays a (long) table of all key bindings now in effect

CONTROL-H c

key-sequence

Displays the name of the command bound to

key-sequence

CONTROL-H k

key-sequence

Displays the name of and documentation for the command

bound to

key-sequence

CONTROL-H f Prompts for the name of a Lisp function and displays the doc-

umentation for that function

Table 7-1 Killing and deleting text (continued)

Chapter Summary 257

Table 7-4 lists commands that work with a Region.

Table 7-5 lists commands that work with lines.

Command Result

CONTROL-H i (lowercase “i”) Displays the top menu of info (page 36)

CONTROL-H l (lowercase “l”) Displays the last 100 characters typed

CONTROL-H m Displays the documentation and special key bindings for the

current Major mode

CONTROL-H n Displays the emacs news file

CONTROL-H t Starts an emacs tutorial session

CONTROL-H v Prompts for a Lisp variable name and displays the documenta-

tion for that variable

CONTROL-H w Prompts for a command name and displays the key sequence,

if any, bound to that command

Table 7-4 Working with a Region

Command Result

META-W Copies Region nondestructively to the Kill Ring

CONTROL-W Kills (deletes) Region

META-x print-region Copies Region to the print spooler

META-x append-to-buffer Prompts for buffer name and appends Region to that buffer

META-x append-to-file Prompts for filename and appends Region to that file

CONTROL-X CONTROL-U Converts Region to uppercase

CONTROL-X CONTROL-L Converts Region to lowercase

Table 7-5 Working with lines

Command Result

META-x occur Prompts for a regular expression and lists each line containing

a match for the expression in a buffer named *Occur*

META-x delete-matching-lines Prompts for a regular expression and deletes lines from Point

forward that have a match for the regular expression

META-x delete-non-matching-lines Prompts for a regular expression and deletes lines from Point

forward that do not have a match for the regular expression

Table 7-3 Online help (continued)

258 Chapter 7 The emacs Editor

Table 7-6 lists commands that replace strings and regular expressions uncondition-

ally and interactively.

Table 7-7 lists responses to replacement queries.

Table 7-8 lists commands that work with windows.

Table 7-6 Commands that replace text

Command Result

META-x replace-string Prompts for two strings and replaces each instance of the first

string with the second string from Mark forward; sets Mark at

the start of the command

META-%

META-x query-replace

As above but queries for each replacement (see Table 7-7 for a

list of responses)

META-x replace-regexp Prompts for a regular expression and a string, and replaces

each match for the regular expression with the string; sets Mark

at the start of the command

META-x query-replace-regexp As above but queries for each replacement (see Table 7-7 for a

list of responses)

Table 7-7 Responses to replacement queries

Command Result

RETURN Quits searching (does not make or query for any more

replacements)

SPACE Makes this replacement and continues querying

DELETE Does not make this replacement and continues querying

, (comma) Makes this replacement, displays the result, and asks for

another command

. (period) Makes this replacement and does not make or query for any

more replacements

! (exclamation point) Replaces this and all remaining instances without querying

Table 7-8 Working with windows

Command Result

CONTROL-X b Prompts for and displays a different buffer in current window

CONTROL-X 2 Splits current window vertically into two

CONTROL-X 3 Splits current window horizontally into two

Chapter Summary 259

Table 7-9 lists commands that work with files.

Command Result

CONTROL-X o (lowercase “o”) Selects other window

META-CONTROL-V Scrolls other window

CONTROL-X 4b Prompts for buffer name and selects it in other window

CONTROL-X 4f Prompts for filename and selects it in other window

CONTROL-X 0 (zero) Deletes current window

CONTROL-X 1 (one) Deletes all windows except current window

META-x shrink-window Makes current window one line shorter

CONTROL-X ^ Makes current window one line taller

CONTROL-X } Makes current window one character wider

CONTROL-X { Makes current window one character narrower

Table 7-9 Working with files

Command Result

CONTROL-X CONTROL-F Prompts for a filename and reads its contents into a new buffer;

assigns the file’s simple filename as the buffer name.

CONTROL-X CONTROL-V Prompts for a filename and reads its contents into the current

buffer (overwriting the contents of the current buffer).

CONTROL-X 4 CONTROL-F Prompts for a filename and reads its contents into a new buffer;

assigns the file’s simple filename as the buffer name. Creates a

new window for the new buffer and selects that window. This

command splits the screen in half if you begin with only one

window.

CONTROL-X CONTROL-S Saves the current buffer to the original file.

CONTROL-X s Prompts for whether to save each modified buffer (y/n).

META-x set-visited-file-name Prompts for a filename and sets the current buffer’s “original”

name to that filename.

CONTROL-X CONTROL-W Prompts for a filename, sets the current buffer’s “original”

name to that filename, and saves the current buffer in that file.

META-~ (tilde) Clears modified flag from the current buffer. Use with caution.

Table 7-8 Working with windows (continued)

260 Chapter 7 The emacs Editor

Table 7-10 lists commands that work with buffers.

Table 7-11 lists commands that run shell commands in the foreground. These com-

mands may not work with all shells.

Table 7-10 Working with buffers

Command Result

CONTROL-X CONTROL-S Saves current buffer in its associated file.

CONTROL-X CONTROL-F Prompts for filename and visits (opens) that file.

CONTROL-X b Prompts for buffer name and selects it. If that buffer does not

exist, creates it.

CONTROL-X 4b Prompts for buffer name and displays that buffer in another

window. The existing window is not disturbed, although the

new window may overlap it.

CONTROL-X CONTROL-B Creates a buffer named *Buffer List* and displays it in another

window. The existing window is not disturbed, although the

new window may overlap it. The new buffer is not selected. In

the *Buffer List* buffer, each buffer’s data is displayed with its

name, size, mode(s), and original filename.

META-x rename-buffer Prompts for a new buffer name and assigns this new name to

the current buffer.

CONTROL-X CONTROL-Q Toggles the current buffer’s readonly status and the associated

%% Mode Line indicator.

META-x append-to-buffer Prompts for buffer name and appends Region to the end of that

buffer.

META-x prepend-to-buffer Prompts for buffer name and prepends Region to the beginning

of that buffer.

META-x copy-to-buffer Prompts for buffer name, deletes contents of that buffer, and

copies Region to that buffer.

META-x insert-buffer Prompts for buffer name and inserts entire contents of that

buffer in current buffer at Point.

CONTROL-X k Prompts for buffer name and deletes that buffer.

META-x kill-some-buffers Goes through the entire buffer list and offers the chance to

delete each buffer.

Chapter Summary 261

Table 7-12 lists commands that run shell commands in the background.

Table 7-13 lists commands that convert text from uppercase to lowercase, and vice

versa.

Table 7-11 Foreground shell commands

Command Result

META-! (exclamation point) Prompts for shell command, executes it, and displays the

output

CONTROL-U META-!

(exclamation point)

Prompts for shell command, executes it, and inserts the output

at Point

META-| (vertical bar) Prompts for shell command, supplies Region as input to that

command, and displays output of command

CONTROL-U META-| (vertical bar) Prompts for shell command, supplies Region as input to that

command, deletes old Region, and inserts output of command

in place of Region

Table 7-12 Background shell commands

Command Result

META-x compile Prompts for shell command and runs that command in

the background, with output going to the buffer named

*compilation*

META-x kill-compilation Kills background process

Table 7-13 Case conversion commands

Command Result

META-l (lowercase “l”) Converts word to right of Point to lowercase

META-u Converts word to right of Point to uppercase

META-c Converts word to right of Point to initial caps

CONTROL-X CONTROL-L Converts Region to lowercase

CONTROL-X CONTROL-U Converts Region to uppercase

262 Chapter 7 The emacs Editor

Table 7-14 lists commands that work in C mode.

Type META-x shell to create a buffer named *shell* and start a subshell. Table 7-15

lists commands that work on this buffer.

Exercises

1. Given a buffer full of English text, answer the following questions:

a. How would you change every instance of his to hers?

b. How would you make this change only in the final paragraph?

Table 7-14 C mode commands

Command Result

CONTROL-META-f Moves forward over expression

CONTROL-META-b Moves backward over expression

CONTROL-META-k Moves forward over expression and kills it

CONTROL-META-@ Sets Mark at the position CONTROL-META-f would move to, without

changing Point

CONTROL-META-a Moves to beginning of the most recent function definition

CONTROL-META-e Moves to end of the next function definition

CONTROL-META-h Moves Point to beginning and Mark to end of current (or next,

if between) function definition

Table 7-15 Shell mode commands

Command Result

RETURN Sends current line to the shell

CONTROL-C CONTROL-D Sends CONTROL-D to shell or its subshell

CONTROL-C CONTROL-C Sends CONTROL-C to shell or its subshell

CONTROL-C CONTROL-\ Sends quit signal to shell or its subshell

CONTROL-C CONTROL-U Kills text on the current line not yet completed

CONTROL-C CONTROL-R Scrolls back to beginning of last shell output, putting first line

of output at the top of the window

CONTROL-C CONTROL-O (uppercase

“O”)

Deletes last batch of shell output

Exercises 263

c. Is there a way to look at every usage in context before changing it?

d. How would you deal with the possibility that His might begin a

sentence?

2. Which command moves the cursor to the end of the current paragraph?

Can you use this command to skip through the buffer in one-paragraph

steps?

3. Suppose that you get lost in the middle of typing a long sentence.

a. Is there an easy way to kill the botched sentence and start over?

b. What if only one word is incorrect? Is there an alternative to backspac-

ing one letter at a time?

4. After you have been working on a paragraph for a while, most likely

some lines will have become too short and others too long. Is there a

command to “neaten up” the paragraph without rebreaking all the lines

by hand?

5. Is there a way to change the entire contents of the buffer to capital letters?

Can you think of a way to change just one paragraph?

6. How would you reverse the order of two paragraphs?

7. How would you reverse two words?

8. Imagine that you saw a Usenet posting with something particularly funny

in it and saved the posting to a file. How would you incorporate this file

into your own buffer? What if you wanted to use only a couple of para-

graphs from the posting? How would you add > to the beginning of each

included line?

9. On the keyboard alone emacs has always offered a full set of editing

possibilities. Generally several techniques will accomplish the same

goal for any editing task. In the X environment the choice is enlarged

still further with a new group of mouse-oriented visual alternatives.

From these options you must select the way that you like to solve a

given editing puzzle best.

Consider this Shakespearean fragment:

1. Full fathom five thy father lies;

2. Of his bones are coral made;

3. Those are pearls that were his eyes:

4. Nothing of him that doth fade,

5. But doth suffer a sea-change

6. Into something rich and strange.

7. Sea-nymphs hourly ring his knell:

8. Ding-dong.

9. Hark! now I hear them--

10. Ding-dong, bell!

264 Chapter 7 The emacs Editor

The following fragment has been typed with some errors:

1. Full fathiom five tyy father lies;

2. These are pearls that were his eyes:

3. Of his bones are coral made;

4. Nothin of him that doth fade,

5. But doth susffer a sea-change

6. Into something rich and strange.

7. Sea-nymphs hourly ring his knell:

8. Ding=dong.

9. Hard! now I hear them--

10. Ding-dong, bell!

Use only the keyboard to answer the following:

a. How many ways can you think of to move the cursor to the spelling

errors?

b. Once the cursor is on or near the errors, how many ways can you think

of to fix them?

c. Are there ways to fix errors without explicitly navigating to or search-

ing for them? How many can you think of?

d. Lines 2 and 3 in the retyped material are transposed. How many ways

can you think of to correct this situation?

Advanced Exercises

10. Assume that your buffer contains the C code shown here, with the Major

mode set for C and the cursor positioned at the end of the while line as

shown by the black square:

* Copy string s2 to s1. s1 must be large enough

* return s1

char *strcpy(char *s1, char *s2)

{

char *os1;

os1 = s1;

while (*s1++ = *s2++)

;

return os1;

}

* Copy source into dest, stopping after '\0' is copied, and

* return a pointer to the '\0' at the end of dest. Then our caller

* can catenate to the dest * string without another strlen call.

Advanced Exercises 265

char *stpcpy (char *dest, char *source)

{

while ((*dest++ = *source++) != '\0') ■

; /* void loop body */

return (dest - 1);

}

a. Which command moves the cursor to the opening brace of strcpy?

Which command moves the cursor past the closing brace? Can you use

these commands to skip through the buffer in one-procedure steps?

b. Assume the cursor is just past the closing parenthesis of the while condi-

tion. How do you move to the matching opening parenthesis? How do

you move back to the matching close parenthesis again? Does the same

command set work for matched [] (square brackets) and {} (braces)?

How does this differ from the vim % command?

c. One procedure is indented in the Berkeley indention style; the other is

indented in the GNU style. Which command reindents a line in accor-

dance with the current indention style you have set up? How would you

reindent an entire procedure?

d. Suppose that you want to write five string procedures and intend to use

strcpy as a starting point for further editing. How would you make five

copies of the strcpy procedure?

e. How would you compile the code without leaving emacs?

269269

This chapter picks up where Chapter 5 left off by focusing on the

Bourne Again Shell (bash). It notes where tcsh implementation of a

feature differs from that of bash and, if appropriate, directs you to

the page that discusses the alternative implementation. Chapter 10

expands on this chapter, exploring control flow commands and

more advanced aspects of programming the Bourne Again Shell

(bash). The bash home page is www.gnu.org/software/bash. The

bash info page is a complete Bourne Again Shell reference.

The Bourne Again Shell and TC Shell (tcsh) are command inter-

preters and high-level programming languages. As command

interpreters, they process commands you enter on the command

line in response to a prompt. When you use the shell as a pro-

gramming language, it processes commands stored in files called

shell scripts. Like other languages, shells have variables and con-

trol flow commands (for example, for loops and if statements).

When you use a shell as a command interpreter, you can cus-

tomize the environment you work in. You can make your

prompt display the name of the working directory, create a

function or an alias for cp that keeps it from overwriting certain

kinds of files, take advantage of keyword variables to change

In This Chapter

Startup Files . . . . . . . . . . . . . . . . . 271

Redirecting Standard Error . . . . . . 275

Writing a Simple Shell Script . . . . 278

Job Control. . . . . . . . . . . . . . . . . . . 285

Manipulating the Directory

Stack . . . . . . . . . . . . . . . . . . . . . 302

Parameters and Variables . . . . . . 290

Processes . . . . . . . . . . . . . . . . . . . 306

History . . . . . . . . . . . . . . . . . . . . . . 308

Reexecuting and Editing

Commands . . . . . . . . . . . . . . . . . 324

Functions. . . . . . . . . . . . . . . . . . . . 327

Controlling bash: Features

and Options . . . . . . . . . . . . . . . . 344

Processing the Command Line. . . 334

The Bourne Again

Shell

270 Chapter 8 The Bourne Again Shell

aspects of how the shell works, and so on. You can also write shell scripts that do

your bidding—anything from a one-line script that stores a long, complex com-

mand to a longer script that runs a set of reports, prints them, and mails you a

reminder when the job is done. More complex shell scripts are themselves pro-

grams; they do not just run other programs. Chapter 10 has some examples of these

types of scripts.

Most system shell scripts are written to run under bash (or dash; see below). If you

will ever work in single-user or recovery mode—when you boot the system or per-

form system maintenance, administration, or repair work, for example—it is a good

idea to become familiar with this shell.

This chapter expands on the interactive features of the shell described in Chapter 5,

explains how to create and run simple shell scripts, discusses job control, introduces

the basic aspects of shell programming, talks about history and aliases, and

describes command-line expansion. Chapter 9 covers interactive use of the TC Shell

and TC Shell programming, and Chapter 10 presents some more challenging shell

programming problems.

Background

The Bourne Again Shell is based on the Bourne Shell (the early UNIX shell; this book

refers to it as the original Bourne Shell to avoid confusion), which was written by

Steve Bourne of AT&T’s Bell Laboratories. Over the years the original Bourne Shell

has been expanded but it remains the basic shell provided with many commercial

versions of UNIX.

sh Shell Because of its long and successful history, the original Bourne Shell has been used to

write many of the shell scripts that help manage UNIX systems. Some of these

scripts appear in Linux as Bourne Again Shell scripts. Although the Bourne Again

Shell includes many extensions and features not found in the original Bourne Shell,

bash maintains compatibility with the original Bourne Shell so you can run Bourne

Shell scripts under bash. On UNIX systems the original Bourne Shell is named sh.

On some Linux systems sh is a symbolic link to bash ensuring that scripts that

require the presence of the Bourne Shell still run. When called as sh,bash does its

best to emulate the original Bourne Shell.

dash Shell The bash executable file is about 800 kilobytes, has many features, and is well suited

as a user login shell. The dash (Debian Almquist) shell is about 100 kilobytes, offers

Bourne Shell compatibility for shell scripts (noninteractive use), and, because of its

size, can load and execute shell scripts much more quickly than bash. Most system

scripts are set up to run sh, which under some systems, such as Ubuntu, is a symbolic

link to dash. This setup allows the system to boot and run system shell scripts quickly.

Korn Shell System V UNIX introduced the Korn Shell (ksh), written by David Korn. This shell

extended many features of the original Bourne Shell and added many new features.

Shell Basics 271

Some features of the Bourne Again Shell, such as command aliases and command-

line editing, are based on similar features from the Korn Shell.

POSIX The POSIX (Portable Operating System Interface) family of related standards is

being developed by PASC (IEEE’s Portable Application Standards Committee,

www.pasc.org). A comprehensive FAQ on POSIX, including many links, appears at

www.opengroup.org/austin/papers/posix_faq.html.

POSIX standard 1003.2 describes shell functionality. The Bourne Again Shell pro-

vides the features that match the requirements of this standard. Efforts are under

way to make the Bourne Again Shell fully comply with the POSIX standard. In the

meantime, if you invoke bash with the ––posix option, the behavior of the Bourne

Again Shell will closely match the POSIX requirements.

Shell Basics

This section covers writing and using startup files, redirecting standard error, writ-

ing and executing simple shell scripts, separating and grouping commands, imple-

menting job control, and manipulating the directory stack.

Startup Files

When a shell starts, it runs startup files to initialize itself. Which files the shell runs

depends on whether it is a login shell, an interactive shell that is not a login shell

(such as you get by giving the command bash), or a noninteractive shell (one used to

The bash-doc package holds Bourne Again Shell info documentation

tip Without the bash-doc package installed, the bash man page is available. Installing this package

provides the more complete bash info pages. See page 36 for information on info and Appendix C

for information on installing software packages.

chsh: changes your login shell

tip The person who sets up your account determines which shell you use when you first log in on the

system or when you open a terminal emulator window in a GUI environment. Under most Linux

distributions, bash is the default shell. You can run any shell you like once you are logged in. Enter

the name of the shell you want to use (bash,tcsh, or another shell) and press RETURN; the next

prompt will be that of the new shell. Give an exit command to return to the previous shell. Because

shells you call in this manner are nested (one runs on top of the other), you will be able to log out

only from your original shell. When you have nested several shells, keep giving exit commands

until you reach your original shell. You will then be able to log out.

Use the chsh utility to change your login shell permanently. First give the command chsh. In

response to the prompts, enter your password and the absolute pathname of the shell you want

to use (/bin/bash,/bin/tcsh, or the pathname of another shell). When you change your login shell

in this manner using a terminal emulator (page 26) under a GUI, subsequent terminal emulator

windows will not reflect the change until you log out of the system and log back in.

272 Chapter 8 The Bourne Again Shell

execute a shell script). You must have read access to a startup file to execute the

commands in it. Typically Linux distributions put appropriate commands in some

of these files. This section covers bash startup files. See page 352 for information on

tcsh startup files and page 936 for information on startup files under Mac OS X.

The files covered in this section are executed by login shells and shells that you start

with the bash ––login option. Login shells are, by their nature, interactive.

/etc/profile The shell first executes the commands in /etc/profile. A user working with root

privileges can set up this file to establish systemwide default characteristics for users

running bash.

.bash_profile

.bash_login

.profile

Next the shell looks for ~/.bash_profile,~/.bash_login, and ~/.profile (~/ is short-

hand for your home directory), in that order, executing the commands in the first of

these files it finds. You can put commands in one of these files to override the

defaults set in /etc/profile. A shell running on a virtual terminal does not execute

commands in these files.

.bash_logout When you log out, bash executes commands in the ~/.bash_logout file. This file

often holds commands that clean up after a session, such as those that remove

temporary files.

Interactive Nonlogin Shells

The commands in the preceding startup files are not executed by interactive, non-

are set by these startup files.

/etc/bashrc Although not called by bash directly, many ~/.bashrc files call /etc/bashrc. This

setup allows a user working with root privileges to establish systemwide default

characteristics for nonlogin bash shells.

.bashrc An interactive nonlogin shell executes commands in the ~/.bashrc file. Typically a

startup file for a login shell, such as .bash_profile, runs this file, so both login and

nonlogin shells run the commands in .bashrc.

Noninteractive Shells

The commands in the previously described startup files are not executed by nonin-

teractive shells, such as those that runs shell scripts. However, these shells inherit

BASH_ENV Noninteractive shells look for the environment variable BASH_ENV (or ENV if the

shell is called as sh) and execute commands in the file named by this variable.

Setting Up Startup Files

Although many startup files and types of shells exist, usually all you need are the

.bash_profile and .bashrc files in your home directory. Commands similar to the

Shell Basics 273

following in .bash_profile run commands from .bashrc for login shells (when

.bashrc exists). With this setup, the commands in .bashrc are executed by login and

nonlogin shells.

if [ -f ~/.bashrc ]; then . ~/.bashrc; fi

The [ –f ~/.bashrc ] tests whether the file named .bashrc in your home directory

exists. See pages 399, 401, and 854 for more information on test and its synonym [ ].

See page 274 for information on the . (dot) builtin.

Sample .bash_profile and .bashrc files follow. Some commands used in these files are

not covered until later in this chapter. In any startup file, you must export variables

and functions that you want to be available to child processes. For more information

refer to “Locality of Variables” on page 436.

$ cat ~/.bash_profile

if [ -f ~/.bashrc ]; then

. ~/.bashrc # Read local startup file if it exists

PATH=$PATH:. # Add the working directory to PATH

export PS1='[\h \W \!]\$ ' # Set prompt

The first command in the preceding .bash_profile file executes the commands in the

user’s .bashrc file if it exists. The next command adds to the PATH variable

(page 297). Typically PATH is set and exported in /etc/profile so it does not need to

be exported in a user’s startup file. The final command sets and exports PS1

(page 299), which controls the user’s prompt.

A sample .bashrc file is shown on the next page. The first command executes the

commands in the /etc/bashrc file if it exists. Next the file sets and exports the

LANG (page 304) and VIMINIT (for vim initialization) variables and defines sev-

eral aliases. The final command defines a function (page 327) that swaps the names

of two files.

Use .bash_profile to set PATH

tip Because commands in .bashrc may be executed many times, and because subshells inherit

exported variables, it is a good idea to put commands that add to existing variables in the

.bash_profile file. For example, the following command adds the bin subdirectory of the home

directory to PATH (page 297) and should go in .bash_profile:

PATH=$PATH:$HOME/bin

When you put this command in .bash_profile and not in .bashrc, the string is added to the PATH

variable only once, when you log in.

Modifying a variable in .bash_profile causes changes you make in an interactive session to prop-

agate to subshells. In contrast, modifying a variable in .bashrc overrides changes inherited from

a parent shell.

274 Chapter 8 The Bourne Again Shell

$ cat ~/.bashrc

if [ -f /etc/bashrc ]; then

source /etc/bashrc # read global startup file if it exists

set -o noclobber # prevent overwriting files

unset MAILCHECK # turn off "you have new mail" notice

export LANG=C # set LANG variable

export VIMINIT='set ai aw' # set vim options

alias df='df -h' # set up aliases

alias rm='rm -i' # always do interactive rm's

alias lt='ls -ltrh | tail'

alias h='history | tail'

alias ch='chmod 755 '

function switch() # a function to exchange the names

{ # of two files

local tmp=$$switch

mv "$1" $tmp

mv "$2" "$1"

mv $tmp "$2"

}

. (Dot) or source: Runs a Startup File in the

Current Shell

After you edit a startup file such as .bashrc, you do not have to log out and log in

again to put the changes into effect. Instead, you can run the startup file using the .

(dot) or source builtin (they are the same command under bash; only source is avail-

able under tcsh [page 390]). As with all other commands, the . must be followed by

aSPACE on the command line. Using . or source is similar to running a shell script,

except these commands run the script as part of the current process. Consequently,

when you use . or source to run a script, changes you make to variables from within

the script affect the shell you run the script from. If you ran a startup file as a regu-

lar shell script and did not use the . or source builtin, the variables created in the

startup file would remain in effect only in the subshell running the script—not in the

shell you ran the script from. You can use the . or source command to run any shell

script—not just a startup file—but undesirable side effects (such as changes in the

values of shell variables you rely on) may occur. For more information refer to

“Locality of Variables” on page 436.

In the following example, .bashrc sets several variables and sets PS1, the prompt, to

the name of the host. The . builtin puts the new values into effect.

$ cat ~/.bashrc

export TERM=vt100 # set the terminal type

export PS1="$(hostname -f): " # set the prompt string

export CDPATH=:$HOME # add HOME to CDPATH string

stty kill '^u' # set kill line to control-u

$ . ~/.bashrc

bravo.example.com:

Shell Basics 275

Commands That Are Symbols

The Bourne Again Shell uses the symbols (, ),[, ], and $ in a variety of ways. To

minimize confusion, Table 8-1 lists the most common use of each of these symbols,

even though some of them are not introduced until later in this book.

Redirecting Standard Error

Chapter 5 covered the concept of standard output and explained how to redirect

standard output of a command. In addition to standard output, commands can

send output to standard error. A command can send error messages to standard

error to keep them from getting mixed up with the information it sends to standard

output.

Just as it does with standard output, by default the shell directs standard error to

the screen. Unless you redirect one or the other, you may not know the difference

between the output a command sends to standard output and the output it sends to

standard error. This section describes the syntax used by the Bourne Again Shell to

redirect standard error and to distinguish between standard output and standard

error. See page 359 if you are using the TC Shell.

File descriptors Afile descriptor is the place a program sends its output to and gets its input from.

When you execute a program, Linux opens three file descriptors for the program: 0

(standard input), 1 (standard output), and 2 (standard error). The redirect output

symbol (> [page 126]) is shorthand for 1>, which tells the shell to redirect standard

output. Similarly < (page 128) is short for 0<, which redirects standard input. The

symbols 2> redirect standard error. For more information refer to “File Descrip-

tors” on page 431.

The following examples demonstrate how to redirect standard output and standard

error to different files and to the same file. When you run the cat utility with the

name of a file that does not exist and the name of a file that does exist, cat sends an

error message to standard error and copies the file that does exist to standard out-

put. Unless you redirect them, both messages appear on the screen.

Table 8-1 Builtin commands that are symbols

Symbol Command

( ) Subshell (page 284)

$( ) Command substitution (page 340)

(( )) Arithmetic evaluation; a synonym for let (use when the enclosed value con-

tains an equal sign; page 460)

$(( )) Arithmetic expansion (not for use with an enclosed equal sign; page 338)

[ ] The test command (pages 399, 401, and 854)

[[ ]] Conditional expression; similar to [ ] but adds string comparisons (page 461)

276 Chapter 8 The Bourne Again Shell

$ cat y

This is y.

$ cat x

cat: x: No such file or directory

$ cat x y

cat: x: No such file or directory

This is y.

When you redirect standard output of a command, output sent to standard error is

not affected and still appears on the screen.

$ cat x y > hold

cat: x: No such file or directory

$ cat hold

This is y.

Similarly, when you send standard output through a pipe, standard error is not

affected. The following example sends standard output of cat through a pipe to tr,

which in this example converts lowercase characters to uppercase. (See page 864 for

more information on tr.) The text that cat sends to standard error is not translated

because it goes directly to the screen rather than through the pipe.

$ cat x y | tr "[a-z]" "[A-Z]"

cat: x: No such file or directory

THIS IS Y.

The following example redirects standard output and standard error to different

files. The token following 2> tells the shell where to redirect standard error (file

descriptor 2). The token following 1> tells the shell where to redirect standard out-

put (file descriptor 1). You can use > in place of 1>.

$ cat x y 1> hold1 2> hold2

$ cat hold1

This is y.

$ cat hold2

cat: x: No such file or directory

Combining

standard output and

standard error

In the next example, the &> token redirects standard output and standard error to

a single file. The >& token performs the same function under tcsh (page 359).

$ cat x y &> hold

$ cat hold

cat: x: No such file or directory

This is y.

Duplicating a file

descriptor

In the next example, first 1> redirects standard output to hold and then 2>&1

declares file descriptor 2 to be a duplicate of file descriptor 1. As a result, both stan-

dard output and standard error are redirected to hold.

$ cat x y 1> hold 2>&1

$ cat hold

cat: x: No such file or directory

This is y.

In this case, 1> hold precedes 2>&1. If they had been listed in the opposite order,

standard error would have been made a duplicate of standard output before standard

Shell Basics 277

output was redirected to hold. Only standard output would have been redirected to

hold in that scenario.

The next example declares file descriptor 2 to be a duplicate of file descriptor 1 and

sends the output for file descriptor 1 through a pipe to the tr command.

$ cat x y 2>&1 | tr "[a-z]" "[A-Z]"

CAT: X: NO SUCH FILE OR DIRECTORY

THIS IS Y.

Sending errors to

standard error

You can use 1>&2 to redirect standard output of a command to standard error.

Shell scripts use this technique to send the output of echo to standard error. In the

following script, standard output of the first echo is redirected to standard error:

$ cat message_demo

echo This is an error message. 1>&2

echo This is not an error message.

If you redirect standard output of message_demo, error messages such as the one pro-

duced by the first echo appear on the screen because you have not redirected standard

error. Because standard output of a shell script is frequently redirected to another file,

you can use this technique to display on the screen any error messages generated by

the script. The lnks script (page 406) uses this technique. You can also use the exec

builtin to create additional file descriptors and to redirect standard input, standard

output, and standard error of a shell script from within the script (page 451).

The Bourne Again Shell supports the redirection operators shown in Table 8-2.

Table 8-2 Redirection operators

Operator Meaning

filename

Redirects standard input from

filename

Redirects standard output to

filename

unless

filename

exists and noclobber

(page 129) is set. If noclobber is not set, this redirection creates

filename

if it

does not exist and overwrites it if it does exist.

filename

Redirects standard output to

filename

, even if the file exists and noclobber

(page 129) is set.

filename

Redirects and appends standard output to

filename

unless

filename

exists and

noclobber (page 129) is set. If noclobber is not set, this redirection creates

filename

if it does not exist.

filename

Redirects standard output and standard error to

filename

Duplicates standard input from file descriptor

(page 432).

[n]

Duplicates standard output or file descriptor

if specified from file descriptor

(page 432).

[n]

<&– Closes standard input or file descriptor

if specified (page 432).

[n]

>&– Closes standard output or file descriptor

if specified.

278 Chapter 8 The Bourne Again Shell

Writing a Simple Shell Script

Ashell script is a file that holds commands that the shell can execute. The commands

in a shell script can be any commands you can enter in response to a shell prompt.

For example, a command in a shell script might run a Linux utility, a compiled pro-

gram, or another shell script. Like the commands you give on the command line, a

command in a shell script can use ambiguous file references and can have its input or

output redirected from or to a file or sent through a pipe. You can also use pipes and

redirection with the input and output of the script itself.

In addition to the commands you would ordinarily use on the command line, control

flow commands (also called control structures) find most of their use in shell scripts.

This group of commands enables you to alter the order of execution of commands in

a script in the same way you would alter the order of execution of statements using a

structured programming language. Refer to “Control Structures” on page 398 (bash)

and page 378 (tcsh) for specifics.

The shell interprets and executes the commands in a shell script, one after another.

Thus a shell script enables you to simply and quickly initiate a complex series of

tasks or a repetitive procedure.

chmod: Makes a File Executable

To execute a shell script by giving its name as a command, you must have permission

to read and execute the file that contains the script (refer to “Access Permissions” on

page 93). Read permission enables you to read the file that holds the script. Execute

permission tells the shell and the system that the owner, group, and/or public has

permission to execute the file; it implies that the content of the file is executable.

When you create a shell script using an editor, the file does not typically have its

execute permission set. The following example shows a file named whoson that

contains a shell script:

$ cat whoson

date

echo "Users Currently Logged In"

who

$ ./whoson

bash: ./whoson: Permission denied

You cannot execute whoson by giving its name as a command because you do not

have execute permission for the file. The shell does not recognize whoson as an exe-

cutable file and issues the error message Permission denied when you try to execute

it. (See the tip on the next page if you get a command not found error message.)

When you give the filename as an argument to bash (bash whoson), bash takes the

argument to be a shell script and executes it. In this case bash is executable and

whoson is an argument that bash executes so you do not need to have execute per-

mission to whoson. You must have read permission. You can do the same with tcsh

script files.

Shell Basics 279

The chmod utility changes the access privileges associated with a file. Figure 8-1

shows ls with the –l option displaying the access privileges of whoson before and

after chmod gives execute permission to the file’s owner.

The first ls displays a hyphen (–) as the fourth character, indicating that the owner

does not have permission to execute the file. Next chmod gives the owner execute per-

mission: u+x causes chmod to add (+) execute permission (x) for the owner (u). (The u

stands for user, although it means the owner of the file.) The second argument is the

name of the file. The second ls shows an x in the fourth position, indicating that the

owner has execute permission.

If other users will execute the file, you must also change group and/or public access

permissions for the file. Any user must have execute access to use the file’s name as

a command. If the file is a shell script, the user trying to execute the file must have

read access to the file as well. You do not need read access to execute a binary exe-

cutable (compiled program).

Figure 8-1 Using chmod to make a shell script executable

$ ls -l whoson

-rw-rw-r-- 1 max group 40 May 24 11:30 whoson

$ chmod u+x whoson

$ ls -l whoson

-rwxrw-r-- 1 max group 40 May 24 11:30 whoson

$ ./whoson

Fri May 22 11:40:49 PDT 2009

Users Currently Logged In

zach pts/7 May 21 18:17

hls pts/1 May 22 09:59

sam pts/12 May 22 06:29 (bravo.example.com)

max pts/4 May 22 09:08

Command not found?

tip If you give the name of a shell script as a command without including the leading ./, the shell typ-

ically displays the following error message:

$ whoson

bash: whoson: command not found

This message indicates the shell is not set up to search for executable files in the working directory.

Give this command instead:

$ ./whoson

The ./ tells the shell explicitly to look for an executable file in the working directory. To change the

environment so the shell searches the working directory automatically, see the section about PATH

on page 297.

280 Chapter 8 The Bourne Again Shell

The final command in Figure 8-1 shows the shell executing the file when its name is

given as a command. For more information refer to “Access Permissions” on

page 93 as well as the discussions of ls and chmod in Part V.

#! Specifies a Shell

You can put a special sequence of characters on the first line of a shell script to tell

the operating system which shell (or other program) should execute the file. Because

the operating system checks the initial characters of a program before attempting to

execute it using exec, these characters save the system from making an unsuccessful

attempt. If #! are the first two characters of a script, the system interprets the char-

acters that follow as the absolute pathname of the utility that should execute the

script. This can be the pathname of any program, not just a shell. The following

example specifies that bash should run the script:

$ cat bash_script

#!/bin/bash

echo "This is a Bourne Again Shell script."

The #! characters are useful if you have a script that you want to run with a shell

other than the shell you are running the script from. The next example shows a

script that should be executed by tcsh:

$ cat tcsh_script

#!/bin/tcsh

echo "This is a tcsh script."

set person = zach

echo "person is $person"

Because of the #! line, the operating system ensures that tcsh executes the script no

matter which shell you run it from.

You can use ps –f within a shell script to display the name of the shell that is execut-

ing the script. The three lines that ps displays in the following example show the

process running the parent bash shell, the process running the tcsh script, and the

process running the ps command:

$ cat tcsh_script2

#!/bin/tcsh

ps -f

$ ./tcsh_script2

UID PID PPID C STIME TTY TIME CMD

max 3031 3030 0 Nov16 pts/4 00:00:00 -bash

max 9358 3031 0 21:13 pts/4 00:00:00 /bin/tcsh ./tcsh_script2

max 9375 9358 0 21:13 pts/4 00:00:00 ps -f

If you do not follow #! with the name of an executable program, the shell reports

that it cannot find the command that you asked it to run. You can optionally follow

#! with SPACEs. If you omit the #! line and try to run, for example, a tcsh script from

bash, the script will run under bash and may generate error messages or not run

properly. See page 578 for an example of a stand-alone sed script that uses #!.

Shell Basics 281

# Begins a Comment

Comments make shell scripts and all code easier to read and maintain by you and

others. The comment syntax is common to both the Bourne Again and the TC Shells.

If a pound sign (#) in the first character position of the first line of a script is not

immediately followed by an exclamation point (!) or if a pound sign occurs in any

other location in a script, the shell interprets it as the beginning of a comment. The

shell then ignores everything between the pound sign and the end of the line (the

next NEWLINE character).

Executing a Shell Script

fork and exec

system calls

A command on the command line causes the shell to fork a new process, creating a

duplicate of the shell process (a subshell). The new process attempts to exec (exe-

cute) the command. Like fork, the exec routine is executed by the operating system

(a system call). If the command is a binary executable program, such as a compiled

C program, exec succeeds and the system overlays the newly created subshell with

the executable program. If the command is a shell script, exec fails. When exec fails,

the command is assumed to be a shell script, and the subshell runs the commands in

the script. Unlike a login shell, which expects input from the command line, the sub-

shell takes its input from a file—namely, the shell script.

As discussed earlier, you can run commands in a shell script file that you do not

have execute permission for by using a bash command to exec a shell that runs the

script directly. In the following example, bash creates a new shell that takes its input

from the file named whoson:

$ bash whoson

Because the bash command expects to read a file containing commands, you do not

need execute permission for whoson. (You do need read permission.) Even though

bash reads and executes the commands in whoson, standard input, standard output,

and standard error remain directed from/to the terminal.

Although you can use bash to execute a shell script, this technique causes the script

to run more slowly than giving yourself execute permission and directly invoking the

script. Users typically prefer to make the file executable and run the script by typing

its name on the command line. It is also easier to type the name, and this practice is

consistent with the way other kinds of programs are invoked (so you do not need to

know whether you are running a shell script or an executable file). However, if bash

is not your interactive shell or if you want to see how the script runs with different

shells, you may want to run a script as an argument to bash or tcsh.

Separating and Grouping Commands

Whether you give the shell commands interactively or write a shell script, you must

separate commands from one another. This section, which applies to the Bourne

Again and the TC Shells, reviews the ways to separate commands that were covered

in Chapter 5 and introduces a few new ones.

282 Chapter 8 The Bourne Again Shell

; and NEWLINE Separate Commands

The NEWLINE character is a unique command separator because it initiates execution

of the command preceding it. You have seen this behavior throughout this book

each time you press the RETURN key at the end of a command line.

The semicolon (;) is a command separator that does not initiate execution of a

command and does not change any aspect of how the command functions. You can

execute a series of commands sequentially by entering them on a single command

line and separating each from the next with a semicolon (;). You initiate execution

of the sequence of commands by pressing RETURN:

$ x ; y ; z

If x,y, and z are commands, the preceding command line yields the same results as

the next three commands. The difference is that in the next example the shell issues

a prompt after each of the commands (x,y, and z) finishes executing, whereas the

preceding command line causes the shell to issue a prompt only after z is complete:

$ x

$ y

$ z

Whitespace Although the whitespace around the semicolons in the earlier example makes the

command line easier to read, it is not necessary. None of the command separators

needs to be surrounded by SPACEs or TABs.

\ Continues a Command

When you enter a long command line and the cursor reaches the right side of the

screen, you can use a backslash (\) character to continue the command on the next

line. The backslash quotes, or escapes, the NEWLINE character that follows it so the

shell does not treat the NEWLINE as a command terminator. Enclosing a backslash

within single quotation marks or preceding it with another backslash turns off the

power of a backslash to quote special characters such as NEWLINE (not tcsh; see

prompt2 on page 374). Enclosing a backslash within double quotation marks has

no effect on the power of the backslash (not tcsh).

Although you can break a line in the middle of a word (token), it is typically simpler

to break a line immediately before or after whitespace.

optional Under bash, you can enter a RETURN in the middle of a quoted string on a command

line without using a backslash. (See prompt2 on page 374 for tcsh behavior). The

NEWLINE (RETURN) you enter will then be part of the string:

$ echo "Please enter the three values

> required to complete the transaction."

Please enter the three values

required to complete the transaction.

In the three examples in this section, bash does not interpret RETURN as a command

terminator because it occurs within a quoted string. The greater than (>) sign is a

secondary prompt (PS2; page 300) indicating the shell is waiting for you to continue

Shell Basics 283

the unfinished command. In the next example, the first RETURN is quoted (escaped) so

the shell treats it as a separator and does not interpret it literally.

$ echo "Please enter the three values \

> required to complete the transaction."

Please enter the three values required to complete the transaction.

Single quotation marks cause the shell to interpret a backslash literally:

$ echo 'Please enter the three values \

> required to complete the transaction.'

Please enter the three values \

required to complete the transaction.

| and & Separate Commands and Do Something Else

The pipe symbol (|) and the background task symbol (&) are also command sepa-

rators. They do not start execution of a command but do change some aspect of

how the command functions. The pipe symbol alters the source of standard input

or the destination of standard output. The background task symbol causes the

shell to execute the task in the background and display a prompt immediately; you

can continue working on other tasks.

Each of the following command lines initiates a single job comprising three tasks:

$ x | y | z

$ ls -l | grep tmp | less

In the first job, the shell redirects standard output of task x to standard input of

task y and redirects y’s standard output to z’s standard input. Because it runs the

entire job in the foreground, the shell does not display a prompt until task z runs to

completion: Task z does not finish until task y finishes, and task y does not finish

until task x finishes. In the second job, task x is an ls –l command, task y is grep

tmp, and task z is the pager less. The shell displays a long (wide) listing of the files

in the working directory that contain the string tmp, piped through less.

The next command line executes tasks d and e in the background and task f in the

foreground:

$ d & e & f

[1] 14271

[2] 14272

The shell displays the job number between brackets and the PID number for each

process running in the background. It displays a prompt as soon as f finishes, which

may be before d or e finishes.

Before displaying a prompt for a new command, the shell checks whether any back-

ground jobs have completed. For each completed job, the shell displays its job number,

the word Done, and the command line that invoked the job; the shell then displays a

prompt. When the job numbers are listed, the number of the last job started is followed

by a + character and the job number of the previous job is followed by a – character.

284 Chapter 8 The Bourne Again Shell

Other jobs are followed by a SPACE character. After running the last command, the shell

displays the following lines before issuing a prompt:

[1]- Done d

[2]+ Done e

The next command line executes all three tasks as background jobs. The shell displays

a prompt immediately:

$ d & e & f &

[1] 14290

[2] 14291

[3] 14292

You can use pipes to send the output from one task to the next task and an amper-

sand (&) to run the entire job as a background task. Again the shell displays the

prompt immediately. The shell regards the commands joined by a pipe as a single

job. That is, it treats all pipes as single jobs, no matter how many tasks are connected

with the pipe (|) symbol or how complex they are. The Bourne Again Shell reports

only one process in the background (although there are three):

$ d | e | f &

[1] 14295

The TC Shell shows three processes (all belonging to job 1) placed in the background:

tcsh $ d | e | f &

[1] 14302 14304 14306

optional () Groups Commands

You can use parentheses to group commands. The shell creates a copy of itself,

called a subshell, for each group. It treats each group of commands as a job and

creates a new process to execute each command (refer to “Process Structure” on

page 306 for more information on creating subshells). Each subshell (job) has its

own environment, meaning that it has its own set of variables whose values can

differ from those found in other subshells.

The following command line executes commands a and b sequentially in the back-

ground while executing c in the background. The shell displays a prompt immediately.

$ (a ; b) & c &

[1] 15520

[2] 15521

The preceding example differs from the earlier example d & e & f & in that tasks a

and b are initiated sequentially, not concurrently.

Similarly the following command line executes a and b sequentially in the back-

ground and, at the same time, executes c and d sequentially in the background. The

subshell running a and b and the subshell running c and d run concurrently. The shell

displays a prompt immediately.

$ (a ; b) & (c ; d) &

[1] 15528

[2] 15529

Shell Basics 285

The next script copies one directory to another. The second pair of parentheses creates

a subshell to run the commands following the pipe. Because of these parentheses, the

output of the first tar command is available for the second tar command despite the

intervening cd command. Without the parentheses, the output of the first tar command

would be sent to cd and lost because cd does not process input from standard input.

The shell variables $1 and $2 represent the first and second command-line arguments

(page 441), respectively. The first pair of parentheses, which creates a subshell to run

the first two commands, allows users to call cpdir with relative pathnames. Without

them, the first cd command would change the working directory of the script (and con-

sequently the working directory of the second cd command). With them, only the

working directory of the subshell is changed.

$ cat cpdir

(cd $1 ; tar -cf - . ) | (cd $2 ; tar -xvf - )

$ ./cpdir /home/max/sources /home/max/memo/biblio

The cpdir command line copies the files and directories in the /home/max/sources

directory to the directory named /home/max/memo/biblio. This shell script is

almost the same as using cp with the –r option. Refer to Part V for more informa-

tion on cp and tar.

Job Control

Ajob is a command pipeline. You run a simple job whenever you give the shell a

command. For example, if you type date on the command line and press RETURN, you

have run a job. You can also create several jobs with multiple commands on a single

command line:

$ find . -print | sort | lpr & grep -l max /tmp/* > maxfiles &

[1] 18839

[2] 18876

The portion of the command line up to the first & is one job consisting of three pro-

cesses connected by pipes: find (page 688), sort (pages 54 and 817), and lpr (pages 51

and 742). The second job is a single process running grep. The trailing & characters

put each job in the background, so bash does not wait for them to complete before

displaying a prompt.

Using job control you can move commands from the foreground to the background

(and vice versa), stop commands temporarily, and list all commands that are run-

ning in the background or stopped.

jobs: Lists Jobs

The jobs builtin lists all background jobs. Following, the sleep command runs in the

background and creates a background job that jobs reports on:

$ sleep 60 &

[1] 7809

$ jobs

[1] + Running sleep 60 &

286 Chapter 8 The Bourne Again Shell

fg: Brings a Job to the Foreground

The shell assigns a job number to each command you run in the background. For

each job run in the background, the shell lists the job number and PID number

immediately, just before it issues a prompt:

$ xclock &

[1] 1246

$ date &

[2] 1247

$ Tue Dec 2 11:44:40 PST 2008

[2]+ Done date

$ find /usr -name ace -print > findout &

[2] 1269

$ jobs

[1]- Running xclock &

[2]+ Running find /usr -name ace -print > findout &

Job numbers, which are discarded when a job is finished, can be reused. When you

start or put a job in the background, the shell assigns a job number that is one more

than the highest job number in use.

In the preceding example, the jobs command lists the first job, xclock, as job 1. The

date command does not appear in the jobs list because it finished before jobs was

run. Because the date command was completed before find was run, the find com-

mand became job 2.

To move a background job to the foreground, use the fg builtin followed by the job

number. Alternatively, you can give a percent sign (%) followed by the job number

as a command. Either of the following commands moves job 2 to the foreground.

When you move a job to the foreground, the shell displays the command it is now

executing in the foreground.

$ fg 2

find /usr -name ace -print > findout

$ %2

find /usr -name ace -print > findout

You can also refer to a job by following the percent sign with a string that uniquely

identifies the beginning of the command line used to start the job. Instead of the

preceding command, you could have used either fg %find or fg %f because both

uniquely identify job 2. If you follow the percent sign with a question mark and a

string, the string can match any part of the command line. In the preceding exam-

ple, fg %?ace also brings job 2 to the foreground.

Often the job you wish to bring to the foreground is the only job running in the

background or is the job that jobs lists with a plus (+). In these cases fg without an

argument brings the job to the foreground.

Shell Basics 287

Suspending a Job

Pressing the suspend key (usually CONTROL-Z) immediately suspends (temporarily

stops) the job in the foreground and displays a message that includes the word

Stopped.

CONTROL-Z

[2]+ Stopped find /usr -name ace -print > findout

For more information refer to “Moving a Job from the Foreground to the Back-

ground” on page 135.

bg: Sends a Job to the Background

To move the foreground job to the background, you must first suspend the job

(above). You can then use the bg builtin to resume execution of the job in the back-

ground.

$ bg

[2]+ find /usr -name ace -print > findout &

If a background job attempts to read from the terminal, the shell stops the program

and displays a message saying the job has been stopped. You must then move the

job to the foreground so it can read from the terminal.

$ (sleep 5; cat > mytext) &

[1] 1343

$ date

Tue Dec 2 11:58:20 PST 2008

[1]+ Stopped ( sleep 5; cat >mytext )

$ fg

( sleep 5; cat >mytext )

Remember to let the cat out!

CONTROL-D

In the preceding example, the shell displays the job number and PID number of the

background job as soon as it starts, followed by a prompt. Demonstrating that you

can give a command at this point, the user gives the command date and its output

appears on the screen. The shell waits until just before it issues a prompt (after date

has finished) to notify you that job 1 is stopped. When you give an fg command, the

shell puts the job in the foreground and you can enter the data the command is

waiting for. In this case the input needs to be terminated with CONTROL-D, which sends

an EOF (end of file) signal to the shell. The shell then displays another prompt.

The shell keeps you informed about changes in the status of a job, notifying you when

a background job starts, completes, or stops, perhaps because it is waiting for input

from the terminal. The shell also lets you know when a foreground job is suspended.

Because notices about a job being run in the background can disrupt your work, the

shell delays displaying these notices until just before it displays a prompt. You can set

notify (page 333) to cause the shell to display these notices without delay.

288 Chapter 8 The Bourne Again Shell

If you try to exit from a shell while jobs are stopped, the shell issues a warning and

does not allow you to exit. If you then use jobs to review the list of jobs or you imme-

diately try to exit from the shell again, the shell allows you to exit. If huponexit

(page 333) is not set (the default), stopped and background jobs keep running in the

background. If it is set, the shell terminates the jobs.

Manipulating the Directory Stack

Both the Bourne Again and the TC Shells allow you to store a list of directories you

are working with, enabling you to move easily among them. This list is referred to

as a stack. It is analogous to a stack of dinner plates: You typically add plates to and

remove plates from the top of the stack, so this type of stack is named a last in, first

out (LIFO) stack.

dirs: Displays the Stack

The dirs builtin displays the contents of the directory stack. If you call dirs when the

directory stack is empty, it displays the name of the working directory:

$ dirs

~/literature

Figure 8-2 The directory structure in the examples

home

sam

demo

promo

literaturenames

Figure 8-3 Creating a directory stack

names

demo

literature

pushd

Shell Basics 289

The dirs builtin uses a tilde (~) to represent the name of a user’s home directory. The

examples in the next several sections assume that you are referring to the directory

structure shown in Figure 8-2.

pushd: Pushes a Directory on the Stack

When you supply the pushd (push directory) builtin with one argument, it pushes

the directory specified by the argument on the stack, changes directories to the

specified directory, and displays the stack. The following example is illustrated in

Figure 8-3:

$ pushd ../demo

~/demo ~/literature

$ pwd

/home/sam/demo

$ pushd ../names

~/names ~/demo ~/literature

$ pwd

/home/sam/names

When you use pushd without an argument, it swaps the top two directories on the

stack, makes the new top directory (which was the second directory) the new work-

ing directory, and displays the stack (Figure 8-4):

$ pushd

~/demo ~/names ~/literature

$ pwd

/home/sam/demo

Using pushd in this way, you can easily move back and forth between two directories.

You can also use cd – to change to the previous directory, whether or not you have

explicitly created a directory stack. To access another directory in the stack, call pushd

with a numeric argument preceded by a plus sign. The directories in the stack are

numbered starting with the top directory, which is number 0. The following pushd

command continues with the previous example, changing the working directory to

literature and moving literature to the top of the stack:

$ pushd +2

~/literature ~/demo ~/names

$ pwd

/home/sam/literature

Figure 8-4 Using pushd to change working directories

names demo names

demonamesdemo

literature literature literature

pushd pushdpushd pushd

290 Chapter 8 The Bourne Again Shell

popd: Pops a Directory Off the Stack

To remove a directory from the stack, use the popd (pop directory) builtin. As the fol-

lowing example and Figure 8-5 show, without an argument, popd removes the top

directory from the stack and changes the working directory to the new top directory:

$ dirs

~/literature ~/demo ~/names

$ popd

~/demo ~/names

$ pwd

/home/sam/demo

To remove a directory other than the top one from the stack, use popd with a

numeric argument preceded by a plus sign. The following example removes direc-

tory number 1, demo. Removing a directory other than directory number 0 does

not change the working directory.

$ dirs

~/literature ~/demo ~/names

$ popd +1

~/literature ~/names

Parameters and Variables

Variables Within a shell, a shell parameter is associated with a value that is accessible to the

user. There are several kinds of shell parameters. Parameters whose names consist of

letters, digits, and underscores are often referred to as shell variables, or simply

variables. A variable name must start with a letter or underscore, not with a num-

ber. Thus A76,MY_CAT, and ___X___ are valid variable names, whereas

69TH_STREET (starts with a digit) and MY-NAME (contains a hyphen) are not.

User-created

variables

Shell variables that you name and assign values to are user-created variables. You

can change the values of user-created variables at any time, or you can make them

readonly so that their values cannot be changed. You can also make user-created

variables global. A global variable (also called an environment variable) is available

to all shells and other programs you fork from the original shell. One naming con-

vention is to use only uppercase letters for global variables and to use mixed-case or

Figure 8-5 Using popd to remove a directory from the stack

literature

demo

names

popd

Parameters and Variables 291

lowercase letters for other variables. Refer to “Locality of Variables” on page 436

for more information on global variables.

To assign a value to a variable in the Bourne Again Shell, use the following syntax:

VARIABLE=value

There can be no whitespace on either side of the equal sign (=). An example assignment

follows:

$ myvar=abc

Under the TC Shell the assignment must be preceded by the word set and the SPACEs

on either side of the equal sign are optional:

$ set myvar = abc

The Bourne Again Shell permits you to put variable assignments on a command

line. This type of assignment creates a variable that is local to the command

shell—that is, the variable is accessible only from the program the command runs.

The my_script shell script displays the value of TEMPDIR. The following command

runs my_script with TEMPDIR set to /home/sam/temp. The echo builtin shows

that the interactive shell has no value for TEMPDIR after running my_script. If

TEMPDIR had been set in the interactive shell, running my_script in this manner

would have had no effect on its value.

$ cat my_script

echo $TEMPDIR

$ TEMPDIR=/home/sam/temp ./my_script

/home/sam/temp

$ echo $TEMPDIR

Keyword variables Keyword shell variables (or simply keyword variables) have special meaning to the

shell and usually have short, mnemonic names. When you start a shell (by logging

in, for example), the shell inherits several keyword variables from the environment.

Among these variables are HOME, which identifies your home directory, and

PATH, which determines which directories the shell searches and in what order to

locate commands that you give the shell. The shell creates and initializes (with

default values) other keyword variables when you start it. Still other variables do

not exist until you set them.

You can change the values of most keyword shell variables. It is usually not neces-

sary to change the values of keyword variables initialized in the /etc/profile or

/etc/csh.cshrc systemwide startup files. If you need to change the value of a bash

keyword variable, do so in one of your startup files (for bash see page 271; for tcsh

see page 352). Just as you can make user-created variables global, so you can make

keyword variables global—a task usually done automatically in startup files. You

can also make a keyword variable readonly.

Positional and

special parameters

The names of positional and special parameters do not resemble variable names.

Most of these parameters have one-character names (for example, 1,?, and #) and

292 Chapter 8 The Bourne Again Shell

are referenced (as are all variables) by preceding the name with a dollar sign ($1,$?,

and $#). The values of these parameters reflect different aspects of your ongoing

interaction with the shell.

Whenever you give a command, each argument on the command line becomes the

value of a positional parameter (page 440). Positional parameters enable you to

access command-line arguments, a capability that you will often require when you

write shell scripts. The set builtin (page 442) enables you to assign values to posi-

tional parameters.

Other frequently needed shell script values, such as the name of the last command

executed, the number of command-line arguments, and the status of the most

recently executed command, are available as special parameters (page 438). You

cannot assign values to special parameters.

User-Created Variables

The first line in the following example declares the variable named person and

initializes it with the value max (use set person = max in tcsh):

$ person=max

$ echo person

person

$ echo $person

max

Parameter

substitution

Because the echo builtin copies its arguments to standard output, you can use it to

display the values of variables. The second line of the preceding example shows that

person does not represent max. Instead, the string person is echoed as person. The

shell substitutes the value of a variable only when you precede the name of the vari-

able with a dollar sign ($). Thus the command echo $person displays the value of

the variable person; it does not display $person because the shell does not pass

$person to echo as an argument. Because of the leading $, the shell recognizes that

$person is the name of a variable, substitutes the value of the variable, and passes

that value to echo. The echo builtin displays the value of the variable—not its

name—never “knowing” that you called it with a variable.

Quoting the $ You can prevent the shell from substituting the value of a variable by quoting the

leading $. Double quotation marks do not prevent the substitution; single quotation

marks or a backslash (\) do.

$ echo $person

max

$ echo "$person"

max

$ echo '$person'

$person

$ echo \$person

$person

Parameters and Variables 293

SPACEsBecause they do not prevent variable substitution but do turn off the special mean-

ings of most other characters, double quotation marks are useful when you assign

values to variables and when you use those values. To assign a value that contains

SPACEs or TABs to a variable, use double quotation marks around the value. Although

double quotation marks are not required in all cases, using them is a good habit.

$ person="max and zach"

$ echo $person

max and zach

$ person=max and zach

bash: and: command not found

When you reference a variable whose value contains TABs or multiple adjacent SPACEs,

you need to use quotation marks to preserve the spacing. If you do not quote the

variable, the shell collapses each string of blank characters into a single SPACE before

passing the variable to the utility:

$ person="max and zach"

$ echo $person

max and zach

$ echo "$person"

max and zach

Pathname

expansion in

assignments

When you execute a command with a variable as an argument, the shell replaces the

name of the variable with the value of the variable and passes that value to the pro-

gram being executed. If the value of the variable contains a special character, such

as * or ?, the shell may expand that variable.

The first line in the following sequence of commands assigns the string max* to the

variable memo. The Bourne Again Shell does not expand the string because bash

does not perform pathname expansion (page 136) when it assigns a value to a vari-

able. All shells process a command line in a specific order. Within this order bash

(but not tcsh) expands variables before it interprets commands. In the following

echo command line, the double quotation marks quote the asterisk (*) in the

expanded value of $memo and prevent bash from performing pathname expansion

on the expanded memo variable before passing its value to the echo command:

$ memo=max*

$ echo "$memo"

max*

All shells interpret special characters as special when you reference a variable that

contains an unquoted special character. In the following example, the shell expands

the value of the memo variable because it is not quoted:

$ ls

max.report

max.summary

$ echo $memo

max.report max.summary

294 Chapter 8 The Bourne Again Shell

Here the shell expands the $memo variable to max*, expands max* to max.report

and max.summary, and passes these two values to echo.

optional

Braces The $VARIABLE syntax is a special case of the more general syntax ${VARIABLE},

in which the variable name is enclosed by ${}. The braces insulate the variable name

from adjacent characters. Braces are necessary when catenating a variable value

with a string:

$ PREF=counter

$ WAY=$PREFclockwise

$ FAKE=$PREFfeit

$ echo $WAY $FAKE

The preceding example does not work as planned. Only a blank line is output

because, although the symbols PREFclockwise and PREFfeit are valid variable

names, they are not set. By default the shell evaluates an unset variable as an empty

(null) string and displays this value (bash) or generates an error message (tcsh). To

achieve the intent of these statements, refer to the PREF variable using braces:

$ PREF=counter

$ WAY=${PREF}clockwise

$ FAKE=${PREF}feit

$ echo $WAY $FAKE

counterclockwise counterfeit

The Bourne Again Shell refers to the arguments on its command line by position,

using the special variables $1,$2,$3, and so forth up to $9. If you wish to refer to

arguments past the ninth argument, you must use braces: ${10}. The name of the

command is held in $0 (page 441).

unset: Removes a Variable

Unless you remove a variable, it exists as long as the shell in which it was created

exists. To remove the value of a variable but not the variable itself, assign a null

value to the variable (use set person = in tcsh):

$ person=

$ echo $person

You can remove a variable using the unset builtin. The following command removes

the variable person:

$ unset person

Parameters and Variables 295

Variable Attributes

This section discusses attributes and explains how to assign them to variables.

readonly: Makes the Value of a Variable Permanent

You can use the readonly builtin (not in tcsh) to ensure that the value of a variable

cannot be changed. The next example declares the variable person to be readonly.

You must assign a value to a variable before you declare it to be readonly; you can-

not change its value after the declaration. When you attempt to unset or change the

value of a readonly variable, the shell displays an error message:

$ person=zach

$ echo $person

zach

$ readonly person

$ person=helen

bash: person: readonly variable

If you use the readonly builtin without an argument, it displays a list of all readonly

shell variables. This list includes keyword variables that are automatically set as read-

only as well as keyword or user-created variables that you have declared as readonly.

See page 296 for an example (readonly and declare –r produce the same output).

declare and typeset: Assign Attributes to Variables

The declare and typeset builtins (two names for the same command, neither of

which is available in tcsh) set attributes and values for shell variables. Table 8-3 lists

five of these attributes.

The following commands declare several variables and set some attributes. The first

line declares person1 and assigns it a value of max. This command has the same

effect with or without the word declare.

$ declare person1=max

$ declare -r person2=zach

$ declare -rx person3=helen

$ declare -x person4

Table 8-3 Variable attributes (typeset or declare)

Attribute Meaning

–a Declares a variable as an array (page 434)

–f Declares a variable to be a function name (page 327)

–i Declares a variable to be of type integer (page 296)

–r Makes a variable readonly; also readonly (page 295)

–x Exports a variable (makes it global); also export (page 436)

296 Chapter 8 The Bourne Again Shell

The readonly and export builtins are synonyms for the commands declare –r and

declare –x, respectively. You can declare a variable without assigning a value to it,

as the preceding declaration of the variable person4 illustrates. This declaration

makes person4 available to all subshells (i.e., makes it global). Until an assignment

is made to the variable, it has a null value.

You can list the options to declare separately in any order. The following is equivalent

to the preceding declaration of person3:

$ declare -x -r person3=helen

Use the + character in place of – when you want to remove an attribute from a vari-

able. You cannot remove the readonly attribute. After the following command is

given, the variable person3 is no longer exported but it is still readonly.

$ declare +x person3

You can use typeset instead of declare.

Listing variable

attributes

Without any arguments or options, declare lists all shell variables. The same list is

output when you run set (page 442) without any arguments.

If you use a declare builtin with options but no variable names as arguments, the

command lists all shell variables that have the indicated attributes set. For example,

the command declare –r displays a list of all readonly shell variables. This list is the

same as that produced by the readonly command without any arguments. After the

declarations in the preceding example have been given, the results are as follows:

$ declare -r

declare -ar BASH_VERSINFO='([0]="3" [1]="2" [2]="39" [3]="1" ... )'

declare -ir EUID="500"

declare -ir PPID="936"

declare -r SHELLOPTS="braceexpand:emacs:hashall:histexpand:history:..."

declare -ir UID="500"

declare -r person2="zach"

declare -rx person3="helen"

The first five entries are keyword variables that are automatically declared as read-

only. Some of these variables are stored as integers (–i). The –a option indicates that

BASH_VERSINFO is an array variable; the value of each element of the array is

listed to the right of an equal sign.

Integer By default the values of variables are stored as strings. When you perform arith-

metic on a string variable, the shell converts the variable into a number, manipulates

it, and then converts it back to a string. A variable with the integer attribute is

stored as an integer. Assign the integer attribute as follows:

$ declare -i COUNT

Keyword Variables

Keyword variables either are inherited or are declared and initialized by the shell

when it starts. You can assign values to these variables from the command line or

from a startup file. Typically you want these variables to apply to all subshells you

Parameters and Variables 297

start as well as to your login shell. For those variables not automatically exported

by the shell, you must use export (bash; page 436) or setenv (tcsh; page 366) to make

them available to child shells.

HOME: Your Home Directory

By default your home directory is the working directory when you log in. Your

home directory is established when your account is set up; under Linux its name is

stored in the /etc/passwd file. Mac OS X uses Open Directory (page 926) to store

this information.

$ grep sam /etc/passwd

sam:x:501:501:Sam S. x301:/home/sam:/bin/bash

When you log in, the shell inherits the pathname of your home directory and assigns

it to the variable HOME. When you give a cd command without an argument, cd

makes the directory whose name is stored in HOME the working directory:

$ pwd

/home/max/laptop

$ echo $HOME

/home/max

$ cd

$ pwd

/home/max

This example shows the value of the HOME variable and the effect of the cd builtin.

After you execute cd without an argument, the pathname of the working directory is

the same as the value of HOME: your home directory.

Tilde (~) The shell uses the value of HOME to expand pathnames that use the shorthand

tilde (~) notation (page 84) to denote a user’s home directory. The following exam-

ple uses echo to display the value of this shortcut and then uses ls to list the files in

Max’s laptop directory, which is a subdirectory of his home directory:

$ echo ~

/home/max

$ ls ~/laptop

tester count lineup

PATH: Where the Shell Looks for Programs

When you give the shell an absolute or relative pathname rather than a simple file-

name as a command, it looks in the specified directory for an executable file with

the specified filename. If the file with the pathname you specified does not exist, the

shell reports command not found. If the file exists as specified but you do not have

execute permission for it, or in the case of a shell script you do not have read and

execute permission for it, the shell reports Permission denied.

If you give a simple filename as a command, the shell searches through certain

directories (your search path) for the program you want to execute. It looks in sev-

eral directories for a file that has the same name as the command and that you have

execute permission for (a compiled program) or read and execute permission for (a

shell script). The PATH shell variable controls this search.

298 Chapter 8 The Bourne Again Shell

The default value of PATH is determined when bash or tcsh is compiled. It is not set

in a startup file, although it may be modified there. Normally the default specifies

that the shell search several system directories used to hold common commands.

These system directories include /bin and /usr/bin and other directories appropriate

to the local system. When you give a command, if the shell does not find the execut-

able—and, in the case of a shell script, readable—file named by the command in

any of the directories listed in PATH, the shell generates one of the aforementioned

error messages.

Working directory The PATH variable specifies the directories in the order the shell should search them.

Each directory must be separated from the next by a colon. The following command

sets PATH so that a search for an executable file starts with the /usr/local/bin direc-

tory. If it does not find the file in this directory, the shell looks next in /bin, and then

in /usr/bin. If the search fails in those directories, the shell looks in the ~/bin direc-

tory, a subdirectory of the user’s home directory. Finally the shell looks in the working

directory. Exporting PATH makes its value accessible to subshells:

$ export PATH=/usr/local/bin:/bin:/usr/bin:~/bin:

A null value in the string indicates the working directory. In the preceding example, a

null value (nothing between the colon and the end of the line) appears as the last ele-

ment of the string. The working directory is represented by a leading colon (not rec-

ommended; see the following security tip), a trailing colon (as in the example), or two

colons next to each other anywhere in the string. You can also represent the working

directory explicitly with a period (.). See “PATH” on page 373 for a tcsh example.

Because Linux stores many executable files in directories named bin (binary), users

typically put their own executable files in their own ~/bin directories. If you put your

own bin directory at the end of your PATH, as in the preceding example, the shell

looks there for any commands that it cannot find in directories listed earlier in PATH.

If you want to add directories to PATH, you can reference the old value of the

PATH variable in setting PATH to a new value (but see the preceding security tip).

The following command adds /usr/local/bin to the beginning of the current PATH

and the bin directory in the user’s home directory (~/bin) to the end:

$ PATH=/usr/local/bin:$PATH:~/bin

PATH and security

security Do not put the working directory first in PATH when security is a concern. If you are working as

root, you should never put the working directory first in PATH. It is common for root’s PATH to

omit the working directory entirely. You can always execute a file in the working directory by

prepending ./ to the name: ./myprog.

Putting the working directory first in PATH can create a security hole. Most people type ls as the

first command when entering a directory. If the owner of a directory places an executable file

named ls in the directory, and the working directory appears first in a user’s PATH, the user giving

an ls command from the directory executes the ls program in the working directory instead of the

system ls utility, possibly with undesirable results.

Parameters and Variables 299

MAIL: Where Your Mail Is Kept

The MAIL variable (mail under tcsh) contains the pathname of the file that holds

your mail (your mailbox, usually /var/mail/name, where name is your username).

If MAIL is set and MAILPATH (next) is not set, the shell informs you when mail

arrives in the file specified by MAIL. In a graphical environment you can unset

MAIL so the shell does not display mail reminders in a terminal emulator window

(assuming you are using a graphical mail program).

Most Mac OS X systems do not use local files for incoming mail; mail is typically

kept on a remote mail server instead. The MAIL variable and other mail-related

shell variables do not do anything unless you have a local mail server.

The MAILPATH variable (not available under tcsh) contains a list of filenames sep-

arated by colons. If this variable is set, the shell informs you when any one of the

files is modified (for example, when mail arrives). You can follow any of the file-

names in the list with a question mark (?), followed by a message. The message

replaces the you have mail message when you receive mail while you are logged in.

The MAILCHECK variable (not available under tcsh) specifies how often, in seconds,

the shell checks for new mail. The default is 60 seconds. If you set this variable to

zero, the shell checks before each prompt.

PS1: User Prompt (Primary)

The default Bourne Again Shell prompt is a dollar sign ($). When you run bash with

root privileges, bash typically displays a pound sign (#) prompt. The PS1 variable

(prompt under tcsh, page 373) holds the prompt string that the shell uses to let you

know that it is waiting for a command. When you change the value of PS1 or

prompt, you change the appearance of your prompt.

You can customize the prompt displayed by PS1. For example, the assignment

$ PS1="[\u@\h \W \!]$ "

displays the following prompt:

[user@host directory event]$

where user is the username, host is the hostname up to the first period, directory is

the basename of the working directory, and event is the event number (page 309) of

the current command.

If you are working on more than one system, it can be helpful to incorporate the

system name into your prompt. For example, you might change the prompt to the

name of the system you are using, followed by a colon and a SPACE (a SPACE at the end

of the prompt makes the commands you enter after the prompt easier to read). This

command uses command substitution (page 340) in the string assigned to PS1:

$ PS1="$(hostname): "

bravo.example.com: echo test

test

bravo.example.com:

300 Chapter 8 The Bourne Again Shell

Use the following command under tcsh:

tcsh $ set prompt = "`hostname`: "

The first example that follows changes the prompt to the name of the local host, a

SPACE, and a dollar sign (or, if the user is running with root privileges, a pound sign).

The second example changes the prompt to the time followed by the name of the

user. The third example changes the prompt to the one used in this book (a pound

sign for root and a dollar sign otherwise):

$ PS1='\h \$ '

bravo $

$ PS1='\@ \u $ '

09:44 PM max $

$ PS1='\$ '

Table 8-4 describes some of the symbols you can use in PS1. See Table 9-4 on

page 373 for the corresponding tcsh symbols. For a complete list of special characters

you can use in the prompt strings, open the bash man page and search for the second

occurrence of PROMPTING (give the command /PROMPTING and then press n).

PS2: User Prompt (Secondary)

The PS2 variable holds the secondary prompt (tcsh uses prompt2; page 374). On the

first line of the next example, an unclosed quoted string follows echo. The shell

assumes the command is not finished and, on the second line, gives the default sec-

ondary prompt (>). This prompt indicates the shell is waiting for the user to continue

the command line. The shell waits until it receives the quotation mark that closes the

string. Only then does it execute the command:

Table 8-4 PS1 symbols

Symbol Display in prompt

\$ # if the user is running with root privileges; otherwise, $

\w Pathname of the working directory

\W Basename of the working directory

\! Current event (history) number (page 313)

\d Date in Weekday Month Date format

\h Machine hostname, without the domain

\H Full machine hostname, including the domain

\u Username of the current user

\@ Current time of day in 12-hour, AM/PM format

\T Current time of day in 12-hour HH:MM:SS format

\A Current time of day in 24-hour HH:MM format

\t Current time of day in 24-hour HH:MM:SS format

Parameters and Variables 301

$ echo "demonstration of prompt string

> 2"

demonstration of prompt string

$ PS2="secondary prompt: "

$ echo "this demonstrates

secondary prompt: prompt string 2"

this demonstrates

prompt string 2

The second command changes the secondary prompt to secondary prompt: followed

by a SPACE. A multiline echo demonstrates the new prompt.

PS3: Menu Prompt

The PS3 variable holds the menu prompt (tcsh uses prompt3; page 374) for the

select control structure (page 428).

PS4: Debugging Prompt

The PS4 variable holds the bash debugging symbol (page 410; not under tcsh).

IFS: Separates Input Fields (Word Splitting)

The IFS (Internal Field Separator) shell variable (not under tcsh) specifies the char-

acters you can use to separate arguments on a command line. It has the default

value of SPACE TAB NEWLINE. Regardless of the value of IFS, you can always use one or

more SPACE or TAB characters to separate arguments on the command line, provided

these characters are not quoted or escaped. When you assign IFS character values,

these characters can also separate fields—but only if they undergo expansion. This

type of interpretation of the command line is called word splitting.

The following example demonstrates how setting IFS can affect the interpretation

of a command line:

$ a=w:x:y:z

$ cat $a

cat: w:x:y:z: No such file or directory

$ IFS=":"

$ cat $a

cat: w: No such file or directory

cat: x: No such file or directory

cat: y: No such file or directory

cat: z: No such file or directory

Be careful when changing IFS

caution Changing IFS has a variety of side effects, so work cautiously. You may find it useful to save the

value of IFS before changing it. Then you can easily restore the original value if you get unexpected

results. Alternatively, you can fork a new shell with a bash command before experimenting with

IFS; if you get into trouble, you can exit back to the old shell, where IFS is working properly.

302 Chapter 8 The Bourne Again Shell

The first time cat is called, the shell expands the variable a, interpreting the string

w:x:y:z as a single word to be used as the argument to cat. The cat utility cannot

find a file named w:x:y:z and reports an error for that filename. After IFS is set to a

colon (:), the shell expands the variable a into four words, each of which is an argu-

ment to cat. Now cat reports errors for four files: w,x,y, and z. Word splitting

based on the colon (:) takes place only after the variable a is expanded.

The shell splits all expanded words on a command line according to the separating

characters found in IFS. When there is no expansion, there is no splitting. Consider

the following commands:

$ IFS="p"

$ export VAR

Although IFS is set to p, the p on the export command line is not expanded, so the

word export is not split.

The following example uses variable expansion in an attempt to produce an export

command:

$ IFS="p"

$ aa=export

$ echo $aa

ex ort

This time expansion occurs, so the character p in the token export is interpreted as

a separator (as the echo command shows). Now when you try to use the value of

the aa variable to export the VAR variable, the shell parses the $aa VAR command

line as ex ort VAR. The effect is that the command line starts the ex editor with two

filenames: ort and VAR.

$ $aa VAR

2 files to edit

"ort" [New File]

Entering Ex mode. Type "visual" to go to Normal mode.

E173: 1 more file to edit

If you unset IFS, only SPACEs and TABs work as field separators.

CDPATH: Broadens the Scope of cd

The CDPATH variable (cdpath under tcsh) allows you to use a simple filename as

an argument to the cd builtin to change the working directory to a directory other

than a child of the working directory. If you have several directories you typically

work out of, this variable can speed things up and save you the tedium of using cd

with longer pathnames to switch among them.

Multiple separator characters

tip Although the shell treats sequences of multiple SPACE or TAB characters as a single separator, it

treats each occurrence of another field-separator character as a separator.

Parameters and Variables 303

When CDPATH or cdpath is not set and you specify a simple filename as an argument

to cd,cd searches the working directory for a subdirectory with the same name as the

argument. If the subdirectory does not exist, cd displays an error message. When

CDPATH or cdpath is set, cd searches for an appropriately named subdirectory in the

directories in the CDPATH list. If it finds one, that directory becomes the working

directory. With CDPATH or cdpath set, you can use cd and a simple filename to

change the working directory to a child of any of the directories listed in CDPATH or

cdpath.

The CDPATH or cdpath variable takes on the value of a colon-separated list of direc-

tory pathnames (similar to the PATH variable). It is usually set in the ~/.bash_profile

(bash) or ~/.tcshrc (tcsh) startup file with a command line such as the following:

export CDPATH=$HOME:$HOME/literature

Use the following format for tcsh:

setenv cdpath $HOME\:$HOME/literature

These commands cause cd to search your home directory, the literature directory,

and then the working directory when you give a cd command. If you do not include

the working directory in CDPATH or cdpath,cd searches the working directory if

the search of all the other directories in CDPATH or cdpath fails. If you want cd to

search the working directory first, include a null string, represented by two colons

(::), as the first entry in CDPATH:

export CDPATH=::$HOME:$HOME/literature

If the argument to the cd builtin is an absolute pathname—one starting with a slash

(/)—the shell does not consult CDPATH or cdpath.

Keyword Variables: A Summary

Table 8-5 presents a list of bash keyword variables. See page 371 for information on

tcsh variables.

Table 8-5 bash keyword variables

Variable Value

BASH_ENV The pathname of the startup file for noninteractive shells (page 272)

CDPATH The cd search path (page 302)

COLUMNS The width of the display used by select (page 427)

FCEDIT The name of the editor that fc uses by default (page 312)

HISTFILE The pathname of the file that holds the history list (default: ~/.bash_history;

page 308)

HISTFILESIZE The maximum number of entries saved in HISTFILE (default: 500; page 308)

HISTSIZE The maximum number of entries saved in the history list (default: 500;

page 308)

304 Chapter 8 The Bourne Again Shell

Special Characters

Table 8-6 lists most of the characters that are special to the bash and tcsh shells.

Variable Value

HOME The pathname of the user’s home directory (page 297); used as the default

argument for cd and in tilde expansion (page 84)

IFS Internal Field Separator (page 301); used for word splitting (page 341)

INPUTRC The pathname of the Readline startup file (default: ~/.inputrc; page 321)

LANG The locale category when that category is not specifically set with an LC_*

variable

LC_*A group of variables that specify locale categories including LC_COLLATE,

LC_CTYPE,LC_MESSAGES, and LC_NUMERIC; use the locale builtin to dis-

play a complete list with values

LINES The height of the display used by select (page 427)

MAIL The pathname of the file that holds a user’s mail (page 299)

MAILCHECK How often, in seconds, bash checks for mail (page 299)

MAILPATH A colon-separated list of file pathnames that bash checks for mail in

(page 299)

PATH A colon-separated list of directory pathnames that bash looks for commands

in (page 297)

PROMPT_COMMAND

A command that bash executes just before it displays the primary prompt

PS1 Prompt String 1; the primary prompt (page 299)

PS2 Prompt String 2; the secondary prompt (default: '> '; page 300)

PS3 The prompt issued by select (page 427)

PS4 The bash debugging symbol (page 410)

REPLY Holds the line that read accepts (page 448); also used by select (page 427)

Table 8-5 bash keyword variables (continued)

Table 8-6 Shell special characters

Character Use

NEWLINE Initiates execution of a command (page 282)

;Separates commands (page 282)

Special Characters 305

Character Use

( ) Groups commands (page 284) for execution by a subshell or identifies a func-

tion (page 327)

(( )) Expands an arithmetic expression (page 338)

&Executes a command in the background (pages 134 and 283)

|Sends standard output of the preceding command to standard input of the fol-

lowing command (pipe; page 283)

>Redirects standard output (page 126)

>> Appends standard output (page 130)

<Redirects standard input (page 128)

<< Here document (page 429)

*Any string of zero or more characters in an ambiguous file reference

(page 138)

?Any single character in an ambiguous file reference (page 137)

\Quotes the following character (page 46)

'Quotes a string, preventing all substitution (page 46)

"Quotes a string, allowing only variable and command substitution (pages 46

and 292)

‘...‘Performs command substitution (page 340)

[ ] Character class in an ambiguous file reference (page 139)

$References a variable (page 290)

.(dot builtin) Executes a command (page 274)

#Begins a comment (page 281)

{ } Surrounds the contents of a function (page 327)

:(null builtin) Returns true (page 455)

(Boolean AND)

Executes command on right only if command on left succeeds (returns a zero

exit status; page 466)

|| (Boolean OR) Executes command on right only if command on left fails (returns a nonzero

exit status; page 466)

!(Boolean NOT) Reverses exit status of a command

$() (not in tcsh) Performs command substitution (preferred form; page 340)

[] Evaluates an arithmetic expression (page 338)

Table 8-6 Shell special characters (continued)

306 Chapter 8 The Bourne Again Shell

Processes

Aprocess is the execution of a command by the Linux kernel. The shell that starts

when you log in is a command, or a process, like any other. When you give the

name of a Linux utility on the command line, you initiate a process. When you run

a shell script, another shell process is started and additional processes are created

for each command in the script. Depending on how you invoke the shell script, the

script is run either by the current shell or, more typically, by a subshell (child) of the

current shell. A process is not started when you run a shell builtin, such as cd.

Process Structure

fork system call Like the file structure, the process structure is hierarchical, with parents, children,

and even a root. A parent process forks a child process, which in turn can fork other

processes. (The term fork indicates that, as with a fork in the road, one process

turns into two. Initially the two forks are identical except that one is identified as

the parent and one as the child. You can also use the term spawn; the words are

interchangeable.) The operating system routine, or system call, that creates a new

process is named fork().

When Linux begins execution when a system is started, it starts init, a single process

called a spontaneous process, with PID number 1. This process holds the same posi-

tion in the process structure as the root directory does in the file structure: It is the

ancestor of all processes the system and users work with. When a command-line

system is in multiuser mode, init runs getty or mingetty processes, which display

RETURN,getty hands control over to a utility named login, which checks the username

and password combination. After the user logs in, the login process becomes the

user’s shell process.

Process Identification

PID numbers Linux assigns a unique PID (process identification) number at the inception of each

process. As long as a process exists, it keeps the same PID number. During one ses-

sion the same process is always executing the login shell. When you fork a new

process—for example, when you use an editor—the PID number of the new (child)

process is different from that of its parent process. When you return to the login

shell, it is still being executed by the same process and has the same PID number as

when you logged in.

The following example shows that the process running the shell forked (is the parent

of) the process running ps. When you call it with the –f option, ps displays a full list-

ing of information about each process. The line of the ps display with bash in the

CMD column refers to the process running the shell. The column headed by PID

identifies the PID number. The column headed PPID identifies the PID number of the

parent of the process. From the PID and PPID columns you can see that the process

Processes 307

running the shell (PID 21341) is the parent of the process running sleep (PID 22789).

The parent PID number of sleep is the same as the PID number of the shell (21341).

$ sleep 10 &

[1] 22789

$ ps -f

UID PID PPID C STIME TTY TIME CMD

max 21341 21340 0 10:42 pts/16 00:00:00 bash

max 22789 21341 0 17:30 pts/16 00:00:00 sleep 10

max 22790 21341 0 17:30 pts/16 00:00:00 ps -f

Refer to page 796 for more information on ps and the columns it displays with the

–f option. A second pair of sleep and ps –f commands shows that the shell is still

being run by the same process but that it forked another process to run sleep:

$ sleep 10 &

[1] 22791

$ ps -f

UID PID PPID C STIME TTY TIME CMD

max 21341 21340 0 10:42 pts/16 00:00:00 bash

max 22791 21341 0 17:31 pts/16 00:00:00 sleep 10

max 22792 21341 0 17:31 pts/16 00:00:00 ps -f

You can also use pstree (or ps ––forest, with or without the –e option) to see the

parent–child relationship of processes. The next example shows the –p option to

pstree, which causes it to display PID numbers:

$ pstree -p

init(1)-+-acpid(1395)

|-atd(1758)

|-crond(1702)

...

|-kdeinit(2223)-+-firefox(8914)---run-mozilla.sh(8920)---firefox-bin(8925)

| |-gaim(2306)

| |-gqview(14062)

| |-kdeinit(2228)

| |-kdeinit(2294)

| |-kdeinit(2314)-+-bash(2329)---ssh(2561)

| | |-bash(2339)

| | '-bash(15821)---bash(16778)

| |-kdeinit(16448)

| |-kdeinit(20888)

| |-oclock(2317)

| '-pam-panel-icon(2305)---pam_timestamp_c(2307)

...

|-login(1823)---bash(20986)-+-pstree(21028)

| '-sleep(21026)

...

The preceding output is abbreviated. The line that starts with –kdeinit shows a

graphical user running many processes, including firefox,gaim, and oclock. The

line that starts with –login shows a textual user running sleep in the background

and running pstree in the foreground. Refer to “$$: PID Number” on page 439 for

a description of how to instruct the shell to report on PID numbers.

308 Chapter 8 The Bourne Again Shell

Executing a Command

fork and sleep When you give the shell a command, it usually forks [spawns using the fork() system

call] a child process to execute the command. While the child process is executing the

command, the parent process sleeps [implemented as the sleep() system call]. While a

process is sleeping, it does not use any computer time; it remains inactive, waiting to

wake up. When the child process finishes executing the command, it tells its parent

of its success or failure via its exit status and then dies. The parent process (which is

running the shell) wakes up and prompts for another command.

Background process When you run a process in the background by ending a command with an ampersand

(&), the shell forks a child process without going to sleep and without waiting for the

child process to run to completion. The parent process, which is executing the shell,

reports the job number and PID number of the child process and prompts for another

command. The child process runs in the background, independent of its parent.

Builtins Although the shell forks a process to run most of the commands you give it, some

commands are built into the shell. The shell does not need to fork a process to run

builtins. For more information refer to “Builtins” on page 141.

Variables Within a given process, such as your login shell or a subshell, you can declare, ini-

tialize, read, and change variables. By default, however, a variable is local to a

process. When a process forks a child process, the parent does not pass the value

of a variable to the child. You can make the value of a variable available to child

processes (global) by using the export builtin under bash (page 436) or the setenv

builtin under tcsh (page 366).

History

The history mechanism, a feature adapted from the C Shell, maintains a list of

recently issued command lines, also called events, that provides a quick way to

reexecute any of the events in the list. This mechanism also enables you to execute

variations of previous commands and to reuse arguments from them. You can use

the history list to replicate complicated commands and arguments that you used

earlier in this login session or in a previous one and enter a series of commands that

differ from one another in minor ways. The history list also serves as a record of

what you have done. It can prove helpful when you have made a mistake and are

not sure what you did or when you want to keep a record of a procedure that

involved a series of commands.

The history builtin (in both bash and tcsh) displays the history list. If it does not,

read the next section, which describes the variables you need to set.

Variables That Control History

The TC Shell’s history mechanism is similar to bash’s but uses different variables

and has other differences. See page 354 for more information.

The value of the HISTSIZE variable determines the number of events preserved in

the history list during a session. A value in the range of 100 to 1,000 is normal.

History 309

When you exit from the shell, the most recently executed commands are saved in

the file whose name is stored in the HISTFILE variable (the default is

~/.bash_history). The next time you start the shell, this file initializes the history list.

The value of the HISTFILESIZE variable determines the number of lines of history

saved in HISTFILE. See Table 8-7.

Event number The Bourne Again Shell assigns a sequential event number to each command line.

You can display this event number as part of the bash prompt by including \! in PS1

(page 299). Examples in this section show numbered prompts when they help to

illustrate the behavior of a command.

Give the following command manually, or place it in ~/.bash_profile to affect future

sessions, to establish a history list of the 100 most recent events:

$ HISTSIZE=100

The following command causes bash to save the 100 most recent events across login

sessions:

$ HISTFILESIZE=100

After you set HISTFILESIZE, you can log out and log in again, and the 100 most

recent events from the previous login session will appear in your history list.

Give the command history to display the events in the history list. This list is

ordered so that the oldest events appear at the top. A tcsh history list includes the

time the command was executed. The following history list includes a command to

modify the bash prompt so it displays the history event number. The last event in

the history list is the history command that displayed the list.

32 $ history | tail

23 PS1="\! bash$ "

24 ls -l

25 cat temp

26 rm temp

27 vim memo

28 lpr memo

29 vim memo

30 lpr memo

31 rm memo

32 history | tail

history can help track down mistakes

tip When you have made a mistake on a command line (not an error within a script or program) and

are not sure what you did wrong, look at the history list to review your recent commands. Some-

times this list can help you figure out what went wrong and how to fix things.

Table 8-7 History variables

Variable Default Function

HISTSIZE 500 events Maximum number of events saved during a session

HISTFILE ~/.bash_history Location of the history file

HISTFILESIZE 500 events Maximum number of events saved between sessions

310 Chapter 8 The Bourne Again Shell

As you run commands and your history list becomes longer, it may run off the top

of the screen when you use the history builtin. Pipe the output of history through less

to browse through it, or give the command history 10 or history | tail to look at the

ten most recent commands.

Reexecuting and Editing Commands

You can reexecute any event in the history list. This feature can save you time,

effort, and aggravation. Not having to reenter long command lines allows you to

reexecute events more easily, quickly, and accurately than you could if you had to

retype the command line in its entirety. You can recall, modify, and reexecute previ-

ously executed events in three ways: You can use the fc builtin (covered next), the

exclamation point commands (page 313), or the Readline Library, which uses a

one-line vi- or emacs-like editor to edit and execute events (page 318).

fc: Displays, Edits, and Reexecutes Commands

The fc (fix command) builtin (not in tcsh) enables you to display the history list and

to edit and reexecute previous commands. It provides many of the same capabilities

as the command-line editors.

Viewing the History List

When you call fc with the –l option, it displays commands from the history list.

Without any arguments, fc –l lists the 16 most recent commands in a numbered list,

with the oldest appearing first:

$ fc -l

1024 cd

1025 view calendar

1026 vim letter.adams01

1027 aspell -c letter.adams01

1028 vim letter.adams01

1029 lpr letter.adams01

1030 cd ../memos

A handy history alias

tip Creating the following aliases makes working with history easier. The first allows you to give the

command h to display the ten most recent events. The second alias causes the command hg

string

to display all events in the history list that contain

string

. Put these aliases in your ~/.bashrc file

to make them available each time you log in. See page 324 for more information.

$ alias 'h=history | tail'

$ alias 'hg=history | grep'

Which method to use?

tip If you are more familiar with vi or emacs and less familiar with the C or TC Shell, use fc or the

Readline Library. If you are more familiar with the C or TC Shell, use the exclamation point com-

mands. If it is a toss-up, try the Readline Library; it will benefit you in other areas of Linux more

than learning the exclamation point commands will.

History 311

1031 ls

1032 rm *0405

1033 fc -l

1034 cd

1035 whereis aspell

1036 man aspell

1037 cd /usr/share/doc/*aspell*

1038 pwd

1039 ls

1040 ls man-html

The fc builtin can take zero, one, or two arguments with the –l option. The argu-

ments specify the part of the history list to be displayed:

fc –l [first [last]]

The fc builtin lists commands beginning with the most recent event that matches

first. The argument can be an event number, the first few characters of the command

line, or a negative number, which is taken to be the nth previous command. Without

last,fc displays events through the most recent. If you include last,fc displays com-

mands from the most recent event that matches first through the most recent event

that matches last.

The next command displays the history list from event 1030 through event 1035:

$ fc -l 1030 1035

1030 cd ../memos

1031 ls

1032 rm *0405

1033 fc -l

1034 cd

1035 whereis aspell

The following command lists the most recent event that begins with view through

the most recent command line that begins with whereis:

$ fc -l view whereis

1025 view calendar

1026 vim letter.adams01

1027 aspell -c letter.adams01

1028 vim letter.adams01

1029 lpr letter.adams01

1030 cd ../memos

1031 ls

1032 rm *0405

1033 fc -l

1034 cd

1035 whereis aspell

To list a single command from the history list, use the same identifier for the first

and second arguments. The following command lists event 1027:

$ fc -l 1027 1027

1027 aspell -c letter.adams01

312 Chapter 8 The Bourne Again Shell

Editing and Reexecuting Previous Commands

You can use fc to edit and reexecute previous commands.

fc [–e editor] [first [last]]

When you call fc with the –e option followed by the name of an editor, fc calls the

editor with event(s) in the Work buffer, assuming the editor you specify is installed.

By default, fc invokes the nano editor. Without first and last, it defaults to the most

recent command. The next example invokes the vim editor to edit the most recent

command:

$ fc -e vi

The fc builtin uses the stand-alone vim editor. If you set the FCEDIT variable, you

do not need to use the –e option to specify an editor on the command line. Because

the value of FCEDIT has been changed to /usr/bin/emacs and fc has no arguments,

the following command edits the most recent command using the emacs editor (part

of the emacs package; not installed by default):

$ export FCEDIT=/usr/bin/emacs

$ fc

If you call it with a single argument, fc invokes the editor on the specified command.

The following example starts the editor with event 1029 in the Work buffer. When

you exit from the editor, the shell executes the command:

$ fc 1029

As described earlier, you can identify commands with numbers or by specifying the

first few characters of the command name. The following example calls the editor

to work on events from the most recent event that begins with the letters vim

through event 1030:

$ fc vim 1030

Reexecuting Commands Without Calling the Editor

You can reexecute previous commands without using an editor. If you call fc with

the –s option, it skips the editing phase and reexecutes the command. The following

example reexecutes event 1029:

$ fc -s 1029

lpr letter.adams01

The next example reexecutes the previous command:

$ fc -s

Clean up the fc buffer

caution When you execute an fc command, the shell executes whatever you leave in the editor buffer, pos-

sibly with unwanted results. If you decide you do not want to execute a command, delete everything

from the buffer before you exit from the editor.

History 313

When you reexecute a command, you can tell fc to substitute one string for another.

The next example substitutes the string john for the string adams in event 1029 and

executes the modified event:

$ fc -s adams=john 1029

lpr letter.john01

Using an Exclamation Point (!) to Reference Events

The C Shell history mechanism uses an exclamation point to reference events. This

technique, which is available under bash and tcsh, is frequently more cumbersome

to use than fc but nevertheless has some useful features. For example, the !! com-

mand reexecutes the previous event, and the shell replaces the !$ token with the last

word on the previous command line.

You can reference an event by using its absolute event number, its relative event

number, or the text it contains. All references to events, called event designators,

begin with an exclamation point (!). One or more characters follow the exclama-

tion point to specify an event.

You can put history events anywhere on a command line. To escape an exclamation point

so that the shell interprets it literally instead of as the start of a history event, precede the

exclamation point with a backslash (\) or enclose it within single quotation marks.

Event Designators

An event designator specifies a command in the history list. See Table 8-8 on the

next page for a list of event designators.

!! reexecutes the

previous event

You can reexecute the previous event by giving a !! command. In the following

example, event 45 reexecutes event 44:

44 $ ls -l text

-rw-rw-r-- 1 max group 45 Apr 30 14:53 text

45 $ !!

ls -l text

-rw-rw-r-- 1 max group 45 Apr 30 14:53 text

The !! command works whether or not your prompt displays an event number. As

this example shows, when you use the history mechanism to reexecute an event, the

shell displays the command it is reexecuting.

!n event number A number following an exclamation point refers to an event. If that event is in the

history list, the shell executes it. Otherwise, the shell displays an error message. A

negative number following an exclamation point references an event relative to the

current event. For example, the command !–3 refers to the third preceding event.

After you issue a command, the relative event number of a given event changes

(event –3 becomes event –4). Both of the following commands reexecute event 44:

51 $ !44

ls -l text

-rw-rw-r-- 1 max group 45 Apr 30 14:53 text

52 $ !-8

ls -l text

-rw-rw-r-- 1 max group 45 Apr 30 14:53 text

314 Chapter 8 The Bourne Again Shell

!string event text When a string of text follows an exclamation point, the shell searches for and exe-

cutes the most recent event that began with that string. If you enclose the string

within question marks, the shell executes the most recent event that contained that

string. The final question mark is optional if a RETURN would immediately follow it.

68 $ history 10

59 ls -l text*

60 tail text5

61 cat text1 text5 > letter

62 vim letter

63 cat letter

64 cat memo

65 lpr memo

66 pine zach

67 ls -l

68 history

69 $ !l

ls -l

...

70 $ !lpr

lpr memo

71 $ !?letter?

cat letter

...

optional Word Designators

Aword designator specifies a word (token) or series of words from an event.

(Table 8-9 on page 316 lists word designators.) The words are numbered starting

with 0 (the first word on the line—usually the command), continuing with 1 (the

first word following the command), and ending with n (the last word on the line).

Table 8-8 Event designators

Designator Meaning

!Starts a history event unless followed immediately by SPACE,NEWLINE,=, or (.

!! The previous command.

Command number

in the history list.

!–

The

th preceding command.

string

The most recent command line that started with

string

[?] The most recent command that contained

string

. The last ? is optional.

!# The current command (as you have it typed so far).

event

}The

event

is an event designator. The braces isolate

event

from the surrounding

text. For example, !{–3}3 is the third most recently executed command followed

by a 3.

History 315

To specify a particular word from a previous event, follow the event designator (such

as !14) with a colon and the number of the word in the previous event. For example,

!14:3 specifies the third word following the command from event 14. You can spec-

ify the first word following the command (word number 1) using a caret (^) and the

last word using a dollar sign ($). You can specify a range of words by separating two

word designators with a hyphen.

72 $ echo apple grape orange pear

apple grape orange pear

73 $ echo !72:2

echo grape

grape

74 $ echo !72:^

echo apple

apple

75 $ !72:0 !72:$

echo pear

pear

76 $ echo !72:2-4

echo grape orange pear

grape orange pear

77 $ !72:0-$

echo apple grape orange pear

apple grape orange pear

As the next example shows, !$ refers to the last word of the previous event. You can

use this shorthand to edit, for example, a file you just displayed with cat:

$ cat report.718

...

$ vim !$

vim report.718

...

If an event contains a single command, the word numbers correspond to the argu-

ment numbers. If an event contains more than one command, this correspondence

does not hold true for commands after the first. In the following example, event 78

contains two commands separated by a semicolon so the shell executes them

sequentially; the semicolon is word number 5.

78 $ !72 ; echo helen zach barbara

echo apple grape orange pear ; echo helen zach barbara

apple grape orange pear

helen zach barbara

79 $ echo !78:7

echo helen

helen

80 $ echo !78:4-7

echo pear ; echo helen

pear

helen

316 Chapter 8 The Bourne Again Shell

Modifiers

On occasion you may want to change an aspect of an event you are reexecuting.

Perhaps you entered a complex command line with a typo or incorrect pathname

or you want to specify a different argument. You can modify an event or a word of

an event by putting one or more modifiers after the word designator, or after the

event designator if there is no word designator. Each modifier must be preceded by

a colon (:).

Substitute modifier The following example shows the substitute modifier correcting a typo in the previous

event:

$ car /home/zach/memo.0507 /home/max/letter.0507

bash: car: command not found

$ !!:s/car/cat

cat /home/zach/memo.0507 /home/max/letter.0507

...

The substitute modifier has the following syntax:

[g]s/old/new/

where old is the original string (not a regular expression) and new is the string that

replaces old. The substitute modifier substitutes the first occurrence of old with

new. Placing a g before the s (as in gs/old/new/) causes a global substitution, replac-

ing all occurrences of old. Although / is the delimiter in the examples, you can use

any character that is not in either old or new. The final delimiter is optional if a

RETURN would immediately follow it. As with the vim Substitute command, the history

mechanism replaces an ampersand (&) in new with old. The shell replaces a null

old string (s//new/) with the previous old string or string within a command that

you searched for with ?string?.

Quick substitution An abbreviated form of the substitute modifier is quick substitution. Use it to

reexecute the most recent event while changing some of the event text. The quick

substitution character is the caret (^). For example, the command

Table 8-9 Word designators

Designator Meaning

The

th word. Word 0 is normally the command name.

^The first word (after the command name).

$The last word.

–

All words from word number

through word number

;

defaults to 0 if you

omit it (0–

*All words from word number

through the last word.

*All words except the command name. The same as 1*.

%The word matched by the most recent ?

string

? search.

History 317

$ ^old^new^

produces the same results as

$ !!:s/old/new/

Thus substituting cat for car in the previous event could have been entered as

$ ^car^cat

cat /home/zach/memo.0507 /home/max/letter.0507

...

You can omit the final caret if it would be followed immediately by a RETURN. As with

other command-line substitutions, the shell displays the command line as it appears

after the substitution.

Other modifiers Modifiers (other than the substitute modifier) perform simple edits on the part of

the event that has been selected by the event designator and the optional word des-

ignators. You can use multiple modifiers, each preceded by a colon (:).

The following series of commands uses ls to list the name of a file, repeats the com-

mand without executing it (p modifier), and repeats the last command, removing

the last part of the pathname (h modifier) again without executing it:

$ ls /var/log/messages

/var/log/messages

$ !!:p

ls /var/log/messages

$ !!:h:p

ls /var/log

Table 8-10 lists event modifiers other than the substitute modifier.

Table 8-10 Event modifiers

Modifier Function

e(extension) Removes all but the filename extension

h(head) Removes the last part of a pathname

p(print-not) Displays the command, but does not execute it

q(quote) Quotes the substitution to prevent further substitutions on it

r(root) Removes the filename extension

t(tail) Removes all elements of a pathname except the last

xLike q but quotes each word in the substitution individually

318 Chapter 8 The Bourne Again Shell

The Readline Library

Command-line editing under the Bourne Again Shell is implemented through the

Readline Library, which is available to any application written in C. Any applica-

tion that uses the Readline Library supports line editing that is consistent with that

provided by bash. Programs that use the Readline Library, including bash, read

~/.inputrc (page 321) for key binding information and configuration settings. The

––noediting command-line option turns off command-line editing in bash.

vi mode You can choose one of two editing modes when using the Readline Library in bash:

emacs or vi(m). Both modes provide many of the commands available in the stand-

alone versions of the emacs and vim editors. You can also use the ARROW keys to move

around. Up and down movements move you backward and forward through the his-

tory list. In addition, Readline provides several types of interactive word completion

(page 320). The default mode is emacs; you can switch to vi mode with the following

command:

$ set -o vi

emacs mode The next command switches back to emacs mode:

$ set -o emacs

vi Editing Mode

Before you start, make sure the shell is in vi mode.

When you enter bash commands while in vi editing mode, you are in Input mode

(page 153). As you enter a command, if you discover an error before you press

RETURN, you can press ESCAPE to switch to vim Command mode. This setup is different

from the stand-alone vim editor’s initial mode. While in Command mode you can

use many vim commands to edit the command line. It is as though you were using

vim to edit a copy of the history file with a screen that has room for only one com-

mand. When you use the k command or the UP ARROW to move up a line, you access

the previous command. If you then use the j command or the DOWN ARROW to move

down a line, you return to the original command. To use the k and j keys to move

between commands, you must be in Command mode; you can use the ARROW keys in

both Command and Input modes.

In addition to cursor-positioning commands, you can use the search-backward (?)

command followed by a search string to look back through your history list for the

most recent command containing that string. If you have moved back in your history

list, use a forward slash (/) to search forward toward your most recent command.

Unlike the search strings in the stand-alone vim editor, these search strings cannot

contain regular expressions. You can, however, start the search string with a caret (^)

The stand-alone editor starts in Command mode

tip The stand-alone vim editor starts in Command mode, whereas the command-line vim editor

starts in Input mode. If commands display characters and do not work properly, you are in Input

mode. Press ESCAPE and enter the command again.

History 319

to force the shell to locate commands that start with the search string. As in vim,

pressing n after a successful search looks for the next occurrence of the same string.

You can also use event numbers to access events in the history list. While you are in

Command mode (press ESCAPE), enter the event number followed by a G to go to the

command with that event number.

When you use /,?, or G to move to a command line, you are in Command mode,

not Input mode: You can edit the command or press RETURN to execute it.

Once the command you want to edit is displayed, you can modify the command line

using vim Command mode editing commands such as x (delete character), r (replace

character), ~(change case), and . (repeat last change). To change to Input mode, use

an Insert (i,I), Append (a,A), Replace (R), or Change (c, C) command. You do not

have to return to Command mode to execute a command; simply press RETURN, even

if the cursor is in the middle of the command line.

Refer to page 196 for a summary of vim commands.

emacs Editing Mode

Unlike the vim editor, emacs is modeless. You need not switch between Command

mode and Input mode because most emacs commands are control characters

(page 216), allowing emacs to distinguish between input and commands. Like vim,

the emacs command-line editor provides commands for moving the cursor on the

command line and through the command history list and for modifying part or all

of a command. However, in a few cases, the emacs command-line editor commands

differ from those in the stand-alone emacs editor.

In emacs you perform cursor movement by using both CONTROL and ESCAPE commands.

To move the cursor one character backward on the command line, press CONTROL-B.

Press CONTROL-F to move one character forward. As in vim, you may precede these

movements with counts. To use a count you must first press ESCAPE; otherwise, the

numbers you type will appear on the command line.

Like vim,emacs provides word and line movement commands. To move backward

or forward one word on the command line, press ESCAPE b or ESCAPE f. To move several

words using a count, press ESCAPE followed by the number and the appropriate

escape sequence. To move to the beginning of the line, press CONTROL-A; to the end of

the line, press CONTROL-E; and to the next instance of the character c, press CONTROL-X

CONTROL-F followed by c.

You can add text to the command line by moving the cursor to the position you

want to enter text and typing the desired text. To delete text, move the cursor just to

the right of the characters that you want to delete and press the erase key (page 29)

once for each character you want to delete.

CONTROL-D can terminate your screen session

tip If you want to delete the character directly under the cursor, press CONTROL-D. If you enter CONTROL-D

at the beginning of the line, it may terminate your shell session.

320 Chapter 8 The Bourne Again Shell

If you want to delete the entire command line, type the line kill character (page 30).

You can type this character while the cursor is anywhere in the command line. If

you want to delete from the cursor to the end of the line, press CONTROL-K.

Refer to page 254 for a summary of emacs commands.

Readline Completion Commands

You can use the TAB key to complete words you are entering on the command line.

This facility, called completion, works in both vi and emacs editing modes and is

similar to the completion facility available in tcsh. Several types of completion are

possible, and which one you use depends on which part of a command line you are

typing when you press TAB.

Command Completion

If you are typing the name of a command (usually the first word on the command

line), pressing TAB initiates command completion, in which bash looks for a com-

mand whose name starts with the part of the word you have typed. If no command

starts with the characters you entered, bash beeps. If there is one such command,

bash completes the command name. If there is more than one choice, bash does

nothing in vi mode and beeps in emacs mode. Pressing TAB a second time causes bash

to display a list of commands whose names start with the prefix you typed and

allows you to continue typing the command name.

In the following example, the user types bz and presses TAB. The shell beeps (the user

is in emacs mode) to indicate that several commands start with the letters bz. The

user enters another TAB to cause the shell to display a list of commands that start

with bz followed by the command line as the user had entered it so far:

$ bz → TAB (beep) → TAB

bzcat bzdiff bzip2 bzless

bzcmp bzgrep bzip2recover bzmore

$ bz■

Next the user types c and presses TAB twice. The shell displays the two commands

that start with bzc. The user types a followed by TAB. At this point the shell com-

pletes the command because only one command starts with bzca.

$ bzc → TAB (beep) → TAB

bzcat bzcmp

$ bzca → TAB →t ■

Pathname Completion

Pathname completion, which also uses TABs, allows you to type a portion of a path-

name and have bash supply the rest. If the portion of the pathname you have typed

is sufficient to determine a unique pathname, bash displays that pathname. If more

than one pathname would match it, bash completes the pathname up to the point

where there are choices so that you can type more.

When you are entering a pathname, including a simple filename, and press TAB, the

shell beeps (if the shell is in emacs mode—in vi mode there is no beep). It then

extends the command line as far as it can.

History 321

$ cat films/dar → TAB (beep) cat films/dark_■

In the films directory every file that starts with dar has k_ as the next characters, so

bash cannot extend the line further without making a choice among files. The shell

leaves the cursor just past the _ character. At this point you can continue typing the

pathname or press TAB twice. In the latter case bash beeps, displays your choices,

redisplays the command line, and again leaves the cursor just after the _ character.

$ cat films/dark_ → TAB (beep) → TAB

dark_passage dark_victory

$ cat films/dark_■

When you add enough information to distinguish between the two possible files and

press TAB,bash displays the unique pathname. If you enter p followed by TAB after the

_ character, the shell completes the command line:

$ cat films/dark_p → TAB → assage

Because there is no further ambiguity, the shell appends a SPACE so you can finish typ-

ing the command line or just press RETURN to execute the command. If the complete

pathname is that of a directory, bash appends a slash (/) in place of a SPACE.

Variable Completion

When you are typing a variable name, pressing TAB results in variable completion,

wherein bash attempts to complete the name of the variable. In case of an ambiguity,

pressing TAB twice displays a list of choices:

$ echo $HO → TAB → TAB

$HOME $HOSTNAME $HOSTTYPE

$ echo $HOM → TAB → E

.inputrc: Configuring the Readline Library

The Bourne Again Shell and other programs that use the Readline Library read the

file specified by the INPUTRC environment variable to obtain initialization infor-

mation. If INPUTRC is not set, these programs read the ~/.inputrc file. They ignore

lines of .inputrc that are blank or that start with a pound sign (#).

Variables

You can set variables in .inputrc to control the behavior of the Readline Library

using the following syntax:

set variable value

Table 8-11 lists some variables and values you can use. See Readline Variables in the

bash man or info page for a complete list.

Pressing RETURN executes the command

caution Pressing RETURN causes the shell to execute the command regardless of where the cursor is on

the command line.

322 Chapter 8 The Bourne Again Shell

Key Bindings

You can specify bindings that map keystroke sequences to Readline commands, allow-

ing you to change or extend the default bindings. Like the emacs editor, the Readline

Library includes many commands that are not bound to a keystroke sequence. To use

an unbound command, you must map it using one of the following forms:

keyname:command_name

"keystroke_sequence":command_name

In the first form, you spell out the name for a single key. For example, CONTROL-U would

be written as control-u. This form is useful for binding commands to single keys.

In the second form, you specify a string that describes a sequence of keys that will be

bound to the command. You can use the emacs-style backslash escape sequences to

represent the special keys CONTROL (\C), META (\M), and ESCAPE (\e). Specify a backslash

by escaping it with another backslash: \\. Similarly, a double or single quotation

mark can be escaped with a backslash: \" or \'.

The kill-whole-line command, available in emacs mode only, deletes the current

line. Put the following command in .inputrc to bind the kill-whole-line command

(which is unbound by default) to the keystroke sequence CONTROL-R:

control-r: kill-whole-line

bind Give the command bind –P to display a list of all Readline commands. If a command

is bound to a key sequence, that sequence is shown. Commands you can use in vi

mode start with vi. For example, vi-next-word and vi-prev-word move the cursor to

the beginning of the next and previous words, respectively. Commands that do not

begin with vi are generally available in emacs mode.

Use bind –q to determine which key sequence is bound to a command:

Table 8-11 Readline variables

Variable Effect

editing-mode Set to vi to start Readline in vi mode. Set to emacs to start

Readline in emacs mode (the default). Similar to the set –o vi

and set –o emacs shell commands (page 318).

horizontal-scroll-mode Set to on to cause long lines to extend off the right edge of the

display area. Moving the cursor to the right when it is at the

right edge of the display area shifts the line to the left so you can

see more of the line. You can shift the line back by moving the

cursor back past the left edge. The default value is off, which

causes long lines to wrap onto multiple lines of the display.

mark-directories Set to off to cause Readline not to place a slash (/) at the end of

directory names it completes. The default value is on.

mark-modified-lines Set to on to cause Readline to precede modified history lines

with an asterisk. The default value is off.

History 323

$ bind -q kill-whole-line

kill-whole-line can be invoked via "\C-r".

You can also bind text by enclosing it within double quotation marks (emacs mode

only):

"QQ": "The Linux Operating System"

This command causes bash to insert the string The Linux Operating System when

you type QQ.

Conditional Constructs

You can conditionally select parts of the .inputrc file using the $if directive. The

syntax of the conditional construct is

$if test[=value]

commands

[$else

commands]

$endif

where test is mode,term, or bash. If test equals value (or if test is true when value is

not specified), this structure executes the first set of commands. If test does not

equal value (or if test is false when value is not specified), this construct executes

the second set of commands if they are present or exits from the structure if they are

not present.

The power of the $if directive lies in the three types of tests it can perform.

1. You can test to see which mode is currently set.

$if mode=vi

The preceding test is true if the current Readline mode is vi and false

otherwise. You can test for vi or emacs.

2. You can test the type of terminal.

$if term=xterm

The preceding test is true if the TERM variable is set to xterm. You can

test for any value of TERM.

3. You can test the application name.

$if bash

The preceding test is true when you are running bash and not another pro-

gram that uses the Readline Library. You can test for any application name.

These tests can customize the Readline Library based on the current mode, the type of

terminal, and the application you are using. They give you a great deal of power and

flexibility when you are using the Readline Library with bash and other programs.

324 Chapter 8 The Bourne Again Shell

The following commands in .inputrc cause CONTROL-Y to move the cursor to the beginning

of the next word regardless of whether bash is in vi or emacs mode:

$ cat ~/.inputrc

set editing-mode vi

$if mode=vi

"\C-y": vi-next-word

$else

"\C-y": forward-word

$endif

Because bash reads the preceding conditional construct when it is started, you must

set the editing mode in .inputrc. Changing modes interactively using set will not

change the binding of CONTROL-Y.

For more information on the Readline Library, open the bash man page and give the

command /^READLINE, which searches for the word READLINE at the beginning

of a line.

Aliases

An alias is a (usually short) name that the shell translates into another (usually

longer) name or (complex) command. Aliases allow you to define new commands

by substituting a string for the first token of a simple command. They are typically

placed in the ~/.bashrc (bash) or ~/.tcshrc (tcsh) startup files so that they are avail-

able to interactive subshells.

Under bash the syntax of the alias builtin is

alias [name[=value]]

Under tcsh the syntax is

alias [name[value]]

In the bash syntax no SPACEs are permitted around the equal sign. If value contains

SPACEs or TABs, you must enclose value within quotation marks. Unlike aliases under

tcsh, a bash alias does not accept an argument from the command line in value. Use

abash function (page 327) when you need to use an argument.

An alias does not replace itself, which avoids the possibility of infinite recursion in

handling an alias such as the following:

$ alias ls='ls -F'

You can nest aliases. Aliases are disabled for noninteractive shells (that is, shell

scripts). To see a list of the current aliases, give the command alias. To view the alias

for a particular name, give the command alias followed by the name of the alias.

You can use the unalias builtin to remove an alias.

If Readline commands do not work, log out and log in again

tip The Bourne Again Shell reads ~/.inputrc when you log in. After you make changes to this file, you

must log out and log in again before the changes will take effect.

Aliases 325

When you give an alias builtin command without any arguments, the shell displays

a list of all defined aliases:

$ alias

alias ll='ls -l'

alias l='ls -ltr'

alias ls='ls -F'

alias zap='rm -i'

Most Linux distributions define at least some aliases. Give an alias command to see

which aliases are in effect. You can delete the aliases you do not want from the

appropriate startup file.

Single Versus Double Quotation Marks in Aliases

The choice of single or double quotation marks is significant in the alias syntax

when the alias includes variables. If you enclose value within double quotation

marks, any variables that appear in value are expanded when the alias is created. If

you enclose value within single quotation marks, variables are not expanded until

the alias is used. The following example illustrates the difference.

The PWD keyword variable holds the pathname of the working directory. Max cre-

ates two aliases while he is working in his home directory. Because he uses double

quotation marks when he creates the dirA alias, the shell substitutes the value of the

working directory when he creates this alias. The alias dirA command displays the

dirA alias and shows that the substitution has already taken place:

$ echo $PWD

/home/max

$ alias dirA="echo Working directory is $PWD"

$ alias dirA

alias dirA='echo Working directory is /home/max'

When Max creates the dirB alias, he uses single quotation marks, which prevent the

shell from expanding the $PWD variable. The alias dirB command shows that the

dirB alias still holds the unexpanded $PWD variable:

$ alias dirB='echo Working directory is $PWD'

$ alias dirB

alias dirB='echo Working directory is $PWD'

After creating the dirA and dirB aliases, Max uses cd to make cars his working

directory and gives each of the aliases as commands. The alias he created using dou-

ble quotation marks displays the name of the directory he created the alias in as the

working directory (which is wrong). In contrast, the dirB alias displays the proper

name of the working directory:

$ cd cars

$ dirA

Working directory is /home/max

$ dirB

Working directory is /home/max/cars

326 Chapter 8 The Bourne Again Shell

Examples of Aliases

The following alias allows you to type r to repeat the previous command or r abc to

repeat the last command line that began with abc:

$ alias r='fc -s'

If you use the command ls –ltr frequently, you can create an alias that substitutes ls

–ltr when you give the command l:

$ alias l='ls -ltr'

$ l

total 41

-rw-r--r-- 1 max group 30015 Mar 1 2008 flute.ps

-rw-r----- 1 max group 3089 Feb 11 2009 XTerm.ad

-rw-r--r-- 1 max group 641 Apr 1 2009 fixtax.icn

-rw-r--r-- 1 max group 484 Apr 9 2009 maptax.icn

drwxrwxr-x 2 max group 1024 Aug 9 17:41 Tiger

drwxrwxr-x 2 max group 1024 Sep 10 11:32 testdir

-rwxr-xr-x 1 max group 485 Oct 21 08:03 floor

drwxrwxr-x 2 max group 1024 Oct 27 20:19 Test_Emacs

Another common use of aliases is to protect yourself from mistakes. The following exam-

ple substitutes the interactive version of the rm utility when you give the command zap:

$ alias zap='rm -i'

$ zap f*

rm: remove 'fixtax.icn'? n

rm: remove 'flute.ps'? n

rm: remove 'floor'? n

The –i option causes rm to ask you to verify each file that would be deleted,

thereby helping you avoid deleting the wrong file. You can also alias rm with the

rm –i command: alias rm='rm –i'.

The aliases in the next example cause the shell to substitute ls –l each time you give

an ll command and ls –F each time you use ls:

$ alias ls='ls -F'

$ alias ll='ls -l'

$ ll

total 41

drwxrwxr-x 2 max group 1024 Oct 27 20:19 Test_Emacs/

drwxrwxr-x 2 max group 1024 Aug 9 17:41 Tiger/

-rw-r----- 1 max group 3089 Feb 11 2009 XTerm.ad

-rw-r--r-- 1 max group 641 Apr 1 2009 fixtax.icn

-rw-r--r-- 1 max group 30015 Mar 1 2008 flute.ps

-rwxr-xr-x 1 max group 485 Oct 21 08:03 floor*

-rw-r--r-- 1 max group 484 Apr 9 2009 maptax.icn

drwxrwxr-x 2 max group 1024 Sep 10 11:32 testdir/

How to prevent the shell from invoking an alias

tip The shell checks only simple, unquoted commands to see if they are aliases. Commands given as

relative or absolute pathnames and quoted commands are not checked. When you want to give a

command that has an alias but do not want to use the alias, precede the command with a back-

slash, specify the command’s absolute pathname, or give the command as ./

command

Functions 327

The –F option causes ls to print a slash (/) at the end of directory names and an

asterisk (*) at the end of the names of executable files. In this example, the string

that replaces the alias ll (ls –l) itself contains an alias (ls). When it replaces an alias

with its value, the shell looks at the first word of the replacement string to see

whether it is an alias. In the preceding example, the replacement string contains the

alias ls, so a second substitution occurs to produce the final command ls –F –l. (To

avoid a recursive plunge, the ls in the replacement text, although an alias, is not

expanded a second time.)

When given a list of aliases without the =value or value field, the alias builtin

responds by displaying the value of each defined alias. The alias builtin reports an

error if an alias has not been defined:

$ alias ll l ls zap wx

alias ll='ls -l'

alias l='ls -ltr'

alias ls='ls -F'

alias zap='rm -i'

bash: alias: wx: not found

You can avoid alias substitution by preceding the aliased command with a backslash (\):

$ \ls

Test_Emacs XTerm.ad flute.ps maptax.icn

Tiger fixtax.icn floor testdir

Because the replacement of an alias name with the alias value does not change the rest of

the command line, any arguments are still received by the command that gets executed:

$ ll f*

-rw-r--r-- 1 max group 641 Apr 1 2009 fixtax.icn

-rw-r--r-- 1 max group 30015 Mar 1 2008 flute.ps

-rwxr-xr-x 1 max group 485 Oct 21 08:03 floor*

You can remove an alias with the unalias builtin. When the zap alias is removed, it is no

longer displayed with the alias builtin and its subsequent use results in an error message:

$ unalias zap

$ alias

alias ll='ls -l'

alias l='ls -ltr'

alias ls='ls -F'

$ zap maptax.icn

bash: zap: command not found

Functions

Abash shell function (tcsh does not have functions) is similar to a shell script in that

it stores a series of commands for execution at a later time. However, because the

shell stores a function in the computer’s main memory (RAM) instead of in a file on

the disk, the shell can access it more quickly than the shell can access a script. The

shell also preprocesses (parses) a function so that it starts up more quickly than a

script. Finally the shell executes a shell function in the same shell that called it. If

328 Chapter 8 The Bourne Again Shell

you define too many functions, the overhead of starting a subshell (as when you run

a script) can become unacceptable.

You can declare a shell function in the ~/.bash_profile startup file, in the script that

uses it, or directly from the command line. You can remove functions with the unset

builtin. The shell does not retain functions after you log out.

The syntax that declares a shell function is

[function] function-name ()

{

commands

}

where the word function is optional, function-name is the name you use to call the

function, and commands comprise the list of commands the function executes when

you call it. The commands can be anything you would include in a shell script,

including calls to other functions.

The opening brace ({) can appear on the same line as the function name. Aliases and

variables are expanded when a function is read, not when it is executed. You can

use the break statement (page 420) within a function to terminate its execution.

Shell functions are useful as a shorthand as well as to define special commands. The

following function starts a process named process in the background, with the out-

put normally displayed by process being saved in .process.out:

start_process() {

process > .process.out 2>&1 &

}

The next example creates a simple function that displays the date, a header, and a

list of the people who are logged in on the system. This function runs the same com-

mands as the whoson script described on page 278. In this example the function is

being entered from the keyboard. The greater than (>) signs are secondary shell

prompts (PS2); do not enter them.

$ function whoson ()

> {

> date

> echo "Users Currently Logged On"

> who

> }

$ whoson

Sun Aug 9 15:44:58 PDT 2009

Users Currently Logged On

hls console Aug 8 08:59 (:0)

max pts/4 Aug 8 09:33 (0.0)

zach pts/7 Aug 8 09:23 (bravo.example.com)

Removing variables and functions

tip If you have a shell variable and a function with the same name, using unset removes the shell

variable. If you then use unset again with the same name, it removes the function.

Functions 329

Functions in

startup files

If you want to have the whoson function always be available without having to

enter it each time you log in, put its definition in ~/.bash_profile. Then run

.bash_profile, using the . (dot) command to put the changes into effect immediately:

$ cat ~/.bash_profile

export TERM=vt100

stty kill '^u'

whoson ()

{

date

echo "Users Currently Logged On"

who

}

$ . ~/.bash_profile

You can specify arguments when you call a function. Within the function these

arguments are available as positional parameters (page 440). The following exam-

ple shows the arg1 function entered from the keyboard:

$ arg1 ( ) {

> echo "$1"

> }

$ arg1 first_arg

first_arg

See the function switch () on page 274 for another example of a function. “Func-

tions” on page 437 discusses the use of local and global variables within a function.

optional The following function allows you to export variables using tcsh syntax. The env

builtin lists all environment variables and their values and verifies that setenv

worked correctly:

$ cat .bash_profile

...

# setenv - keep tcsh users happy

function setenv()

{

if [ $# -eq 2 ]

then

eval $1=$2

export $1

else

echo "Usage: setenv NAME VALUE" 1>&2

}

$ . ~/.bash_profile

$ setenv TCL_LIBRARY /usr/local/lib/tcl

$ env | grep TCL_LIBRARY

TCL_LIBRARY=/usr/local/lib/tcl

eval The $# special parameter (page 441) takes on the value of the number of command-

line arguments. This function uses the eval builtin to force bash to scan the command

$1=$2 twice. Because $1=$2 begins with a dollar sign ($), the shell treats the entire

330 Chapter 8 The Bourne Again Shell

string as a single token—a command. With variable substitution performed, the

command name becomes TCL_LIBRARY=/usr/local/lib/tcl, which results in an

error. Using eval, a second scanning splits the string into the three desired tokens, and

the correct assignment occurs.

Controlling bash: Features and Options

This section explains how to control bash features and options using command-line

options and the set and shopt builtins.

Command-Line Options

Two kinds of command-line options are available: short and long. Short options

consist of a hyphen followed by a letter; long options have two hyphens followed by

multiple characters. Long options must appear before short options on a command

line that calls bash. Table 8-12 lists some commonly used command-line options.

Shell Features

You can control the behavior of the Bourne Again Shell by turning features on and

off. Different features use different methods to turn features on and off. The set

Table 8-12 Command-line options

Option Explanation Syntax

Help Displays a usage message. ––help

No edit Prevents users from using the Readline Library

(page 318) to edit command lines in an interactive

shell.

––noediting

No profile Prevents reading these startup files (page 271):

/etc/profile,~/.bash_profile,~/.bash_login, and

~/.profile.

––noprofile

No rc Prevents reading the ~/.bashrc startup file

(page 272). This option is on by default if the shell is

called as sh.

––norc

POSIX Runs bash in POSIX mode. ––posix

Version Displays bash version information and exits. ––version

shopt Runs a shell with the

opt

shopt option (next page).

A–O (uppercase “O”) sets the option; +O unsets it.

[±]O [

opt

]

End of options On the command line, signals the end of options.

Subsequent tokens are treated as arguments even if

they begin with a hyphen (–).

––

Controlling bash: Features and Options 331

builtin controls one group of features, while the shopt builtin controls another group.

You can also control many features from the command line you use to call bash.

set ±o: Turns Shell Features On and Off

The bash set builtin (there is a set builtin in tcsh, but it works differently), when

used with the –o or +o option, enables, disables, and lists certain bash features. For

example, the following command turns on the noclobber feature (page 129):

$ set -o noclobber

You can turn this feature off (the default) by giving the command

$ set +o noclobber

The command set –o without an option lists each of the features controlled by set, fol-

lowed by its state (on or off). The command set +o without an option lists the same fea-

tures in a form you can use as input to the shell. Table 8-13 (next page) lists bash features.

shopt: Turns Shell Features On and Off

The shopt (shell option) builtin (not available in tcsh) enables, disables, and lists cer-

tain bash features that control the behavior of the shell. For example, the following

command causes bash to include filenames that begin with a period (.) when it

expands ambiguous file references (the –s stands for set):

$ shopt -s dotglob

You can turn this feature off (the default) by giving the following command (the –u

stands for unset):

$ shopt -u dotglob

The shell displays how a feature is set if you give the name of the feature as the only

argument to shopt:

$ shopt dotglob

dotglob off

The command shopt without any options or arguments lists the features controlled

by shopt and their state. The command shopt –s without an argument lists the fea-

tures controlled by shopt that are set or on. The command shopt –u lists the features

that are unset or off. Table 8-13 lists bash features.

Features, options, variables?

tip To avoid confusing terminology, this book refers to the various shell behaviors that you can con-

trol as features. The bash info page refers to them as “options” and “values of variables control-

ling optional shell behavior.”

Setting set ±o features using shopt

tip You can use shopt to set/unset features that are otherwise controlled by set ±o. Use the regular

shopt syntax with –s or –u and include the –o option. For example, the following command turns

on the noclobber feature:

$ shopt -o -s noclobber

332 Chapter 8 The Bourne Again Shell

Table 8-13 bash features

Feature Description Syntax Alternate syntax

allexport Automatically exports all variables and

functions you create or modify after giving

this command.

set –o allexport set –a

braceexpand Causes bash to perform brace expansion

(the default; page 336).

set –o braceexpand set –B

cdspell Corrects minor spelling errors in directory

names used as arguments to cd.

shopt –s cdspell

cmdhist Saves all lines of a multiline command in

the same history entry, adding semicolons

as needed.

shopt –s cmdhist

dotglob Causes shell special characters (wildcards;

page 136) in an ambiguous file reference

to match a leading period in a filename. By

default special characters do not match a

leading period. You must always specify

the filenames . and .. explicitly because no

pattern ever matches them.

shopt –s dotglob

emacs Specifies emacs editing mode for

command-line editing (the default;

page 319).

set –o emacs

errexit Causes bash to exit when a simple com-

mand (not a control structure) fails.

set –o errexit set –e

execfail Causes a shell script to continue running

when it cannot find the file that is given as

an argument to exec. By default a script

terminates when exec cannot find the file

that is given as its argument.

shopt –s execfail

expand_aliases Causes aliases (page 324) to be expanded

(by default it is on for interactive shells and

off for noninteractive shells).

shopt –s expand_alias

hashall Causes bash to remember where com-

mands it has found using PATH (page 297)

are located (default).

set –o hashall set –h

histappend Causes bash to append the history list to

the file named by HISTFILE (page 308)

when the shell exits. By default bash over-

writes this file.

shopt –s histappend

histexpand Turns on the history mechanism (which

uses exclamation points by default;

page 313). Turn this feature off to turn off

history expansion.

set –o histexpand set –H

Controlling bash: Features and Options 333

Feature Description Syntax Alternate syntax

history Enables command history (on by default;

page 308).

set –o history

huponexit Specifies that bash send a SIGHUP signal

to all jobs when an interactive login shell

exits.

shopt –s huponexit

ignoreeof Specifies that bash must receive ten EOF

characters before it exits. Useful on noisy

dial-up lines.

set –o ignoreeof

monitor Enables job control (on by default,

page 285).

set –o monitor set –m

nocaseglob Causes ambiguous file references

(page 136) to match filenames without

regard to case (off by default).

shopt –s nocaseglob

noclobber Helps prevent overwriting files (off by

default; page 129).

set –o noclobber set –C

noglob Disables pathname expansion (off by

default; page 136).

set –o noglob set –f

notify With job control (page 285) enabled,

reports the termination status of back-

ground jobs immediately. The default

behavior is to display the status just before

the next prompt.

set –o notify set –b

nounset Displays an error and exits from a shell

script when you use an unset variable in an

interactive shell. The default is to display a

null value for an unset variable.

set –o nounset set –u

nullglob Causes bash to expand ambiguous file

references (page 136) that do not match a

filename to a null string. By default bash

passes these file references without

expanding them.

shopt –s nullglob

posix Runs bash in POSIX mode. set –o posix

verbose Displays command lines as bash reads

them.

set –o verbose set –v

vi Specifies vi editing mode for command-

line editing (page 318).

set –o vi

xpg_echo Causes the echo builtin to expand back-

slash escape sequences without the need

for the –e option (page 424).

shopt –s xpg_echo

xtrace Turns on shell debugging (page 410). set –o xtrace set –x

Table 8-13 bash features (continued)

334 Chapter 8 The Bourne Again Shell

Processing the Command Line

Whether you are working interactively or running a shell script, bash needs to read

a command line before it can start processing it—bash always reads at least one line

before processing a command. Some bash builtins, such as if and case, as well as

functions and quoted strings, span multiple lines. When bash recognizes a command

that covers more than one line, it reads the entire command before processing it. In

interactive sessions, bash prompts you with the secondary prompt (PS2,> by

default; page 300) as you type each line of a multiline command until it recognizes

the end of the command:

$ echo 'hi

> end'

end

$ function hello () {

> echo hello there

> }

After reading a command line, bash applies history expansion and alias substitution

to the line.

History Expansion

“Reexecuting and Editing Commands” on page 310 discusses the commands you

can give to modify and reexecute command lines from the history list. History

expansion is the process that bash uses to turn a history command into an execut-

able command line. For example, when you give the command !!, history expansion

changes that command line so it is the same as the previous one. History expansion

is turned on by default for interactive shells; set +o histexpand turns it off. History

expansion does not apply to noninteractive shells (shell scripts).

Alias Substitution

Aliases (page 324) substitute a string for the first word of a simple command. By

default aliases are turned on for interactive shells and off for noninteractive shells.

Give the command shopt –u expand_aliases to turn aliases off.

Parsing and Scanning the Command Line

After processing history commands and aliases, bash does not execute the command

immediately. One of the first things the shell does is to parse (isolate strings of charac-

ters in) the command line into tokens or words. The shell then scans each token for

special characters and patterns that instruct the shell to take certain actions. These

actions can involve substituting one word or words for another. When the shell parses

the following command line, it breaks it into three tokens (cp,~/letter, and .):

$ cp ~/letter .

Processing the Command Line 335

After separating tokens and before executing the command, the shell scans the

tokens and performs command-line expansion.

Command-Line Expansion

Both interactive and noninteractive shells transform the command line using command-

line expansion before passing the command line to the program being called. You can

use a shell without knowing much about command-line expansion, but you can use

what a shell has to offer to a better advantage with an understanding of this topic. This

section covers Bourne Again Shell command-line expansion; TC Shell command-line

expansion is covered starting on page 354.

The Bourne Again Shell scans each token for the various types of expansion and

substitution in the following order. Most of these processes expand a word into a

single word. Only brace expansion, word splitting, and pathname expansion can

change the number of words in a command (except for the expansion of the vari-

able "$@"—see page 444).

1. Brace expansion (page 336)

2. Tilde expansion (page 337)

3. Parameter and variable expansion (page 338)

4. Arithmetic expansion (page 338)

5. Command substitution (page 340)

6. Word splitting (page 341)

7. Pathname expansion (page 341)

8. Process substitution (page 343)

Quote removal After bash finishes with the preceding list, it removes from the command line single

quotation marks, double quotation marks, and backslashes that are not a result of

an expansion. This process is called quote removal.

Order of Expansion

The order in which bash carries out these steps affects the interpretation of com-

mands. For example, if you set a variable to a value that looks like the instruction

for output redirection and then enter a command that uses the variable’s value to

perform redirection, you might expect bash to redirect the output.

$ SENDIT="> /tmp/saveit"

$ echo xxx $SENDIT

xxx > /tmp/saveit

$ cat /tmp/saveit

cat: /tmp/saveit: No such file or directory

In fact, the shell does not redirect the output—it recognizes input and output redi-

rection before it evaluates variables. When it executes the command line, the shell

checks for redirection and, finding none, evaluates the SENDIT variable. After

336 Chapter 8 The Bourne Again Shell

replacing the variable with > /tmp/saveit,bash passes the arguments to echo, which

dutifully copies its arguments to standard output. No /tmp/saveit file is created.

The following sections provide more detailed descriptions of the steps involved in

command processing. Keep in mind that double and single quotation marks cause

the shell to behave differently when performing expansions. Double quotation

marks permit parameter and variable expansion but suppress other types of expan-

sion. Single quotation marks suppress all types of expansion.

Brace Expansion

Brace expansion, which originated in the C Shell, provides a convenient way to

specify filenames when pathname expansion does not apply. Although brace

expansion is almost always used to specify filenames, the mechanism can be used

to generate arbitrary strings; the shell does not attempt to match the brace notation

with the names of existing files.

Brace expansion is turned on in interactive and noninteractive shells by default; you

can turn it off with set +o braceexpand. The shell also uses braces to isolate variable

names (page 294).

The following example illustrates how brace expansion works. The ls command does

not display any output because there are no files in the working directory. The echo

builtin displays the strings that the shell generates with brace expansion. In this case

the strings do not match filenames (because there are no files in the working directory).

$ ls

$ echo chap_{one,two,three}.txt

chap_one.txt chap_two.txt chap_three.txt

The shell expands the comma-separated strings inside the braces in the echo com-

mand into a SPACE-separated list of strings. Each string from the list is prepended

with the string chap_, called the preamble, and appended with the string .txt, called

the postscript. Both the preamble and the postscript are optional. The left-to-right

order of the strings within the braces is preserved in the expansion. For the shell to

treat the left and right braces specially and for brace expansion to occur, at least one

comma and no unquoted whitespace characters must be inside the braces. You can

nest brace expansions.

Brace expansion is useful when there is a long preamble or postscript. The fol-

lowing example copies four files—main.c,f1.c,f2.c, and tmp.c—located in the

/usr/local/src/C directory to the working directory:

$ cp /usr/local/src/C/{main,f1,f2,tmp}.c .

You can also use brace expansion to create directories with related names:

$ ls -F

file1 file2 file3

$ mkdir vrs{A,B,C,D,E}

$ ls -F

file1 file2 file3 vrsA/ vrsB/ vrsC/ vrsD/ vrsE/

Processing the Command Line 337

The –F option causes ls to display a slash (/) after a directory and an asterisk (*)

after an executable file.

If you tried to use an ambiguous file reference instead of braces to specify the direc-

tories, the result would be different (and not what you wanted):

$ rmdir vrs*

$ mkdir vrs[A-E]

$ ls -F

file1 file2 file3 vrs[A-E]/

An ambiguous file reference matches the names of existing files. In the preceding

example, because it found no filenames matching vrs[A–E],bash passed the ambig-

uous file reference to mkdir, which created a directory with that name. Brackets in

ambiguous file references are discussed on page 139.

Tilde Expansion

Chapter 4 introduced a shorthand notation to specify your home directory or the

home directory of another user. This section provides a more detailed explanation

of tilde expansion.

The tilde (~) is a special character when it appears at the start of a token on a com-

mand line. When it sees a tilde in this position, bash looks at the following string of

characters—up to the first slash (/) or to the end of the word if there is no slash—as

a possible username. If this possible username is null (that is, if the tilde appears as

a word by itself or if it is immediately followed by a slash), the shell substitutes the

value of the HOME variable for the tilde. The following example demonstrates this

expansion, where the last command copies the file named letter from Max’s home

directory to the working directory:

$ echo $HOME

/home/max

$ echo ~

/home/max

$ echo ~/letter

/home/max/letter

$ cp ~/letter .

If the string of characters following the tilde forms a valid username, the shell sub-

stitutes the path of the home directory associated with that username for the tilde

and name. If the string is not null and not a valid username, the shell does not make

any substitution:

$ echo ~zach

/home/zach

$ echo ~root

/root

$ echo ~xx

~xx

338 Chapter 8 The Bourne Again Shell

Tildes are also used in directory stack manipulation (page 288). In addition, ~+ is a

synonym for PWD (the name of the working directory), and ~– is a synonym for

OLDPWD (the name of the previous working directory).

Parameter and Variable Expansion

On a command line, a dollar sign ($) that is not followed by an open parenthesis

introduces parameter or variable expansion. Parameters include both command-line,

or positional, parameters (page 440) and special parameters (page 438). Variables

include both user-created variables (page 292) and keyword variables (page 296).

The bash man and info pages do not make this distinction.

Parameters and variables are not expanded if they are enclosed within single quotation

marks or if the leading dollar sign is escaped (i.e., preceded with a backslash). If they are

enclosed within double quotation marks, the shell expands parameters and variables.

Arithmetic Expansion

The shell performs arithmetic expansion by evaluating an arithmetic expression and

replacing it with the result. See page 368 for information on arithmetic expansion

under tcsh. Under bash the syntax for arithmetic expansion is

$((expression))

The shell evaluates expression and replaces $((expression)) with the result of the

evaluation. This syntax is similar to the syntax used for command substitution

[$(...)] and performs a parallel function. You can use $((expression)) as an argument

to a command or in place of any numeric value on a command line.

The rules for forming expression are the same as those found in the C programming

language; all standard C arithmetic operators are available (see Table 10-8 on

page 463). Arithmetic in bash is done using integers. Unless you use variables of

type integer (page 296) or actual integers, however, the shell must convert string-

valued variables to integers for the purpose of the arithmetic evaluation.

You do not need to precede variable names within expression with a dollar sign ($).

In the following example, after read (page 447) assigns the user’s response to age, an

arithmetic expression determines how many years are left until age 60:

$ cat age_check

#!/bin/bash

echo -n "How old are you? "

read age

echo "Wow, in $((60-age)) years, you'll be 60!"

$ ./age_check

How old are you? 55

Wow, in 5 years, you'll be 60!

You do not need to enclose the expression within quotation marks because bash

does not perform filename expansion on it. This feature makes it easier for you to

use an asterisk (*) for multiplication, as the following example shows:

Processing the Command Line 339

$ echo There are $((60*60*24*365)) seconds in a non-leap year.

There are 31536000 seconds in a non-leap year.

The next example uses wc,cut, arithmetic expansion, and command substitution

(page 340) to estimate the number of pages required to print the contents of the file

letter.txt. The output of the wc (word count) utility (page 876) used with the –l

option is the number of lines in the file, in columns (character positions) 1 through

4, followed by a SPACE and the name of the file (the first command following). The

cut utility (page 652) with the –c1–4 option extracts the first four columns.

$ wc -l letter.txt

351 letter.txt

$ wc -l letter.txt | cut -c1-4

351

The dollar sign and single parenthesis instruct the shell to perform command substi-

tution; the dollar sign and double parentheses indicate arithmetic expansion:

$ echo $(( $(wc -l letter.txt | cut -c1-4)/66 + 1))

The preceding example sends standard output from wc to standard input of cut via a

pipe. Because of command substitution, the output of both commands replaces the

commands between the $( and the matching ) on the command line. Arithmetic

expansion then divides this number by 66, the number of lines on a page. A 1 is

added because the integer division results in any remainder being discarded.

Another way to get the same result without using cut is to redirect the input to wc

instead of having wc get its input from a file you name on the command line. When

you redirect its input, wc does not display the name of the file:

$ wc -l < letter.txt

351

It is common practice to assign the result of arithmetic expansion to a variable:

$ numpages=$(( $(wc -l < letter.txt)/66 + 1))

let builtin The let builtin (not available in tcsh) evaluates arithmetic expressions just as the $(( ))

syntax does. The following command is equivalent to the preceding one:

$ let "numpages=$(wc -l < letter.txt)/66 + 1"

Fewer dollar signs ($)

tip When you use variables within $(( and )), the dollar signs that precede individual variable refer-

ences are optional:

$ x=23 y=37

$ echo $((2*$x + 3*$y))

157

$ echo $((2*x + 3*y))

157

340 Chapter 8 The Bourne Again Shell

The double quotation marks keep the SPACEs (both those you can see and those that

result from the command substitution) from separating the expression into separate

arguments to let. The value of the last expression determines the exit status of let. If the

value of the last expression is 0, the exit status of let is 1; otherwise, its exit status is 0.

You can supply let with multiple arguments on a single command line:

$ let a=5+3 b=7+2

$ echo $a $b

8 9

When you refer to variables when doing arithmetic expansion with let or $(( )), the

shell does not require a variable name to begin with a dollar sign ($). Nevertheless,

it is a good practice to do so for consistency, as in most places you must precede a

variable name with a dollar sign.

Command Substitution

Command substitution replaces a command with the output of that command. The

preferred syntax for command substitution under bash follows:

$(command)

Under bash you can also use the following, older syntax, which is the only syntax

allowed under tcsh:

‘command‘

The shell executes command within a subshell and replaces command, along with

the surrounding punctuation, with standard output of command.

In the following example, the shell executes pwd and substitutes the output of the

command for the command and surrounding punctuation. Then the shell passes the

output of the command, which is now an argument, to echo, which displays it.

$ echo $(pwd)

/home/max

The next script assigns the output of the pwd builtin to the variable where and dis-

plays a message containing the value of this variable:

$ cat where

where=$(pwd)

echo "You are using the $where directory."

$ ./where

You are using the /home/zach directory.

Although it illustrates how to assign the output of a command to a variable, this

example is not realistic. You can more directly display the output of pwd without

using a variable:

$ cat where2

echo "You are using the $(pwd) directory."

$ ./where2

You are using the /home/zach directory.

Processing the Command Line 341

The following command uses find to locate files with the name README in the

directory tree rooted at the working directory. This list of files is standard output of

find and becomes the list of arguments to ls.

$ ls -l $(find . -name README -print)

The next command line shows the older ‘command‘syntax:

$ ls -l ‘find . -name README -print‘

One advantage of the newer syntax is that it avoids the rather arcane rules for token

handling, quotation mark handling, and escaped back ticks within the old syntax.

Another advantage of the new syntax is that it can be nested, unlike the old syntax.

For example, you can produce a long listing of all README files whose size

exceeds the size of ./README with the following command:

$ ls -l $(find . -name README -size +$(echo $(cat ./README | wc -c)c ) -print )

Try giving this command after giving a set –x command (page 410) to see how bash

expands it. If there is no README file, you just get the output of ls –l.

For additional scripts that use command substitution, see pages 406, 425, and 455.

Word Splitting

The results of parameter and variable expansion, command substitution, and arith-

metic expansion are candidates for word splitting. Using each character of IFS

(page 301) as a possible delimiter, bash splits these candidates into words or tokens.

If IFS is unset, bash uses its default value (SPACE-TAB-NEWL NE). If IFS is null, bash does

not split words.

Pathname Expansion

Pathname expansion (page 136), also called filename generation or globbing, is the

process of interpreting ambiguous file references and substituting the appropriate list

of filenames. Unless noglob (page 333) is set, the shell performs this function when it

encounters an ambiguous file reference—a token containing any of the unquoted

characters *,?,[, or ]. If bash cannot locate any files that match the specified pat-

tern, the token with the ambiguous file reference is left alone. The shell does not

delete the token or replace it with a null string but rather passes it to the program as

is (except see nullglob on page 333). The TC Shell generates an error message.

In the first echo command in the following example, the shell expands the ambigu-

ous file reference tmp* and passes three tokens (tmp1,tmp2, and tmp3) to echo.

The echo builtin displays the three filenames it was passed by the shell. After rm

$(( Versus $(

tip The symbols $(( constitute a single token. They introduce an arithmetic expression, not a com-

mand substitution. Thus, if you want to use a parenthesized subshell (page 284) within $(), you

must insert a SPACE between the $( and the following (.

342 Chapter 8 The Bourne Again Shell

removes the three tmp* files, the shell finds no filenames that match tmp* when it

tries to expand it. It then passes the unexpanded string to the echo builtin, which

displays the string it was passed.

$ ls

tmp1 tmp2 tmp3

$ echo tmp*

tmp1 tmp2 tmp3

$ rm tmp*

$ echo tmp*

tmp*

By default the same command causes the TC Shell to display an error message:

tcsh $ echo tmp*

echo: No match

A period that either starts a pathname or follows a slash (/) in a pathname must be

matched explicitly unless you have set dotglob (page 332). The option nocaseglob

(page 333) causes ambiguous file references to match filenames without regard to case.

Quotation marks Putting double quotation marks around an argument causes the shell to suppress

pathname and all other kinds of expansion except parameter and variable expan-

sion. Putting single quotation marks around an argument suppresses all types of

expansion. The second echo command in the following example shows the variable

$max between double quotation marks, which allow variable expansion. As a result

the shell expands the variable to its value: sonar. This expansion does not occur in

the third echo command, which uses single quotation marks. Because neither single

nor double quotation marks allow pathname expansion, the last two commands

display the unexpanded argument tmp*.

$ echo tmp* $max

tmp1 tmp2 tmp3 sonar

$ echo "tmp* $max"

tmp* sonar

$ echo 'tmp* $max'

tmp* $max

The shell distinguishes between the value of a variable and a reference to the vari-

able and does not expand ambiguous file references if they occur in the value of a

variable. As a consequence you can assign to a variable a value that includes special

characters, such as an asterisk (*).

Levels of expansion In the next example, the working directory has three files whose names begin with

letter. When you assign the value letter* to the variable var, the shell does not

expand the ambiguous file reference because it occurs in the value of a variable (in

the assignment statement for the variable). No quotation marks surround the string

letter*; context alone prevents the expansion. After the assignment the set builtin

(with the help of grep) shows the value of var to be letter*.

$ ls letter*

letter1 letter2 letter3

$ var=letter*

Chapter Summary 343

$ set | grep var

var='letter*'

$ echo '$var'

$var

$ echo "$var"

letter*

$ echo $var

letter1 letter2 letter3

The three echo commands demonstrate three levels of expansion. When $var is

quoted with single quotation marks, the shell performs no expansion and passes the

character string $var to echo, which displays it. With double quotation marks, the

shell performs variable expansion only and substitutes the value of the var variable

for its name, preceded by a dollar sign. No pathname expansion is performed on

this command because double quotation marks suppress it. In the final command,

the shell, without the limitations of quotation marks, performs variable substitution

and then pathname expansion before passing the arguments to echo.

Process Substitution

A special feature of the Bourne Again Shell is the ability to replace filename argu-

ments with processes. An argument with the syntax <(command) causes command

to be executed and the output written to a named pipe (FIFO). The shell replaces

that argument with the name of the pipe. If that argument is then used as the name

of an input file during processing, the output of command is read. Similarly an argu-

ment with the syntax >(command) is replaced by the name of a pipe that command

reads as standard input.

The following example uses sort (pages 54 and 817) with the –m (merge, which

works correctly only if the input files are already sorted) option to combine two

word lists into a single list. Each word list is generated by a pipe that extracts words

matching a pattern from a file and sorts the words in that list.

$ sort -m -f <(grep "[^A-Z]..$" memo1 | sort) <(grep ".*aba.*" memo2 |sort)

Chapter Summary

The shell is both a command interpreter and a programming language. As a com-

mand interpreter, it executes commands you enter in response to its prompt. As a

programming language, the shell executes commands from files called shell scripts.

When you start a shell, it typically runs one or more startup files.

Running a

shell script

Assuming the file holding a shell script is in the working directory, there are three

basic ways to execute the shell script from the command line.

1. Type the simple filename of the file that holds the script.

2. Type a relative pathname, including the simple filename preceded by ./.

3. Type bash or tcsh followed by the name of the file.

344 Chapter 8 The Bourne Again Shell

Technique 1 requires that the working directory be in the PATH variable. Techniques

1 and 2 require that you have execute and read permission for the file holding the

script. Technique 3 requires that you have read permission for the file holding the

script.

Job control A job is one or more commands connected by pipes. You can bring a job running in

the background into the foreground using the fg builtin. You can put a foreground

job into the background using the bg builtin, provided that you first suspend the job

by pressing the suspend key (typically CONTROL-Z). Use the jobs builtin to see which

jobs are running or suspended.

Variables The shell allows you to define variables. You can declare and initialize a variable by

assigning a value to it; you can remove a variable declaration using unset. Variables

are local to a process unless they are exported using the export (bash) or setenv (tcsh)

builtin to make them available to child processes. Variables you declare are called

user-created variables. The shell defines keyword variables. Within a shell script you

can work with the command-line (positional) parameters the script was called with.

Process Each process has a unique identification (PID) number and is the execution of a sin-

gle Linux command. When you give it a command, the shell forks a new (child)

process to execute the command, unless the command is built into the shell. While

the child process is running, the shell is in a state called sleep. By ending a command

line with an ampersand (&), you can run a child process in the background and

bypass the sleep state so that the shell prompt returns immediately after you press

RETURN. Each command in a shell script forks a separate process, each of which may

in turn fork other processes. When a process terminates, it returns its exit status to

its parent process. An exit status of zero signifies success; nonzero signifies failure.

History The history mechanism, a feature adapted from the C Shell, maintains a list of

recently issued command lines, also called events, that provides a way to reexecute

previous commands quickly. There are several ways to work with the history list;

one of the easiest is to use a command-line editor.

Command-line

editors

When using an interactive Bourne Again Shell, you can edit a command line and

commands from the history file, using either of the Bourne Again Shell’s command-

line editors (vim or emacs). When you use the vim command-line editor, you start in

Input mode, unlike vim. You can switch between Command and Input modes. The

emacs editor is modeless and distinguishes commands from editor input by recog-

nizing control characters as commands.

Aliases An alias is a name that the shell translates into another name or (complex) com-

mand. Aliases allow you to define new commands by substituting a string for the

first token of a simple command. The Bourne Again and TC Shells use different

syntaxes to define an alias, but aliases in both shells work similarly.

Functions A shell function is a series of commands that, unlike a shell script, is parsed prior to

being stored in memory. As a consequence shell functions run faster than shell

scripts. Shell scripts are parsed at runtime and are stored on disk. A function can be

defined on the command line or within a shell script. If you want the function defi-

nition to remain in effect across login sessions, you can define it in a startup file.

Exercises 345

Like functions in many programming languages, a shell function is called by giving

its name followed by any arguments.

Shell features There are several ways to customize the shell’s behavior. You can use options on the

command line when you call bash. You can use the bash set and shopt builtins to

turn features on and off.

Command-line

expansion

When it processes a command line, the Bourne Again Shell may replace some

words with expanded text. Most types of command-line expansion are invoked by

the appearance of a special character within a word (for example, a leading dollar

sign denotes a variable). Table 8-6 on page 304 lists these special characters. The

expansions take place in a specific order. Following the history and alias expan-

sions, the common expansions are parameter and variable expansion, command

substitution, and pathname expansion. Surrounding a word with double quotation

marks suppresses all types of expansion except parameter and variable expansion.

Single quotation marks suppress all types of expansion, as does quoting (escaping)

a special character by preceding it with a backslash.

Exercises

1. Explain the following unexpected result:

$ whereis date

date: /bin/date ...

$ echo $PATH

.:/usr/local/bin:/usr/bin:/bin

$ cat > date

echo "This is my own version of date."

$ ./date

Fri May 22 11:45:49 PDT 2009

2. What are two ways you can execute a shell script when you do not have exe-

cute permission for the file containing the script? Can you execute a shell

script if you do not have read permission for the file containing the script?

3. What is the purpose of the PATH variable?

a. Set the PATH variable so that it causes the shell to search the following

directories in order:

•/usr/local/bin

•/usr/bin

•/bin

•/usr/kerberos/bin

•The bin directory in your home directory

• The working directory

346 Chapter 8 The Bourne Again Shell

b. If there is a file named doit in /usr/bin and another file with the same

name in your ~/bin directory, which one will be executed? (Assume that

you have execute permission for both files.)

c. If your PATH variable is not set to search the working directory, how

can you execute a program located there?

d. Which command can you use to add the directory /usr/games to the end

of the list of directories in PATH?

4. Assume you have made the following assignment:

$ person=zach

Give the output of each of the following commands:

a. echo $person

b. echo '$person'

c. echo "$person"

5. The following shell script adds entries to a file named journal-file in your

home directory. This script helps you keep track of phone conversations

and meetings.

$ cat journal

# journal: add journal entries to the file

# $HOME/journal-file

file=$HOME/journal-file

date >> $file

echo -n "Enter name of person or group: "

read name

echo "$name" >> $file

echo >> $file

cat >> $file

echo "----------------------------------------------------" >> $file

echo >> $file

a. What do you have to do to the script to be able to execute it?

b. Why does the script use the read builtin the first time it accepts input

from the terminal and the cat utility the second time?

6. Assume the /home/zach/grants/biblios and /home/zach/biblios directories

exist. Give Zach’s working directory after he executes each sequence of

commands given. Explain what happens in each case.

$ pwd

/home/zach/grants

$ CDPATH=$(pwd)

$ cd

$ cd biblios

Advanced Exercises 347

$ pwd

/home/zach/grants

$ CDPATH=$(pwd)

$ cd $HOME/biblios

7. Name two ways you can identify the PID number of the login shell.

8. Give the following command:

$ sleep 30 | cat /etc/inittab

Is there any output from sleep? Where does cat get its input from? What

has to happen before the shell displays another prompt?

Advanced Exercises

9. Write a sequence of commands or a script that demonstrates variable

expansion occurs before pathname expansion.

10. Write a shell script that outputs the name of the shell executing it.

11. Explain the behavior of the following shell script:

$ cat quote_demo

twoliner="This is line 1.

This is line 2."

echo "$twoliner"

echo $twoliner

a. How many arguments does each echo command see in this script?

Explain.

b. Redefine the IFS shell variable so that the output of the second echo is

the same as the first.

12. Add the exit status of the previous command to your prompt so that it

behaves similarly to the following:

$ [0] ls xxx

ls: xxx: No such file or directory

$ [1]

13. The dirname utility treats its argument as a pathname and writes to stan-

dard output the path prefix—that is, everything up to but not including

the last component:

$ dirname a/b/c/d

a/b/c

If you give dirname a simple filename (no / characters) as an argument,

dirname writes a . to standard output:

$ dirname simple

348 Chapter 8 The Bourne Again Shell

Implement dirname as a bash function. Make sure that it behaves sensibly

when given such arguments as /.

14. Implement the basename utility, which writes the last component of its

pathname argument to standard output, as a bash function. For example,

given the pathname a/b/c/d,basename writes d to standard output:

$ basename a/b/c/d

15. The Linux basename utility has an optional second argument. If you give

the command basename path suffix,basename removes the suffix and the

prefix from path:

$ basename src/shellfiles/prog.bash .bash

prog

$ basename src/shellfiles/prog.bash .c

prog.bash

Add this feature to the function you wrote for exercise 14.

349349

The TC Shell (tcsh) performs the same function as the Bourne

Again Shell and other shells: It provides an interface between

you and the Linux operating system. The TC Shell is an interac-

tive command interpreter as well as a high-level programming

language. Although you use only one shell at any given time,

you should be able to switch back and forth comfortably

between shells as the need arises. In fact, you may want to run

different shells in different windows. Chapters 8 and 10 apply

to tcsh as well as to bash so they provide a good background for

this chapter. This chapter explains tcsh features that are not

found in bash and those that are implemented differently from

their bash counterparts. The tcsh home page is www.tcsh.org.

The TC Shell is an expanded version of the C Shell (csh), which

originated on Berkeley UNIX. The “T” in TC Shell comes from

the TENEX and TOPS-20 operating systems, which inspired

command completion and other features in the TC Shell. A

number of features not found in csh are present in tcsh, includ-

ing file and username completion, command-line editing, and

spelling correction. As with csh, you can customize tcsh to

In This Chapter

Shell Scripts . . . . . . . . . . . . . . . . . 350

Entering and Leaving the

TC Shell . . . . . . . . . . . . . . . . . . . 369

Features Common to the

Bourne Again and TC Shells . . . 371

Redirecting Standard Error . . . . . . 359

Word Completion . . . . . . . . . . . . . 360

Editing the Command Line . . . . . . 363

Variables . . . . . . . . . . . . . . . . . . . . 365

Reading User Input . . . . . . . . . . . . 371

Control Structures . . . . . . . . . . . . . 378

Builtins . . . . . . . . . . . . . . . . . . . . . 387

The TC Shell

350 Chapter 9 The TC Shell

make it more tolerant of mistakes and easier to use. By setting the proper shell vari-

ables, you can have tcsh warn you when you appear to be accidentally logging out

or overwriting a file. Many popular features of the original C Shell are now shared

by bash and tcsh.

Assignment

statement

Although some of the functionality of tcsh is present in bash, differences arise in

the syntax of some commands. For example, the tcsh assignment statement has the

following syntax:

set variable =value

Having SPACEs on either side of the equal sign, although illegal in bash, is allowed in

tcsh. By convention shell variables in tcsh are generally named with lowercase letters,

not uppercase (you can use either). If you reference an undeclared variable (one that

has had no value assigned to it), tcsh generates an error message, whereas bash does

not. Finally the default tcsh prompt is a greater than sign (>), but it is frequently set

to a single $ character followed by a SPACE. The examples in this chapter use a prompt

of tcsh $ to avoid confusion with the bash prompt.

Shell Scripts

The TC Shell can execute files containing tcsh commands, just as the Bourne Again

Shell can execute files containing bash commands. Although the concepts of writing

and executing scripts in the two shells are similar, the methods of declaring and

assigning values to variables and the syntax of control structures are different.

You can run bash and tcsh scripts while using any one of the shells as a command

interpreter. Various methods exist for selecting the shell that runs a script. Refer to

“#! Specifies a Shell” on page 280 for more information.

If the first character of a shell script is a pound sign (#) and the following character is

not an exclamation point (!), the TC Shell executes the script under tcsh. If the first

character is anything other than #,tcsh calls the sh link to dash or bash to execute

the script.

Do not use tcsh as a programming language

tip If you have used UNIX and are comfortable with the C or TC Shell, you may want to use tcsh as

your login shell. However, you may find that the TC Shell is not as good a programming language

as bash. If you are going to learn only one shell programming language, learn bash. The Bourne

Again Shell and dash (page 270), which is a subset of bash, are used throughout Linux to pro-

gram many system administration scripts.

echo: getting rid of the RETURN

tip The tcsh echo builtin accepts either a –n option or a trailing \c to get rid of the RETURN that echo

normally displays at the end of a line. The bash echo builtin accepts only the –n option (refer to

“read: Accepts User Input” on page 447).

Entering and Leaving the TC Shell 351

Entering and Leaving the TC Shell

chsh You can execute tcsh by giving the command tcsh. If you are not sure which shell

you are using, use the ps utility to find out. It shows whether you are running tcsh,

bash,sh (linked to bash), or possibly another shell. The finger command followed by

your username displays the name of your login shell, which is stored in the

/etc/passwd file. (Mac OS X uses Open Directory [page 926] in place of this file.) If

you want to use tcsh as a matter of course, you can use the chsh (change shell) util-

ity to change your login shell:

bash $ chsh

Changing shell for sam.

Password:

New shell [/bin/bash]: /bin/tcsh

Shell changed.

bash $

The shell you specify will remain in effect for your next login and all subsequent

logins until you specify a different login shell. The /etc/passwd file stores the name

of your login shell.

You can leave tcsh in several ways. The approach you choose depends on two fac-

tors: whether the shell variable ignoreeof is set and whether you are using the shell

that you logged in on (your login shell) or another shell that you created after you

logged in. If you are not sure how to exit from tcsh, press CONTROL-D on a line by itself

with no leading SPACEs, just as you would to terminate standard input to a program.

You will either exit or receive instructions on how to exit. If you have not set

Shell game

tip When you are working with an interactive TC Shell, if you run a script in which # is not the first

character of the script and you call the script directly (without preceding its name with tcsh), tcsh

calls the sh link to dash or bash to run the script. The following script was written to be run

under tcsh but, when called from a tcsh command line, is executed by bash. The set builtin

(page 442) works differently under bash and tcsh. As a result the following example (from

page 371) issues a prompt but does not wait for you to respond:

tcsh $ cat user_in

echo -n "Enter input: "

set input_line = "$<"

echo $input_line

tcsh $ user_in

Enter input:

Although in each case the examples are run from a tcsh command line, the following one calls

tcsh explicitly so that tcsh executes the script and it runs correctly:

tcsh $ tcsh user_in

Enter input: here is some input

here is some input

352 Chapter 9 The TC Shell

ignoreeof (page 377) and it has not been set for you in a startup file, you can exit

from any shell by using CONTROL-D (the same procedure you use to exit from the

Bourne Again Shell).

When ignoreeof is set, CONTROL-D does not work. The ignoreeof variable causes the

shell to display a message telling you how to exit. You can always exit from tcsh by

giving an exit command. A logout command allows you to exit from your login

shell only.

Startup Files

When you log in on the TC Shell, it automatically executes various startup files.

These files are normally executed in the order described in this section, but you can

compile tcsh so that it uses a different order. You must have read access to a startup

file to execute the commands in it. See page 271 for information on bash startup

files and page 936 for information on startup files under Mac OS X.

/etc/csh.cshrc and

/etc/csh.login

The shell first executes the commands in /etc/csh.cshrc and /etc/csh.login. A user

working with root privileges can set up these files to establish systemwide default

characteristics for tcsh users. They contain systemwide configuration information,

such as the default path, the location to check for mail, and so on.

.tcshrc and .cshrc Next the shell looks for ~/.tcshrc or, if it does not exist, ~/.cshrc (~/ is shorthand for

your home directory; page 84). You can use these files to establish variables and

parameters that are local to your shell. Each time you create a new shell, tcsh reini-

tializes these variables for the new shell. The following .tcshrc file sets several shell

variables, establishes two aliases (page 357), and adds two directories to path—one

at the beginning of the list and one at the end:

tcsh $ cat ~/.tcshrc

set noclobber

set dunique

set ignoreeof

set history=256

set path = (~/bin $path /usr/games)

alias h history

alias ll ls -l

.history Login shells rebuild the history list from the contents of ~/.history. If the histfile

variable exists, tcsh uses the file that histfile points to in place of .history.

.login Login shells read and execute the commands in ~/.login. This file contains com-

mands that you want to execute once, at the beginning of each session. You can use

setenv (page 366) to declare environment (global) variables here. You can also

declare the type of terminal you are using and set some terminal characteristics in

your .login file.

tcsh $ cat ~/.login

setenv history 200

Features Common to the Bourne Again and TC Shells 353

setenv mail /var/spool/mail/$user

if ( -z $DISPLAY ) then

setenv TERM vt100

else

setenv TERM xterm

endif

stty erase '^h' kill '^u' -lcase tab3

date '+Login on %A %B %d at %I:%M %p'

The preceding .login file establishes the type of terminal you are using by setting the

TERM variable (the if statement [page 378] determines whether you are using a

graphical interface and therefore which value should be assigned to TERM). It then

runs stty (page 838) to set terminal characteristics and date (page 655) to display the

time you logged in.

/etc/csh.logout

and .logout

The TC Shell runs the /etc/csh.logout and ~/.logout files, in that order, when you

exit from a login shell. The following sample .logout file uses date to display the

time you logged out. The sleep command ensures that echo has time to display the

message before the system logs you out. The delay may be useful for dial-up lines

that take some time to display the message.

tcsh $ cat ~/.logout

date '+Logout on %A %B %d at %I:%M %p'

sleep 5

Features Common to the Bourne Again and

TC Shells

Most of the features common to both bash and tcsh are derived from the original

C Shell:

• Command-line expansion (also called substitution; page 354)

• History (page 354)

• Aliases (page 357)

• Job control (page 358)

• Filename substitution (page 358)

• Directory stack manipulation (page 359)

• Command substitution (page 359)

The chapters on bash discuss these features in detail. This section focuses on the

differences between the bash and tcsh implementations.

354 Chapter 9 The TC Shell

Command-Line Expansion (Substitution)

Refer to “Processing the Command Line” on page 334 for an introduction to

command-line expansion in the Bourne Again Shell. The tcsh man page uses the

term substitution instead of expansion; the latter is used by bash. The TC Shell

scans each token for possible expansion in the following order:

1. History substitution (below)

2. Alias substitution (page 357)

3. Variable substitution (page 366)

4. Command substitution (page 359)

5. Filename substitution (page 358)

6. Directory stack substitution (page 359)

History

The TC Shell assigns a sequential event number to each command line. You can dis-

play this event number as part of the tcsh prompt (refer to “prompt” on page 373).

Examples in this section show numbered prompts when they help illustrate the

behavior of a command.

The history Builtin

As in bash, the tcsh history builtin displays the events in your history list. The list of

events is ordered with the oldest events at the top. The last event in the history list is

the history command that displayed the list. In the following history list, which is

limited to ten lines by the argument of 10 to the history command, command 23

modifies the tcsh prompt to display the history event number. The time each com-

mand was executed appears to the right of the event number.

32 $ history 10

23 23:59 set prompt = "! $ "

24 23:59 ls -l

25 23:59 cat temp

26 0:00 rm temp

27 0:00 vim memo

28 0:00 lpr memo

29 0:00 vim memo

30 0:00 lpr memo

31 0:00 rm memo

32 0:00 history

History Expansion

The same event and word designators work in both shells. For example, !! refers to

the previous event in tcsh, just as it does in bash. The command !328 executes event

number 328;!?txt? executes the most recent event containing the string txt. For

Features Common to the Bourne Again and TC Shells 355

more information refer to “Using an Exclamation Point (!) to Reference Events” on

page 313. Table 9-1 lists the few tcsh word modifiers not found in bash.

You can use more than one word modifier in a command. For instance, the a modi-

fier, when used in combination with the u or l modifier, enables you to change the

case of an entire word.

tcsh $ echo $VERSION

VERSION: Undefined variable.

tcsh $ echo !!:1:al

echo $version

tcsh 6.14.00 (Astron) 2005-03-25 (i486-intel-linux) options wide,nls,dl,...

In addition to using event designators to access the history list, you can use the command-

line editor to access, modify, and execute previous commands (page 363).

Variables

The variables you set to control the history list in tcsh are different from those used in

bash. Whereas bash uses HISTSIZE and HISTFILESIZE to determine the number of

events that are preserved during and between sessions, respectively, tcsh uses history

and savehist (Table 9-2) for these purposes.

history and savehist When you exit from a tcsh shell, the most recently executed commands are saved in

your ~/.history file. The next time you start the shell, this file initializes the history

list. The value of the savehist variable determines the number of lines saved in the

.history file (not necessarily the same as the history variable). If savehist is not set,

tcsh does not save history information between sessions. The history and savehist

variables must be local (i.e., declared with set, not setenv). The history variable

holds the number of events remembered during a session and the savehist variable

holds the number remembered between sessions. See Table 9-2.

Table 9-1 Word modifiers

Modifier Function

uConverts the first lowercase letter into uppercase

lConverts the first uppercase letter into lowercase

aApplies the next modifier globally within a single word

Table 9-2 History variables

Variable Default Function

history 100 events Maximum number of events saved during a session

histfile ~/.history Location of the history file

savehist not set Maximum number of events saved between sessions

356 Chapter 9 The TC Shell

If you set the value of history too high, it can use too much memory. If it is unset or

set to zero, the shell does not save any commands. To establish a history list of the

500 most recent events, give the following command manually or place it in your

~/.tcshrc startup file:

tcsh $ set history = 500

The following command causes tcsh to save the 200 most recent events across login

sessions:

tcsh $ set savehist = 200

You can combine these two assignments into a single command:

tcsh $ set history=500 savehist=200

After you set savehist, you can log out and log in again; the 200 most recent events

from the previous login sessions will appear in your history list after you log back

in. Set savehist in your ~/.tcshrc file if you want to maintain your event list from

histlit If you set the variable histlit (history literal), history displays the commands in the

history list exactly as they were typed in, without any shell interpretation. The fol-

lowing example shows the effect of this variable (compare the lines numbered 32):

tcsh $ cat /etc/csh.cshrc

...

tcsh $ cp !!:1 ~

cp /etc/csh.cshrc ~

tcsh $ set histlit

tcsh $ history

...

31 9:35 cat /etc/csh.cshrc

32 9:35 cp !!:1 ~

33 9:35 set histlit

34 9:35 history

tcsh $ unset histlit

tcsh $ history

...

31 9:35 cat /etc/csh.cshrc

32 9:35 cp /etc/csh.cshrc ~

33 9:35 set histlit

34 9:35 history

35 9:35 unset histlit

36 9:36 history

optional The bash and tcsh Shells expand history event designators differently. If you give the

command !250w,bash replaces it with command number 250 with a character w

appended to it. In contrast, tcsh looks back through your history list for an event

that begins with the string 250w to execute. The reason for the difference: bash

interprets the first three characters of 250w as the number of a command, whereas

tcsh interprets those characters as part of the search string 250w. (If the 250 stands

alone, tcsh treats it as a command number.)

Features Common to the Bourne Again and TC Shells 357

If you want to append w to command number 250, you can insulate the event

number from the w by surrounding it with braces:

!{250}w

Aliases

The alias/unalias feature in tcsh closely resembles its counterpart in bash (page 324).

However, the alias builtin has a slightly different syntax:

alias name value

The following command creates an alias for ls:

tcsh $ alias ls "ls -lF"

The tcsh alias allows you to substitute command-line arguments, whereas bash

does not:

$ alias nam "echo Hello, \!^ is my name"

$ nam Sam

Hello, Sam is my name

The string \!* within an alias expands to all command-line arguments:

$ alias sortprint "sort \!* | lpr"

The next alias displays its second argument:

$ alias n2 "echo \!:2"

Special Aliases

Some alias names, called special aliases, have special meaning to tcsh. If you define

an alias with one of these names, tcsh executes it automatically as explained in

Table 9-3. Initially all special aliases are undefined.

To see a list of current aliases, give the command alias. To view the alias for a

particular name, give the command alias followed by the name.

Table 9-3 Special aliases

Alias When executed

beepcmd Whenever the shell would normally ring the terminal bell. Gives you a way to

have other visual or audio effects take place at those times.

cwdcmd Whenever you change to another working directory.

periodic Periodically, as determined by the number of minutes in the tperiod variable.

If tperiod is unset or has the value 0, periodic has no meaning.

precmd Just before the shell displays a prompt.

shell Gives the absolute pathname of the shell that you want to use to run scripts

that do not start with #! (page 280).

358 Chapter 9 The TC Shell

History Substitution in Aliases

You can substitute command-line arguments by using the history mechanism, where

a single exclamation point represents the command line containing the alias. Modi-

fiers are the same as those used by history (page 313). In the following example, the

exclamation points are quoted so the shell does not interpret them when building

the aliases:

21 $ alias last echo \!:$

22 $ last this is just a test

test

23 $ alias fn2 echo \!:2:t

24 $ fn2 /home/sam/test /home/zach/temp /home/barbara/new

temp

Event 21 defines for last an alias that displays the last argument. Event 23 defines

for fn2 an alias that displays the simple filename, or tail, of the second argument on

the command line.

Job Control

Job control is similar in both bash (page 285) and tcsh. You can move commands

between the foreground and the background, suspend jobs temporarily, and get a

list of the current jobs. The % character references a job when it is followed by a

job number or a string prefix that uniquely identifies the job. You will see a minor

difference when you run a multiple-process command line in the background from

each shell. Whereas bash displays only the PID number of the last background pro-

cess in each job, tcsh displays the numbers for all processes belonging to a job. The

example from page 285 looks like this under tcsh:

tcsh $ find . -print | sort | lpr & grep -l zach /tmp/* > zachfiles &

[1] 18839 18840 18841

[2] 18876

Filename Substitution

The TC Shell expands the characters *,?, and [] in a pathname just as bash does

(page 136). The * matches any string of zero or more characters, ? matches any

single character, and [] defines a character class, which is used to match single

characters appearing within a pair of brackets.

The TC Shell expands command-line arguments that start with a tilde (~) into file-

names in much the same way that bash does (page 361), with the ~ standing for the

user’s home directory or the home directory of the user whose name follows the

tilde. The bash special expansions ~+ and ~– are not available in tcsh.

Brace expansion (page 336) is available in tcsh. Like tilde expansion, it is regarded

as an aspect of filename substitution even though brace expansion can generate

strings that are not the names of actual files.

Redirecting Standard Error 359

In tcsh and its predecessor csh, the process of using patterns to match filenames is

referred to as globbing and the pattern itself is called a globbing pattern. If tcsh is

unable to identify one or more files that match a globbing pattern, it reports an

error (unless the pattern contains a brace). Setting the shell variable noglob sup-

presses filename substitution, including both tilde and brace interpretation.

Manipulating the Directory Stack

Directory stack manipulation in tcsh does not differ much from that in bash

(page 288). The dirs builtin displays the contents of the stack, and the pushd and

popd builtins push directories onto and pop directories off of the stack.

Command Substitution

The $(...) format for command substitution is not available in tcsh. In its place you

must use the original ‘...‘format. Otherwise, the implementation in bash and tcsh

is identical. Refer to page 340 for more information on command substitution.

Redirecting Standard Error

Both bash and tcsh use a greater than symbol (>) to redirect standard output, but

tcsh does not use the bash notation 2> to redirect standard error. Under tcsh you use

a greater than symbol followed by an ampersand (>&) to combine and redirect

standard output and standard error. Although you can use this notation under bash,

few people do. The following examples, like the bash examples on page 276, refer-

ence file x, which does not exist, and file y, which contains a single line:

tcsh $ cat x

cat: x: No such file or directory

tcsh $ cat y

This is y.

tcsh $ cat x y >& hold

tcsh $ cat hold

cat: x: No such file or directory

This is y.

With an argument of y in the preceding example, cat sends a string to standard

output. An argument of x causes cat to send an error message to standard error.

Unlike bash,tcsh does not provide a simple way to redirect standard error separately

from standard output. A work-around frequently provides a reasonable solution.

The following example runs cat with arguments of x and y in a subshell (the paren-

theses ensure that the command within them runs in a subshell; see page 284). Also

within the subshell, a > redirects standard output to the file outfile. Output sent to

standard error is not touched by the subshell but rather is sent to the parent shell,

360 Chapter 9 The TC Shell

where both it and standard output are sent to errfile. Because standard output has

already been redirected, errfile contains only output sent to standard error.

tcsh $ (cat x y > outfile) >& errfile

tcsh $ cat outfile

This is y.

tcsh $ cat errfile

cat: x: No such file or directory

It can be useful to combine and redirect output when you want to execute a command

that runs slowly in the background and do not want its output cluttering up the ter-

minal screen. For example, because the find utility (page 688) often takes some time to

complete, it may be a good idea to run it in the background. The next command finds

in the filesystem hierarchy all files that contain the string biblio in their name. This

command runs in the background and sends its output to the findout file. Because the

find utility sends to standard error a report of directories that you do not have permis-

sion to search, the findout file contains a record of any files that are found as well as a

record of the directories that could not be searched.

tcsh $ find / -name "*biblio*" -print >& findout &

In this example, if you did not combine standard error with standard output and

redirected only standard output, the error messages would appear on the screen and

findout would list only files that were found.

While a command that has its output redirected to a file is running in the back-

ground, you can look at the output by using tail (page 843) with the –f option. The

–f option causes tail to display new lines as they are written to the file:

tcsh $ tail -f findout

To terminate the tail command, press the interrupt key (usually CONTROL-C).

Working with the Command Line

This section covers word completion, editing the command line, and correcting

spelling.

Word Completion

The TC Shell completes filenames, commands, and variable names on the command

line when you prompt it to do so. The generic term used to refer to all these features

under tcsh is word completion.

Filename Completion

The TC Shell can complete a filename after you specify a unique prefix. Filename

completion is similar to filename generation, but the goal of filename completion is

Working with the Command Line 361

to select a single file. Together these capabilities make it practical to use long,

descriptive filenames.

To use filename completion when you are entering a filename on the command line,

type enough of the name to identify the file in the directory uniquely and press TAB;

tcsh fills in the name and adds a SPACE, leaving the cursor so you can enter additional

arguments or press RETURN. In the following example, the user types the command cat

trig1A and presses TAB; the system fills in the rest of the filename that begins with

trig1A:

tcsh $ cat trig1A → TAB → cat trig1A.302488 ■

If two or more filenames match the prefix that you have typed, tcsh cannot com-

plete the filename without obtaining more information from you. The shell

attempts to maximize the length of the prefix by adding characters, if possible, and

then beeps to signify that additional input is needed to resolve the ambiguity:

tcsh $ ls h*

help.hist help.trig01 help.txt

tcsh $ cat h → TAB → cat help. (beep)

You can fill in enough characters to resolve the ambiguity and then press the TAB key

again. Alternatively, you can press CONTROL-D to cause tcsh to display a list of matching

filenames:

tcsh $ cat help. → CONTROL-D

help.hist help.trig01 help.txt

tcsh $ cat help.■

After displaying the filenames, tcsh redraws the command line so you can disambiguate

the filename (and press TAB again) or finish typing the filename manually.

Tilde Completion

The TC Shell parses a tilde (~) appearing as the first character of a word and

attempts to expand it to a username when you enter a TAB:

tcsh $ cd ~za → TAB → cd ~zach/■ → RETURN

tcsh $ pwd

/home/zach

By appending a slash (/), tcsh indicates that the completed word is a directory. The

slash also makes it easy to continue specifying the pathname.

Command and Variable Completion

You can use the same mechanism that you use to list and complete filenames with

command and variable names. Unless you give a full pathname, the shell uses the

variable path in an attempt to complete a command name. The choices tcsh lists

may be located in different directories.

362 Chapter 9 The TC Shell

tcsh $ up → TAB (beep) → CONTROL-D

up2date updatedb uptime

up2date-config update-mime-database

up2date-nox updmap

tcsh $ up → t → TAB → uptime ■ → RETURN

9:59am up 31 days, 15:11, 7 users, load average: 0.03, 0.02, 0.00

If you set the autolist variable as in the following example, the shell lists choices

automatically when you invoke completion by pressing TAB. You do not have to

press CONTROL-D.

tcsh $ set autolist

tcsh $ up → TAB (beep)

up2date updatedb uptime

up2date-config update-mime-database

up2date-nox updmap

tcsh $ up → t → TAB → uptime ■ → RETURN

10:01am up 31 days, 15:14, 7 users, load average: 0.20, 0.06, 0.02

If you set autolist to ambiguous, the shell lists the choices when you press TAB only if

the word you enter is the longest prefix of a set of commands. Otherwise, pressing

TAB causes the shell to add one or more characters to the word until it is the longest

prefix; pressing TAB again then lists the choices:

tcsh $ set autolist=ambiguous

tcsh $ echo $h → TAB (beep)

histfile history home

tcsh $ echo $h■ → i → TAB → echo $hist■ → TAB

histfile history

tcsh $ echo $hist■ → o → TAB → echo $history ■ → RETURN

1000

The shell must rely on the context of the word within the input line to determine

whether it is a filename, a username, a command, or a variable name. The first

word on an input line is assumed to be a command name; if a word begins with the

special character $, it is viewed as a variable name; and so on. In the following

example, the second which command does not work properly: The context of the

word up makes it look like the beginning of a filename rather than the beginning of

a command. The TC Shell supplies which with an argument of updates (a nonexe-

cutable file) and which displays an error message:

tcsh $ ls up*

updates

tcsh $ which updatedb ups uptime

/usr/bin/updatedb

/usr/local/bin/ups

/usr/bin/uptime

tcsh $ which up → TAB → which updates

updates: Command not found.

Working with the Command Line 363

Editing the Command Line

bindkey The tcsh command-line editing feature is similar to that available under bash. You

can use either emacs mode commands (default) or vi(m) mode commands. Change

to vi(m) mode commands by using bindkey –v and to emacs mode commands by

using bindkey –e. The ARROW keys are bound to the obvious motion commands in

both modes, so you can move back and forth (up and down) through the history list

as well as left and right on the current command line.

Without an argument, the bindkey builtin displays the current mappings between

editor commands and the key sequences you can enter at the keyboard:

tcsh $ bindkey

Standard key bindings

"^@" -> set-mark-command

"^A" -> beginning-of-line

"^B" -> backward-char

"^C" -> tty-sigintr

"^D" -> delete-char-or-list-or-eof

...

Multi-character bindings

"^[[A" -> up-history

"^[[B" -> down-history

"^[[C" -> forward-char

"^[[D" -> backward-char

...

Arrow key bindings

down -> down-history

up -> up-history

left -> backward-char

right -> forward-char

home -> beginning-of-line

end -> end-of-line

The ^ indicates a CONTROL character (^B = CONTROL-B). The ^[ indicates a META or ALT char-

acter; in this case you press and hold the META or ALT key while you press the key for

the next character. If this substitution does not work or if the keyboard you are

using does not have a META or ALT key, press and release the ESCAPE key and then press

the key for the next character. For ^[[F you would press META-[ or ALT-[ followed by the

F key or else ESCAPE [F. The down/up/left/right indicate ARROW keys, and home/end

indicate the HOME and END keys on the numeric keypad. See page 216 for more infor-

mation on the META key.

Under OS X, most keyboards do not have a META or ALT key. See page 935 for an

explanation of how to set up the OPTION key to perform the same functions as the META

key on a Macintosh.

The preceding example shows the output from bindkey with the user in emacs mode.

Change to vi(m) mode (bindkey –v) and give another bindkey command to display

the vi(m) key bindings. You can pipe the output of bindkey through less to make it

easier to read.

364 Chapter 9 The TC Shell

Correcting Spelling

You can have tcsh attempt to correct the spelling of command names, filenames,

and variables (but only using emacs-style key bindings). Spelling correction can take

place only at two times: before and after you press RETURN.

Before You Press

RETURN

For tcsh to correct a word or line before you press RETURN, you must indicate that you

want it to do so. The two functions for this purpose are spell-line and spell-word:

$ bindkey | grep spell

"^[$" -> spell-line

"^[S" -> spell-word

"^[s" -> spell-word

The output from bindkey shows that spell-line is bound to META-$ (ALT-$ or ESCAPE $) and

spell-word is bound to META-S and META-s (ALT-s or ESCAPE s and ALT-S or ESCAPE S). To correct

the spelling of the word to the left of the cursor, press META-s. Pressing META-$ invokes

the spell-line function, which attempts to correct all words on a command line:

tcsh $ ls

bigfile.gz

tcsh $ gunzipp → META-s → gunzip bigfele.gz → META-s → gunzip bigfile.gz

tcsh $ gunzip bigfele.gz → META-$ → gunzip bigfile.gz

tcsh $ ecno $usfr → META-$ → echo $user

After You Press

RETURN

The variable named correct controls what tcsh attempts to correct or complete after

you press RETURN and before it passes the command line to the command being

called. If you do not set correct,tcsh will not correct anything:

tcsh $ unset correct

tcsh $ ls morning

morning

tcsh $ ecno $usfr morbing

usfr: Undefined variable.

The shell reports the error in the variable name and not the command name because it

expands variables before it executes the command (page 354). When you give a bad

command name without any arguments, the shell reports on the bad command name.

Set correct to cmd to correct only commands; to all to correct commands, variables,

and filenames; or to complete to complete commands:

tcsh $ set correct = cmd

tcsh $ ecno $usfr morbing

CORRECT>echo $usfr morbing (y|n|e|a)? y

usfr: Undefined variable.

Variables 365

tcsh $ set correct = all

tcsh $ echo $usfr morbing

CORRECT>echo $user morning (y|n|e|a)? y

zach morning

With correct set to cmd,tcsh corrects the command name from ecno to echo. With

correct set to all,tcsh corrects both the command name and the variable. It would

also correct a filename if one was present on the command line.

The TC Shell displays a special prompt that lets you enter y to accept the modified

command line, n to reject it, e to edit it, or a to abort the command. Refer to

“prompt3” on page 374 for a discussion of the special prompt used in spelling

correction.

In the next example, after setting the correct variable the user mistypes the name of

the ls command; tcsh then prompts for a correct command name. Because the com-

mand that tcsh has offered as a replacement is not ls, the user chooses to edit the

command line. The shell leaves the cursor following the command so the user can

correct the mistake:

tcsh $ set correct=cmd

tcsh $ lx -l → RETURN (beep)

CORRECT>lex -l (y|n|e|a)? e

tcsh $ lx -l■

If you assign the value complete to the variable correct,tcsh attempts command

name completion in the same manner as filename completion (page 360). In the fol-

lowing example, after setting correct to complete the user enters the command up.

The shell responds with Ambiguous command because several commands start with

these two letters but differ in the third letter. The shell then redisplays the command

line. The user could press TAB at this point to get a list of commands that start with

up but decides to enter t and press RETURN. The shell completes the command because

these three letters uniquely identify the uptime utility:

tcsh $ set correct = complete

tcsh $ upRETURN

Ambiguous command

tcsh $ up → tRETURN → uptime

4:45pm up 5 days, 9:54, 5 users, load average: 1.62, 0.83, 0.33

Variables

Although tcsh stores variable values as strings, you can work with these variables as

numbers. Expressions in tcsh can use arithmetic, logical, and conditional operators.

The @ builtin can evaluate integer arithmetic expressions.

366 Chapter 9 The TC Shell

This section uses the term numeric variable to describe a string variable that con-

tains a number that tcsh uses in arithmetic or logical arithmetic computations.

However, no true numeric variables exist in tcsh.

Variable name Atcsh variable name consists of 1 to 20 characters, which can be letters, digits, and

underscores (_). The first character cannot be a digit but can be an underscore.

Variable Substitution

Three builtins declare, display, and assign values to variables: set,@, and setenv. The

set and setenv builtins both assume nonnumeric string variables. The @ builtin works

only with numeric variables. Both set and @ declare local variables. The setenv

builtin declares a variable and places it in the calling environment of all child pro-

cesses (makes it global). Using setenv is similar to assigning a value to a variable and

then using export in the Bourne Again Shell. See “Locality of Variables” on page 436

for a discussion of local and environment variables.

Once the value—or merely the existence—of a variable has been established, tcsh

substitutes the value of that variable when the name of the variable, preceded by a

dollar sign ($), appears on a command line. If you quote the dollar sign by preceding

it with a backslash or enclosing it within single quotation marks, the shell does not

perform the substitution. When a variable is within double quotation marks, the

substitution occurs even if you quote the dollar sign by preceding it with a backslash.

String Variables

The TC Shell treats string variables similarly to the way the Bourne Again Shell

does. The major difference lies in their declaration and assignment: tcsh uses an

explicit command, set (or setenv), to declare and/or assign a value to a string

variable.

tcsh $ set name = fred

tcsh $ echo $name

fred

tcsh $ set

argv ()

cwd /home/zach

home /home/zach

name fred

path (/usr/local/bin /bin /usr/bin /usr/X11R6/bin)

prompt $

shell /bin/tcsh

status 0

term vt100

user zach

The first line in the example declares the variable name and assigns the string fred to

it. Unlike bash,tcsh allows—but does not require—SPACEs around the equal sign. The

next line displays the value of name. When you give a set command without any

arguments, it displays a list of all local shell variables and their values (your list will be

Variables 367

longer than the one in the example). When you give a set command with the name of

a variable and no value, the command sets the value of the variable to a null string.

You can use the unset builtin to remove a variable:

tcsh $ set name

tcsh $ echo $name

tcsh $ unset name

tcsh $ echo $name

name: Undefined variable.

setenv With setenv you must separate the variable name from the string being assigned to it

by inserting one or more SPACEs and omitting the equal sign. In the following exam-

ple, the tcsh command creates a subshell, echo shows that the variable and its value

are known to the subshell, and exit returns to the original shell. Try this example,

using set in place of setenv:

tcsh $ setenv SRCDIR /usr/local/src

tcsh $ tcsh

tcsh $ echo $SRCDIR

/usr/local/src

tcsh $ exit

If you use setenv with no arguments, it displays a list of the environment (global)

variables—variables that are passed to the shell’s child processes. By convention,

environment variables are named using uppercase letters.

As with set, giving setenv a variable name without a value sets the value of the vari-

able to a null string. Although you can use unset to remove environment and local

variables, unsetenv can remove only environment variables.

Arrays of String Variables

An array is a collection of strings, each of which is identified by its index (1, 2, 3,

and so on). Arrays in tcsh use one-based indexing (i.e., the first element of the array

has the subscript 1). Before you can access individual elements of an array, you

must declare the entire array by assigning a value to each element of the array. The

list of values must be enclosed in parentheses and separated by SPACEs:

8 $ set colors = (red green blue orange yellow)

9 $ echo $colors

red green blue orange yellow

10 $ echo $colors[3]

blue

11 $ echo $colors[2-4]

green blue orange

12 $ set shapes = ('' '' '' '' '')

13 $ echo $shapes

14 $ set shapes[4] = square

15 $ echo $shapes[4]

square

368 Chapter 9 The TC Shell

Event 8 declares the array of string variables named colors to have five elements and

assigns values to each of them. If you do not know the values of the elements at the

time you declare an array, you can declare an array containing the necessary num-

ber of null elements (event 12).

You can reference an entire array by preceding its name with a dollar sign (event 9).

A number in brackets following a reference to the array refers to an element of the

array (events 10, 14, and 15). Two numbers in brackets, separated by a hyphen,

refer to two or more adjacent elements of the array (event 11). Refer to “Special

Variable Forms” on page 371 for more information on arrays.

Numeric Variables

The @ builtin assigns the result of a numeric calculation to a numeric variable (as

described under “Variables” on page 365, tcsh has no true numeric variables). You

can declare single numeric variables with @, just as you can use set to declare non-

numeric variables. However, if you give it a nonnumeric argument, @ displays an

error message. Just as set does, the @ command used without any arguments lists

all shell variables.

Many of the expressions that the @ builtin can evaluate and the operators it recog-

nizes are derived from the C programming language. The following format shows a

declaration or assignment using @ (the SPACE after the @ is required):

@variable-name operator expression

The variable-name is the name of the variable you are assigning a value to. The

operator is one of the C assignment operators: =,+=,–=,*=,/=, or %=. (See

Table 12-4 on page 537 for a description of these operators.) The expression is an

arithmetic expression that can include most C operators (see the next section). You

can use parentheses within the expression for clarity or to change the order of eval-

uation. Parentheses must surround parts of the expression that contain any of the

following characters: <,>,&, or |.

Do not use $ when assigning a value to a variable

tip As with bash, variables having a value assigned to them (those on the left of the operator) must

not be preceded by a dollar sign ($) in tcsh. Thus

tcsh $ @ $answer = 5 + 5

will yield

answer: Undefined variable.

or, if answer is defined,

@: Variable name must begin with a letter.

whereas

tcsh $ @ answer = 5 + 5

assigns the value 10 to the variable answer.

Variables 369

Expressions

An expression can be composed of constants, variables, and most of the bash oper-

ators (page 463). Expressions that involve files rather than numeric variables or

strings are described in Table 9-8 on page 379.

Expressions follow these rules:

1. The shell evaluates a missing or null argument as 0.

2. All results are decimal numbers.

3. Except for != and ==, the operators act on numeric arguments.

4. You must separate each element of an expression from adjacent elements

by a SPACE, unless the adjacent element is &,|,<,>,(, or ).

Following are some examples that use @:

216 $ @ count = 0

217 $ echo $count

218 $ @ count = ( 10 + 4 ) / 2

219 $ echo $count

220 $ @ result = ( $count < 5 )

221 $ echo $result

222 $ @ count += 5

223 $ echo $count

224 $ @ count++

225 $ echo $count

Event 216 declares the variable count and assigns it a value of 0. Event 218 shows

the result of an arithmetic operation being assigned to a variable. Event 220 uses @

to assign the result of a logical operation involving a constant and a variable to

result. The value of the operation is false (= 0) because the variable count is not less

than 5. Event 222 is a compressed form of the following assignment statement:

tcsh $ @ count = $count + 5

Event 224 uses a postfix operator to increment count by 1.

Postincrement and

postdecrement

operators

You can use the postincrement (++) and postdecrement (––) operators only in

expressions containing a single variable name, as shown in the following example:

tcsh $ @ count = 0

tcsh $ @ count++

tcsh $ echo $count

tcsh $ @ next = $count++

@: Badly formed number.

370 Chapter 9 The TC Shell

Unlike in the C programming language and bash, expressions in tcsh cannot use

preincrement and predecrement operators.

Arrays of Numeric Variables

You must use the set builtin to declare an array of numeric variables before you can use

@ to assign values to the elements of that array. The set builtin can assign any values to

the elements of a numeric array, including zeros, other numbers, and null strings.

Assigning a value to an element of a numeric array is similar to assigning a value to

a simple numeric variable. The only difference is that you must specify the element,

or index, of the array. The syntax is

@variable-name[index]operator expression

The index specifies the element of the array that is being addressed. The first element

has an index of 1. The index cannot be an expression but rather must be either a

numeric constant or a variable. In the preceding syntax the brackets around index

are part of the syntax and do not indicate that index is optional. If you specify an

index that is too large for the array you declared with set,tcsh displays @: Subscript

out of range.

226 $ set ages = (0 0 0 0 0)

227 $ @ ages[2] = 15

228 $ @ ages[3] = ($ages[2] + 4)

229 $ echo $ages[3]

230 $ echo $ages

0 15 19 0 0

231 $ set index = 3

232 $ echo $ages[$index]

233 $ echo $ages[6]

ages: Subscript out of range.

Elements of a numeric array behave as though they were simple numeric variables. Event

226 declares an array with five elements, each having a value of 0. Events 227 and 228

assign values to elements of the array, and event 229 displays the value of one of the ele-

ments. Event 230 displays all the elements of the array, event 232 specifies an element by

using a variable, and event 233 demonstrates the out-of-range error message.

Braces

Like bash,tcsh allows you to use braces to distinguish a variable from the surrounding

text without the use of a separator:

$ set bb=abc

$ echo $bbdef

bbdef: Undefined variable.

$ echo ${bb}def

abcdef

Variables 371

Special Variable Forms

The special variable with the following syntax has the value of the number of elements

in the variable-name array:

$#variable-name

You can determine whether variable-name has been set by looking at the value of

the variable with the following syntax:

$?variable-name

This variable has a value of 1 if variable-name is set and 0 otherwise:

tcsh $ set days = (mon tues wed thurs fri)

tcsh $ echo $#days

tcsh $ echo $?days

tcsh $ unset days

tcsh $ echo $?days

Reading User Input

Within a tcsh shell script, you can use the set builtin to read a line from the terminal

and assign it to a variable. The following portion of a shell script prompts the user

and reads a line of input into the variable input_line:

echo -n "Enter input: "

set input_line = "$<"

The value of the shell variable $< is a line from standard input. The quotation

marks around $< keep the shell from assigning only the first word of the line of

input to the variable input_line.

Shell Variables

TC Shell variables may be set by the shell, inherited by the shell from the environ-

ment, or set by the user and used by the shell. Some variables take on significant

values (for example, the PID number of a background process). Other variables

act as switches: on if they are declared and off if they are not. Many of the shell

variables are often set from one of tcsh’s two startup files: ~/.login and ~/.tcshrc

(page 352).

Shell Variables That Take on Values

argv Contains the command-line arguments (positional parameters) from the command

line that invoked the shell. Like all tcsh arrays, this array uses one-based indexing;

argv[1] contains the first command-line argument. You can abbreviate references to

372 Chapter 9 The TC Shell

$argv[n] as $n. The token argv[*] references all the arguments together; you can

abbreviate it as $*. Use $0 to reference the name of the calling program. Refer to

“Positional Parameters” on page 440. The Bourne Again Shell does not use the argv

form, only the abbreviated form. You cannot assign values to the elements of argv.

$#argv or $# Holds the number of elements in the argv array. Refer to “Special Variable Forms”

on page 371.

autolist Controls command and variable completion (page 361).

autologout Enables tcsh’s automatic logout facility, which logs you out if you leave the shell idle

for too long. The value of the variable is the number of minutes of inactivity that

tcsh waits before logging you out. The default is 60 minutes except when you are

running in a graphical environment, in which case this variable is initially unset.

cdpath Affects the operation of cd in the same way as the CDPATH variable does in bash

(page 302). The cdpath variable is assigned an array of absolute pathnames (see

path, later in this section) and is usually set in the ~/.login file with a line such as the

following:

set cdpath = (/home/zach /home/zach/letters)

When you call cd with a simple filename, it searches the working directory for a

subdirectory with that name. If one is not found, cd searches the directories listed in

cdpath for the subdirectory.

correct Set to cmd for automatic spelling correction of command names, to all to correct

the entire command line, and to complete for automatic completion of command

names. This variable works on corrections that are made after you press RETURN.

Refer to “After You Press RETURN” on page 364.

cwd The shell sets this variable to the name of the working directory. When you access

a directory through a symbolic link (page 108), tcsh sets cwd to the name of the

symbolic link.

dirstack The shell keeps the stack of directories used with the pushd,popd, and dirs builtins

in this variable. For more information refer to “Manipulating the Directory Stack”

on page 288.

fignore Holds an array of suffixes that tcsh ignores during filename completion.

gid The shell sets this variable to your group ID.

histfile Holds the full pathname of the file that saves the history list between login sessions

(page 355). The defaults is ~/.history.

history Specifies the size of the history list. Refer to “History” on page 354.

home or HOME Holds the pathname of the user’s home directory. The cd builtin refers to this variable,

as does the filename substitution of ~ (page 337).

mail Specifies files and directories, separated by whitespace, to check for mail. The TC

Shell checks for new mail every 10 minutes unless the first word of mail is a number,

in which case that number specifies how often the shell should check in seconds.

Variables 373

owd The shell keeps the name of your previous (old) working directory in this variable,

which is equivalent to ~– in bash.

path or PATH Holds a list of directories that tcsh searches for executable commands (page 297). If

this array is empty or unset, you can execute commands only by giving their full

pathnames. You can set path with a command such as the following:

tcsh $ set path = ( /usr/bin /bin /usr/local/bin /usr/bin/X11 ~/bin . )

prompt Holds the primary prompt, similar to the bash PS1 variable (page 299). If it is not

set, the prompt is >, or # when you are working with root privileges. The shell

expands an exclamation point in the prompt string to the current event number.

The following is a typical line from a .tcshrc file that sets the value of prompt:

set prompt = '! $ '

Table 9-4 lists some of the formatting sequences you can use in prompt to achieve

special effects.

Table 9-4 prompt formatting sequences

Sequence Displays in prompt

%/ Value of cwd (the working directory)

%~ Same as %/, but replaces the path of the user’s home directory with a tilde

%! or %h or ! Current event number

%d Day of the week

%D Day of the month

%m Hostname without the domain

%M Full hostname, including the domain

%n User’s username

%t Time of day through the current minute

%p Time of day through the current second

%W Month as mm

%y Year as yy

%Y Year as yyyy

%# A pound sign (#) if the user is running with root privileges; otherwise, a greater

than sign (>)

%? Exit status of the preceding command

374 Chapter 9 The TC Shell

prompt2 Holds the secondary prompt, which tcsh uses to indicate it is waiting for additional

input. The default value is %R?. The TC Shell replaces %R with nothing when it is

waiting for you to continue an unfinished command, the word foreach while iterat-

ing through a foreach structure (page 383), and the word while while iterating

through a while structure (page 385).

When you press RETURN in the middle of a quoted string on a command line without

ending the line with a backslash, tcsh displays an error message regardless of

whether you use single or double quotation marks:

% echo "Please enter the three values

Unmatched ".

In the next example, the first RETURN is quoted (escaped); the shell interprets it liter-

ally. Under tcsh, single and double quotation marks produce the same result. The

secondary prompt is a question mark (?).

% echo "Please enter the three values \

? required to complete the transaction."

Please enter the three values

> required to complete the transaction.

prompt3 Holds the prompt used during automatic spelling correction. The default value is

CORRECT>%R (y|n|e|a)?, where R is replaced by the corrected string.

savehist Specifies the number of commands saved from the history list when you log out.

These events are saved in a file named ~/.history. The shell uses these events as the

initial history list when you log in again, causing your history list to persist across

shell Holds the pathname of the shell you are using.

shlvl Holds the level of the shell. The TC Shell increments this variable each time you

start a subshell and decrements it each time you exit a subshell. The value is set to 1

for a login shell.

status Contains the exit status returned by the last command. Similar to $? in bash

(page 440).

tcsh Holds the version number of tcsh you are running.

time Provides two functions: automatic timing of commands using the time builtin and

the format used by time. You can set this variable to either a single numeric value or

an array holding a numeric value and a string. The numeric value is used to control

automatic timing; any command that takes more than that number of CPU seconds

to run has time display the command statistics when it finishes execution. A value of

0 results in statistics being displayed after every command. The string controls the

formatting of the statistics using formatting sequences, including those listed in

Table 9-5.

Variables 375

By default the time builtin uses the string

"%Uu %Ss %E %P% %X+%Dk %I+%Oio %Fpf+%Ww"

which generates output in the following format:

tcsh $ time

0.200u 0.340s 17:32:33.27 0.0% 0+0k 0+0io 1165pf+0w

You may want to time commands when concerns arise about system performance.

If your commands consistently show many page faults and swaps, the system prob-

ably does not have enough memory; you should consider adding more. You can use

the information that time reports to compare the performance of various system

configurations and program algorithms.

tperiod Controls how often, in minutes, the shell executes the special periodic alias (page 357).

user The shell sets this variable to your username.

version The shell sets this variable to contain detailed information about the version of tcsh

that the system is running.

watch Set to an array of user and terminal pairs to watch for logins and logouts. The word

any means any user or any terminal, so (any any) monitors all logins and logouts on

all terminals, whereas (zach ttyS1 any console $user any) watches for zach on ttyS1,

Table 9-5 time formatting sequences

Sequence Displays

%U Time the command spent running user code, in CPU seconds (user mode)

%S Time the command spent running system code, in CPU seconds (kernel mode)

%E Wall clock time (total elapsed) taken by the command

%P Percentage of time the CPU spent on this task during this period, computed as

(%U+%S)/%E

%W Number of times the command’s processes were swapped out to disk

%X Average amount of shared code memory used by the command, in kilobytes

%D Average amount of data memory used by the command, in kilobytes

%K Total memory used by the command (as %X+%D), in kilobytes

%M Maximum amount of memory used by the command, in kilobytes

%F Number of major page faults (pages of memory that had to be read from disk)

%I Number of input operations

%O Number of output operations

376 Chapter 9 The TC Shell

any user who accesses the system console, and any logins and logouts that use your

account (presumably to catch intruders). By default logins and logouts are checked

once every 10 minutes, but you can change this value by beginning the array with a

numeric value giving the number of minutes between checks. If you set watch to (1

any console), logins and logouts by any user on the console will be checked once per

minute. Reports are displayed just before a new shell prompt is issued. Also, the log

builtin forces an immediate check whenever it is executed. See who (next) for infor-

mation about how you can control the format of the watch messages.

who Controls the format of the information displayed in watch messages (Table 9-6).

The default string used for watch messages when who is unset is "%n has %a %l

from %m", which generates the following line:

sam has logged on tty2 from local

$As in bash, this variable contains the PID number of the current shell; use it as $$.

Shell Variables That Act as Switches

The following shell variables act as switches; their values are not significant. If the

variable has been declared, the shell takes the specified action. If not, the action is

not taken or is negated. You can set these variables in your ~/.tcshrc startup file, in

a shell script, or from the command line.

autocorrect Causes the shell to attempt spelling correction automatically, just before each

attempt at completion (page 364).

dunique Normally pushd blindly pushes the new working directory onto the directory stack,

meaning that you can end up with many duplicate entries on this stack. Set dunique

to cause the shell to look for and delete any entries that duplicate the one it is about

to push.

echo Causes the shell to display each command before it executes that command. Set

echo by calling tcsh with the –x option or by using set.

filec Enables filename completion (page 360) when running tcsh as csh (and csh is linked

to tcsh).

histlit Displays the commands in the history list exactly as entered, without interpretation

by the shell (page 356).

Table 9-6 who formatting sequence

Sequence Displays

%n Username

%a Action taken by the user

%l Terminal on which the action took place

%M Full hostname of remote host (or local if none) from which the action took

place

$m Hostname without domain name

Variables 377

ignoreeof Prevents you from using CONTROL-D to exit from a shell so you cannot accidentally log

out. When this variable is declared, you must use exit or logout to leave a shell.

listjobs Causes the shell to list all jobs whenever a job is suspended.

listlinks Causes the ls–F builtin to show the type of file each symbolic link points to instead

of marking the symbolic link with an @ symbol.

loginsh Set by the shell if the current shell is running as a login shell.

nobeep Disables all beeping by the shell.

noclobber Prevents you from accidentally overwriting a file when you redirect output and pre-

vents you from creating a file when you attempt to append output to a nonexistent

file (Table 9-7). To override noclobber, add an exclamation point to the symbol you

use for redirecting or appending output (for example, >! and >>!). For more infor-

mation see page 129.

noglob Prevents the shell from expanding ambiguous filenames. Allows you to use *,?,~,

and [] literally on the command line or in a shell script without quoting them.

nonomatch Causes the shell to pass an ambiguous file reference that does not match a filename

to the command being called. The shell does not expand the file reference. When

you do not set nonomatch,tcsh generates a No match error message and does not

execute the command.

tcsh $ cat questions?

cat: No match

tcsh $ set nonomatch

tcsh $ cat questions?

cat: questions?: No such file or directory

notify When set, tcsh sends a message to the screen immediately whenever a background

job completes. Ordinarily tcsh notifies you about job completion just before dis-

playing the next prompt. Refer to “Job Control” on page 285.

pushdtohome Causes a call to pushd without any arguments to change directories to your home

directory (equivalent to pushd –).

pushdsilent Causes pushd and popd not to display the directory stack.

Table 9-7 How noclobber works

Command line noclobber not declared noclobber declared

x > fileout Redirects standard output from pro-

cess x to

fileout

. Overwrites

fileout

it exists.

Redirects standard output from pro-

cess x to

fileout

. The shell displays

an error message if

fileout

exists, and

does not overwrite the file.

x >> fileout Redirects standard output from pro-

cess x to

fileout

. Appends new output

to the end of

fileout

if it exists. Creates

fileout

if it does not exist.

Redirects standard output from

process x to

fileout

. Appends new

output to the end of

fileout

if it

exists. The shell displays an error

message if

fileout

does not exist,

and does not create the file.

378 Chapter 9 The TC Shell

rmstar Causes the shell to request confirmation when you give an rm * command.

verbose Causes the shell to display each command after a history expansion (page 354). Set

verbose by calling tcsh with the –v option or by using set.

visiblebell Causes audible beeps to be replaced by flashing the screen.

Control Structures

The TC Shell uses many of the same control structures as the Bourne Again Shell. In

each case the syntax is different, but the effects are the same. This section summarizes

the differences between the control structures in the two shells. For more information

refer to “Control Structures” on page 398.

The syntax of the if control structure is

if (expression) simple-command

The if control structure works only with simple commands, not with pipes or lists

of commands. You can use the if...then control structure (page 382) to execute

more complex commands.

tcsh $ cat if_1

#!/bin/tcsh

# Routine to show the use of a simple if control structure.

if ( $#argv == 0 ) echo "if_1: There are no arguments."

The if_1 script checks whether it was called with zero arguments. If the expression

enclosed in parentheses evaluates to true—that is, if zero arguments were on the

command line—the if structure displays a message.

In addition to logical expressions such as the one the if_1 script uses, you can use

expressions that return a value based on the status of a file. The syntax for this type

of expression is

–n filename

where n is one of the values listed in Table 9-8.

If the result of the test is true, the expression has a value of 1; if it is false, the

expression has a value of 0. If the specified file does not exist or is not accessible,

tcsh evaluates the expression as 0. The following example checks whether the file

specified on the command line is an ordinary or directory file (and not a device or

other special file):

tcsh $ cat if_2

#!/bin/tcsh

if -f $1 echo "$1 is an ordinary or directory file."

Control Structures 379

You can combine operators where it makes sense. For example, –ox filename is true

if you own and have execute permission for the file. This expression is equivalent to

–o filename && –x filename.

Some operators return useful information about a file other than reporting true or

false. They use the same –n filename format, where n is one of the values shown in

Table 9-9 on the next page.

Table 9-8 Value of n

n Meaning

bFile is a block special file

cFile is a character special file

dFile is a directory file

eFile exists

fFile is an ordinary or directory file

gFile has the set-group-ID bit set

kFile has the sticky bit (page 980) set

lFile is a symbolic link

oFile is owned by user

pFile is a named pipe (FIFO)

rThe user has read access to the file

SFile is a socket special file

sFile is not empty (has nonzero size)

tFile descriptor (a single digit replacing

filename

) is open and connected to the

screen

uFile has the set-user-ID bit set

wUser has write access to the file

XFile is either a builtin or an executable found by searching the directories in

$path

xUser has execute access to the file

zFile is 0 bytes long

380 Chapter 9 The TC Shell

You can use only one of these operators in a given test, and it must appear as the

last operator in a multiple-operator sequence. Because 0 (zero) can be a valid

response from some of these operators (for instance, the number of bytes in a file

might be 0), most return –1 on failure instead of the 0 that the logical operators

return on failure. The one exception is F, which returns a colon if it cannot deter-

mine the device and inode for the file.

When you want to use one of these operators outside of a control structure expression,

you can use the filetest builtin to evaluate a file test and report the result:

tcsh $ filetest -z if_1

tcsh $ filetest -F if_1

2051:12694

tcsh $ filetest -Z if_1

131

Table 9-9 Value of n

n Meaning

AThe last time the file was accessed.*

A: The last time the file was accessed displayed in a human-readable format.

MThe last time the file was modified.*

M: The last time the file was modified displayed in a human-readable format.

CThe last time the file’s inode was modified.*

C: The last time the file’s inode was modified displayed in a human-readable

format.

DDevice number for the file. This number uniquely identifies the device (a disk

partition, for example) on which the file resides.

IInode number for the file. The inode number uniquely identifies a file on a

particular device.

FA string of the form device:inode. This string uniquely identifies a file anywhere

on the system.

NNumber of hard links to the file.

PThe file’s permissions, shown in octal, without a leading 0.

UNumeric user ID of the file’s owner.

U: Username of the file’s owner.

GNumeric group ID of the file’s group.

G: Name of the file’s group.

ZNumber of bytes in the file.

*Time measured in seconds from the epoch (usually the start of January 1, 1970).

Control Structures 381

goto

The goto statement has the following syntax:

goto label

Agoto builtin transfers control to the statement beginning with label:. The following

script fragment demonstrates the use of goto:

tcsh $ cat goto_1

#!/bin/tcsh

# test for 2 arguments

if ($#argv == 2) goto goodargs

echo "Usage: $0 arg1 arg2"

exit 1

goodargs:

...

The goto_1 script displays a usage message (page 402) when it is called with more

or fewer than two arguments.

Interrupt Handling

The onintr (on interrupt) statement transfers control when you interrupt a shell

script. The format of an onintr statement is

onintr label

When you press the interrupt key during execution of a shell script, the shell

transfers control to the statement beginning with label:. This statement allows

you to terminate a script gracefully when it is interrupted. You can use it to ensure

that when you interrupt a shell script, the script removes temporary files before

returning control to the parent shell.

The following script demonstrates the use of onintr. It loops continuously until you

press the interrupt key, at which time it displays a message and returns control to

the shell:

tcsh $ cat onintr_1

#!/bin/tcsh

# demonstration of onintr

onintr close

while ( 1 )

echo "Program is running."

sleep 2

end

close:

echo "End of program."

If a script creates temporary files, you can use onintr to remove them.

close:

rm -f /tmp/$$*

382 Chapter 9 The TC Shell

The ambiguous file reference /tmp/$$* matches all files in /tmp that begin with the

PID number of the current shell. Refer to page 439 for a description of this tech-

nique for naming temporary files.

if...then...else

The if...then...else control structure has three forms. The first form, an extension of

the simple if structure, executes more complex commands or a series of commands

if expression is true. This form is still a one-way branch.

if (expression) then

commands

endif

The second form is a two-way branch. If expression is true, the first set of commands

is executed. If it is false, the set of commands following else is executed.

if (expression) then

commands

else

commands

endif

The third form is similar to the if...then...elif structure (page 405). It performs

tests until it finds an expression that is true and then executes the corresponding

commands.

if (expression) then

commands

else if (expression) then

commands

. . .

else

commands

endif

The following program assigns a value of 0, 1, 2, or 3 to the variable class based on

the value of the first command-line argument. The program declares the variable

class at the beginning for clarity; you do not need to declare it before its first use.

Also for clarity, the script assigns the value of the first command-line argument to

number.

tcsh $ cat if_else_1

#!/bin/tcsh

# routine to categorize the first

# command-line argument

set class

set number = $argv[1]

Control Structures 383

if ($number < 0) then

@ class = 0

else if (0 <= $number && $number < 100) then

@ class = 1

else if (100 <= $number && $number < 200) then

@ class = 2

else

@ class = 3

endif

echo "The number $number is in class ${class}."

The first if statement tests whether number is less than 0. If it is, the script assigns 0

to class and transfers control to the statement following endif. If it is not, the second

if tests whether the number is between 0 and 100. The && Boolean AND operator

yields a value of true if the expression on each side is true. If the number is between

0 and 100, 1 is assigned to class and control is transferred to the statement follow-

ing endif. A similar test determines whether the number is between 100 and 200. If

it is not, the final else assigns 3 to class. The endif closes the if control structure. The

final statement uses braces ({}) to isolate the variable class from the following

period. The braces isolate the period for clarity; the shell does not consider a punc-

tuation mark to be part of a variable name. The braces would be required if you

wanted other characters to follow immediately after the variable.

foreach

The foreach structure parallels the bash for...in structure (page 411). The syntax is

foreach loop-index (argument-list)

commands

end

This structure loops through commands. The first time through the loop, the struc-

ture assigns the value of the first argument in argument-list to loop-index. When

control reaches the end statement, the shell assigns the value of the next argument

from argument-list to loop-index and executes the commands again. The shell

repeats this procedure until it exhausts argument-list.

The following tcsh script uses a foreach structure to loop through the files in the

working directory containing a specified string of characters in their filename and to

change the string. For example, you can use it to change the string memo in file-

names to letter. Thus the filenames memo.1,dailymemo, and memories would be

changed to letter.1,dailyletter, and letterries, respectively.

This script requires two arguments: the string to be changed (the old string) and the

new string. The argument-list of the foreach structure uses an ambiguous file refer-

ence to loop through all files in the working directory with filenames that contain

the first argument. For each filename that matches the ambiguous file reference, the

mv utility changes the filename. The echo and sed commands appear within back

ticks (‘) that indicate command substitution: Executing the commands within the

384 Chapter 9 The TC Shell

back ticks replaces the back ticks and everything between them. Refer to “Com-

mand Substitution” on page 340 for more information. The sed utility (page 565)

substitutes the first argument for the second argument in the filename. The $1 and

$2 are abbreviated forms of $argv[1] and $argv[2], respectively.

tcsh $ cat ren

#!/bin/tcsh

# Usage: ren arg1 arg2

# changes the string arg1 in the names of files

# in the working directory into the string arg2

if ($#argv != 2) goto usage

foreach i ( *$1* )

mv $i ‘echo $i | sed -n s/$1/$2/p‘

end

exit 0

usage:

echo "Usage: ren arg1 arg2"

exit 1

optional The next script uses a foreach loop to assign the command-line arguments to the

elements of an array named buffer:

tcsh $ cat foreach_1

#!/bin/tcsh

# routine to zero-fill argv to 20 arguments

set buffer = (0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)

set count = 1

if ($#argv > 20) goto toomany

foreach argument ($argv[*])

set buffer[$count] = $argument

@ count++

end

# REPLACE command ON THE NEXT LINE WITH

# THE PROGRAM YOU WANT TO CALL.

exec command $buffer[*]

toomany:

echo "Too many arguments given."

echo "Usage: foreach_1 [up to 20 arguments]"

exit 1

The foreach_1 script calls another program named command with a command line

guaranteed to contain 20 arguments. If foreach_1 is called with fewer than 20

arguments, it fills the command line with zeros to complete the 20 arguments for

command. Providing more than 20 arguments causes it to display a usage message

and exit with an error status of 1.

Control Structures 385

The foreach structure loops through the commands one time for each command-

line argument. Each time through the loop, foreach assigns the value of the next

argument from the command line to the variable argument. Then the script assigns

each of these values to an element of the array buffer. The variable count maintains

the index for the buffer array. A postfix operator increments the count variable

using @ (@ count++). The exec builtin (bash and tcsh; page 450) calls command so

that a new process is not initiated. (Once command is called, the process running

this routine is no longer needed so a new process is not required.)

while

The syntax of the while structure is

while (expression)

commands

end

This structure continues to loop through commands while expression is true. If

expression is false the first time it is evaluated, the structure never executes

commands.

tcsh $ cat while_1

#!/bin/tcsh

# Demonstration of a while control structure.

# This routine sums the numbers between 1 and n;

# n is the first argument on the command line.

set limit = $argv[1]

set index = 1

set sum = 0

while ($index <= $limit)

@ sum += $index

@ index++

end

echo "The sum is $sum"

This program computes the sum of all integers up to and including n, where n is the

first argument on the command line. The += operator assigns the value of sum +

index to sum.

break and continue

You can interrupt a foreach or while structure with a break or continue statement.

These statements execute the remaining commands on the line before they transfer

control. The break statement transfers control to the statement after the end state-

ment, terminating execution of the loop. The continue statement transfers control

to the end statement, which continues execution of the loop.

386 Chapter 9 The TC Shell

switch

The switch structure is analogous to the bash case structure (page 421):

switch (test-string)

case pattern:

commands

breaksw

case pattern:

commands

breaksw

...

default:

commands

breaksw

endsw

The breaksw statement transfers control to the statement following the endsw state-

ment. If you omit breaksw, control falls through to the next command. You can use

any of the special characters listed in Table 10-2 on page 423 within pattern except

the pipe symbol (|).

tcsh $ cat switch_1

#!/bin/tcsh

# Demonstration of a switch control structure.

# This routine tests the first command-line argument

# for yes or no in any combination of uppercase and

# lowercase letters.

# test that argv[1] exists

if ($#argv != 1) then

echo "Usage: $0 [yes|no]"

exit 1

else

# argv[1] exists, set up switch based on its value

switch ($argv[1])

# case of YES

case [yY][eE][sS]:

echo "Argument one is yes."

breaksw

# case of NO

case [nN][oO]:

echo "Argument one is no."

breaksw

Builtins 387

# default case

default:

echo "Argument one is not yes or no."

breaksw

endsw

endif

Builtins

Builtins are commands that are part of (built into) the shell. When you give a simple

filename as a command, the shell first checks whether it is the name of a builtin. If it

is, the shell executes it as part of the calling process; the shell does not fork a new

process to execute the builtin. The shell does not need to search the directory struc-

ture for builtin programs because they are immediately available to the shell.

If the simple filename you give as a command is not a builtin, the shell searches the

directory structure for the program you want, using the PATH variable as a guide.

When it finds the program, the shell forks a new process to execute the program.

Although they are not listed in Table 9-10, the control structure keywords (if,

foreach,endsw, and so on) are builtins. Table 9-10 describes many of the tcsh built-

ins, some of which are also built into other shells.

Table 9-10 tcsh builtins

Builtin Function

job

A synonym for the fg builtin. The

job

is the job number of the job you want to

bring to the foreground (page 286).

job

& A synonym for the bg builtin. The

job

is the number of the job you want to put

in the background (page 287).

@Similar to the set builtin but evaluates numeric expressions. Refer to

“Numeric Variables” on page 368.

alias Creates and displays aliases; bash uses a different syntax than tcsh. Refer to

“Aliases” on page 357.

alloc Displays a report of the amount of free and used memory.

bg Moves a suspended job into the background (page 287).

bindkey Controls the mapping of keys to the tcsh command-line editor commands.

bindkey Without any arguments, bindkey lists all key bindings (page 363).

bindkey –l Lists all available editor commands and gives a short description of each.

bindkey –e Puts the command-line editor in emacs mode (page 363).

388 Chapter 9 The TC Shell

Builtin Function

bindkey –v Puts the command-line editor in vi(m) mode (page 363).

bindkey

key

command

Attaches the editor command

command

to the key

key

bindkey –b

key

command

Similar to the previous form but allows you to specify control keys by using

the form C–x (where x is the character you type while you press the CONTROL

key), specify meta key sequences as M-x (on most keyboards used with

Linux, the ALT key is the META key; Mac OS X uses the OPTION key [but you

have to set an option in the Te r m i n a l utility to use it; see page 935]), and

specify function keys as F-x.

bindkey –c

key

command

Binds the key

key

to the command

command

. Here the

command

is not an

editor command but rather a shell builtin or an executable program.

bindkey –s

key

string

Causes tcsh to substitute

string

when you press

key

builtins Displays a list of all builtins.

cd or chdir Changes the working directory (page 87).

dirs Displays the directory stack (page 288).

echo Displays its arguments. You can prevent echo from displaying a RETURN at the

end of a line by using the –n option (see “Reading User Input” on page 371)

or by using a trailing \c (see “read: Accepts User Input” on page 447). The

echo builtin is similar to the echo utility (page 680).

eval Scans and evaluates the command line. When you put eval in front of a

command, the command is scanned twice by the shell before it is executed.

This feature is useful with a command that is generated through command

or variable substitution. Because of the order in which the shell processes

a command line, it is sometimes necessary to repeat the scan to achieve the

desired result (page 329).

exec Overlays the program currently being executed with another program in the

same shell. The original program is lost. Refer to “exec: Executes a Command

or Redirects File Descriptors” on page 450 for more information; also refer to

source on page 390.

exit Exits from a TC Shell. When you follow exit with a numeric argument, tcsh

returns that number as the exit status (page 440).

fg Moves a job into the foreground (page 285).

filetest Takes one of the file inquiry operators followed by one or more filenames and

applies the operator to each filename (page 380). Returns the results as a

SPACE-separated list.

Table 9-10 tcsh builtins (continued)

Builtins 389

Builtin Function

glob Like echo, but does not display SPACEs between its arguments and does not

follow its display with a NEWLINE.

hashstat Reports on the efficiency of tcsh’s hash mechanism. The hash mechanism

speeds the process of searching through the directories in your search path.

See also rehash (page 390) and unhash (page 391).

history Displays a list of recent commands (page 354).

jobs Displays a list of jobs (suspended commands and those running in the

background).

kill Terminates a job or process (page 456).

limit Limits the computer resources that the current process and any processes it

creates can use. You can put limits on the number of seconds of CPU time the

process can use, the size of files that the process can create, and so forth.

log Immediately produces the report that the watch shell variable (page 375)

would normally cause tcsh to produce every 10 minutes.

logout Ends a session if you are using your original (login) shell.

ls–F Similar to ls –F (page 746) but faster. (This builtin is the characters ls–F without

any SPACEs.)

nice Lowers the processing priority of a command or a shell. This builtin is useful

if you want to run a command that makes large demands on the system and

you do not need the output right away. If you are working with root privileges,

you can use nice to raise the priority of a command. Refer to page 773 for

more information on the nice builtin and the nice utility.

nohup Allows you to log out without terminating processes running in the background.

Some systems are set up this way by default. Refer to page 775 for information

on the nohup builtin and the nohup utility.

notify Causes the shell to notify you immediately when the status of one of your jobs

changes (page 285).

onintr Controls the action that an interrupt causes within a script (page 381). See

“trap: Catches a Signal” on page 453 for information on the equivalent

command in bash.

popd Changes the working directory to the directory on the top of the directory stack

and removes that directory from the directory stack (page 288).

printenv Displays all environment variable names and values.

Table 9-10 tcsh builtins (continued)

390 Chapter 9 The TC Shell

Builtin Function

pushd Changes the working directory and places the new directory at the top of the

directory stack (page 288).

rehash Re-creates the internal tables used by the hash mechanism. Whenever a new

instance of tcsh is invoked, the hash mechanism creates a sorted list of all

available commands based on the value of path. After you add a command to

a directory in path, use rehash to re-create the sorted list of commands. If

you do not, tcsh may not be able to find the new command. Also refer to

hashstat (page 389) and unhash (page 391).

repeat Takes two arguments—a count and a simple command (no pipes or lists of com-

mands)—and repeats the command the number of times specified by the count.

sched Executes a command at a specified time. For example, the following command

causes the shell to display the message Dental appointment. at 10 AM:

tcsh $ sched 10:00 echo "Dental appointment."

Without any arguments, sched displays the list of scheduled commands.

When the time to execute a scheduled command arrives, tcsh executes the

command just before it displays a prompt.

set Declares, initializes, and displays local variables (page 365).

setenv Declares, initializes, and displays environment variables (page 365).

shift Analogous to the bash shift builtin (page 442). Without an argument, shift

promotes the indexes of the argv array. You can use it with an argument of an

array name to perform the same operation on that array.

source Executes the shell script given as its argument: source does not fork another

process. It is similar to the bash . (dot) builtin (page 274). The source builtin

expects a TC Shell script so no leading #! is required in the script. The current

shell executes source; thus the script can contain commands, such as set,

that affect the current shell. After you make changes to your .tcshrc or .login

file, you can use source to execute it from the shell, thereby putting the

changes into effect without logging out and in. You can nest source builtins.

stop Stops a job or process that is running in the background. The stop builtin

accepts multiple arguments.

suspend Stops the current shell and puts it in the background. It is similar to the suspend

key, which stops jobs running in the foreground. This builtin will not suspend a

time Executes the command you give it as an argument. It displays a summary of

time-related information about the executed command, according to the time

shell variable (page 374). Without an argument, time displays the times for

the current shell and its children.

Table 9-10 tcsh builtins (continued)

Chapter Summary 391

Chapter Summary

Like the Bourne Again Shell, the TC Shell is both a command interpreter and a

programming language. The TC Shell, which is based on the C Shell that was

developed at the University of California at Berkeley, includes popular features

such as history, alias, and job control.

You may prefer to use tcsh as a command interpreter, especially if you are familiar

with the C Shell. You can use chsh to change your login shell to tcsh. However, run-

ning tcsh as your interactive shell does not cause tcsh to run shell scripts; they will

continue to be run by bash unless you explicitly specify another shell on the first line

of the script or specify the script name as an argument to tcsh. Specifying the shell

on the first line of a shell script ensures the behavior you expect.

If you are familiar with bash, you will notice some differences between the two

shells. For instance, the syntax you use to assign a value to a variable differs. In

addition, tcsh allows SPACEs around the equal sign. Both numeric and nonnumeric

Builtin Function

umask Identifies or changes the access permissions that tcsh assigns to files you

create (page 870).

unalias Removes an alias (page 357).

unhash Turns off the hash mechanism. See also hashstat (page 389) and rehash

(page 390).

unlimit Removes limits (page 389) on the current process.

unset Removes a variable declaration (page 365).

unsetenv Removes an environment variable declaration (page 365).

wait Causes the shell to wait for all child processes to terminate. When you give a

wait command in response to a shell prompt, tcsh does not display a prompt

until all background processes have finished execution. If you interrupt it with

the interrupt key, wait displays a list of background processes before tcsh

displays a prompt.

where When given the name of a command as an argument, locates all occurrences

of the command and, for each, tells you whether it is an alias, a builtin, or an

executable program in your path.

which Similar to where but reports on only the command that would be executed,

not all occurrences. This builtin is much faster than the Linux which utility and

reports on aliases and builtins.

Table 9-10 tcsh builtins (continued)

392 Chapter 9 The TC Shell

variables are created and given values using the set builtin. The @ builtin can evalu-

ate numeric expressions for assignment to numeric variables.

setenv Because there is no export builtin in tcsh, you must use the setenv builtin to create an

environment (global) variable. You can also assign a value to the variable with the

setenv command. The command unset removes both local and environment vari-

ables, whereas the command unsetenv removes only environment variables.

Aliases The syntax of the tcsh alias builtin is slightly different from that of alias in bash.

Unlike bash, the tcsh aliases permit you to substitute command-line arguments using

the history mechanism syntax.

Most other tcsh features, such as history, word completion, and command-line edit-

ing, closely resemble their bash counterparts. The syntax of the tcsh control structures

is slightly different but provides functionality equivalent to that found in bash.

Globbing The term globbing, a carryover from the original Bourne Shell, refers to the match-

ing of strings containing special characters (such as * and ?) to filenames. If tcsh is

unable to generate a list of filenames matching a globbing pattern, it displays an

error message. This behavior contrasts with that of bash, which simply leaves the

pattern alone.

Standard input and standard output can be redirected in tcsh, but there is no

straightforward way to redirect them independently. Doing so requires the creation

of a subshell that redirects standard output to a file while making standard error

available to the parent process.

Exercises

1. Assume that you are working with the following history list:

37 mail zach

38 cd /home/sam/correspondence/business/cheese_co

39 less letter.0321

40 vim letter.0321

41 cp letter.0321 letter.0325

42 grep hansen letter.0325

43 vim letter.0325

44 lpr letter*

45 cd ../milk_co

46 pwd

47 vim wilson.0321 wilson.0329

Using the history mechanism, give commands to

a. Send mail to Zach.

b. Use vim to edit a file named wilson.0329.

Exercises 393

c. Send wilson.0329 to the printer.

d. Send both wilson.0321 and wilson.0329 to the printer.

2. a. How can you display the aliases currently in effect?

b. Write an alias named homedots that lists the names (only) of all hidden

files in your home directory.

3. a. How can you prevent a command from sending output to the terminal

when you start it in the background?

b. What can you do if you start a command in the foreground and later

decide that you want it to run in the background?

4. Which statement can you put in your ~/.tcshrc file to prevent accidentally

overwriting a file when you redirect output? How can you override this

feature?

5. Assume that the working directory contains the following files:

adams.ltr.03

adams.brief

adams.ltr.07

abelson.09

abelson.brief

anthony.073

anthony.brief

azevedo.99

What happens if you press TAB after typing the following commands?

a. less adams.l

b. cat a

c. ls ant

d. file az

What happens if you press CONTROL-D after typing the following commands?

e. ls ab

f. less a

6. Write an alias named backup that takes a filename as an argument and

creates a copy of that file with the same name and a filename extension

of .bak.

7. Write an alias named qmake (quiet make) that runs make with both

standard output and standard error redirected to the file named

make.log. The command qmake should accept the same options and

arguments as make.

394 Chapter 9 The TC Shell

8. How can you make tcsh always display the pathname of the working

directory as part of its prompt?

Advanced Exercises

9. Which lines do you need to change in the Bourne Again Shell script

command_menu (page 423) to turn it into a TC Shell script? Make the

changes and verify that the new script works.

10. Users often find rm (and even rm –i) too unforgiving because this utility

removes files irrevocably. Create an alias named delete that moves files

specified by its argument(s) into the ~/.trash directory. Create a second

alias named undelete that moves a file from the ~/.trash directory into the

working directory. Put the following line in your ~/.logout file to remove

any files that you deleted during the login session:

/bin/rm -f $HOME/.trash/* >& /dev/null

Explain what could be different if the following line were put in your

~/.logout file instead:

rm $HOME/.trash/*

11. Modify the foreach_1 script (page 384) so that it takes the command to

exec as an argument.

12. Rewrite the program while_1 (page 385) so that it runs faster. Use the time

builtin to verify the improvement in execution time.

13. Write your own version of find named myfind that writes output to the file

findout but without the clutter of error messages, such as those generated

when you do not have permission to search a directory. The myfind com-

mand should accept the same options and arguments as find. Can you

think of a situation in which myfind does not work as desired?

14. When the foreach_1 script (page 384) is supplied with 20 or fewer argu-

ments, why are the commands following toomany: not executed? (Why is

there no exit command?)

397397

Chapter 5 introduced the shells and Chapter 8 went into detail

about the Bourne Again Shell. This chapter introduces addi-

tional Bourne Again Shell commands, builtins, and concepts

that carry shell programming to a point where it can be useful.

Although you may make use of shell programming as a system

administrator, you do not have to read this chapter to perform

system administration tasks. Feel free to skip this chapter and

come back to it if and when you like.

The first part of this chapter covers programming control struc-

tures, also called control flow constructs. These structures allow

you to write scripts that can loop over command-line arguments,

make decisions based on the value of a variable, set up menus,

and more. The Bourne Again Shell uses the same constructs

found in such high-level programming languages as C.

The next part of this chapter discusses parameters and variables,

going into detail about array variables, local versus global vari-

ables, special parameters, and positional parameters. The explo-

ration of builtin commands covers type, which displays

information about a command, and read, which allows a shell

In This Chapter

Control Structures . . . . . . . . . . . . . 398

File Descriptors . . . . . . . . . . . . . . . 431

Parameters and Variables . . . . . . 434

Array Variables . . . . . . . . . . . . . . . 434

Locality of Variables . . . . . . . . . . . 436

Special Parameters . . . . . . . . . . . . 438

Positional Parameters. . . . . . . . . . 440

Builtin Commands . . . . . . . . . . . . 446

Expressions . . . . . . . . . . . . . . . . . . 460

Shell Programs . . . . . . . . . . . . . . . 468

A Recursive Shell Script . . . . . . . . 469

The quiz Shell Script. . . . . . . . . . . 472

Programming the

Bourne Again Shell

398 Chapter 10 Programming the Bourne Again Shell

script to accept user input. The section on the exec builtin demonstrates how to use

exec to execute a command efficiently by replacing a process and explains how to

use exec to redirect input and output from within a script.

The next section covers the trap builtin, which provides a way to detect and respond

to operating system signals (such as the signal generated when you press CONTROL-C).

The discussion of builtins concludes with a discussion of kill, which can abort a pro-

cess, and getopts, which makes it easy to parse options for a shell script. Table 10-6

on page 459 lists some of the more commonly used builtins.

Next the chapter examines arithmetic and logical expressions as well as the operators

that work with them. The final section walks through the design and implementation

of two major shell scripts.

This chapter contains many examples of shell programs. Although they illustrate cer-

tain concepts, most use information from earlier examples as well. This overlap not

only reinforces your overall knowledge of shell programming but also demonstrates

how you can combine commands to solve complex tasks. Running, modifying, and

experimenting with the examples in this book is a good way to become comfortable

with the underlying concepts.

This chapter illustrates concepts with simple examples, which are followed by more

complex ones in sections marked “Optional.” The more complex scripts illustrate

traditional shell programming practices and introduce some Linux utilities often

used in scripts. You can skip these sections without loss of continuity. Return to

them when you feel comfortable with the basic concepts.

Control Structures

The control flow commands alter the order of execution of commands within a shell

script. The TC Shell uses a different syntax for these commands than the Bourne

Again Shell does; see page 378. Control structures include the if...then,for...in,

while,until, and case statements. In addition, the break and continue statements

work in conjunction with the control structures to alter the order of execution of

commands within a script.

if...then

The if...then control structure has the following syntax:

if test-command

then

commands

Do not name a shell script test

tip You can unwittingly create a problem if you give a shell script the name test because a Linux utility

has the same name. Depending on how the PATH variable is set up and how you call the program,

you may run either your script or the utility, leading to confusing results.

Control Structures 399

The bold words in the syntax description are the items you supply to cause the

structure to have the desired effect. The nonbold words are the keywords the shell

uses to identify the control structure.

test builtin Figure 10-1 shows that the if statement tests the status returned by the test-command

and transfers control based on this status. The end of the if structure is marked by a

fi statement (if spelled backward). The following script prompts for two words,

reads them, and then uses an if structure to execute commands based on the result

returned by the test builtin (tcsh uses the test utility) when it compares the two

words. (See page 854 for information on the test utility, which is similar to the test

builtin.) The test builtin returns a status of true if the two words are the same and

false if they are not. Double quotation marks around $word1 and $word2 make

sure test works properly if you enter a string that contains a SPACE or other special

character:

$ cat if1

echo -n "word 1: "

read word1

echo -n "word 2: "

read word2

if test "$word1" = "$word2"

then

echo "Match"

echo "End of program."

Figure 10-1 An if...then flowchart

then

commands

test-command

Tr u e

False

400 Chapter 10 Programming the Bourne Again Shell

$ ./if1

word 1: peach

word 2: peach

Match

End of program.

In the preceding example the test-command is test "$word1" = "$word2". The

test

builtin returns a true status if its first and third arguments have the relationship

specified by its second argument. If this command returns a true status (= 0), the

shell executes the commands between the then and fi statements. If the command

returns a false status (not = 0), the shell passes control to the statement following fi

without executing the statements between then and fi. The effect of this if statement

is to display Match if the two words are the same. The script always displays End of

program.

Builtins In the Bourne Again Shell,

test

is a builtin—part of the shell. It is also a stand-alone

utility kept in /usr/bin/test. This chapter discusses and demonstrates many Bourne

Again Shell builtins. Each

bash

builtin may or may not be a builtin in

tcsh

. You typ-

ically use the builtin version if it is available and the utility if it is not. Each version

of a command may vary slightly from one shell to the next and from the utility to

any of the shell builtins. See page 446 for more information on shell builtins.

Checking arguments The next program uses an if structure at the beginning of a script to confirm that

you have supplied at least one argument on the command line. The

test

–eq oper-

ator compares two integers; the $# special parameter (page 441) takes on the

value of the number of command-line arguments. This structure displays a mes-

sage and exits from the script with an exit status of 1 if you do not supply at least

one argument:

$ cat chkargs

if test $# -eq 0

then

echo "You must supply at least one argument."

exit 1

echo "Program running."

$ ./chkargs

You must supply at least one argument.

$ ./chkargs abc

Program running.

A test like the one shown in chkargs is a key component of any script that requires

arguments. To prevent the user from receiving meaningless or confusing informa-

tion from the script, the script needs to check whether the user has supplied the

appropriate arguments. Some scripts simply test whether arguments exist (as in

chkargs). Other scripts test for a specific number or specific kinds of arguments.

You can use

test

to verify the status of a file argument or the relationship between

two file arguments. After verifying that at least one argument has been given on the

command line, the following script tests whether the argument is the name of an

Control Structures 401

ordinary file (not a directory or other type of file) in the working directory. The

test

builtin with the –f option and the first command-line argument ($1) check the file:

$ cat is_ordfile

if test $# -eq 0

then

echo "You must supply at least one argument."

exit 1

if test -f "$1"

then

echo "$1 is an ordinary file in the working directory"

else

echo "$1 is NOT an ordinary file in the working directory"

You can test many other characteristics of a file using

test

options; see Table 10-1.

Other

test

options provide ways to test relationships between two files, such as

whether one file is newer than another. Refer to later examples in this chapter and

test

on page 854 for more information.

[]

is a synonym

for

test

The following example—another version of chkargs—checks for arguments in a

way that is more traditional for Linux shell scripts. This example uses the bracket

([]) synonym for

test

. Rather than using the word

test

in scripts, you can surround

the arguments to

test

with brackets. The brackets must be surrounded by white-

space (SPACEs or TABs).

Table 10-1 Options to the

test

builtin

Option Tests file to see if it

–d Exists and is a directory file

–e Exists

–f Exists and is an ordinary file (not a directory)

–r Exists and is readable

–s Exists and has a size greater than 0 bytes

–w Exists and is writable

–x Exists and is executable

Always test the arguments

tip To keep the examples in this book short and focused on specific concepts, the code to verify argu-

ments is often omitted or abbreviated. It is good practice to test arguments in shell programs that

other people will use. Doing so results in scripts that are easier to run and debug.

402 Chapter 10 Programming the Bourne Again Shell

$ cat chkargs2

if [ $# -eq 0 ]

then

echo "Usage: chkargs2 argument..." 1>&2

exit 1

echo "Program running."

exit 0

$ ./chkargs2

Usage: chkargs2 argument...

$ ./chkargs2 abc

Program running.

Usage messages The error message that chkargs2 displays is called a usage message and uses the

1>&2 notation to redirect its output to standard error (page 275). After issuing the

usage message, chkargs2 exits with an exit status of 1, indicating an error has

occurred. The exit 0 command at the end of the script causes chkargs2 to exit with

a 0 status after the program runs without an error. The Bourne Again Shell returns a

0 status if you omit the status code.

The usage message is commonly employed to specify the type and number of argu-

ments the script takes. Many Linux utilities provide usage messages similar to the

one in chkargs2. If you call a utility or other program with the wrong number or

wrong kind of arguments, it will often display a usage message. Following is the

usage message that

displays when you call it without any arguments:

$ cp

cp: missing file operand

Try 'cp --help' for more information.

if...then...else

The introduction of an else statement turns the if structure into the two-way branch

shown in Figure 10-2. The if...then...else control structure (available in

tcsh

with a

slightly different syntax) has the following syntax:

if test-command

then

commands

else

commands

Because a semicolon (;) ends a command just as a NEWLINE does, you can place then on

the same line as if by preceding it with a semicolon. (Because if and then are sepa-

rate builtins, they require a command separator between them; a semicolon and NEW-

LINE work equally well [page 282].) Some people prefer this notation for aesthetic

reasons; others like it because it saves space.

if test-command; then

commands

else

commands

Control Structures 403

If the test-command returns a true status, the if structure executes the commands

between the then and else statements and then diverts control to the statement fol-

lowing fi. If the test-command returns a false status, the if structure executes the

commands following the else statement.

When you run the out script with arguments that are filenames, it displays the files

on the terminal. If the first argument is –v (called an option in this case), out uses

less

(page 48) to display the files one screen at a time. After determining that it was

called with at least one argument, out tests its first argument to see whether it is –v.

If the result of the test is true (the first argument is –v), out uses the

shift

builtin

(page 442) to shift the arguments to get rid of the –v and displays the files using

less

. If the result of the test is false (the first argument is not –v), the script uses

cat

to display the files:

$ cat out

if [ $# -eq 0 ]

then

echo "Usage: out [-v] filenames..." 1>&2

exit 1

if [ "$1" = "-v" ]

then

shift

less -- "$@"

else

cat -- "$@"

Figure 10-2 An if...then...else flowchart

test-command

Tr u e False

else

commands

then

commands

404 Chapter 10 Programming the Bourne Again Shell

optional In out the –– argument to

cat

and

less

tells these utilities that no more options follow

on the command line and not to consider leading hyphens (–) in the following list as

indicating options. Thus –– allows you to view a file with a name that starts with a

hyphen. Although not common, filenames beginning with a hyphen do occasionally

occur. (You can create such a file by using the command cat > –fname.) The –– argu-

ment works with all Linux utilities that use the

getopts

builtin (page 456) to parse

their options; it does not work with

and a few other utilities. This argument is

particularly useful when used in conjunction with

to remove a file whose name

starts with a hyphen (rm –– –fname), including any you create while experimenting

with the –– argument.

Figure 10-3 An if...then...elif flowchart

test-command

Tr u e Fa l s e

else

commands

then

commands

elif

test-command

Tr u e Fa l se

then

commands

Control Structures 405

if...then...elif

The if...then...elif control structure (Figure 10-3; not available in

tcsh

) has the fol-

lowing syntax:

if test-command

then

commands

elif test-command

then

commands

. . .

else

commands

The elif statement combines the else statement and the if statement and enables you

to construct a nested set of if...then...else structures (Figure 10-3). The difference

between the else statement and the elif statement is that each else statement must be

paired with a fi statement, whereas multiple nested elif statements require only a sin-

gle closing fi statement.

The following example shows an if...then...elif control structure. This shell script com-

pares three words that the user enters. The first if statement uses the Boolean AND oper-

ator (–a) as an argument to

test

. The

test

builtin returns a true status only if the first and

second logical comparisons are true (that is, word1 matches word2 and word2 matches

word3). If

test

returns a true status, the script executes the command following the next

then statement, passes control to the statement following fi, and terminates:

$ cat if3

echo -n "word 1: "

read word1

echo -n "word 2: "

read word2

echo -n "word 3: "

read word3

if [ "$word1" = "$word2" -a "$word2" = "$word3" ]

then

echo "Match: words 1, 2, & 3"

elif [ "$word1" = "$word2" ]

then

echo "Match: words 1 & 2"

elif [ "$word1" = "$word3" ]

then

echo "Match: words 1 & 3"

elif [ "$word2" = "$word3" ]

then

echo "Match: words 2 & 3"

else

echo "No match"

406 Chapter 10 Programming the Bourne Again Shell

$ ./if3

word 1: apple

word 2: orange

word 3: pear

No match

$ ./if3

word 1: apple

word 2: orange

word 3: apple

Match: words 1 & 3

$ ./if3

word 1: apple

word 2: apple

word 3: apple

Match: words 1, 2, & 3

If the three words are not the same, the structure passes control to the first elif,

which begins a series of tests to see if any pair of words is the same. As the nesting

continues, if any one of the if statements is satisfied, the structure passes control

to the next then statement and subsequently to the statement following fi. Each

time an elif statement is not satisfied, the structure passes control to the next elif

statement. The double quotation marks around the arguments to

echo

that con-

tain ampersands (&) prevent the shell from interpreting the ampersands as special

characters.

optional The lnks Script

The following script, named lnks, demonstrates the if...then and if...then...elif con-

trol structures. This script finds hard links to its first argument, a filename. If you

provide the name of a directory as the second argument, lnks searches for links in

the directory hierarchy rooted at that directory. If you do not specify a directory,

lnks searches the working directory and its subdirectories. This script does not

locate symbolic links.

$ cat lnks

#!/bin/bash

# Identify links to a file

# Usage: lnks file [directory]

if [ $# -eq 0 -o $# -gt 2 ]; then

echo "Usage: lnks file [directory]" 1>&2

exit 1

if [ -d "$1" ]; then

echo "First argument cannot be a directory." 1>&2

echo "Usage: lnks file [directory]" 1>&2

exit 1

else

file="$1"

Control Structures 407

if [ $# -eq 1 ]; then

directory="."

elif [ -d "$2" ]; then

directory="$2"

else

echo "Optional second argument must be a directory." 1>&2

echo "Usage: lnks file [directory]" 1>&2

exit 1

# Check that file exists and is an ordinary file

if [ ! -f "$file" ]; then

echo "lnks: $file not found or special file" 1>&2

exit 1

# Check link count on file

set -- $(ls -l "$file")

linkcnt=$2

if [ "$linkcnt" -eq 1 ]; then

echo "lnks: no other hard links to $file" 1>&2

exit 0

# Get the inode of the given file

set $(ls -i "$file")

inode=$1

# Find and print the files with that inode number

echo "lnks: using find to search for links..." 1>&2

find "$directory" -xdev -inum $inode -print

Max has a file named letter in his home directory. He wants to find links to this file

in his and other users’ home directory file trees. In the following example, Max calls

lnks from his home directory to perform the search. The second argument to lnks,

/home, is the pathname of the directory where he wants to start the search. If you

are running Mac OS X, substitute /Users for /home. The lnks script reports that

/home/max/letter and /home/zach/draft are links to the same file:

$ ./lnks letter /home

lnks: using find to search for links...

/home/max/letter

/home/zach/draft

In addition to the if...then...elif control structure, lnks introduces other features that

are commonly used in shell programs. The following discussion describes lnks sec-

tion by section.

Specify the shell The first line of the lnks script uses #! (page 280) to specify the shell that will execute

the script:

#!/bin/bash

408 Chapter 10 Programming the Bourne Again Shell

In this chapter, the #! notation appears only in more complex examples. It ensures

that the proper shell executes the script, even when the user is running a different

shell or the script is called from a script running a different shell.

Comments The second and third lines of lnks are comments; the shell ignores text that follows a

pound sign up to the next NEWLINE character. These comments in lnks briefly identify

what the file does and explain how to use it:

# Identify links to a file

# Usage: lnks file [directory]

Usage messages The first if statement tests whether lnks was called with zero arguments or more

than two arguments:

if [ $# -eq 0 -o $# -gt 2 ]; then

echo "Usage: lnks file [directory]" 1>&2

exit 1

If either of these conditions is true, lnks sends a usage message to standard error

and exits with a status of 1. The double quotation marks around the usage message

prevent the shell from interpreting the brackets as special characters. The brackets

in the usage message indicate that the directory argument is optional.

The second if statement tests whether the first command-line argument ($1) is a

directory (the –d argument to

test

returns true if the file exists and is a directory):

if [ -d "$1" ]; then

echo "First argument cannot be a directory." 1>&2

echo "Usage: lnks file [directory]" 1>&2

exit 1

else

file="$1"

If the first argument is a directory, lnks displays a usage message and exits. If it is

not a directory, lnks saves the value of $1 in the file variable because later in the

script

set

resets the command-line arguments. If the value of $1 is not saved before

the set command is issued, its value is lost.

Test the arguments The next section of lnks is an if...then...elif statement:

if [ $# -eq 1 ]; then

directory="."

elif [ -d "$2" ]; then

directory="$2"

else

echo "Optional second argument must be a directory." 1>&2

echo "Usage: lnks file [directory]" 1>&2

exit 1

Control Structures 409

The first test-command determines whether the user specified a single argument on

the command line. If the test-command returns 0 (true), the directory variable is

assigned the value of the working directory (.). If the test-command returns false, the

elif statement tests whether the second argument is a directory. If it is a directory, the

directory variable is set equal to the second command-line argument, $2. If $2 is not

a directory, lnks sends a usage message to standard error and exits with a status of 1.

The next if statement in lnks tests whether $file does not exist. This test keeps lnks

from wasting time looking for links to a nonexistent file. The

test

builtin, when called

with the three arguments !,–f, and $file, evaluates to true if the file $file does not exist:

[ ! -f "$file" ]

The ! operator preceding the –f argument to

test

negates its result, yielding false if

the file $file does exist and is an ordinary file.

Next lnks uses

set

and ls –l to check the number of links $file has:

# Check link count on file

set -- $(ls -l "$file")

linkcnt=$2

if [ "$linkcnt" -eq 1 ]; then

echo "lnks: no other hard links to $file" 1>&2

exit 0

The

set

builtin uses command substitution (page 340) to set the positional parame-

ters to the output of ls –l. The second field in this output is the link count, so the

user-created variable linkcnt is set equal to $2. The –– used with

set

prevents

set

from interpreting as an option the first argument produced by ls –l (the first argu-

ment is the access permissions for the file and typically begins with –). The if state-

ment checks whether $linkcnt is equal to 1; if it is, lnks displays a message and exits.

Although this message is not truly an error message, it is redirected to standard error.

The way lnks has been written, all informational messages are sent to standard error.

Only the final product of lnks—the pathnames of links to the specified file—is sent

to standard output, so you can redirect the output as you please.

If the link count is greater than 1, lnks goes on to identify the inode (page 959) for

$file. As explained on page 107, comparing the inodes associated with filenames is

a good way to determine whether the filenames are links to the same file. (On an

HFS+ filesystem, the underlying data structures are different, but the system

behaves comparably.) The lnks script uses

set

to set the positional parameters to the

output of ls –i. The first argument to

set

is the inode number for the file, so the user-

created variable named inode is assigned the value of $1:

# Get the inode of the given file

set $(ls -i "$file")

inode=$1

410 Chapter 10 Programming the Bourne Again Shell

Finally lnks uses the

find

utility (page 688) to search for files having inode numbers

that match $inode:

# Find and print the files with that inode number

echo "lnks: using find to search for links..." 1>&2

find "$directory" -xdev -inum $inode -print

The

find

utility searches the directory hierarchy rooted at the directory specified by its

first argument ($directory) for files that meet the criteria specified by the remaining

arguments. In this example, the remaining arguments send the names of files having

inodes matching $inode to standard output. Because files in different filesystems can

have the same inode number yet not be linked,

find

must search only directories in

the same filesystem as $directory. The –xdev (cross-device) argument prevents

find

from searching directories on other filesystems. Refer to page 104 for more informa-

tion about filesystems and links.

The

echo

command preceding the

find

command in lnks, which tells the user that

find

is running, is included because

find

can take a long time to run. Because lnks

does not include a final exit statement, the exit status of lnks is that of the last com-

mand it runs,

find

Debugging Shell Scripts

When you are writing a script such as lnks, it is easy to make mistakes. You can use

the shell’s –x option to help debug a script. This option causes the shell to display

each command before it runs the command. Tracing a script’s execution in this way

can give you information about where a problem lies.

You can run lnks as in the previous example and cause the shell to display each

command before it is executed. Either set the –x option for the current shell (set –x)

so all scripts display commands as they are run or use the –x option to affect only

the shell running the script called by the command line.

$ bash -x lnks letter /home

+ '[' 2 -eq 0 -o 2 -gt 2 ']'

+ '[' -d letter ']'

+ file=letter

+ '[' 2 -eq 1 ']'

+ '[' -d /home ']'

+ directory=/home

+ '[' '!' -f letter ']'

...

PS4 Each command the script executes is preceded by the value of the PS4 variable—a

plus sign (+) by default, so you can distinguish debugging output from script-

produced output. You must export PS4 if you set it in the shell that calls the script.

The next command sets PS4 to >>>> followed by a SPACE and exports it:

$ export PS4='>>>> '

Control Structures 411

You can also set the –x option of the shell running the script by putting the following

set

command near the beginning of the script:

set -x

Put set –x anywhere in the script you want to turn debugging on. Turn the debugging

option off with a plus sign:

set +x

The set –o xtrace and set +o xtrace commands do the same things as set –x and set +x,

respectively.

for...in

The for...in control structure (

tcsh

uses foreach) has the following syntax:

for loop-index in argument-list

commands

done

Figure 10-4 Afor...in flowchart

Assign next

argument in

argument-list

loop-index

commands

Another

argument in

argument-list

done

Ye s

412 Chapter 10 Programming the Bourne Again Shell

The for...in structure (Figure 10-4, previous page) assigns the value of the first argu-

ment in the argument-list to the loop-index and executes the commands between

the do and done statements. The do and done statements mark the beginning and

end of the for loop.

After it passes control to the done statement, the structure assigns the value of the

second argument in the argument-list to the loop-index and repeats the commands.

It then repeats the commands between the do and done statements one time for

each argument in the argument-list. When the structure exhausts the argument-list,

it passes control to the statement following done.

The following for...in structure assigns apples to the user-created variable fruit and

then displays the value of fruit, which is apples. Next the structure assigns oranges

to fruit and repeats the process. When it exhausts the argument list, the structure

transfers control to the statement following done, which displays a message.

$ cat fruit

for fruit in apples oranges pears bananas

echo "$fruit"

done

echo "Task complete."

$ ./fruit

apples

oranges

pears

bananas

Task complete.

The next script lists the names of the directory files in the working directory by

looping through the files in the working directory and using

test

to determine which

are directory files:

$ cat dirfiles

for i in *

if [ -d "$i" ]

then

echo "$i"

done

The ambiguous file reference character * matches the names of all files (except hid-

den files) in the working directory. Prior to executing the for loop, the shell expands

the * and uses the resulting list to assign successive values to the index variable i.

for

The for control structure (not available in

tcsh

) has the following syntax:

for loop-index

commands

done

Control Structures 413

In the for structure, the loop-index takes on the value of each of the command-line

arguments, one at a time. The for structure is the same as the for...in structure

(Figure 10-4) except in terms of where it gets values for the loop-index. The for struc-

ture performs a sequence of commands, usually involving each argument in turn.

The following shell script shows a for structure displaying each command-line argu-

ment. The first line of the script, for arg, implies for arg in "$@", where the shell

expands "$@" into a list of quoted command-line arguments "$1" "$2" "$3" and

so on. The balance of the script corresponds to the for...in structure.

$ cat for_test

for arg

echo "$arg"

done

$ for_test candy gum chocolate

candy

gum

chocolate

optional The whos Script

The following script, named whos, demonstrates the usefulness of the implied "$@"

in the for structure. You give whos one or more users’ full names or usernames as

arguments, and whos displays information about the users. The whos script gets the

information it displays from the first and fifth fields in the /etc/passwd file. The first

field contains a username, and the fifth field typically contains the user’s full name.

You can provide a username as an argument to whos to identify the user’s name or

provide a name as an argument to identify the username. The whos script is similar

to the

finger

utility, although whos delivers less information. Mac OS X uses Open

Directory in place of the passwd file; see page 926 for a similar script that runs

under OS X.

$ cat whos

#!/bin/bash

if [ $# -eq 0 ]

then

echo "Usage: whos id..." 1>&2

exit 1

for id

gawk -F: '{print $1, $5}' /etc/passwd |

grep -i "$id"

done

In the next example, whos identifies the user whose username is chas and the user

whose name is Marilou Smith:

$ ./whos chas "Marilou Smith"

chas Charles Casey

msmith Marilou Smith

414 Chapter 10 Programming the Bourne Again Shell

Use of "$@" The whos script uses a for statement to loop through the command-line arguments.

In this script the implied use of "$@" in the for loop is particularly beneficial

because it causes the for loop to treat an argument that contains a SPACE as a single

argument. This example encloses Marilou Smith in quotation marks, which causes

the shell to pass it to the script as a single argument. Then the implied "$@" in the

for statement causes the shell to regenerate the quoted argument Marilou Smith so

that it is again treated as a single argument.

gawk

For each command-line argument, whos searches the /etc/passwd file. Inside the for

loop, the

gawk

utility (Chapter 12;

awk

and

mawk

will work the same way) extracts

the first ($1) and fifth ($5) fields from each line in /etc/passwd. The –F: option causes

mawk

to use a colon (:) as a field separator when it reads /etc/passwd, allowing it to

break each line into fields. The

mawk

command sets and uses the $1 and $5 argu-

ments; they are included within single quotation marks and are not interpreted by the

shell. Do not confuse these arguments with positional parameters, which correspond

to command-line arguments. The first and fifth fields are sent to

grep

(page 719) via a

pipe. The

grep

utility searches for $id (to which the shell has assigned the value of a

command-line argument) in its input. The –i option causes

grep

to ignore case as it

searches;

grep

displays each line in its input that contains $id.

| at the end of a line An interesting syntactical exception that

bash

makes for the pipe symbol (|) appears

on the line with the

mawk

command: You do not have to quote a NEWLINE that imme-

diately follows a pipe symbol (that is, a pipe symbol that is the last character on a

line) to keep the NEWLINE from executing a command. Try giving the command who |

and pressing RETURN. The shell (not

tcsh

) displays a secondary prompt. If you then

enter sort followed by another RETURN, you see a sorted

who

list. The pipe works even

though a NEWLINE follows the pipe symbol.

while

The while control structure (not available in

tcsh

) has the following syntax:

while test-command

commands

done

As long as the test-command (Figure 10-5) returns a true exit status, the while

structure continues to execute the series of commands delimited by the do and done

statements. Before each loop through the commands, the structure executes the test-

command. When the exit status of the test-command is false, the structure passes

control to the statement after the done statement.

test

builtin The following shell script first initializes the number variable to zero. The

test

builtin

then determines whether number is less than 10. The script uses

test

with the –lt argu-

ment to perform a numerical test. For numerical comparisons, you must use –ne (not

equal), –eq (equal), –gt (greater than), –ge (greater than or equal to), –lt (less than), or

–le (less than or equal to). For string comparisons, use = (equal) or != (not equal)

when you are working with

test

. In this example,

test

has an exit status of 0 (true) as

long as number is less than 10. As long as

test

returns true, the structure executes the

Control Structures 415

commands between the do and done statements. See page 854 for information on the

test

utility, which is very similar to the

test

builtin.

$ cat count

#!/bin/bash

number=0

while [ "$number" -lt 10 ]

echo -n "$number"

((number +=1))

done

echo

$ ./count

0123456789

The

echo

command following do displays number. The –n prevents

echo

from

issuing a NEWL NE following its output. The next command uses arithmetic evalua-

tion [((...)); page 460] to increment the value of number by 1. The done statement

terminates the loop and returns control to the while statement to start the loop

over again. The final

echo

causes count to send a NEWLINE character to standard

output, so the next prompt occurs in the leftmost column on the display (rather

than immediately following 9).

optional The spell_check Script

The

aspell

utility checks the words in a file against a dictionary of correctly spelled

words. With the list command,

aspell

(Linux only) runs in list mode: Input comes from

standard input and

aspell

sends each potentially misspelled word to standard output.

The following command produces a list of possible misspellings in the file letter.txt:

$ aspell list < letter.txt

quikly

portible

frendly

Figure 10-5 Awhile flowchart

while

test-command

commands

done

False

Tr u e

416 Chapter 10 Programming the Bourne Again Shell

The next shell script, named spell_check, shows another use of a while structure. To

find the incorrect spellings in a file, spell_check calls

aspell

to check a file against a

system dictionary. But it goes a step further: It enables you to specify a list of cor-

rectly spelled words and removes these words from the output of

aspell

. This script

is useful for removing words that you use frequently, such as names and technical

terms, that do not appear in a standard dictionary. Although you can duplicate the

functionality of spell_check by using additional

aspell

dictionaries, the script is

included here for its instructive value.

The spell_check script requires two filename arguments: the file containing the list

of correctly spelled words and the file you want to check. The first if statement ver-

ifies that the user specified two arguments. The next two if statements verify that

both arguments are readable files. (The exclamation point negates the sense of the

following operator; the –r operator causes

test

to determine whether a file is read-

able. The result is a test that determines whether a file is not readable.)

$ cat spell_check

#!/bin/bash

# remove correct spellings from aspell output

if [ $# -ne 2 ]

then

echo "Usage: spell_check file1 file2" 1>&2

echo "file1: list of correct spellings" 1>&2

echo "file2: file to be checked" 1>&2

exit 1

if [ ! -r "$1" ]

then

echo "spell_check: $1 is not readable" 1>&2

exit 1

if [ ! -r "$2" ]

then

echo "spell_check: $2 is not readable" 1>&2

exit 1

aspell list < "$2" |

while read line

if ! grep "^$line$" "$1" > /dev/null

then

echo $line

done

The spell_check script sends the output from

aspell

(with the list argument, so it

produces a list of misspelled words on standard output) through a pipe to standard

input of a while structure, which reads one line at a time (each line has one word on

Control Structures 417

it) from standard input. The test-command (that is, read line) returns a true exit sta-

tus as long as it receives a line from standard input.

Inside the while loop, an if statement1 monitors the return value of

grep

, which

determines whether the line that was read is in the user’s list of correctly spelled

words. The pattern

grep

searches for (the value of $line) is preceded and followed by

special characters that specify the beginning and end of a line (^ and $, respectively).

These special characters ensure that

grep

finds a match only if the $line variable

matches an entire line in the file of correctly spelled words. (Otherwise,

grep

would

match a string, such as paul, in the output of

aspell

if the file of correctly spelled

words contained the word paulson.) These special characters, together with the

value of the $line variable, form a regular expression (Appendix A).

The output of

grep

is redirected to /dev/null (page 131) because the output is not

needed; only the exit code is important. The if statement checks the negated exit sta-

tus of

grep

(the leading exclamation point negates or changes the sense of the exit

status—true becomes false, and vice versa), which is 0 or true (false when negated)

when a matching line is found. If the exit status is not 0 or false (true when negated),

the word was not in the file of correctly spelled words. The

echo

builtin sends a list

of words that are not in the file of correctly spelled words to standard output.

Once it detects the EOF (end of file), the

read

builtin returns a false exit status, con-

trol passes out of the while structure, and the script terminates.

Before you use spell_check, create a file of correct spellings containing words you

use frequently but that are not in a standard dictionary. For example, if you work

for a company named Blinkenship and Klimowski, Attorneys, you would put

Blinkenship and Klimowski in the file. The following example shows how

spell_check checks the spelling in a file named memo and removes Blinkenship and

Klimowski from the output list of incorrectly spelled words:

$ aspell list < memo

Blinkenship

Klimowski

targat

hte

$ cat word_list

Blinkenship

Klimowski

$ ./spell_check word_list memo

targat

hte

Refer to page 607 for more information on

aspell

(Linux only).

1. This if statement can also be written as

if ! grep -qw "$line" "$1"

The –q option suppresses the output from

grep

so it returns only an exit code. The –w option causes

grep

to match only a whole word.

418 Chapter 10 Programming the Bourne Again Shell

until The until (not available in

tcsh

) and while (available in

tcsh

with a slightly differ-

ent syntax) structures are very similar, differing only in the sense of the test per-

formed at the top of the loop. Figure 10-6 shows that until continues to loop until

the test-command returns a true exit status. The while structure loops while the

test-command continues to return a true or nonerror condition. The until control

structure has the following syntax:

until test-command

commands

done

The following script demonstrates an until structure that includes

read

. When the

user enters the correct string of characters, the test-command is satisfied and the

structure passes control out of the loop.

$ cat until1

secretname=zach

name=noname

echo "Try to guess the secret name!"

echo

until [ "$name" = "$secretname" ]

echo -n "Your guess: "

read name

done

echo "Very good."

$ ./until1

Try to guess the secret name!

Your guess: helen

Your guess: barbara

Your guess: rachael

Your guess: zach

Very good

Figure 10-6 An until flowchart

until

test-command

commands

done

False

Tr u e

Control Structures 419

The following locktty script is similar to the

lock

command on Berkeley UNIX and

the Lock Screen menu selection in GNOME. The script prompts for a key (pass-

word) and uses an until control structure to lock the terminal. The until statement

causes the system to ignore any characters typed at the keyboard until the user types

the key followed by a RETURN on a line by itself, which unlocks the terminal. The

locktty script can keep people from using your terminal while you are away from it

for short periods of time. It saves you from having to log out if you are concerned

about other users using your login.

$ cat locktty

#! /bin/bash

trap '' 1 2 3 18

stty -echo

echo -n "Key: "

read key_1

echo

echo -n "Again: "

read key_2

echo

key_3=

if [ "$key_1" = "$key_2" ]

then

tput clear

until [ "$key_3" = "$key_2" ]

read key_3

done

else

echo "locktty: keys do not match" 1>&2

stty echo

trap

builtin The

trap

builtin (page 453; not available in

tcsh

) at the beginning of the locktty

script stops a user from being able to terminate the script by sending it a signal (for

example, by pressing the interrupt key). Trapping signal 18 means that no one can

use CONTROL-Z (job control, a stop from a tty) to defeat the lock. Table 10-5 on

page 453 provides a list of signals. The stty –echo command (page 838) causes the

terminal not to display characters typed at the keyboard, preventing the key the

user enters from appearing on the screen. After turning off keyboard echo, the

script prompts the user for a key, reads it into the user-created variable key_1,

prompts the user to enter the same key again, and saves it in key_2. The statement

key_3= creates a variable with a NULL value. If key_1 and key_2 match, locktty clears

Forget your password for locktty?

tip If you forget your key (password), you will need to log in from another (virtual) terminal and kill

the process running locktty.

420 Chapter 10 Programming the Bourne Again Shell

the screen (with the

tput

command) and starts an until loop. The until loop keeps

attempting to read from the terminal and assigning the input to the key_3 variable.

Once the user types a string that matches one of the original keys (key_2), the until

loop terminates and keyboard echo is turned on again.

break and continue

You can interrupt a for,while, or until loop by using a break or continue statement.

The break statement transfers control to the statement after the done statement,

thereby terminating execution of the loop. The continue command transfers control

to the done statement, continuing execution of the loop.

The following script demonstrates the use of these two statements. The for...in

structure loops through the values 1–10. The first if statement executes its com-

mands when the value of the index is less than or equal to 3 ($index –le 3). The

second if statement executes its commands when the value of the index is greater

than or equal to 8 ($index –ge 8). In between the two ifs,

echo

displays the value

of the index. For all values up to and including 3, the first if statement displays

continue, executes a continue statement that skips echo $index and the second if

statement, and continues with the next for statement. For the value of 8, the sec-

ond if statement displays break and executes a break statement that exits from the

for loop.

$ cat brk

for index in 1 2 3 4 5 6 7 8 9 10

if [ $index -le 3 ] ; then

echo "continue"

continue

echo $index

if [ $index -ge 8 ] ; then

echo "break"

break

done

$ ./brk

continue

break

Control Structures 421

case

The case structure (Figure 10-7, next page) is a multiple-branch decision mecha-

nism. The path taken through the structure depends on a match or lack of a match

between the test-string and one of the patterns.The case control structure (

tcsh

uses

switch) has the following syntax:

case test-string in

pattern-1)

commands-1

;;

pattern-2)

commands-2

;;

pattern-3)

commands-3

;;

. . .

esac

The following case structure examines the character the user enters as the test-string.

This value is held in the variable letter. If the test-string has a value of A, the structure

executes the command following the pattern A. The right parenthesis is part of the case

control structure, not part of the pattern. If the test-string has a value of B or C, the

structure executes the command following the matching pattern. The asterisk (*) indi-

cates any string of characters and serves as a catchall in case there is no match. If no

pattern matches the test-string and if there is no catchall (*)pattern, control passes to

the command following the esac statement, without the case structure taking any action.

$ cat case1

echo -n "Enter A, B, or C: "

read letter

case "$letter" in

echo "You entered A"

;;

echo "You entered B"

;;

echo "You entered C"

;;

echo "You did not enter A, B, or C"

;;

esac

$ ./case1

Enter A, B, or C: B

You entered B

422 Chapter 10 Programming the Bourne Again Shell

The next execution of case1 shows the user entering a lowercase b. Because the test-

string b does not match the uppercase Bpattern (or any other pattern in the case

statement), the program executes the commands following the catchall pattern and

displays a message:

$ ./case1

Enter A, B, or C: b

You did not enter A, B, or C

The pattern in the case structure is analogous to an ambiguous file reference. It can

include any special characters and strings shown in Table 10-2.

The next script accepts both uppercase and lowercase letters:

Figure 10-7 Acase flowchart

case

esac

test-string

pattern-1

test-string

pattern-2

test-string

pattern-3

commands-1

commands-2

commands-3

Control Structures 423

$ cat case2

echo -n "Enter A, B, or C: "

read letter

case "$letter" in

a|A)

echo "You entered A"

;;

b|B)

echo "You entered B"

;;

c|C)

echo "You entered C"

;;

echo "You did not enter A, B, or C"

;;

esac

$ ./case2

Enter A, B, or C: b

You entered B

optional The following example shows how you can use the case structure to create a simple

menu. The command_menu script uses

echo

to present menu items and prompt the

user for a selection. (The select control structure [page 427] is a much easier way of

coding a menu.) The case structure then executes the appropriate utility depending

on the user’s selection.

$ cat command_menu

#!/bin/bash

# menu interface to simple commands

echo -e "\n COMMAND MENU\n"

echo " a. Current date and time"

echo " b. Users currently logged in"

echo " c. Name of the working directory"

echo -e " d. Contents of the working directory\n"

echo -n "Enter a, b, c, or d: "

read answer

echo

Table 10-2 Patterns

Pattern Function

*Matches any string of characters. Use for the default case.

?Matches any single character.

[...] Defines a character class. Any characters enclosed within brackets are tried,

one at a time, in an attempt to match a single character. A hyphen between two

characters specifies a range of characters.

|Separates alternative choices that satisfy a particular branch of the case structure.

424 Chapter 10 Programming the Bourne Again Shell

case "$answer" in

date

;;

who

;;

pwd

;;

echo "There is no selection: $answer"

;;

esac

$ ./command_menu

COMMAND MENU

a. Current date and time

b. Users currently logged in

c. Name of the working directory

d. Contents of the working directory

Enter a, b, c, or d: a

Wed Jan 3 12:31:12 PST 2009

echo –e The –e option causes

echo

to interpret \n as a NEWLINE character. If you do not include

this option,

echo

does not output the extra blank lines that make the menu easy to

read but instead outputs the (literal) two-character sequence \n. The –e option causes

echo

to interpret several other backslash-quoted characters (Table 10-3). Remember

to quote (i.e., place double quotation marks around the string) the backslash-quoted

character so the shell does not interpret it but passes the backslash and the character

echo

. See xpg_echo (page 333) for a way to avoid using the –e option.

Table 10-3 Special characters in

echo

(must use –e)

Quoted

character

echo

displays

\a Alert (bell)

\b BACKSPACE

\c Suppress trailing NEWLINE

\f FORMFEED

\n NEWLINE

\r RETURN

Control Structures 425

You can also use the case control structure to take various actions in a script, depend-

ing on how many arguments the script is called with. The following script, named

safedit, uses a case structure that branches based on the number of command-line

arguments ($#). It saves a backup copy of a file you are editing with

vim

$ cat safedit

#!/bin/bash

PATH=/bin:/usr/bin

script=$(basename $0)

case $# in

vim

exit 0

;;

if [ ! -f "$1" ]

then

vim "$1"

exit 0

if [ ! -r "$1" -o ! -w "$1" ]

then

echo "$script: check permissions on $1" 1>&2

exit 1

else

editfile=$1

if [ ! -w "." ]

then

echo "$script: backup cannot be " \

"created in the working directory" 1>&2

exit 1

;;

echo "Usage: $script [file-to-edit]" 1>&2

exit 1

;;

esac

Quoted

character

echo

displays

\t Horizontal TAB

\v Vertical TAB

\\ Backslash

nnn

The character with the ASCII octal code

nnn

; if

nnn

is not valid,

echo

displays

the string literally

Table 10-3 Special characters in

echo

(must use –e) (continued)

426 Chapter 10 Programming the Bourne Again Shell

tempfile=/tmp/$$.$script

cp $editfile $tempfile

if vim $editfile

then

mv $tempfile bak.$(basename $editfile)

echo "$script: backup file created"

else

mv $tempfile editerr

echo "$script: edit error--copy of " \

"original file is in editerr" 1>&2

If you call safedit without any arguments, the case structure executes its first branch

and calls

vim

without a filename argument. Because an existing file is not being

edited, safedit does not create a backup file. (See the :w command on page 163 for

an explanation of how to exit from

vim

when you have called it without a filename.)

If you call safedit with one argument, it runs the commands in the second branch of

the case structure and verifies that the file specified by $1 does not yet exist or is the

name of a file for which the user has read and write permission. The safedit script

also verifies that the user has write permission for the working directory. If the user

calls safedit with more than one argument, the third branch of the case structure

presents a usage message and exits with a status of 1.

Set PATH In addition to using a case structure for branching based on the number of

command-line arguments, the safedit script introduces several other features. At the

beginning of the script, the PATH variable is set to search /bin and /usr/bin. Setting

PATH in this way ensures that the commands executed by the script are standard

utilities, which are kept in those directories. By setting this variable inside a script,

you can avoid the problems that might occur if users have set PATH to search their

own directories first and have scripts or programs with the same names as the utili-

ties the script calls. You can also include absolute pathnames within a script to

achieve this end, although this practice can make a script less portable.

Name of the

program

The next line creates a variable named script and uses command substitution to

assign the simple filename of the script to it:

script=$(basename $0)

The

basename

utility sends the simple filename component of its argument to stan-

dard output, which is assigned to the script variable, using command substitution.

The $0 holds the command the script was called with (page 441). No matter which

of the following commands the user calls the script with, the output of

basename

the simple filename safedit:

$ /home/max/bin/safedit memo

$ ./safedit memo

$ safedit memo

After the script variable is set, it replaces the filename of the script in usage and

error messages. By using a variable that is derived from the command that invoked

the script rather than a filename that is hardcoded into the script, you can create

Control Structures 427

links to the script or rename it, and the usage and error messages will still provide

accurate information.

Naming

temporary files

Another feature of safedit relates to the use of the $$ parameter in the name of a

temporary file. The statement following the esac statement creates and assigns a

value to the tempfile variable. This variable contains the name of a temporary file

that is stored in the /tmp directory, as are many temporary files. The temporary file-

name begins with the PID number of the shell and ends with the name of the script.

Using the PID number ensures that the filename is unique. Thus safedit will not

attempt to overwrite an existing file, as might happen if two people were using

safedit at the same time. The name of the script is appended so that, should the file

be left in /tmp for some reason, you can figure out where it came from.

The PID number is used in front of—rather than after—$script in the filename

because of the 14-character limit placed on filenames by some older versions of

UNIX. Linux systems do not have this limitation. Because the PID number ensures

the uniqueness of the filename, it is placed first so that it cannot be truncated. (If the

$script component is truncated, the filename is still unique.) For the same reason,

when a backup file is created inside the if control structure a few lines down in the

script, the filename consists of the string bak. followed by the name of the file being

edited. On an older system, if bak were used as a suffix rather than a prefix and the

original filename were 14 characters long, .bak might be lost and the original file

would be overwritten. The

basename

utility extracts the simple filename of $editfile

before it is prefixed with bak.

The safedit script uses an unusual test-command in the if structure: vim $editfile. The

test-command calls

vim

to edit $editfile. When you finish editing the file and exit

from

vim

returns an exit code. The if control structure uses that exit code to

determine which branch to take. If the editing session completed successfully,

vim

returns 0 and the statements following the then statement are executed. If

vim

does

not terminate normally (as would occur if the user

kill

ed [page 729] the

vim

process),

vim

returns a nonzero exit status and the script executes the statements following else.

select

The select control structure (not available in

tcsh

) is based on the one found in the

Korn Shell. It displays a menu, assigns a value to a variable based on the user’s

choice of items, and executes a series of commands. The select control structure has

the following syntax:

select varname [in arg . . . ]

commands

done

The select structure displays a menu of the arg items. If you omit the keyword in

and the list of arguments, select uses the positional parameters in place of the arg

428 Chapter 10 Programming the Bourne Again Shell

items. The menu is formatted with numbers before each item. For example, a select

structure that begins with

select fruit in apple banana blueberry kiwi orange watermelon STOP

displays the following menu:

1) apple 3) blueberry 5) orange 7) STOP

2) banana 4) kiwi 6) watermelon

The select structure uses the values of the LINES and COLUMNS variables to spec-

ify the size of the display. (LINES has a default value of 24; COLUMNS has a

default value of 80.) With COLUMNS set to 20, the menu looks like this:

1) apple

2) banana

3) blueberry

4) kiwi

5) orange

6) watermelon

7) STOP

PS3 After displaying the menu, select displays the value of PS3, the select prompt. The

default value of PS3 is ?#, but it is typically set to a more meaningful value. When

you enter a valid number (one in the menu range) in response to the PS3 prompt,

select sets varname to the argument corresponding to the number you entered. An

invalid entry causes the shell to set varname to null. Either way select stores your

response in the keyword variable REPLY and then executes the commands between

do and done. If you press RETURN without entering a choice, the shell redisplays the

menu and the PS3 prompt.

The select structure continues to issue the PS3 prompt and execute the commands

until something causes it to exit—typically a break or exit statement. A break state-

ment exits from the loop and an exit statement exits from the script.

The following script illustrates the use of select:

$ cat fruit2

#!/bin/bash

PS3="Choose your favorite fruit from these possibilities: "

select FRUIT in apple banana blueberry kiwi orange watermelon STOP

if [ "$FRUIT" == "" ]; then

echo -e "Invalid entry.\n"

continue

elif [ $FRUIT = STOP ]; then

echo "Thanks for playing!"

break

echo "You chose $FRUIT as your favorite."

echo -e "That is choice number $REPLY.\n"

done

Control Structures 429

$ ./fruit2

1) apple 3) blueberry 5) orange 7) STOP

2) banana 4) kiwi 6) watermelon

Choose your favorite fruit from these possibilities: 3

You chose blueberry as your favorite.

That is choice number 3.

Choose your favorite fruit from these possibilities: 99

Invalid entry.

Choose your favorite fruit from these possibilities: 7

Thanks for playing!

After setting the PS3 prompt and establishing the menu with the select statement,

fruit2 executes the commands between do and done. If the user submits an invalid

entry, the shell sets varname ($FRUIT) to a null value. If $FRUIT is null,

echo

dis-

plays an error; continue then causes the shell to redisplay the PS3 prompt. If the entry

is valid, the script tests whether the user wants to stop. If so,

echo

displays a message

and break exits from the select structure (and from the script). If the user enters a valid

response and does not want to stop, the script displays the name and number of the

user’s response. (See page 424 for information about the

echo

–e option.)

Here Document

A Here document allows you to redirect input to a shell script from within the shell

script itself. A Here document is so named because it is here—immediately accessi-

ble in the shell script—instead of there, perhaps in another file.

The following script, named birthday, contains a Here document. The two less than

symbols (<<) in the first line indicate a Here document follows. One or more char-

acters that delimit the Here document follow the less than symbols—this example

uses a plus sign. Whereas the opening delimiter must appear adjacent to the less

than symbols, the closing delimiter must be on a line by itself. The shell sends every-

thing between the two delimiters to the process as standard input. In the example it

is as though you have redirected standard input to

grep

from a file, except that the

file is embedded in the shell script:

$ cat birthday

grep -i "$1" <<+

Max June 22

Barbara February 3

Darlene May 8

Helen March 13

Zach January 23

Nancy June 26

$ ./birthday Zach

Zach January 23

$ ./birthday june

Max June 22

Nancy June 26

430 Chapter 10 Programming the Bourne Again Shell

When you run birthday, it lists all the Here document lines that contain the argu-

ment you called it with. In this case the first time birthday is run, it displays Zach’s

birthday because it is called with an argument of Zach. The second run displays all

the birthdays in June. The –i argument causes

grep

’s search not to be case sensitive.

optional The next script, named bundle,2 includes a clever use of a Here document. The bun-

dle script is an elegant example of a script that creates a shell archive (shar) file. The

script creates a file that is itself a shell script containing several other files as well as

the code to re-create the original files:

$ cat bundle

#!/bin/bash

# bundle: group files into distribution package

echo "# To unbundle, bash this file"

for i

echo "echo $i 1>&2"

echo "cat >$i <<'End of $i'"

cat $i

echo "End of $i"

done

Just as the shell does not treat special characters that occur in standard input of a

shell script as special, so the shell does not treat the special characters that occur

between the delimiters in a Here document as special.

As the following example shows, the output of bundle is a shell script, which is redi-

rected to a file named bothfiles. It contains the contents of each file given as an

argument to bundle (file1 and file2 in this case) inside a Here document. To extract

the original files from bothfiles, you simply give it as an argument to a

bash

com-

mand. Before each Here document is a

cat

command that causes the Here document

to be written to a new file when bothfiles is run:

$ cat file1

This is a file.

It contains two lines.

$ cat file2

This is another file.

It contains

three lines.

$ ./bundle file1 file2 > bothfiles

$ cat bothfiles

# To unbundle, bash this file

echo file1 1>&2

cat >file1 <<'End of file1'

2. Thanks to Brian W. Kernighan and Rob Pike, The Unix Programming Environment (Englewood Cliffs,

N.J.: Prentice-Hall, 1984), 98. Reprinted with permission.

File Descriptors 431

This is a file.

It contains two lines.

End of file1

echo file2 1>&2

cat >file2 <<'End of file2'

This is another file.

It contains

three lines.

End of file2

In the next example, file1 and file2 are removed before bothfiles is run. The bothfiles

script echoes the names of the files it creates as it creates them. The

command then

shows that bothfiles has re-created file1 and file2:

$ rm file1 file2

$ bash bothfiles

file1

file2

$ ls

bothfiles

file1

file2

File Descriptors

As discussed on page 275, before a process can read from or write to a file, it must

open that file. When a process opens a file, Linux associates a number (called a file

descriptor) with the file. A file descriptor is an index into the process’s table of open

files. Each process has its own set of open files and its own file descriptors. After

opening a file, a process reads from and writes to that file by referring to its file

descriptor. When it no longer needs the file, the process closes the file, freeing the

file descriptor.

A typical Linux process starts with three open files: standard input (file descriptor 0),

standard output (file descriptor 1), and standard error (file descriptor 2). Often these

are the only files the process needs. Recall that you redirect standard output with the

symbol > or the symbol 1> and that you redirect standard error with the symbol 2>.

Although you can redirect other file descriptors, because file descriptors other than

0, 1, and 2 do not have any special conventional meaning, it is rarely useful to do so.

The exception is in programs that you write yourself, in which case you control the

meaning of the file descriptors and can take advantage of redirection.

Opening a file

descriptor

The Bourne Again Shell opens files using the

exec

builtin as follows:

exec n>outfile

exec m<infile

The first line opens outfile for output and holds it open, associating it with file

descriptor n. The second line opens infile for input and holds it open, associating it

with file descriptor m.

432 Chapter 10 Programming the Bourne Again Shell

Duplicating a

file descriptor

The <& token duplicates an input file descriptor; >& duplicates an output file

descriptor. You can duplicate a file descriptor by making it refer to the same file as

another open file descriptor, such as standard input or output. Use the following

format to open or redirect file descriptor n as a duplicate of file descriptor m:

exec n<&m

Once you have opened a file, you can use it for input and output in two ways. First,

you can use I/O redirection on any command line, redirecting standard output to a

file descriptor with >&nor redirecting standard input from a file descriptor with

<&n. Second, you can use the

read

(page 447) and

echo

builtins. If you invoke other

commands, including functions (page 327), they inherit these open files and file

descriptors. When you have finished using a file, you can close it using

exec n<&–

When you invoke the shell function in the next example, named mycp, with two

arguments, it copies the file named by the first argument to the file named by the

second argument. If you supply only one argument, the script copies the file named

by the argument to standard output. If you invoke mycp with no arguments, it

copies standard input to standard output.

function mycp ()

{

case $# in

# Zero arguments

# File descriptor 3 duplicates standard input

# File descriptor 4 duplicates standard output

exec 3<&0 4<&1

;;

# One argument

# Open the file named by the argument for input

# and associate it with file descriptor 3

# File descriptor 4 duplicates standard output

exec 3< $1 4<&1

;;

# Two arguments

# Open the file named by the first argument for input

# and associate it with file descriptor 3

# Open the file named by the second argument for output

# and associate it with file descriptor 4

exec 3< $1 4> $2

;;

A function is not a shell script

tip The mycp example is a shell function; it will not work as you expect if you execute it as a shell

script. (It will work: The function will be created in a very short-lived subshell, which is probably

of little use.) You can enter this function from the keyboard. If you put the function in a file, you

can run it as an argument to the . (dot) builtin (page 274). You can also put the function in a

startup file if you want it to be always available (page 329).

File Descriptors 433

echo "Usage: mycp [source [dest]]"

return 1

;;

esac

# Call cat with input coming from file descriptor 3

# and output going to file descriptor 4

cat <&3 >&4

# Close file descriptors 3 and 4

exec 3<&- 4<&-

}

The real work of this function is done in the line that begins with

cat

. The rest of the

script arranges for file descriptors 3 and 4, which are the input and output of the

cat

command, to be associated with the appropriate files.

optional The next program takes two filenames on the command line, sorts both, and sends

the output to temporary files. The program then merges the sorted files to standard

output, preceding each line by a number that indicates which file it came from.

$ cat sortmerg

#!/bin/bash

usage ()

{

if [ $# -ne 2 ]; then

echo "Usage: $0 file1 file2" 2>&1

exit 1

}

# Default temporary directory

: ${TEMPDIR:=/tmp}

# Check argument count

usage "$@"

# Set up temporary files for sorting

file1=$TEMPDIR/$$.file1

file2=$TEMPDIR/$$.file2

# Sort

sort $1 > $file1

sort $2 > $file2

# Open $file1 and $file2 for reading. Use file descriptors 3 and 4.

exec 3<$file1

exec 4<$file2

# Read the first line from each file to figure out how to start.

read Line1 <&3

status1=$?

read Line2 <&4

status2=$?

434 Chapter 10 Programming the Bourne Again Shell

# Strategy: while there is still input left in both files:

# Output the line that should come first.

# Read a new line from the file that line came from.

while [ $status1 -eq 0 -a $status2 -eq 0 ]

if [[ "$Line2" > "$Line1" ]]; then

echo -e "1.\t$Line1"

read -u3 Line1

status1=$?

else

echo -e "2.\t$Line2"

read -u4 Line2

status2=$?

done

# Now one of the files is at end-of-file.

# Read from each file until the end.

# First file1:

while [ $status1 -eq 0 ]

echo -e "1.\t$Line1"

read Line1 <&3

status1=$?

done

# Next file2:

while [[ $status2 -eq 0 ]]

echo -e "2.\t$Line2"

read Line2 <&4

status2=$?

done

# Close and remove both input files

exec 3<&- 4<&-

rm -f $file1 $file2

exit 0

Parameters and Variables

Shell parameters and variables were introduced on page 290. This section adds to

the previous coverage with a discussion of array variables, global versus local vari-

ables, special and positional parameters, and expansion of null and unset variables.

Array Variables

The Bourne Again Shell supports one-dimensional array variables. The subscripts

are integers with zero-based indexing (i.e., the first element of the array has the sub-

script 0). The following format declares and assigns values to an array:

Parameters and Variables 435

name=(element1 element2 ...)

The following example assigns four values to the array NAMES:

$ NAMES=(max helen sam zach)

You reference a single element of an array as follows:

$ echo ${NAMES[2]}

sam

The subscripts [*] and [@] both extract the entire array but work differently when

used within double quotation marks. An @ produces an array that is a duplicate of

the original array; an * produces a single element of an array (or a plain variable)

that holds all the elements of the array separated by the first character in IFS (nor-

mally a SPACE). In the following example, the array A is filled with the elements of

the NAMES variable using an *, and B is filled using an @. The

declare

builtin with

the –a option displays the values of the arrays (and reminds you that

bash

uses zero-

based indexing for arrays):

$ A=("${NAMES[*]}")

$ B=("${NAMES[@]}")

$ declare -a

declare -a A='([0]="max helen sam zach")'

declare -a B='([0]="max" [1]="helen" [2]="sam" [3]="zach")'

...

declare -a NAMES='([0]="max" [1]="helen" [2]="sam" [3]="zach")'

From the output of

declare

, you can see that NAMES and B have multiple elements.

In contrast, A, which was assigned its value with an * within double quotation

marks, has only one element: A has all its elements enclosed between double quota-

tion marks.

In the next example,

echo

attempts to display element 1 of array A. Nothing is dis-

played because A has only one element and that element has an index of 0. Element

0 of array A holds all four names. Element 1 of B holds the second item in the array

and element 0 holds the first item.

$ echo ${A[1]}

$ echo ${A[0]}

max helen sam zach

$ echo ${B[1]}

helen

$ echo ${B[0]}

max

You can apply the ${#name[*]} operator to array variables, returning the number

of elements in the array:

$ echo ${#NAMES[*]}

436 Chapter 10 Programming the Bourne Again Shell

The same operator, when given the index of an element of an array in place of *,

returns the length of the element:

$ echo ${#NAMES[1]}

You can use subscripts on the left side of an assignment statement to replace

selected elements of the array:

$ NAMES[1]=max

$ echo ${NAMES[*]}

max max sam zach

Locality of Variables

By default variables are local to the process in which they are declared. Thus a shell

script does not have access to variables declared in your login shell unless you

explicitly make the variables available (global). Under

bash

export

makes a variable

available to child processes. Under

tcsh

setenv

(page 366) assigns a value to a vari-

able and makes it available to child processes. The examples in this section use the

bash

syntax but the theory applies to both shells.

export

Once you use the

export

builtin with a variable name as an argument, the shell

places the value of the variable in the calling environment of child processes. This

call by value gives each child process a copy of the variable for its own use.

The following extest1 shell script assigns a value of american to the variable named

cheese and then displays its filename (extest1) and the value of cheese. The extest1

script then calls subtest, which attempts to display the same information. Next subtest

declares a cheese variable and displays its value. When subtest finishes, it returns con-

trol to the parent process, which is executing extest1. At this point extest1 again dis-

plays the value of the original cheese variable.

$ cat extest1

cheese=american

echo "extest1 1: $cheese"

subtest

echo "extest1 2: $cheese"

$ cat subtest

echo "subtest 1: $cheese"

cheese=swiss

echo "subtest 2: $cheese"

$ ./extest1

extest1 1: american

subtest 1:

subtest 2: swiss

extest1 2: american

The subtest script never receives the value of cheese from extest1, and extest1 never

loses the value. In

bash

—unlike in the real world—a child can never affect its par-

ent’s attributes. When a process attempts to display the value of a variable that has

Parameters and Variables 437

not been declared, as is the case with subtest, the process displays nothing; the value

of an undeclared variable is that of a null string.

The following extest2 script is the same as extest1 except it uses

export

to make

cheese available to the subtest script:

$ cat extest2

export cheese=american

echo "extest2 1: $cheese"

subtest

echo "extest2 2: $cheese"

$ ./extest2

extest2 1: american

subtest 1: american

subtest 2: swiss

extest2 2: american

Here the child process inherits the value of cheese as american and, after displaying

this value, changes its copy to swiss. When control is returned to the parent, the

parent’s copy of cheese retains its original value: american.

export

builtin can optionally include an assignment:

export cheese=american

The preceding statement is equivalent to the following two statements:

cheese=american

export cheese

Although it is rarely done, you can export a variable before you assign a value to it.

You do not need to export an already-exported variable a second time after you

change its value.

Functions

Because functions run in the same environment as the shell that calls them, variables

are implicitly shared by a shell and a function it calls.

$ function nam () {

> echo $myname

> myname=zach

> }

$ myname=sam

$ nam

sam

$ echo $myname

zach

In the preceding example, the myname variable is set to sam in the interactive shell.

The nam function then displays the value of myname (sam) and sets myname to

zach. The final

echo

shows that, in the interactive shell, the value of myname has

been changed to zach.

438 Chapter 10 Programming the Bourne Again Shell

Function local

variables

Local variables are helpful in a function written for general use. Because the func-

tion is called by many scripts that may be written by different programmers, you

need to make sure the names of the variables used within the function do not con-

flict with (i.e., duplicate) the names of the variables in the programs that call the

function. Local variables eliminate this problem. When used within a function, the

typeset

builtin declares a variable to be local to the function it is defined in.

The next example shows the use of a local variable in a function. It features two

variables named count. The first is declared and assigned a value of 10 in the inter-

active shell. Its value never changes, as

echo

verifies after count_down is run. The

other count is declared, using

typeset

, to be local to the function. Its value, which is

unknown outside the function, ranges from 4 to 1, as the

echo

command within the

function confirms.

The example shows the function being entered from the keyboard; it is not a shell

script. See the tip “A function is not a shell script” on page 432.

$ function count_down () {

> typeset count

> count=$1

> while [ $count -gt 0 ]

> do

> echo "$count..."

> ((count=count-1))

> sleep 1

> done

> echo "Blast Off."

> }

$ count=10

$ count_down 4

4...

3...

2...

1...

Blast Off.

$ echo $count

The ((count=count–1)) assignment is enclosed between double parentheses, which

cause the shell to perform an arithmetic evaluation (page 460). Within the double

parentheses you can reference shell variables without the leading dollar sign ($).

Special Parameters

Special parameters enable you to access useful values pertaining to command-line

arguments and the execution of shell commands. You reference a shell special

parameter by preceding a special character with a dollar sign ($). As with posi-

tional parameters, it is not possible to modify the value of a special parameter by

assignment.

Parameters and Variables 439

$$: PID Number

The shell stores in the $$ parameter the PID number of the process that is executing

it. In the following interaction,

echo

displays the value of this variable and the

utility confirms its value. Both commands show that the shell has a PID number of

5209:

$ echo $$

5209

$ ps

PID TTY TIME CMD

5209 pts/1 00:00:00 bash

6015 pts/1 00:00:00 ps

Because

echo

is built into the shell, the shell does not create another process when

you give an

echo

command. However, the results are the same whether

echo

is a

builtin or not, because the shell substitutes the value of $$ before it forks a new pro-

cess to run a command. Try using the

echo

utility (/bin/echo), which is run by

another process, and see what happens. In the following example, the shell substi-

tutes the value of $$ and passes that value to

as a prefix for a filename:

$ echo $$

8232

$ cp memo $$.memo

$ ls

8232.memo memo

Incorporating a PID number in a filename is useful for creating unique filenames

when the meanings of the names do not matter; this technique is often used in shell

scripts for creating names of temporary files. When two people are running the

same shell script, having unique filenames keeps the users from inadvertently shar-

ing the same temporary file.

The following example demonstrates that the shell creates a new shell process when

it runs a shell script. The id2 script displays the PID number of the process running

it (not the process that called it—the substitution for $$ is performed by the shell

that is forked to run id2):

$ cat id2

echo "$0 PID= $$"

$ echo $$

8232

$ id2

./id2 PID= 8362

$ echo $$

8232

The first

echo

displays the PID number of the interactive shell. Then id2 displays its

name ($0) and the PID number of the subshell that it is running in. The last

echo

shows that the PID number of the interactive shell has not changed.

440 Chapter 10 Programming the Bourne Again Shell

$! The shell stores the value of the PID number of the last process that ran in the

background in $! (not available in

tcsh

). The following example executes

sleep

as a

background task and uses

echo

to display the value of $!:

$ sleep 60 &

[1] 8376

$ echo $!

8376

$?: Exit Status

When a process stops executing for any reason, it returns an exit status to its parent

process. The exit status is also referred to as a condition code or a return code. The

$? ($status under

tcsh

) variable stores the exit status of the last command.

By convention a nonzero exit status represents a false value and means the command

failed. A zero is true and indicates the command executed successfully. In the follow-

ing example, the first

command succeeds and the second fails, as demonstrated by

the exit status:

$ ls es

$ echo $?

$ ls xxx

ls: xxx: No such file or directory

$ echo $?

You can specify the exit status that a shell script returns by using the

exit

builtin, fol-

lowed by a number, to terminate the script. If you do not use

exit

with a number to ter-

minate a script, the exit status of the script is that of the last command the script ran.

$ cat es

echo This program returns an exit status of 7.

exit 7

$ es

This program returns an exit status of 7.

$ echo $?

The es shell script displays a message and terminates execution with an

exit

com-

mand that returns an exit status of 7, the user-defined exit status in this script. The

first

echo

then displays the value of the exit status of es. The second

echo

displays

the value of the exit status of the first

echo

. This value is 0, indicating the first

echo

was successful.

Positional Parameters

Positional parameters comprise the command name and command-line arguments.

These parameters are called positional because within a shell script, you refer to

Parameters and Variables 441

them by their position on the command line. Only the

set

builtin (page 442) allows

you to change the values of positional parameters. However, you cannot change the

value of the command name from within a script. The

tcsh set

builtin does not

change the values of positional parameters.

$#: Number of Command-Line Arguments

The $# parameter holds the number of arguments on the command line (positional

parameters), not counting the command itself:

$ cat num_args

echo "This script was called with $# arguments."

$ ./num_args sam max zach

This script was called with 3 arguments.

$0: Name of the Calling Program

The shell stores the name of the command you used to call a program in parameter

$0. This parameter is numbered zero because it appears before the first argument

on the command line:

$ cat abc

echo "The command used to run this script is $0"

$ ./abc

The command used to run this script is ./abc

$ ~sam/abc

The command used to run this script is /home/sam/abc

The preceding shell script uses

echo

to verify the name of the script you are executing.

You can use the

basename

utility and command substitution to extract and display

the simple filename of the command:

$ cat abc2

echo "The command used to run this script is $(basename $0)"

$ ~sam/abc2

The command used to run this script is abc2

$1–$

Command-Line Arguments

The first argument on the command line is represented by parameter $1, the second

argument by $2, and so on up to $n. For values of n greater than 9, the number

must be enclosed within braces. For example, the twelfth command-line argument

is represented by ${12}. The following script displays positional parameters that

hold command-line arguments:

$ cat display_5args

echo First 5 arguments are $1 $2 $3 $4 $5

$ ./display_5args zach max helen

First 5 arguments are zach max helen

The display_5args script displays the first five command-line arguments. The shell

assigns a null value to each parameter that represents an argument that is not

442 Chapter 10 Programming the Bourne Again Shell

present on the command line. Thus the $4 and $5 parameters have null values in

this example.

shift: Promotes Command-Line Arguments

The

shift

builtin promotes each command-line argument. The first argument

(which was $1) is discarded. The second argument (which was $2) becomes the

first argument (now $1), the third becomes the second, and so on. Because no

“unshift” command exists, you cannot bring back arguments that have been dis-

carded. An optional argument to

shift

specifies the number of positions to shift

(and the number of arguments to discard); the default is 1.

The following demo_shift script is called with three arguments. Double quotation

marks around the arguments to

echo

preserve the spacing of the output. The pro-

gram displays the arguments and shifts them repeatedly until no more arguments

are left to shift:

$ cat demo_shift

echo "arg1= $1 arg2= $2 arg3= $3"

shift

echo "arg1= $1 arg2= $2 arg3= $3"

shift

echo "arg1= $1 arg2= $2 arg3= $3"

shift

echo "arg1= $1 arg2= $2 arg3= $3"

shift

$ ./demo_shift alice helen zach

arg1= alice arg2= helen arg3= zach

arg1= helen arg2= zach arg3=

arg1= zach arg2= arg3=

arg1= arg2= arg3=

Repeatedly using

shift

is a convenient way to loop over all command-line arguments

in shell scripts that expect an arbitrary number of arguments. See page 403 for a

shell script that uses

shift

set: Initializes Command-Line Arguments

When you call the

set

builtin with one or more arguments, it assigns the values of the

arguments to the positional parameters, starting with $1 (not available in

tcsh

). The

following script uses

set

to assign values to the positional parameters $1,$2, and $3:

$ cat set_it

set this is it

echo $3 $2 $1

$ ./set_it

it is this

Combining command substitution (page 340) with the

set

builtin is a convenient

way to get standard output of a command in a form that can be easily manipu-

lated in a shell script. The following script shows how to use

date

and

set

to pro-

Parameters and Variables 443

vide the date in a useful format. The first command shows the output of

date

Then

cat

displays the contents of the dateset script. The first command in this

script uses command substitution to set the positional parameters to the output of

the

date

utility. The next command, echo $*, displays all positional parameters

resulting from the previous

set

. Subsequent commands display the values of

parameters $1,$2,$3, and $6. The final command displays the date in a format

you can use in a letter or report:

$ date

Wed Aug 13 17:35:29 PDT 2008

$ cat dateset

set $(date)

echo $*

echo

echo "Argument 1: $1"

echo "Argument 2: $2"

echo "Argument 3: $3"

echo "Argument 6: $6"

echo

echo "$2 $3, $6"

$ ./dateset

Wed Aug 13 17:35:34 PDT 2008

Argument 1: Wed

Argument 2: Aug

Argument 3: 13

Argument 6: 2008

Aug 13, 2008

You can also use the +format argument to

date

(page 655) to modify the format of

its output.

When used without any arguments,

set

displays a list of the shell variables that are

set, including user-created variables and keyword variables. Under

bash

, this list is

the same as that displayed by

declare

and

typeset

when they are called without any

arguments.

The

set

builtin also accepts options that let you customize the behavior of the shell

(not available in

tcsh

). For more information refer to “set ±o: Turns Shell Features

On and Off” on page 331.

$* and $@: Represent All Command-Line Arguments

The $* parameter represents all command-line arguments, as the display_all program

demonstrates:

$ cat display_all

echo All arguments are $*

$ ./display_all a b c d e f g h i j k l m n o p

All arguments are a b c d e f g h i j k l m n o p

444 Chapter 10 Programming the Bourne Again Shell

It is a good idea to enclose references to positional parameters between double quo-

tation marks. The quotation marks are particularly important when you are using

positional parameters as arguments to commands. Without double quotation

marks, a positional parameter that is not set or that has a null value disappears:

$ cat showargs

echo "$0 was called with $# arguments, the first is :$1:."

$ ./showargs a b c

./showargs was called with 3 arguments, the first is :a:.

$ echo $xx

$ ./showargs $xx a b c

./showargs was called with 3 arguments, the first is :a:.

$ ./showargs "$xx" a b c

./showargs was called with 4 arguments, the first is ::.

The showargs script displays the number of arguments ($#) followed by the value of

the first argument enclosed between colons. In the preceding example, showargs is

initially called with three simple arguments. Next the

echo

command demonstrates

that the $xx variable, which is not set, has a null value. In the final two calls to

showargs, the first argument is $xx. In the first case the command line becomes

showargs a b c; the shell passes showargs three arguments. In the second case the

command line becomes showargs "" a b c, which results in calling showargs with

four arguments. The difference in the two calls to showargs illustrates a subtle

potential problem that you should keep in mind when using positional parameters

that may not be set or that may have a null value.

"$*" versus "$@" The $* and $@ parameters work the same way except when they are enclosed within

double quotation marks. Using "$*" yields a single argument (with SPACEs or the value

of the first character of IFS [page 301] between the positional parameters), whereas

using "$@" produces a list wherein each positional parameter is a separate argument.

This difference typically makes "$@" more useful than "$*" in shell scripts.

The following scripts help explain the difference between these two special parame-

ters. In the second line of both scripts, the single quotation marks keep the shell from

interpreting the enclosed special characters so they are passed to

echo

and displayed

as themselves. The bb1 script shows that set "$*" assigns multiple arguments to the

first command-line parameter:

$ cat bb1

set "$*"

echo $# parameters with '"$*"'

echo 1: $1

echo 2: $2

echo 3: $3

$ ./bb1 a b c

1 parameters with "$*"

1: a b c

The bb2 script shows that set "$@" assigns each argument to a different command-

line parameter:

Parameters and Variables 445

$ cat bb2

set "$@"

echo $# parameters with '"$@"'

echo 1: $1

echo 2: $2

echo 3: $3

$ ./bb2 a b c

3 parameters with "$@"

1: a

2: b

3: c

Expanding Null and Unset Variables

The expression ${name} (or just $name if it is not ambiguous) expands to the value

of the name variable. If name is null or not set,

bash

expands ${name} to a null

string. The Bourne Again Shell provides the following alternatives to accepting the

expanded null string as the value of the variable:

• Use a default value for the variable.

• Use a default value and assign that value to the variable.

• Display an error.

You can choose one of these alternatives by using a modifier with the variable

name. In addition, you can use set –o nounset (page 333) to cause

bash

to display

an error and exit from a script whenever an unset variable is referenced.

:– Uses a Default Value

The :– modifier uses a default value in place of a null or unset variable while allowing

a nonnull variable to represent itself:

${name:–default}

The shell interprets :– as “If name is null or unset, expand default and use the

expanded value in place of name;else use name.” The following command lists the

contents of the directory named by the LIT variable. If LIT is null or unset, it lists

the contents of /home/max/literature:

$ ls ${LIT:-/home/max/literature}

The default can itself have variable references that are expanded:

$ ls ${LIT:-$HOME/literature}

:= Assigns a Default Value

The :– modifier does not change the value of a variable. However, you can change

the value of a null or unset variable to its default in a script by using the := modifier:

${name:=default}

446 Chapter 10 Programming the Bourne Again Shell

The shell expands the expression ${name:=default} in the same manner as it

expands ${name:–default} but also sets the value of name to the expanded value

of default. If a script contains a line such as the following and LIT is unset or null at

the time this line is executed, LIT is assigned the value /home/max/literature:

$ ls ${LIT:=/home/max/literature}

: (null) builtin Shell scripts frequently start with the : (null) builtin followed on the same line by the

:= expansion modifier to set any variables that may be null or unset. The : builtin

evaluates each token in the remainder of the command line but does not execute

any commands. Without the leading colon (:), the shell evaluates and attempts to

execute the “command” that results from the evaluation.

Use the following syntax to set a default for a null or unset variable in a shell script

(a SPACE follows the first colon):

:${name:=default}

When a script needs a directory for temporary files and uses the value of TEMPDIR

for the name of this directory, the following line assigns to TEMPDIR the value

/tmp if TEMPDIR is null:

: ${TEMPDIR:=/tmp}

:? Displays an Error Message

Sometimes a script needs the value of a variable but you cannot supply a reasonable

default at the time you write the script. If the variable is null or unset, the :? modi-

fier causes the script to display an error message and terminate with an exit status

of 1:

${name:?message}

If you omit message, the shell displays the default error message (parameter null or

not set). Interactive shells do not exit when you use :?. In the following command,

TESTDIR is not set so the shell displays on standard error the expanded value of

the string following :?. In this case the string includes command substitution for

date

with the %T format, followed by the string error, variable not set.

cd ${TESTDIR:?$(date +%T) error, variable not set.}

bash: TESTDIR: 16:16:14 error, variable not set.

Builtin Commands

Builtin commands, which were introduced in Chapter 5, do not fork a new process

when you execute them. This section discusses the

type

read

exec

trap

kill

, and

getopts

builtins. Table 10-6 on page 459 lists many

bash

builtin commands. See

Table 9-10 on page 387 for a list of

tcsh

builtins.

Builtin Commands 447

type: Displays Information About a Command

The

type

builtin (use

which

under

tcsh

) provides information about a command:

$ type cat echo who if lt

cat is hashed (/bin/cat)

echo is a shell builtin

who is /usr/bin/who

if is a shell keyword

lt is aliased to 'ls -ltrh | tail'

The preceding output shows the files that would be executed if you gave cat or who

as a command. Because

cat

has already been called from the current shell, it is in the

hash table (page 957) and

type

reports that cat is hashed. The output also shows

that a call to echo runs the

echo

builtin, if is a keyword, and lt is an alias.

read: Accepts User Input

One of the most common uses for user-created variables is storing information that

a user enters in response to a prompt. Using

read

, scripts can accept input from the

user and store that input in variables. See page 371 for information about reading

user input under

tcsh

. The

read

builtin reads one line from standard input and

assigns the words on the line to one or more variables:

$ cat read1

echo -n "Go ahead: "

read firstline

echo "You entered: $firstline"

$ ./read1

Go ahead: This is a line.

You entered: This is a line.

The first line of the read1 script uses

echo

to prompt for a line of text. The –n

option suppresses the following NEWLINE, allowing you to enter a line of text on the

same line as the prompt. The second line reads the text into the variable firstline.

The third line verifies the action of

read

by displaying the value of firstline. The vari-

able is quoted (along with the text string) in this example because you, as the script

writer, cannot anticipate which characters the user might enter in response to the

prompt. Consider what would happen if the variable were not quoted and the user

entered * in response to the prompt:

$ cat read1_no_quote

echo -n "Go ahead: "

read firstline

echo You entered: $firstline

$ ./read1_no_quote

Go ahead: *

You entered: read1 read1_no_quote script.1

$ ls

read1 read1_no_quote script.1

448 Chapter 10 Programming the Bourne Again Shell

The

command lists the same words as the script, demonstrating that the shell

expands the asterisk into a list of files in the working directory. When the variable

$firstline is surrounded by double quotation marks, the shell does not expand the

asterisk. Thus the read1 script behaves correctly:

$ ./read1

Go ahead: *

You entered: *

REPLY The

read

builtin includes several features that can make it easier to use. For exam-

ple, when you do not specify a variable to receive

read

’s input,

bash

puts the input

into the variable named REPLY. You can use the –p option to prompt the user

instead of using a separate

echo

command. The following read1a script performs

exactly the same task as read1:

$ cat read1a

read -p "Go ahead: "

echo "You entered: $REPLY"

The read2 script prompts for a command line, reads the user’s response, and assigns

it to the variable cmd. The script then attempts to execute the command line that

results from the expansion of the cmd variable:

$ cat read2

read -p "Enter a command: " cmd

$cmd

echo "Thanks"

In the following example, read2 reads a command line that calls the

echo

builtin.

The shell executes the command and then displays Thanks. Next read2 reads a

command line that executes the

who

utility:

$ ./read2

Enter a command: echo Please display this message.

Please display this message.

Thanks

$ ./read2

Enter a command: who

max pts/4 Jun 17 07:50 (:0.0)

sam pts/12 Jun 17 11:54 (bravo.example.com)

Thanks

If cmd does not expand into a valid command line, the shell issues an error message:

$ ./read2

Enter a command: xxx

./read2: line 2: xxx: command not found

Thanks

The read3 script reads values into three variables. The

read

builtin assigns one word

(a sequence of nonblank characters) to each variable:

$ cat read3

read -p "Enter something: " word1 word2 word3

echo "Word 1 is: $word1"

echo "Word 2 is: $word2"

echo "Word 3 is: $word3"

Builtin Commands 449

$ ./read3

Enter something: this is something

Word 1 is: this

Word 2 is: is

Word 3 is: something

When you enter more words than

read

has variables,

read

assigns one word to each

variable, assigning all leftover words to the last variable. Both read1 and read2

assigned the first word and all leftover words to the one variable the scripts each

had to work with. In the following example,

read

assigns five words to three vari-

ables: It assigns the first word to the first variable, the second word to the second

variable, and the third through fifth words to the third variable.

$ ./read3

Enter something: this is something else, really.

Word 1 is: this

Word 2 is: is

Word 3 is: something else, really.

Table 10-4 lists some of the options supported by the

read

builtin.

The

read

builtin returns an exit status of 0 if it successfully reads any data. It has a

nonzero exit status when it reaches the EOF (end of file).

Table 10-4

read

options

Option Function

–a

aname

(array) Assigns each word of input to an element of array

aname

–d

delim

(delimiter) Uses

delim

to terminate the input instead of NEWLINE.

–e (Readline) If input is coming from a keyboard, uses the Readline Library

(page 318) to get input.

–n

num

(number of characters) Reads

num

characters and returns. As soon as the user

types

num

characters,

read

returns; there is no need to press RETURN.

–p

prompt

(prompt) Displays

prompt

on standard error without a terminating NEWLINE

before reading input. Displays

prompt

only when input comes from the

keyboard.

–s (silent) Does not echo characters.

–u

(file descriptor) Uses the integer

as the file descriptor that

read

takes its

input from. Thus

read –u4 arg1 arg2

is equivalent to

read arg1 arg2 <&4

See “File Descriptors” (page 431) for a discussion of redirection and file

descriptors.

450 Chapter 10 Programming the Bourne Again Shell

The following example runs a while loop from the command line. It takes its input

from the names file and terminates after reading the last line from names.

$ cat names

Alice Jones

Robert Smith

Alice Paulson

John Q. Public

$ while read first rest

> do

> echo $rest, $first

> done < names

Jones, Alice

Smith, Robert

Paulson, Alice

Q. Public, John

The placement of the redirection symbol (<) for the while structure is critical. It is

important that you place the redirection symbol at the done statement and not at

the call to

read

optional Each time you redirect input, the shell opens the input file and repositions the read

pointer at the start of the file:

$ read line1 < names; echo $line1; read line2 < names; echo $line2

Alice Jones

Here each

read

opens names and starts at the beginning of the names file. In the fol-

lowing example, names is opened once, as standard input of the subshell created by

the parentheses. Each

read

then reads successive lines of standard input:

$ (read line1; echo $line1; read line2; echo $line2) < names

Alice Jones

Robert Smith

Another way to get the same effect is to open the input file with

exec

and hold it

open (refer to “File Descriptors” on page 431):

$ exec 3< names

$ read -u3 line1; echo $line1; read -u3 line2; echo $line2

Alice Jones

Robert Smith

$ exec 3<&-

exec: Executes a Command or Redirects File Descriptors

The

exec

builtin (not available in

tcsh

) has two primary purposes: to run a command

without creating a new process and to redirect a file descriptor—including standard

input, output, or error—of a shell script from within the script (page 431). When the

shell executes a command that is not built into the shell, it typically creates a new

process. The new process inherits environment (global or exported) variables from

its parent but does not inherit variables that are not exported by the parent. (For

Builtin Commands 451

more information refer to “Locality of Variables” on page 436.) In contrast,

exec

executes a command in place of (overlays) the current process.

exec: Executes a Command

The

exec

builtin used for running a command has the following syntax:

exec command arguments

exec

versus . (dot) Insofar as

exec

runs a command in the environment of the original process, it is sim-

ilar to the . (dot) command (page 274). However, unlike the . command, which can

run only shell scripts,

exec

can run both scripts and compiled programs. Also,

whereas the . command returns control to the original script when it finishes run-

ning,

exec

does not. Finally, the . command gives the new program access to local

variables, whereas

exec

does not.

exec

does not

return control

Because the shell does not create a new process when you use

exec

, the command

runs more quickly. However, because

exec

does not return control to the original

program, it can be used only as the last command in a script. The following script

shows that control is not returned to the script:

$ cat exec_demo

who

exec date

echo "This line is never displayed."

$ ./exec_demo

zach pts/7 May 30 7:05 (bravo.example.com)

hls pts/1 May 30 6:59 (:0.0)

Mon May 25 11:42:56 PDT 2009

The next example, a modified version of the out script (page 403), uses

exec

to exe-

cute the final command the script runs. Because out runs either

cat

less

and then

terminates, the new version, named out2, uses

exec

with both

cat

and

less

$ cat out2

if [ $# -eq 0 ]

then

echo "Usage: out2 [-v] filenames" 1>&2

exit 1

if [ "$1" = "-v" ]

then

shift

exec less "$@"

else

exec cat -- "$@"

exec: Redirects Input and Output

The second major use of

exec

is to redirect a file descriptor—including standard

input, output, or error—from within a script. The next command causes all subse-

quent input to a script that would have come from standard input to come from the

file named infile:

exec < infile

452 Chapter 10 Programming the Bourne Again Shell

Similarly the following command redirects standard output and standard error to

outfile and errfile, respectively:

exec > outfile 2> errfile

When you use

exec

in this manner, the current process is not replaced with a new

process, and

exec

can be followed by other commands in the script.

/dev/tty When you redirect the output from a script to a file, you must make sure the user

sees any prompts the script displays. The /dev/tty device is a pseudonym for the

screen the user is working on; you can use this device to refer to the user’s screen

without knowing which device it is. (The

tty

utility displays the name of the device

you are using.) By redirecting the output from a script to /dev/tty, you ensure that

prompts and messages go to the user’s terminal, regardless of which terminal the

user is logged in on. Messages sent to /dev/tty are also not diverted if standard out-

put and standard error from the script are redirected.

The to_screen1 script sends output to three places: standard output, standard error,

and the user’s screen. When run with standard output and standard error redi-

rected, to_screen1 still displays the message sent to /dev/tty on the user’s screen.

The out and err files hold the output sent to standard output and standard error.

$ cat to_screen1

echo "message to standard output"

echo "message to standard error" 1>&2

echo "message to the user" > /dev/tty

$ ./to_screen1 > out 2> err

message to the user

$ cat out

message to standard output

$ cat err

message to standard error

The following command redirects the output from a script to the user’s screen:

exec > /dev/tty

Putting this command at the beginning of the previous script changes where the out-

put goes. In to_screen2,

exec

redirects standard output to the user’s screen so the >

/dev/tty is superfluous. Following the

exec

command, all output sent to standard

output goes to /dev/tty (the screen). Output to standard error is not affected.

$ cat to_screen2

exec > /dev/tty

echo "message to standard output"

echo "message to standard error" 1>&2

echo "message to the user" > /dev/tty

$ ./to_screen2 > out 2> err

message to standard output

message to the user

One disadvantage of using

exec

to redirect the output to /dev/tty is that all subsequent

output is redirected unless you use

exec

again in the script.

Builtin Commands 453

You can also redirect the input to

read

(standard input) so that it comes from /dev/tty

(the keyboard):

read name < /dev/tty

exec < /dev/tty

trap: Catches a Signal

Asignal is a report to a process about a condition. Linux uses signals to report

interrupts generated by the user (for example, pressing the interrupt key) as well as

bad system calls, broken pipes, illegal instructions, and other conditions. The

trap

builtin (

tcsh

uses

onintr

) catches (traps) one or more signals, allowing you to direct

the actions a script takes when it receives a specified signal.

This discussion covers six signals that are significant when you work with shell scripts.

Table 10-5 lists these signals, the signal numbers that systems often ascribe to them, and

the conditions that usually generate each signal. Give the command kill –l (lowercase

“ell”), trap –l (lowercase “ell”), or man 7 signal to display a list of all signal names.

Table 10-5 Signals

Type Name Number Generating condition

Not a real signal EXIT 0 Exit because of

exit

command or reaching the

end of the program (not an actual signal but use-

ful in

trap

)

Hang up SIGHUP or

HUP

1 Disconnect the line

Terminal

interrupt

SIGINT or

INT

2 Press the interrupt key (usually CONTROL-C )

Quit SIGQUIT or

QUIT

3 Press the quit key (usually CONTROL-SHIFT-| or

CONTROL-SHIFT-\)

Kill SIGKILL or

KILL

9The

kill

builtin with the –9 option (cannot be

trapped; use only as a last resort)

Software

termination

SIGTERM or

TERM

15 Default of the

kill

command

Stop SIGTSTP or

TSTP

20 Press the suspend key (usually CONTROL-Z)

Debug DEBUG Executes

commands

specified in the

trap

state-

ment after each command (not an actual signal

but useful in

trap

)

Error ERR Executes

commands

specified in the

trap

state-

ment after each command that returns a nonzero

exit status (not an actual signal but useful in

trap

)

454 Chapter 10 Programming the Bourne Again Shell

When it traps a signal, a script takes whatever action you specify: It can remove

files or finish other processing as needed, display a message, terminate execution

immediately, or ignore the signal. If you do not use

trap

in a script, any of the six

actual signals listed in Table 10-5 (not EXIT, DEBUG, or ERR) will terminate the

script. Because a process cannot trap a KILL signal, you can use kill –KILL (or kill

–9) as a last resort to terminate a script or other process. (See page 456 for more

information on

kill

The

trap

command has the following syntax:

trap ['commands'][signal]

The optional commands specifies the commands that the shell executes when it

catches one of the signals specified by signal. The signal can be a signal name or

number—for example, INT or 2. If commands is not present,

trap

resets the trap to

its initial condition, which is usually to exit from the script.

Quotation marks The

trap

builtin does not require single quotation marks around commands as shown

in the preceding syntax, but it is a good practice to use them. The single quotation

marks cause shell variables within the commands to be expanded when the signal

occurs, rather than when the shell evaluates the arguments to

trap

. Even if you do not

use any shell variables in the commands, you need to enclose any command that

takes arguments within either single or double quotation marks. Quoting commands

causes the shell to pass to

trap

the entire command as a single argument.

After executing the commands, the shell resumes executing the script where it left

off. If you want

trap

to prevent a script from exiting when it receives a signal but not

to run any commands explicitly, you can specify a null (empty) commands string, as

shown in the locktty script (page 419). The following command traps signal num-

ber 15, after which the script continues:

trap '' 15

The following script demonstrates how the

trap

builtin can catch the terminal inter-

rupt signal (2). You can use SIGINT, INT, or 2 to specify this signal. The script

returns an exit status of 1:

$ cat inter

#!/bin/bash

trap 'echo PROGRAM INTERRUPTED; exit 1' INT

while true

echo "Program running."

sleep 1

done

$ ./inter

Program running.

CONTROL-C

PROGRAM INTERRUPTED

Builtin Commands 455

: (null) builtin The second line of inter sets up a trap for the terminal interrupt signal using INT.

When

trap

catches the signal, the shell executes the two commands between the sin-

gle quotation marks in the

trap

command. The

echo

builtin displays the message

PROGRAM INTERRUPTED,

exit

terminates the shell running the script, and the

parent shell displays a prompt. If

exit

were not there, the shell would return control

to the while loop after displaying the message. The while loop repeats continuously

until the script receives a signal because the

true

utility always returns a true exit sta-

tus. In place of

true

you can use the : (null) builtin, which is written as a colon and

always returns a 0 (true) status.

The

trap

builtin frequently removes temporary files when a script is terminated prema-

turely, thereby ensuring the files are not left to clutter the filesystem. The following

shell script, named addbanner, uses two

trap

s to remove a temporary file when the

script terminates normally or owing to a hangup, software interrupt, quit, or software

termination signal:

$ cat addbanner

#!/bin/bash

script=$(basename $0)

if [ ! -r "$HOME/banner" ]

then

echo "$script: need readable $HOME/banner file" 1>&2

exit 1

trap 'exit 1' 1 2 3 15

trap 'rm /tmp/$$.$script 2> /dev/null' 0

for file

if [ -r "$file" -a -w "$file" ]

then

cat $HOME/banner $file > /tmp/$$.$script

cp /tmp/$$.$script $file

echo "$script: banner added to $file" 1>&2

else

echo "$script: need read and write permission for $file" 1>&2

done

When called with one or more filename arguments, addbanner loops through the

files, adding a header to the top of each. This script is useful when you use a stan-

dard format at the top of your documents, such as a standard layout for memos, or

when you want to add a standard header to shell scripts. The header is kept in a file

named ~/banner. Because addbanner uses the HOME variable, which contains the

pathname of the user’s home directory, the script can be used by several users with-

out modification. If Max had written the script with /home/max in place of

$HOME and then given the script to Zach, either Zach would have had to change it

or addbanner would have used Max’s banner file when Zach ran it (assuming Zach

had read permission for the file).

456 Chapter 10 Programming the Bourne Again Shell

The first

trap

in addbanner causes it to exit with a status of 1 when it receives a

hangup, software interrupt (terminal interrupt or quit signal), or software termina-

tion signal. The second

trap

uses a 0 in place of signal-number, which causes

trap

execute its command argument whenever the script exits because it receives an

exit

command or reaches its end. Together these

trap

s remove a temporary file whether

the script terminates normally or prematurely. Standard error of the second

trap

sent to /dev/null whenever

trap

attempts to remove a nonexistent temporary file. In

those cases

sends an error message to standard error; because standard error is

redirected, the user does not see the message.

See page 419 for another example that uses

trap

kill: Aborts a Process

The

kill

builtin sends a signal to a process or job;

kill

has the following syntax:

kill [–signal] PID

where signal is the signal name or number (for example, INT or 2) and PID is the

process identification number of the process that is to receive the signal. You can

specify a job number (page 135) as %n in place of PID. If you omit signal,

kill

sends

a TERM (software termination, number 15) signal. For more information on signal

names and numbers, see Table 10-5 on page 453.

The following command sends the TERM signal to job number 1, regardless of

whether it is in the foreground (running) or in the background (running or stopped):

$ kill -TERM %1

Because TERM is the default signal for

kill

, you can also give this command as kill

%1. Give the command kill –l (lowercase “l”) to display a list of signal names.

A program that is interrupted can leave matters in an unpredictable state: Temporary

files may be left behind (when they are normally removed), and permissions may be

changed. A well-written application traps, or detects, signals and cleans up before

exiting. Most carefully written applications trap the INT, QUIT, and TERM signals.

To terminate a program, first try INT (press CONTROL-C, if the job running is in the

foreground). Because an application can be written to ignore these signals, you may

need to use the KILL signal, which cannot be trapped or ignored; it is a “sure kill.”

Refer to page 729 for more information on

kill

. See also

killall

(page 731).

getopts: Parses Options

The

getopts

builtin (not available in

tcsh

) parses command-line arguments, making

it easier to write programs that follow the Linux argument conventions. The syntax

for

getopts

getopts optstring varname [arg ...]

where optstring is a list of the valid option letters, varname is the variable that

receives the options one at a time, and arg is the optional list of parameters to be

processed. If arg is not present,

getopts

processes the command-line arguments. If

optstring starts with a colon (:), the script must take care of generating error mes-

sages; otherwise,

getopts

generates error messages.

Builtin Commands 457

The

getopts

builtin uses the OPTIND (option index) and OPTARG (option argu-

ment) variables to track and store option-related values. When a shell script starts,

the value of OPTIND is 1. Each time

getopts

is called and locates an argument, it

increments OPTIND to the index of the next option to be processed. If the option

takes an argument,

bash

assigns the value of the argument to OPTARG.

To indicate that an option takes an argument, follow the corresponding letter in

optstring with a colon (:). The option string dxo:lt:r indicates that

getopts

should

search for –d,–x,–o,–l,–t, and –r options and that the –o and –t options take

arguments.

Using

getopts

as the test-command in a while control structure allows you to loop

over the options one at a time. The

getopts

builtin checks the option list for options

that are in optstring. Each time through the loop,

getopts

stores the option letter it

finds in varname.

Suppose that you want to write a program that can take three options:

1. A –b option indicates that the program should ignore whitespace at the

start of input lines.

2. A –t option followed by the name of a directory indicates that the program

should store temporary files in that directory. Otherwise, it should use

/tmp.

3. A –u option indicates that the program should translate all output to

uppercase.

In addition, the program should ignore all other options and end option processing

when it encounters two hyphens (––).

The problem is to write the portion of the program that determines which options

the user has supplied. The following solution does not use

getopts

SKIPBLANKS=

TMPDIR=/tmp

CASE=lower

while [[ "$1" = -* ]] # [[ = ]] does pattern match

case $1 in

-b) SKIPBLANKS=TRUE ;;

-t) if [ -d "$2" ]

then

TMPDIR=$2

shift

else

echo "$0: -t takes a directory argument." >&2

exit 1

fi ;;

-u) CASE=upper ;;

--) break ;; # Stop processing options

*) echo "$0: Invalid option $1 ignored." >&2 ;;

esac

shift

done

458 Chapter 10 Programming the Bourne Again Shell

This program fragment uses a loop to check and

shift

arguments while the argument

is not ––. As long as the argument is not two hyphens, the program continues to

loop through a case statement that checks for possible options. The –– case label

breaks out of the while loop. The *case label recognizes any option; it appears as

the last case label to catch any unknown options, displays an error message, and

allows processing to continue. On each pass through the loop, the program uses

shift

to access the next argument. If an option takes an argument, the program uses

an extra

shift

to get past that argument.

The following program fragment processes the same options, but uses

getopts

SKIPBLANKS=

TMPDIR=/tmp

CASE=lower

while getopts :bt:u arg

case $arg in

b) SKIPBLANKS=TRUE ;;

t) if [ -d "$OPTARG" ]

then

TMPDIR=$OPTARG

else

echo "$0: $OPTARG is not a directory." >&2

exit 1

fi ;;

u) CASE=upper ;;

:) echo "$0: Must supply an argument to -$OPTARG." >&2

exit 1 ;;

\?) echo "Invalid option -$OPTARG ignored." >&2 ;;

esac

done

In this version of the code, the while structure evaluates the

getopts

builtin each time

control transfers to the top of the loop. The

getopts

builtin uses the OPTIND vari-

able to keep track of the index of the argument it is to process the next time it is

called. There is no need to call

shift

in this example.

In the

getopts

version of the script, the case patterns do not start with a hyphen

because the value of arg is just the option letter (

getopts

strips off the hyphen). Also,

getopts

recognizes –– as the end of the options, so you do not have to specify it

explicitly, as in the case statement in the first example.

Because you tell

getopts

which options are valid and which require arguments, it

can detect errors in the command line and handle them in two ways. This example

uses a leading colon in optstring to specify that you check for and handle errors in

your code; when

getopts

finds an invalid option, it sets varname to ? and OPTARG

to the option letter. When it finds an option that is missing an argument,

getopts

sets varname to : and OPTARG to the option lacking an argument.

The \? case pattern specifies the action to take when

getopts

detects an invalid

option. The :case pattern specifies the action to take when

getopts

detects a missing

Builtin Commands 459

option argument. In both cases

getopts

does not write any error message but rather

leaves that task to you.

If you omit the leading colon from optstring, both an invalid option and a missing

option argument cause varname to be assigned the string ?.OPTARG is not set and

getopts

writes its own diagnostic message to standard error. Generally this method is less

desirable because you have less control over what the user sees when an error occurs.

Using

getopts

will not necessarily make your programs shorter. Its principal advantages

are that it provides a uniform programming interface and that it enforces standard

option handling.

A Partial List of Builtins

Table 10-6 lists some of the

bash

builtins. You can use

type

(page 447) to see if a

command runs a builtin. See “Listing

bash

builtins” on page 141 for instructions on

how to display complete lists of builtins.

Table 10-6

bash

builtins

Builtin Function

:Returns 0 or true (the null builtin; page 455)

(dot)

Executes a shell script as part of the current process (page 274)

Puts a suspended job in the background (page 287)

break Exits from a looping control structure (page 420)

Changes to another working directory (page 87)

continue Starts with the next iteration of a looping control structure (page 420)

echo

Displays its arguments (page 57)

eval

Scans and evaluates the command line (page 329)

exec

Executes a shell script or program in place of the current process (page 450)

exit

Exits from the current shell (usually the same as CONTROL-D from an interactive

shell; page 440)

export

Places the value of a variable in the calling environment (makes it global;

page 436)

Brings a job from the background into the foreground (page 286)

getopts

Parses arguments to a shell script (page 456)

jobs

Displays a list of background jobs (page 285)

kill

Sends a signal to a process or job (page 729)

pwd

Displays the name of the working directory (page 82)

460 Chapter 10 Programming the Bourne Again Shell

Expressions

An expression comprises constants, variables, and operators that the shell can pro-

cess to return a value. This section covers arithmetic, logical, and conditional

expressions as well as operators. Table 10-8 on page 463 lists the

bash

operators.

Arithmetic Evaluation

The Bourne Again Shell can perform arithmetic assignments and evaluate many dif-

ferent types of arithmetic expressions, all using integers. The shell performs arith-

metic assignments in a number of ways. One is with arguments to the

let

builtin:

$ let "VALUE=VALUE * 10 + NEW"

In the preceding example, the variables VALU E and NEW contain integer values.

Within a

let

statement you do not need to use dollar signs ($) in front of variable

names. Double quotation marks must enclose a single argument, or expression, that

contains SPACEs. Because most expressions contain SPACEs and need to be quoted,

bash

accepts ((expression)) as a synonym for let "expression", obviating the need for

both quotation marks and dollar signs:

$ ((VALUE=VALUE * 10 + NEW))

You can use either form wherever a command is allowed and can remove the SPACEs

if you like. In the following example, the asterisk (*) does not need to be quoted

because the shell does not perform pathname expansion on the right side of an

assignment (page 293):

$ let VALUE=VALUE*10+NEW

Builtin Function

read

Reads a line from standard input (page 447)

readonly

Declares a variable to be readonly (page 295)

set

Sets shell flags or command-line argument variables; with no argument, lists

all variables (pages 331, 366, and 442)

shift

Promotes each command-line argument (page 442)

test

Compares arguments (pages 399 and 854)

times

Displays total times for the current shell and its children

trap

Traps a signal (page 453)

type

Displays how each argument would be interpreted as a command (page 447)

umask

Returns the value of the file-creation mask (page 870)

unset

Removes a variable or function (page 294)

wait

Waits for a background process to terminate (page 391)

Table 10-6

bash

builtins (continued)

Expressions 461

Because each argument to

let

is evaluated as a separate expression, you can assign

values to more than one variable on a single line:

$ let "COUNT = COUNT + 1" VALUE=VALUE*10+NEW

You need to use commas to separate multiple assignments within a set of double

parentheses:

$ ((COUNT = COUNT + 1, VALUE=VALUE*10+NEW))

Logical expressions You can use the ((expression)) syntax for logical expressions, although that task is fre-

quently left to [[expression]]. The next example expands the age_check script

(page 338) to include logical arithmetic evaluation in addition to arithmetic expansion:

$ cat age2

#!/bin/bash

echo -n "How old are you? "

read age

if ((30 < age && age < 60)); then

echo "Wow, in $((60-age)) years, you'll be 60!"

else

echo "You are too young or too old to play."

$ ./age2

How old are you? 25

You are too young or too old to play.

The test-statement for the if structure evaluates two logical comparisons joined by a

Boolean AND and returns 0 (true) if they are both true or 1 (false) otherwise.

Logical Evaluation (Conditional Expressions)

The syntax of a conditional expression is

[[ expression ]]

where expression is a Boolean (logical) expression. You must precede a variable

name with a dollar sign ($) within expression. The result of executing this builtin,

as with the

test

builtin, is a return status. The conditions allowed within the brack-

ets are almost a superset of those accepted by

test

(page 854). Where the

test

builtin

uses –a as a Boolean AND operator, [[ expression ]] uses &&. Similarly, where

test

uses –o as a Boolean OR operator, [[ expression ]] uses ||.

Arithmetic evaluation versus arithmetic expansion

tip Arithmetic evaluation differs from arithmetic expansion. As explained on page 338, arithmetic

expansion uses the syntax $((

expression

)), evaluates

expression

, and replaces $((

expression

)) with

the result. You can use arithmetic expansion to display the value of an expression or to assign that

value to a variable.

Arithmetic evaluation uses the let

expression

or ((

expression

)) syntax, evaluates

expression

, and

returns a status code. You can use arithmetic evaluation to perform a logical comparison or an

assignment.

462 Chapter 10 Programming the Bourne Again Shell

To see how conditional expressions work, replace the line that tests age in the age2 script

with the following conditional expression. You must surround the [[ and ]] tokens with

whitespace or a command terminator, and place dollar signs before the variables:

if [[ 30 < $age && $age < 60 ]]; then

You can also use

test

’s relational operators –gt,–ge,–lt,–le,–eq, and –ne:

if [[ 30 -lt $age && $age -lt 60 ]]; then

String comparisons The

test

builtin tests whether strings are equal. The [[ expression ]] syntax adds

comparison tests for string operators. The > and < operators compare strings for

order (for example, "aa" < "bbb"). The = operator tests for pattern match, not just

equality: [[ string = pattern ]] is true if string matches pattern. This operator is not

symmetrical; the pattern must appear on the right side of the equal sign. For exam-

ple, [[ artist = a*]] is true (= 0), whereas [[ a*=artist]] is false (= 1):

$ [[ artist = a* ]]

$ echo $?

$ [[ a* = artist ]]

$ echo $?

The next example uses a command list that starts with a compound condition. The

condition tests that the directory bin and the file src/myscript.bash exist. If this is

true,

copies src/myscript.bash to bin/myscript. If the copy succeeds,

chmod

makes myscript executable. If any of these steps fails,

echo

displays a message.

$ [[ -d bin && -f src/myscript.bash ]] && cp src/myscript.bash \

bin/myscript && chmod +x bin/myscript || echo "Cannot make \

executable version of myscript"

String Pattern Matching

The Bourne Again Shell provides string pattern-matching operators that can manip-

ulate pathnames and other strings. These operators can delete from strings prefixes

or suffixes that match patterns. Table 10-7 lists the four operators.

The syntax for these operators is

${varname op pattern}

Table 10-7 String operators

Operator Function

#Removes minimal matching prefixes

## Removes maximal matching prefixes

%Removes minimal matching suffixes

%% Removes maximal matching suffixes

Expressions 463

where op is one of the operators listed in Table 10-7 and pattern is a match pattern

similar to that used for filename generation. These operators are commonly used to

manipulate pathnames to extract or remove components or to change suffixes:

$ SOURCEFILE=/usr/local/src/prog.c

$ echo ${SOURCEFILE#/*/}

local/src/prog.c

$ echo ${SOURCEFILE##/*/}

prog.c

$ echo ${SOURCEFILE%/*}

/usr/local/src

$ echo ${SOURCEFILE%%/*}

$ echo ${SOURCEFILE%.c}

/usr/local/src/prog

$ CHOPFIRST=${SOURCEFILE#/*/}

$ echo $CHOPFIRST

local/src/prog.c

$ NEXT=${CHOPFIRST%%/*}

$ echo $NEXT

local

Here the string-length operator, ${#name}, is replaced by the number of characters

in the value of name:

$ echo $SOURCEFILE

/usr/local/src/prog.c

$ echo ${#SOURCEFILE}

Operators

Arithmetic expansion and arithmetic evaluation in

bash

use the same syntax, prece-

dence, and associativity of expressions as in the C language. Table 10-8 lists operators

in order of decreasing precedence (priority of evaluation); each group of operators has

equal precedence. Within an expression you can use parentheses to change the order

of evaluation.

Table 10-8 Operators

Type of operator/operator Function

Post

var

––

Postincrement

Postdecrement

Pre

var

––

var

Preincrement

Predecrement

464 Chapter 10 Programming the Bourne Again Shell

Type of operator/operator Function

Unary

–Unary minus

+Unary plus

Negation

!Boolean NOT (logical negation)

~Complement (bitwise negation)

Exponentiation

** Exponent

Multiplication, division,

remainder

*Multiplication

/Division

%Remainder

Addition, subtraction

–Subtraction

+Addition

Bitwise shifts

<< Left bitwise shift

>> Right bitwise shift

Comparison

<= Less than or equal

>= Greater than or equal

<Less than

>Greater than

Equality, inequality

== Equality

!= Inequality

Bitwise

&Bitwise AND

^Bitwise XOR (exclusive OR)

|Bitwise OR

Table 10-8 Operators (continued)

Expressions 465

Pipe The pipe token has higher precedence than operators. You can use pipes anywhere

in a command that you can use simple commands. For example, the command line

$ cmd1 | cmd2 || cmd3 | cmd4 && cmd5 | cmd6

is interpreted as if you had typed

$ ((cmd1 | cmd2) || (cmd3 | cmd4)) && (cmd5 | cmd6)

Increment and

decrement

The postincrement, postdecrement, preincrement, and predecrement operators

work with variables. The pre- operators, which appear in front of the variable name

(as in ++COUNT and ––VALUE), first change the value of the variable (++ adds 1;

–– subtracts 1) and then provide the result for use in the expression. The post- oper-

ators appear after the variable name (as in COUNT++ and VALUE––); they first

provide the unchanged value of the variable for use in the expression and then

change the value of the variable.

$ N=10

$ echo $N

$ echo $((--N+3))

$ echo $N

$ echo $((N++ - 3))

$ echo $N

Type of operator/operator Function

Boolean (logical)

&& Boolean AND

|| Boolean OR

Conditional evaluation

? : Ternary operator

Assignment

=,*=, /=, %=, +=, –=,

<<=, >>=, &=, ^=, |=

Assignment

Comma

,Comma

Table 10-8 Operators (continued)

Do not rely on rules of precedence: use parentheses

tip Do not rely on the precedence rules when you use compound commands. Instead, use paren-

theses to explicitly state the order in which you want the shell to interpret the commands.

466 Chapter 10 Programming the Bourne Again Shell

Remainder The remainder operator (%) yields the remainder when its first operand is divided

by its second. For example, the expression $((15%7)) has the value 1.

Boolean The result of a Boolean operation is either 0 (false) or 1 (true).

The && (AND) and || (OR) Boolean operators are called short-circuiting operators.

If the result of using one of these operators can be decided by looking only at the left

operand, the right operand is not evaluated. The && operator causes the shell to test

the exit status of the command preceding it. If the command succeeded,

bash

executes

the next command; otherwise, it skips the remaining commands on the command

line. You can use this construct to execute commands conditionally.

$ mkdir bkup && cp -r src bkup

This compound command creates the directory bkup. If

mkdir

succeeds, the contents

of directory src is copied recursively to bkup.

The || separator also causes

bash

to test the exit status of the first command but has

the opposite effect: The remaining command(s) are executed only if the first one

failed (that is, exited with nonzero status).

$ mkdir bkup || echo "mkdir of bkup failed" >> /tmp/log

The exit status of a command list is the exit status of the last command in the list.

You can group lists with parentheses. For example, you could combine the previous

two examples as

$ (mkdir bkup && cp -r src bkup) || echo "mkdir failed" >> /tmp/log

In the absence of parentheses, && and || have equal precedence and are grouped

from left to right. The following examples use the

true

and

false

utilities. These util-

ities do nothing and return true (0) and false (1) exit statuses, respectively:

$ false; echo $?

The $? variable holds the exit status of the preceding command (page 440). The

next two commands yield an exit status of 1 (false):

$ true || false && false

$ echo $?

$ (true || false) && false

$ echo $?

Similarly the next two commands yield an exit status of 0 (true):

$ false && false || true

$ echo $?

$ (false && false) || true

$ echo $?

Expressions 467

Because || and && have equal precedence, the parentheses in the two preceding

pairs of examples do not change the order of operations.

Because the expression on the right side of a short-circuiting operator may never be

executed, you must be careful when placing assignment statements in that location.

The following example demonstrates what can happen:

$ ((N=10,Z=0))

$ echo $((N || ((Z+=1)) ))

$ echo $Z

Because the value of N is nonzero, the result of the || (OR) operation is 1 (true), no

matter what the value of the right side is. As a consequence, ((Z+=1)) is never evalu-

ated and Z is not incremented.

Ternary The ternary operator, ? :, decides which of two expressions should be evaluated,

based on the value returned by a third expression:

expression1 ? expression2 : expression3

If expression1 produces a false (0) value, expression3 is evaluated; otherwise,

expression2 is evaluated. The value of the entire expression is the value of

expression2 or expression3, depending on which is evaluated. If expression1 is true,

expression3 is not evaluated. If expression1 is false, expression2 is not evaluated.

$ ((N=10,Z=0,COUNT=1))

$ ((T=N>COUNT?++Z:--Z))

$ echo $T

$ echo $Z

Assignment The assignment operators, such as +=, are shorthand notations. For example, N+=3

is the same as ((N=N+3)).

Other bases The following commands use the syntax base#n to assign base 2 (binary) values.

First v1 is assigned a value of 0101 (5 decimal) and then v2 is assigned a value of

0110 (6 decimal). The

echo

utility verifies the decimal values.

$ ((v1=2#0101))

$ ((v2=2#0110))

$ echo "$v1 and $v2"

5 and 6

Next the bitwise AND operator (&) selects the bits that are on in both 5 (0101

binary) and 6 (0110 binary). The result is binary 0100, which is 4 decimal.

$ echo $(( v1 & v2 ))

468 Chapter 10 Programming the Bourne Again Shell

The Boolean AND operator (&&) produces a result of 1 if both of its operands are

nonzero and a result of 0 otherwise. The bitwise inclusive OR operator (|) selects

the bits that are on in either 0101 or 0110, resulting in 0111, which is 7 decimal.

The Boolean OR operator (||) produces a result of 1 if either of its operands is non-

zero and a result of 0 otherwise.

$ echo $(( v1 && v2 ))

$ echo $(( v1 | v2 ))

$ echo $(( v1 || v2 ))

Next the bitwise exclusive OR operator (^) selects the bits that are on in either, but

not both, of the operands 0101 and 0110, yielding 0011, which is 3 decimal. The

Boolean NOT operator (!) produces a result of 1 if its operand is 0 and a result of 0

otherwise. Because the exclamation point in $(( ! v1 )) is enclosed within double

parentheses, it does not need to be escaped to prevent the shell from interpreting the

exclamation point as a history event. The comparison operators produce a result of

1 if the comparison is true and a result of 0 otherwise.

$ echo $(( v1 ^ v2 ))

$ echo $(( ! v1 ))

$ echo $(( v1 < v2 ))

$ echo $(( v1 > v2 ))

Shell Programs

The Bourne Again Shell has many features that make it a good programming lan-

guage. The structures that

bash

provides are not a random assortment, but rather

have been chosen to provide most of the structural features that are found in other

procedural languages, such as C or Perl. A procedural language provides the follow-

ing abilities:

• Declare, assign, and manipulate variables and constant data. The Bourne

Again Shell provides string variables, together with powerful string opera-

tors, and integer variables, along with a complete set of arithmetic operators.

• Break large problems into small ones by creating subprograms. The Bourne

Again Shell allows you to create functions and call scripts from other

scripts. Shell functions can be called recursively; that is, a Bourne Again

Shell function can call itself. You may not need to use recursion often, but it

may allow you to solve some apparently difficult problems with ease.

• Execute statements conditionally, using statements such as if.

• Execute statements iteratively, using statements such as while and for.

Shell Programs 469

• Transfer data to and from the program, communicating with both data

files and users.

Programming languages implement these capabilities in different ways but with the

same ideas in mind. When you want to solve a problem by writing a program, you

must first figure out a procedure that leads you to a solution—that is, an algorithm.

Typically you can implement the same algorithm in roughly the same way in differ-

ent programming languages, using the same kinds of constructs in each language.

Chapter 8 and this chapter have introduced numerous

bash

features, many of which

are useful for both interactive use and shell programming. This section develops

two complete shell programs, demonstrating how to combine some of these features

effectively. The programs are presented as problems for you to solve, with sample

solutions provided.

A Recursive Shell Script

A recursive construct is one that is defined in terms of itself. Alternatively, you

might say that a recursive program is one that can call itself. This concept may seem

circular, but it need not be. To avoid circularity, a recursive definition must have a

special case that is not self-referential. Recursive ideas occur in everyday life. For

example, you can define an ancestor as your mother, your father, or one of their

ancestors. This definition is not circular; it specifies unambiguously who your

ancestors are: your mother or your father, or your mother’s mother or father or

your father’s mother or father, and so on.

A number of Linux system utilities can operate recursively. See the –R option to the

chmod

(page 626),

chown

(page 631), and

(page 640) utilities for examples.

Solve the following problem by using a recursive shell function:

One algorithm for a recursive solution follows:

1. Examine the path argument. If it is a null string or if it names an existing

directory, do nothing and return.

2. If the path argument is a simple path component, create it (using

mkdir

)

and return.

3. Otherwise, call makepath using the path prefix of the original argument.

This step eventually creates all the directories up to the last component,

which you can then create using

mkdir

In general, a recursive function must invoke itself with a simpler version of the

problem than it was given until it is finally called with a simple case that does not

need to call itself. Following is one possible solution based on this algorithm:

Write a shell function named makepath that, given a pathname, creates all compo-

nents in that pathname as directories. For example, the command makepath

a/b/c/d should create directories a,a/b,a/b/c, and a/b/c/d. (The

mkdir

–p option

creates directories in this manner. Solve the problem without using mkdir –p.)

470 Chapter 10 Programming the Bourne Again Shell

makepath # This is a function

# Enter it at the keyboard, do not run it as a shell script

function makepath()

{

if [[ ${#1} -eq 0 || -d "$1" ]]

then

return 0 # Do nothing

if [[ "${1%/*}" = "$1" ]]

then

mkdir $1

return $?

makepath ${1%/*} || return 1

mkdir $1

return $?

}

In the test for a simple component (the if statement in the middle of the function), the

left expression is the argument after the shortest suffix that starts with a / character

has been stripped away (page 462). If there is no such character (for example, if $1 is

max), nothing is stripped off and the two sides are equal. If the argument is a simple

filename preceded by a slash, such as /usr, the expression ${1%/*} evaluates to a null

string. To make the function work in this case, you must take two precautions: Put

the left expression within quotation marks and ensure that the recursive function

behaves sensibly when it is passed a null string as an argument. In general, good pro-

grams are robust: They should be prepared for borderline, invalid, or meaningless

input and behave appropriately in such cases.

By giving the following command from the shell you are working in, you turn on

debugging tracing so that you can watch the recursion work:

$ set -o xtrace

(Give the same command, but replace the hyphen with a plus sign (+) to turn debug-

ging off.) With debugging turned on, the shell displays each line in its expanded

form as it executes the line. A + precedes each line of debugging output.

In the following example, the first line that starts with + shows the shell calling

makepath. The makepath function is initially called from the command line with

arguments of a/b/c. It then calls itself with arguments of a/b and finally a. All the

work is done (using

mkdir

) as each call to makepath returns.

$ ./makepath a/b/c

+ makepath a/b/c

+ [[ 5 -eq 0 ]]

+ [[ -d a/b/c ]]

+ [[ a/b = \a\/\b\/\c ]]

+ makepath a/b

+ [[ 3 -eq 0 ]]

+ [[ -d a/b ]]

+ [[ a = \a\/\b ]]

Shell Programs 471

+ makepath a

+ [[ 1 -eq 0 ]]

+ [[ -d a ]]

+ [[ a = \a ]]

+ mkdir a

+ return 0

+ mkdir a/b

+ return 0

+ mkdir a/b/c

+ return 0

The function works its way down the recursive path and back up again.

It is instructive to invoke makepath with an invalid path and see what happens. The

following example, which is run with debugging turned on, tries to create the path

/a/b. Creating this path requires that you create directory a in the root directory.

Unless you have permission to write to the root directory, you are not permitted to

create this directory.

$ ./makepath /a/b

+ makepath /a/b

+ [[ 4 -eq 0 ]]

+ [[ -d /a/b ]]

+ [[ /a = \/\a\/\b ]]

+ makepath /a

+ [[ 2 -eq 0 ]]

+ [[ -d /a ]]

+ [[ '' = \/\a ]]

+ makepath

+ [[ 0 -eq 0 ]]

+ return 0

+ mkdir /a

mkdir: cannot create directory '/a': Permission denied

+ return 1

The recursion stops when makepath is denied permission to create the /a directory.

The error returned is passed all the way back, so the original makepath exits with

nonzero status.

Use local variables with recursive functions

tip The preceding example glossed over a potential problem that you may encounter when you use a

recursive function. During the execution of a recursive function, many separate instances of that

function may be active simultaneously. All but one of them are waiting for their child invocation to

complete.

Because functions run in the same environment as the shell that calls them, variables are implicitly

shared by a shell and a function it calls. As a consequence, all instances of the function share a

single copy of each variable. Sharing variables can give rise to side effects that are rarely what you

want. As a rule, you should use

typeset

to make all variables of a recursive function be local vari-

ables. See page 437 for more information.

472 Chapter 10 Programming the Bourne Again Shell

The quiz Shell Script

Solve the following problem using a

bash

script:

The detailed design of this program and even the detailed description of the prob-

lem depend on a number of choices: How will the program know which subjects are

available for quizzes? How will the user choose a subject? How will the program

know when the quiz is over? Should the program present the same questions (for a

given subject) in the same order each time, or should it scramble them?

Of course, you can make many perfectly good choices that implement the specification

of the problem. The following details narrow the problem specification:

• Each subject will correspond to a subdirectory of a master quiz directory.

This directory will be named in the environment variable QUIZDIR, whose

default will be ~/quiz. For example, you could have the following directories

correspond to the subjects engineering, art, and politics: ~/quiz/engineering,

~/quiz/art, and ~/quiz/politics. Put the quiz directory in /usr/games if you

want all users to have access to it (requires root privileges).

• Each subject can have several questions. Each question is represented by a

file in its subject’s directory.

• The first line of each file that represents a question holds the text of the

question. If it takes more than one line, you must escape the NEWLINE with a

backslash. (This setup makes it easy to read a single question with the

read

builtin.) The second line of the file is an integer that specifies the number

of choices. The next lines are the choices themselves. The last line is the

correct answer. Following is a sample question file:

Who discovered the principle of the lever?

Euclid

Archimedes

Thomas Edison

The Lever Brothers

Archimedes

• The program presents all the questions in a subject directory. At any point

the user can interrupt the quiz with CONTROL-C, whereupon the program will

summarize the results so far and exit. If the user does not interrupt the

program, the program summarizes the results and exits when it has asked

all questions for the chosen subject.

• The program scrambles the questions in a subject before presenting them.

Write a generic multiple-choice quiz program. The program should get its questions

from data files, present them to the user, and keep track of the number of correct

and incorrect answers. The user must be able to exit from the program at any time

and receive a summary of results to that point.

Shell Programs 473

Following is a top-level design for this program:

1. Initialize. This involves a number of steps, such as setting the counts of the

number of questions asked so far and the number of correct and wrong

answers to zero. It also sets up the program to trap CONTROL-C.

2. Present the user with a choice of subjects and get the user’s response.

3. Change to the corresponding subject directory.

4. Determine the questions to be asked (that is, the filenames in that direc-

tory). Arrange them in random order.

5. Repeatedly present questions and ask for answers until the quiz is over or

is interrupted by the user.

6. Present the results and exit.

Clearly some of these steps (such as step 3) are simple, whereas others (such as step

4) are complex and worthy of analysis on their own. Use shell functions for any

complex step, and use the

trap

builtin to handle a user interrupt.

Here is a skeleton version of the program with empty shell functions:

function initialize

{

# Initializes variables.

}

function choose_subj

{

# Writes choice to standard output.

}

function scramble

{

# Stores names of question files, scrambled,

# in an array variable named questions.

}

function ask

{

# Reads a question file, asks the question, and checks the

# answer. Returns 1 if the answer was correct, 0 otherwise. If it

# encounters an invalid question file, exit with status 2.

}

function summarize

{

# Presents the user's score.

}

# Main program

initialize # Step 1 in top-level design

subject=$(choose_subj) # Step 2

[[ $? -eq 0 ]] || exit 2 # If no valid choice, exit

474 Chapter 10 Programming the Bourne Again Shell

cd $subject || exit 2 # Step 3

echo # Skip a line

scramble # Step 4

for ques in ${questions[*]}; do # Step 5

ask $ques

result=$?

(( num_ques=num_ques+1 ))

if [[ $result == 1 ]]; then

(( num_correct += 1 ))

echo # Skip a line between questions

sleep ${QUIZDELAY:=1}

done

summarize # Step 6

exit 0

To make reading the results a bit easier for the user, a

sleep

call appears inside the

question loop. It delays $QUIZDELAY seconds (default = 1) between questions.

Now the task is to fill in the missing pieces of the program. In a sense this program

is being written backward. The details (the shell functions) come first in the file but

come last in the development process. This common programming practice is called

top-down design. In top-down design you fill in the broad outline of the program

first and supply the details later. In this way you break the problem up into smaller

problems, each of which you can work on independently. Shell functions are a great

help in using the top-down approach.

One way to write the initialize function follows. The

command causes QUIZDIR

to be the working directory for the rest of the script and defaults to ~/quiz if

QUIZDIR is not set.

function initialize ()

{

trap 'summarize ; exit 0' INT # Handle user interrupts

num_ques=0 # Number of questions asked so far

num_correct=0 # Number answered correctly so far

first_time=true # true until first question is asked

cd ${QUIZDIR:=~/quiz} || exit 2

}

Be prepared for the

command to fail. The directory may be unsearchable or

conceivably another user may have removed it. The preceding function exits with

a status code of 2 if

fails.

The next function, choose_subj, is a bit more complicated. It displays a menu using

aselect statement:

Shell Programs 475

function choose_subj ()

{

subjects=($(ls))

PS3="Choose a subject for the quiz from the preceding list: "

select Subject in ${subjects[*]}; do

if [[ -z "$Subject" ]]; then

echo "No subject chosen. Bye." >&2

exit 1

echo $Subject

return 0

done

}

The function first uses an

command and command substitution to put a list of sub-

ject directories in the subjects array. Next the select structure (page 427) presents the

user with a list of subjects (the directories found by

) and assigns the chosen direc-

tory name to the Subject variable. Finally the function writes the name of the subject

directory to standard output. The main program uses command substitution to

assign this value to the subject variable [subject=$(choose_subj)].

The scramble function presents a number of difficulties. In this solution it uses an

array variable (questions) to hold the names of the questions. It scrambles the entries

in an array using the RANDOM variable (each time you reference RANDOM, it has

the value of a [random] integer between 0 and 32767):

function scramble ()

{

typeset -i index quescount

questions=($(ls))

quescount=${#questions[*]} # Number of elements

((index=quescount-1))

while [[ $index > 0 ]]; do

((target=RANDOM % index))

exchange $target $index

((index -= 1))

done

}

This function initializes the array variable questions to the list of filenames (questions)

in the working directory. The variable quescount is set to the number of such files. Then

the following algorithm is used: Let the variable index count down from quescount – 1

(the index of the last entry in the array variable). For each value of index, the function

chooses a random value target between 0 and index, inclusive. The command

((target=RANDOM % index))

produces a random value between 0 and index – 1 by taking the remainder (the %

operator) when $RANDOM is divided by index. The function then exchanges the

elements of questions at positions target and index. It is convenient to take care of

this step in another function named exchange:

476 Chapter 10 Programming the Bourne Again Shell

function exchange ()

{

temp_value=${questions[$1]}

questions[$1]=${questions[$2]}

questions[$2]=$temp_value

}

The ask function also uses the select structure. It reads the question file named in its

argument and uses the contents of that file to present the question, accept the

answer, and determine whether the answer is correct. (See the code that follows.)

The ask function uses file descriptor 3 to read successive lines from the question file,

whose name was passed as an argument and is represented by $1 in the function. It

reads the question into the ques variable and the number of questions into

num_opts. The function constructs the variable choices by initializing it to a null

string and successively appending the next choice. Then it sets PS3 to the value of

ques and uses a select structure to prompt the user with ques. The select structure

places the user’s answer in answer, and the function then checks that response

against the correct answer from the file.

The construction of the choices variable is done with an eye toward avoiding a

potential problem. Suppose that one answer has some whitespace in it—then it

might appear as two or more arguments in choices. To avoid this problem, make

sure that choices is an array variable. The select statement does the rest of the work:

quiz $ cat quiz

#!/bin/bash

# remove the # on the following line to turn on debugging

# set -o xtrace

#==================

function initialize ()

{

trap 'summarize ; exit 0' INT # Handle user interrupts

num_ques=0 # Number of questions asked so far

num_correct=0 # Number answered correctly so far

first_time=true # true until first question is asked

cd ${QUIZDIR:=~/quiz} || exit 2

}

#==================

function choose_subj ()

{

subjects=($(ls))

PS3="Choose a subject for the quiz from the preceding list: "

select Subject in ${subjects[*]}; do

if [[ -z "$Subject" ]]; then

echo "No subject chosen. Bye." >&2

exit 1

echo $Subject

return 0

done

}

Shell Programs 477

#==================

function exchange ()

{

temp_value=${questions[$1]}

questions[$1]=${questions[$2]}

questions[$2]=$temp_value

}

#==================

function scramble ()

{

typeset -i index quescount

questions=($(ls))

quescount=${#questions[*]} # Number of elements

((index=quescount-1))

while [[ $index > 0 ]]; do

((target=RANDOM % index))

exchange $target $index

((index -= 1))

done

}

#==================

function ask ()

{

exec 3<$1

read -u3 ques || exit 2

read -u3 num_opts || exit 2

index=0

choices=()

while (( index < num_opts )) ; do

read -u3 next_choice || exit 2

choices=("${choices[@]}" "$next_choice")

((index += 1))

done

read -u3 correct_answer || exit 2

exec 3<&-

if [[ $first_time = true ]]; then

first_time=false

echo -e "You may press the interrupt key at any time to quit.\n"

PS3=$ques" " # Make $ques the prompt for select

# and add some spaces for legibility.

select answer in "${choices[@]}"; do

if [[ -z "$answer" ]]; then

echo Not a valid choice. Please choose again.

elif [[ "$answer" = "$correct_answer" ]]; then

echo "Correct!"

return 1

else

echo "No, the answer is $correct_answer."

return 0

done

}

478 Chapter 10 Programming the Bourne Again Shell

#==================

function summarize ()

{

echo # Skip a line

if (( num_ques == 0 )); then

echo "You did not answer any questions"

exit 0

(( percent=num_correct*100/num_ques ))

echo "You answered $num_correct questions correctly, out of \

$num_ques total questions."

echo "Your score is $percent percent."

}

#==================

# Main program

initialize # Step 1 in top-level design

subject=$(choose_subj) # Step 2

[[ $? -eq 0 ]] || exit 2 # If no valid choice, exit

cd $subject || exit 2 # Step 3

echo # Skip a line

scramble # Step 4

for ques in ${questions[*]}; do # Step 5

ask $ques

result=$?

(( num_ques=num_ques+1 ))

if [[ $result == 1 ]]; then

(( num_correct += 1 ))

echo # Skip a line between questions

sleep ${QUIZDELAY:=1}

done

summarize # Step 6

exit 0

Chapter Summary

The shell is a programming language. Programs written in this language are called

shell scripts, or simply scripts. Shell scripts provide the decision and looping control

structures present in high-level programming languages while allowing easy access

to system utilities and user programs. Shell scripts can use functions to modularize

and simplify complex tasks.

Control structures The control structures that use decisions to select alternatives are if...then,

if...then...else, and if...then...elif. The case control structure provides a multiway

branch and can be used when you want to express alternatives using a simple

pattern-matching syntax.

The looping control structures are for...in,for,until, and while. These structures

perform one or more tasks repetitively.

Chapter Summary 479

The break and continue control structures alter control within loops: break trans-

fers control out of a loop, and continue transfers control immediately to the top of

a loop.

The Here document allows input to a command in a shell script to come from

within the script itself.

File descriptors The Bourne Again Shell provides the ability to manipulate file descriptors. Coupled

with the

read

and

echo

builtins, file descriptors allow shell scripts to have as much

control over input and output as do programs written in lower-level languages.

Variables The

typeset

builtin assigns attributes, such as readonly, to

bash

variables. The

Bourne Again Shell provides operators to perform pattern matching on variables,

provide default values for variables, and evaluate the length of variables. This shell

also supports array variables and local variables for functions and provides built-in

integer arithmetic, using the

let

builtin and an expression syntax similar to that

found in the C programming language.

Builtins Bourne Again Shell builtins include

type

read

exec

trap

kill

, and

getopts

. The

type

builtin displays information about a command, including its location;

read

allows a

script to accept user input.

The

exec

builtin executes a command without creating a new process. The new

command overlays the current process, assuming the same environment and PID

number of that process. This builtin executes user programs and other Linux com-

mands when it is not necessary to return control to the calling process.

The

trap

builtin catches a signal sent by Linux to the process running the script and

allows you to specify actions to be taken upon receipt of one or more signals. You

can use this builtin to cause a script to ignore the signal that is sent when the user

presses the interrupt key.

The

kill

builtin terminates a running program. The

getopts

builtin parses command-

line arguments, making it easier to write programs that follow standard Linux

conventions for command-line arguments and options.

Utilities in scripts In addition to using control structures, builtins, and functions, shell scripts gener-

ally call Linux utilities. The

find

utility, for instance, is commonplace in shell scripts

that search for files in the system hierarchy and can perform a vast range of tasks,

from simple to complex.

Expressions There are two basic types of expressions: arithmetic and logical. Arithmetic expres-

sions allow you to do arithmetic on constants and variables, yielding a numeric

result. Logical (Boolean) expressions compare expressions or strings, or test condi-

tions to yield a true or false result. As with all decisions within Linux shell scripts, a

true status is represented by the value zero; false, by any nonzero value.

Good programming

practices

A well-written shell script adheres to standard programming practices, such as

specifying the shell to execute the script on the first line of the script, verifying the

number and type of arguments that the script is called with, displaying a standard

usage message to report command-line errors, and redirecting all informational

messages to standard error.

480 Chapter 10 Programming the Bourne Again Shell

Exercises

1. Rewrite the journal script of Chapter 8 (exercise 5, page 346) by adding

commands to verify that the user has write permission for a file named

journal-file in the user’s home directory, if such a file exists. The script

should take appropriate actions if journal-file exists and the user does not

have write permission to the file. Verify that the modified script works.

2. The special parameter "$@" is referenced twice in the out script

(page 403). Explain what would be different if the parameter "$*" was

used in its place.

3. Write a filter that takes a list of files as input and outputs the basename

(page 426) of each file in the list.

4. Write a function that takes a single filename as an argument and adds execute

permission to the file for the user.

a. When might such a function be useful?

b. Revise the script so it takes one or more filenames as arguments and

adds execute permission for the user for each file argument.

c. What can you do to make the function available every time you log in?

d. Suppose that, in addition to having the function available on subsequent

shell. How would you do so?

5. When might it be necessary or advisable to write a shell script instead of a

shell function? Give as many reasons as you can think of.

6. Write a shell script that displays the names of all directory files, but no

other types of files, in the working directory.

7. Write a script to display the time every 15 seconds. Read the

date man

page

and display the time, using the %r field descriptor. Clear the window

(using the

clear

command) each time before you display the time.

8. Enter the following script named savefiles, and give yourself execute

permission to the file:

$ cat savefiles

#! /bin/bash

echo "Saving files in current directory in file savethem."

exec > savethem

for i in *

echo "==================================================="

echo "File: $i"

echo "==================================================="

cat "$i"

done

Exercises 481

a. Which error message do you get when you execute this script? Rewrite

the script so that the error does not occur, making sure the output still

goes to savethem.

b. What might be a problem with running this script twice in the same

directory? Discuss a solution to this problem.

9. Read the

bash man

info

page, try some experiments, and answer the

following questions:

a. How do you export a function?

b. What does the

hash

builtin do?

c. What happens if the argument to

exec

is not executable?

10. Using the

find

utility, perform the following tasks:

a. List all files in the working directory and all subdirectories that have

been modified within the last day.

b. List all files that you have read access to on the system that are larger

than 1 megabyte.

c. Remove all files named core from the directory structure rooted at your

home directory.

d. List the inode numbers of all files in the working directory whose filenames

end in .c.

e. List all files that you have read access to on the root filesystem that have

been modified in the last 30 days.

11. Write a short script that tells you whether the permissions for two files,

whose names are given as arguments to the script, are identical. If the per-

missions for the two files are identical, output the common permission

field. Otherwise, output each filename followed by its permission field.

(Hint: Try using the

cut

utility.)

12. Write a script that takes the name of a directory as an argument and

searches the file hierarchy rooted at that directory for zero-length files.

Write the names of all zero-length files to standard output. If there is no

option on the command line, have the script delete the file after displaying

its name, asking the user for confirmation, and receiving positive confir-

mation. A –f (force) option on the command line indicates that the script

should display the filename but not ask for confirmation before deleting

the file.

482 Chapter 10 Programming the Bourne Again Shell

Advanced Exercises

13. Write a script that takes a colon-separated list of items and outputs the

items, one per line, to standard output (without the colons).

14. Generalize the script written in exercise 13 so that the character separating

the list items is given as an argument to the function. If this argument is

absent, the separator should default to a colon.

15. Write a function named funload that takes as its single argument the name

of a file containing other functions. The purpose of funload is to make all

functions in the named file available in the current shell; that is, funload

loads the functions from the named file. To locate the file, funload

searches the colon-separated list of directories given by the environment

variable FUNPATH. Assume that the format of FUNPATH is the same as

PATH and that searching FUNPATH is similar to the shell’s search of the

PATH variable.

16. Rewrite bundle (page 430) so the script it creates takes an optional list of

filenames as arguments. If one or more filenames are given on the com-

mand line, only those files should be re-created; otherwise, all files in the

shell archive should be re-created. For example, suppose all files with the

filename extension .c are bundled into an archive named srcshell, and you

want to unbundle just the files test1.c and test2.c. The following command

will unbundle just these two files:

$ bash srcshell test1.c test2.c

17. What kind of links will the lnks script (page 406) not find? Why?

18. In principle, recursion is never necessary. It can always be replaced by an

iterative construct, such as while or until. Rewrite makepath (page 470) as

a nonrecursive function. Which version do you prefer? Why?

19. Lists are commonly stored in environment variables by putting a colon

(:) between each of the list elements. (The value of the PATH variable is

an example.) You can add an element to such a list by catenating the new

element to the front of the list, as in

PATH=/opt/bin:$PATH

If the element you add is already in the list, you now have two copies of it

in the list. Write a shell function named addenv that takes two arguments:

(1) the name of a shell variable and (2) a string to prepend to the list that is

the value of the shell variable only if that string is not already an element

of the list. For example, the call

addenv PATH /opt/bin

Advanced Exercises 483

would add /opt/bin to PATH only if that pathname is not already in

PATH. Be sure that your solution works even if the shell variable starts out

empty. Also make sure that you check the list elements carefully. If

/usr/opt/bin is in PATH but /opt/bin is not, the example just given should

still add /opt/bin to PATH. (Hint: You may find this exercise easier to

complete if you first write a function locate_field that tells you whether a

string is an element in the value of a variable.)

20. Write a function that takes a directory name as an argument and writes to

standard output the maximum of the lengths of all filenames in that direc-

tory. If the function’s argument is not a directory name, write an error

message to standard output and exit with nonzero status.

21. Modify the function you wrote for exercise 20 to descend all subdirectories

of the named directory recursively and to find the maximum length of any

filename in that hierarchy.

22. Write a function that lists the number of ordinary files, directories, block

special files, character special files, FIFOs, and symbolic links in the work-

ing directory. Do this in two different ways:

a. Use the first letter of the output of ls –l to determine a file’s type.

b. Use the file type condition tests of the [[ expression ]] syntax to determine

a file’s type.

23. Modify the quiz program (page 476) so that the choices for a question are

randomly arranged.

485485

Larry Wall created the Perl (Practical Extraction and Report

Language) programming language for working with text. Perl

uses syntax and concepts from

awk

sed

, C, the Bourne Shell,

Smalltalk, Lisp, and English. It was designed to scan and extract

information from text files and generate reports based on that

information. Since its introduction in 1987, Perl has expanded

enormously—its documentation growing up with it. Today, in

addition to text processing, Perl is used for system administra-

tion, software development, and general-purpose programming.

Perl code is portable because Perl has been implemented on many

operating systems (see www.cpan.org/ports). Perl is an informal,

practical, robust, easy-to-use, efficient, and complete language. It

is a down-and-dirty language that supports procedural and

object-oriented programming. It is not necessarily elegant.

One of the things that distinguishes Perl from many other lan-

guages is its linguistic origins. In English you say, “I will buy a

car if I win the lottery.” Perl allows you to mimic that syntax.

Another distinction is that Perl has singular and plural vari-

ables, the former holding single values and the latter holding

lists of values.

In This Chapter

Running a Perl Program . . . . . . . . 490

Syntax . . . . . . . . . . . . . . . . . . . . . . 491

Help . . . . . . . . . . . . . . . . . . . . . . . . 487

Variables . . . . . . . . . . . . . . . . . . . . 493

Control Structures . . . . . . . . . . . . . 501

Working with Files. . . . . . . . . . . . . 510

Sort . . . . . . . . . . . . . . . . . . . . . . . . 513

Subroutines. . . . . . . . . . . . . . . . . . 515

Regular Expressions . . . . . . . . . . . 517

CPAN Modules. . . . . . . . . . . . . . . . 523

Examples. . . . . . . . . . . . . . . . . . . . 525

The Perl Scripting

Language

486 Chapter 11 The Perl Scripting Language

Introduction to Perl

A couple of quotes from the manual shed light on Perl’s philosophy:

Many of Perl’s syntactic elements are optional. Rather than requir-

ing you to put parentheses around every function call and declare

every variable, you can often leave such explicit elements off and

Perl will frequently figure out what you meant. This is known as

Do What I Mean, abbreviated DWIM. It allows programmers to

be lazy and to code in a style with which they are comfortable.

The Perl motto is “There’s more than one way to do it.” Divining

how many more is left as an exercise to the reader.

One of Perl’s biggest assets is its support by thousands of third-party modules. The

Comprehensive Perl Archive Network (CPAN; www.cpan.org) is a repository for

many of the modules and other information related to Perl. See page 523 for infor-

mation on downloading, installing, and using these modules in Perl programs.

The best way to learn Perl is to work with it. Copy and modify the programs in this

chapter until they make sense to you. Many system tools are written in Perl. The

first line of most of these tools begins with #!/usr/bin/perl, which tells the shell to

pass the program to Perl for execution. Most files that contain the string

/usr/bin/perl are Perl programs. The following command uses

grep

to search the

/usr/bin and /usr/sbin directories recursively (–r) for files containing the string

/usr/bin/perl; it lists many local system tools written in Perl:

$ grep -r /usr/bin/perl /usr/bin /usr/sbin | head -4

/usr/bin/defoma-user:#! /usr/bin/perl -w

/usr/bin/pod2latex:#!/usr/bin/perl

/usr/bin/pod2latex: eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'

/usr/bin/splain:#!/usr/bin/perl

Review these programs—they demonstrate how Perl is used in the real world. Copy

a system program to a directory you own before modifying it. Do not run a system

program while running with root privileges unless you know what you are doing.

More Information

Local

man

pages: See the

perl

and

perltoc man

pages for lists of Perl

man

pages

Web Perl home page: www.perl.com

CPAN: www.cpan.org

blog: perlbuzz.com

Install perl-doc

tip The perl-doc package holds a wealth of information. Install this package before you start using

Perl; see the next page for more information.

Introduction to Perl 487

Book Programming Perl, third edition, by Wall, Christiansen, & Orwant, O’Reilly &

Associates (July 2000)

Help

Perl is a forgiving language. As such, it is easy to write Perl code that runs but

does not perform as you intended. Perl includes many tools that can help you

find coding mistakes. The –w option and the use warnings statement can produce

helpful diagnostic messages. The use strict statement (see the perldebtut

man

page) can impose order on a program by requiring, among other things, that you

declare variables before you use them. When all else fails, you can use Perl’s

builtin debugger to step through a program. See the perldebtut and perldebug

man

pages for more information.

perldoc

You must install the perl-doc package before you can use

perldoc

The

perldoc

utility locates and displays local Perl documentation. It is similar to

man

(page 33) but specific to Perl. It works with files that include lines of pod (plain old

documentation), a clean and simple documentation language. When embedded in a

Perl program, pod enables you to include documentation for the entire program,

not just code-level comments, in a Perl program.

Following is a simple Perl program that includes pod. The two lines following =cut

are the program; the rest is pod-format documentation.

$ cat pod.ex1.pl

#!/usr/bin/perl

=head1 A Perl Program to Say I<Hi there.>

This simple Perl program includes documentation in B<pod> format.

The following B<=cut> command tells B<perldoc> that what follows

is not documentation.

=cut

# A Perl program

print "Hi there.\n";

=head1 pod Documentation Resumes with Any pod Command

See the B<perldoc.perl.org/perlpod.html> page for more information

on B<pod> and B<perldoc.perl.org> for complete Perl documentation.

You can use Perl to run the program:

$ perl pod.ex1.pl

Hi there.

488 Chapter 11 The Perl Scripting Language

Or you can use

perldoc

to display the documentation:

$ perldoc pod.ex1.pl

POD.EX1(1) User Contributed Perl Documentation POD.EX1(1)

A Perl Program to Say Hi there.

This simple Perl program includes documentation in pod format. The

following =cut command tells perldoc that what follows is not documen-

tation.

pod Documentation Resumes with Any pod Command

See the perldoc.perl.org/perlpod.html page for more information on pod

and perldoc.perl.org for complete Perl documentation.

perl v5.10.0 2008-10-14 POD.EX1(1)

Most publicly distributed modules and scripts, as well as Perl itself, include

embedded pod-format documentation. For example, the following command dis-

plays information about the Perl print function:

$ perldoc -f print

print FILEHANDLE LIST

print LIST

print Prints a string or a list of strings. Returns true if success-

ful. FILEHANDLE may be a scalar variable name, in which case

the variable contains the name of or a reference to the file-

handle, thus introducing one level of indirection. (NOTE: If

FILEHANDLE is a variable and the next token is a term, it may

...

Once you have installed a module (page 523), you can use

perldoc

to display doc-

umentation for that module. The following example shows

perldoc

displaying

information on the locally installed Timestamp::Simple module:

$ perldoc Timestamp::Simple

Timestamp::Simple(3) User Contributed Perl Documentation Timestamp::Simple(3)

NAME

Timestamp::Simple - Simple methods for timestamping

SYNOPSIS

use Timestamp::Simple qw(stamp);

print stamp, "\n";

...

Give the command man perldoc or perldoc perldoc to display the

perldoc man

page

and read more about this tool.

Make Perl programs readable

tip Although Perl has many shortcuts that are good choices for one-shot programming, the code in

this chapter presents code that is easy to understand and easy to maintain.

Introduction to Perl 489

Terminology

This section defines some of the terms used in this chapter.

Module A Perl module is a self-contained chunk of Perl code, frequently containing several

functions that work together. A module can be called from another module or

from a Perl program. A module must have a unique name. To help ensure unique

names, Perl provides a hierarchical namespace (page 967) for modules, separating

components of a name with double colons (::). Example module names are Time-

stamp::Simple and WWW::Mechanize.

Distribution A Perl distribution is a set of one or more modules that perform a task. You can

search for distributions and modules at search.cpan.org. Examples of distributions

include Timestamp-Simple (the Timestamp-Simple-1.01.tar.gz archive file contains

the Timestamp::Simple module only) and WWW-Mechanize (WWW-Mechanize-

1.34.tar.gz contains the WWW::Mechanize module, plus supporting modules

including WWW::Mechanize::Link and WWW::Mechanize::Image).

Package Apackage defines a Perl namespace. For example, in the variable with the name

$WWW::Mechanize::ex,$ex is a scalar variable in the WWW::Mechanize package,

where “package” is used in the sense of a namespace. Using the same name, such as

WWW::Mechanize, for a distribution, a package, and a module can be confusing.

Block Ablock is zero or more statements, delimited by curly braces ({}), that defines a

scope. The shell control structure syntax explanations refer to these elements as

commands. See the if...then control structure on page 398 for an example.

Package variable Apackage variable is defined within the package it appears in. Other packages can

refer to package variables by using the variable’s fully qualified name (for example,

$Text::Wrap::columns). By default, variables are package variables unless you

define them as lexical variables.

Lexical variable Alexical variable, which is defined by preceding the name of a variable with the key-

word my (see the tip on page 494), is defined only within the block or file it appears

in. Other languages refer to a lexical variable as a local variable. Because Perl 4 used

the keyword local with a different meaning, Perl 5 uses the keyword lexical in its

place. When programming using

bash

, variables that are not exported (page 436) are

local to the program they are used in.

List Alist is a series of zero or more scalars. The following list has three elements—two

numbers and a string:

(2, 4, 'Zach')

Array An array is a variable that holds a list of elements in a defined order. In the follow-

ing line of code, @a is an array. See page 497 for more information about array

variables.

@a = (2, 4, 'Zach')

Compound

statement

Acompound statement is a statement made up of other statements. For example,

the if compound statement (page 501) incorporates an if statement that normally

includes other statements within the block it controls.

490 Chapter 11 The Perl Scripting Language

Running a Perl Program

There are several ways you can run a program written in Perl. The –e option

enables you to enter a program on the command line:

$ perl -e 'print "Hi there.\n"'

Hi there.

The –e option is a good choice for testing Perl syntax and running brief, one-shot

programs. This option requires that the Perl program appear as a single argument

on the command line. The program must immediately follow this option—it is an

argument to this option. An easy way to write this type of program is to enclose the

program within single quotation marks.

Because Perl is a member of the class of utilities that take input from a file or standard

input (page 128), you can give the command perl and enter the program terminated

by CONTROL-D (end of file). Perl reads the program from standard input:

$ perl

print "Hi there.\n";

CONTROL-D

Hi there.

The preceding techniques are useful for quick, one-off command-line programs but

are not helpful for running more complex programs. Most of the time, a Perl program

is stored in a text file. Although not required, the file typically has a filename exten-

sion of .pl. Following is the same simple program used in the previous examples

stored in a file:

$ cat simple.pl

print "Hi there.\n";

You can run this program by specifying its name as an argument to Perl:

$ perl simple.pl

Hi there.

Most commonly and similarly to most shell scripts, the file containing the Perl pro-

gram is executable. In the following example,

chmod

(page 278) makes the simple2.pl

file executable. As explained on page 280, the #! at the start of the first line of the file

instructs the shell to pass the rest of the file to /usr/bin/perl for execution.

$ chmod 755 simple2.pl

$ cat simple2.pl

#!/usr/bin/perl -w

print "Hi there.\n";

$ ./simple2.pl

Hi there.

In this example, the simple2.pl program is executed as ./simple2.pl because the

working directory is not in the user’s PATH (page 297). The –w option tells Perl to

issue warning messages when it identifies potential errors in the code.

Introduction to Perl 491

Perl Version 5.10

All examples in this chapter were run under Perl 5.10. Give the following command

to see which version of Perl the local system is running:

$ perl -v

This is perl, v5.10.0 built for i486-linux-gnu-thread-multi

...

use feature 'say' The say function is a Perl 6 feature that is available in Perl 5.10. It works the same

way as print, except it adds a NEWLINE (\n) at the end of each line it outputs. Some ver-

sions of Perl require you to tell Perl explicitly that you want to use say. The use

function in the following example tells Perl to enable say. Try running this program

without the use line to see if the local version of Perl requires it.

$ cat 5.10.pl

use feature 'say';

say 'Output by say.';

print 'Output by print.';

say 'End.'

$ perl 5.10.pl

Output by say.

Output by print.End.

Earlier versions

of Perl

If you are running an earlier version of Perl, you will need to replace say in the

examples in this chapter with print and terminate the print statement with a

quoted \n:

$ cat 5.8.pl

print 'Output by print in place of say.', "\n";

print 'Output by print.';

print 'End.', "\n";

$ perl 5.8.pl

Output by print in place of say.

Output by print.End.

Syntax

This section describes the major components of a Perl program.

Statements A Perl program comprises one or more statements, each terminated by a semicolon

(;). These statements are free-form with respect to whitespace (page 987), except for

whitespace within quoted strings. Multiple statements can appear on a single line,

each terminated by a semicolon. The following programs are equivalent. The first

occupies two lines, the second only one; look at the differences in the spacing

around the equal and plus signs. See use feature 'say' (above) if these programs

complain about say not being available.

492 Chapter 11 The Perl Scripting Language

$ cat statement1.pl

$n=4;

say "Answer is ", $n + 2;

$ perl statement1.pl

Answer is 6

$ cat statement2.pl

$n = 4; say "Answer is ", $n+2;

$ perl statement2.pl

Answer is 6

Expressions The syntax of Perl expressions frequently corresponds to the syntax of C expressions,

but is not always the same. Perl expressions are covered in examples throughout this

chapter.

Quotation marks All character strings must be enclosed within single or double quotation marks. Perl

differentiates between the two types of quotation marks in a manner similar to the

way the shell does (page 292): Double quotation marks allow Perl to interpolate

enclosed variables and interpret special characters such as \n (NEWLINE), whereas sin-

gle quotation marks do not. Table 11-1 lists some of Perl’s special characters.

The following example demonstrates how different types of quotation marks, and

the absence of quotation marks, affect Perl in converting scalars between numbers

and strings. The single quotation marks in the first print statement prevent Perl

from interpolating the $string variable and from interpreting the \n special charac-

ter. The leading \n in the second print statement forces the output of that statement

to appear on a new line.

$ cat string1.pl

$string="5"; # $string declared as a string, but it will not matter

print '$string+5\n'; # Perl displays $string+5 literally because of

# the single quotation marks

print "\n$string+5\n"; # Perl interpolates the value of $string as a string

# because of the double quotation marks

print $string+5, "\n"; # Lack of quotation marks causes Perl to interpret

# $string as a numeric variable and to add 5;

# the \n must appear between double quotation marks

$ perl string1.pl

$string+5\n

5+5

Slash By default, regular expressions are delimited by slashes (/). The following example

tests whether the string hours contains the pattern our; see page 518 for more infor-

mation on regular expressions in Perl.

$ perl -e 'if ("hours" =~ /our/) {say "yes";}'

The local version of Perl may require use feature 'say' (page 491) to work properly:

$ perl -e 'use feature "say"; if ("hours" =~ /our/) {say "yes";}'

Variables 493

Backslash Within a string enclosed between double quotation marks, a backslash escapes

(quotes) another backslash. Thus Perl displays "\\n" as \n. Within a regular expres-

sion, Perl does not expand a metacharacter preceded by a backslash. See the

string1.pl program on the previous page.

Comments As in the shell, a comment in Perl begins with a pound sign (#) and ends at the end

of the line (just before the NEWLINE character).

Special characters Table 11-1 lists some of the characters that are special within strings in Perl. Perl

interpolates these characters when they appear between double quotation marks

but not when they appear between single quotation marks. Table 11-3 on page 520

lists metacharacters, which are special within regular expressions.

Variables

Like human languages, Perl distinguishes between singular and plural data. Strings

and numbers are singular; lists of strings or numbers are plural. Perl provides three

types of variables: scalar (singular), array (plural), and hash (plural; also called

associative arrays). Perl identifies each type of variable by a special character pre-

ceding its name. The name of a scalar variable begins with a dollar sign ($), an array

variable begins with an at sign (@), and a hash variable begins with a percent sign

(%). As opposed to the way the shell identifies variables, Perl requires the leading

character to appear each time you reference a variable, including when you assign a

value to the variable:

$ name="Zach" ; echo "$name" (

bash

)

Zach

$ perl -e '$name="Zach" ; print "$name\n";' (

perl

)

Zach

Variable names, which are case sensitive, can include letters, digits, and the underscore

character (_). A Perl variable is a package variable (page 489) unless it is preceded by

Table 11-1 Some Perl special characters

Character When within double quotation marks, interpolated as

(zero) The ASCII character whose octal value is

\a An alarm (bell or beep) character (ASCII 7)

\e An ESCAPE character (ASCII 27)

\n ANEWLINE character (ASCII 10)

\r ARETURN character (ASCII 13)

\t ATAB character (ASCII 9)

494 Chapter 11 The Perl Scripting Language

the keyword my, in which case it is a lexical variable (page 489) that is defined only

within the block or file it appears in. See “Subroutines” on page 515 for a discussion of

the locality of Perl variables.

A Perl variable comes into existence when you assign a value to it—you do not need

to define or initialize a variable, although it may make a program more understand-

able to do so. Normally, Perl does not complain when you reference an uninitialized

variable:

$ cat variable1.pl

#!/usr/bin/perl

my $name = 'Sam';

print "Hello, $nam, how are you?\n"; # Typo, e left off of name

$ ./variable1.pl

Hello, , how are you?

use strict Include use strict to cause Perl to require variables to be declared before being

assigned values. See the perldebtut man page for more information. When you

include use strict in the preceding program, Perl displays an error message:

$ cat variable1b.pl

#!/usr/bin/perl

use strict;

my $name = 'Sam';

print "Hello, $nam, how are you?\n"; # Typo, e left off of name

$ ./variable1b.pl

Global symbol "$nam" requires explicit package name at ./variable1b.pl line 4.

Execution of ./variable1b.pl aborted due to compilation errors.

Lexical variables overshadow package variables

caution If a lexical variable and a package variable have the same name, within the block or file in which

the lexical variable is defined, the name refers to the lexical variable and not to the package

variable.

Using my: lexical versus package variables

tip In variable1.pl,$name is declared to be lexical by preceding its name with the keyword my; its

name and value are known within the file variable1.pl only. Declaring a variable to be lexical limits

its scope to the block or file it is defined in. Although not necessary in this case, declaring variables

to be lexical is good practice. This habit becomes especially useful when you write longer pro-

grams, subroutines, and packages, where it is harder to keep variable names unique. Declaring all

variables to be lexical is mandatory when you write routines that will be used within code written

by others. This practice allows those who work with your routines to use whichever variable

names they like, without regard to which variable names you used in the code you wrote.

The shell and Perl scope variables differently. In the shell, if you do not

export

a variable, it is local

to the routine it is used in (page 436). In Perl, if you do not use my to declare a variable to be lex-

ical, it is defined for the package it appears in.

Variables 495

–w and

use warnings

The –w option and the use warnings statement perform the same function: They

cause Perl to generate an error message when it detects a syntax error. In the fol-

lowing example, Perl displays two warnings. The first tells you that you have used

the variable named $nam once, on line 3, which probably indicates an error. This

message is helpful when you mistype the name of a variable. Under Perl 5.10, the

second warning specifies the name of the uninitialized variable. This warning

refers to the same problem as the first warning. Although it is not hard to figure

out which of the two variables is undefined in this simple program, doing so in a

complex program can take a lot of time.

$ cat variable1a.pl

#!/usr/bin/perl -w

my $name = 'Sam';

print "Hello, $nam, how are you?\n"; # Prints warning because of typo and -w

$ ./variable1a.pl

Name "main::nam" used only once: possible typo at ./variable1a.pl line 3.

Use of uninitialized value $nam in concatenation (.) or string at ./variable1a.pl line 3.

Hello, , how are you?

You can also use –w on the command line. If you use –e as well, make sure the argu-

ment that follows this option is the program you want to execute (e.g., –e –w does

not work). See the tip on page 518.

$ perl -w -e 'my $name = "Sam"; print "Hello, $nam, how are you?\n"'

Name "main::nam" used only once: possible typo at -e line 1.

Use of uninitialized value $nam in concatenation (.) or string at -e line 1.

Hello, , how are you?

undef and defined An undefined variable has the special value undef, which evaluates to zero (0) in a

numeric expression and expands to an empty string ("") when you print it. Use the

defined function to determine whether a variable has been defined. The following

example, which uses constructs explained later in this chapter, calls defined with an

argument of $name and negates the result with an exclamation point (!). The result

is that the print statement is executed if $name is not defined.

$ cat variable2.pl

#!/usr/bin/perl

if (!defined($name)) {

print "The variable '\$name' is not defined.\n"

};

$ ./variable2.pl

The variable '$name' is not defined.

Because the –w option causes Perl to warn you when you reference an undefined

variable, using this option would generate a warning.

Scalar Variables

A scalar variable has a name that begins with a dollar sign ($) and holds a single

string or number: It is a singular variable. Because Perl converts between the two

496 Chapter 11 The Perl Scripting Language

when necessary, you can use strings and numbers interchangeably. Perl interprets

scalar variables as strings when it makes sense to interpret them as strings, and as

numbers when it makes sense to interpret them as numbers. Perl's judgment in these

matters is generally good.

The following example shows some uses of scalar variables. The first two lines of

code (lines 3 and 4) assign the string Sam to the scalar variable $name and the

numbers 5 and 2 to the scalar variables $n1 and $n2, respectively. In this example,

multiple statements, each terminated with a semicolon (;), appear on a single line.

See use feature 'say' on page 491 if this program complains about say not being

available.

$ cat scalars1.pl

#!/usr/bin/perl -w

$name = "Sam";

$n1 = 5; $n2 = 2;

say "$name $n1 $n2";

say "$n1 + $n2";

say '$name $n1 $n2';

say $n1 + $n2, " ", $n1 * $n2;

say $name + $n1;

$ ./scalars1.pl

Sam 5 2

5 + 2

$name $n1 $n2

7 10

Argument "Sam" isn't numeric in addition (+) at ./scalers1.pl line 11.

Double quotation

marks

The first say statement sends the string enclosed within double quotation marks to

standard output (the screen unless you redirect it). Within double quotation marks,

Perl expands variable names to the value of the named variable. Thus the first say

statement displays the values of three variables, separated from each other by SPACEs.

The second say statement includes a plus sign (+). Perl does not recognize operators

such as + within either type of quotation marks. Thus Perl displays the plus sign

between the values of the two variables.

Single quotation

marks

The third say statement sends the string enclosed within single quotation marks to

standard output. Within single quotation marks, Perl interprets all characters liter-

ally, so it displays the string exactly as it appears between the single quotation

marks.

In the fourth say statement, the operators are not quoted, and Perl performs the

addition and multiplication as specified. Without the quoted SPACE, Perl would cate-

nate the two numbers (710). The last say statement attempts to add a string and a

number; the –w option causes Perl to display an error message before displaying 5.

The 5 results from adding Sam, which Perl evaluates as 0 in a numerical context, to

the number 5 (0 + 5 = 5).

Variables 497

Array Variables

An array variable is an ordered container of scalars whose name begins with an at

sign (@) and whose first element is numbered zero (zero-based indexing). Because

an array can hold zero or more scalars, it is a plural variable. Arrays are ordered;

hashes (page 500) are unordered. In Perl, arrays grow as needed. If you reference an

uninitialized element of an array, such as an element beyond the end of the array,

Perl returns undef.

The first statement in the following program assigns the values of two numbers

and a string to the array variable named @arrayvar. Because Perl uses zero-based

indexing, the first say statement displays the value of the second element of the

array (the element with the index 1). This statement specifies the variable

$arrayvar[1] as a scalar (singular) because it refers to a single value. The second

say statement specifies the variable @arrayvar[1,2] as a list (plural) because it refers

to multiple values (the elements with the indexes 1 and 2).

$ cat arrayvar1.pl

#!/usr/bin/perl -w

@arrayvar = (8, 18, "Sam");

say $arrayvar[1];

say "@arrayvar[1,2]";

$ perl arrayvar1.pl

18 Sam

The next example shows a couple of ways to determine the length of an array and

presents more information on using quotation marks within print statements. The

first assignment statement in arrayvar2.pl assigns values to the first six elements of

the @arrayvar2 array. When used in a scalar context, Perl evaluates the name of an

array as the length of the array. The second assignment statement assigns the num-

ber of elements in @arrayvar2 to the scalar variable $num.

$ cat arrayvar2.pl

#!/usr/bin/perl -w

@arrayvar2 = ("apple", "bird", 44, "Tike", "metal", "pike");

$num = @arrayvar2; # number of elements in array

print "Elements: ", $num, "\n"; # two equivalent print statements

print "Elements: $num\n";

print "Last: $#arrayvar2\n"; # index of last element in array

$ ./arrayvar2.pl

Elements: 6

Last: 5

The first two print statements in arrayvar2.pl display the string Elements:, a SPACE,

the value of $num, and a NEWL NE, each using a different syntax. The first of these

498 Chapter 11 The Perl Scripting Language

statements displays three values, using commas to separate them within the print

statement. The second print statement has one argument and demonstrates that Perl

expands a variable (replaces the variable with its value) when the variable is

enclosed within double quotation marks.

$#array The final print statement in arrayvar2.pl shows that Perl evaluates the variable

$#array as the index of the last element in the array named array. Because Perl uses

zero-based indexing by default, this variable evaluates to one less than the number

of elements in the array.

The next example works with elements of an array and uses a dot (.; the string cat-

enation operator). The first two lines assign values to four scalar variables. The

third line shows that you can assign values to array elements using scalar variables,

arithmetic, and catenated strings. The dot operator catenates strings, so Perl evalu-

ates $va . $vb as Sam catenated with uel—that is, as Samuel (see the output of the

last print statement).

$ cat arrayvar3.pl

#!/usr/bin/perl -w

$v1 = 5; $v2 = 8;

$va = "Sam"; $vb = "uel";

@arrayvar3 = ($v1, $v1 * 2, $v1 * $v2, "Max", "Zach", $va . $vb);

print $arrayvar3[2], "\n"; # one element of an array is a scalar

print @arrayvar3[2,4], "\n"; # two elements of an array are a list

print @arrayvar3[2..4], "\n\n"; # a slice

print "@arrayvar3[2,4]", "\n"; # a list, elements separated by SPACEs

print "@arrayvar3[2..4]", "\n\n"; # a slice, elements separated by SPACEs

print "@arrayvar3\n"; # an array, elements separated by SPACEs

$ ./arrayvar3.pl

40Zach

40MaxZach

40 Zach

40 Max Zach

5 10 40 Max Zach Samuel

The first print statement in arrayvar3.pl displays the third element (the element with

an index of 2) of the @arrayvar3 array. This statement uses $ in place of @ because

it refers to a single element of the array. The subsequent print statements use @

because they refer to more than one element. Within the brackets that specify an

array subscript, two subscripts separated by a comma specify two elements of an

array. The second print statement, for example, displays the third and fifth elements

of the array.

Variables 499

Array slice When you separate two elements of an array with two dots (..; the range operator),

Perl substitutes all elements between and including the two specified elements. A

portion of an array comprising elements is called a slice. The third print statement

in the preceding example displays the elements with indexes 2, 3, and 4 (the third,

fourth, and fifth elements) as specified by 2..4. Perl puts no SPACEs between the ele-

ments it displays.

Within a print statement, when you enclose an array variable, including its subscripts,

within double quotation marks, Perl puts a SPACE between each of the elements. The

fourth and fifth print statements in the preceding example illustrate this syntax. The

last print statement displays the entire array, with elements separated by SPACEs.

shift,push,

pop, and splice

The next example demonstrates several functions you can use to manipulate arrays.

The example uses the @colors array, which is initialized to a list of seven colors. The

shift function returns and removes the first element of an array, push adds an ele-

ment to the end of an array, and pop returns and removes the last element of an

array. The splice function replaces elements of an array with another array; in the

example, splice inserts the @ins array starting at index 1 (the second element),

replacing two elements of the array. See use feature 'say' on page 491 if this pro-

gram complains about say not being available. See the perlfunc

man

page for more

information on the functions described in this paragraph.

$ cat ./shift1.pl

#!/usr/bin/perl -w

@colors = ("red", "orange", "yellow", "green", "blue", "indigo", "violet");

say " Display array: @colors";

say " Display and remove first element of array: ", shift (@colors);

say " Display remaining elements of array: @colors";

push (@colors, "WHITE");

say " Add element to end of array and display: @colors";

say " Display and remove last element of array: ", pop (@colors);

say " Display remaining elements of array: @colors";

@ins = ("GRAY", "FERN");

splice (@colors, 1, 2, @ins);

say "Replace second and third elements of array: @colors";

$ ./shift1.pl

Display array: red orange yellow green blue indigo violet

Display and remove first element of array: red

Display remaining elements of array: orange yellow green blue indigo violet

Add element to end of array and display: orange yellow green blue indigo violet WHITE

Display and remove last element of array: WHITE

Display remaining elements of array: orange yellow green blue indigo violet

Replace second and third elements of array: orange GRAY FERN blue indigo violet

500 Chapter 11 The Perl Scripting Language

Hash Variables

A hash variable, sometimes called an associative array variable, is a plural data

structure that holds an array of key–value pairs. It uses strings as keys (indexes) and

is optimized to return a value quickly when given a key. Each key must be a unique

scalar. Hashes are unordered; arrays (page 497) are ordered. When you assign a

hash to a list, the key–value pairs are preserved, but their order is neither alphabeti-

cal nor the order in which they were inserted into the hash; instead, the order is

effectively random.

Perl provides two syntaxes to assign values to a hash. The first uses a single assignment

statement for each key–value pair:

$ cat hash1.pl

#!/usr/bin/perl -w

$hashvar1{boat} = "tuna";

$hashvar1{"number five"} = 5;

$hashvar1{4} = "fish";

@arrayhash1 = %hashvar1;

say "@arrayhash1";

$ ./hash1.pl

boat tuna 4 fish number five 5

Within an assignment statement, the key is located within braces to the left of the

equal sign; the value is on the right side of the equal sign. As illustrated in the pre-

ceding example, keys and values can take on either numeric or string values. You do

not need to quote string keys unless they contain SPACEs. This example also shows

that you can display the keys and values held by a hash, each separated from the

next by a SPACE, by assigning the hash to an array variable and then printing that

variable enclosed within double quotation marks.

The next example shows the other way of assigning values to a hash and illustrates

how to use the keys and values functions to extract keys and values from a hash. After

assigning values to the %hash2 hash, hash2.pl calls the keys function with an argu-

ment of %hash2 and assigns the resulting list of keys to the @array_keys array. The

program then uses the values function to assign values to the @array_values array.

$ cat hash2.pl

#!/usr/bin/perl -w

%hash2 = (

boat => "tuna",

"number five" => 5,

4 => "fish",

);

@array_keys = keys(%hash2);

say " Keys: @array_keys";

@array_values = values(%hash2);

say "Values: @array_values";

Control Structures 501

$ ./hash2.pl

Keys: boat 4 number five

Values: tuna fish 5

Because Perl automatically quotes a single word that appears to the left of the =>

operator, you do not need quotation marks around boat in the third line of this pro-

gram. However, removing the quotation marks from around number five would

generate an error because the string contains a SPACE.

Control Structures

Control flow statements alter the order of execution of statements within a Perl

program. Starting on page 398, Chapter 10 discusses

bash

control structures in

detail and includes flow diagrams. Perl control structures perform the same func-

tions as their

bash

counterparts, although the two languages use different syntaxes.

The description of each control structure in this section references the discussion of

the same control structure under

bash

In this section, the bold italic words in the syntax description are the items you sup-

ply to cause the structure to have the desired effect, the nonbold italic words are the

keywords Perl uses to identify the control structure, and {...} represents a block

(page 489) of statements. Many of these structures use an expression, denoted as

expr, to control their execution. See if/unless (next) for an example and explanation

of a syntax description.

if/unless

The if and unless control structures are compound statements that have the following

syntax:

if (expr) {...}

unless (expr) {...}

These structures differ only in the sense of the test they perform. The if structure

executes the block of statements if expr evaluates to true;unless executes the block

of statements unless expr evaluates to true (i.e., if expr is false).

The if appears in nonbold type because it is a keyword; it must appear exactly as

shown. The expr is an expression; Perl evaluates it and executes the block

(page 489) of statements represented by {...} if the expression evaluates as required

by the control structure.

File test operators The expr in the following example, –r memo1, uses the –r file test operator to

determine if a file named memo1 exists in the working directory and if the file is

readable. Although this operator tests only whether you have read permission

for the file, the file must exist for you to have read permission; thus it implicitly

tests that the file is present. (Perl uses the same file test operators as

bash

; see

Table 10-1 on page 401.) If this expression evaluates to true, Perl executes the

502 Chapter 11 The Perl Scripting Language

block of statements (in this case one statement) between the braces. If the

expression evaluates to false, Perl skips the block of statements. In either case,

Perl then exits and returns control to the shell.

$ cat if1.pl

#!/usr/bin/perl -w

if (-r "memo1") {

say "The file 'memo1' exists and is readable.";

}

$ ./if1.pl

The file 'memo1' exists and is readable.

Following is the same program written using the postfix if syntax. Which syntax you

use depends on which part of the statement is more important to someone reading

the code.

$ cat if1a.pl

#!/usr/bin/perl -w

say "The file 'memo1' exists and is readable." if (-r "memo1");

The next example uses a print statement to display a prompt on standard output

and uses the statement $entry = <>; to read a line from standard input and assign

the line to the variable $entry. Reading from standard input, working with other

files, and use of the magic file handle (<>) for reading files specified on the com-

mand line are covered on page 510.

Comparison

operators

Perl uses different operators to compare numbers from those it uses to compare

strings. Table 11-2 lists numeric and string comparison operators. In the following

example, the expression in the if statement uses the == numeric comparison operator

to compare the value the user entered and the number 28. This operator performs a

numeric comparison, so the user can enter 28,28.0, or 00028 and in all cases the

result of the comparison will be true. Also, because the comparison is numeric, Perl

ignores both the whitespace around and the NEWLINE following the user’s entry. The –w

option causes Perl to issue a warning if the user enters a nonnumeric value and the

program uses that value in an arithmetic expression; without this option Perl silently

evaluates the expression as false.

$ cat if2.pl

#!/usr/bin/perl -w

print "Enter 28: ";

$entry = <>;

if ($entry == 28) { # use == for a numeric comparison

print "Thank you for entering 28.\n";

}

print "End.\n";

$ ./if2.pl

Enter 28: 28.0

Thank you for entering 28.

End.

Control Structures 503

The next program is similar to the preceding one, except it tests for equality

between two strings. The chomp function (page 511) removes the trailing NEWLINE

from the user’s entry—without this function the strings in the comparison would

never match. The eq comparison operator compares strings. In this example the

result of the string comparison is true when the user enters the string five. Leading

or trailing whitespace will yield a result of false, as would the string 5, although

none of these entries would generate a warning because they are legitimate strings.

$ cat if2a.pl

#!/usr/bin/perl -w

print "Enter the word 'five': ";

$entry = <>;

chomp ($entry);

if ($entry eq "five") { # use eq for a string comparison

print "Thank you for entering 'five'.\n";

}

print "End.\n";

$ ./if2a.pl

Enter the word 'five': five

Thank you for entering 'five'.

End.

if...else

The if...else control structure is a compound statement that is similar to the

bash

if...then...else control structure (page 402). It implements a two-way branch using

the following syntax:

if (expr) {...} else {...}

die The next program prompts the user for two different numbers and stores those

numbers in $num1 and $num2. If the user enters the same number twice, an if

structure executes a die function, which sends its argument to standard error and

aborts program execution.

Table 11-2 Comparison operators

Numeric

operators

String

operators

Value returned based on the relationship between the

values preceding and following the operator

== eq True if equal

!= ne True if not equal

<ltTrue if less than

>gtTrue if greater than

<= le True if less than or equal

>= ge True if greater than or equal

<=> cmp 0 if equal, 1 if greater than, –1 if less than

504 Chapter 11 The Perl Scripting Language

If the user enters different numbers, the if...else structure reports which number is

larger. Because expr performs a numeric comparison, the program accepts numbers

that include decimal points.

$ cat ifelse.pl

#!/usr/bin/perl -w

print "Enter a number: ";

$num1 = <>;

print "Enter another, different number: ";

$num2 = <>;

if ($num1 == $num2) {

die ("Please enter two different numbers.\n");

}

if ($num1 > $num2) {

print "The first number is greater than the second number.\n";

}

else {

print "The first number is less than the second number.\n";

}

$ ./ifelse.pl

Enter a number: 8

Enter another, different number: 8

Please enter two different numbers.

$ ./ifelse.pl

Enter a number: 5.5

Enter another, different number: 5

The first number is greater than the second number.

if...elsif...else

Similar to the bash if...then...elif control structure (page 405), the Perl if...elsif...else

control structure is a compound statement that implements a nested set of if...else

structures using the following syntax:

if (expr) {...} elsif {...} ... else {...}

The next program implements the functionality of the preceding ifelse.pl program

using an if...elsif...else structure. A print statement replaces the die statement

because the last statement in the program displays the error message; the program

terminates after executing this statement anyway. You can use the STDERR handle

(page 510) to cause Perl to send this message to standard error instead of standard

output.

$ cat ifelsif.pl

#!/usr/bin/perl -w

print "Enter a number: ";

$num1 = <>;

print "Enter another, different number: ";

$num2 = <>;

Control Structures 505

if ($num1 > $num2) {

print "The first number is greater than the second number.\n";

}

elsif ($num1 < $num2) {

print "The first number is less than the second number.\n";

}

else {

print "Please enter two different numbers.\n";

}

foreach/for

The Perl foreach and for keywords are synonyms; you can replace one with the other

in any context. These structures are compound statements that have two syntaxes.

Some programmers use one syntax with foreach and the other syntax with the for,

although there is no need to do so. This book uses foreach with both syntaxes.

foreach: Syntax 1

The first syntax for the foreach structure is similar to the shell’s for...in structure

(page 411):

foreach|for [var] (list) {...}

where list is a list of expressions or variables. Perl executes the block of statements

once for each item in list, sequentially assigning to var the value of one item in list

on each iteration, starting with the first item. If you do not specify var, Perl assigns

values to the $_ variable (page 509).

The following program demonstrates a simple foreach structure. On the first pass

through the loop, Perl assigns the string Mo to the variable $item and the say state-

ment displays the value of this variable followed by a NEWLINE. On the second and

third passes through the loop, $item is assigned the value of Larry and Curly. When

there are no items left in the list, Perl continues with the statement following the

foreach structure. In this case, the program terminates. See use feature 'say' on

page 491 if this program complains about say not being available.

$ cat foreach.pl

foreach $item ("Mo", "Larry", "Curly") {

say "$item says hello.";

}

$ perl foreach.pl

Mo says hello.

Larry says hello.

Curly says hello.

Using $_ (page 509), you can write this program as follows:

$ cat foreacha.pl

foreach ("Mo", "Larry", "Curly") {

say "$_ says hello.";

}

506 Chapter 11 The Perl Scripting Language

Following is the program using an array:

$ cat foreachb.pl

@stooges = ("Mo", "Larry", "Curly");

foreach (@stooges) {

say "$_ says hello.";

}

Following is the program using the foreach postfix syntax:

$ cat foreachc.pl

@stooges = ("Mo", "Larry", "Curly");

say "$_ says hello." foreach @stooges;

The loop variable ($item and $_ in the preceding examples) references the elements

in the list within the parentheses. When you modify the loop variable, you modify

the element in the list. The uc function returns an upshifted version of its argument.

The next example shows that modifying the loop variable $stooge modifies the

@stooges array:

$ cat foreachd.pl

@stooges = ("Mo", "Larry", "Curly");

foreach $stooge (@stooges) {

$stooge = uc $stooge;

say "$stooge says hello.";

}

say "$stooges[1] is uppercase"

$ perl foreachd.pl

MO says hello.

LARRY says hello.

CURLY says hello.

LARRY is uppercase

See page 513 for an example that loops through command-line arguments.

last and next

Perl’s last and next statements allow you to interrupt a loop; they are analogous to the

Bourne Again Shell’s break and continue statements (page 420). The last statement

transfers control to the statement following the block of statements controlled by the

loop structure, terminating execution of the loop. The next statement transfers control

to the end of the block of statements, which continues execution of the loop with the

next iteration.

In the following program, the if structure tests whether $item is equal to the string

two; if it is, the structure executes the next command, which skips the say statement

and continues with the next iteration of the loop. If you replaced next with last, Perl

would exit from the loop and not display three. See use feature 'say' on page 491 if

this program complains about say not being available.

Control Structures 507

$ cat foreach1.pl

foreach $item ("one", "two", "three") {

if ($item eq "two") {

next;

}

say "$item";

}

$ perl foreach1.pl

one

three

foreach: Syntax 2

The second syntax for the foreach structure is similar to the C for structure:

foreach|for (expr1;expr2;expr3) {...}

The expr1 initializes the foreach loop; Perl evaluates expr1 one time, before it executes

the block of statements. The expr2 is the termination condition; Perl evaluates it before

each pass through the block of statements and executes the block of statements if

expr2 evaluates as true. Perl evaluates expr3 after each pass through the block of

statements—it typically increments a variable that is part of expr2.

In the next example, the foreach2.pl program prompts for three numbers; displays

the first number; repeatedly increments this number by the second number, display-

ing each result until the result would be greater than the third number; and quits.

See page 510 for a discussion of the magic file handle (<>).

$ cat ./foreach2.pl

#!/usr/bin/perl -w

print "Enter starting number: ";

$start = <>;

print "Enter ending number: ";

$end = <>;

print "Enter increment: ";

$incr = <>;

if ($start >= $end || $incr < 1) {

die ("The starting number must be less than the ending number\n",

"and the increment must be greater than zero.\n");

}

foreach ($count = $start+0; $count <= $end; $count += $incr) {

say "$count";

}

508 Chapter 11 The Perl Scripting Language

$ ./foreach2.pl

Enter starting number: 2

Enter ending number: 10

Enter increment: 3

After prompting for three numbers, the preceding program tests whether the starting

number is greater than or equal to the ending number or if the increment is less than

1. The || is a Boolean OR operator; the expression within the parentheses following if

evaluates to true if either the expression before or the expression after this operator

evaluates to true.

The foreach statement begins by assigning the value of $start+0 to $count. Adding

0 (zero) to the string $start forces Perl to work in a numeric context, removing the

trailing NEWLINE when it performs the assignment. Without this fix, the program

would display an extra NEWLINE following the first number it displayed.

while/until

The while (page 414) and until (page 418) control structures are compound statements

that implement conditional loops using the following syntax:

while (expr) {...}

until (expr) {...}

These structures differ only in the sense of their termination conditions. The while

structure repeatedly executes the block of statements while expr evaluates to true;

until continues until expr evaluates to true (i.e., while expr remains false).

The following example demonstrates one technique for reading and processing

input until there is no more input. Although this example shows input coming from

the user (standard input), the technique works the same way for input coming from

a file (see the example on page 512). The user enters CONTROL-D on a line by itself to

signal the end of file.

In this example, expr is $line = <>. This statement uses the magic file handle (<>;

page 510) to read one line from standard input and assigns the string it reads to the

$line variable. This statement evaluates to true as long as it reads data. When it

reaches the end of file, the statement evaluates to false. The while loop continues to

execute the block of statements (in this example, only one statement) as long as

there is data to read.

$ cat while1.pl

#!/usr/bin/perl -w

$count = 0;

while ($line = <>) {

print ++$count, ". $line";

}

print "\n$count lines entered.\n";

Control Structures 509

$ ./while1.pl

Good Morning.

1. Good Morning.

Today is Monday.

2. Today is Monday.

CONTROL-D

2 lines entered.

In the preceding example, $count keeps track of the number of lines the user enters.

Putting the ++ increment operator before a variable (++$count; called a preincre-

ment operator) increments the variable before Perl evaluates it. Alternatively, you

could initialize $count to 1 and increment it with $count++ (postincrement), but

then in the final print statement $count would equal one more than the number of

lines entered.

$. The $. variable keeps track of the number of lines of input a program has read.

Using $. you can rewrite the previous example as follows:

$ cat while1a.pl

#!/usr/bin/perl -w

while ($line = <>) {

print $., ". $line";

}

print "\n$. lines entered.\n";

$_ Frequently you can simplify Perl code by using the $_ variable. You can use $_

many places in a Perl program—think of $_ as meaning it, the object of what you

are doing. It is the default operand for many operations. For example, the following

section of code processes a line using the $line variable. It reads a line into $line,

removes any trailing NEWLINE from $line using chomp (page 511), and checks whether

a regular expression matches $line.

while (my $line = <>) {

chomp $line;

if ($line =~ /regex/) ...

}

You can rewrite this code by using $_ to replace $line:

while (my $_ = <>) {

chomp $_;

if ($_ =~ /regex/) ...

}

Because $_ is the default operand in each of these instances, you can also omit $_

altogether:

while (<>) { # read into $_

chomp; # chomp $_

if (/regex/) ... # if $_ matches regex

}

510 Chapter 11 The Perl Scripting Language

Working with Files

Opening a file and

assigning a handle

Ahandle is a name that you can use in a Perl program to refer to a file or pro-

cess that is open for reading and/or writing. When you are working with the

shell, handles are referred to as file descriptors (page 431). As when you are

working with the shell, the kernel automatically opens handles for standard

input (page 123), standard output (page 123), and standard error (page 275)

before it runs a program. The kernel closes these descriptors after a program

finishes running. The names for these handles are STDIN,STDOUT, and

STDERR, respectively. You must manually open handles to read from or write

to other files or processes. The syntax of an open statement is

open (file-handle, ['mode',] "file-ref");

where file-handle is the name of the handle or a variable you will use in the pro-

gram to refer to the file or process named by file-ref. If you omit mode or specify a

mode of <, Perl opens the file for input (reading). Specify mode as > to truncate and

write to a file or as >> to append to a file.

See page 526 for a discussion of reading from and writing to processes.

Writing to a file The print function writes output to a file or process. The syntax of a print statement is

print [file-handle] "text";

where file-handle is the name of the handle you specified in an open statement and

text is the information you want to output. The file-handle can also be STDOUT

or STDERR, as explained earlier. Except when you send information to standard

output, you must specify a handle in a print statement. Do not place a comma after

file-handle. Also, do not enclose arguments to print within parentheses because

doing so can create problems.

Reading from a file The following expression reads one line, including the NEWLINE (\n), from the file or

process associated with file-handle:

<file-handle>

This expression is typically used in a statement such as

$line = <IN>;

which reads into the variable $line one line from the file or process identified by the

handle IN.

Magic file

handle (<>)

To facilitate reading from files named on the command line or from standard input,

Perl provides the magic file handle. This book uses this file handle in most exam-

ples. In place of the preceding line, you can use

$line = <>;

Working with Files 511

This file handle causes a Perl program to work like many Linux utilities: It reads

from standard input unless the program is called with one or more arguments, in

which case it reads from the files named by the arguments. See page 128 for an

explanation of how this feature works with

cat

The print statement in the first line in the next example includes the optional han-

dle STDOUT; the next print statement omits this handle; the final print statement

uses the STDERR file handle, which causes print’s output to go to standard error.

The first print statement prompts the user to enter something. The string that this

statement outputs is terminated with a SPACE, not a NEWLINE, so the user can enter

information on the same line as the prompt. The second line then uses a magic file

handle to read one line from standard input, which it assigns to $userline. Because

of the magic file handle, if you call file1.pl with an argument that is a filename, it

reads one line from that file instead of from standard input. The command line

that runs file1.pl uses 2> (see “File descriptors” on page 275) to redirect standard

error (the output of the third print statement) to the file1.err file.

$ cat file1.pl

print STDOUT "Enter something: ";

$userline = <>;

print "1>>>$userline<<<\n";

chomp ($userline);

print "2>>>$userline<<<\n";

print STDERR "3. Error message.\n";

$ perl file1.pl 2> file1.err

Enter something: hi there

1>>>hi there

<<<

2>>>hi there<<<

$ cat file1.err

3. Error message.

chomp/chop The two print statements following the user input in file1.pl display the value of

$userline immediately preceded by greater than signs (>) and followed by less than

signs (<). The first of these statements demonstrates that $userline includes a NEWLINE:

The less than signs following the string the user entered appear on the line following

the string. The chomp function removes a trailing NEWLINE, if it exists, from a string.

After chomp processes $userline, the print statement shows that this variable no

longer contains a NEWLINE. (The chop function is similar to chomp, except it removes

any trailing character from a string.)

The next example shows how to read from a file. It uses an open statement to assign

the lexical file handle $infile to the file /usr/share/dict/words. Each iteration of the

while structure evaluates an expression that reads a line from the file represented by

$infile and assigns the line to $line. When while reaches the end of file, the expres-

sion evaluates to false; control then passes out of the while structure. The block of

one statement displays the line as it was read from the file, including the NEWLINE.

512 Chapter 11 The Perl Scripting Language

This program copies /usr/share/dict/words to standard output. A pipe (|; page 56)

is then used to send the output through

head

(page 52), which displays the first four

lines of the file (the first line is blank).

$ cat file2.pl

open (my $infile, "/usr/share/dict/words") or die "Cannot open dictionary: $!\n";

while ($line = <$infile>) {

print $line;

}

$ perl file2.pl | head -4

A's

AOL

$! The $! variable holds the last system error. In a numeric context, it holds the system

error number; in a string context, it holds the system error string. If the words file is

not present on the system, file2.pl displays the following message:

Cannot open dictionary: No such file or directory

If you do not have read permission for the file, the program displays this message:

Cannot open dictionary: Permission denied

Displaying the value of $! gives the user more information about what went wrong

than simply saying that the program could not open the file.

@ARGV The @ARGV array holds the arguments from the command line Perl was called

with. When you call the following program with a list of filenames, it displays the

first line of each file. If the program cannot read a file, die (page 503) sends an

error message to standard error and quits. The foreach structure loops through the

command-line arguments, as represented by @ARGV, assigning each argument in

turn to $filename. The foreach block starts with an open statement. Perl executes

the open statement that precedes the OR Boolean operator (or) or, if that fails, Perl

executes the statement following the or operator (die). The result is that Perl either

opens the file named by $filename and assigns IN as its handle or, if it cannot open

that file, executes the die statement and quits. The print statement displays the

name of the file followed by a colon and the first line of the file. When it accepts

$line = <IN> as an argument to print, Perl displays the value of $line following the

assignment. After reading a line from a file, the program closes the file.

Always check for an error when opening a file

tip When a Perl program attempts to open a file and fails, the program does not display an error

message unless it checks whether open returned an error. In file2.pl, the or operator in the

open statement causes Perl to execute die (page 503) if open fails. The die statement sends

the message Cannot open the dictionary followed by the system error string to standard error

and terminates the program.

Sort 513

$ cat file3.pl

foreach $filename (@ARGV) {

open (IN, $filename) or die "Cannot open file '$filename': $!\n";

print "$filename: ", $line = <IN>;

close (IN);

}

$ perl file3.pl f1 f2 f3 f4

f1: First line of file f1.

f2: First line of file f2.

Cannot open file 'f3': No such file or directory

The next example is similar to the preceding one, except it takes advantage of several

Perl features that make the code simpler. It does not quit when it cannot read a file.

Instead, Perl displays an error message and continues. The first line of the program

uses my to declare $filename to be a lexical variable. Next, while uses the magic file

handle to open and read each line of each file named by the command-line argu-

ments; $ARGV holds the name of the file. When there are no more files to read, the

while condition [(<>)] is false, while transfers control outside the while block, and

the program terminates. Perl takes care of all file opening and closing operations;

you do not have to write code to take care of these tasks. Perl also performs error

checking.

The program displays the first line of each file named by a command-line argument.

Each time through the while block, while reads another line. When it finishes with

one file, it starts reading from the next file. Within the while block, if tests whether

it is processing a new file. If it is, the if block displays the name of the file and the

(first) line from the file and then assigns the new filename ($ARGV) to $filename.

$ cat file3a.pl

my $filename;

while (<>) {

if ($ARGV ne $filename) {

print "$ARGV: $_";

$filename = $ARGV;

}

$ perl file3a.pl f1 f2 f3 f4

f1: First line of file f1.

f2: First line of file f2.

Can't open f3: No such file or directory at file3a.pl line 3, <> line 3.

f4: First line of file f4.

Sort

reverse The sort function returns elements of an array ordered numerically or alphabeti-

cally, based on the locale (page 963) environment. The reverse function is not

related to sort; it simply returns the elements of an array in reverse order.

514 Chapter 11 The Perl Scripting Language

The first two lines of the following program assign values to the @colors array and

display these values. Each of the next two pairs of lines uses sort to put the values in

the @colors array in order, assign the result to @scolors, and display @scolors. These

sorts put uppercase letters before lowercase letters. Observe the positions of Orange

and Violet, both of which begin with an uppercase letter, in the sorted output. The

first assignment statement in these two pairs of lines uses the full sort syntax, includ-

ing the block {$a cmp $b} that tells Perl to use the cmp subroutine, which compares

strings, and to put the result in ascending order. When you omit the block in a sort

statement, as is the case in the second assignment statement, Perl also performs an

ascending textual sort.

$ cat sort3.pl

@colors = ("red", "Orange", "yellow", "green", "blue", "indigo", "Violet");

say "@colors";

@scolors = sort {$a cmp $b} @colors; # ascending sort with

say "@scolors"; # an explicit block

@scolors = sort @colors; # ascending sort with

say "@scolors"; # an implicit block

@scolors = sort {$b cmp $a} @colors; # descending sort

say "@scolors";

@scolors = sort {lc($a) cmp lc($b)} @colors; # ascending folded sort

say "@scolors";

$ perl sort3.pl

red Orange yellow green blue indigo Violet

Orange Violet blue green indigo red yellow

yellow red indigo green blue Violet Orange

blue green indigo Orange red Violet yellow

The third sort in the preceding example reverses the positions of $a and $b in the

block to specify a descending sort. The last sort converts the strings to lowercase

before comparing them, providing a sort wherein the uppercase letters are folded

into the lowercase letters. As a result, Orange and Violet appear in alphabetical

order.

To perform a numerical sort, specify the <=> subroutine in place of cmp. The following

example demonstrates ascending and descending numerical sorts:

$ cat sort4.pl

@numbers = (22, 188, 44, 2, 12);

print "@numbers\n";

@snumbers = sort {$a <=> $b} @numbers;

print "@snumbers\n";

@snumbers = sort {$b <=> $a} @numbers;

print "@snumbers\n";

Subroutines 515

$ perl sort4.pl

22 188 44 2 12

2 12 22 44 188

188 44 22 12 2

Subroutines

All variables are package variables (page 489) unless you use the my function to

define them to be lexical variables (page 489). Lexical variables defined in a subrou-

tine are local to that subroutine.

The following program includes a main part and a subroutine named add(). This

program uses the variables named $one,$two, and $ans, all of which are package

variables: They are available to both the main program and the subroutine. The call

to the subroutine does not pass values to the subroutine and the subroutine returns

no values. This setup is not typical: It demonstrates that all variables are package

variables unless you use my to declare them to be lexical variables.

The subroutine1.pl program assigns values to two variables and calls a subroutine.

The subroutine adds the values of the two variables and assigns the result to

another variable. The main part of the program displays the result.

$ cat subroutine1.pl

$one = 1;

$two = 2;

add();

print "Answer is $ans\n";

sub add {

$ans =$one + $two

}

$ perl subroutine1.pl

Answer is 3

The next example is similar to the previous one, except the subroutine takes advan-

tage of a return statement to return a value to the main program. The program

assigns the value returned by the subroutine to the variable $ans and displays that

value. Again, all variables are package variables.

$ cat subroutine2.pl

$one = 1;

$two = 2;

$ans = add();

print "Answer is $ans\n";

sub add {

return ($one + $two)

}

$ perl subroutine2.pl

Answer is 3

516 Chapter 11 The Perl Scripting Language

Keeping variables local to a subroutine is important in many cases. The subroutine in

the next example changes the values of variables and insulates the calling program

from these changes by declaring and using lexical variables. This setup is more typical.

@_ When you pass values in a call to a subroutine, Perl makes those values available in

the array named @_ in the subroutine. Although @_ is local to the subroutine, its

elements are aliases for the parameters the subroutine was called with. Changing a

value in the @_ array changes the value of the underlying variable, which may not

be what you want. The next program avoids this pitfall by assigning the values

passed to the subroutine to lexical variables.

The subroutine3.pl program calls the addplusone() subroutine with two variables as

arguments and assigns the value returned by the subroutine to a variable. The first

statement in the subroutine declares two lexical variables and assigns to them the

values from the @_ array. The my function declares these variables to be lexical.

(See the tip on lexical and package variables on page 494.) Although you can use

my without assigning values to the declared variables, the syntax in the example is

more commonly used. The next two statements increment the lexical variables

$lcl_one and $lcl_two. The print statement displays the value of $lcl_one within the

subroutine. The return statement returns the sum of the two incremented, lexical

variables.

$ cat subroutine3.pl

$one = 1;

$two = 2;

$ans = addplusone($one, $two);

print "Answer is $ans\n";

print "Value of 'lcl_one' in main: $lcl_one\n";

print "Value of 'one' in main: $one\n";

sub addplusone {

my ($lcl_one, $lcl_two) = @_;

$lcl_one++;

$lcl_two++;

print "Value of 'lcl_one' in sub: $lcl_one\n";

return ($lcl_one + $lcl_two)

}

$ perl subroutine3.pl

Value of 'lcl_one' in sub: 2

Answer is 5

Value of 'lcl_one' in main:

Value of 'one' in main: 1

After displaying the result returned by the subroutine, the print statements in the

main program demonstrate that $lcl_one is not defined in the main program (it is

local to the subroutine) and that the value of $one has not changed.

The next example illustrates another way to work with parameters passed to a sub-

routine. This subroutine does not use variables other than the @_ array it was

passed and does not change the values of any elements of that array.

Regular Expressions 517

$ cat subroutine4.pl

$one = 1;

$two = 2;

$ans = addplusone($one, $two);

print "Answer is $ans\n";

sub addplusone {

return ($_[0] + $_[1] + 2);

}

$ perl subroutine4.pl

Answer is 5

The final example in this section presents a more typical Perl subroutine. The sub-

routine max() can be called with any number of numeric arguments and returns the

value of the largest argument. It uses the shift function to assign to $biggest the

value of the first argument the subroutine was called with and to shift the rest of the

arguments. After using shift, argument number 2 becomes argument number 1 (8),

argument 3 becomes argument 2 (64), and argument 4 becomes argument 3 (2).

Next, foreach loops over the remaining arguments (@_). Each time through the

foreach block, Perl assigns to $_ the value of each of the arguments, in order. The

$biggest variable is assigned the value of $_ if $_ is bigger than $biggest. When

max() finishes going through its arguments, $biggest holds the maximum value,

which max() returns.

$ cat subroutine5.pl

$ans = max (16, 8, 64, 2);

print "Maximum value is $ans\n";

sub max {

my $biggest = shift; # Assign first and shift the rest of the arguments to max()

foreach (@_) { # Loop through remaining arguments

$biggest = $_ if $_ > $biggest;

}

return ($biggest);

}

$ perl subroutine5.pl

Maximum value is 64

Regular Expressions

Appendix A defines and discusses regular expressions as you can use them in many

Linux utilities. All of the material in Appendix A applies to Perl, except as noted. In

addition to the facilities described in Appendix A, Perl offers regular expression fea-

tures that allow you to perform more complex string processing. This section

reviews some of the regular expressions covered in Appendix A and describes some

of the additional features of regular expressions available in Perl. It also introduces

the syntax Perl uses for working with regular expressions.

518 Chapter 11 The Perl Scripting Language

Syntax and the =~ Operator

The –l option The Perl –l option applies chomp to each line of input and places \n at the end of

each line of output. The examples in this section use the Perl –l and –e (page 490)

options. Because the program must be specified as a single argument, the examples

enclose the Perl programs within single quotation marks. The shell interprets the

quotation marks and does not pass them to Perl.

/ is the default

delimiter

By default, Perl delimits a regular expression with slashes (/). The first program uses

the =~ operator to search for the pattern ge in the string aged. You can think of the

=~ operator as meaning “contains.” Using different terminology, the =~ operator

determines whether the regular expression ge has a match in the string aged. The

regular expression in this example contains no special characters; the string ge is

part of the string aged. Thus the expression within the parentheses evaluates to true

and Perl executes the print statement.

$ perl -le 'if ("aged" =~ /ge/) {print "true";}'

true

You can achieve the same functionality by using a postfix if statement:

$ perl -le 'print "true" if "aged" =~ /ge/'

true

!~ The !~ operator works in the opposite sense from the =~ operator. The expression in

the next example evaluates to true because the regular expression xy does not

match any part of aged:

$ perl -le 'print "true" if ("aged" !~ /xy/)'

true

As explained on page 889, a period within a regular expression matches any single

character, so the regular expression a..d matches the string aged:

$ perl -le 'print "true" if ("aged" =~ /a..d/)'

true

You can use a variable to hold a regular expression. The following syntax quotes

string as a regular expression:

qr/string/

Using other options with –e

tip When you use another option with –e, the program must immediately follow the –e on the com-

mand line. Like many other utilities, Perl allows you to combine options following a single hyphen;

if –e is one of the combined options, it must appear last in the list of options. Thus you can use

perl –l –e or perl –le but not perl –e –l or perl –el.

Regular Expressions 519

The next example uses this syntax to assign the regular expression /a..d/ (includ-

ing the delimiters) to the variable $re and then uses that variable as the regular

expression:

$ perl -le '$re = qr/a..d/; print "true" if ("aged" =~ $re)'

true

If you want to include the delimiter within a regular expression, you must quote

it. In the next example, the default delimiter, a slash (/), appears in the regular

expression. To keep Perl from interpreting the / in /usr as the end of the regular

expression, the / that is part of the regular expression is quoted by preceding it

with a backslash (\). See page 891 for more information on quoting characters in

regular expressions.

$ perl -le 'print "true" if ("/usr/doc" =~ /\/usr/)'

true

Quoting several characters by preceding each one with a backslash can make a

complex regular expression harder to read. Instead, you can precede a delimited

regular expression with m and use a paired set of characters, such as {}, as the

delimiters. In the following example, the caret (^) anchors the regular expression to

the beginning of the line (page 890):

$ perl -le 'print "true" if ("/usr/doc" =~ m{^/usr})'

true

You can use the same syntax when assigning a regular expression to a variable:

$ perl -le '$pn = qr{^/usr}; print "true" if ("/usr/doc" =~ $pn)'

true

Replacement string

and assignment

Perl uses the syntax shown in the next example to substitute a string (the replacement

string) for a matched regular expression. The syntax is the same as that found in

vim

and

sed

. In the second line of the example, an s before the regular expression instructs

Perl to substitute the string between the second and third slashes (worst; the replace-

ment string) for a match of the regular expression between the first two slashes (best).

Implicit in this syntax is the notion that the substitution is made in the string held in

the variable on the left of the =~ operator.

$ cat re10a.pl

$stg = "This is the best!";

$stg =~ s/best/worst/;

print "$stg\n";

$ perl re10a.pl

This is the worst!

Table 11-3 (on the next page) list some of the characters, called metacharacters,

that are considered special within Perl regular expressions. Give the command

perldoc perlre for more information.

520 Chapter 11 The Perl Scripting Language

Greedy Matches

By default Perl performs greedy matching, which means a regular expression

matches the longest string possible (page 891). In the following example, the regular

expression /{.*} / matches an opening brace followed by any string of characters, a

closing brace, and a SPACE ({remove me} may have two {keep me} ). Perl substitutes a

null string (//) for this match.

$ cat 5ha.pl

$string = "A line {remove me} may have two {keep me} pairs of braces.";

$string =~ s/{.*} //;

print "$string\n";

$ perl 5ha.pl

A line pairs of braces.

Nongreedy matches The next example shows the classic way of matching the shorter brace-enclosed string

from the previous example. This type of match is called nongreedy or parsimonious

matching. Here the regular expression matches

1. An opening brace followed by

2. A character belonging to the character class (page 889) that includes all

characters except a closing brace ([^}]) followed by

Table 11-3 Some Perl regular expression metacharacters

Character Matches

^ (caret) Anchors a regular expression to the beginning of a line (page 890)

$ (dollar sign) Anchors a regular expression to the end of a line (page 890)

(...)Brackets a regular expression (page 521)

. (period) Any single character except NEWLINE (\n; page 889)

\\ A backslash (\)

\b A word boundary (zero-width match)

\B A nonword boundary ([^\b])

\d A single decimal digit ([0–9])

\D A single nondecimal digit ([^0–9] or [^\d])

\s (lowercase) A single whitespace character SPACE,NEWLINE,RETURN,TAB,FORMFEED

\S (uppercase) A single nonwhitespace character ([^\s])

\w (lowercase) A single word character (a letter or digit; [a–zA–Z0–9])

\W (uppercase) A single nonword character ([^\w])

Regular Expressions 521

3. Zero or more occurrences of the preceding character (*) followed by

4. A closing brace followed by

5. A SPACE.

(A caret as the first character of a character class specifies the class of all characters

that do not match the following characters, so [^}] matches any character that is not

a closing brace.)

$ cat re5b.pl

$string = "A line {remove me} may have two {keep me} pairs of braces.";

$string =~ s/{[^}]*} //;

print "$string\n";

$ perl re5b.pl

A line may have two {keep me} pairs of braces.

Perl provides a shortcut that allows you to specify a nongreedy match. In the follow-

ing example, the question mark in {.*?} causes the regular expression to match the

shortest string that starts with an opening brace followed by any string of characters

followed by a closing brace.

$ cat re5c.pl

$string = "A line {remove me} may have two {keep me} pairs of braces.";

$string =~ s/{.*?} //;

print "$string\n";

$ perl re5c.pl

A line may have two {keep me} pairs of braces.

Bracketing Expressions

As explained on page 892, you can bracket parts of a regular expression and

recall those parts in the replacement string. Most Linux utilities use quoted paren-

theses [i.e., $ and $] to bracket a regular expression. In Perl regular expressions,

parentheses are special characters. Perl omits the backslashes and uses unquoted

parentheses to bracket regular expressions. To specify a parenthesis as a regular

character within a regular expression in Perl, you must quote it (page 891).

The next example uses unquoted parentheses in a regular expression to bracket part

of the expression. It then assigns the part of the string that the bracketed expression

matched to the variable that held the string in which Perl originally searched for the

regular expression.

First the program assigns the string My name is Sam to $stg. The next statement

looks for a match for the regular expression /My name is (.*)/ in the string held by

$stg. The part of the regular expression bracketed by parentheses matches Sam; the

$1 in the replacement string matches the first (and only in this case) matched brack-

eted portion of the regular expression. The result is that the string held in $stg is

replaced by the string Sam.

522 Chapter 11 The Perl Scripting Language

$ cat re11.pl

$stg = "My name is Sam";

$stg =~ s/My name is (.*)/$1/;

print "Matched: $stg\n";

$ perl re11.pl

Matched: Sam

The next example uses regular expressions to parse a string for numbers. Two vari-

ables are initialized to hold a string that contains two numbers. The third line of the

program uses a regular expression to isolate the first number in the string. The \D*

matches a string of zero or more characters that does not include a digit: The \D

special character matches any single nondigit character. The trailing asterisk makes

this part of the regular expression perform a greedy match that does not include a

digit (it matches What is ). The bracketed regular expression \d+ matches a string of

one or more digits. The parentheses do not affect what the regular expression

matches; they allow the $1 in the replacement string to match what the bracketed

regular expression matched. The final .* matches the rest of the string. This line

assigns the value of the first number in the string to $string.

The next line is similar but assigns the second number in the string to $string2. The

print statements display the numbers and the result of subtracting the second num-

ber from the first.

$ cat re8.pl

$string = "What is 488 minus 78?";

$string2 = $string;

$string =~ s/\D*(\d+).*/$1/;

$string2 =~ s/\D*\d+\D*(\d+).*/$1/;

print "$string\n";

print "$string2\n";

print $string - $string2, "\n";

$ perl re8.pl

488

410

The next few programs show some of the pitfalls of using unquoted parentheses in

regular expressions when you do not intend to bracket part of the regular expres-

sion. The first of these programs attempts to match parentheses in a string with

unquoted parentheses in a regular expression, but fails. The regular expression ag(e

matches the same string as the regular expression age because the parenthesis is a

special character; the regular expression does not match the string ag(ed).

$ perl -le 'if ("ag(ed)" =~ /ag(ed)/) {print "true";} else {print "false";}'

false

The regular expression in the next example quotes the parentheses by preceding

each with a backslash, causing Perl to interpret them as regular characters. The

match is successful.

CPAN Modules 523

$ perl -le 'if ("ag(ed)" =~ /ag$ed$/) {print "true";} else {print "false";}'

true

Next, Perl finds an unmatched parenthesis in a regular expression:

$ perl -le 'if ("ag(ed)" =~ /ag(e/) {print "true";} else {print "false";}'

Unmatched ( in regex; marked by <-- HERE in m/ag( <-- HERE e/ at -e line 1.

When you quote the parenthesis, all is well and Perl finds a match:

$ perl -le 'if ("ag(ed)" =~ /ag\(e/) {print "true";} else {print "false";}'

true

CPAN Modules

CPAN (Comprehensive Perl Archive Network) provides Perl documentation, FAQs,

modules (page 489), and scripts on its Web site (www.cpan.org). It holds more than

16,000 distributions (page 489) and provides links, mailing lists, and versions of

Perl compiled to run under various operating systems (ports of Perl). One way to

locate a module is to visit search.cpan.org and use the search box or click one of the

classes of modules listed on that page.

This section explains how to download a module from CPAN and how to install

and run the module. Perl provides a hierarchical namespace for modules, separating

components of a name with double colons (::). The example in this section uses the

module named Timestamp::Simple, which you can read about and download from

search.cpan.org/dist/Timestamp-Simple. The timestamp is the date and time in the

format YYYYMMDDHHMMSS

To use a Perl module, you first download the file that holds the module. For this

example, the search.cpan.org/~shoop/Timestamp-Simple-1.01/Simple.pm Web page

has a link on the right side labeled Download. Click this link and save the file to the

directory you want to work in. You do not need to work as a privileged user until

the last step of this procedure, when you install the module.

Most Perl modules come as compressed

tar

files (page 63). With the downloaded file

in the working directory, decompress the file:

$ tar xzvf Timestamp-Simple-1.01.tar.gz

Timestamp-Simple-1.01/

Timestamp-Simple-1.01/Simple.pm

Timestamp-Simple-1.01/Makefile.PL

Timestamp-Simple-1.01/README

Timestamp-Simple-1.01/test.pl

Timestamp-Simple-1.01/Changes

Timestamp-Simple-1.01/MANIFEST

Timestamp-Simple-1.01/ARTISTIC

Timestamp-Simple-1.01/GPL

Timestamp-Simple-1.01/META.yml

524 Chapter 11 The Perl Scripting Language

The README file in the newly created directory usually provides instructions for

building and installing the module. Most modules follow the same steps.

$ cd Timestamp-Simple-1.01

$ perl Makefile.PL

Checking if your kit is complete...

Looks good

Writing Makefile for Timestamp::Simple

If the module you are building depends on other modules that are not installed on

the local system, running perl Makefile.PL will display one or more warnings about

prerequisites that are not found. This step writes out the makefile even if modules

are missing. In this case the next step will fail, and you must build and install miss-

ing modules before continuing.

The next step is to run

make

on the makefile you just created. After you run make,

run make test to be sure the module is working.

$ make

cp Simple.pm blib/lib/Timestamp/Simple.pm

Manifying blib/man3/Timestamp::Simple.3pm

$ make test

PERL_DL_NONLAZY=1 /usr/bin/perl "-Iblib/lib" "-Iblib/arch" test.pl

1..1

# Running under perl version 5.100000 for linux

# Current time local: Fri Sep 4 18:20:41 2009

# Current time GMT: Sat Sep 5 01:20:41 2009

# Using Test.pm version 1.25

ok 1

ok 2

ok 3

Finally, running with root privileges, install the module:

# make install

Installing /usr/local/share/perl/5.10.0/Timestamp/Simple.pm

Installing /usr/local/man/man3/Timestamp::Simple.3pm

Writing /usr/local/lib/perl/5.10.0/auto/Timestamp/Simple/.packlist

Appending installation info to /usr/local/lib/perl/5.10.0/perllocal.pod

Once you have installed a module, you can use

perldoc

to display the documentation

that tells you how to use the module. See page 487 for an example.

Some modules contain SYNOPSIS sections. If the module you installed includes

such a section, you can test the module by putting the code from the SYNOPSIS sec-

tion in a file and running it as a Perl program:

$ cat times.pl

use Timestamp::Simple qw(stamp);

print stamp, "\n";

$ perl times.pl

20090905182627

Examples 525

You can then incorporate the module in a Perl program. The following example

uses the timestamp module to generate a unique filename:

$ cat fn.pl

use Timestamp::Simple qw(stamp);

# Save timestamp in a variable

$ts = stamp, "\n";

# Strip off the year

$ts =~ s/....(.*)/\1/;

# Create a unique filename

$fn = "myfile." . $ts;

# Open, write to, and close the file

open (OUTFILE, '>', "$fn");

print OUTFILE "Hi there.\n";

close (OUTFILE);

$ perl fn.pl

$ ls myf*

myfile.0905183010

substr You can use the substr function in place of the regular expression to strip off the

year. To do so, replace the line that starts with $ts =~ with the following line. Here,

substr takes on the value of the string $ts starting at position 4 and continuing to

the end of the string:

$ts = substr ($ts, 4);

Examples

This section provides some sample Perl programs. First try running these programs

as is, and then modify them to learn more about programming with Perl.

The first example runs under Linux and displays the list of groups that the user

given as an argument is a member of. Without an argument, it displays the list of

groups that the user running the program is a member of. In a Perl program, the

%ENV hash holds the environment variables from the shell that called Perl. The

keys in this hash are the names of environment variables; the values in this hash are

the values of the corresponding variables. The first line of the program assigns a

username to $user. The shift function (page 499) takes on the value of the first

command-line argument and shifts the rest of the arguments, if any remain. If the

user runs the program with an argument, that argument is assigned to $user. If no

argument appears on the command line, shift fails and Perl executes the statement

following the Boolean OR (||). This statement extracts the value associated with

the USER key in %ENV, which is the name of the user running the program.

526 Chapter 11 The Perl Scripting Language

Accepting output

from a process

The third statement initializes the array @list. Although this statement is not

required, it is good practice to include it to make the code easier to read. The next

statement opens the $fh lexical handle. The trailing pipe symbol (|) in the file-ref

(page 510) portion of this open statement tells Perl to pass the command line pre-

ceding the pipe symbol to the shell for execution and to accept standard output

from the command when the program reads from the file handle. In this case the

command uses

grep

to filter the /etc/group file for lines containing the username

held in $user. The die statement displays an error message if Perl cannot open the

handle.

$ cat groupfind.pl

$user = shift || $ENV{"USER"};

say "User $user belongs to these groups:";

@list = ();

open (my $fh, "grep $user /etc/group |") or die "Error: $!\n";

while ($group = <$fh>) {

chomp $group;

$group =~ s/(.*?):.*/$1/;

push @list, $group;

}

close $fh;

@slist = sort @list;

say "@slist";

$ perl groupfind.pl

User sam belongs to these groups:

adm admin audio cdrom dialout dip floppy kvm lpadmin ...

The while structure in groupfind.pl reads lines from standard output of

grep

and

terminates when

grep

finishes executing. The name of the group appears first on

each line in /etc/group, followed by a colon and other information, including the

names of the users who belong to the group. Following is a line from this file:

sam:x:1000:max,zach,helen

The line

$group =~ s/(.*?):.*/$1/;

uses a regular expression and substitution to remove everything except the name of

the group from each line. The regular expression .*: would perform a greedy match

of zero or more characters followed by a colon; putting a question mark after the

asterisk causes the expression to perform a nongreedy match (page 520). Putting

parentheses around the part of the expression that matches the string the program

needs to display enables Perl to use the string that the regular expression matches in

the replacement string. The final .* matches the rest of the line. Perl replaces the $1

in the replacement string with the string the bracketed portion of the regular expres-

sion (the part between the parentheses) matched and assigns this value (the name of

the group) to $group.

Examples 527

The chomp statement removes the trailing NEWLINE (the regular expression did not

match this character). The push statement adds the value of $group to the end of

the @list array. Without chomp, each group would appear on a line by itself in the

output. After the while structure finishes processing input from

grep

,sort orders

@list and assigns the result to @slist. The final statement displays the sorted list of

groups the user belongs to.

opendir and readdir The next example introduces the opendir and readdir functions. The opendir func-

tion opens a directory in a manner similar to the way open opens an ordinary file. It

takes two arguments: the name of the directory handle and the name of the direc-

tory to open. The readdir function reads the name of a file from an open directory.

In the example, opendir opens the working directory (specified by .) using the

$dir lexical directory handle. If opendir fails, Perl executes the statement follow-

ing the or operator: die sends an error message to standard error and terminates

the program. With the directory opened, while loops through the files in the

directory, assigning the filename that readdir returns to the lexical variable

$entry. An if statement executes print only for those files that are directories (–d).

The print function displays the name of the directory unless the directory is

named . or ... When readdir has read all files in the working directory, it returns

false and control passes to the statement following the while block. The closedir

function closes the open directory and print displays a NEWL NE following the list of

directories the program displayed.

$ cat dirs2a.pl

#!/usr/bin/perl

print "The working directory contains these directories:\n";

opendir my $dir, '.' or die "Could not open directory: $!\n";

while (my $entry = readdir $dir) {

if (-d $entry) {

print $entry, ' ' unless ($entry eq '.' || $entry eq '..');

}

closedir $dir;

print "\n";

$ ./dirs2a.pl

The working directory contains these directories:

two one

split The split function divides a string into substrings as specified by a delimiter. The

syntax of a call to split is

split (/re/, string);

where re is the delimiter, which is a regular expression (frequently a single regular

character), and string is the string that is to be divided. As the next example shows,

you can assign the list that split returns to an array variable.

528 Chapter 11 The Perl Scripting Language

The next program runs under Linux and lists the usernames of users with UIDs

greater than or equal to 100 listed in the /etc/passwd file. It uses a while structure to

read lines from passwd into $user, and it uses split to break the line into substrings

separated by colons. The line that begins with @row assigns each of these substrings

to an element of the @row array. The expression the if statement evaluates is true if

the third substring (the UID) is greater than or equal to 100. This expression uses

the >= numeric comparison operator because it compares two numbers; an alpha-

betic comparison would use the ge string comparison operator.

The print statement sends the UID number and the associated username to the

$sortout file handle. The open statement for this handle establishes a pipe that sends

its output to sort –n. Because the

sort

utility (page 54) does not display any output

until it finishes receiving all of the input, split3.pl does not display anything until it

closes the $sortout handle, which it does when it finishes reading the passwd file.

$ cat split3.pl

#!/usr/bin/perl -w

open ($pass, "/etc/passwd");

open ($sortout, "| sort -n");

while ($user = <$pass>) {

@row = split (/:/, $user);

if ($row[2] >= 100) {

print $sortout "$row[2] $row[0]\n";

}

close ($pass);

close ($sortout);

$ ./split3.pl

100 libuuid

101 syslog

102 klog

103 avahi-autoipd

104 pulse

...

The next example counts and displays the arguments it was called with, using

@ARGV (page 512). A foreach structure loops through the elements of the @ARGV

array, which holds the command-line arguments. The ++ preincrement operator

increments $count before it is displayed.

$ cat 10.pl

#!/usr/bin/perl -w

$count = 0;

$num = @ARGV;

print "You entered $num arguments on the command line:\n";

foreach $arg (@ARGV) {

print ++$count, ". $arg\n";

}

Exercises 529

$ ./10.pl apple pear banana watermelon

You entered 4 arguments on the command line:

1. apple

2. pear

3. banana

4. watermelon

Chapter Summary

Perl was written by Larry Wall in 1987. Since that time Perl has grown in size and

functionality and is now a very popular language used for text processing, system

administration, software development, and general-purpose programming. One of

Perl’s biggest assets is its support by thousands of third-party modules, many of

which are stored in the CPAN repository.

The

perldoc

utility locates and displays local Perl documentation. It also allows you

to document a Perl program by displaying lines of pod (plain old documentation)

that you include in the program.

Perl provides three types of variables: scalar (singular variables that begin with a $),

array (plural variables that begin with an @), and hash (also called associative

arrays; plural variables that begin with a %). Array and hash variables both hold

lists, but arrays are ordered while hashes are unordered. Standard control flow

statements allow you to alter the order of execution of statements within a Perl pro-

gram. In addition, Perl programs can take advantage of subroutines that can include

variables local to the subroutines (lexical variables).

Regular expressions are one of Perl’s strong points. In addition to the same facilities

that are available in many utilities, Perl offers regular expression features that allow

you to perform more complex string processing.

Exercises

1. What are two different ways to turn on warnings in Perl?

2. What is the difference between an array and a hash?

3. In each example, when would you use a hash and when would you use an

array?

a. Counting the number of occurrences of an IP address in a log file.

b. Generating a list of users who are over disk quota for use in a report.

4. Write a regular expression to match a quoted string, such as

He said, "Go get me the wrench," but I didn’t hear him.

530 Chapter 11 The Perl Scripting Language

5. Write a regular expression to match an IP address in a log file.

6. Many configuration files contain many comments, including commented-

out default configuration directives. Write a program to remove these

comments from a configuration file.

Advanced Exercises

7. Write a program that removes *~ and *.ico files from a directory hierarchy.

(Hint: Use the File::Find module.)

8. Describe a programming mistake that Perl’s warnings do not report on.

9. Write a Perl program that counts the number of files in the working

directory and the number of bytes in those files, by filename extension.

10. Describe the difference between quoting strings using single quotation

marks and using double quotation marks.

11. Write a program that copies all files with a .ico filename extension in a

directory hierarchy to a directory named icons in your home directory.

(Hint: Use the File::Find and File::Copy modules.)

12. Write a program that analyzes Apache logs. Display the number of bytes

served by each path. Ignore unsuccessful page requests. If there are more

than ten paths, display the first ten only.

Following is a sample line from an Apache access log. The two numbers

following the HTTP/1.1 are the response code and the byte count. A

response code of 200 means the request was successful. A byte count of –

means no data was transferred.

__DATA__

92.50.103.52 - - [19/Aug/2008:08:26:43 -0400] "GET /perl/automated-testing/next_active.gif

HTTP/1.1" 200 980 "http://example.com/perl/automated-testing/navigation_bar.htm"

"Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.6) Gecko/20061201 Firefox/3.0.0.6

(Ubuntu-feisty); Blazer/4.0"

531531

AWK is a pattern-scanning and processing language that

searches one or more files for records (usually lines) that

match specified patterns. It processes lines by performing

actions, such as writing the record to standard output or

incrementing a counter, each time it finds a match. Unlike

procedural languages, AWK is data driven: You describe the

data you want to work with and tell AWK what to do with

the data once it finds it.

You can use AWK to generate reports or filter text. It works

equally well with numbers and text; when you mix the two,

AWK usually comes up with the right answer. The authors

of AWK (Alfred V. Aho, Peter J. Weinberger, and Brian W.

Kernighan) designed the language to be easy to use. To

achieve this end they sacrificed execution speed in the origi-

nal implementation.

In This Chapter

Syntax . . . . . . . . . . . . . . . . . . . . . . 532

Arguments . . . . . . . . . . . . . . . . . . . 532

Options . . . . . . . . . . . . . . . . . . . . . 533

Patterns . . . . . . . . . . . . . . . . . . . . . 534

Actions. . . . . . . . . . . . . . . . . . . . . . 535

Variables . . . . . . . . . . . . . . . . . . . . 535

Functions. . . . . . . . . . . . . . . . . . . . 536

Associative Arrays. . . . . . . . . . . . . 538

Control Structures . . . . . . . . . . . . . 539

Examples. . . . . . . . . . . . . . . . . . . . 541

getline: Controlling Input . . . . . . . 558

Coprocess: Two-Way I/O. . . . . . . . 560

Getting Input from a Network. . . . 562

The AWK Pattern

Processing Language

532 Chapter 12 The AWK Pattern Processing Language

AWK takes many of its constructs from the C programming language. It includes

the following features:

• A flexible format

• Conditional execution

• Looping statements

• Numeric variables

• String variables

• Regular expressions

• Relational expressions

•C’s printf

• Coprocess execution (

gawk

only)

• Network data exchange (

gawk

only)

Syntax

gawk

command line has the following syntax:

gawk [options] [program] [file-list]

gawk [options] –f program-file [file-list]

The

gawk

utility takes its input from files you specify on the command line or from

standard input. An advanced command, getline, gives you more choices about

where input comes from and how

gawk

reads it (page 558). Using a coprocess,

gawk

can interact with another program or exchange data over a network (page 560; not

available under

awk

mawk

). Output from

gawk

goes to standard output.

Arguments

In the preceding syntax, program is a

gawk

program that you include on the command

line. The program-file is the name of the file that holds a

gawk

program. Putting the

program on the command line allows you to write short

gawk

programs without hav-

ing to create a separate program-file. To prevent the shell from interpreting the

gawk

commands as shell commands, enclose the program within single quotation marks.

Putting a long or complex program in a file can reduce errors and retyping.

The file-list contains the pathnames of the ordinary files that

gawk

processes. These

files are the input files. When you do not specify a file-list,

gawk

takes input from

standard input or as specified by getline (page 558) or a coprocess (page 560).

Options 533

Options

Options preceded by a double hyphen (––) work under

gawk

only. They are not

available under

awk

and

mawk

––field-separator fs

–F fs

Uses fs as the value of the input field separator (FS variable; page 536).

––file program-file

–f program-file

Reads the

gawk

program from the file named program-file instead of the com-

mand line. You can specify this option more than once on a command line. See

page 545 for examples.

––help –W help

Summarizes how to use

gawk

(

gawk

only).

––lint –W lint

Warns about

gawk

constructs that may not be correct or portable (

gawk

only).

––posix –W posix

Runs a POSIX-compliant version of

gawk

. This option introduces some

restrictions; see the

gawk man

page for details (

gawk

only).

––traditional –W traditional

Ignores the new GNU features in a

gawk

program, making the program conform

to UNIX

awk

(

gawk

only).

––assign var=value

–v var=value

Assigns value to the variable var. The assignment takes place prior to execu-

tion of the

gawk

program and is available within the BEGIN pattern

(page 535). You can specify this option more than once on a command line.

AWK has many implementations

tip The AWK language was originally implemented under UNIX as the

awk

utility. Most Linux distri-

butions provide

gawk

(the GNU implementation of

awk

) or

mawk

(a faster, stripped-down ver-

sion of

awk

). Mac OS X provides

awk

. This chapter describes

gawk

. All the examples in this

chapter work under

awk

and

mawk

except as noted; the exceptions make use of coprocesses

(page 560). You can easily install

gawk

on most Linux distributions. See gawk.darwinports.com

if you are running Mac OS X. For a complete list of

gawk

extensions, see GNU EXTENSIONS in

the

gawk man

page or see the

gawk info

page.

534 Chapter 12 The AWK Pattern Processing Language

Notes

See the tip on the previous page for information on AWK implementations.

For convenience many Linux systems provide a link from /bin/awk to /bin/gawk or

/bin/mawk. As a result you can run the program using either name.

Language Basics

gawk

program (from program on the command line or from program-file) consists

of one or more lines containing a pattern and/or action in the following format:

pattern { action }

The pattern selects lines from the input. The

gawk

utility performs the action on all

lines that the pattern selects. The braces surrounding the action enable

gawk

to dif-

ferentiate it from the pattern. If a program line does not contain a pattern,

gawk

selects all lines in the input. If a program line does not contain an action,

gawk

copies the selected lines to standard output.

To start,

gawk

compares the first line of input (from the file-list or standard input)

with each pattern in the program. If a pattern selects the line (if there is a match),

gawk

takes the action associated with the pattern. If the line is not selected,

gawk

does not take the action. When

gawk

has completed its comparisons for the first line

of input, it repeats the process for the next line of input. It continues this process of

comparing subsequent lines of input until it has read all of the input.

If several patterns select the same line,

gawk

takes the actions associated with each

of the patterns in the order in which they appear in the program. It is possible for

gawk

to send a single line from the input to standard output more than once.

Patterns

~ and !~ You can use a regular expression (Appendix A), enclosed within slashes, as a pattern.

The ~ operator tests whether a field or variable matches a regular expression (exam-

ples on page 543). The !~ operator tests for no match. You can perform both numeric

and string comparisons using the relational operators listed in Table 12-1. You can

combine any of the patterns using the Boolean operators || (OR) or && (AND).

Table 12-1 Relational operators

Relational operator Meaning

<Less than

<= Less than or equal to

== Equal to

Language Basics 535

BEGIN and END Two unique patterns,BEGIN and END, execute commands before

gawk

starts pro-

cessing the input and after it finishes processing the input. The

gawk

utility executes

the actions associated with the BEGIN pattern before, and with the END pattern

after, it processes all the input. See pages 545 and 546 for examples.

, (comma) The comma is the range operator. If you separate two patterns with a comma on a sin-

gle

gawk

program line,

gawk

selects a range of lines, beginning with the first line that

matches the first pattern. The last line selected by

gawk

is the next subsequent line that

matches the second pattern. If no line matches the second pattern,

gawk

selects every

line through the end of the input. After

gawk

finds the second pattern, it begins the

process again by looking for the first pattern again. See page 544 for examples.

Actions

The action portion of a

gawk

command causes

gawk

to take that action when it

matches a pattern. When you do not specify an action,

gawk

performs the default

action, which is the print command (explicitly represented as {print}). This action

copies the record (normally a line; see “Record separators” on the next page) from

the input to standard output.

When you follow a print command with arguments,

gawk

displays only the argu-

ments you specify. These arguments can be variables or string constants. You can

send the output from a print command to a file (use > within the

gawk

program;

page 549), append it to a file (>>), or send it through a pipe to the input of another

program (|). A coprocess (|&) is a two-way pipe that exchanges data with a program

running in the background (available under

gawk

only; page 560).

Unless you separate items in a print command with commas,

gawk

catenates them.

Commas cause

gawk

to separate the items with the output field separator (OFS, nor-

mally a SPACE; page 536).

You can include several actions on one line by separating them with semicolons.

Comments

The

gawk

utility disregards anything on a program line following a pound sign (#).

You can document a

gawk

program by preceding comments with this symbol.

Variables

Although you do not need to declare

gawk

variables prior to their use, you can

assign initial values to them if you like. Unassigned numeric variables are initialized

Relational operator Meaning

!= Not equal to

>= Greater than or equal to

>Greater than

Table 12-1 Relational operators (continued)

536 Chapter 12 The AWK Pattern Processing Language

to 0; string variables are initialized to the null string. In addition to supporting user

variables,

gawk

maintains program variables. You can use both user and program

variables in the pattern and action portions of a

gawk

program. Table 12-2 lists a

few program variables.

In addition to initializing variables within a program, you can use the ––assign (–v)

option to initialize variables on the command line. This feature is useful when the

value of a variable changes from one run of

gawk

to the next.

Record separators By default the input and output record separators are NEWLINE characters. Thus

gawk

takes each line of input to be a separate record and appends a NEWLINE to the end of

each output record. By default the input field separators are SPACEs and TABs; the

default output field separator is a SPACE. You can change the value of any of the sep-

arators at any time by assigning a new value to its associated variable either from

within the program or from the command line by using the ––assign (–v) option.

Functions

Table 12-3 lists a few of the functions

gawk

provides for manipulating numbers and

strings.

Table 12-2 Variables

Variable Meaning

$0 The current record (as a single variable)

$1–$

Fields in the current record

FILENAME Name of the current input file (null for standard input)

FS Input field separator (default: SPACE or TAB; page 550)

NF Number of fields in the current record (page 554)

NR Record number of the current record (page 546)

OFS Output field separator (default: SPACE; page 547)

ORS Output record separator (default: NEWLINE; page 554)

RS Input record separator (default: NEWLINE)

Table 12-3 Functions

Function Meaning

length(

str

)Returns the number of characters in

str

; without an argument, returns the

number of characters in the current record (page 545)

int(

num

)Returns the integer portion of

num

index(

str1

str2

)Returns the index of

str2

str1

or 0 if

str2

is not present

split(

str

arr

del

)Places elements of

str

, delimited by

del

, in the array

arr

[1]...

arr

[

]; returns the

number of elements in the array (page 556)

Language Basics 537

Arithmetic Operators

The

gawk

arithmetic operators listed in Table 12-4 are from the C programming

language.

Function Meaning

sprintf(

fmt

args

)Formats

args

according to

fmt

and returns the formatted string; mimics the C

programming language function of the same name

substr(

str

pos

len

)Returns the substring of

str

that begins at

pos

and is

len

characters long

tolower(

str

)Returns a copy of

str

in which all uppercase letters are replaced with their

lowercase counterparts

toupper(

str

)Returns a copy of

str

in which all lowercase letters are replaced with their

uppercase counterparts

Table 12-3 Functions (continued)

Table 12-4 Arithmetic operators

Operator Meaning

** Raises the expression preceding the operator to the power of the expression

following it

*Multiplies the expression preceding the operator by the expression following it

/Divides the expression preceding the operator by the expression following it

%Takes the remainder after dividing the expression preceding the operator by

the expression following it

+Adds the expression preceding the operator to the expression following it

–Subtracts the expression following the operator from the expression preceding it

=Assigns the value of the expression following the operator to the variable

preceding it

++ Increments the variable preceding the operator

–– Decrements the variable preceding the operator

+= Adds the expression following the operator to the variable preceding it and

assigns the result to the variable preceding the operator

–= Subtracts the expression following the operator from the variable preceding it

and assigns the result to the variable preceding the operator

*=Multiplies the variable preceding the operator by the expression following it

and assigns the result to the variable preceding the operator

/= Divides the variable preceding the operator by the expression following it and

assigns the result to the variable preceding the operator

%= Assigns the remainder, after dividing the variable preceding the operator by the

expression following it, to the variable preceding the operator

538 Chapter 12 The AWK Pattern Processing Language

Associative Arrays

The associative array is one of

gawk

’s most powerful features. These arrays use

strings as indexes. Using an associative array, you can mimic a traditional array by

using numeric strings as indexes. In Perl, an associative array is called a hash

(page 500).

You assign a value to an element of an associative array using the following syntax:

array[string] = value

where array is the name of the array, string is the index of the element of the array

you are assigning a value to, and value is the value you are assigning to that element.

Using the following syntax, you can use a for structure with an associative array:

for (elem in array)action

where elem is a variable that takes on the value of each element of the array as the

for structure loops through them, array is the name of the array, and action is the

action that

gawk

takes for each element in the array. You can use the elem variable

in this action.

See page 551 for example programs that use associative arrays.

printf

You can use the printf command in place of print to control the format of the output

gawk

generates. The

gawk

version of printf is similar to that found in the C language.

Aprintf command has the following syntax:

printf "control-string", arg1,arg2, ..., argn

The control-string determines how printf formats arg1, arg2, ..., argn. These argu-

ments can be variables or other expressions. Within the control-string you can use \n to

indicate a NEWLINE and \t to indicate a TAB. The control-string contains conversion specifi-

cations, one for each argument. A conversion specification has the following syntax:

%[–][x[.y]]conv

where – causes printf to left-justify the argument, x is the minimum field width, and

.y is the number of places to the right of a decimal point in a number. The conv indi-

cates the type of numeric conversion and can be selected from the letters in

Table 12-5. See page 548 for example programs that use printf.

Table 12-5 Numeric conversion

conv Type of conversion

dDecimal

eExponential notation

fFloating-point number

Language Basics 539

Control Structures

Control (flow) statements alter the order of execution of commands within a

gawk

program. This section details the if...else, while, and for control structures.

In addition, the break and continue statements work in conjunction with the con-

trol structures to alter the order of execution of commands. See page 398 for

more information on control structures. You do not need to use braces around

commands when you specify a single, simple command.

if...else

The if...else control structure tests the status returned by the condition and transfers

control based on this status. The syntax of an if...else structure is shown below. The

else part is optional.

if (condition)

{commands}

[else

{commands}]

The simple if statement shown here does not use braces:

if ($5 <= 5000) print $0

Next is a

gawk

program that uses a simple if...else structure. Again, there are no braces.

$ cat if1

BEGIN {

nam="sam"

if (nam == "max")

print "nam is max"

else

print "nam is not max, it is", nam

}

$ gawk -f if1

nam is not max, it is sam

while

The while structure loops through and executes the commands as long as the

condition is true. The syntax of a while structure is

while (condition)

{commands}

conv Type of conversion

gUse f or e, whichever is shorter

oUnsigned octal

sString of characters

xUnsigned hexadecimal

Table 12-5 Numeric conversion (continued)

540 Chapter 12 The AWK Pattern Processing Language

The next

gawk

program uses a simple while structure to display powers of 2. This

example uses braces because the while loop contains more than one statement. This

program does not accept input; all processing takes place when

gawk

executes the

statements associated with the BEGIN pattern.

$ cat while1

BEGIN{

n = 1

while (n <= 5)

{

print "2^" n, 2**n

n++

}

$ gawk -f while1

1^2 2

2^2 4

3^2 8

4^2 16

5^2 32

for

The syntax of a for control structure is

for (init;condition;increment)

{commands}

Afor structure starts by executing the init statement, which usually sets a counter

to 0 or 1. It then loops through the commands as long as the condition remains

true. After each loop it executes the increment statement. The for1

gawk

program

does the same thing as the preceding while1 program except that it uses a for state-

ment, which makes the program simpler:

$ cat for1

BEGIN {

for (n=1; n <= 5; n++)

print "2^" n, 2**n

}

$ gawk -f for1

1^2 2

2^2 4

3^2 8

4^2 16

5^2 32

The

gawk

utility supports an alternative for syntax for working with associative

arrays:

for (var in array)

{commands}

Examples 541

This for structure loops through elements of the associative array named array,

assigning the value of the index of each element of array to var each time through

the loop. The following line of code (from the program on page 551) demonstrates

afor structure:

END {for (name in manuf) print name, manuf[name]}

break

The break statement transfers control out of a for or while loop, terminating execution

of the innermost loop it appears in.

continue

The continue statement transfers control to the end of a for or while loop, causing

execution of the innermost loop it appears in to continue with the next iteration.

Examples

cars data file Many of the examples in this section work with the cars data file. From left to right,

the columns in the file contain each car’s make, model, year of manufacture, mile-

age in thousands of miles, and price. All whitespace in this file is composed of single

TAB s (the file does not contain any SPACEs).

cat cars

plym fury 1970 73 2500

chevy malibu 1999 60 3000

ford mustang 1965 45 10000

volvo s80 1998 102 9850

ford thundbd 2003 15 10500

chevy malibu 2000 50 3500

bmw 325i 1985 115 450

honda accord 2001 30 6000

ford taurus 2004 10 17000

toyota rav4 2002 180 750

chevy impala 1985 85 1550

ford explor 2003 25 9500

Missing pattern A simple

gawk

program is

{ print }

This program consists of one program line that is an action. Because the pattern is

missing,

gawk

selects all lines of input. When used without any arguments the print

command displays each selected line in its entirety. This program copies the input to

standard output.

$ gawk '{ print }' cars

plym fury 1970 73 2500

chevy malibu 1999 60 3000

ford mustang 1965 45 10000

volvo s80 1998 102 9850

...

542 Chapter 12 The AWK Pattern Processing Language

Missing action The next program has a pattern but no explicit action. The slashes indicate that

chevy is a regular expression.

/chevy/

In this case

gawk

selects from the input just those lines that contain the string chevy.

When you do not specify an action,

gawk

assumes the action is print. The following

example copies to standard output all lines from the input that contain the string

chevy:

$ gawk '/chevy/' cars

chevy malibu 1999 60 3000

chevy malibu 2000 50 3500

chevy impala 1985 85 1550

Single quotation

marks

Although neither

gawk

nor shell syntax requires single quotation marks on the com-

mand line, it is still a good idea to use them because they can prevent problems. If

the

gawk

program you create on the command line includes SPACEs or characters that

are special to the shell, you must quote them. Always enclosing the program in sin-

gle quotation marks is the easiest way to make sure you have quoted any characters

that need to be quoted.

Fields The next example selects all lines from the file (it has no pattern). The braces

enclose the action; you must always use braces to delimit the action so

gawk

can

distinguish it from the pattern. This example displays the third field ($3), a SPACE

(the output field separator, indicated by the comma), and the first field ($1) of each

selected line:

$ gawk '{print $3, $1}' cars

1970 plym

1999 chevy

1965 ford

1998 volvo

...

The next example, which includes both a pattern and an action, selects all lines that

contain the string chevy and displays the third and first fields from the selected

lines:

$ gawk '/chevy/ {print $3, $1}' cars

1999 chevy

2000 chevy

1985 chevy

In the following example,

gawk

selects lines that contain a match for the regular

expression h. Because there is no explicit action,

gawk

displays all the lines it selects.

$ gawk '/h/' cars

chevy malibu 1999 60 3000

ford thundbd 2003 15 10500

chevy malibu 2000 50 3500

honda accord 2001 30 6000

chevy impala 1985 85 1550

Examples 543

~ (matches

operator)

The next pattern uses the matches operator (~) to select all lines that contain the letter

h in the first field:

$ gawk '$1 ~ /h/' cars

chevy malibu 1999 60 3000

chevy malibu 2000 50 3500

honda accord 2001 30 6000

chevy impala 1985 85 1550

The caret (^) in a regular expression forces a match at the beginning of the line

(page 890) or, in this case, at the beginning of the first field:

$ gawk '$1 ~ /^h/' cars

honda accord 2001 30 6000

Brackets surround a character class definition (page 889). In the next example,

gawk

selects lines that have a second field that begins with t or m and displays the third

and second fields, a dollar sign, and the fifth field. Because there is no comma

between the "$" and the $5,

gawk

does not put a SPACE between them in the output.

$ gawk '$2 ~ /^[tm]/ {print $3, $2, "$" $5}' cars

1999 malibu $3000

1965 mustang $10000

2003 thundbd $10500

2000 malibu $3500

2004 taurus $17000

Dollar signs The next example shows three roles a dollar sign can play in a

gawk

program. First,

a dollar sign followed by a number names a field. Second, within a regular expres-

sion a dollar sign forces a match at the end of a line or field (5$). Third, within a

string a dollar sign represents itself.

$ gawk '$3 ~ /5$/ {print $3, $1, "$" $5}' cars

1965 ford $10000

1985 bmw $450

1985 chevy $1550

In the next example, the equal-to relational operator (==) causes

gawk

to perform a

numeric comparison between the third field in each line and the number 1985. The

gawk

command takes the default action,print, on each line where the comparison

is true.

$ gawk '$3 == 1985' cars

bmw 325i 1985 115 450

chevy impala 1985 85 1550

The next example finds all cars priced at or less than $3,000.

$ gawk '$5 <= 3000' cars

plym fury 1970 73 2500

chevy malibu 1999 60 3000

bmw 325i 1985 115 450

toyota rav4 2002 180 750

chevy impala 1985 85 1550

544 Chapter 12 The AWK Pattern Processing Language

Textual

comparisons

When you use double quotation marks,

gawk

performs textual comparisons by

using the ASCII (or other local) collating sequence as the basis of the comparison. In

the following example,

gawk

shows that the strings 450 and 750 fall in the range

that lies between the strings 2000 and 9000, which is probably not the intended

result.

$ gawk '"2000" <= $5 && $5 < "9000"' cars

plym fury 1970 73 2500

chevy malibu 1999 60 3000

chevy malibu 2000 50 3500

bmw 325i 1985 115 450

honda accord 2001 30 6000

toyota rav4 2002 180 750

When you need to perform a numeric comparison, do not use quotation marks. The

next example gives the intended result. It is the same as the previous example except

it omits the double quotation marks.

$ gawk '2000 <= $5 && $5 < 9000' cars

plym fury 1970 73 2500

chevy malibu 1999 60 3000

chevy malibu 2000 50 3500

honda accord 2001 30 6000

, (range operator) The range operator (,) selects a group of lines. The first line it selects is the one

specified by the pattern before the comma. The last line is the one selected by

the pattern after the comma. If no line matches the pattern after the comma,

gawk

selects every line through the end of the input. The next example selects

all lines, starting with the line that contains volvo and ending with the line that

contains bmw.

$ gawk '/volvo/ , /bmw/' cars

volvo s80 1998 102 9850

ford thundbd 2003 15 10500

chevy malibu 2000 50 3500

bmw 325i 1985 115 450

After the range operator finds its first group of lines, it begins the process again,

looking for a line that matches the pattern before the comma. In the following

example,

gawk

finds three groups of lines that fall between chevy and ford.

Although the fifth line of input contains ford,

gawk

does not select it because at the

time it is processing the fifth line, it is searching for chevy.

$ gawk '/chevy/ , /ford/' cars

chevy malibu 1999 60 3000

ford mustang 1965 45 10000

chevy malibu 2000 50 3500

bmw 325i 1985 115 450

honda accord 2001 30 6000

Examples 545

ford taurus 2004 10 17000

chevy impala 1985 85 1550

ford explor 2003 25 9500

––file option When you are writing a longer

gawk

program, it is convenient to put the program in

a file and reference the file on the command line. Use the –f (––file) option followed

by the name of the file containing the

gawk

program.

BEGIN The following

gawk

program, which is stored in a file named pr_header, has two

actions and uses the BEGIN pattern. The

gawk

utility performs the action associ-

ated with BEGIN before processing any lines of the data file: It displays a header.

The second action,{print}, has no pattern part and displays all lines from the input.

$ cat pr_header

BEGIN {print "Make Model Year Miles Price"}

{print}

$ gawk -f pr_header cars

Make Model Year Miles Price

plym fury 1970 73 2500

chevy malibu 1999 60 3000

ford mustang 1965 45 10000

volvo s80 1998 102 9850

...

The next example expands the action associated with the BEGIN pattern. In the

previous and the following examples, the whitespace in the headers is composed of

single TABs, so the titles line up with the columns of data.

$ cat pr_header2

BEGIN {

print "Make Model Year Miles Price"

print "----------------------------------------"

}

{print}

$ gawk -f pr_header2 cars

Make Model Year Miles Price

----------------------------------------

plym fury 1970 73 2500

chevy malibu 1999 60 3000

ford mustang 1965 45 10000

volvo s80 1998 102 9850

...

length function When you call the length function without an argument, it returns the number of

characters in the current line, including field separators. The $0 variable always

contains the value of the current line. In the next example,

gawk

prepends the line

length to each line and then a pipe sends the output from

gawk

sort

(the –n option

specifies a numeric sort; page 817). As a result, the lines of the cars file appear in

order of line length.

546 Chapter 12 The AWK Pattern Processing Language

$ gawk '{print length, $0}' cars | sort -n

21 bmw 325i 1985 115 450

22 plym fury 1970 73 2500

23 volvo s80 1998 102 9850

24 ford explor 2003 25 9500

24 toyota rav4 2002 180 750

25 chevy impala 1985 85 1550

25 chevy malibu 1999 60 3000

25 chevy malibu 2000 50 3500

25 ford taurus 2004 10 17000

25 honda accord 2001 30 6000

26 ford mustang 1965 45 10000

26 ford thundbd 2003 15 10500

The formatting of this report depends on TABs for horizontal alignment. The three

extra characters at the beginning of each line throw off the format of several lines; a

remedy for this situation is covered shortly.

NR (record number) The NR variable contains the record (line) number of the current line. The following

pattern selects all lines that contain more than 24 characters. The action displays the

line number of each of the selected lines.

$ gawk 'length > 24 {print NR}' cars

You can combine the range operator (,) and the NR variable to display a group of

lines of a file based on their line numbers. The next example displays lines 2

through 4:

$ gawk 'NR == 2 , NR == 4' cars

chevy malibu 1999 60 3000

ford mustang 1965 45 10000

volvo s80 1998 102 9850

END The END pattern works in a manner similar to the BEGIN pattern, except

gawk

takes the actions associated with this pattern after processing the last line of input.

The following report displays information only after it has processed all the input.

The NR variable retains its value after

gawk

finishes processing the data file, so an

action associated with an END pattern can use it.

$ gawk 'END {print NR, "cars for sale." }' cars

12 cars for sale.

The next example uses if control structures to expand the abbreviations used in

some of the first fields. As long as

gawk

does not change a record, it leaves the entire

Examples 547

record—including any separators—intact. Once it makes a change to a record,

gawk

changes all separators in that record to the value of the output field separator. The

default output field separator is a SPACE.

$ cat separ_demo

{

if ($1 ~ /ply/) $1 = "plymouth"

if ($1 ~ /chev/) $1 = "chevrolet"

}

$ gawk -f separ_demo cars

plymouth fury 1970 73 2500

chevrolet malibu 1999 60 3000

ford mustang 1965 45 10000

volvo s80 1998 102 9850

ford thundbd 2003 15 10500

chevrolet malibu 2000 50 3500

bmw 325i 1985 115 450

honda accord 2001 30 6000

ford taurus 2004 10 17000

toyota rav4 2002 180 750

chevrolet impala 1985 85 1550

ford explor 2003 25 9500

Stand-alone script Instead of calling

gawk

from the command line with the –f option and the name of

the program you want to run, you can write a script that calls

gawk

with the com-

mands you want to run. The next example is a stand-alone script that runs the same

program as the previous example. The #!/bin/gawk –f command (page 280) runs

the

gawk

utility directly. To execute it, you need both read and execute permission to

the file holding the script (page 278).

$ chmod u+rx separ_demo2

$ cat separ_demo2

#!/bin/gawk -f

{

if ($1 ~ /ply/) $1 = "plymouth"

if ($1 ~ /chev/) $1 = "chevrolet"

}

$ ./separ_demo2 cars

plymouth fury 1970 73 2500

chevrolet malibu 1999 60 3000

ford mustang 1965 45 10000

...

OFS variable You can change the value of the output field separator by assigning a value to the

OFS variable. The following example assigns a TAB character to OFS, using the

backslash escape sequence \t. This fix improves the appearance of the report but

does not line up the columns properly.

548 Chapter 12 The AWK Pattern Processing Language

$ cat ofs_demo

BEGIN {OFS = "\t"}

{

if ($1 ~ /ply/) $1 = "plymouth"

if ($1 ~ /chev/) $1 = "chevrolet"

}

$ gawk -f ofs_demo cars

plymouth fury 1970 73 2500

chevrolet malibu 1999 60 3000

ford mustang 1965 45 10000

volvo s80 1998 102 9850

ford thundbd 2003 15 10500

chevrolet malibu 2000 50 3500

bmw 325i 1985 115 450

honda accord 2001 30 6000

ford taurus 2004 10 17000

toyota rav4 2002 180 750

chevrolet impala 1985 85 1550

ford explor 2003 25 9500

printf You can use printf (page 538) to refine the output format. The following example

uses a backslash at the end of two program lines to quote the following NEWLINE. You

can use this technique to continue a long line over one or more lines without affect-

ing the outcome of the program.

$ cat printf_demo

BEGIN {

print " Miles"

print "Make Model Year (000) Price"

print \

"--------------------------------------------------"

}

{

if ($1 ~ /ply/) $1 = "plymouth"

if ($1 ~ /chev/) $1 = "chevrolet"

printf "%-10s %-8s %2d %5d $ %8.2f\n",\

$1, $2, $3, $4, $5

}

$ gawk -f printf_demo cars

Miles

Make Model Year (000) Price

--------------------------------------------------

plymouth fury 1970 73 $ 2500.00

chevrolet malibu 1999 60 $ 3000.00

ford mustang 1965 45 $ 10000.00

volvo s80 1998 102 $ 9850.00

ford thundbd 2003 15 $ 10500.00

chevrolet malibu 2000 50 $ 3500.00

bmw 325i 1985 115 $ 450.00

Examples 549

honda accord 2001 30 $ 6000.00

ford taurus 2004 10 $ 17000.00

toyota rav4 2002 180 $ 750.00

chevrolet impala 1985 85 $ 1550.00

ford explor 2003 25 $ 9500.00

Redirecting output The next example creates two files: one with the lines that contain chevy and one

with the lines that contain ford.

$ cat redirect_out

/chevy/ {print > "chevfile"}

/ford/ {print > "fordfile"}

END {print "done."}

$ gawk -f redirect_out cars

done.

$ cat chevfile

chevy malibu 1999 60 3000

chevy malibu 2000 50 3500

chevy impala 1985 85 1550

The summary program produces a summary report on all cars and newer cars.

Although they are not required, the initializations at the beginning of the program

represent good programming practice;

gawk

automatically declares and initializes

variables as you use them. After reading all the input data,

gawk

computes and dis-

plays the averages.

$ cat summary

BEGIN {

yearsum = 0 ; costsum = 0

newcostsum = 0 ; newcount = 0

}

{

yearsum += $3

costsum += $5

}

$3 > 2000 {newcostsum += $5 ; newcount ++}

END {

printf "Average age of cars is %4.1f years\n",\

2006 - (yearsum/NR)

printf "Average cost of cars is $%7.2f\n",\

costsum/NR

printf "Average cost of newer cars is $%7.2f\n",\

newcostsum/newcount

}

$ gawk -f summary cars

Average age of cars is 13.1 years

Average cost of cars is $6216.67

Average cost of newer cars is $8750.00

550 Chapter 12 The AWK Pattern Processing Language

The following

gawk

command shows the format of a line from a Linux passwd file

that the next example uses:

$ awk '/mark/ {print}' /etc/passwd

mark:x:107:100:ext 112:/home/mark:/bin/tcsh

FS variable The next example demonstrates a technique for finding the largest number in a

field. Because it works with a Linux passwd file, which delimits fields with colons

(:), the example changes the input field separator (FS) before reading any data. It

reads the passwd file and determines the next available user ID number (field 3).

The numbers do not have to be in order in the passwd file for this program to work.

The pattern ($3 > saveit) causes

gawk

to select records that contain a user ID num-

ber greater than any previous user ID number it has processed. Each time it selects a

record,

gawk

assigns the value of the new user ID number to the saveit variable.

Then

gawk

uses the new value of saveit to test the user IDs of all subsequent records.

Finally

gawk

adds 1 to the value of saveit and displays the result.

$ cat find_uid

BEGIN {FS = ":"

saveit = 0}

$3 > saveit {saveit = $3}

END {print "Next available UID is " saveit + 1}

$ gawk -f find_uid /etc/passwd

Next available UID is 1092

The next example produces another report based on the cars file. This report uses

nested if...else control structures to substitute values based on the contents of the

price field. The program has no pattern part; it processes every record.

$ cat price_range

{

if ($5 <= 5000) $5 = "inexpensive"

else if (5000 < $5 && $5 < 10000) $5 = "please ask"

else if (10000 <= $5) $5 = "expensive"

printf "%-10s %-8s %2d %5d %-12s\n",\

$1, $2, $3, $4, $5

}

$ gawk -f price_range cars

plym fury 1970 73 inexpensive

chevy malibu 1999 60 inexpensive

ford mustang 1965 45 expensive

volvo s80 1998 102 please ask

ford thundbd 2003 15 expensive

chevy malibu 2000 50 inexpensive

bmw 325i 1985 115 inexpensive

honda accord 2001 30 please ask

ford taurus 2004 10 expensive

toyota rav4 2002 180 inexpensive

chevy impala 1985 85 inexpensive

ford explor 2003 25 please ask

Examples 551

Associative arrays Next the manuf associative array uses the contents of the first field of each record in

the cars file as an index. The array consists of the elements manuf[plym],

manuf[chevy],manuf[ford], and so on. Each new element is initialized to 0 (zero) as

it is created. The ++ operator increments the variable it follows.

for structure The action following the END pattern is a for structure, which loops through the

elements of an associative array. A pipe sends the output through

sort

to produce an

alphabetical list of cars and the quantities in stock. Because it is a shell script and

not a

gawk

program file, you must have both read and execute permission to the

manuf file to execute it as a command.

$ cat manuf

gawk ' {manuf[$1]++}

END {for (name in manuf) print name, manuf[name]}

' cars |

sort

$ ./manuf

bmw 1

chevy 3

ford 4

honda 1

plym 1

toyota 1

volvo 1

The next program, manuf.sh, is a more general shell script that includes error

checking. This script lists and counts the contents of a column in a file, with both

the column number and the name of the file specified on the command line.

The first action (the one that starts with {count) uses the shell variable $1 in the

middle of the

gawk

program to specify an array index. Because of the way the single

quotation marks are paired, the $1 that appears to be within single quotation marks

is actually not quoted: The two quoted strings in the

gawk

program surround, but

do not include, the $1. Because the $1 is not quoted, and because this is a shell

script, the shell substitutes the value of the first command-line argument in place of

$1 (page 441). As a result, the $1 is interpreted before the

gawk

command is

invoked. The leading dollar sign (the one before the first single quotation mark on

that line) causes

gawk

to interpret what the shell substitutes as a field number.

$ cat manuf.sh

if [ $# != 2 ]

then

echo "Usage: manuf.sh field file"

exit 1

gawk < $2 '

{count[$'$1']++}

END {for (item in count) printf "%-20s%-20s\n",\

item, count[item]}' |

sort

552 Chapter 12 The AWK Pattern Processing Language

$ ./manuf.sh

Usage: manuf.sh field file

$ ./manuf.sh 1 cars

bmw 1

chevy 3

ford 4

honda 1

plym 1

toyota 1

volvo 1

$ ./manuf.sh 3 cars

1965 1

1970 1

1985 2

1998 1

1999 1

2000 1

2001 1

2002 1

2003 2

2004 1

A way around the tricky use of quotation marks that allow parameter expansion

within the

gawk

program is to use the –v option on the command line to pass the

field number to

gawk

as a variable. This change makes it easier for someone else to

read and debug the script. You call the manuf2.sh script the same way you call

manuf.sh:

$ cat manuf2.sh

if [ $# != 2 ]

then

echo "Usage: manuf.sh field file"

exit 1

gawk -v "field=$1" < $2 '

{count[$field]++}

END {for (item in count) printf "%-20s%-20s\n",\

item, count[item]}' |

sort

The word_usage script displays a word usage list for a file you specify on the com-

mand line. The

utility (page 864) lists the words from standard input, one to a line.

The

sort

utility orders the file, putting the most frequently used words first. The script

sorts groups of words that are used the same number of times in alphabetical order.

$ cat word_usage

tr -cs 'a-zA-Z' '[\n*]' < $1 |

gawk '

{count[$1]++}

END {for (item in count) printf "%-15s%3s\n", item, count[item]}' |

sort +1nr +0f -1

Examples 553

$ ./word_usage textfile

the 42

file 29

fsck 27

system 22

you 22

to 21

it 17

SIZE 14

and 13

MODE 13

...

Following is a similar program in a different format. The style mimics that of a

C program and may be easier to read and work with for more complex

gawk

programs.

$ cat word_count

tr -cs 'a-zA-Z' '[\n*]' < $1 |

gawk ' {

count[$1]++

}

END {

for (item in count)

{

if (count[item] > 4)

{

printf "%-15s%3s\n", item, count[item]

}

} ' |

sort +1nr +0f -1

The

tail

utility displays the last ten lines of output, illustrating that words occurring

fewer than five times are not listed:

$ ./word_count textfile | tail

directories 5

if 5

information 5

INODE 5

more 5

no 5

on 5

response 5

this 5

will 5

The next example shows one way to put a date on a report. The first line of input to

the

gawk

program comes from

date

. The program reads this line as record number 1

(NR == 1), processes it accordingly, and processes all subsequent lines with the

action associated with the next pattern (NR > 1).

554 Chapter 12 The AWK Pattern Processing Language

$ cat report

if (test $# = 0) then

echo "You must supply a filename."

exit 1

(date; cat $1) |

gawk '

NR == 1 {print "Report for", $1, $2, $3 ", " $6}

NR > 1 {print $5 "\t" $1}'

$ ./report cars

Report for Mon Jan 31, 2010

2500 plym

3000 chevy

10000 ford

9850 volvo

10500 ford

3500 chevy

450 bmw

6000 honda

17000 ford

750 toyota

1550 chevy

9500 ford

The next example sums each of the columns in a file you specify on the command

line; it takes its input from the numbers file. The program performs error checking,

reporting on and discarding rows that contain nonnumeric entries. It uses the next

command (with the comment skip bad records) to skip the rest of the commands for

the current record if the record contains a nonnumeric entry. At the end of the pro-

gram,

gawk

displays a grand total for the file.

$ cat numbers

10 20 30.3 40.5

20 30 45.7 66.1

30 xyz 50 70

40 75 107.2 55.6

50 20 30.3 40.5

60 30 45.O 66.1

70 1134.7 50 70

80 75 107.2 55.6

90 176 30.3 40.5

100 1027.45 45.7 66.1

110 123 50 57a.5

120 75 107.2 55.6

$ cat tally

gawk ' BEGIN {

ORS = ""

}

NR == 1 { # first record only

nfields = NF # set nfields to number of

} # fields in the record (NF)

{

Examples 555

if ($0 ~ /[^0-9. \t]/) # check each record to see if it contains

{ # any characters that are not numbers,

print "\nRecord " NR " skipped:\n\t" # periods, spaces, or TABs

print $0 "\n"

next # skip bad records

}

else

{

for (count = 1; count <= nfields; count++) # for good records loop through fields

{

printf "%10.2f", $count > "tally.out"

sum[count] += $count

gtotal += $count

}

print "\n" > "tally.out"

}

END { # after processing last record

for (count = 1; count <= nfields; count++) # print summary

{

print " -------" > "tally.out"

}

print "\n" > "tally.out"

for (count = 1; count <= nfields; count++)

{

printf "%10.2f", sum[count] > "tally.out"

}

print "\n\n Grand Total " gtotal "\n" > "tally.out"

} ' < numbers

$ ./tally

Record 3 skipped:

30 xyz 50 70

Record 6 skipped:

60 30 45.O 66.1

Record 11 skipped:

110 123 50 57a.5

$ cat tally.out

10.00 20.00 30.30 40.50

20.00 30.00 45.70 66.10

40.00 75.00 107.20 55.60

50.00 20.00 30.30 40.50

70.00 1134.70 50.00 70.00

80.00 75.00 107.20 55.60

90.00 176.00 30.30 40.50

100.00 1027.45 45.70 66.10

120.00 75.00 107.20 55.60

------- ------- ------- -------

580.00 2633.15 553.90 490.50

Grand Total 4257.55

556 Chapter 12 The AWK Pattern Processing Language

The next example reads the passwd file, listing users who do not have passwords

and users who have duplicate user ID numbers. (The

pwck

utility [Linux only] per-

forms similar checks.) Because Mac OS X uses Open Directory (page 926) and not

the passwd file, this example will not work under OS X.

$ cat /etc/passwd

bill::102:100:ext 123:/home/bill:/bin/bash

roy:x:104:100:ext 475:/home/roy:/bin/bash

tom:x:105:100:ext 476:/home/tom:/bin/bash

lynn:x:166:100:ext 500:/home/lynn:/bin/bash

mark:x:107:100:ext 112:/home/mark:/bin/bash

sales:x:108:100:ext 102:/m/market:/bin/bash

anne:x:109:100:ext 355:/home/anne:/bin/bash

toni::164:100:ext 357:/home/toni:/bin/bash

ginny:x:115:100:ext 109:/home/ginny:/bin/bash

chuck:x:116:100:ext 146:/home/chuck:/bin/bash

neil:x:164:100:ext 159:/home/neil:/bin/bash

rmi:x:118:100:ext 178:/home/rmi:/bin/bash

vern:x:119:100:ext 201:/home/vern:/bin/bash

bob:x:120:100:ext 227:/home/bob:/bin/bash

janet:x:122:100:ext 229:/home/janet:/bin/bash

maggie:x:124:100:ext 244:/home/maggie:/bin/bash

dan::126:100::/home/dan:/bin/bash

dave:x:108:100:ext 427:/home/dave:/bin/bash

mary:x:129:100:ext 303:/home/mary:/bin/bash

$ cat passwd_check

gawk < /etc/passwd ' BEGIN {

uid[void] = "" # tell gawk that uid is an array

}

{ # no pattern indicates process all records

dup = 0 # initialize duplicate flag

split($0, field, ":") # split into fields delimited by ":"

if (field[2] == "") # check for null password field

{

if (field[5] == "") # check for null info field

{

print field[1] " has no password."

}

else

{

print field[1] " ("field[5]") has no password."

}

for (name in uid) # loop through uid array

{

if (uid[name] == field[3]) # check for second use of UID

{

print field[1] " has the same UID as " name " : UID = " uid[name]

dup = 1 # set duplicate flag

}

Examples 557

if (!dup) # same as if (dup == 0)

# assign UID and login name to uid array

{

uid[field[1]] = field[3]

}

$ ./passwd_check

bill (ext 123) has no password.

toni (ext 357) has no password.

neil has the same UID as toni : UID = 164

dan has no password.

dave has the same UID as sales : UID = 108

The next example shows a complete interactive shell script that uses

gawk

to generate

a report on the cars file based on price ranges:

$ cat list_cars

trap 'rm -f $$.tem > /dev/null;echo $0 aborted.;exit 1' 1 2 15

echo -n "Price range (for example, 5000 7500):"

read lowrange hirange

echo '

Miles

Make Model Year (000) Price

--------------------------------------------------' > $$.tem

gawk < cars '

$5 >= '$lowrange' && $5 <= '$hirange' {

if ($1 ~ /ply/) $1 = "plymouth"

if ($1 ~ /chev/) $1 = "chevrolet"

printf "%-10s %-8s %2d %5d $ %8.2f\n", $1, $2, $3, $4,

}' | sort -n +5 >> $$.tem

cat $$.tem

rm $$.tem

$ ./list_cars

Price range (for example, 5000 7500):3000 8000

Miles

Make Model Year (000) Price

--------------------------------------------------

chevrolet malibu 1999 60 $ 3000.00

chevrolet malibu 2000 50 $ 3500.00

honda accord 2001 30 $ 6000.00

$ ./list_cars

Price range (for example, 5000 7500):0 2000

Miles

Make Model Year (000) Price

--------------------------------------------------

bmw 325i 1985 115 $ 450.00

toyota rav4 2002 180 $ 750.00

chevrolet impala 1985 85 $ 1550.00

558 Chapter 12 The AWK Pattern Processing Language

$ ./list_cars

Price range (for example, 5000 7500):15000 100000

Miles

Make Model Year (000) Price

--------------------------------------------------

ford taurus 2004 10 $ 17000.00

optional

Advanced gawk Programming

This section discusses some of the advanced features of AWK. It covers how to

control input using the getline statement, how to use a coprocess to exchange

information between

gawk

and a program running in the background, and how to

use a coprocess to exchange data over a network. Coprocesses are available under

gawk

only; they are not available under

awk

and

mawk

getline: Controlling Input

Using the getline statement gives you more control over the data

gawk

reads than

other methods of input do. When you provide a variable name as an argument to

getline,getline reads data into that variable. The BEGIN block of the g1 program

uses getline to read one line into the variable aa from standard input:

$ cat g1

BEGIN {

getline aa

print aa

}

$ echo aaaa | gawk -f g1

aaaa

The next few examples use the alpha file:

$ cat alpha

aaaaaaaaa

bbbbbbbbb

ccccccccc

ddddddddd

Even when g1 is given more than one line of input, it processes only the first line:

$ gawk -f g1 < alpha

aaaaaaaaa

When getline is not given an argument, it reads input into $0 and modifies the field

variables ($1,$2, . . .):

$ gawk 'BEGIN {getline;print $1}' < alpha

aaaaaaaaa

Advanced gawk Programming 559

The g2 program uses a while loop in the BEGIN block to loop over the lines in stan-

dard input. The getline statement reads each line into holdme and print outputs

each value of holdme.

$ cat g2

BEGIN {

while (getline holdme)

print holdme

}

$ gawk -f g2 < alpha

aaaaaaaaa

bbbbbbbbb

ccccccccc

ddddddddd

The g3 program demonstrates that

gawk

automatically reads each line of input into

$0 when it has statements in its body (and not just a BEGIN block). This program

outputs the record number (NR), the string $0:, and the value of $0 (the current

record) for each line of input.

$ cat g3

{print NR, "$0:", $0}

$ gawk -f g3 < alpha

1 $0: aaaaaaaaa

2 $0: bbbbbbbbb

3 $0: ccccccccc

4 $0: ddddddddd

Next g4 demonstrates that getline works independently of

gawk

’s automatic reads

and $0. When getline reads data into a variable, it does not modify either $0 or any

of the fields in the current record ($1,$2, . . .). The first statement in g4, which is

the same as the statement in g3, outputs the line that

gawk

has automatically read.

The getline statement reads the next line of input into the variable named aa. The

third statement outputs the record number, the string aa:, and the value of aa. The

output from g4 shows that getline processes records independently of

gawk

’s auto-

matic reads.

$ cat g4

{

print NR, "$0:", $0

getline aa

print NR, "aa:", aa

}

$ gawk -f g4 < alpha

1 $0: aaaaaaaaa

2 aa: bbbbbbbbb

3 $0: ccccccccc

4 aa: ddddddddd

560 Chapter 12 The AWK Pattern Processing Language

The g5 program outputs each line of input except for those lines that begin with the

letter b. The first print statement outputs each line that

gawk

reads automatically.

Next the /^b/ pattern selects all lines that begin with b for special processing. The

action uses getline to read the next line of input into the variable hold, outputs the

string skip this line: followed by the value of hold, and outputs the value of $1. The

$1 holds the value of the first field of the record that

gawk

read automatically, not

the record read by getline. The final statement displays a string and the value of

NR, the current record number. Even though getline does not change $0 when it

reads data into a variable,

gawk

increments NR.

$ cat g5

# print all lines except those read with getline

{print "line #", NR, $0}

# if line begins with "b" process it specially

/^b/ {

# use getline to read the next line into variable named hold

getline hold

# print value of hold

print "skip this line:", hold

# $0 is not affected when getline reads data into a variable

# $1 still holds previous value

print "previous line began with:", $1

}

{

print ">>>> finished processing line #", NR

print ""

}

$ gawk -f g5 < alpha

line # 1 aaaaaaaaa

>>>> finished processing line # 1

line # 2 bbbbbbbbb

skip this line: ccccccccc

previous line began with: bbbbbbbbb

>>>> finished processing line # 3

line # 4 ddddddddd

>>>> finished processing line # 4

Coprocess: Two-Way I/O

Acoprocess is a process that runs in parallel with another process. Starting with

version 3.1,

gawk

can invoke a coprocess to exchange information directly with a

background process. A coprocess can be useful when you are working in a cli-

ent/server environment, setting up an SQL (page 980) front end/back end, or

exchanging data with a remote system over a network. The

gawk

syntax identifies a

coprocess by preceding the name of the program that starts the background process

with a |& operator.

Advanced gawk Programming 561

The coprocess command must be a filter (i.e., it reads from standard input and

writes to standard output) and must flush its output whenever it has a complete line

rather than accumulating lines for subsequent output. When a command is invoked

as a coprocess, it is connected via a two-way pipe to a

gawk

program so you can

read from and write to the coprocess.

to_upper When used alone the

utility (page 864) does not flush its output after each line.

The to_upper shell script is a wrapper for

that does flush its output; this filter can

be run as a coprocess. For each line read, to_upper writes the line, translated to

uppercase, to standard output. Remove the # before set –x if you want to_upper to

display debugging output.

$ cat to_upper

#!/bin/bash

#set -x

while read arg

echo "$arg" | tr '[a-z]' '[A-Z]'

done

$ echo abcdef | ./to_upper

ABCDEF

The g6 program invokes to_upper as a coprocess. This

gawk

program reads stan-

dard input or a file specified on the command line, translates the input to uppercase,

and writes the translated data to standard output.

$ cat g6

{

print $0 |& "to_upper"

"to_upper" |& getline hold

print hold

}

$ gawk -f g6 < alpha

AAAAAAAAA

BBBBBBBBB

CCCCCCCCC

DDDDDDDDD

The g6 program has one compound statement, enclosed within braces, comprising

three statements. Because there is no pattern,

gawk

executes the compound state-

ment once for each line of input.

In the first statement, print $0 sends the current record to standard output. The |&

operator redirects standard output to the program named to_upper, which is run-

ning as a coprocess. The quotation marks around the name of the program are

required. The second statement redirects standard output from to_upper to a getline

statement, which copies its standard input to the variable named hold. The third

statement, print hold, sends the contents of the hold variable to standard output.

Only

gawk

supports coprocesses

tip The

awk

and

mawk

utilities do not support coprocesses. Only

gawk

supports coprocesses.

562 Chapter 12 The AWK Pattern Processing Language

Getting Input from a Network

Building on the concept of a coprocess,

gawk

can exchange information with a pro-

cess on another system via an IP network connection. When you specify one of the

special filenames that begins with /inet/,

gawk

processes the request using a network

connection. The format of these special filenames is

/inet/protocol/local-port/remote-host/remote-port

where protocol is usually tcp but can be udp,local-port is 0 (zero) if you want

gawk

to pick a port (otherwise it is the number of the port you want to use), remote-host

is the IP address (page 960) or fully qualified domain name (page 955) of the

remote host, and remote-port is the port number on the remote host. Instead of a

port number in local-port and remote-port, you can specify a service name such as

http or ftp.

The g7 program reads the rfc-retrieval.txt file from the server at www.rfc-edi-

tor.org. On www.rfc-editor.org the file is located at /rfc/rfc-retrieval.txt. The first

statement in g7 assigns the special filename to the server variable. The filename

specifies a TCP connection, allows the local system to select an appropriate port,

and connects to www.rfc-editor.org on port 80. You can use http in place of 80 to

specify the standard HTTP port.

The second statement uses a coprocess to send a GET request to the remote server.

This request includes the pathname of the file

gawk

is requesting. A while loop

uses a coprocess to redirect lines from the server to getline. Because getline has no

variable name as an argument, it saves its input in the current record buffer $0.

The final print statement sends each record to standard output. Experiment with

this script, replacing the final print statement with

gawk

statements that process

the file.

$ cat g7

BEGIN {

# set variable named server

# to special networking filename

server = "/inet/tcp/0/www.rfc-editor.org/80"

# use coprocess to send GET request to remote server

print "GET /rfc/rfc-retrieval.txt" |& server

# while loop uses coprocess to redirect

# output from server to getline

while (server |& getline)

print $0

}

Exercises 563

$ gawk -f g7

Where and how to get new RFCs

=============================

RFCs may be obtained via FTP or HTTP or email from many RFC repositories.

The official repository for RFCs is:

http://www.rfc-editor.org/

...

Chapter Summary

AWK is a pattern-scanning and processing language that searches one or more files

for records (usually lines) that match specified patterns. It processes lines by per-

forming actions, such as writing the record to standard output or incrementing a

counter, each time it finds a match. AWK has several implementations, including

awk

gawk

, and

mawk

An AWK program consists of one or more lines containing a pattern and/or action

in the following format:

pattern { action }

The pattern selects lines from the input. An AWK program performs the action on

all lines that the pattern selects. If a program line does not contain a pattern, AW K

selects all lines in the input. If a program line does not contain an action, AW K

copies the selected lines to standard output.

An AWK program can use variables, functions, arithmetic operators, associative

arrays, control statements, and C’s printf statement. Advanced AWK programming

takes advantage of getline statements to fine-tune input, coprocesses to enable

gawk

to exchange data with other programs (

gawk

only), and network connections to

exchange data with programs running on remote systems on a network (

gawk

only).

Exercises

1. Write an AWK program that numbers each line in a file and sends its output

to standard output.

2. Write an AWK program that displays the number of characters in the first

field followed by the first field and sends its output to standard output.

564 Chapter 12 The AWK Pattern Processing Language

3. Write an AWK program that uses the cars file (page 541), displays all cars

priced at more than $5,000, and sends its output to standard output.

4. Use AWK to determine how many lines in /usr/share/dict/words contain

the string abul. Verify your answer using

grep

Advanced Exercises

5. Experiment with

pgawk

(available only with

gawk

). What does it do? How

can it be useful?

6. Write a

gawk

(not

awk

mawk

) program named net_list that reads from

the rfc-retrieval.txt file on www.rfc-editor.org (see “Getting Input from a

Network” on page 562) and displays a the last word on each line in all

uppercase letters.

7. Expand the net_list program developed in Exercise 6 to use to_upper

(page 561) as a coprocess to display the list of cars with only the make of

the cars in uppercase. The model and subsequent fields on each line should

appear as they do in the cars file.

8. How can you get

gawk

(not

awk

mawk

) to neatly format—that is,

“pretty print”—a

gawk

program file? (Hint: See the

gawk man

page.)

565565

The

sed

(stream editor) utility is a batch (noninteractive) editor.

It transforms an input stream that can come from a file or stan-

dard input. It is frequently used as a filter in a pipe. Because it

makes only one pass through its input,

sed

is more efficient

than an interactive editor such as

. Most Linux distributions

provide GNU

sed

; Mac OS X supplies BSD

sed

. This chapter

applies to both versions.

In This Chapter

Syntax . . . . . . . . . . . . . . . . . . . . . . 566

Arguments . . . . . . . . . . . . . . . . . . . 566

Options . . . . . . . . . . . . . . . . . . . . . 566

Editor Basics . . . . . . . . . . . . . . . . . 567

Addresses . . . . . . . . . . . . . . . . . . . 567

Instructions . . . . . . . . . . . . . . . . . . 568

Control Structures . . . . . . . . . . . . . 569

The Hold Space . . . . . . . . . . . . . . . 570

Examples. . . . . . . . . . . . . . . . . . . . 570

The

sed

Editor

566 Chapter 13 The

sed

Editor

Syntax

sed

command line has the following syntax:

sed [–n] program [file-list]

sed [–n] –f program-file [file-list]

The

sed

utility takes its input from files you specify on the command line or from

standard input. Output from

sed

goes to standard output.

Arguments

The program is a

sed

program included on the command line. The first format

allows you to write simple, short

sed

programs without creating a separate file to

hold the

sed

program. The program-file in the second format is the pathname of a

file containing a

sed

program (see “Editor Basics”). The file-list contains pathnames

of the ordinary files that

sed

processes; these are the input files. When you do not

specify a file-list,

sed

takes its input from standard input.

Options

Options preceded by a double hyphen (––) work under Linux (GNU

sed

) only.

Options named with a single letter and preceded by a single hyphen work under

Linux (GNU

sed

) and OS X (BSD

sed

––file program-file

–f program-file

Causes

sed

to read its program from the file named program-file instead of

from the command line. You can use this option more than once on the com-

mand line.

––help Summarizes how to use

sed

––in-place[=suffix]

–i[suffix]

Edits files in place. Without this option

sed

sends its output to standard out-

put. With this option

sed

replaces the file it is processing with its output. When

you specify a suffix,

sed

makes a backup of the original file. The backup has

the original filename with suffix appended. You must include a period in suffix

if you want a period to appear between the original filename and suffix.

––quiet or

––silent

–n Causes

sed

not to copy lines to standard output except as specified by the Print

(p) instruction or flag.

Editor Basics 567

Editor Basics

sed

program consists of one or more lines with the following syntax:

[address[,address]] instruction [argument-list]

The addresses are optional. If you omit the address,

sed

processes all lines of

input. The instruction is an editing instruction that modifies the text. The

addresses select the line(s) the instruction part of the command operates on. The

number and kinds of arguments in the argument-list depend on the instruction. If

you want to put several

sed

commands on one line, separate the commands with

semicolons (;).

The

sed

utility processes input as follows:

1. Reads one line of input from file-list or standard input.

2. Reads the first instruction from the program or program-file. If the

address(es) select the input line, acts on the input line as the instruction

specifies.

3. Reads the next instruction from the program or program-file. If the

address(es) select the input line, acts on the input line (possibly modified

by the previous instruction) as the instruction specifies.

4. Repeats step 3 until it has executed all instructions in the program or

program-file.

5. Starts over with step 1 if there is another line of input; otherwise,

sed

finished.

Addresses

A line number is an address that selects a line. As a special case, the line number $

represents the last line of input.

A regular expression (Appendix A) is an address that selects those lines containing a

string that the regular expression matches. Although slashes are often used to

delimit these regular expressions,

sed

permits you to use any character other than a

backslash or NEWLINE for this purpose.

Except as noted, zero, one, or two addresses (either line numbers or regular expres-

sions) can precede an instruction. If you do not specify an address,

sed

selects all

lines, causing the instruction to act on every line of input. Specifying one address

causes the instruction to act on each input line the address selects. Specifying two

addresses causes the instruction to act on groups of lines. In this case the first

address selects the first line in the first group. The second address selects the next

568 Chapter 13 The

sed

Editor

subsequent line that it matches; this line is the last line in the first group. If no

match for the second address is found, the second address points to the end of the

file. After selecting the last line in a group,

sed

starts the selection process over,

looking for the next line the first address matches. This line is the first line in the

next group. The

sed

utility continues this process until it has finished going through

the entire file.

Instructions

Pattern space The

sed

utility has two buffers. The following commands work with the Pattern

space, which initially holds the line of input that

sed

just read. The other buffer, the

Hold space, is discussed on page 570.

a(append) The Append instruction appends one or more lines to the currently

selected line. If you precede an Append instruction with two addresses, it appends

to each line that is selected by the addresses. If you do not precede an Append

instruction with an address, it appends to each input line. An Append instruction

has the following format:

[address[,address]] a\

text \

...

text

You must end each line of appended text, except the last, with a backslash, which

quotes the following NEWLINE. The appended text concludes with a line that does not

end with a backslash. The

sed

utility always writes out appended text, regardless of

whether you use a –n flag on the command line. It even writes out the text if you

delete the line to which you are appending the text.

c(change) The Change instruction is similar to Append and Insert except it changes

the selected lines so that they contain the new text. When you specify an address

range, Change replaces the range of lines with a single occurrence of the new text.

d(delete) The Delete instruction causes

sed

not to write out the lines it selects and not

to finish processing the lines. After

sed

executes a Delete instruction, it reads the

next input line and then begins anew with the first instruction from the program or

program-file.

i(insert) The Insert instruction is identical to the Append instruction except it places

the new text before the selected line.

N(next without write) The Next (N) instruction reads the next input line and appends

it to the current line. An embedded NEWL NE separates the original line and the new

line. You can use the N command to remove NEWLINEs from a file. See the example on

page 575.

Editor Basics 569

n(next) The Next (n) instruction writes out the currently selected line if appropriate,

reads the next input line, and starts processing the new line with the next instruc-

tion from the program or program-file.

p(print) The Print instruction writes the selected lines to standard output, writing the

lines immediately, and does not reflect the effects of subsequent instructions. This

instruction overrides the –n option on the command line.

q(quit) The Quit instruction causes

sed

to terminate immediately.

rfile (read) The Read instruction reads the contents of the specified file and appends it to

the selected line. A single SPACE and the name of the input file must follow a Read

instruction.

s(substitute) The Substitute instruction in

sed

is similar to that in

vim

(page 176). It

has the following format:

[address[,address]] s/pattern/replacement-string/[g][p][w file]

The pattern is a regular expression (Appendix A) that traditionally is delimited by a

slash (/); you can use any character other than a SPACE or NEWLINE. The replacement-

string starts immediately following the second delimiter and must be terminated by

the same delimiter. The final (third) delimiter is required. The replacement-string

can contain an ampersand (&), which

sed

replaces with the matched pattern.

Unless you use the g flag, the Substitute instruction replaces only the first occur-

rence of the pattern on each selected line.

The g (global) flag causes the Substitute instruction to replace all nonoverlapping

occurrences of the pattern on the selected lines.

The p (print) flag causes

sed

to send all lines on which it makes substitutions to

standard output. This flag overrides the –n option on the command line.

The w (write) flag is similar to the p flag but sends its output to the file specified by

file. A single SPACE and the name of the output file must follow a w flag.

wfile (write) This instruction is similar to the Print instruction except it sends the output

to the file specified by file. A single SPACE and the name of the output file must follow

a Write instruction.

Control Structures

!(NOT) Causes

sed

to apply the following instruction, located on the same line, to

each of the lines not selected by the address portion of the instruction. For example,

3!d deletes all lines except line 3 and $!p displays all lines except the last.

{} (group instructions) When you enclose a group of instructions within a pair of

braces, a single address or address pair selects the lines on which the group of

instructions operates. Use semicolons (;) to separate multiple commands appearing

on a single line.

570 Chapter 13 The

sed

Editor

Branch instructions The GNU

sed info

page identifies the branch instructions as “Commands for

sed

gurus” and suggests that if you need them you might be better off writing your pro-

gram in

awk

or Perl.

:label Identifies a location within a

sed

program. The label is useful as a target for the b

and t branch instructions.

b[label]Unconditionally transfers control to label. Without label, skips the rest of the

instructions for the current line of input and reads the next line of input.

t[label]Transfers control to label only if a Substitute instruction has been successful since

the most recent line of input was read (conditional branch). Without label, skips the

rest of the instructions for the current line of input and reads the next line of input.

The Hold Space

The commands reviewed up to this point work with the Pattern space, a buffer that

initially holds the line of input that

sed

just read. The Hold space can hold data

while you manipulate data in the Pattern space; it is a temporary buffer. Until you

place data in the Hold space, it is empty. This section discusses commands that

move data between the Pattern space and the Hold space.

gCopies the contents of the Hold space to the Pattern space. The original contents of

the Pattern space is lost.

GAppends a NEWL NE and the contents of the Hold space to the Pattern space.

hCopies the contents of the Pattern space to the Hold space. The original contents of

the Hold space is lost.

HAppends a NEWL NE and the contents of the Pattern space to the Hold space.

xExchanges the contents of the Pattern space and the Hold space.

Examples

lines data file The following examples use the lines file for input:

$ cat lines

Line one.

The second line.

The third.

This is line four.

Five.

This is the sixth sentence.

This is line seven.

Eighth and last.

Unless you instruct it not to,

sed

sends all lines—selected or not—to standard out-

put. When you use the –n option on the command line,

sed

sends only certain lines,

such as those selected by a Print (p) instruction, to standard output.

Examples 571

The following command line displays all lines in the lines file that contain the word

line (all lowercase). In addition, because there is no –n option,

sed

displays all the

lines of input. As a result,

sed

displays the lines that contain the word line twice.

$ sed '/line/ p' lines

Line one.

The second line.

The third.

This is line four.

Five.

This is the sixth sentence.

This is line seven.

Eighth and last.

The preceding command uses the address /line/, a regular expression that is a simple

string. The

sed

utility selects each of the lines that contains a match for that pattern.

The Print (p) instruction displays each of the selected lines.

The following command uses the –n option, so

sed

displays only the selected lines:

$ sed -n '/line/ p' lines

The second line.

This is line four.

This is line seven.

In the next example,

sed

displays part of a file based on line numbers. The Print

instruction selects and displays lines 3 through 6.

$ sed -n '3,6 p' lines

The third.

This is line four.

Five.

This is the sixth sentence.

The next command line uses the Quit instruction to cause

sed

to display only the

beginning of a file. In this case

sed

displays the first five lines of lines just as a head

–5 lines command would.

$ sed '5 q' lines

Line one.

The second line.

The third.

This is line four.

Five.

program-file When you need to give

sed

more complex or lengthy instructions, you can use a

program-file. The print3_6 program performs the same function as the command

line in the second preceding example. The –f option tells

sed

to read its program

from the file named following this option.

572 Chapter 13 The

sed

Editor

$ cat print3_6

3,6 p

$ sed -n -f print3_6 lines

The third.

This is line four.

Five.

This is the sixth sentence.

Append The next program selects line 2 and uses an Append instruction to append a NEWLINE

and the text AFTER. to the selected line. Because the command line does not

include the –n option,

sed

copies all lines from the input file lines.

$ cat append_demo

2 a\

AFTER.

$ sed -f append_demo lines

Line one.

The second line.

AFTER.

The third.

This is line four.

Five.

This is the sixth sentence.

This is line seven.

Eighth and last.

Insert The insert_demo program selects all lines containing the string This and inserts a

NEWLINE and the text BEFORE. before the selected lines.

$ cat insert_demo

/This/ i\

BEFORE.

$ sed -f insert_demo lines

Line one.

The second line.

The third.

BEFORE.

This is line four.

Five.

BEFORE.

This is the sixth sentence.

BEFORE.

This is line seven.

Eighth and last.

Change The next example demonstrates a Change instruction with an address range. When

you specify a range of lines for a Change instruction, it does not change each line

within the range but rather changes the block of lines to a single occurrence of the

new text.

Examples 573

$ cat change_demo

2,4 c\

SED WILL INSERT THESE\

THREE LINES IN PLACE\

OF THE SELECTED LINES.

$ sed -f change_demo lines

Line one.

SED WILL INSERT THESE

THREE LINES IN PLACE

OF THE SELECTED LINES.

Five.

This is the sixth sentence.

This is line seven.

Eighth and last.

Substitute The next example demonstrates a Substitute instruction. The

sed

utility selects all

lines because the instruction has no address. On each line subs_demo replaces the

first occurrence of line with sentence. The p flag displays each line where a substitu-

tion occurs. The command line calls

sed

with the –n option, so

sed

displays only the

lines the program explicitly specifies.

$ cat subs_demo

s/line/sentence/p

$ sed -n -f subs_demo lines

The second sentence.

This is sentence four.

This is sentence seven.

The next example is similar to the preceding one except that a w flag and filename

(temp) at the end of the Substitute instruction cause

sed

to create the file named

temp. The command line does not include the –n option, so it displays all lines in

addition to writing the changed lines to temp. The

cat

utility displays the contents of

the file temp. The word Line (starting with an uppercase L) is not changed.

$ cat write_demo1

s/line/sentence/w temp

$ sed -f write_demo1 lines

Line one.

The second sentence.

The third.

This is sentence four.

Five.

This is the sixth sentence.

This is sentence seven.

Eighth and last.

$ cat temp

The second sentence.

This is sentence four.

This is sentence seven.

574 Chapter 13 The

sed

Editor

The following

bash

script changes all occurrences of REPORT to report,FILE to

file, and PROCESS to process in a group of files. Because it is a shell script and not

sed

program file, you must have read and execute permission to the sub file to

execute it as a command (page 278). The for structure (page 412) loops through the

list of files on the command line. As it processes each file, the script displays each

filename before processing the file with

sed

. This program uses embedded

sed

com-

mands that span multiple lines. Because the NEWLINEs between the commands are

quoted (they appear between single quotation marks),

sed

accepts multiple com-

mands on a single, extended command line (within the shell script). Each Substitute

instruction includes a g (global) flag to take care of the case where a string occurs

more than once on a line.

$ cat sub

for file

echo $file

mv $file $$.subhld

sed 's/REPORT/report/g

s/FILE/file/g

s/PROCESS/process/g' $$.subhld > $file

done

rm $$.subhld

$ sub file1 file2 file3

file1

file2

file3

In the next example, a Write instruction copies part of a file to another file (temp2).

The line numbers 2 and 4, separated by a comma, select the range of lines

sed

is to

copy. This program does not alter the lines.

$ cat write_demo2

2,4 w temp2

$ sed -n -f write_demo2 lines

$ cat temp2

The second line.

The third.

This is line four.

The program write_demo3 is similar to write_demo2 but precedes the Write

instruction with the NOT operator (!), causing

sed

to write to the file those lines

not selected by the address.

$ cat write_demo3

2,4 !w temp3

$ sed -n -f write_demo3 lines

Examples 575

$ cat temp3

Line one.

Five.

This is the sixth sentence.

This is line seven.

Eighth and last.

Next (n)The following example demonstrates the Next (n) instruction. When it processes

the selected line (line 3),

sed

immediately starts processing the next line without dis-

playing line 3.

$ cat next_demo1

3 n

$ sed -n -f next_demo1 lines

Line one.

The second line.

This is line four.

Five.

This is the sixth sentence.

This is line seven.

Eighth and last.

The next example uses a textual address. The sixth line contains the string the, so

the Next (n) instruction causes

sed

not to display it.

$ cat next_demo2

/the/ n

$ sed -n -f next_demo2 lines

Line one.

The second line.

The third.

This is line four.

Five.

This is line seven.

Eighth and last.

Next (N)The following example is similar to the preceding example except it uses the upper-

case Next (N) instruction in place of the lowercase Next (n) instruction. Here the

Next (N) instruction appends the next line to the line that contains the string the. In

the lines file, sed appends line 7 to line 6 and embeds a NEWLINE between the two lines.

The Substitute command replaces the embedded NEWLINE with a SPACE. The Substitute

command does not affect other lines because they do not contain embedded NEWLINEs;

rather, they are terminated by NEWL NEs. See page 926 for an example of the Next (N)

instruction in a

sed

script running under OS X.

$ cat Next_demo3

/the/ N

s/\n/ /

576 Chapter 13 The

sed

Editor

$ sed -n -f Next_demo3 lines

Line one.

The second line.

The third.

This is line four.

Five.

This is the sixth sentence. This is line seven.

Eighth and last.

The next set of examples uses the compound.in file to demonstrate how

sed

instructions work together.

$ cat compound.in

1. The words on this page...

2. The words on this page...

3. The words on this page...

4. The words on this page...

The following example substitutes the string words with text on lines 1, 2, and 3

and the string text with TEXT on lines 2, 3, and 4. The example also selects and

deletes line 3. The result is text on line 1, TEXT on line 2, no line 3, and words on

line 4. The

sed

utility makes two substitutions on lines 2 and 3: text for words and

TEXT for text. Then

sed

deletes line 3.

$ cat compound

1,3 s/words/text/

2,4 s/text/TEXT/

3 d

$ sed -f compound compound.in

1. The text on this page...

2. The TEXT on this page...

4. The words on this page...

The ordering of instructions within a

sed

program is critical. Both Substitute

instructions are applied to the second line in the following example, as in the previ-

ous example, but the order in which the substitutions occur changes the result.

$ cat compound2

2,4 s/text/TEXT/

1,3 s/words/text/

3 d

$ sed -f compound2 compound.in

1. The text on this page...

2. The text on this page...

4. The words on this page...

In the next example, compound3 appends two lines to line 2. The

sed

utility dis-

plays all lines from the file once because no –n option appears on the command

line. The Print instruction at the end of the program file displays line 3 an addi-

tional time.

Examples 577

$ cat compound3

2 a\

This is line 2a.\

This is line 2b.

3 p

$ sed -f compound3 compound.in

1. The words on this page...

2. The words on this page...

This is line 2a.

This is line 2b.

3. The words on this page...

4. The words on this page...

The next example shows that

sed

always displays appended text. Here line 2 is

deleted but the Append instruction still displays the two lines that were appended to

it. Appended lines are displayed even if you use the –n option on the command line.

$ cat compound4

2 a\

This is line 2a.\

This is line 2b.

2 d

$ sed -f compound4 compound.in

1. The words on this page...

This is line 2a.

This is line 2b.

3. The words on this page...

4. The words on this page...

The next example uses a regular expression as the pattern. The regular expression

in the following instruction (^.) matches one character at the beginning of every line

that is not empty. The replacement string (between the second and third slashes)

contains a backslash escape sequence that represents a TAB character (\t) followed by

an ampersand (&). The ampersand takes on the value of what the regular expres-

sion matched.

$ sed 's/^./\t&/' lines

Line one.

The second line.

The third.

...

This type of substitution is useful for indenting a file to create a left margin. See

Appendix A for more information on regular expressions.

You can also use the simpler form s/^/\t/ to add TABs to the beginnings of lines. In

addition to placing TABs at the beginnings of lines with text on them, this instruction

places a TAB at the beginning of every empty line—something the preceding com-

mand does not do.

578 Chapter 13 The

sed

Editor

You may want to put the preceding

sed

instruction into a shell script so you do not

have to remember it (and retype it) each time you want to indent a file. The

chmod

utility gives you read and execute permission to the ind file.

$ cat ind

sed 's/^./\t&/' $*

$ chmod u+rx ind

$ ind lines

Line one.

The second line.

The third.

...

Stand-alone script When you run the preceding shell script, it creates two processes: It calls a shell,

which in turn calls

sed

. You can eliminate the overhead associated with the shell

process by putting the line #!/bin/sed –f (page 280) at the beginning of the script,

which runs the

sed

utility directly. You need read and execute permission to the file

holding the script.

$ cat ind2

#!/bin/sed -f

s/^./\t&/

In the following

sed

program, the regular expression (two SPACEs followed by *$)

matches one or more SPACEs at the end of a line. This program removes trailing SPACEs

at the ends of lines, which is useful for cleaning up files you created using

vim

$ cat cleanup

sed 's/ *$//' $*

The cleanup2 script runs the same

sed

command as cleanup but stands alone: It

calls

sed

directly with no intermediate shell.

$ cat cleanup2

#!/bin/sed -f

s/ *$//

Hold space The next

sed

program makes use of the Hold space to exchange pairs of lines in a

file.

$ cat s1

h # Copy Pattern space (line just read) to Hold space.

n # Read the next line of input into Pattern space.

p # Output Pattern space.

g # Copy Hold space to Pattern space.

p # Output Pattern space (which now holds the previous line).

$ sed -nf s1 lines

The second line.

Line one.

This is line four.

Examples 579

The third.

This is the sixth sentence.

Five.

Eighth and last.

This is line seven.

The commands in the s1 program process pairs of input lines. This program reads a

line and stores it; reads another line and displays it; and then retrieves the stored

line and displays it. After processing a pair of lines, the program starts over with the

next pair of lines.

The next

sed

program adds a blank line after each line in the input file (i.e., it double-

spaces a file).

$ sed 'G' lines

Line one.

The second line.

The third.

This is line four.

The G instruction appends a NEWLINE and the contents of the Hold space to the Pat-

tern space. Unless you put something in the Hold space, it will be empty. Thus the G

instruction appends a NEWL NE to each line of input before

sed

displays the line(s)

from the Pattern space.

The s2

sed

program reverses the order of the lines in a file just as the

tac

utility does.

$ cat s2

2,$G # On all but the first line, append a NEWLINE and the

# contents of the Hold space to the Pattern space.

h # Copy the Pattern space to the Hold space.

$!d # Delete all except the last line.

$ sed -f s2 lines

Eighth and last.

This is line seven.

This is the sixth sentence.

Five.

This is line four.

The third.

The second line.

Line one.

This program comprises three commands: 2,$G,h, and $!d. To understand this

script it is important to understand how the address of the last command works:

The $ is the address of the last line of input and the ! negates the address. The result

580 Chapter 13 The

sed

Editor

is an address that selects all lines except the last line of input. In the same fashion

you could replace the first command with 1!G: It would select all lines except the

first line for processing; the results would be the same.

Here is what happens as s2 processes the lines file:

1. The

sed

utility reads the first line of input (Line one.) into the Pattern

space.

a. The 2,$G does not process the first line of input—because of its address

the G instruction starts processing at the second line.

b. The h copies Line one. from the Pattern space to the Hold space.

c. The $!d deletes the contents of the Pattern space. Because there is nothing

in the Pattern space,

sed

does not display anything.

2. The

sed

utility reads the second line of input (The second line.) into the

Pattern space.

a. The 2,$G adds what is in the Hold space (Line one.) to the Pattern

space. The Pattern space now contains The second line.NEWLINELine one.

b. The h copies what is in the Pattern space to the Hold space.

c. The $!d deletes the second line of input. Because it is deleted,

sed

does

not display it.

3. The

sed

utility reads the third line of input (The third.) into the Pattern

space.

a. The 2,$G adds what is in the Hold space (The second line.NEWLINELine

one.) to the Pattern space. The Pattern space now has The third.NEWLINE

The second line.NEWLINELine one.

b. The h copies what is in the Pattern space to the Hold space.

c. The $!d deletes the contents of the Pattern space. Because there is nothing

in the Pattern space,

sed

does not display anything.

. . .

8. The

sed

utility reads the eighth (last) line of input into the Pattern space.

a. The 2,$G adds what is in the Hold space to the Pattern space. The Pat-

tern space now contains all the lines from lines in reverse order.

b. The h copies what is in the Pattern space to the Hold space. This step is

not necessary for the last line of input but does not alter the program’s

output.

c. The $!d does not process the last line of input. Because of its address the

d instruction does not delete the last line.

d. The

sed

utility displays the contents of the Pattern space.

Exercises 581

Chapter Summary

The

sed

(stream editor) utility is a batch (noninteractive) editor. It takes its input

from files you specify on the command line or from standard input. Unless you redi-

rect the output from

sed

, it goes to standard output.

sed

program consists of one or more lines with the following syntax:

[address[,address]] instruction [argument-list]

The addresses are optional. If you omit the address,

sed

processes all lines of input.

The instruction is the editing instruction that modifies the text. The addresses select

the line(s) the instruction part of the command operates on. The number and kinds

of arguments in the argument-list depend on the instruction.

In addition to basic instructions,

sed

includes some powerful advanced instructions.

One set of these instructions allows

sed

programs to store data temporarily in a

buffer called the Hold space. Other instructions provide unconditional and condi-

tional branching in

sed

programs.

Exercises

1. Write a

sed

command that copies a file to standard output, removing all

lines that begin with the word Today.

2. Write a

sed

command that copies only those lines of a file that begin with

the word Today to standard output.

3. Write a

sed

command that copies a file to standard output, removing all

blank lines (i.e., lines with no characters on them).

4. Write a

sed

program named ins that copies a file to standard output,

changing all occurrences of cat to dog and preceding each modified line

with a line that says following line is modified:

5. Write a

sed

program named div that copies a file to standard output,

copies the first five lines to a file named first, and copies the rest of the file

to a file named last.

6. Write a

sed

command that copies a file to standard output, replacing a

single SPACE as the first character on a line with a 0 (zero) only if the SPACE is

immediately followed by a number (0–9). For example:

abc → abc

85c → 085c

55b → 55b

000 → 0000

7. How can you use

sed

to triple-space (i.e., add two blank lines after each

line in) a file?

583583

The

rsync

(remote synchronization) utility copies an ordinary

file or directory hierarchy locally or from the local system to or

from another system on a network. By default, this utility uses

OpenSSH to transfer files and the same authentication mecha-

nism as OpenSSH; therefore it provides the same security as

OpenSSH. The

rsync

utility prompts for a password when it

needs one. Alternatively, you can use the rsyncd daemon as a

transfer agent.

In This Chapter

Syntax . . . . . . . . . . . . . . . . . . . . . . 584

Arguments . . . . . . . . . . . . . . . . . . . 584

Options . . . . . . . . . . . . . . . . . . . . . 584

Examples. . . . . . . . . . . . . . . . . . . . 587

Removing Files . . . . . . . . . . . . . . . 588

Copying Files to and from a

Remote System . . . . . . . . . . . . . 590

Mirroring a Directory . . . . . . . . . . . 590

Making Backups . . . . . . . . . . . . . . 591

The

rsync

Secure

Copy Utility

584 Chapter 14 The

rsync

Secure Copy Utility

Syntax

rsync

command line has the following syntax:

rsync [options] [[user@]from-host:]source-file [[user@]to-host:][destination-file]

The

rsync

utility copies files, including directory hierarchies, on the local system or

between the local system and a remote system.

Arguments

The from-host is the name of the system you are copying files from; the to-host is

the name of the system you are copying files to. When you do not specify a host,

rsync

assumes the local system. The user on either system defaults to the user who is

giving the command on the local system; you can specify a different user with

user@. Unlike

scp

rsync

does not permit copying between remote systems.

The source-file is the ordinary or directory file you are copying; the destination-file

is the resulting copy. You can specify files as relative or absolute pathnames. On

the local system, relative pathnames are relative to the working directory; on a

remote system, relative pathnames are relative to the specified or implicit user’s

home directory. When the source-file is a directory, you must use the ––recursive or

––archive option to copy its contents. When the destination-file is a directory, each

of the source files maintains its simple filename. If the source-file is a single file,

you can omit destination-file; the copied file will have the same simple filename as

source-file (useful only when copying to or from a remote machine).

Options

––acls –A Preserves ACLs (page 99) of copied files.

––archive –a Copies files including dereferenced symbolic links, device files, and special files

recursively, preserving ownership, group, permissions, and modification times

A trailing slash (/) on source-file is critical

caution When

source-file

is a directory, a trailing slash in

source-file

causes rsync to copy the contents of

the directory. The slash is equivalent to /*; it tells rsync to ignore the directory itself and copy the

files within the directory. Without a trailing slash, rsync copies the directory. See page 587.

The Mac OS X version of rsync accepts long options

tip Options for rsync preceded by a double hyphen (––) work under Mac OS X as well as under

Linux.

Options 585

associated with the files. Using this option is the same as specifying the

––devices,––specials,––group,––links,––owner,––perms,––recursive, and

––times options. This option does not include the ––acls,––hard-links, or

––xattrs options; you must specify these options in addition to ––archive if

you want to use them. See page 588 for an example.

––backup –b Renames files that otherwise would be deleted or overwritten. By default, the

rsync utility renames files by appending a tilde (~) to existing filenames. See

––backup-dir=dir if you want rsync to put these files in a specified directory

instead of renaming them. See also ––link-dest=dir.

––backup-dir=dir When used with the ––backup option, moves files that otherwise would be

deleted or overwritten into the directory named dir. After moving the older

version of the file in dir,rsync copies the newer version of the file from

source-file to destination-file.

The directory named dir is located on the same system as destination-file. If

dir is a relative pathname, it is relative to destination-file.

––copy-unsafe-links

(partial dereference) For each file that is a symbolic link that refers to a file

outside the source-file hierarchy, copies the file the link points to, not the sym-

bolic link itself. Without this option rsync copies all symbolic links, even if it

does not copy the file the link refers to.

–D same as ––devices ––specials.

––delete Deletes files in the destination-file that are not in the source-file. This option

can easily remove files you did not intend to remove; see the caution box on

page 589.

––devices Copies device files (root user only).

––dry-run Runs rsync without writing to disk. With the ––verbose option, this option

reports on what rsync would have done had it been run without this option.

Useful with the ––delete option.

––group –g Preserves group associations of copied files.

––hard-links –H Preserves hard links of copied files.

––links –l (lowercase “l”; no dereference) For each file that is a symbolic link, copies the

symbolic link, not the file the link points to, even if the file the link points to is

not in the source-file.

––link-dest=dir If rsync would normally copy a file—that is, if the file exists in source-file but

not in destination-file or is changed in destination-file—rsync looks in the

directory named dir for the same file. If it finds an exact copy of the file in dir,

rsync makes a hard link from the file in dir to destination-file. If it does not

586 Chapter 14 The rsync Secure Copy Utility

find an exact copy, rsync copies the file to destination-file.See page 592 for an

example.

The directory named dir is located on the same system as destination-file. If

dir is a relative pathname, it is relative to destination-file.

––owner –o Preserves the owner of copied files (root user only).

––perms –p Preserves the permissions of copied files.

––recursive –r Recursively descends a directory specified in source-file and copies all files in

the directory hierarchy. See page 587 for an example.

––specials Copies special files.

––times –t Preserves the modification times of copied files. This option also speeds up

copying files because it causes rsync not to copy a file that has the same modifi-

cation time and size in both the source-file and the destination-file. See

page 587 for an example.

––update –u Skips files that are newer in the destination-file than in the source-file.

––verbose –v Displays information about what rsync is doing. This option is useful with the

––dry-run option. See page 587 for an example.

––xattrs –X Preserves the extended attributes of copied files. This option is not available

with all compilations of rsync.

––compress –z Compresses files while copying them.

Notes

The rsync utility has many options. This chapter describes a few of them; see the

rsync man page for a complete list.

OpenSSH By default, rsync copies files to and from a remote system using OpenSSH. The

remote system must be running an OpenSSH server. If you can use ssh to log in on

the remote system, you can use rsync to copy files to or from that system. If ssh

requires you to enter a password, rsync will require a password. See “Copying Files

to and from a Remote System” on page 590 for examples. See “Discussion” on

page 829 for more information on setting up and using OpenSSH.

rsyncd daemon If you use a double colon (::) in place of a single colon (:) following the name of a

remote system, rsync connects to the rsyncd daemon on the remote system (it does

not use OpenSSH). See the rsync man page for more information.

More Information

man page: rsync

rsync home page: www.samba.org/rsync

Backup information: www.mikerubel.org/computers/rsync_snapshots

Examples 587

Backup tools: www.rsnapshot.org, backuppc.sourceforge.net

File synchronization: alliance.seas.upenn.edu/~bcpierce

Examples

––recursive

––verbose

The first example shows rsync making a copy of a directory using the ––recursive and

––verbose options. Both the source and destination directories are in the working

directory.

$ ls -l memos

total 12

-rw-r--r-- 1 max max 1500 Jun 6 14:24 0606

-rw-r--r-- 1 max max 6001 Jun 8 16:16 0608

$ rsync --recursive --verbose memos memos.copy

building file list ... done

created directory memos.copy

memos/

memos/0606

memos/0608

sent 7671 bytes received 70 bytes 15482.00 bytes/sec

total size is 7501 speedup is 0.97

$ ls -l memos.copy

total 4

drwxr-xr-x 2 max max 4096 Jul 20 17:42 memos

In the preceding example, rsync copies the memos directory to the memos.copy

directory. As the following ls command shows, rsync changed the modification

times on the copied files to the time it made the copies:

$ ls -l memos.copy/memos

total 12

-rw-r--r-- 1 max max 1500 Jul 20 17:42 0606

-rw-r--r-- 1 max max 6001 Jul 20 17:42 0608

Using a Trailing Slash (/) on

source-file

Whereas the previous example copied a directory to another directory, you may

want to copy the contents of a directory to another directory. A trailing slash (/) on

the source-file causes rsync to act as though you had specified a trailing /* and

causes rsync to copy the contents of the specified directory. A trailing slash on the

destination-file has no effect.

––times The next example makes another copy of the memos directory, using ––times to

preserve modification times of the copied files. It uses a trailing slash on memos

588 Chapter 14 The rsync Secure Copy Utility

to copy the contents of the memos directory—not the directory itself—to

memos.copy2.

$ rsync --recursive --verbose --times memos/ memos.copy2

building file list ... done

created directory memos.copy2

0606

0608

sent 7665 bytes received 70 bytes 15470.00 bytes/sec

total size is 7501 speedup is 0.97

$ ls -l memos.copy2

total 12

-rw-r--r-- 1 max max 1500 Jun 6 14:24 0606

-rw-r--r-- 1 max max 6001 Jun 8 16:16 0608

––archive The ––archive option causes rsync to copy directories recursively, dereferencing

symbolic links (copying the files the links point to, not the symbolic links them-

selves), preserving modification times, ownership, group association of the copied

files, and more. This option does not preserve hard links; use the ––hard-links

option for that purpose. See page 584 for more information on ––archive. The fol-

lowing commands perform the same functions as the previous one:

$ rsync --archive --verbose memos/ memos.copy2

$ rsync -av memos/ memos.copy2

Removing Files

––delete

––dry-run

The ––delete option causes rsync to delete from destination-file files that are not in

source-file. Together, the ––dry-run and ––verbose options report on what an rsync

command would do without the ––dry-run option, without rsync taking any action.

With the ––delete option, the ––dry-run and ––verbose options can help you avoid

removing files you did not intend to remove. This combination of options marks

files rsync would remove with the word deleting. The next example uses these

options in addition to the ––archive option.

$ ls -l memos memos.copy3

memos:

total 20

-rw-r--r-- 1 max max 1500 Jun 6 14:24 0606

-rw-r--r-- 1 max max 6001 Jun 8 16:16 0608

-rw-r--r-- 1 max max 5911 Jun 10 12:02 0610

memos.copy3:

total 16

-rw-r--r-- 1 max max 1500 Jun 6 14:24 0606

-rw-r--r-- 1 max max 6001 Jun 8 16:16 0608

-rw-r--r-- 1 max max 1200 Jul 21 10:17 notes

Examples 589

$ rsync --archive --verbose --delete --dry-run memos/ memos.copy3

building file list ... done

deleting notes

0610

sent 125 bytes received 32 bytes 314.00 bytes/sec

total size is 13412 speedup is 85.43

The rsync utility reports deleting notes, indicating which file it would remove if you

ran it without the ––dry-run option. It also reports it would copy the 0610 file.

If you get tired of using the long versions of options, you can use the single-letter

versions. The next rsync command is the same as the previous one (there is no short

version of the ––delete option):

$ rsync -avn --delete memos/ memos.copy3

The next example runs the same rsync command, omitting the ––dry-run option.

The ls command shows the results of the rsync command: The ––delete option

causes rsync to remove the notes file from the destination-file (memos.copy3)

because it is not in the source-file (memos). In addition, rsync copies the 0610 file.

$ rsync --archive --verbose --delete memos/ memos.copy3

building file list ... done

deleting notes

0610

sent 6076 bytes received 48 bytes 12248.00 bytes/sec

total size is 13412 speedup is 2.19

$ ls -l memos memos.copy3

memos:

total 20

-rw-r--r-- 1 max max 1500 Jun 6 14:24 0606

-rw-r--r-- 1 max max 6001 Jun 8 16:16 0608

-rw-r--r-- 1 max max 5911 Jun 10 12:02 0610

memos.copy3:

total 20

-rw-r--r-- 1 max max 1500 Jun 6 14:24 0606

-rw-r--r-- 1 max max 6001 Jun 8 16:16 0608

-rw-r--r-- 1 max max 5911 Jun 10 12:02 0610

Test to make sure ––delete is going to do what you think it will do

caution The ––delete option can easily delete an entire directory tree if you omit a needed slash (/) or

include an unneeded slash in

source-file

. Use ––delete with the ––dry-run and ––verbose options

to test an rsync command.

590 Chapter 14 The rsync Secure Copy Utility

To this point, the examples have copied files locally, in the working directory. To

copy files to other directories, replace the simple filenames with relative or absolute

pathnames. On the local system, relative pathnames are relative to the working

directory; on a remote system, relative pathnames are relative to the user’s home

directory. For example, the following command copies memos from the working

directory to the /backup directory on the local system:

$ rsync --archive --verbose --delete memos/ /backup

Copying Files to and from a Remote System

To copy files to or from a remote system, that system must be running an

OpenSSH server or another transport mechanism rsync can connect to. For more

information refer to “Notes” on page 586. To specify a file on a remote system,

preface the filename with the name of the remote system and a colon. Relative

pathnames on the remote system are relative to the user’s home directory. Absolute

pathnames are absolute (i.e., they are relative to the root directory). See page 83

for more information on relative and absolute pathnames.

In the next example, Max copies the memos directory from the working directory

on the local system to the holdfiles directory in his working directory on the remote

system named coffee. The ssh utility runs an ls command on coffee to show the

result of the rsync command. The rsync and ssh utilities do not request a password

because Max has set up OpenSSH-based utilities to log in automatically on coffee

(page 830).

$ rsync --archive memos/ coffee:holdfiles

$ ssh coffee 'ls -l holdfiles'

total 20

-rw-r--r-- 1 max max 1500 Jun 6 14:24 0606

-rw-r--r-- 1 max max 6001 Jun 8 16:16 0608

-rw-r--r-- 1 max max 5911 Jun 10 12:02 0610

When copying from a remote system to the local system, place the name of the

remote system before source-file:

$ rsync --archive coffee:holdfiles/ ~/memo.copy4

$ rsync --archive coffee:holdfiles/ /home/max/memo.copy5

Both of these commands copy the holdfiles directory from Max’s working directory

on coffee to his home directory on the local system. Under Mac OS X, replace

/home with /Users.

Mirroring a Directory

You can use rsync to maintain a copy of a directory. Because it is an exact copy,

this type of copy is called a mirror. The mirror directory must be on an OpenSSH

server (you must be able to connect to it using an OpenSSH utility such as ssh). If

you want to run this script using crontab (page 649), you must set up OpenSSH so

Examples 591

you can log in on the remote system automatically (without providing a pass-

word; page 830).

––compress

––update

The next example introduces the rsync ––compress and ––update options. The

––compress option causes rsync to compress files as it copies them, usually making

the transfer go more quickly. In some cases, such as a setup with a fast network con-

nection and a slower CPU, compressing files can slow down the transfer. The

––update option keeps rsync from overwriting a newer file with an older one.

As with all shell scripts, you must have read and execute access to the mirror script.

To make it easier to read, each option in this script appears on a line by itself. Each

line of each command except the last is terminated with a SPACE and a backslash (\).

The SPACE separates one option from the next; the backslash quotes the following

NEWLINE so the shell passes all arguments to rsync and does not interpret the NEWLINEs as

the end of the command.

$ cat mirror

rsync \

--archive \

--verbose \

--compress \

--update \

--delete \

~/mirrordir/ coffee:mirrordir

$ ./mirror > mirror.out

The mirror command in the example redirects output to mirror.out for review.

Remove the ––verbose option if you do not want the command to produce any out-

put except for errors. The rsync command in mirror copies the mirrordir directory

hierarchy from the user’s home directory on the local system to the user’s home

directory on the remote (server) system. In this example the remote system is named

coffee. Because of the ––update option, rsync will not overwrite newer versions of

files on the server with older versions of the same files from the local system.

Although this option is not required if files on the server system are never changed

manually, it can save you from grief if you accidentally update files on or add files

to the server system. The ––delete option causes rsync to remove files on the server

system that are not present on the local system.

Making Backups

After performing an initial full backup, rsync is able to perform subsequent incremen-

tal backups efficiently with regard to running time and storage space. By definition,

an incremental backup stores only those files that have changed since the last backup;

these are the only files that rsync needs to copy. As the following example shows,

rsync, without using extra disk space, can make each incremental backup appear to be

a full backup by creating hard links between the incremental backup and the

unchanged files in the initial full backup.

592 Chapter 14 The rsync Secure Copy Utility

––link-dest=dir The rsync ––link-dest=dir option makes backups easy and efficient. It presents the

user and/or system administrator with snapshots that appear to be full backups

while taking minimal extra space in addition to the initial backup. The dir directory

is always located on the machine holding the destination-file. If dir is a relative

pathname, the pathname is relative to the destination-file. See page 585 for a

description of this option.

Following is a simple rsync command that uses the ––link-dest=dir option:

$ rsync --archive --link-dest=../backup source/ destination

When you run this command, rsync descends the source directory, examining each

file it finds. For each file in the source directory, rsync looks in the destination direc-

tory to find an exact copy of the file.

• If it finds an exact copy of the file in the destination directory, rsync con-

tinues with the next file.

• If it does not find an exact copy of the file in the destination directory,

rsync looks in the backup directory to find an exact copy of the file.

• If it finds an exact copy of the file in the backup directory, rsync

makes a hard link from the file in the backup directory to the

destination directory.

• If it does not find an exact copy of the file in the backup directory,

rsync copies the file from the source directory to the destination

directory.

Next is a simple example showing how to use rsync to make full and incremental

backups using the ––link-dest=dir option. Although the backup files reside on the

local system, they could easily be located on a remote system.

As specified by the two arguments to rsync in the bkup script, rsync copies the

memos directory to the bu.0 directory. The ––link-dest=dir option causes rsync to

check whether each file it needs to copy exists in bu.1. If it does, rsync creates a link

to the bu.1 file instead of copying it.

The bkup script rotates three backup directories named bu.0,bu.1, and bu.2 and

calls rsync. The script removes bu.2, moves bu.1 to bu.2, and moves bu.0 to bu.1.

The first time you run the script, rsync copies all the files from memos because they

do not exist in bu.0 or bu.1.

$ cat bkup

rm -rf bu.2

mv bu.1 bu.2

mv bu.0 bu.1

rsync --archive --link-dest=../bu.1 memos/ bu.0

Before you run bkup for the first time, bu.0,bu.1, and bu.2 do not exist. Because

of the –f option, rm does not display an error message when it tries to remove the

Examples 593

nonexistent bu.2 directory. Until bkup creates bu.0 and bu.1,mv displays error

messages saying there is No such file or directory.

In the following example, ls shows the bkup script and the contents of the memos

directory. After running bkup,ls shows the contents of memos and of the new bu.0

directory; bu.0 holds exact copies of the files in memos. The rsync utility created no

links because there were no files in bu.1: The directory did not exist.

$ ls -l *

-rwxr-xr-x 1 max max 87 Jul 22 11:23 bkup

memos:

total 20

-rw-r--r-- 1 max max 1500 Jun 6 14:24 0606

-rw-r--r-- 1 max max 6001 Jun 8 16:16 0608

-rw-r--r-- 1 max max 5911 Jun 10 12:02 0610

$ ./bkup

mv: cannot stat 'bu.1': No such file or directory

mv: cannot stat 'bu.0': No such file or directory

$ ls -l *

-rwxr-xr-x 1 max max 87 Jul 22 11:23 bkup

bu.0:

total 20

-rw-r--r-- 1 max max 1500 Jun 6 14:24 0606

-rw-r--r-- 1 max max 6001 Jun 8 16:16 0608

-rw-r--r-- 1 max max 5911 Jun 10 12:02 0610

memos:

total 20

-rw-r--r-- 1 max max 1500 Jun 6 14:24 0606

-rw-r--r-- 1 max max 6001 Jun 8 16:16 0608

-rw-r--r-- 1 max max 5911 Jun 10 12:02 0610

After working with the files in memos,ls shows 0610 has been removed and newfile

has been added:

$ ls -l memos

total 20

-rw-r--r-- 1 max max 2100 Jul 22 14:31 0606

-rw-r--r-- 1 max max 6001 Jun 8 16:16 0608

-rw-r--r-- 1 max max 5251 Jul 22 14:32 newfile

After running bkup again, bu.0 holds the same files as memos and bu.1 holds the

files that bu.0 held before running bkup. The 0608 file has not changed, so rsync,

with the ––link-dest=dir option, has not copied it but rather has made a link from

the copy in bu.1 to the copy in bu.0, as indicated by the 2ls displays between the

permissions and max.

594 Chapter 14 The rsync Secure Copy Utility

$ ./bkup

mv: cannot stat 'bu.1': No such file or directory

$ ls -l bu.0 bu.1

bu.0:

total 20

-rw-r--r-- 1 max max 2100 Jul 22 14:31 0606

-rw-r--r-- 2 max max 6001 Jun 8 16:16 0608

-rw-r--r-- 1 max max 5251 Jul 22 14:32 newfile

bu.1:

total 20

-rw-r--r-- 1 max max 1500 Jun 6 14:24 0606

-rw-r--r-- 2 max max 6001 Jun 8 16:16 0608

-rw-r--r-- 1 max max 5911 Jun 10 12:02 0610

The beauty of this setup is that each incremental backup occupies only the space

needed to hold files that have changed. Files that have not changed are stored as

links, which take up minimal disk space. Yet users and the system administrator

have access to a directory that appears to hold a full backup.

You can run a backup script such as bkup once an hour, once a day, or as often as

you like. Storage space permitting, you can have as many backup directories as you

like. If rsync does not require a password, you can automate this process by using

crontab (page 649).

Chapter Summary

The rsync utility copies an ordinary file or directory hierarchy locally or between the

local system and a remote system on a network. By default, this utility uses

openSSH to transfer files and the same authentication mechanism as openSSH;

therefore it provides the same security as openSSH. The rsync utility prompts for a

password when it needs one.

Exercises

1. List three features of rsync.

2. Write an rsync command that copies the backmeup directory from your home

directory on the local system to the /tmp directory on coffee, preserving file

ownership, permissions, and modification times. Write a command that will

copy the same directory to your home directory on coffee. Do not assume

your working directory on the local system is your home directory.

Exercises 595

3. You are writing an rsync command that includes the ––delete option.

Which options would you use to test the command without copying or

removing any files?

4. What does the ––archive option do? Why is it useful?

5. When running a script such as bkup (page 592) to back up files on a

remote system, how could you rotate (rename) files on a remote system?

6. What effect does a trailing slash (/) on the source-file have?

Command Reference 599

Command Reference

The following tables list the utilities covered in this part of the book grouped by

function and alphabetically within function. Although most of these are true utili-

ties (programs that are separate from the shells), some are built into the shells (shell

builtins). The sample utility on page 605 shows the format of the description of

each utility in this part of the book.

Utilities That Display and Manipulate Files

aspell Checks a file for spelling errors—page 607

bzip2 Compresses or decompresses files—page 615

cat Joins and displays files—page 618

cmp Compares two files—page 634

comm Compares sorted files—page 636

cp Copies files—page 640

cpio Creates an archive, restores files from an archive, or copies a directory

hierarchy—page 644

cut Selects characters or fields from input lines—page 652

dd Converts and copies a file—page 658

diff Displays the differences between two text files—page 663

ditto Copies files and creates and unpacks archives—page 671 O

emacs Editor—page 205

find Finds files based on criteria—page 688

fmt Formats text very simply—page 697

grep Searches for a pattern in files—page 719

gzip Compresses or decompresses files—page 724

head Displays the beginning of a file—page 727

less Displays text files, one screen at a time—page 735

ln Makes a link to a file—page 740

lpr Sends files to printers—page 742

ls Displays information about one or more files—page 745

man Displays documentation for commands—page 759

600 Command Reference

Network Utilities

mkdir Creates a directory—page 763

mv Renames or moves a file—page 771

od Dumps the contents of a file—page 776

open Opens files, directories, and URLs—page 780 O

otool Displays object, library, and executable files—page 782 O

paste Joins corresponding lines from files—page 784

pax Creates an archive, restores files from an archive, or copies a directory

hierarchy—page 786

plutil Manipulates property list files—page 792 O

pr Paginates files for printing—page 794

rm Removes a file (deletes a link)—page 804

rmdir Removes directories—page 806

sed Edits a file noninteractively—page 565

sort Sorts and/or merges files—page 817

split Divides a file into sections—page 826

strings Displays strings of printable characters—page 837

tail Displays the last part (tail) of a file—page 843

tar Stores or retrieves files to/from an archive file—page 846

touch Creates a file or changes a file’s access and/or modification time—page 862

uniq Displays unique lines—page 872

vim Editor—page 149

wc Displays the number of lines, words, and bytes—page 876

ftp Transfers files over a network—page 704

rcp Copies one or more files to or from a remote system—page 800

rlogin Logs in on a remote system—page 803

rsh Executes commands on a remote system—page 807

rsync Copies files and directory hierarchies securely over a network—page 583

scp Securely copies one or more files to or from a remote system—page 810

ssh Securely executes commands on a remote system—page 828

telnet Connects to a remote system over a network—page 852

Command Reference 601

Utilities That Display and Alter Status

cd Changes to another working directory—page 620

chgrp Changes the group associated with a file—page 622

chmod Changes the access mode (permissions) of a file—page 626

chown Changes the owner of a file and/or the group the file is associated

with—page 631

date Displays or sets the system time and date—page 655

df Displays disk space usage—page 661

dscl Displays and manages Directory Service information—page 674 O

dmesg Displays kernel messages—page 673

du Displays information on disk usage by directory hierarchy and/or

file—page 677

file Displays the classification of a file—page 686

finger Displays information about users—page 695

GetFileInfo Displays file attributes—page 717 O

kill Terminates a process by PID—page 729

killall Terminates a process by name—page 731

nice Changes the priority of a command—page 773

nohup Runs a command that keeps running after you log out—page 775

ps Displays process status—page 796

renice Changes the priority of a process—page 802

SetFile Sets file attributes—page 813 O

sleep Creates a process that sleeps for a specified interval—page 815

stat Displays information about files—page 835

stty Displays or sets terminal parameters—page 838

sysctl Displays and alters kernel variables—page 842 O

top Dynamically displays process status—page 858

umask Establishes the file-creation permissions mask—page 870

wDisplays information about system users—page 874

which Shows where in PATH a command is located—page 877

who Displays information about logged-in users—page 879

602 Command Reference

Utilities That Are Programming Tools

Miscellaneous Utilities

Standard Multiplicative Suffixes

Several utilities allow you to use the suffixes listed in Table V-1 following byte

counts. You can precede a multiplicative suffix with a number that is a multiplier.

For example, 5K means 5 × 2

10. The absence of a multiplier indicates that the

awk Searches for and processes patterns in a file—page 531

configure Configures source code automatically—page 638

gawk Searches for and processes patterns in a file—page 531

gcc Compiles C and C++ programs—page 712

make Keeps a set of programs current—page 753

mawk Searches for and processes patterns in a file—page 531

at Executes commands at a specified time—page 611

cal Displays a calendar—page 617

crontab Maintains crontab files—page 649

diskutil Checks, modifies, and repairs local volumes—page 668 O

echo Displays a message—page 680

expr Evaluates an expression—page 682

fsck Checks and repairs a filesystem—page 699

launchctl Controls the launchd daemon—page 733 O

mkfs Creates a filesystem on a device—page 764

Mtools Uses DOS-style commands on files and directories—page 767

tee Copies standard input to standard output and one or more files—page 851

test Evaluates an expression—page 854

tr Replaces specified characters—page 864

tty Displays the terminal pathname—page 867

tune2fs Changes parameters on an ext2,ext3, or ext4 filesystem—page 868

xargs Converts standard input to command lines—page 881

Command Reference 603

multiplicative suffix is to be multiplied by 1. The utilities that allow these suffixes

are marked as such.

BLOCKSIZE Under Mac OS X, some utilities use the BLOCKSIZE environment variable to set a

default block size. You can set BLOCKSIZE to a value that is a number of bytes or

to a value that uses one of the K, M, or G suffixes. The text identifies utilities that

use BLOCKSIZE.

Common Options

Several GNU utilities share the options listed in Table V-2. The utilities that use

these options are marked as such.

Table V-1 Multiplicative suffixes

Suffix Multiplicative value Suffix Multiplicative value

KB 1,000 (103)PB10

K 1,024 (210)P2

MB 1,000,000 (106)EB10

M 1,048,576 (220)E2

GB 1,000,000,000 (109)ZB10

G 1,073,741,824 (230)Z2

TB 1012 YB 1024

40 Y2

Table V-2 Common command-line options

Option Effect

–A single hyphen appearing in place of a filename instructs the utility to accept

standard input in place of the file.

–– A double hyphen marks the end of the options on a command line. You can fol-

low this option with an argument that begins with a hyphen. Without this

option the utility assumes that an argument that begins with a hyphen is an

option.

––help Displays a help message for the utility. Some of these messages are quite long;

using a pipe, you can send the output through less to display it one screen at

a time. For example, you could give the command ls ––help | less. Alterna-

tively, you can send the output through a pipe to grep if you are looking for

specific information. For example, you could give the following command to

get information on the –d option to ls:ls ––help | grep –– –d. See the pre-

ceding entry in this table for information on the double hyphen.

––version Displays version information for the utility.

604 Command Reference

The sample Utility

The following description of the sample utility shows the format that this part of the

book uses to describe the utilities. These descriptions are similar to the man page

descriptions (pages 33 and 759); however, most users find the descriptions in this

book easier to read and understand. They emphasize the most useful features of the

utilities and often leave out the more obscure features. For information about the

less commonly used features, refer to the man and info pages or call the utility with

the ––help option, which works with many utilities.

sample O605

sample O

sample O ← OS X in an oval indicates this utility runs under Mac OS X only.

Brief description of what the utility does

sample [options]arguments

Following the syntax line is a description of the utility. The syntax line shows how to run the utility

from the command line. Options and arguments enclosed in brackets ([]) are not required. Enter

words that appear in this italic typeface as is. Words that you must substitute when you type appear

in this bold italic typeface. Words listed as arguments to a command identify single arguments (for

example, source-file) or groups of similar arguments (for example, directory-list).

Arguments This section describes the arguments you can use when you run the utility. The

argument itself, as shown in the preceding syntax line, is printed in this bold italic

typeface.

Options This section lists some of the options you can use with the command. Unless other-

wise specified, you must precede options with one or two hyphens. Most commands

accept a single hyphen before multiple options (page 119). Options in this section

are ordered alphabetically by short (single-hyphen) options. If an option has only a

long version (two hyphens), it is ordered by its long option. Following are some

sample options:

––delimiter=dchar

–d dchar

This option includes an argument. The argument is set in a bold italic typeface

in both the heading and the description. You substitute another word (file-

name, string of characters, or other value) for any arguments you see in this

typeface. Type characters that are in bold type (such as the ––delimiter and –d)

as is.

––make-dirs –m This option has a long and a short version. You can use either option; they

are equivalent. This option description ends with Linux in a box, indicating

it is available under Linux only. Options not followed by Linux or OS X are

available under both operating systems. L

–t (table of contents) This simple option is preceded by a single hyphen and not

followed by arguments. It has no long version. The table of contents appearing

in parentheses at the beginning of the description is a cue, suggestive of what

the option letter stands for. This option description ends with OS X in a box,

indicating it is available under OS X only. Options not followed by Linux or

OS X are available under both operating systems. O

606 sample O

Discussion This optional section describes how to use the utility and identifies any quirks it

may have.

Notes This section contains miscellaneous notes—some important and others merely

interesting.

Examples This section contains examples illustrating how to use the utility. This section is a

tutorial, so it takes a more casual tone than the preceding sections of the description.

aspell 607

aspell

Checks a file for spelling errors

aspell check [options]filename

aspell list [options] < filename

aspell config

aspell help

The aspell utility checks the spelling of words in a document against a standard dictionary. You can

use aspell interactively: It displays each misspelled word in context, together with a menu that gives

you the choice of accepting the word as is, choosing one of aspell’s suggested replacements for the

word, inserting the word into your personal dictionary, or replacing the word with one you enter.

You can also use aspell in batch mode so it reads from standard input and writes to standard output.

The aspell utility is available under Linux only. L

Actions You must choose one and only one action when you run aspell.

check –c Runs aspell as an interactive spelling checker. Input comes from a single file

named on the command line. Refer to “Discussion” on page 608.

config Displays aspell’s configuration, both default and current values. Send the out-

put through a pipe to less for easier viewing, or use grep to find the option you

are looking for (for example, aspell config | grep backup).

help –? Displays an extensive page of help. Send the output through a pipe to less for

easier viewing.

list –l Runs aspell in batch mode (noninteractively) with input coming from standard

input and output going to standard output.

Arguments The filename is the name of the file you want to check. The aspell utility accepts this

argument only when you use the check (–c)action. With the list (–l)action, input

must come from standard input.

Options The aspell utility has many options. A few of the more commonly used ones are

listed in this section; see the aspell man page for a complete list. Default values of

many options are determined when aspell is compiled (see the config action).

You can specify options on the command line, as the value of the ASPELL_CONF

shell variable, or in your personal configuration file (~/.aspell.conf). A user working

aspell is not like other utilities regarding its input

tip Unlike many other utilities, aspell does not accept input from standard input when you do not

specify a filename on the command line. Instead, the

action

specifies where aspell gets its input.

608 aspell

with root privileges can create a global configuration file (/etc/aspell.conf). Put one

option per line in a configuration file; separate options with a semicolon (;) in

ASPELL_CONF. Options on the command line override those in ASPELL_CONF,

which override those in your personal configuration file, which override those in the

global configuration file.

The following list contains two types of options: Boolean and value. The Boolean

options turn a feature on (enable the feature) or off (disable the feature). Precede a

Boolean option with dont– to turn it off. For example, ––ignore-case turns the

ignore-case feature on and ––dont-ignore-case turns it off.

Value options assign a value to a feature. Follow the option with an equal sign and

a value—for example, ––ignore=4.

For all options in a configuration file or in the ASPELL_CONF variable, drop the

leading hyphens (ignore-case or dont-ignore-case).

––dont-backup Does not create a backup file named filename.bak (default is ––backup when

action is check).

––ignore=nIgnores words with n or fewer characters (default is 1).

––ignore-case Ignores the case of letters in words being checked (default is ––dont-ignore-case).

––lang=cc Specifies the two-letter language code (cc). The language code defaults to the

value of LC_MESSAGES (page 304).

––mode=mod Specifies a filter to use. Select mod from url (default), none,sgml, and others.

The modes work as follows: url: skips URLs, hostnames, and email addresses;

none: turns off all filters; sgml: skips SGML, HTML, XHTML, and XML

commands.

––strip-accents Removes accent marks from all the words in the dictionary before checking

words (default is ––dont-strip-accents).

Discussion The aspell utility has two basic modes of operation: batch and interactive. You

specify batch mode by using the list or –l action. In batch mode, aspell accepts the

document you want to check for spelling errors as standard input and sends the list

of potentially misspelled words to standard output.

You specify interactive mode by using the check or –c action. In interactive mode,

aspell displays a screen with the potentially misspelled word highlighted in context

aspell options and leading hyphens

caution The way you specify options differs depending on whether you are specifying them on the com-

mand line, using the ASPELL_CONF shell variable, or in a configuration file.

On the command line, prefix long options with two hyphens (for example, ––ignore-case or

––dont-ignore-case). In ASPELL_CONF and configuration files, drop the leading hyphens (for

example, ignore-case or dont-ignore-case).

aspell 609

and a menu of choices. See “Examples” for an illustration. The menu includes various

commands (Table V-3) as well as some suggestions of similar, correctly spelled words.

You either enter one of the numbers from the menu to select a suggested word to

replace the word in question or enter a letter to give a command.

Notes For more information refer to the aspell home page located at aspell.net and to the

/usr/share/doc/aspell directory.

The aspell utility is not a foolproof way of finding spelling errors. It also does not

check for misused, properly spelled words (such as red instead of read).

Spelling from

emacs

You can make it easy to use aspell from emacs by adding the following line to your

~/.emacs file (page 250). This line causes emacs’ispell functions to call aspell:

(setq-default ispell-program-name "aspell")

Spelling from vim Similarly, you can make it easy to use aspell from vim by adding the following line to

your ~/.vimrc file (page 185):

map ^T :w!<CR>:!aspell check %<CR>:e! %<CR>

When you enter this line in ~/.vimrc using vim, enter the ^T as CONTROL-V CONTROL-T

(page 169). With this line in ~/.vimrc,CONTROL-T brings up aspell to spell check the file

you are editing with vim.

Examples The following examples use aspell to correct the spelling in the memo.txt file:

$ cat memo.txt

Here's a document for teh aspell utilitey

to check. It obviosly needs proofing

quiet badly.

Table V-3 aspell commands

Command Action

SPACE Takes no action and goes on to next the misspelled word.

Replaces the misspelled word with suggested word number

aAdds the “misspelled” word to your personal dictionary.

bAborts aspell; does not save changes.

ior I (letter “i”) Ignores the misspelled word. I (uppercase “I”) ignores all occurrences of this

word; i ignores this occurrence only and is the same as SPACE.

l (lowercase “l”) Changes the “misspelled” word to lowercase and adds it to your personal

dictionary.

ror RReplaces the misspelled word with the word you enter at the bottom of the

screen. R replaces all occurrences of this word; r replaces this occurrence only.

xSaves the file as corrected so far and exits from aspell.

610 aspell

The first example uses aspell with the check action and no options. The appearance

of the screen for the first misspelled word, teh, is shown. At the bottom of the

screen is the menu of commands and suggested words. Each of the numbered words

differs slightly from the misspelled word:

$ aspell check memo.txt

Here's a document for teh aspell utilitey

to check. It obviosly needs proofing

quiet badly.

============================================================

1) the 6) th

2) Te 7) tea

3) tech 8) tee

4) Th 9) Ted

5) eh 0) tel

i) Ignore I) Ignore all

r) Replace R) Replace all

a) Add l) Add Lower

b) Abort x) Exit

============================================================

Enter one of the menu choices in response to the preceding display; aspell will do

your bidding and move the highlight to the next misspelled word (unless you choose

to abort or exit). In this case, entering 1 (one) would change teh to the in the file.

The next example uses the list action to display a list of misspelled words. The word

quiet is not in the list—it is not properly used but is properly spelled.

$ aspell list < memo.txt

teh

aspell

utilitey

obviosly

The last example also uses the list action. It shows a quick way to check the spelling

of a word or two with a single command. The user gives the aspell list command

and then enters seperate temperature into aspell’s standard input (the keyboard).

After the user presses RETURN and CONTROL-D (to mark the end of file), aspell writes the

misspelled word to standard output (the screen):

$ aspell list

seperate temperatureRETURN

CONTROL-D

seperate

at 611

Executes commands at a specified time

at [options]time [date | +increment]

atq

atrm job-list

batch [options] [time]

The at and batch utilities execute commands at a specified time. They accept commands from stan-

dard input or, with the –f option, from a file. Commands are executed in the same environment as

the at or batch command. Unless redirected, standard output and standard error from commands

are emailed to the user who ran the at or batch command. A job is the group of commands that is

executed by one call to at. The batch utility differs from at in that it schedules jobs so they run when

the CPU load on the system is low.

The atq utility displays a list of at jobs you have queued; atrm cancels pending at jobs.

Arguments The time is the time of day when at runs the job. You can specify the time as a

one-, two-, or four-digit number. One- and two-digit numbers specify an hour,

and four-digit numbers specify an hour and minute. You can also give the time in

the form hh:mm. The at utility assumes a 24-hour clock unless you place am or

pm immediately after the number, in which case it uses a 12-hour clock. You can

also specify time as now,midnight,noon, or teatime (4:00 PM).

The date is the day of the week or day of the month when you want at to execute

the job. When you do not specify a day, at executes the job today if the hour you

specify in time is greater than the current hour. If the hour is less than the current

hour, at executes the job tomorrow.

You specify a day of the week by spelling it out or abbreviating it to three letters.

You can also use the words today and tomorrow. Use the name of a month followed

by the number of the day in the month to specify a date. You can follow the month

and day number with a year.

The increment is a number followed by one of the following (both plural and singu-

lar are allowed): minutes,hours,days, or weeks. The at utility adds the increment to

time. You cannot specify an increment for a date.

When using atrm,job-list is a list of one or more at job numbers. You can list job

numbers by running at with the –l option or by using atq.

Options The batch utility accepts options under OS X only. The –c,–d, and –l options are

not for use when you initiate a job with at; use these options to determine the status

of a job or to cancel a job only.

612 at

–c job-list

(cat) Displays the environment and commands specified by the job numbers in

job-list.

–d job-list

(delete) Cancels jobs that you submitted with at. The job-list is a list of one or

more at job numbers to cancel. If you do not remember the job number, use the

–l option or run atq to list your jobs and their numbers. Using this option with

at has the same effect as running atrm. This option is deprecated under OS X;

use the –r option instead.

–f file

(file) Specifies that commands come from file instead of standard input. This

option is useful for long lists of commands or commands that are executed

repeatedly.

–l (list) Displays a list of your at jobs. Using this option with at has the same

effect as running atq.

–m (mail) Sends you email after a job is run, even when nothing is sent to standard

output or standard error. When a job generates output, at always emails it to

you, regardless of this option.

–r job-list

(remove) Same as the –d option. O

Notes The at utility uses /bin/sh to execute commands. Under Linux, this file is typically a

link to bash or dash.

The shell saves the environment variables and the working directory at the time you

submit an at job so they are available when at executes commands.

at.allow and

at.deny

A user running with root privileges can always use at. The Linux /etc/at.allow (OS

X uses /var/at/at.allow) and Linux /etc/at.deny (OS X uses /var/at/at.deny) files,

which should be readable and writable by root only (mode 600), control which

ordinary, local users can use at. When at.deny exists and is empty, all users can use

at. When at.deny does not exist, only those users listed in at.allow can use at. Users

listed in at.deny cannot use at unless they are also listed in at.allow.

Under Linux, jobs you submit using at are run by the atd daemon. This daemon

stores jobs in /var/spool/at or /var/spool/cron/atjobs and stores their output in

/var/spool/at/spool or /var/spool/cron/atspool. These files should be set to mode

700 and owned by the user named daemon.

Under Mac OS X, jobs you submit using at are run by atrun, which is called every 30

seconds by launchd. The atrun utility stores jobs in /var/at/jobs and stores their out-

put in /var/at/spool, both of which should be set to mode 700 and owned by the

user named daemon.

at 613

Under OS X 10.4 and later, the atrun daemon is disabled by default. Working as a

privileged user, you can enable and disable atrun with the following commands:

# launchctl load -w /System/Library/LaunchDaemons/com.apple.atrun.plist

# launchctl unload -w /System/Library/LaunchDaemons/com.apple.atrun.plist

See launchctl (page 733) for more information.

Examples You can use any of the following techniques to paginate and print long_file tomorrow

at 2:00 AM. The first example executes the command directly from the command line;

the last two examples use the pr_tonight file, which contains the necessary command,

and execute that command using at. Prompts and output from different versions of at

differ.

$ at 2am

at> pr long_file | lpr

at>CONTROL-D<EOT>

job 8 at 2009-08-17 02:00

$ cat pr_tonight

#!/bin/bash

pr long_file | lpr

$ at -f pr_tonight 2am

job 9 at 2009-08-17 02:00

$ at 2am < pr_tonight

job 10 at 2009-08-17 02:00

If you execute commands directly from the command line, you must signal the end of

the commands by pressing CONTROL-D at the beginning of a line. After you press CONTROL-D,

at displays a line that begins with job followed by the job number and the time at will

execute the job.

If you run atq after the preceding commands, it displays a list of jobs in its queue:

$ atq

8 2009-08-17 02:00 a

9 2009-08-17 02:00 a

10 2009-08-17 02:00 a

The following command removes job number 9 from the queue:

$ atrm 9

$ atq

8 2009-08-17 02:00 a

10 2009-08-17 02:00 a

The next example executes cmdfile at 3:30 PM (1530 hours) one week from today:

$ at -f cmdfile 1530 +1 week

job 12 at 2009-08-23 15:30

614 at

The next at command executes a job at 7 PM on Thursday. This job uses find to create

an intermediate file, redirects the output sent to standard error, and prints the file.

$ at 7pm Thursday

at> find / -name "core" -print >report.out 2>report.err

at> lpr report.out

at>CONTROL-D<EOT>

job 13 at 2009-08-18 19:00

The final example shows some of the output generated by the –c option when at is

queried about the preceding job. Most of the lines show the environment; only the

last few lines execute the commands:

$ at -c 13

#!/bin/sh

# atrun uid=500 gid=500

# mail sam 0

umask 2

PATH=/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:.;

export PATH

PWD=/home/sam/book.examples/99/cp; export PWD

EXINIT=set\ ai\ aw; export EXINIT

LANG=C; export LANG

PS1=\\\$\ ; export PS1

...

cd /home/sam/book\.examples/99/cp || {

echo 'Execution directory inaccessible' >&2

exit 1

}

find / -name "core" -print >report.out 2>report.err

lpr report.out

bzip2 615

bzip2

Compresses or decompresses files

bzip2 [options] [file-list]

bunzip2 [options] [file-list]

bzcat [options] [file-list]

bzip2recover [file]

The bzip2 utility compresses files; bunzip2 restores files compressed with bzip2; and bzcat displays files

compressed with bzip2.

Arguments The file-list is a list of one or more files (no directories) that are to be compressed or

decompressed. If file-list is empty or if the special option – is present, bzip2 reads

from standard input. The ––stdout option causes bzip2 to write to standard output.

Options Under Linux, bzip2,bunzip2, and bzcat accept the common options described on

page 603.

––stdout –c Writes the results of compression or decompression to standard output.

––decompress –d Decompresses a file compressed with bzip2. This option with bzip2 is equivalent

to the bunzip2 command.

––fast or

––best

–nSets the block size when compressing a file. The n is a digit from 1 to 9, where

1 (––fast) generates a block size of 100 kilobytes and 9 (––best) generates a

block size of 900 kilobytes. The default level is 9. The options ––fast and

––best are provided for compatibility with gzip and do not necessarily yield the

fastest or best compression.

––force –f Forces compression even if a file already exists, has multiple links, or comes

directly from a terminal. The option has a similar effect with bunzip2.

––keep –k Does not delete input files while compressing or decompressing them.

––quiet –q Suppresses warning messages; does display critical messages.

––test –t Verifies the integrity of a compressed file. Displays nothing if the file is OK.

––verbose –v For each file being compressed, displays the name of the file, the compression

ratio, the percentage of space saved, and the sizes of the decompressed and

compressed files.

The Mac OS X version of bzip2 accepts long options

tip Options for bzip2 preceded by a double hyphen (––) work under Mac OS X as well as under

Linux.

616 bzip2

Discussion The bzip2 and bunzip2 utilities work similarly to gzip and gunzip; see the discussion

of gzip (page 725) for more information. Normally bzip2 does not overwrite a file;

you must use ––force to overwrite a file during compression or decompression.

Notes See page 62 for additional information on and examples of tar.

The bzip2 home page is bzip.org.

The bzip2 utility does a better job of compressing files than does gzip.

Use the ––bzip2 modifier with tar (page 847) to compress archive files with bzip2.

bzcat file-list Works like cat except it uses bunzip2 to decompress file-list as it copies files to stan-

dard output.

bzip2recover Attempts to recover a damaged file that was compressed with bzip2.

Examples In the following example, bzip2 compresses a file and gives the resulting file the

same name with a .bz2 filename extension. The –v option displays statistics about

the compression.

$ ls -l

total 728

-rw-r--r-- 1 sam sam 737414 Feb 20 19:05 bigfile

$ bzip2 -v bigfile

bigfile: 3.926:1, 2.037 bits/byte, 74.53% saved, 737414 in, 187806 out

$ ls -l

total 188

-rw-r--r-- 1 sam sam 187806 Feb 20 19:05 bigfile.bz2

Next touch creates a file with the same name as the original file; bunzip2 refuses to

overwrite the file in the process of decompressing bigfile.bz2. The ––force option

enables bunzip2 to overwrite the file.

$ touch bigfile

$ bunzip2 bigfile.bz2

bunzip2: Output file bigfile already exists.

$ bunzip2 --force bigfile.bz2

$ ls -l

total 728

-rw-r--r-- 1 sam sam 737414 Feb 20 19:05 bigfile

cal 617

cal

Displays a calendar

cal [options] [[month]year]

The cal utility displays a calendar.

Arguments The arguments specify the month and year for which cal displays a calendar. The

month is a decimal integer from 1 to 12 and the year is a decimal integer. Without

any arguments, cal displays a calendar for the current month. When you specify a

single argument, it is taken to be the year.

Options –j (Julian) Displays Julian days—a calendar that numbers the days consecutively

from January 1 (1) through December 31 (365 or 366).

–m (Monday) Makes Monday the first day of the week. Without this option,

Sunday is the first day of the week. L

–y (year) Displays a calendar for the current year. L

–3 (three months) Displays the previous, current, and next months. L

Notes Do not abbreviate the year. The year 05 is not the same as 2005.

The ncal (new cal) utility displays a more compact calendar.

Examples The following command displays a calendar for August 2007:

$ cal 8 2007<U>vim:

August 2007

Su Mo Tu We Th Fr Sa

1 2 3 4

5 6 7 8 9 10 11

12 13 14 15 16 17 18

19 20 21 22 23 24 25

26 27 28 29 30 31

Next is a Julian calendar for 1949:

$ cal -j 1949

1949

January February

Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa

1 32 33 34 35 36

2 3 4 5 6 7 8 37 38 39 40 41 42 43

9 10 11 12 13 14 15 44 45 46 47 48 49 50

16 17 18 19 20 21 22 51 52 53 54 55 56 57

23 24 25 26 27 28 29 58 59

30 31

...

618 cat

cat

Joins and displays files

cat [options] [file-list]

The cat utility copies files to standard output. You can use cat to display the contents of one or more

text files on the screen.

Arguments The file-list is a list of the pathnames of one or more files that cat processes. If you

do not specify an argument or if you specify a hyphen (–) in place of a filename, cat

reads from standard input.

Options Under Linux, cat accepts the common options described on page 603. Options pre-

ceded by a double hyphen (––) work under Linux only. Except as noted, options

named with a single letter and preceded by a single hyphen work under Linux and

OS X.

––show-all –A Same as –vET.L

––number-nonblank

–b Numbers all lines that are not blank as they are written to standard output.

–e (end) Same as –vE.

––show-ends –E Marks the ends of lines with dollar signs. L

––number –n Numbers all lines as they are written to standard output.

––squeeze-blank –s Removes extra blank lines so there are never two or more blank lines in a row.

–t (tab) Same as –vT.

––show-tabs –T Displays TABs as ^I.L

––show-nonprinting

–v Displays CONTROL characters with the caret notation (^M) and displays characters

that have the high bit set (META characters) with the M- notation (page 216).

This option does not convert TABs and LINEFEEDs. Use –T (––show-tabs) if you

want to display TABs as ^I.LINEFEEDs cannot be displayed as anything but them-

selves; otherwise, the line would be too long.

Notes See page 125 for a discussion of cat, standard input, and standard output.

Use the od utility (page 776) to display the contents of a file that does not contain

text (for example, an executable program file).

Use the tac utility to display lines of a text file in reverse order. See the tac info page

for more information.

cat 619

The name cat is derived from one of the functions of this utility, catenate, which

means to join together sequentially, or end to end.

Examples The following command displays the contents of the memo text file on the terminal:

$ cat memo

...

The next example catenates three text files and redirects the output to the all file:

$ cat page1 letter memo > all

You can use cat to create short text files without using an editor. Enter the following

command line, type the text you want in the file, and press CONTROL-D on a line by

itself:

$ cat > new_file

...

(text)

...

CONTROL-D

In this case cat takes input from standard input (the keyboard) and the shell redi-

rects standard output (a copy of the input) to the file you specify. The CONTROL-D sig-

nals the EOF (end of file) and causes cat to return control to the shell.

In the next example, a pipe sends the output from who to standard input of cat. The

shell redirects cat’s output to the file named output; after the commands have fin-

ished executing, output contains the contents of the header file, the output of who,

and footer. The hyphen on the command line causes cat to read standard input after

reading header and before reading footer.

$ who | cat header - footer > output

Set noclobber to avoid overwriting a file

caution Despite cat’s warning message, the shell destroys the input file (letter) before invoking cat in the

following example:

$ cat memo letter > letter

cat: letter: input file is output file

You can prevent overwriting a file in this situation by setting the noclobber variable (pages 129

and 377).

620 cd

Changes to another working directory

cd [options] [directory]

The cd builtin makes directory the working directory.

Arguments The directory is the pathname of the directory you want to be the new working

directory. Without an argument, cd makes your home directory the working direc-

tory. Using a hyphen in place of directory changes to the previous working directory.

Options The following options are available under bash and dash only.

–L (no dereference) If directory is a symbolic link, cd makes the symbolic link the

working directory (default).

–P (dereference) If directory is a symbolic link, cd makes the directory the symbolic

link points to the working directory.

Notes The cd command is a bash,dash, and tcsh builtin.

See page 87 for a discussion of cd.

Without an argument, cd makes your home directory the working directory; it uses

the value of the HOME (bash; page 297) or home (tcsh; page 372) variable to deter-

mine the pathname of your home directory.

With an argument of a hyphen, cd makes the previous working directory the work-

ing directory. It uses the value of the OLDPWD (bash) or owd (tcsh) variable to

determine the pathname of the previous working directory.

The CDPATH (bash; page 302) or cdpath (tcsh; page 372) variable contains a

colon-separated list of directories that cd searches. Within the list a null directory

name (::) or a period (:.:) represents the working directory. If CDPATH or cdpath is

not set, cd searches only the working directory for directory. If this variable is set

and directory is not an absolute pathname (does not begin with a slash), cd searches

the directories in the list; if the search fails, cd searches the working directory. See

page 302 for a discussion of CDPATH.

Examples Acd command without an argument makes a user’s home directory the working direc-

tory. In the following example, cd makes Max’s home directory the working directory

and the pwd builtin verifies the change.

cd 621

$ pwd

/home/max/literature

$ cd

$ pwd

/home/max

Under Mac OS X, home directories are stored in /Users, not /home.

The next command makes the /home/max/literature directory the working directory:

$ cd /home/max/literature

$ pwd

/home/max/literature

Next the cd utility makes a subdirectory of the working directory the new working

directory:

$ cd memos

$ pwd

/home/max/literature/memos

Finally cd uses the .. reference to the parent of the working directory (page 88) to

make the parent the new working directory:

$ cd ..

$ pwd

/home/max/literature

622 chgrp

chgrp

Changes the group associated with a file

chgrp [options]group file-list

chgrp [options] ––reference=rfile file-list L

The chgrp utility changes the group associated with one or more files. The second format works

under Linux only.

Arguments The group is the name or numeric group ID of the new group. The file-list is a list of

the pathnames of the files whose group association is to be changed. The rfile is the

pathname of a file whose group is to become the new group associated with file-list.

Options Options preceded by a double hyphen (––) work under Linux only. Except as noted,

options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

––changes –c Displays a message for each file whose group is changed. L

––dereference For each file that is a symbolic link, changes the group of the file the link

points to, not the symbolic link itself. Under Linux, this option is the

default. L

––quiet or

––silent

–f Suppresses warning messages about files whose permissions prevent you from

changing their group IDs.

––no-dereference –h For each file that is a symbolic link, changes the group of the symbolic link,

not the file the link points to.

–H (partial dereference) For each file that is a symbolic link, changes the group of

the file the link points to, not the symbolic link itself. This option affects files

specified on the command line; it does not affect files found while descending a

directory hierarchy. This option treats files that are not symbolic links nor-

mally and works with –R only. See page 623 for an example of the use of the

–H versus –L options.

–L (dereference) For each file that is a symbolic link, changes the group of the file

the link points to, not the symbolic link itself. This option affects all files,

treats files that are not symbolic links normally, and works with –R only. See

page 623 for an example of the use of the –H versus –L options.

–P (no dereference) For each file that is a symbolic link, changes the group of the

symbolic link, not the file the link points to (default). This option affects all

files, treats files that are not symbolic links normally, and works with –R only.

See page 625 for an example of the use of the –P option.

chgrp 623

––recursive –R Recursively descends a directory specified in file-list and changes the group ID

on all files in the directory hierarchy.

––reference=rfile Changes the group of the files in file-list to that of rfile.L

––verbose –v For each file, displays a message saying whether its group was retained or

changed.

Notes Only the owner of a file or a user working with root privileges can change the group

association of a file.

Unless you are working with root privileges, you must belong to the specified group

to change the group ID of a file to that group.

See page 631 for information on how to use chown to change the group associated

with, as well as the owner of, a file.

Examples The following command changes the group that the manuals file is associated with;

the new group is pubs.

$ chgrp pubs manuals

–H versus –L The following examples demonstrate the difference between the –H and –L options.

Initially all files and directories in the working directory are associated with the

zach group:

$ ls -lR

total 12

-rw-r--r-- 1 zach zach 102 Jul 2 12:31 bb

drwxr-xr-x 2 zach zach 4096 Jul 2 15:34 dir1

drwxr-xr-x 2 zach zach 4096 Jul 2 15:33 dir4

./dir1:

total 4

-rw-r--r-- 1 zach zach 102 Jul 2 12:32 dd

lrwxrwxrwx 1 zach zach 7 Jul 2 15:33 dir4.link -> ../dir4

./dir4:

total 8

-rw-r--r-- 1 zach zach 125 Jul 2 15:33 gg

-rw-r--r-- 1 zach zach 375 Jul 2 15:33 hh

When you call chgrp with the –R and –H options (–H does not work without –R),

chgrp dereferences only symbolic links you list on the command line, which includes

symbolic links found in directories you list on the command line. That means chgrp

changes the group association of the files these links point to. It does not derefer-

ence symbolic links it finds as it descends into directory hierarchies, nor does it

change symbolic links themselves. While descending the dir1 hierarchy, chgrp does

not change dir4.link, but it does change dir4, the directory dir4.link points to.

624 chgrp

$ chgrp -RH pubs bb dir1

$ ls -lR

total 12

-rw-r--r-- 1 zach pubs 102 Jul 2 12:31 bb

drwxr-xr-x 2 zach pubs 4096 Jul 2 15:34 dir1

drwxr-xr-x 2 zach pubs 4096 Jul 2 15:33 dir4

./dir1:

total 4

-rw-r--r-- 1 zach pubs 102 Jul 2 12:32 dd

lrwxrwxrwx 1 zach zach 7 Jul 2 15:33 dir4.link -> ../dir4

./dir4:

total 8

-rw-r--r-- 1 zach zach 125 Jul 2 15:33 gg

-rw-r--r-- 1 zach zach 375 Jul 2 15:33 hh

When you call chgrp with the –R and –L options (–L does not work without –R),

chgrp dereferences all symbolic links: those you list on the command line and those

it finds as it descends the directory hierarchy. It does not change the symbolic links

themselves. This command changes the files dir4.link points to:

$ chgrp -RL pubs bb dir1

$ ls -lR

total 12

-rw-r--r-- 1 zach pubs 102 Jul 2 12:31 bb

drwxr-xr-x 2 zach pubs 4096 Jul 2 15:34 dir1

drwxr-xr-x 2 zach pubs 4096 Jul 2 15:33 dir4

./dir1:

total 4

-rw-r--r-- 1 zach pubs 102 Jul 2 12:32 dd

lrwxrwxrwx 1 zach zach 7 Jul 2 15:33 dir4.link -> ../dir4

./dir4:

total 8

-rw-r--r-- 1 zach pubs 125 Jul 2 15:33 gg

-rw-r--r-- 1 zach pubs 375 Jul 2 15:33 hh

The –H option under Mac OS X

caution The chgrp –H option works slightly differently under Mac OS X than it does under Linux. Under

OS X, chgrp –RH changes the group of the symbolic link it finds in a directory listed on the com-

mand line and does not change the file the link points to. (It does not dereference the symbolic

link.) When you run the preceding example under OS X, the group association of dir4 is not

changed, but the group association of dir4.link is.

If your program depends on how the –H option functions with a utility under OS X, test the option

with that utility to determine exactly how it works.

chgrp 625

–P When you call chgrp with the –R and –P options (–P does not work without –R),

chgrp does not dereference symbolic links. It does change the group of the symbolic

link itself.

$ ls -l bb*

-rw-r--r-- 1 zach zach 102 Jul 2 12:31 bb

lrwxrwxrwx 1 zach zach 2 Jul 2 16:02 bb.link -> bb

$ chgrp -PR pubs bb.link

$ ls -l bb*

-rw-r--r-- 1 zach zach 102 Jul 2 12:31 bb

lrwxrwxrwx 1 zach pubs 2 Jul 2 16:02 bb.link -> bb

626 chmod

chmod

Changes the access mode (permissions) of a file

chmod [options]who operator permission file-list symbolic

chmod [options]mode file-list absolute

chmod [options] ––reference=rfile file-list referential L

The chmod utility changes the ways in which a file can be accessed by the owner of the file, the group

the file is associated with, and/or all other users. You can specify the new access mode absolutely or

symbolically. Under Linux you can also specify the mode referentially (third format). Under Mac OS

X, you can use chmod to modify ACLs (page 932).

Arguments Arguments specify which files are to have their modes changed in what ways. The

rfile is the pathname of a file whose permissions are to become the new permissions

of file-list.

Symbolic

You can specify multiple sets of symbolic modes (who operator permission) by sep-

arating each set from the next with a comma.

The chmod utility changes the access permission for the class of users specified by

who. The class of users is designated by one or more of the letters specified in the

who column of Table V-4.

Table V-5 lists the symbolic mode operators.

The access permission is specified by one or more of the letters listed in Table V-6.

Table V-4 Symbolic mode user class specification

who User class Meaning

uUser Owner of the file

gGroup Group the file is associated with

oOther All other users

aAll Use in place of ugo

Table V-5 Symbolic mode operators

operator Meaning

+Adds the permission for the specified user class

–Removes the permission for the specified user class

=Sets the permission for the specified user class; resets all other permissions

for that user class

chmod 627

Absolute

You can use an octal number to specify the access mode. Construct the number by

ORing the appropriate values from Table V-7. To OR two or more octal numbers

from this table, just add them. (Refer to Table V-8 on the next page for examples.)

Table V-6 Symbolic mode permissions

permission Meaning

rSets read permission

wSets write permission

xSets execute permission

sSets the user ID or group ID (depending on the

who

argument) to that of the

owner of the file while the file is being executed (For more information see

page 96.)

tSets the sticky bit (Only a user working with root privileges can set the sticky

bit, and it can be used only with u; see page 980.)

XMakes the file executable only if it is a directory or if another user class has

execute permission

uSets the specified permissions to those of the owner

gSets the specified permissions to those of the group

oSets the specified permissions to those of others

Table V-7 Absolute mode specifications

mode Meaning

4000 Sets the user ID when the program is executed (page 96)

2000 Sets the group ID when the program is executed (page 96)

1000 Sticky bit (page 980)

0400 Owner can read the file

0200 Owner can write to the file

0100 Owner can execute the file

0040 Group can read the file

0020 Group can write to the file

0010 Group can execute the file

0004 Others can read the file

0002 Others can write to the file

0001 Others can execute the file

628 chmod

Table V-8 lists some typical modes.

Options Options preceded by a double hyphen (––) work under Linux only. Except as noted,

options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

––changes –c Displays a message for each file whose permissions are changed. L

––quiet or

––silent

–f Suppresses warning messages about files whose permissions prevent chmod

from changing the permissions of the file.

–H (partial dereference) For each file that is a symbolic link, changes permissions

of the file the link points to, not the symbolic link itself. This option affects

files specified on the command line; it does not affect files found while

descending a directory hierarchy. This option treats files that are not symbolic

links normally and works with –R only. See page 623 for an example of the

use of the –H versus –L options. O

–L (dereference) For each file that is a symbolic link, changes permissions of the

file the link points to, not the symbolic link itself. This option affects all files,

treats files that are not symbolic links normally, and works with –R only. See

page 623 for an example of the use of the –H versus –L options. O

–P (no dereference) For each file that is a symbolic link, changes permissions of

the symbolic link, not the file the link points to. This option affects all files,

treats files that are not symbolic links normally, and works with –R only. See

page 625 for an example of the use of the –P option. O

––recursive –R Recursively descends a directory specified in file-list and changes the permissions

on all files in the directory hierarchy.

––reference=rfile Changes the permissions of the files in file-list to that of rfile.L

Table V-8 Examples of absolute mode specifications

Mode Meaning

0777 Owner, group, and others can read, write, and execute the file

0755 Owner can read, write, and execute the file; group and others can read and exe-

cute the file

0711 Owner can read, write, and execute the file; group and others can execute the

file

0644 Owner can read and write the file; group and others can read the file

0640 Owner can read and write the file, group can read the file, and others cannot

access the file

chmod 629

––verbose –v For each file, displays a message saying that its permissions were changed

(even if they were not changed) and specifying the permissions. Use ––changes

to display messages only when permissions are actually changed.

Notes Only the owner of a file or a user working with root privileges can change the

access mode, or permissions, of a file.

When you use symbolic arguments, you can omit the permission from the com-

mand line when the operator is =. This omission takes away all permissions for the

specified user class. See the second example in the next section.

Under Linux, chmod never changes the permissions of symbolic links.

Under Linux, chmod dereferences symbolic links found on the command line. In

other words, chmod changes the permissions of files that symbolic links found on

the command line point to; chmod does not affect files found while descending a

directory hierarchy. This behavior mimics the behavior of the Mac OS X –H option.

See page 94 for another discussion of chmod.

See page 933 for a discussion of using chmod under Mac OS X to change ACLs.

Examples See page 932 for examples of using chmod to change ACLs under Mac OS X.

The following examples show how to use the chmod utility to change the permis-

sions of the file named temp. The initial access mode of temp is shown by ls. See

“Discussion” on page 748 for information about the ls display.

$ ls -l temp

-rw-rw-r-- 1 max pubs 57 Jul 12 16:47 temp

When you do not follow an equal sign with a permission, chmod removes all permis-

sions for the specified user class. The following command removes all access permis-

sions for the group and all other users so only the owner has access to the file:

$ chmod go= temp

$ ls -l temp

-rw------- 1 max pubs 57 Jul 12 16:47 temp

The next command changes the access modes for all users (owner, group, and

others) to read and write. Now anyone can read from or write to the file.

$ chmod a=rw temp

$ ls -l temp

-rw-rw-rw- 1 max pubs 57 Jul 12 16:47 temp

Using an absolute argument, a=rw becomes 666. The next command performs the

same function as the previous one:

$ chmod 666 temp

630 chmod

The next command removes write access permission for other users. As a result,

members of the pubs group can read from and write to the file, but other users can

only read from the file:

$ chmod o-w temp

$ ls -l temp

-rw-rw-r-- 1 max pubs 57 Jul 12 16:47 temp

The following command yields the same result, using an absolute argument:

$ chmod 664 temp

The next command adds execute access permission for all users:

$ chmod others temp

$ ls -l temp

-rwxrwxr-x 1 max pubs 57 Jul 12 16:47 temp

If temp is a shell script or other executable file, all users can now execute it. (The

operating system requires read and execute access to execute a shell script but only

execute access to execute a binary file.) The absolute command that yields the same

result is

$ chmod 775 temp

The final command uses symbolic arguments to achieve the same result as the preced-

ing one. It sets permissions to read, write, and execute for the owner and the group,

and to read and execute for other users. A comma separates the sets of symbolic

modes.

$ chmod ug=rwx,o=rx temp

chown 631

chown

Changes the owner of a file and/or the group the file is associated with

chown [options]owner file-list

chown [options]owner:group file-list

chown [options]owner: file-list

chown [options]:group file-list

chown [options] ––reference=rfile file-list L

The chown utility changes the owner of a file and/or the group the file is associated with. Only a user

working with root privileges can change the owner of a file. Only a user working with root privileges

or the owner of a file who belongs to the new group can change the group a file is associated with.

The last format works under Linux only.

Arguments The owner is the username or numeric user ID of the new owner. The file-list is a

list of the pathnames of the files whose ownership and/or group association you

want to change. The group is the group name or numeric group ID of the new

group the file is to be associated with. The rfile is the pathname of a file whose

owner and/or group association is to become the new owner and/or group associa-

tion of file-list. Table V-9 shows the ways you can specify the new owner and/or

group.

Options Under Linux, chown accepts the common options described on page 603. Options

preceded by a double hyphen (––) work under Linux only. Except as noted, options

named with a single letter and preceded by a single hyphen work under Linux and

OS X.

––changes –c Displays a message for each file whose ownership/group is changed. L

––dereference Changes the ownership/group of the files symbolic links point to, not the symbolic

links themselves. Under Linux, this option is the default. L

––quiet or

––silent

–f Suppresses error messages about files whose ownership and/or group association

chown cannot change.

Table V-9 Specifying the new owner and/or group

Argument Meaning

owner The new owner of

file-list

; the group is not changed

owner:group The new owner of and new group associated with

file-list

owner: The new owner of

file-list

; the group associated with

file-list

is changed to the

new owner’s login group

:group The new group associated with

file-list

; the owner is not changed

632 chown

–H (partial dereference) For each file that is a symbolic link, changes the owner

and/or group association of the file the link points to, not the symbolic link

itself. This option affects files specified on the command line; it does not affect

files found while descending a directory hierarchy. This option treats files that

are not symbolic links normally and works with –R only. See page 623 for an

example of the use of the –H versus –L options.

––no-dereference –h For each file that is a symbolic link, changes the owner and/or group associa-

tion of the symbolic link, not the file the link points to.

–L (dereference) For each file that is a symbolic link, changes the owner and/or

group association of the file the link points to, not the symbolic link itself. This

option affects all files, treats files that are not symbolic links normally, and

works with –R only. See page 623 for an example of the use of the –H versus

–L options.

–P (no dereference) For each file that is a symbolic link, changes the owner and/or

group association of the symbolic link, not the file the link points to. This

option affects all files, treats files that are not symbolic links normally, and

works with –R only. See page 625 for an example of the use of the –P option.

––recursive –R When you include directories in the file-list, this option descends the directory

hierarchy, setting the specified owner and/or group association for all files in

the hierarchy.

––reference=rfile Changes the owner and/or group association of the files in the file-list to that

of rfile.L

––verbose –v Displays for each file a message saying whether its owner and/or group association

was retained or changed.

Notes The chown utility clears setuid and setgid bits when it changes the owner of a file.

Examples The following command changes the owner of the chapter1 file in the manuals

directory. The new owner is Sam.

# chown sam manuals/chapter1

The following command makes Max the owner of, and Max’s login group the

group associated with, all files in the /home/max/literature directory and in all its

subdirectories.

# chown -R max: /home/max/literature

Under Mac OS X, home directories are stored in /Users, not /home.

The next command changes the ownership of the files in literature to max and the

group associated with these files to pubs:

# chown max:pubs /home/max/literature/*

chown 633

The final example changes the group association of the files in manuals to pubs

without altering their ownership. The owner of the files, who is executing this com-

mand, must belong to the pubs group.

$ chown :pubs manuals/*

634 cmp

cmp

Compares two files

cmp [options]file1 [file2 [skip1 [skip2]]]

The cmp utility displays the differences between two files on a byte-by-byte basis. If the files are the

same, cmp is silent. If the files differ, cmp displays the byte and line number of the first difference.

Arguments The file1 and file2 arguments are the pathnames of the files that cmp compares. If

file2 is omitted, cmp uses standard input instead. Using a hyphen (–) in place of file1

or file2 causes cmp to read standard input instead of that file.

The skip1 and skip2 arguments are decimal numbers indicating the number of bytes

to skip in each file before beginning the comparison. You can use the standard

multiplicative suffixes after skip1 and skip2; see Table V-1 on page 603.

Options Under Linux and OS X, cmp accepts the common options described on page 603.

Options preceded by a double hyphen (––) work under Linux only. Options named

with a single letter and preceded by a single hyphen work under Linux and OS X.

––print–bytes –b Displays more information, including filenames, byte and line numbers, and

the octal and ASCII values of the first differing byte.

––ignore–initial=n1[:n2]

–i n1[:n2]

Without n2, skips the first n1 bytes in both files before beginning the compari-

son. With n1 and n2, skips the first n1 bytes in file1 and skips the first n2

bytes in file2 before beginning the comparison. You can follow n1 and/or n2

with one of the multiplicative suffixes listed in Table V-1 on page 603.

––verbose –l (lowercase “l”) Instead of stopping at the first byte that differs, continues com-

paring the two files and displays both the location and the value of each byte

that differs. Locations are displayed as decimal byte count offsets from the

beginning of the files; byte values are displayed in octal. The comparison ter-

minates when an EOF is encountered on either file.

––silent or

––quiet

–s Suppresses output from cmp; only sets the exit status (see “Notes”).

Notes Byte and line numbering start at 1.

The cmp utility does not display a message if the files are identical; it only sets the exit

status. This utility returns an exit status of 0 if the files are the same and an exit status

of 1 if they are different. An exit status greater than 1 means an error occurred.

When you use skip1 (and skip2), the offset values cmp displays are based on the

byte where the comparison began.

cmp 635

Under Mac OS X, cmp compares data forks (page 929) of a file only. If you want to

compare resource forks, you can manually compare the ..namedfork/rsrc files

(page 930) for the target files.

Unlike diff (page 663), cmp works with binary as well as ASCII files.

Examples The examples use the files a and b shown below. These files have two differences.

The first difference is that the word lazy in file a is replaced by lasy in file b. The

second difference is more subtle: A TAB character appears just before the NEWLINE char-

acter in file b.

$ cat a

The quick brown fox jumped over the lazy dog.

$ cat b

The quick brown fox jumped over the lasy dog.TAB

The first example uses cmp without any options to compare the two files. The cmp

utility reports that the files are different and identifies the offset from the beginning

of the files where the first difference is found:

$ cmp a b

a b differ: char 39, line 1

You can display the values of the bytes at that location by adding the –b

(––print–bytes) option:

$ cmp --print-bytes a b

a b differ: char 39, line 1 is 172 z 163 s

The –l option displays all bytes that differ between the two files. Because this option

creates a lot of output if the files have many differences, you may want to redirect

the output to a file. The following example shows the two differences between files

a and b. The –b option displays the values for the bytes as well. Where file a has a

CONTROL–J (NEWLINE), file b has a CONTROL–I (TAB). The message saying that the end of file on

file a has been reached indicates that file b is longer than file a.

$ cmp -lb a b

39 172 z 163 s

46 12 ^J 11 ^I

cmp: EOF on a

In the next example, the ––ignore–initial option is used to skip over the first differ-

ence in the files. The cmp utility now reports on the second difference. The difference

is put at character 7, which is the 46th character in the original file b (7 characters

past the ignored 39 characters).

$ cmp --ignore-initial=39 a b

a b differ: char 7, line 1

You can use skip1 and skip2 in place of the ––ignore–initial option used in the

preceding example:

$ cmp a b 39 39

a b differ: char 7, line 1

636 comm

comm

Compares sorted files

comm [options]file1 file2

The comm utility displays a line-by-line comparison of two sorted files. The first of the three columns

it displays lists the lines found only in file1, the second column lists the lines found only in file2, and

the third lists the lines common to both files.

Arguments The file1 and file2 arguments are pathnames of the files that comm compares. Using

a hyphen (–) in place of file1 or file2 causes comm to read standard input instead of

that file.

Options You can combine the options. With no options, comm produces three-column output.

–1 Does not display column 1 (does not display lines found only in file1).

–2 Does not display column 2 (does not display lines found only in file2).

–3 Does not display column 3 (does not display lines found in both files).

Notes If the files have not been sorted, comm will not work properly.

Lines in the second column are preceded by one TAB, and those in the third column

are preceded by two TABs.

The exit status indicates whether comm completed normally (0) or abnormally (not 0).

Examples The following examples use two files, c and d. As with all input to comm, the files

have already been sorted:

$ cat c

bbbbb

ccccc

ddddd

eeeee

fffff

$ cat d

aaaaa

ddddd

eeeee

ggggg

hhhhh

Refer to sort on page 817 for information on sorting files.

comm 637

The following example calls comm without any options, so it displays three col-

umns. The first column lists those lines found only in file c, the second column lists

those found in d, and the third lists the lines found in both c and d:

$ comm c d

aaaaa

bbbbb

ccccc

ddddd

eeeee

fffff

ggggg

hhhhh

The next example shows the use of options to prevent comm from displaying columns

1 and 2. The result is column 3, a list of the lines common to files c and d:

$ comm -12 c d

ddddd

eeeee

638 configure

configure

Configures source code automatically

./configure [options]

The configure script is part of the GNU Configure and Build System. Software developers who supply

source code for their products face the problem of making it easy for relatively naive users to build and

install their software package on a wide variety of machine architectures, operating systems, and sys-

tem software. To facilitate this process many software developers supply a shell script named configure

with their source code.

When you run configure, it determines the capabilities of the local system. The data collected by config-

ure is used to build the makefiles with which make (page 753) builds the executables and libraries. You

can adjust the behavior of configure by specifying command-line options and environment variables.

Options

––disable-feature Works in the same manner as ––enable-feature except it disables support for

feature.

––enable-feature The feature is the name of a feature that can be supported by the software

being configured. For example, configuring the Z Shell source code with the

command configure ––enable-zsh-mem configures the source code to use the

special memory allocation routines provided with zsh instead of using the sys-

tem memory allocation routines. Check the README file supplied with the

software distribution to see the choices available for feature.

––help Displays a detailed list of all options available for use with configure. The

contents of this list depends on the software you are installing.

––prefix=directory

By default configure builds makefiles that install software in the /usr/local

directory hierarchy (when you give the command make install). To install soft-

ware into a different directory, replace directory with the pathname of the

desired directory.

––with-package The package is the name of an optional package that can be included with the

software you are configuring. For example, if you configure the source code

for the Windows emulator wine with the command configure ––with-dll, the

source code is configured to build a shared library of Windows emulation sup-

port. Check the README file supplied with the software you are installing to

The Mac OS X version of configure accepts long options

tip Options for configure preceded by a double hyphen (––) work under Mac OS X as well as under

Linux.

configure 639

see the choices available for package. The command configure ––help usually

displays the choices available for package.

Discussion The GNU Configure and Build System allows software developers to distribute soft-

ware that can configure itself to be built on a variety of systems. It builds a shell

script named configure, which prepares the software distribution to be built and

installed on a local system. The configure script searches the local system to find the

dependencies for the software distribution and constructs the appropriate makefiles.

Once you have run configure, you can build the software with a make command and

install the software with a make install command.

The configure script determines which C compiler to use (usually gcc) and specifies a

set of flags to pass to that compiler. You can set the environment variables CC and

CFLAGS to override these values. See the “Examples” section.

Notes Each package that uses the GNU autoconfiguration utility provides its own custom

copy of configure, which the software developer created using the GNU autoconf

utility (www.gnu.org/software/autoconf). Read the README and INSTALL files

that are provided with the software you are installing for information about the

available options.

The configure scripts are self-contained and run correctly on a wide variety of systems.

You do not need any special system resources to use configure.

The configure utility will exit with an error message if a dependency is not installed.

Examples The simplest way to call configure is to cd to the base directory for the software you

are installing and then run the following command:

$ ./configure

The ./ is prepended to the command name to ensure you are running the configure

script supplied with the software you are installing. For example, to cause configure

to build makefiles that pass the flags –Wall and –O2 to gcc, give the following com-

mand from bash:

$ CFLAGS="-Wall -O2" ./configure

If you are using tcsh, give the following command:

tcsh $ env CFLAGS="-Wall -O2" ./configure

640 cp

Copies files

cp [options]source-file destination-file

cp [options]source-file-list destination-directory

The cp utility copies one or more files. It can either make a copy of a single file (first format) or copy

one or more files to a directory (second format). With the –R option, cp can copy directory hierarchies.

Arguments The source-file is the pathname of the file that cp makes a copy of. The destination-

file is the pathname that cp assigns to the resulting copy of the file.

The source-file-list is a list of one or more pathnames of files that cp makes copies

of. The destination-directory is the pathname of the directory in which cp places

the copied files. With this format, cp gives each copied file the same simple filename

as its source-file.

The –R option enables cp to copy directory hierarchies recursively from the source-

file-list into the destination-directory.

Options Under Linux, cp accepts the common options described on page 603. Options preceded

by a double hyphen (––) work under Linux only. Except as noted, options named with

a single letter and preceded by a single hyphen work under Linux and OS X.

––archive –a Attempts to preserve the owner, group, permissions, access date, and modifi-

cation date of source file(s) while copying recursively without dereferencing

symbolic links. Same as –dpR.L

––backup –b If copying a file would remove or overwrite an existing file, this option makes

a backup copy of the file that would be overwritten. The backup copy has the

same name as the destination-file with a tilde (~) appended to it. When you

use both ––backup and ––force,cp makes a backup copy when you try to copy

a file over itself. For more backup options, search for Backup options in the

core utils info page. L

–d For each file that is a symbolic link, copies the symbolic link, not the file the

link points to. Also preserves hard links in destination-files that exist between

corresponding source-files. This option is equivalent to ––no-dereference and

––preserve=links.L

––force –f When the destination-file exists and cannot be opened for writing, causes cp

to try to remove destination-file before copying source-file. This option is use-

ful when the user copying a file does not have write permission to an existing

destination-file but does have write permission to the directory containing the

cp 641

destination-file. Use this option with –b to back up a destination file before

removing or overwriting it.

–H (partial dereference) For each file that is a symbolic link, copies the file the link

points to, not the symbolic link itself. This option affects files specified on the

command line; it does not affect files found while descending a directory hier-

archy. This option treats files that are not symbolic links normally. Under

OS X works with –R only. See page 623 for an example of the use of the –H

versus –L options.

––interactive –i Prompts you whenever cp would overwrite a file. If you respond with a string

that starts with y or Y,cp copies the file. If you enter anything else, cp does not

copy the file.

––dereference –L (dereference) For each file that is a symbolic link, copies the file the link points

to, not the symbolic link itself. This option affects all files and treats files that

are not symbolic links normally. Under OS X works with –R only. See

page 623 for an example of the use of the –H versus –L options.

––no-dereference –P (no dereference) For each file that is a symbolic link, copies the symbolic link,

not the file the link points to. This option affects all files and treats files that

are not symbolic links normally. Under OS X works with –R only. See

page 625 for an example of the use of the –P option.

––preserve[=attr]–p Creates a destination-file with the same owner, group, permissions, access

date, and modification date as the source-file. The –p option does not take an

argument.

Without attr,––preserve works as described above. The attr is a comma-

separated list that can include mode (permissions and ACLs), ownership

(owner and group), timestamps (access and modification dates), links (hard

links), and all (all attributes).

––parents Copies a relative pathname to a directory, creating directories as needed. See

the “Examples” section. L

––recursive –R or –r

Recursively copies directory hierarchies including ordinary files. Under Linux,

the ––no-dereference (–d) option is implied: With the –R,–r, or ––recursive

option, cp copies the links (not the files the links point to). The –r and

––recursive options are available under Linux only.

––update –u Copies only when the destination-file does not exist or when it is older

than the source-file (i.e., this option will not overwrite a newer destination

file). L

––verbose –v Displays the name of each file as cp copies it.

642 cp

Notes Under Linux, cp dereferences symbolic links unless you use one or more of the –R,

–r,––recursive,–P,–d, or ––no-dereference options. As explained on the previous

page, under Linux the –H option dereferences only symbolic links listed on the

command line. Under Mac OS X, without the –R option, cp always dereferences

symbolic links; with the –R option, cp does not dereference symbolic links (–P is the

default) unless you specify –H or –L.

Many options are available for cp under Linux. See the coreutils info page for a

complete list.

If the destination-file exists before you execute a cp command, cp overwrites the

file, destroying the contents but leaving the access privileges, owner, and group

associated with the file as they were.

If the destination-file does not exist, cp uses the access privileges of the source-file.

The user who copies the file becomes the owner of the destination-file and the

user’s login group becomes the group associated with the destination-file.

Using the –p option without any arguments causes cp to attempt to set the owner,

group, permissions, access date, and modification date to match those of the

source-file.

Unlike with the ln utility (page 740), the destination-file that cp creates is independent

of its source-file.

Under Mac OS X version 10.4 and later, cp copies extended attributes (page 928).

Examples The first command makes a copy of the file letter in the working directory. The

name of the copy is letter.sav.

$ cp letter letter.sav

The next command copies all files with filenames ending in .c into the archives

directory, which is a subdirectory of the working directory. Each copied file retains

its simple filename but has a new absolute pathname. The –p (––preserve) option

causes the copied files in archives to have the same owner, group, permissions,

access date, and modification date as the source files.

$ cp -p *.c archives

The next example copies memo from Sam’s home directory to the working directory:

$ cp ~sam/memo .

The next example runs under Linux and uses the ––parents option to copy the file

memo/thursday/max to the dir directory as dir/memo/thursday/max. The find util-

ity shows the newly created directory hierarchy.

cp 643

$ cp --parents memo/thursday/max dir

$ find dir

dir

dir/memo

dir/memo/thursday

dir/memo/thursday/max

The following command copies the files named memo and letter into another direc-

tory. The copies have the same simple filenames as the source files (memo and letter)

but have different absolute pathnames. The absolute pathnames of the copied files

are /home/sam/memo and /home/sam/letter, respectively.

$ cp memo letter /home/sam

The final command demonstrates one use of the –f (––force) option. Max owns the

working directory and tries unsuccessfully to copy one over another file (me) that

he does not have write permission for. Because he has write permission to the direc-

tory that holds me, Max can remove the file but cannot write to it. The –f (––force)

option unlinks, or removes, me and then copies one to the new file named me.

$ ls -ld

drwxrwxr-x 2 max max 4096 Oct 21 22:55 .

$ ls -l

-rw-r--r-- 1 root root 3555 Oct 21 22:54 me

-rw-rw-r-- 1 max max 1222 Oct 21 22:55 one

$ cp one me

cp: cannot create regular file 'me': Permission denied

$ cp -f one me

$ ls -l

-rw-r--r-- 1 max max 1222 Oct 21 22:58 me

-rw-rw-r-- 1 max max 1222 Oct 21 22:55 one

If Max had used the –b (––backup) option in addition to –f (––force), cp would

have created a backup of me named me~. Refer to “Directory Access Permissions”

on page 98 for more information.

644 cpio

cpio

Creates an archive, restores files from an archive, or copies a directory hierarchy

cpio ––create|–o [options]

cpio ––extract|–i [options] [pattern-list]

cpio ––pass-through|–p [options]destination-directory

The cpio utility has three modes of operation: Create (copy-out) mode places multiple files into a sin-

gle archive file, extract (copy-in) mode restores files from an archive, and pass-through (copy-pass)

mode copies a directory hierarchy. The archive file cpio creates can be saved on disk, tape, other

removable media, or a remote system.

Create mode reads a list of ordinary or directory filenames from standard input and writes the

resulting archive file to standard output. You can use this mode to create an archive. Extract mode

reads an archive from standard input and extracts files from that archive. You can restore all files

from the archive or only those files whose names match a pattern. Pass-through mode reads a list of

names of ordinary or directory files from standard input and copies the files to a specified directory.

Arguments By default cpio in extract mode extracts all files found in the archive. You can

choose to extract files selectively by supplying a pattern-list. If the name of a file

in the archive matches one of the patterns in pattern-list,cpio extracts that file;

otherwise, it ignores the file. The patterns in a cpio pattern-list are similar to shell

wildcards (page 136) except that pattern-list match slashes (/) and a leading

period (.) in a filename.

In pass-through mode you must give the name of the destination-directory as an

argument to cpio.

Options A major option specifies the mode in which cpio operates: create, extract, or

pass-through.

Major Options

You must include exactly one of these options. Options preceded by a double

hyphen (––) work under Linux only. Options named with a single letter and pre-

ceded by a single hyphen work under Linux and OS X.

––extract –i (copy-in mode) Reads the archive from standard input and extracts files. With-

out a pattern-list on the command line, cpio extracts all files from the archive.

With a pattern-list,cpio extracts only files with names that match one of the

patterns in pattern-list. The following example extracts from the device

mounted on /dev/sde1 only those files whose names end in .c:

$ cpio -i \*.c < /dev/sde1

The backslash prevents the shell from expanding the * before it passes the

argument to cpio.

cpio 645

––create –o (copy-out mode) Constructs an archive from the files named on standard

input. These files may be ordinary or directory files, and each must appear on

a separate line. The archive is written to standard output as it is built. The find

utility frequently generates the filenames that cpio uses. The following com-

mand builds an archive of the entire local system and writes it to the device

mounted on /dev/sde1:

$ find / -depth -print | cpio -o >/dev/sde1

The –depth option causes find to search for files in a depth-first search, thereby

reducing the likelihood of permissions problems when you restore the files

from the archive. See the discussion of this option on page 647.

––pass-through –p (copy-pass mode) Copies files from one place on the system to another. Instead

of constructing an archive file containing the files named on standard input,

cpio copies them to the destination-directory (the last argument on the cpio

command line). The effect is the same as if you had created an archive with

copy-out mode and then extracted the files with copy-in mode, except using

pass-through mode avoids creating an archive. The following example copies

the files in the working directory and all subdirectories into ~max/code:

$ find . -depth -print | cpio -pdm ~max/code

Other Options

The following options alter the behavior of cpio. These options work with one or

more of the preceding major options.

Options preceded by a double hyphen (––) work under Linux only. Except as noted,

options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

––reset-access-time

–a Resets the access times of source files after copying them so they have the same

access time after copying as they did before.

–B (block) Sets the block size to 5,120 bytes instead of the default 512 bytes.

Under Linux this option affects input and output block sizes; under OS X it

affects only output block sizes.

––block-size=nSets the block size used for input and output to n 512-byte blocks. L

–c (compatible) Writes header information in ASCII so older (incompatible) cpio

utilities on other systems can read the file. This option is rarely needed.

––make-directories

–d Creates directories as needed when copying files. For example, you need this

option when you are extracting files from an archive with a file list generated

by find with the –depth option. This option can be used only with the –i

(––extract) and –p (––pass-through) options.

646 cpio

––pattern-file=filename

–E filename

Reads pattern-list from filename, one pattern per line. Additionally, you can

specify pattern-list on the command line.

––file=archive –F archive

Uses archive as the name of the archive file. In extract mode, reads from

archive instead of standard input. In create mode, writes to archive instead of

standard output. You can use this option to access a device on another system

on a network; see the –f (––file) option to tar (page 847) for more information.

––nonmatching –f (flip) Reverses the sense of the test performed on pattern-list when extracting

files from an archive. Files are extracted from the archive only if they do not

match any of the patterns in the pattern-list.

––help Displays a list of options. L

––dereference –L For each file that is a symbolic link, copies the file the link points to (not

the symbolic link itself). This option treats files that are not symbolic links

normally.

––link –l When possible, links files instead of copying them.

––preserve-modification-time

–m Preserves the modification times of files that are extracted from an archive.

Without this option the files show the time they were extracted. With this

option the created files show the time they had when they were copied into the

archive.

––no-absolute-filenames

In extract mode, creates all filenames relative to the working directory—even

files that were archived with absolute pathnames. L

––rename –r Allows you to rename files as cpio copies them. When cpio prompts you with

the name of a file, you respond with the new name. The file is then copied with

the new name. If you press RETURN instead, cpio does not copy the file.

––list –t (table of contents) Displays a table of contents of the archive. This option

works only with the –i (––extract) option, although no files are actually

extracted from the archive. With the –v (––verbose) option, it displays a

detailed table of contents in a format similar to that used by ls –l.

––unconditional –u Overwrites existing files regardless of their modification times. Without this

option cpio will not overwrite a more recently modified file with an older one;

it displays a warning message.

––verbose –v Lists files as they are processed. With the –t (––list) option, it displays a

detailed table of contents in a format similar to that used by ls –l.

cpio 647

Discussion Without the –u (––unconditional) option, cpio will not overwrite a more recently

modified file with an older file.

You can use both ordinary and directory filenames as input when you create an

archive. If the name of an ordinary file appears in the input list before the name of

its parent directory, the ordinary file appears before its parent directory in the

archive as well. This order can lead to an avoidable error: When you extract files

from the archive, the child has nowhere to go in the file structure if its parent has

not yet been extracted.

Making sure that files appear after their parent directories in the archive is not

always a solution. One problem occurs if the –m (––preserve-modification-time)

option is used when extracting files. Because the modification time of a parent

directory is updated whenever a file is created within it, the original modification

time of the parent directory is lost when the first file is written to it.

The solution to this potential problem is to ensure that all files appear before their

parent directories when creating an archive and to create directories as needed when

extracting files from an archive. When you use this technique, directories are

extracted only after all files have been written to them and their modification times

are preserved.

With the –depth option, find generates a list of files, with all children appearing in

the list before their parent directories. If you use this list to create an archive, the

files are in the proper order. (Refer to the first example in the next section.) When

extracting files from an archive, the –d (––make-directories) option causes cpio to

create parent directories as needed and the –m (––preserve-modification-time)

option does just what its name says. Using this combination of utilities and options

preserves directory modification times through a create/extract sequence.

This strategy also solves another potential problem. Sometimes a parent directory

may not have permissions set so that you can extract files into it. When cpio auto-

matically creates the directory with –d (––make-directories), you can be assured

that you have write permission to the directory. When the directory is extracted

from the archive (after all the files are written into the directory), it is extracted with

its original permissions.

Examples The first example creates an archive of the files in Sam’s home directory, writing the

archive to a USB flash drive mounted at /dev/sde1.

$ find /home/sam -depth -print | cpio -oB >/dev/sde1

The find utility produces the filenames that cpio uses to build the archive. The –depth

option causes all entries in a directory to be listed before listing the directory name

itself, making it possible for cpio to preserve the original modification times of direc-

tories (see the preceding “Discussion” section). Use the –d (––make-directories) and

648 cpio

–m (––preserve-modification-time) options when you extract files from this archive

(see the following examples). The –B option sets the block size to 5,120 bytes.

Under Mac OS X, home directories are stored in /Users, not /home.

To check the contents of the archive file and display a detailed listing of the files it

contains, give the following command:

$ cpio -itv < /dev/sde1

The following command restores the files that formerly were in the memo subdirectory

of Sam’s home directory:

$ cpio -idm /home/sam/memo/\* < /dev/sde1

The –d (––make-directories) option ensures that any subdirectories that were in the

memo directory are re-created as needed. The –m (––preserve-modification-time)

option preserves the modification times of files and directories. The asterisk in the

regular expression is escaped to keep the shell from expanding it.

The next command is the same as the preceding command except it uses the Linux

––no-absolute-filenames option to re-create the memo directory in the working

directory, which is named memocopy. The pattern does not start with the slash that

represents the root directory, allowing cpio to create the files with relative pathnames.

$ pwd

/home/sam/memocopy

$ cpio -idm --no-absolute-filenames home/sam/memo/\* < /dev/sde1

The final example uses the –f option to restore all files in the archive except those

that were formerly in the memo subdirectory:

$ cpio -ivmdf /home/sam/memo/\* < /dev/sde1

The –v option lists the extracted files as cpio processes the archive, verifying the

expected files are extracted.

crontab 649

crontab

Maintains crontab files

crontab [–u user-name]filename

crontab [–u user-name]option

A crontab file associates periodic times (such as 14:00 on Wednesdays) with commands. The cron

utility executes each command at the specified time. When you are working as yourself, the crontab

utility installs, removes, lists, and allows you to edit your crontab file. A user working with root

privileges can work with any user’s crontab file.

Arguments The first format copies the contents of filename (which contains crontab commands)

into the crontab file of the user who runs the command or that of username. When

the user does not have a crontab file, this process creates a new one; when the user

has a crontab file, this process overwrites the file. When you replace filename with a

hyphen (–), crontab reads commands from standard input.

The second format lists, removes, or edits the crontab file, depending on which

option you specify.

Options Choose only one of the –e,–l, or –r options. A user working with root privileges

can use –u with one of these options.

–e (edit) Runs the text editor specified by the VISUAL or EDITOR shell variable

on the crontab file, enabling you to add, change, or delete entries. This option

installs the modified crontab file when you exit from the editor.

–l (list) Displays the contents of the crontab file.

–r (remove) Deletes the crontab file.

–u username

(user) Works on username’s crontab file. Only a user working with root privi-

leges can use this option.

Notes This section covers the versions of cron,crontab, and crontab files that were written

by Paul Vixie—hence this version of cron is called Vixie cron. These versions are

POSIX compliant and differ from an earlier version of Vixie cron as well as from

the classic SVR3 syntax.

User crontab files are kept in the /var/spool/cron or /var/spool/cron/crontabs

directory. Each file is named with the username of the user that it belongs to.

The system utility named cron reads the crontab files and runs the commands. If a

command line in a crontab file does not redirect its output, all output sent to standard

650 crontab

output and standard error is mailed to the user unless you set the MAILTO variable

within the crontab file to a different username.

To make the system administrator’s job easier, the directories named /etc/cron.hourly,

/etc/cron.daily,/etc/cron.weekly, and /etc/cron.monthly hold crontab files that, on

most systems, are run by run-parts, which in turn are run by the /etc/crontab file.

Each of these directories contains files that execute system tasks at the interval named

by the directory. A user working with root privileges can add files to these directories

instead of adding lines to root’s crontab file. A typical /etc/crontab file looks like this:

$ cat /etc/crontab

SHELL=/bin/bash

PATH=/sbin:/bin:/usr/sbin:/usr/bin

MAILTO=root

HOME=/

# run-parts

01 * * * * root run-parts /etc/cron.hourly

02 4 * * * root run-parts /etc/cron.daily

22 4 * * 0 root run-parts /etc/cron.weekly

42 4 1 * * root run-parts /etc/cron.monthly

Each entry in a crontab file begins with five fields that specify when the command is

to run (minute, hour, day of the month, month, and day of the week). The cron util-

ity interprets an asterisk appearing in place of a number as a wildcard representing

all possible values. In the day-of-the-week field, you can use either 7 or 0 to repre-

sent Sunday.

It is a good practice to start cron jobs a variable number of minutes before or after

the hour, half-hour, or quarter-hour. When you start jobs at these times, it becomes

less likely that many processes will start at the same time, thereby potentially over-

loading the system.

When cron starts (usually when the system is booted), it reads all of the crontab files

into memory. The cron utility mostly sleeps but wakes up once a minute, reviews all

crontab entries it has stored in memory, and runs whichever jobs are due to be run at

that time.

cron.allow,

cron.deny

By creating, editing, and removing the cron.allow and cron.deny files, a user work-

ing with root privileges determines which users can run crontab. Under Linux these

files are kept in the /etc directory; under Mac OS X they are kept in /var/at (which

has a symbolic link at /usr/lib/cron). When you create a cron.deny file with no

entries and no cron.allow file exists, everyone can use crontab. When the cron.allow

file exists, only users listed in that file can use crontab, regardless of the presence and

contents of cron.deny. Otherwise, you can list in the cron.allow file those users who

are allowed to use crontab and in cron.deny those users who are not allowed to use

it. (Listing a user in cron.deny is not strictly necessary because, if a cron.allow file

exists and the user is not listed in it, the user will not be able to use crontab anyway.)

crontab 651

Examples The following example uses crontab –l to list the contents of Sam’s crontab file

(/var/spool/cron/sam). All the scripts that Sam runs are in his ~/bin directory. The

first line sets the MAILTO variable to max so that Max gets the output from com-

mands run from Sam’s crontab file that is not redirected. The sat.job script runs

every Saturday (day 6) at 2:05 AM,twice.week runs at 12:02 AM on Sunday and

Thursday (days 0 and 4), and twice.day runs twice a day, every day, at 10:05 AM

and 4:05 PM.

$ who am i

sam

$ crontab -l

MAILTO=max

05 02 * * 6 $HOME/bin/sat.job

00 02 * * 0,4 $HOME/bin/twice.week

05 10,16 * * *$HOME/bin/twice.day

To add an entry to your crontab file, run the crontab utility with the –e (edit) option.

Some Linux systems use a version of crontab that does not support the –e option. If

the local system runs such a version, you need to make a copy of your existing

crontab file, edit it, and then resubmit it, as in the example that follows. The –l (list)

option displays a copy of your crontab file.

$ crontab -l > newcron

$ vim newcron

...

$ crontab newcron

652 cut

cut

Selects characters or fields from input lines

cut [options] [file-list]

The cut utility selects characters or fields from lines of input and writes them to standard output.

Character and field numbering start with 1.

Arguments The file-list is a list of ordinary files. If you do not specify an argument or if you

specify a hyphen (–) in place of a filename, cut reads from standard input.

Options Under Linux, cut accepts the common options described on page 603. Options

preceded by a double hyphen (––) work under Linux only. Options named with a

single letter and preceded by a single hyphen work under Linux and OS X.

––characters=clist

–c clist

Selects the characters given by the column numbers in clist. The value of clist is

one or more comma-separated column numbers or column ranges. A range is

specified by two column numbers separated by a hyphen. A range of –n means

columns 1 through n;n– means columns n through the end of the line.

––delimiter=dchar

–d dchar

Specifies dchar as the input field delimiter. Also specifies dchar as the output

field delimiter unless you use the ––output-delimiter option. The default delim-

iter is a TAB character. Quote characters as necessary to protect them from shell

expansion.

––fields=flist –f flist

Selects the fields specified in flist. The value of flist is one or more comma-

separated field numbers or field ranges. A range is specified by two field numbers

separated by a hyphen. A range of –n means fields 1 through n;n– means fields n

through the last field. The field delimiter is a TAB character unless you use the

––delimiter option to change it.

––output-delimiter=ochar

Specifies ochar as the output field delimiter. The default delimiter is the TAB

character. You can specify a different delimiter by using the ––delimiter option.

Quote characters as necessary to protect them from shell expansion.

--only-delimited –s Copies only lines containing delimiters. Without this option, cut copies—but

does not modify—lines that do not contain delimiters.

cut 653

Notes Although limited in functionality, cut is easy to learn and use and is a good choice

when columns and fields can be selected without using pattern matching. Some-

times cut is used with paste (page 784).

Examples For the next two examples, assume that an ls –l command produces the following

output:

$ ls -l

total 2944

-rwxr-xr-x 1 zach pubs 259 Feb 1 00:12 countout

-rw-rw-r-- 1 zach pubs 9453 Feb 4 23:17 headers

-rw-rw-r-- 1 zach pubs 1474828 Jan 14 14:15 memo

-rw-rw-r-- 1 zach pubs 1474828 Jan 14 14:33 memos_save

-rw-rw-r-- 1 zach pubs 7134 Feb 4 23:18 tmp1

-rw-rw-r-- 1 zach pubs 4770 Feb 4 23:26 tmp2

-rw-rw-r-- 1 zach pubs 13580 Nov 7 08:01 typescript

The following command outputs the permissions of the files in the working direc-

tory. The cut utility with the –c option selects characters 2 through 10 from each

input line. The characters in this range are written to standard output.

$ ls -l | cut -c2-10

otal 2944

rwxr-xr-x

rw-rw-r--

The next command outputs the size and name of each file in the working directory.

The –f option selects the fifth and ninth fields from the input lines. The –d option

tells cut to use SPACEs, not TABs, as delimiters. The tr utility (page 864) with the –s

option changes sequences of more than one SPACE character into a single SPACE; other-

wise, cut counts the extra SPACE characters as separate fields.

$ ls -l | tr -s ' ' ' ' | cut -f5,9 -d' '

259 countout

9453 headers

1474828 memo

1474828 memos_save

7134 tmp1

4770 tmp2

13580 typescript

The last example displays a list of full names as stored in the fifth field of the

/etc/passwd file. The –d option specifies that the colon character be used as the field

654 cut

delimiter. Although this example works under Mac OS X, /etc/passwd does not

contain information about most users; see “Open Directory” on page 926 for more

information.

$ cat /etc/passwd

root:x:0:0:Root:/:/bin/sh

sam:x:401:50:Sam the Great:/home/sam:/bin/zsh

max:x:402:50:Max Wild:/home/max:/bin/bash

zach:x:504:500:Zach Brill:/home/zach:/bin/tcsh

hls:x:505:500:Helen Simpson:/home/hls:/bin/bash

$ cut -d: -f5 /etc/passwd

Root

Sam the Great

Max Wild

Zach Brill

Helen Simpson

date 655

date

Displays or sets the system time and date

date [options] [+format]

date [options] [newdate]

The date utility displays the time and date known to the system. A user working with root privileges

can use date to change the system clock.

Arguments The +format argument specifies the format for the output of date. The format

string, which consists of field descriptors and text, follows a plus sign (+). The field

descriptors are preceded by percent signs, and date replaces each one by its value in

the output. Table V-10 lists some of the field descriptors.

Table V-10 Selected field descriptors

Descriptor Meaning

%a Abbreviated weekday—Sun to Sat

%A Unabbreviated weekday—Sunday to Saturday

%b Abbreviated month—Jan to Dec

%B Unabbreviated month—January to December

%c Date and time in default format used by date

%d Day of the month—01 to 31

%D Date in mm/dd/yy format

%H Hour—00 to 23

%I Hour—00 to 12

%j Julian date (day of the year—001 to 366)

%m Month of the year—01 to 12

%M Minutes—00 to 59

%n NEWLINE character

%P AM or PM

%r Time in AM/PM notation

%s Number of seconds since the beginning of January 1, 1970

%S Seconds—00 to 60 (the 60 accommodates leap seconds)

656 date

By default date zero fills numeric fields. Placing an underscore (_) immediately fol-

lowing the percent sign (%) for a field causes date to blank fill the field. Placing a

hyphen (–) following the percent sign causes date not to fill the field—that is, to left

justify the field.

The date utility assumes that, in a format string, any character that is not a percent

sign, an underscore or a hyphen following the percent sign, or a field descriptor is

ordinary text and copies it to standard output. You can use ordinary text to add

punctuation to the date and to add labels (for example, you can put the word DATE:

in front of the date). Surround the format argument with single quotation marks if it

contains SPACEs or other characters that have a special meaning to the shell.

Setting the

system clock

When a user working with root privileges specifies newdate, the system changes the

system clock to reflect the new date. The newdate argument has the format

nnddhhmm[[cc]yy][.ss]

where nn is the number of the month (01–12), dd is the day of the month (01–31),

hh is the hour based on a 24-hour clock (00–23), and mm is the minutes (00–59).

When you change the date, you must specify at least these fields.

The optional cc specifies the first two digits of the year (the value of the century

minus 1), and yy specifies the last two digits of the year. You can specify yy or ccyy

following mm. When you do not specify a year, date assumes that the year has not

changed.

You can specify the number of seconds past the start of the minute with .ss.

Options Under Linux, date accepts the common options described on page 603. Options pre-

ceded by a double hyphen (––) work under Linux only. Except as noted, options named

with a single letter and preceded by a single hyphen work under Linux and OS X.

––date=datestring

–d datestring

Displays the date specified by datestring, not the current date. This option

does not change the system clock. L

Descriptor Meaning

%t TAB character

%T Time in HH:MM:SS format

%w Day of the week—0 to 6 (0 = Sunday)

%y Last two digits of the year—00 to 99

%Y Year in four-digit format (for example, 2009)

%Z Time zone (for example, PDT)

Table V-10 Selected field descriptors (continued)

date 657

––reference=file –r file

Displays the modification date and time of file in place of the current date and

time. L

––utc or

––universal

–u Displays or sets the time and date using Universal Coordinated Time (UTC;

page 986). UTC is also called Greenwich Mean Time (GMT).

Notes If you set up a locale database, date uses that database to substitute terms appropriate

to your locale (page 963).

Examples The first example shows how to set the date for 2:07:30 PM on August 19 without

changing the year:

# date 08191407.30

Fri Aug 19 14:07:30 PDT 2009

The next example shows the format argument, which causes date to display the date

in a commonly used format:

$ date '+Today is %h %d, %Y'

Today is Aug 19, 2009

658 dd

Converts and copies a file

dd [arguments]

The dd (device-to-device copy) utility converts and copies a file. The primary use of dd is to copy files

to and from hard disk files and removable media. It can operate on hard disk partitions and create

block-for-block identical disk images. Often dd can handle the transfer of information to and from

other operating systems when other methods fail. Its rich set of arguments gives you precise control

over the characteristics of the transfer.

Arguments Under Linux, dd accepts the common options described on page 603. By default dd

copies standard input to standard output.

bs=n(block size) Reads and writes n bytes at a time. This argument overrides the ibs

and obs arguments.

cbs=n(conversion block size) When performing data conversion during the copy,

converts n bytes at a time.

conv=type[,type...]

By applying conversion types in the order given on the command line, converts

the data being copied. The types must be separated by commas with no SPACEs.

The types of conversions are shown in Table V-11.

count=numblocks Restricts to numblocks the number of blocks of input that dd copies. The size

of each block is the number of bytes specified by the bs or ibs argument.

ibs=n(input block size) Reads n bytes at a time.

if=filename (input file) Reads from filename instead of from standard input. You can use a

device name for filename to read from that device.

obs=n(output block size) Writes n bytes at a time.

of=filename (output file) Writes to filename instead of to standard output. You can use a

device name for filename to write to that device.

seek=numblocks Skips numblocks blocks of output before writing any output. The size of each

block is the number of bytes specified by the bs or obs argument.

skip=numblocks Skips numblocks blocks of input before starting to copy. The size of each block

is the number of bytes specified by the bs or ibs argument.

Table V-11 Conversion types

type Meaning

ascii Converts EBCDIC-encoded characters to ASCII, allowing you to read tapes

written on IBM mainframe and similar computers.

dd 659

Notes Under Linux, you can use the standard multiplicative suffixes to make it easier to

specify large block sizes. See Table V-1 on page 603. Under Mac OS X, you can use

some of the standard multiplicative suffixes; however, OS X uses lowercase letters

in place of the uppercase letters shown in the table. In addition, under OS X, dd

supports b (block; multiply by 512) and w (word; multiply by the number of bytes in

an integer).

Examples You can use dd to create a file filled with pseudo-random bytes.

$ dd if=/dev/urandom of=randfile2 bs=1 count=100

The preceding command reads from the /dev/urandom file (an interface to the ker-

nel’s random number generator) and writes to the file named randfile. The block

size is 1 and the count is 100, so randfile is 100 bytes long. For bytes that are more

random, you can read from /dev/random. See the urandom and random man pages

for more information. Under OS X, urandom and random behave identically.

You can also use dd to make an exact copy of a disk partition. Be careful however—

the following command wipes out anything that was on the /dev/sdb1 partition.

# dd if=/dev/sda1 of=/dev/sdb1

Wiping a file You can use a similar technique to wipe data from a file before deleting it, making it

almost impossible to recover data from the deleted file. You might want to wipe a

file for security reasons; wipe a file several times for added security.

type Meaning

block Each time a line of input is read (that is, a sequence of characters terminated

with a NEWLINE character), outputs a block of text without the NEWLINE. Each

output block has the size given by the bs or obs argument and is created by

adding trailing SPACE characters to the text until it is the proper size.

ebcdic Converts ASCII-encoded characters to EBCDIC, allowing you to write tapes for

use on IBM mainframe and similar computers.

lcase Converts uppercase letters to lowercase while copying data.

noerror If a read error occurs, dd normally terminates. This conversion allows dd to

continue processing data and is useful when you are trying to recover data

from bad media.

notrunc Does not truncate the output file before writing to it.

ucase Converts lowercase letters to uppercase while copying data.

unblock Performs the opposite of the block conversion.

Table V-11 Conversion types (continued)

660 dd

In the following example, ls shows the size of the file named secret;dd, with a block

size of 1 and a count corresponding to the number of bytes in secret, then wipes the

file. The conv=notrunc argument ensures that dd writes over the data in the file and

not another place on the disk.

$ ls -l secret

-rw-rw-r-- 1 max max 2494 Feb 6 00:56 secret

$ dd if=/dev/urandom of=secret bs=1 count=2494 conv=notrunc

2494+0 records in

2494+0 records out

$ rm secret

Copying a diskette You can use dd to make an exact copy of a floppy diskette. First copy the contents

of the diskette to a file on the hard drive and then copy the file from the hard disk to

a formatted diskette. This technique works regardless of what is on the floppy dis-

kette. The next example copies a DOS-formatted diskette. (The filename for the

floppy on the local system may differ from that in the example.) The mount,ls,

umount sequences at the beginning and end of the example verify that the original

diskette and the copy hold the same files. You can use the floppy.copy file to make

multiple copies of the diskette.

# mount -t msdos /dev/fd0H1440 /mnt

# ls /mnt

abprint.dat bti.ini setup.ins supfiles.z wbt.z

adbook.z setup.exe setup.pkg telephon.z

# umount /mnt

# dd if=/dev/fd0 ibs=512 > floppy.copy

2880+0 records in

2880+0 records out

# ls -l floppy.copy

-rw-rw-r-- 1 max speedy 1474560 Oct 11 05:43 floppy.copy

# dd if=floppy.copy bs=512 of=/dev/fd0

2880+0 records in

2880+0 records out

# mount -t msdos /dev/fd0H1440 /mnt

# ls /mnt

abprint.dat bti.ini setup.ins supfiles.z wbt.z

adbook.z setup.exe setup.pkg telephon.z

# umount /mnt

df 661

Displays disk space usage

df [options] [filesystem-list]

The df (disk free) utility reports on the total space and the free space on each mounted device.

Arguments When you call df without an argument, it reports on the free space on each of the

devices mounted on the local system.

The filesystem-list is an optional list of one or more pathnames that specify the file-

systems you want the report to cover. This argument works on Mac OS X and some

Linux systems. You can refer to a mounted filesystem by its device pathname or by

the pathname of the directory it is mounted on.

Options Options preceded by a double hyphen (––) work under Linux only. Except as noted,

options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

––all –a Reports on filesystems with a size of 0 blocks, such as /dev/proc. Normally df

does not report on these filesystems.

––block-size=sz –B sz

The sz specifies the units that the report uses (the default is 1-kilobyte blocks).

The sz is a multiplicative suffix from Table V-1 on page 603. See also the –h

(––human-readable) and –H (––si) options. L

–g (gigabyte) Displays sizes in 1-gigabyte blocks. O

––si –H Displays sizes in K (kilobyte), M (megabyte), and G (gigabyte) blocks, as is

appropriate. Uses powers of 1,000.

––human-readable

–h Displays sizes in K (kilobyte), M (megabyte), and G (gigabyte) blocks, as is

appropriate. Uses powers of 1,024.

––inodes –i Reports the number of inodes (page 959) that are used and free instead of

reporting on blocks.

–k (kilobyte) Displays sizes in 1-kilobyte blocks.

––local –l Displays local filesystems only.

–m (megabyte) Displays sizes in 1-megabyte blocks. O

––type=fstype –t fstype

Reports information only about the filesystems of type fstype, such as DOS or

NFS. Repeat this option to report on several types of filesystems. L

––exclude-type=fstype

–x fstype

Reports information only about the filesystems not of type fstype.L

662 df

Notes Under Mac OS X, the df utility supports the BLOCKSIZE environment variable

(page 603) and ignores block sizes smaller than 512 bytes or larger than 1 gigabyte.

Under Mac OS X, the count of used and free inodes (–i option) is meaningless on

HFS+ filesystems. On these filesystems, new files can be created as long as free space

is available in the filesystem.

Examples In the following example, df displays information about all mounted filesystems on

the local system:

$ df

Filesystem 1k-blocks Used Available Use% Mounted on

/dev/hda12 1517920 53264 1387548 4% /

/dev/hda1 15522 4846 9875 33% /boot

/dev/hda8 1011928 110268 850256 11% /free1

/dev/hda9 1011928 30624 929900 3% /free2

/dev/hda10 1130540 78992 994120 7% /free3

/dev/hda5 4032092 1988080 1839188 52% /home

/dev/hda7 1011928 60 960464 0% /tmp

/dev/hda6 2522048 824084 1569848 34% /usr

zach:/c 2096160 1811392 284768 86% /zach_c

zach:/d 2096450 1935097 161353 92% /zach_d

Next df is called with the –l and –h options, generating a human-readable list of

local filesystems. The sizes in this listing are given in terms of megabytes and

gigabytes.

$ df -lh

Filesystem Size Used Avail Use% Mounted on

/dev/hda12 1.4G 52M 1.3G 4% /

/dev/hda1 15M 4.7M 9.6M 33% /boot

/dev/hda8 988M 108M 830M 11% /free1

/dev/hda9 988M 30M 908M 3% /free2

/dev/hda10 1.1G 77M 971M 7% /free3

/dev/hda5 3.8G 1.9G 1.8G 52% /home

/dev/hda7 988M 60k 938M 0% /tmp

/dev/hda6 2.4G 805M 1.5G 34% /usr

The next example, which runs under Linux only, displays information about the

/free2 partition in megabyte units:

$ df -BM /free2

Filesystem 1M-blocks Used Available Use% Mounted on

/dev/hda9 988 30 908 3% /free2

The final example, which runs under Linux only, displays information about NFS

filesystems in human-readable terms:

$ df -ht nfs

Filesystem Size Used Avail Use% Mounted on

zach:/c 2.0G 1.7G 278M 86% /zach_c

zach:/d 2.0G 1.8G 157M 92% /zach_d

diff 663

diff

Displays the differences between two text files

diff [options]file1 file2

diff [options]file1 directory

diff [options]directory file2

diff [options]directory1 directory2

The diff utility displays line-by-line differences between two text files. By default diff displays the differ-

ences as instructions, which you can then use to edit one of the files to make it the same as the other.

Arguments The file1 and file2 are pathnames of ordinary text files that diff works on. When the

directory argument is used in place of file2,diff looks for a file in directory with the

same name as file1. It works similarly when directory replaces file1. When you

specify two directory arguments, diff compares the files in directory1 with the files

that have the same simple filenames in directory2.

Options The diff utility accepts the common options described on page 603, with one excep-

tion: When one of the arguments is a directory and the other is an ordinary file, you

cannot compare to standard input.

––ignore-blank-lines

–B Ignores differences that involve only blank lines.

––ignore-space-change

–b Ignores whitespace (SPACEs and TABs) at the ends of lines and considers other

strings of whitespace to be equal.

––context[=lines]–C [lines]

Displays the sections of the two files that differ, including lines lines (the

default is 3) around each line that differs to show the context. Each line in file1

that is missing from file2 is preceded by a hyphen (–); each extra line in file2 is

preceded by a plus sign (+); and lines that have different versions in the two

files are preceded by an exclamation point (!). When lines that differ are within

lines lines of each other, they are grouped together in the output.

––ed –e Creates and sends to standard output a script for the ed editor, which will edit

file1 to make it the same as file2. You must add w (Write) and q (Quit) instruc-

tions to the end of the script if you plan to redirect input to ed from the script.

When you use ––ed,diff displays the changes in reverse order: Changes to the end

of the file are listed before changes to the top, preventing early changes from

affecting later changes when the script is used as input to ed. For example, if a line

near the top were deleted, subsequent line numbers in the script would be wrong.

The Mac OS X version of diff accepts long options

tip Options for diff preceded by a double hyphen (––) work under Mac OS X as well as under Linux.

664 diff

––ignore-case –i Ignores differences in case when comparing files.

––new-file –N When comparing directories, when a file is present in one of the directories

only, considers it to be present and empty in the other directory.

––show-c-function

–p Shows which C function, bash control structure, Perl subroutine, and so forth

each change affects.

––brief –q Does not display the differences between lines in the files. Instead, diff reports

only that the files differ.

––recursive –r When using diff to compare the files in two directories, causes the comparisons

to descend through the directory hierarchies.

––unified[=lines]–U lines

Uses the easier-to-read unified output format. See the discussion of diff on

page 54 for more detail and an example. The lines argument is the number of

lines of context; the default is three.

––ignore-all-space

–w (whitespace) Ignores whitespace when comparing lines.

––width=n–W n

Sets the width of the columns that diff uses to display the output to n char-

acters. This option is useful with the ––side-by-side option. The sdiff utility

(see the “Notes” section) uses a lowercase w to perform the same function:

–w n.

––side-by-side –y Displays the output in a side-by-side format. This option generates the same

output as sdiff. Use the ––width=columns option with this option.

Notes The sdiff utility is similar to diff but its output may be easier to read. The diff ––side-

by-side option produces the same output as sdiff. See the “Examples” section and

refer to the diff and sdiff man and info pages for more information.

Use the diff3 utility to compare three files.

Use cmp (page 634) to compare nontext (binary) files.

Discussion When you use diff without any options, it produces a series of lines containing Add

(a), Delete (d), and Change (c) instructions. Each of these lines is followed by the

lines from the file you need to add to, delete from, or change, respectively, to make

the files the same. A less than symbol (<) precedes lines from file1. A greater than

symbol (>) precedes lines from file2. The diff output appears in the format shown in

Table V-12. A pair of line numbers separated by a comma represents a range of

lines; a single line number represents a single line.

The diff utility assumes you will convert file1 to file2. The line numbers to the left of

each of the a,c, or d instructions always pertain to file1; the line numbers to the

diff 665

right of the instructions apply to file2. To convert file2 to file1, run diff again,

reversing the order of the arguments.

Examples The first example shows how diff displays the differences between two short, similar

files:

$ cat m

aaaaa

bbbbb

ccccc

$ cat n

aaaaa

ccccc

$ diff m n

2d1

< bbbbb

The difference between files m and n is that the second line of file m (bbbbb) is

missing from file n. The first line that diff displays (2d1) indicates that you need to

delete the second line from file1 (m) to make it the same as file2 (n). The next line

diff displays starts with a less than symbol (<), indicating that this line of text is from

file1. In this example, you do not need this information—all you need to know is

the line number so that you can delete the line.

The ––side-by-side option and the sdiff utility, both with the output width set to 30

characters, display the same output. In the output, a less than symbol points to the

extra line in file m;diff/sdiff leaves a blank line in file n where the extra line would go

to make the files the same.

$ diff --side-by-side --width=30 m n

aaaaa aaaaa

bbbbb <

ccccc ccccc

$ sdiff -w 30 m n

aaaaa aaaaa

bbbbb <

ccccc ccccc

Table V-12 diff output

Instruction Meaning (to change file1 to file2)

line1 a line2,line3

> lines from file2 Append lines line2 through line3 from file2 after line1 in file1.

line1,line2 d line3

< lines from file1 Delete line1 through line2 from file1.

line1,line2 c line3,line4

< lines from file1

–––

> lines from file 2

Change line1 through line2 in file1 to line3 through line 4 from

file2.

666 diff

The next example uses the same m file and a new file, p, to show diff issuing an a

(Append) instruction:

$ cat p

aaaaa

bbbbb

rrrrr

ccccc

$ diff m p

2a3

> rrrrr

In the preceding example, diff issues the instruction 2a3 to indicate you must append a

line to file m, after line 2, to make it the same as file p. The second line diff displays

indicates the line is from file p (the line begins with >, indicating file2). In this example,

you need the information on this line; the appended line must contain the text rrrrr.

The next example uses file m again, this time with file r, to show how diff indicates a

line that needs to be changed:

$ cat r

aaaaa

-q

ccccc

$ diff m r

2c2

< bbbbb

---

> -q

The difference between the two files appears in line 2: File m contains bbbbb, and

file r contains –q. The diff utility displays 2c2 to indicate that you need to change

line 2. After indicating a change is needed, diff shows you must change line 2 in file

m (bbbbb) to line 2 in file r (–q) to make the files the same. The three hyphens indi-

cate the end of the text in file m that needs to be changed and the beginning of the

text in file r that is to replace it.

Comparing the same files using the side-by-side and width options (–y and –W)

yields an easier-to-read result. The pipe symbol (|) indicates the line on one side

must replace the line on the other side to make the files the same:

$ diff -y -W 30 m r

aaaaa aaaaa

bbbbb | -q

ccccc ccccc

The next examples compare the two files q and v:

$ cat q $ cat v

Monday Monday

Tuesday Wednesday

Wednesday Thursday

Thursday Thursday

Saturday Friday

Sunday Saturday

Sundae

diff 667

Running in side-by-side mode diff shows Tuesday is missing from file v, there is only

one Thursday in file q (there are two in file v), and Friday is missing from file q. The

last line is Sunday in file q and Sundae in file v:diff indicates these lines are different.

You can change file q to be the same as file v by removing Tuesday, adding one

Thursday and Friday, and substituting Sundae from file v for Sunday from file q.

Alternatively, you can change file v to be the same as file q by adding Tuesday,

removing one Thursday and Friday, and substituting Sunday from file q for Sundae

from file v.

$ diff -y -W 30 q v

Monday Monday

Tuesday <

Wednesday Wednesday

Thursday Thursday

> Thursday

> Friday

Saturday Saturday

Sunday | Sundae

Context diff With the ––context option (called a context diff), diff displays output that tells you

how to turn the first file into the second file. The top two lines identify the files and

show that q is represented by asterisks, whereas v is represented by hyphens. Fol-

lowing a row of asterisks that indicates the beginning of a hunk of text is a row of

asterisks with the numbers 1,6 in the middle. This line indicates that the instruc-

tions in the first section tell you what to remove from or change in file q—namely,

lines 1 through 6 (that is, all the lines of file q; in a longer file it would mark the first

hunk). The hyphen on the second subsequent line indicates you need to remove the

line with Tuesday. The line with an exclamation point indicates you need to replace

the line with Sunday with the corresponding line from file v. The row of hyphens

with the numbers 1,7 in the middle indicates that the next section tells you which

lines from file v—lines 1 through 7—you need to add or change in file q. You need

to add a second line with Thursday and a line with Friday, and you need to change

Sunday in file q to Sundae (from file v).

$ diff --context q v

*** q Mon Aug 24 18:26:45 2009

--- v Mon Aug 24 18:27:55 2009

***************

*** 1,6 ****

Monday

- Tuesday

Wednesday

Thursday

Saturday

! Sunday

--- 1,7 ----

Monday

Wednesday

Thursday

+ Thursday

+ Friday

Saturday

! Sundae

668 diskutil O

diskutil O

Checks, modifies, and repairs local volumes

diskutil action [arguments]

The diskutil utility mounts, unmounts, and displays information about disks and partitions (vol-

umes). It can also format and repair filesystems and divide a disk into partitions.

Arguments The action specifies what diskutil is to do. Table V-13 lists common actions along

with the argument each takes.

Table V-13 diskutil actions

Action Argument Description

eraseVolume

type name device

Reformats

device

using the format

type

and the label

name

The

name

specifies the name of the volume; alphanumeric

names are the easiest to work with.

The filesystem

type

is typically HFS+, but can also be UFS

or MS-DOS. You can specify additional options as part of

the

type

. For example, a FAT32 filesystem (as used in Win-

dows 98 and later) would have a type of MS-DOS FAT32. A

journaled, case-sensitive, HFS+ filesystem would have a

type

of Case-sensitive Journaled HFS+.

info

device

Displays information about

device

. Does not require

ownership of

device

list [

device

] Lists partitions on

device

. Without

device

lists partitions on

all devices. Does not require ownership of

device

mount

device

Mounts

device

mountDisk

device

Mounts all devices on the disk containing

device

reformat

device

Reformats

device

using its current name and format.

repairVolume

device

Repairs the filesystem on

device

unmount

device

Unmounts

device

unmountDisk

device

Unmounts all devices on the disk containing

device

verifyVolume

device

Verifies the filesystem on

device

. Does not require owner-

ship of

device

diskutil O669

Notes The diskutil utility provides access to the Disk Management framework, the support

code used by the Disk Utility application. It allows some choices that are not sup-

ported from the graphical interface.

You must own device, or be working with root privileges, when you specify an

action that modifies or changes the state of a volume.

fsck The diskutil verifyVolume and repairVolume actions are analogous to the fsck utility

on Linux systems. Under OS X, the fsck utility is deprecated except when the system

is in single-user mode.

disktool Some of the functions performed by diskutil were handled by disktool in the past.

Examples The first example displays a list of disk devices and volumes available on the local

system:

$ diskutil list

/dev/disk0

#: type name size identifier

0: Apple_partition_scheme *152.7 GB disk0

1: Apple_partition_map 31.5 KB disk0s1

2: Apple_HFS Eva01 30.7 GB disk0s3

3: Apple_HFS Users 121.7 GB disk0s5

/dev/disk1

#: type name size identifier

0: Apple_partition_scheme *232.9 GB disk1

1: Apple_partition_map 31.5 KB disk1s1

2: Apple_Driver43 28.0 KB disk1s2

3: Apple_Driver43 28.0 KB disk1s3

4: Apple_Driver_ATA 28.0 KB disk1s4

5: Apple_Driver_ATA 28.0 KB disk1s5

6: Apple_FWDriver 256.0 KB disk1s6

7: Apple_Driver_IOKit 256.0 KB disk1s7

8: Apple_Patches 256.0 KB disk1s8

9: Apple_HFS Spare 48.8 GB disk1s9

10: Apple_HFS House 184.1 GB disk1s10

The next example displays information about one of the mounted volumes:

$ diskutil info disk1s9

Device Node: /dev/disk1s9

Device Identifier: disk1s9

Mount Point: /Volumes/Spare

Volume Name: Spare

File System: HFS+

Owners: Enabled

Partition Type: Apple_HFS

Bootable: Is bootable

Media Type: Generic

670 diskutil O

Protocol: FireWire

SMART Status: Not Supported

UUID: C77BB3DC-EFBB-30B0-B191-DE7E01D8A563

Total Size: 48.8 GB

Free Space: 48.8 GB

Read Only: No

Ejectable: Yes

The next example formats the partition at /dev/disk1s8 as an HFS+ Extended (HFSX)

filesystem and labels it Spare2. This command erases all data on the partition:

# diskutil eraseVolume 'Case-sensitive HFS+' Spare2 disk1s8

Started erase on disk disk1s10

Erasing

Mounting Disk

Finished erase on disk disk1s10

The final example shows the output of a successful verifyVolume operation:

$ diskutil verifyVolume disk1s9

Started verify/repair on volume disk1s9 Spare

Checking HFS Plus volume.

Checking Extents Overflow file.

Checking Catalog file.

Checking Catalog hierarchy.

Checking volume bitmap.

Checking volume information.

The volume Spare appears to be OK.

Mounting Disk

Verify/repair finished on volume disk1s9 Spare

ditto O671

ditto O

Copies files and creates and unpacks archives

ditto [options] source-file destination-file

ditto [options] source-file-list destination-directory

ditto –c [options]source-directory destination-archive

ditto –x [options] source-archive-list destination-directory

The ditto utility copies files and their ownership, timestamps, and other attributes, including

extended attributes (page 928). It can copy to and from cpio and zip archive files, as well as copy

ordinary files and directories.

Arguments The source-file is the pathname of the file that ditto is to make a copy of. The

destination-file is the pathname that ditto assigns to the resulting copy of the file.

The source-file-list specifies one or more pathnames of files and directories that ditto

makes copies of. The destination-directory is the pathname of the directory that

ditto copies the files and directories into. When you specify a destination-directory,

ditto gives each of the copied files the same simple filename as its source-file.

The source-directory is a single directory that ditto copies into the destination-archive.

The resulting archive holds copies of the contents of source-directory, but not the

directory itself.

The source-archive-list specifies one or more pathnames of archives that ditto

extracts into destination-directory.

Using a hyphen (–) in place of a filename or a directory name causes ditto to read

from standard input or write to standard output instead of reading from or writing

to that file or directory.

Options You cannot use the –c and –x options together.

–c (create archive) Creates an archive file.

––help Displays a help message.

–k (pkzip) Uses the zip format, instead of the default cpio (page 644) format, to

create or extract archives. For more information on zip, see the tip on page 62.

––norsrc (no resource) Ignores extended attributes. This option causes ditto to copy only

data forks (the default behavior under Mac OS X 10.3 and earlier).

––rsrc (resource) Copies extended attributes, including resource forks (the default

behavior under Mac OS X 10.4 and later). Also –rsrc and –rsrcFork.

–V (very verbose) Sends a line to standard error for each file, symbolic link, and

device node ditto copies.

–v (verbose) Sends a line to standard error for each directory ditto copies.

672 ditto O

–X (exclude) Prevents ditto from searching directories in filesystems other than the

filesystems that hold the files it was explicitly told to copy.

–x (extract archive) Extracts files from an archive file.

–z (compress) Uses gzip (page 724) or gunzip to compress or decompress cpio

archives.

Notes The ditto utility does not copy the locked attribute flag (page 931). The utility also

does not copy ACLs.

By default ditto creates and reads archives (page 941) in the cpio (page 644) format.

The ditto utility cannot list the contents of archive files; it can only create or extract

files from archives. Use pax or cpio to list the contents of cpio archives, and use unzip

with the –l option to list the contents of zip files.

Examples The following examples show three ways to back up a user’s home directory, including

extended attributes (except as mentioned in “Notes”), preserving timestamps and per-

missions. The first example copies Zach’s home directory to the volume (filesystem)

named Backups; the copy is a new directory named zach.0228:

$ ditto /Users/zach /Volumes/Backups/zach.0228

The next example copies Zach’s home directory into a single cpio-format archive file

on the volume named Backups:

$ ditto -c /Users/zach /Volumes/Backups/zach.0228.cpio

The next example copies Zach’s home directory into a zip archive:

$ ditto -c -k /Users/zach /Volumes/Backups/zach.0228.zip

Each of the next three examples restores the corresponding backup archive into

Zach’s home directory, overwriting any files that are already there:

$ ditto /Volumes/Backups/zach.0228 /Users/zach

$ ditto -x /Volumes/Backups/zach.0228.cpio /Users/zach

$ ditto -x -k /Volumes/Backups/zach.0228.zip /Users/zach

The following example copies the Scripts directory to a directory named Scripts-

Backups on the remote host bravo. It uses an argument of a hyphen in place of

source-directory locally to write to standard output and in place of destination-

directory on the remote system to read from standard input:

$ ditto -c Scripts - | ssh bravo ditto -x - ScriptsBackups

The final example copies the local startup disk (the root filesystem) to the volume

named Backups.root. Because some of the files can be read only by root, the script

must be run by a user with root privileges. The –X option keeps ditto from trying to

copy other volumes (filesystems) that are mounted under /.

# ditto -X / /Volumes/Backups.root

dmesg 673

dmesg

Displays kernel messages

dmesg [options]

The dmesg utility displays messages stored in the kernel ring buffer.

Options –c Clears the kernel ring buffer after running dmesg.L

–M core

The core is the name of the (core dump) file to process (defaults to

/dev/kmem). O

–N kernel

The kernel is the pathname of a kernel file (defaults to /mach). If you are dis-

playing information about a core dump, kernel should be the kernel that was

running at the time the core file was created. O

Discussion When the system boots, the kernel fills its ring buffer with messages regarding hard-

ware and module initialization. Messages in the kernel ring buffer are often useful

for diagnosing system problems.

Notes Under Mac OS X, you must run this utility while working with root privileges.

As a ring buffer, the kernel message buffer keeps the most recent messages it

receives, discarding the oldest messages once it fills up. To save a list of kernel boot

messages, give the following command immediately after booting the system and

logging in:

$ dmesg > dmesg.boot

This command saves the kernel messages in the dmesg.boot file. This list can be

educational and quite useful when you are having a problem with the boot process.

Under most Linux systems, after the system boots, the system records much of the

same information as dmesg displays in /var/log/messages or a similar file.

Examples The following command displays kernel messages in the ring buffer with the string

serial in them, regardless of case:

$ dmesg | grep -i serial

Apple16X50PCI2: Identified 4 Serial channels at PCI SLOT-2 Bus=5 Dev=2 Func=0

674 dscl O

dscl O

Displays and manages Directory Service information

dscl [options] [datasource [command]]

The dscl (Directory Service command line) utility enables you to work with Directory Service directory

nodes. When you call dscl without arguments, it runs interactively.

Arguments The datasource is a node name or a Mac OS X Server host specified by a hostname

or IP address. A period (.) specifies the local domain.

Options –p (prompt) Prompts for a password as needed.

–q (quiet) Does not prompt.

–u user

Authenticates as user.

Commands Refer to the “Notes” section for definitions of some of the terms used here.

The hyphen (–) before a command is optional.

–list path [key](also –ls) Lists subdirectories in path, one per line. If you specify key, this command

lists subdirectories that match key.

–read [path [key]]

(also –cat and .) Displays a directory, one property per line.

–readall [path [key]]

Displays properties with a given key.

–search path key value

Displays properties where key matches value.

Notes When discussing Directory Service, the term directory refers to a collection of data (a

database), not to a filesystem directory. Each directory holds one or more properties.

Each property comprises a key/value pair, where there may be more than one value

for a given key. In general, dscl displays a property with the key first, followed by a

colon, and then the value. If there is more than one value, the values are separated by

SPACEs. If a value contains SPACEs, dscl displays the value on the line following the key.

Under Mac OS X and Mac OS X Server, Open Directory stores information for

the local system in key/value-formatted XML files in the /var/db/dslocal directory

hierarchy.

dscl O675

The dscl utility is the command-line equivalent of NetInfo Manager (available on

versions of Mac OS X prior to 10.5) or of Workgroup Manager on Mac OS X

Server.

Examples The dscl –list command displays a list of top-level directories when you specify a

path of /:

$ dscl . -list /

AFPServer

AFPUserAliases

Aliases

AppleMetaRecord

Augments

Automount

...

SharePoints

SMBServer

Users

WebServer

The period as the first argument to dscl specifies the local domain as the data

source. The next command displays a list of Users directories:

$ dscl . -list /Users

_amavisd

_appowner

_appserver

_ard

...

_www

_xgridagent

_xgridcontroller

daemon

max

nobody

root

You can use the dscl –read command to display information about a specific user:

$ dscl . -read /Users/root

AppleMetaNodeLocation: /Local/Default

GeneratedUID: FFFFEEEE-DDDD-CCCC-BBBB-AAAA00000000

NFSHomeDirectory: /var/root

Password: *

PrimaryGroupID: 0

RealName:

System Administrator

RecordName: root

RecordType: dsRecTypeStandard:Users

UniqueID: 0

UserShell: /bin/sh

676 dscl O

The following dscl –readall command lists all usernames and user IDs on the local

system. The command looks for the RecordName and UniqueID keys in the /Users

directory and displays the associated values. The dscl utility separates multiple val-

ues with SPACEs. See page 926 for an example of a shell script that calls dscl with the

–readall command.

$ dscl . -readall /Users RecordName UniqueID

RecordName: _amavisd amavisd

UniqueID: 83

RecordName: _appowner appowner

UniqueID: 87

...

RecordName: daemon

UniqueID: 1

RecordName: sam

UniqueID: 501

RecordName: nobody

UniqueID: -2

RecordName: root

UniqueID: 0

The following example uses the dscl –search command to display all properties

where the key RecordName equals sam:

$ dscl . -search / RecordName sam

Users/sam RecordName = (

sam

)

du 677

Displays information on disk usage by directory hierarchy and/or file

du [options] [path-list]

The du (disk usage) utility reports how much disk space is occupied by a directory hierarchy or a file.

By default du displays the number of 1,024-byte blocks occupied by the directory hierarchy or file.

Arguments Without any arguments, du displays information about the working directory and

its subdirectories. The path-list specifies the directories and files you want informa-

tion on.

Options Options preceded by a double hyphen (––) work under Linux only. Except as noted,

options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

Without any options, du displays the total storage used for each argument in path-

list. For directories, du displays this total after recursively listing the totals for each

subdirectory.

––all –a Displays the space used by all ordinary files along with the total for each directory.

––block-size=sz –B sz

The sz argument specifies the units the report uses. It is a multiplicative suffix

from Table V-1 on page 603. See also the –H (––si) and –h (––human-readable)

options. L

––total –c Displays a grand total at the end of the output.

––dereference-args

–D (partial dereference) For each file that is a symbolic link, reports on the file the

link points to, not the symbolic link itself. This option affects files specified on

the command line; it does not affect files found while descending a directory

hierarchy. This option treats files that are not symbolic links normally. L

–d depth

Displays information for subdirectories to a level of depth directories. O

––si –H (human readable) Displays sizes in K (kilobyte), M (megabyte), and G

(gigabyte) blocks, as appropriate. Uses powers of 1,000. In the future, the –H

option will change to be the equivalent of –D.L

–H (partial dereference) For each file that is a symbolic link, reports on the file the

link points to, not the symbolic link itself. This option affects files specified on

the command line; it does not affect files found while descending a directory

hierarchy. This option treats files that are not symbolic links normally. See

page 623 for an example of the use of the –H versus –L options. O

678 du

––human-readable

–h Displays sizes in K (kilobyte), M (megabyte), and G (gigabyte) blocks, as

appropriate. Uses powers of 1,024.

–k Displays sizes in 1-kilobyte blocks.

––dereference –L For each file that is a symbolic link, reports on the file the link points to, not

the symbolic link itself. This option affects all files and treats files that are not

symbolic links normally. The default is –P (––no-dereference). See page 623 for

an example of the use of the –H versus –L options.

–m Displays sizes in 1-megabyte blocks.

––no-dereference –P For each file that is a symbolic link, reports on the symbolic link, not the file

the link points to. This option affects all files and treats files that are not sym-

bolic links normally. This behavior is the default. See page 625 for an example

of the use of the –P option.

––summarize –s Displays only the total size for each directory or file you specify on the command

line; subdirectory totals are not displayed.

––one-file-system –x Reports only on files and directories on the same filesystem as that of the argument

being processed.

Examples In the first example, du displays size information about subdirectories in the work-

ing directory. The last line contains the grand total for the working directory and its

subdirectories.

$ du

26 ./Postscript

4 ./RCS

47 ./XIcon

4 ./Printer/RCS

12 ./Printer

105 .

The total (105) is the number of blocks occupied by all plain files and directories

under the working directory. All files are counted, even though du displays only the

sizes of directories.

If you do not have read permission for a file or directory that du encounters, du

sends a warning to standard error and skips that file or directory. Next, using the –s

(summarize) option, du displays the total for each of the directories in /usr but does

not display information for subdirectories:

$ du -s /usr/*

4 /usr/X11R6

260292 /usr/bin

10052 /usr/games

7772 /usr/include

1720468 /usr/lib

105240 /usr/lib32

0 /usr/lib64

du 679

du: cannot read directory `/usr/local/lost+found': Permission denied

...

130696 /usr/src

Add to the previous example the –c (total) option and du displays the same listing

with a grand total at the end:

$ du -sc /usr/*

4 /usr/X11R6

260292 /usr/bin

...

130696 /usr/src

3931436 total

The following example uses the –s (summarize), –h (human-readable), and –c (total)

options:

$ du -shc /usr/*

4.0K /usr/X11R6

255M /usr/bin

9.9M /usr/games

7.6M /usr/include

1.7G /usr/lib

103M /usr/lib32

...

128M /usr/src

3.8G total

The final example displays, in human-readable format, the total size of all files the

user can read in the /usr filesystem. Redirecting standard error to /dev/null discards

all warnings about files and directories that are unreadable.

$ du -hs /usr 2>/dev/null

3.8G /usr

680 echo

echo

Displays a message

echo [options]message

The echo utility copies its arguments, followed by a NEWLINE, to standard output. Both the Bourne

Again and TC Shells have their own echo builtin that works similarly to the echo utility.

Arguments The message consists of one or more arguments, which can include quoted strings,

ambiguous file references, and shell variables. A SPACE separates each argument from

the next. The shell recognizes unquoted special characters in the arguments. For

example, the shell expands an asterisk into a list of filenames in the working directory.

Options You can configure the tcsh echo builtin to treat backslash escape sequences and the

–n option in different ways. Refer to echo_style in the tcsh man page. The typical

tcsh configuration recognizes the –n option, enables backslash escape sequences,

and ignores the –E and –e options.

–E Suppresses the interpretation of backslash escape sequences such as \n. Available

with the bash builtin version of echo only.

–e Enables the interpretation of backslash escape sequences such as \n. Available

with the bash builtin version of echo only.

––help Gives a short summary of how to use echo. The summary includes a list of the

backslash escape sequences interpreted by echo. This option works only with

the echo utility; it does not work with the echo builtins. L

–n Suppresses the NEWLINE terminating the message.

Notes Suppressing the interpretation of backslash escape sequences is the default behavior

of the bash builtin version of echo and of the echo utility.

You can use echo to send messages to the screen from a shell script. See page 138

for a discussion of how to use echo to display filenames using wildcard characters.

The echo utility and builtins provide an escape notation to represent certain non-

printing characters in message (Table V-14). You must use the –e option for these

backslash escape sequences to work with the echo utility and the bash echo builtin.

Typically you do not need the –e option with the tcsh echo builtin.

Table V-14 Backslash escape sequences

Sequence Meaning

\a Bell

\c Suppress trailing NEWLINE

echo 681

Examples Following are some echo commands. These commands will work with the echo util-

ity (/bin/echo) and the bash and tcsh echo builtins, except for the last, which may

not need the –e option under tcsh.

$ echo "This command displays a string."

This command displays a string.

$ echo -n "This displayed string is not followed by a NEWLINE."

This displayed string is not followed by a NEWLINE.$ echo hi

$ echo -e "This message contains\v a vertical tab."

This message contains

a vertical tab.

The following examples contain messages with the backslash escape sequence \c. In

the first example, the shell processes the arguments before calling echo. When the

shell sees the \c, it replaces the \c with the character c. The next three examples

show how to quote the \c so that the shell passes it to echo, which then does not

append a NEWLINE to the end of the message. The first four examples are run under

bash and require the –e option. The final example runs under tcsh, which may not

need this option.

$ echo -e There is a newline after this line.\c

There is a newline after this line.c

$ echo -e 'There is no newline after this line.\c'

There is no newline after this line.$

$ echo -e "There is no newline after this line.\c"

There is no newline after this line.$

$ echo -e There is no newline after this line.\\c

There is no newline after this line.$

$ tcsh

tcsh $ echo -e 'There is no newline after this line.\c'

There is no newline after this line.$

You can use the –n option in place of –e and \c.

Sequence Meaning

\n NEWLINE

\t HORIZONTAL TAB

\v VERTICAL TAB

\\ BACKSLASH

Table V-14 Backslash escape sequences (continued)

682 expr

expr

Evaluates an expression

expr expression

The expr utility evaluates an expression and sends the result to standard output. It evaluates charac-

ter strings that represent either numeric or nonnumeric values. Operators are used with the strings to

form expressions.

Arguments The expression is composed of strings interspersed with operators. Each string and

operator constitute a distinct argument that you must separate from other argu-

ments with a SPACE. You must quote operators that have special meanings to the shell

(for example, the multiplication operator, *).

The following list of expr operators is given in order of decreasing precedence. Each

operator within a group of operators has the same precedence. You can change the

order of evaluation by using parentheses.

:(comparison) Compares two strings, starting with the first character in each

string and ending with the last character in the second string. The second string

is a regular expression with an implied caret (^) as its first character. If expr

finds a match, it displays the number of characters in the second string. If

expr does not find a match, it displays a zero.

*(multiplication)

/(division)

%(remainder)

Work only on strings that contain the numerals 0 through 9 and optionally a

leading minus sign. Convert strings to integer numbers, perform the specified

arithmetic operation on numbers, and convert the result back to a string before

sending it to standard output.

+(addition)

–(subtraction)

Function in the same manner as the preceding group of operators.

<(less than)

<= (less than or equal to)

=or == (equal to)

!= (not equal to)

>= (greater than or equal to)

>(greater than)

Relational operators work on both numeric and nonnumeric arguments. If

one or both of the arguments are nonnumeric, the comparison is nonnumeric,

using the machine collating sequence (typically ASCII). If both arguments are

expr 683

numeric, the comparison is numeric. The expr utility displays a 1 (one) if the

comparison is true and a 0 (zero) if it is false.

&(AND) Evaluates both of its arguments. If neither is 0 or a null string, expr dis-

plays the value of the first argument. Otherwise, it displays a 0 (zero). You

must quote this operator.

|(OR) Evaluates the first argument. If it is neither 0 nor a null string, expr dis-

plays the value of the first argument. Otherwise, it displays the value of the

second argument. You must quote this operator.

Notes The expr utility returns an exit status of 0 (zero) if the expression evaluates to any-

thing other than a null string or the number 0, a status of 1 if the expression is null

or 0, and a status of 2 if the expression is invalid.

Although expr and this discussion distinguish between numeric and nonnumeric argu-

ments, all arguments to expr are nonnumeric (character strings). When applicable,

expr attempts to convert an argument to a number (for example, when using the +

operator). If a string contains characters other than 0 through 9 and optionally a lead-

ing minus sign, expr cannot convert it. Specifically, if a string contains a plus sign or a

decimal point, expr considers it to be nonnumeric. If both arguments are numeric, the

comparison is numeric. If one is nonnumeric, the comparison is lexicographic.

Examples In the following examples, expr evaluates constants. You can also use expr to evalu-

ate variables in a shell script. The fourth command displays an error message

because of the illegal decimal point in 5.3:

$ expr 17 + 40

$ expr 10 - 24

-14

$ expr -17 + 20

$ expr 5.3 + 4

expr: non-numeric argument

The multiplication (*), division (/), and remainder (%) operators provide addi-

tional arithmetic power. You must quote the multiplication operator (precede it

with a backslash) so that the shell will not treat it as a special character (an ambigu-

ous file reference). You cannot put quotation marks around the entire expression

because each string and operator must be a separate argument.

$ expr 5 \* 4

$ expr 21 / 7

$ expr 23 % 7

684 expr

The next two examples show how parentheses change the order of evaluation. You

must quote each parenthesis and surround the backslash/parenthesis combination

with SPACEs:

$ expr 2 \* 3 + 4

$ expr 2 \* $ 3 + 4 $

You can use relational operators to determine the relationship between numeric or

nonnumeric arguments. The following commands compare two strings to see if they

are equal; expr displays a 0 when the relationship is false and a 1 when it is true.

$ expr fred == sam

$ expr sam == sam

In the following examples, the relational operators, which must be quoted, establish

order between numeric or nonnumeric arguments. Again, if a relationship is true,

expr displays a 1.

$ expr fred \> sam

$ expr fred \< sam

$ expr 5 \< 7

The next command compares 5 with m. When one of the arguments expr is compar-

ing with a relational operator is nonnumeric, expr considers the other to be nonnu-

meric. In this case, because m is nonnumeric, expr treats 5 as a nonnumeric

argument. The comparison is between the ASCII (on many systems) values of m and

5. The ASCII value of m is 109 and that of 5 is 53, so expr evaluates the relationship

as true.

$ expr 5 \< m

In the next example, the matching operator determines that the four characters in

the second string match the first four characters in the first string. The expr utility

displays the number of matching characters (4).

$ expr abcdefghijkl : abcd

The & operator displays a 0 if one or both of its arguments are 0 or a null string;

otherwise, it displays the first argument:

$ expr '' \& book

$ expr magazine \& book

magazine

expr 685

$ expr 5 \& 0

$ expr 5 \& 6

The | operator displays the first argument if it is not 0 or a null string; otherwise, it

displays the second argument:

$ expr '' \| book

book

$ expr magazine \| book

magazine

$ expr 5 \| 0

$ expr 0 \| 5

$ expr 5 \| 6

686 file

file

Displays the classification of a file

file [option]file-list

The file utility classifies files according to their contents.

Arguments The file-list is a list of the pathnames of one or more files that file classifies. You can

specify any kind of file, including ordinary, directory, and special files, in the file-list.

Options

––files-from=file –f file

Takes the names of files to be examined from file rather than from file-list on

the command line. The names of the files must be listed one per line in file.

––no-dereference –h For each file that is a symbolic link, reports on the symbolic link, not the file

the link points to. This option treats files that are not symbolic links normally.

This behavior is the default on systems where the environment variable

POSIXLY_CORRECT is not defined (typical).

––help Displays a help message.

––mime –I Displays MIME (page 966) type strings. O

––mime –i Displays MIME (page 966) type strings. L

––dereference –L For each file that is a symbolic link, reports on the file the link points to, not

the symbolic link itself. This option treats files that are not symbolic links nor-

mally. This behavior is the default on systems where the environment variable

POSIXLY_CORRECT is defined.

––uncompress –z (zip) Attempts to classify files within a compressed file.

Notes The file utility can classify more than 5,000 file types. Some of the more common

file types found on Linux systems, as displayed by file, follow:

directory

ELF 32-bit LSB executable

empty

English text

executable

The file utility uses a maximum of three tests in its attempt to classify a file: file-

system, magic number, and language tests. When file identifies the type of a file, it

ceases testing. The filesystem test examines the return from a stat() system call to see

whether the file is empty or a special file. The magic number (page 964) test looks

for data in particular fixed formats near the beginning of the file. The language test,

if needed, determines whether the file is a text file, which encoding it uses, and

which language it is written in. Refer to the file man page for a more detailed

description of how file works. The results of file are not always correct.

Examples Some examples of file identification follow:

/etc/Muttrc: ASCII English text

/etc/Muttrc.d: directory

/etc/adjtime: ASCII text

/etc/aliases.db: Berkeley DB (Hash, version 9, native byte-order)

/etc/at.deny: writable, regular file, no read permission

/etc/bash_completion: ASCII Pascal program text

/etc/blkid.tab.old: Non-ISO extended-ASCII text, with CR, LF line terminators

/etc/brltty.conf: UTF-8 Unicode C++ program text

/etc/chatscripts: setgid directory

/etc/magic: magic text file for file(1) cmd

/etc/motd: symbolic link to `/var/run/motd'

/etc/qemu-ifup: POSIX shell script text executable

/usr/bin/4xml: a python script text executable

/usr/bin/Xorg: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), for GNU/Linux

2.6.8, dynamically linked (uses shared libs), stripped

/usr/bin/debconf: perl script text executable

/usr/bin/locate: symbolic link to `/etc/alternatives/locate'

/usr/share/man/man7/hier.7.gz: gzip compressed data, was "hier.7", from Unix, last

modified: Thu Jan 31 03:06:00 2008, max compression

688 find

find

Finds files based on criteria

find [directory-list] [option] [expression]

The find utility selects files that are located in specified directory hierarchies and that meet specified

criteria.

Arguments The directory-list specifies the directory hierarchies that find is to search. When you

do not specify a directory-list,find searches the working directory hierarchy.

The option controls whether find dereferences symbolic links as it descends directory

hierarchies. By default find does not dereference symbolic links (it works with the

symbolic link, not the file the link points to). Under Mac OS X, you can use the –x

option to prevent find from searching directories in filesystems other than those spec-

ified in directory-list. Under Linux, the –xdev criterion performs the same function.

The expression contains criteria, as described in the “Criteria” section. The find util-

ity tests each of the files in each of the directories in the directory-list to see whether

it meets the criteria described by the expression. When you do not specify an

expression, the expression defaults to –print.

ASPACE separating two criteria is a Boolean AND operator: The file must meet both

criteria to be selected. A –or or –o separating the criteria is a Boolean OR operator:

The file must meet one or the other (or both) of the criteria to be selected.

You can negate any criterion by preceding it with an exclamation point. The find

utility evaluates criteria from left to right unless you group them using parentheses.

Within the expression you must quote special characters so the shell does not interpret

them but rather passes them to find. Special characters that are frequently used with

find include parentheses, brackets, question marks, and asterisks.

Each element within the expression is a separate argument. You must separate

arguments from each other with SPACEs. A SPACE must appear on both sides of each

parenthesis, exclamation point, criterion, or other element.

Options –H (partial dereference) For each file that is a symbolic link, works with the file

the link points to, not the symbolic link itself. This option affects files specified

on the command line; it does not affect files found while descending a direc-

tory hierarchy. This option treats files that are not symbolic links normally. See

page 623 for an example of the use of the –H versus –L options.

–L (dereference) For each file that is a symbolic link, works with the file the link

points to, not the symbolic link itself. This option affects all files and treats

find 689

files that are not symbolic links normally. See page 623 for an example of the

use of the –H versus –L options.

–P (no dereference) For each file that is a symbolic link, works with the symbolic

link, not the file the link points to. This option affects all files and treats files

that are not symbolic links normally. This behavior is the default. See page 625

for an example of the use of the –P option.

–x Causes find not to search directories in filesystems other than the one(s) specified

by directory-list. Under Linux, use the –xdev criterion. O

Criteria You can use the following criteria within the expression. As used in this list, ±n is a

decimal integer that can be expressed as +n (more than n), –n (fewer than n), or n

(exactly n).

–anewer filename

(accessed newer) The file being evaluated meets this criterion if it was accessed

more recently than filename.

–atime ±n

(access time) The file being evaluated meets this criterion if it was last accessed

±n days ago. When you use this option, find changes the access times of direc-

tories it searches.

–depth

The file being evaluated always meets this action criterion. It causes find to take

action on entries in a directory before it acts on the directory itself. When you

use find to send files to the cpio utility, the –depth criterion enables cpio to pre-

serve the modification times of directories when you restore files (assuming you

use the ––preserve–modification–time option to cpio). See the “Discussion” and

“Examples” sections under cpio on page 647.

–exec command \;

The file being evaluated meets this action criterion if the command returns a 0

(zero [true]) exit status. You must terminate the command with a quoted semi-

colon. A pair of braces ({}) within the command represents the name of the file

being evaluated. You can use the –exec action criterion at the end of a group of

other criteria to execute the command if the preceding criteria are met. Refer

to the following “Discussion” section for more information. See the section on

xargs on page 881 for a more efficient way of doing what this option does.

–group name

The file being evaluated meets this criterion if it is associated with the group

named name. You can use a numeric group ID in place of name.

–inum n

The file being evaluated meets this criterion if its inode number is n.

690 find

–links ±n

The file being evaluated meets this criterion if it has ±n links.

–mtime ±n

(modify time) The file being evaluated meets this criterion if it was last modified

±n days ago.

–name filename

The file being evaluated meets this criterion if the pattern filename matches its

name. The filename can include wildcard characters (*,?, and []) but these

characters must be quoted.

–newer filename

The file being evaluated meets this criterion if it was modified more recently

than filename.

–nogroup

The file being evaluated meets this criterion if it does not belong to a group

known on the local system.

–nouser

The file being evaluated meets this criterion if it does not belong to a user

known on the local system.

–ok command \;

This action criterion is the same as –exec except it displays each command to

be executed enclosed in angle brackets as a prompt and executes the command

only if it receives a response that starts with a y or Yfrom standard input.

–perm [±]mode

The file being evaluated meets this criterion if it has the access permissions

given by mode. If mode is preceded by a minus sign (–), the file access permis-

sions must include all the bits in mode. For example, if mode is 644, a file with

755 permissions will meet this criterion. If mode is preceded by a plus sign (+),

the file access permissions must include at least one of the bits in mode. If no

plus or minus sign precedes mode, the mode of the file must exactly match

mode. You may use either a symbolic or octal representation for mode (see

chmod on page 626).

–print

The file being evaluated always meets this action criterion. When evaluation of

the expression reaches this criterion, find displays the pathname of the file it is

evaluating. If –print is the only criterion in the expression,find displays the

names of all files in the directory-list. If this criterion appears with other crite-

ria, find displays the name only if the preceding criteria are met. If no action

criteria appear in the expression,–print is assumed. (Refer to the following

“Discussion” and “Notes” sections.)

find 691

–size ±n[c|k|M|G]

The file being evaluated meets this criterion if it is the size specified by ±n,

measured in 512-byte blocks. Follow n with the letter c to measure files in

characters, k to measure files in kilobytes, M to measure files in megabytes, or

G to measure files in gigabytes.

–type filetype

The file being evaluated meets this criterion if its file type is specified by filetype.

Select a filetype from the following list:

bBlock special file

cCharacter special file

dDirectory file

fOrdinary file

lSymbolic link

pFIFO (named pipe)

sSocket

–user name

The file being evaluated meets this criterion if it belongs to the user with the

username name. You can use a numeric user ID in place of name.

–xdev

The file being evaluated always meets this action criterion. It prevents find from

searching directories in filesystems other than the one specified by directory-list.

Also –mount. Under Mac OS X, use the –x option. L

Discussion Assume x and y are criteria. The following command line never tests whether the

file meets criterion y if it does not meet criterion x. Because the criteria are sepa-

rated by a SPACE (the Boolean AND operator), once find determines that criterion x is

not met, the file cannot meet the criteria so find does not continue testing. You can

read the expression as “(test to see whether) the file meets criterion xand [SPACE

means and] criterion y.”

$ find dir x y

The next command line tests the file against criterion y if criterion x is not met. The

file can still meet the criteria so find continues the evaluation. You can read the

expression as “(test to see whether) the file meets criterion xor criterion y.” If the

file meets criterion x,find does not evaluate criterion y as there is no need to do so.

$ find dir x -or y

Action criteria Certain “criteria” do not select files but rather cause find to take action. The action is

triggered when find evaluates one of these action criteria. Therefore, the position of

an action criterion on the command line—not the result of its evaluation—determines

whether find takes the action.

692 find

The –print action criterion causes find to display the pathname of the file it is testing.

The following command line displays the names of all files in the dir directory (and

all its subdirectories), regardless of whether they meet the criterion x:

$ find dir -print x

The following command line displays only the names of the files in the dir directory

that meet criterion x:

$ find dir x -print

This use of –print after the testing criteria is the default action criterion. The following

command line generates the same output as the previous one:

$ find dir x

Notes You can use the –a operator between criteria to improve clarity. This operator is a

Boolean AND operator, just as the SPACE is.

You may want to consider using pax (page 786) in place of cpio.

Examples The simplest find command has no arguments and lists the files in the working directory

and all subdirectories:

$ find

...

The following command finds the files in the working directory and subdirectories

that have filenames beginning with a. The command uses a period to designate the

working directory. To prevent the shell from interpreting the a* as an ambiguous

file reference, it is enclosed within single quotation marks.

$ find . -name 'a*'

The –print criterion is implicit in the preceding command. If you omit the directory-list

argument, find searches the working directory. The next command performs the same

function as the preceding one without explicitly specifying the working directory:

$ find -name 'a*'

The next command sends a list of selected filenames to the cpio utility, which writes

them to the device mounted on /dev/sde1. The first part of the command line ends

with a pipe symbol, so the shell expects another command to follow and displays a

secondary prompt (>) before accepting the rest of the command line. You can read

this find command as “find, in the root directory and all subdirectories (/), ordinary

files (–type f) that have been modified within the past day (–mtime –1), with the

exception of files whose names are suffixed with .o (! –name '*.o').” (An object file

carries a .o suffix and usually does not need to be preserved because it can be

re-created from the corresponding source file.)

find 693

$ find / -type f -mtime -1 ! -name '*.o' -print |

> cpio -oB > /dev/sde1

The following command finds, displays the filenames of, and deletes the files named

core or junk in the working directory and its subdirectories:

$ find . $ -name core -o -name junk $ -print -exec rm {} \;

...

The parentheses and the semicolon following –exec are quoted so the shell does not

treat them as special characters. SPACEs separate the quoted parentheses from other

elements on the command line. Read this find command as “find, in the working

directory and subdirectories (.), files named core (–name core)or (–o)junk (–name

junk) [if a file meets these criteria, continue] and (SPACE) print the name of the file

(–print)and (SPACE) delete the file (–exec rm {}).”

The following shell script uses find in conjunction with grep to identify files that

contain a particular string. This script enables you to look for a file when you

remember its contents but cannot remember its filename. The finder script locates

files in the working directory and subdirectories that contain the string specified on

the command line. The –type f criterion causes find to pass to grep only the names of

ordinary files, not directory files.

$ cat finder

find . -type f -exec grep -l "$1" {} \;

$ finder "Executive Meeting"

./january/memo.0102

./april/memo.0415

When called with the string Executive Meeting,finder locates two files containing

that string: ./january/memo.0102 and ./april/memo.0415. The period (.) in the

pathnames represents the working directory; january and april are subdirectories of

the working directory. The grep utility with the ––recursive option performs the

same function as the finder script.

The next command looks in two user directories for files that are larger than 100

blocks (–size +100) and have been accessed only more than five days ago—that is,

files that have not been accessed within the past five days (–atime +5). This find

command then asks whether you want to delete the file (–ok rm {}). You must

respond to each query with y (for yes) or n (for no). The rm command works only if

you have write and execute access permissions to the directory.

$ find /home/max /home/hls -size +100 -atime +5 -ok rm {} \;

< rm ... /home/max/notes >? y

< rm ... /home/max/letter >? n

...

In the next example, /home/max/memos is a symbolic link to Sam’s directory named

/home/sam/memos. When you use the –follow option with find, the symbolic link is

followed, and that directory is searched.

694 find

$ ls -l /home/max

lrwxrwxrwx 1 max pubs 17 Aug 19 17:07 memos -> /home/sam/memos

-rw-r--r-- 1 max pubs 5119 Aug 19 17:08 report

$ find /home/max -print

/home/max

/home/max/memos

/home/max/report

/home/max/.profile

$ find /home/max -follow -print

/home/max

/home/max/memos

/home/max/memos/memo.817

/home/max/memos/memo.710

/home/max/report

/home/max/.profile

finger 695

finger

Displays information about users

finger [options] [user-list]

The finger utility displays the usernames of users, together with their full names, terminal device

numbers, times they logged in, and other information. The options control how much information

finger displays, and the user-list specifies which users finger displays information about. The finger

utility can retrieve information from both local and remote systems.

Arguments Without any arguments, finger provides a short (–s) report on users who are logged

in on the local system. When you specify a user-list,finger provides a long (–l)

report on each user in the user-list. Names in the user-list are not case sensitive.

If the name includes an at sign (@), the finger utility interprets the name following

the @ as the name of a remote host to contact over the network. If a username

appears in front of the @ sign, finger provides information about that user on the

remote system.

Options –l (long) Displays detailed information (the default display when user-list is

specified).

–m (match) If a user-list is specified, displays entries only for those users whose

username matches one of the names in user-list. Without this option the user-list

names match usernames and full names.

–p (no plan, project, or pgpkey) Does not display the contents of .plan,.project,

and .pgpkey files for users. Because these files may contain backslash escape

sequences that can change the behavior of the screen, you may not wish to

view them. Normally the long listing displays the contents of these files if they

exist in the user’s home directory.

–s (short) Provides a short report for each user (the default display when user-list

is not specified).

Discussion The long report provided by the finger utility includes the user’s username, full

name, home directory location, and login shell, plus information about when the

user last logged in and how long it has been since the user last typed on the key-

board and read her email. After extracting this information from system files, finger

displays the contents of the ~/.plan,~/.project, and ~/.pgpkey files in the user’s

home directory. It is up to each user to create and maintain these files, which usually

provide more information about the user (such as telephone number, postal mail

address, schedule, interests, and PGP key).

The short report generated by finger is similar to that provided by the w utility; it

includes the user’s username, his full name, the device number of the user’s termi-

nal, the amount of time that has elapsed since the user last typed on the terminal

696 finger

keyboard, the time the user logged in, and the location of the user’s terminal. If the

user logged in over the network, finger displays the name of the remote system.

Notes Not all Linux distributions install finger by default.

When you specify a network address, the finger utility queries a standard network

service that runs on the remote system. Although this service is supplied with most

Linux systems, some administrators choose not to run it (so as to minimize the load

on their systems, eliminate possible security risks, or simply maintain privacy). If

you try to use finger to get information on someone at such a site, the result may be

an error message or nothing at all. The remote system determines how much infor-

mation to share with the local system and in which format. As a result the report

displayed for any given system may differ from the examples shown here. See also

“finger: Lists Users on the System” on page 68.

A file named ~/.nofinger causes finger to deny the existence of the person whose

home directory it appears in. For this subterfuge to work, the finger query must

originate from a system other than the local host and the fingerd daemon must be

able to see the .nofinger file (generally the home directory must have its execute bit

for other users set).

Examples The first example displays information on the users logged in on the local system:

$ finger

max Max Wild tty1 13:29 Jun 25 21:03

hls Helen Simpson *pts/1 13:29 Jun 25 21:02 (:0)

sam Sam the Great pts/2 Jun 26 07:47 (bravo.example.com)

The asterisk (*) in front of the name of Helen’s terminal (TTY) line indicates she

has blocked others from sending messages directly to her terminal (see mesg on

page 71). A long report displays the string messages off for users who have disabled

messages.

The next two examples cause finger to contact the remote system named kudos over

the network for information:

$ finger @kudos

[kudos]

max Max Wild tty1 23:15 Jun 25 11:22

roy Roy Wong pts/2 Jun 25 11:22

$ finger watson@kudos

[kudos]

Directory: /home/max Shell: /bin/zsh

On since Sat Jun 25 11:22 (PDT) on tty1, idle 23:22

Last login Sun Jun 26 06:20 (PDT) on ttyp2 from speedy

Mail last read Thu Jun 23 08:10 2005 (PDT)

Plan:

For appointments contact Sam the Great, x1963.

fmt 697

fmt

Formats text very simply

fmt [option] [file-list]

The fmt utility does simple text formatting by attempting to make all nonblank lines nearly the same

length.

Arguments The fmt utility reads the files in file-list and sends a formatted version of their con-

tents to standard output. If you do not specify a filename or if you specify a hyphen

(–) in place of a filename, fmt reads from standard input.

Options Options preceded by a double hyphen (––) work under Linux only. Except as noted,

options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

––split-only –s Splits long lines but does not fill short lines. L

–s Replaces multiple adjacent SPACE characters with a single SPACE.O

––tagged-paragraph

–t Indents all but the first line of each paragraph. L

–t n

Specifies n as the number of SPACEs per TAB stop. The default is eight. O

––uniform-spacing

–u Changes the formatted output so that one SPACE appears between words and

two SPACEs appear between sentences. L

––width=n–w n

Changes the output line length to n characters. Without this option, fmt keeps

output lines close to 75 characters wide. You can also specify this option as –n.

Notes The fmt utility works by moving NEWLINE characters. The indention of lines, as well as

the spacing between words, is left intact.

You can use this utility to format text while you are using an editor, such as vim. For

example, you can format a paragraph with the vim editor in command mode by

positioning the cursor at the top of the paragraph and then entering !}fmt –60. This

command replaces the paragraph with the output generated by feeding it through

fmt, specifying a width of 60 characters. Type u immediately if you want to undo the

formatting.

698 fmt

Examples The following example shows how fmt attempts to make all lines the same length.

The –w 50 option gives a target line length of 50 characters.

$ cat memo

One factor that is important to remember while administering the dietary

intake of Charcharodon carcharias is that there is, at least from

the point of view of the subject,

very little

differentiating the prepared morsels being proffered from your digits.

In other words, don't feed the sharks!

$ fmt -w 50 memo

One factor that is important to remember while

administering the dietary intake of Charcharodon

carcharias is that there is, at least from the

point of view of the subject, very little

differentiating the prepared morsels being

proffered from your digits.

In other words, don't feed the sharks!

The next example demonstrates the ––split-only option. Long lines are broken so that

none is longer than 50 characters; this option prevents fmt from filling short lines.

$ fmt -w 50 --split-only memo

One factor that is important to remember while

administering the dietary

intake of Charcharodon carcharias is that there

is, at least from

the point of view of the subject,

very little

differentiating the prepared morsels being

proffered from your digits.

In other words, don't feed the sharks!

fsck 699

fsck

Checks and repairs a filesystem

fsck [options] [filesystem-list]

The fsck utility verifies the integrity of a filesystem and reports on and optionally repairs problems it

finds. It is a front end for filesystem checkers, each of which is specific to a certain filesystem type.

The fsck utility is available under Linux only; under OS X, use diskutil.L

Arguments Without the –A option and with no filesystem-list,fsck checks the filesystems listed

in the /etc/fstab file one at a time (serially). With the –A option and with no file-

system-list,fsck checks all the filesystems listed in the /etc/fstab file in parallel if

possible. See the –s option for a discussion of checking filesystems in parallel.

The filesystem-list specifies the filesystems to be checked. It can either specify the

name of the device that holds the filesystem (for example, /dev/hda2) or, if the file-

system appears in /etc/fstab, specify the mount point (for example, /usr) for the

filesystem. The filesystem-list can also specify the label for the filesystem from

/etc/fstab (for example, LABEL=home).

Options When you run fsck, you can specify both global options and options specific to the

filesystem type that fsck is checking (for example, ext2,ext3,msdos,reiserfs). Global

options must precede type-specific options.

Global Options

–A (all) Processes all filesystems listed in the /etc/fstab file, in parallel if possible.

See the –s option for a discussion of checking filesystems in parallel. Do not

specify a filesystem-list when you use this option; you can specify filesystem

types to be checked with the –t option. Use this option with the –a,–p, or –n

option so fsck does not attempt to process filesystems in parallel interactively

(in which case you would have no way of responding to its multiple prompts).

–N (no) Assumes a no response to any questions that arise while processing a file-

system. This option generates the messages you would normally see but causes

fsck to take no action.

–R (root-skip) With the –A option, does not check the root filesystem. This option

is useful when the system boots, because the root filesystem may be mounted

with read-write access.

–s (serial) Causes fsck to process filesystems one at a time. Without this option,

fsck processes multiple filesystems that reside on separate physical disk drives

in parallel. Parallel processing enables fsck to process multiple filesystems more

quickly. This option is required if you want to process filesystems interactively.

See the –a,–p, or –N (or –n, on some filesystems) option to turn off interactive

processing.

700 fsck

–T (title) Causes fsck not to display its title.

–t fstype

(filesystem type) A comma-separated list that specifies the filesystem type(s) to

process. With the –A option, fsck processes all the filesystems in /etc/fstab that

are of type fstype. Common filesystem types are ext2,ext3,ext4,msdos, and

reiserfs. You do not typically check remote NFS filesystems.

–V (verbose) Displays more output, including filesystem type-specific commands.

Filesystem Type-Specific Options

The following command lists the filesystem checking utilities available on the local

system. Files with the same inode numbers are linked (page 107).

$ ls -i /sbin/*fsck*

63801 /sbin/dosfsck 63856 /sbin/fsck.cramfs 63801 /sbin/fsck.msdos

63763 /sbin/e2fsck 63763 /sbin/fsck.ext2 63801 /sbin/fsck.vfat

63780 /sbin/fsck 63763 /sbin/fsck.ext3

Review the man page or give the pathname of the filesystem checking utility to

determine which options the utility accepts:

$ /sbin/fsck.ext3

Usage: /sbin/fsck.ext3 [-panyrcdfvstDFSV] [-b superblock] [-B blocksize]

[-I inode_buffer_blocks] [-P process_inode_size]

[-l|-L bad_blocks_file] [-C fd] [-j ext-journal]

[-E extended-options] device

Emergency help:

-p Automatic repair (no questions)

-n Make no changes to the filesystem

-y Assume "yes" to all questions

-c Check for bad blocks and add them to the badblock list

-f Force checking even if filesystem is marked clean

...

The following options apply to many filesystem types, including ext2 and ext3:

–a (automatic) Same as the –p option; kept for backward compatibility.

–f (force) Forces fsck to check filesystems even if they are clean. A clean filesystem

is one that was just successfully checked with fsck or was successfully

unmounted and has not been mounted since then. Clean filesystems are

skipped by fsck, which greatly speeds up system booting under normal condi-

tions. For information on setting up periodic, automatic filesystem checking on

ext2,ext3, and ext4 filesystems, see tune2fs page 868.

–n (no) Same as the –N global option. Does not work with all filesystems.

–p (preen) Attempts to repair all minor inconsistencies it finds when processing a

filesystem. If any problems are not repaired, fsck terminates with a nonzero

exit status. This option runs fsck in batch mode; as a consequence, it does not

fsck 701

ask whether to correct each problem it finds. The –p option is commonly used

with the –A option when checking filesystems while booting Linux.

–r (interactive) Asks whether to correct or ignore each problem that is found. For

many filesystem types, this behavior is the default. This option is not available

on all filesystems.

–y (yes) Assumes a yes response to any questions that fsck asks while processing a

filesystem. Use this option with caution, as it gives fsck free reign to do what it

thinks is best to clean up a filesystem.

Notes The fsck and fsck_hfs utilities are deprecated under Mac OS X version 10.5 and

later. Apple suggests using diskutil (page 668) instead.

You can run fsck from a live or rescue CD.

When a filesystem is consistent, fsck displays a report such as the following:

# fsck -f /dev/sdb1

fsck 1.40.8 (13-Mar-2008)

e2fsck 1.40.8 (13-Mar-2008)

Pass 1: Checking inodes, blocks, and sizes

Pass 2: Checking directory structure

Pass 3: Checking directory connectivity

Pass 4: Checking reference counts

Pass 5: Checking group summary information

/dev/sdb1: 710/4153408 files (10.1% non-contiguous), 455813/8303589 blocks

Interactive mode You can run fsck either interactively or in batch mode. For many filesystems, unless

you use one of the –a,–p,–y, or –n options, fsck runs in interactive mode. In interac-

tive mode, if fsck finds a problem with a filesystem, it reports the problem and

allows you to choose whether to repair or ignore it. If you repair a problem you

may lose some data; however, that is often the most reasonable alternative.

Although it is technically feasible to repair files that are damaged and that fsck says

you should remove, this action is rarely practical. The best insurance against signif-

icant loss of data is to make frequent backups.

Order of checking The fsck utility looks at the sixth column in the /etc/fstab file to determine if, and in

which order, it should check filesystems. A 0 (zero) in this position indicates the file-

system should not be checked. A 1 (one) indicates that it should be checked first;

this status is usually reserved for the root filesystem. A 2 (two) indicates that the

filesystem should be checked after those marked with a 1.

fsck is a front end Similar to mkfs (page 764), fsck is a front end that calls other utilities to handle var-

ious types of filesystems. For example, fsck calls e2fsck to check the widely used

ext2 and ext3 filesystems. Refer to the e2fsck man page for more information. Other

utilities that fsck calls are typically named fsck.type, where type is the filesystem

type. By splitting fsck in this manner, filesystem developers can provide programs to

check their filesystems without affecting the development of other filesystems or

changing how system administrators use fsck.

702 fsck

Boot time Run fsck on filesystems that are unmounted or are mounted readonly. When Linux

is booting, the root filesystem is first mounted readonly to allow it to be processed

by fsck. If fsck finds no problems with the root filesystem, it is then remounted

(using the remount option to the mount utility) read-write and fsck is typically run

with the –A,–R, and –p options.

lost+found When it encounters a file that has lost its link to its filename, fsck asks whether to

reconnect it. If you choose to reconnect it, the file is put in a directory named

lost+found in the root directory of the filesystem where the file was found. The

reconnected file is given its inode number as a name. For fsck to restore files in this

way, a lost+found directory must be present in the root directory of each filesystem.

For example, if a system uses the /,/usr, and /home filesystems, you should have

these three lost+found directories: /lost+found,/usr/lost+found, and

/home/lost+found. Each lost+found directory must have unused entries in which

fsck can store the inode numbers for files that have lost their links. When you create

an ext2/ext3/ext4 filesystem, mkfs (page 764) creates a lost+found directory with

the required unused entries. Alternatively, you can use the mklost+found utility to

create this directory in ext2/ext3/ext4 filesystems if needed. On other types of file-

systems, you can create the unused entries by adding many files to the directory and

then removing them. Try using touch (page 862) to create 500 entries in the

lost+found directory and then using rm to delete them.

Messages Table V-15 lists fsck’s common messages. In general fsck suggests the most logical

way of dealing with a problem in the file structure. Unless you have information

that suggests another response, respond to the prompts with yes. Use the system

backup tapes or disks to restore data that is lost as a result of this process.

Table V-15 Common fsck messages

Phase (message) What fsck checks

Phase 1 - Checking inodes,

blocks, and sizes

Checks inode information.

Phase 2 - Checking directory

structure

Looks for directories that point to bad inodes that fsck found in

Phase 1.

Phase 3 - Checking directory

connectivity

Looks for unreferenced directories and a nonexistent or full

lost+found directory.

Phase 4 - Checking reference

counts

Checks for unreferenced files, a nonexistent or full lost+found

directory, bad link counts, bad blocks, duplicated blocks, and

incorrect inode counts.

Phase 5 - Checking group

summary information

Checks whether the free list and other filesystem structures are

OK. If any problems are found with the free list, Phase 6 is run.

Phase 6 - Salvage free list If Phase 5 found any problems with the free list, Phase 6

fixes them.

fsck 703

Cleanup

Once it has repaired the filesystem, fsck informs you about the status of the file-

system. The fsck utility displays the following message after it repairs a filesystem:

*****File System Was Modified*****

On ext2/ext3/ext4 filesystems, fsck displays the following message when it has fin-

ished checking a filesystem:

filesys:used/maximum files (percent non-contiguous), used/maximum blocks

This message tells you how many files and disk blocks are in use as well as how

many files and disk blocks the filesystem can hold. The percent non-contiguous tells

you how fragmented the disk is.

704 ftp

ftp

Transfers files over a network

ftp [options] [remote-system]

The ftp utility is a user interface to the standard File Transfer Protocol (FTP), which transfers files

between systems that can communicate over a network. To establish an FTP connection, you must

have access to an account (personal, guest, or anonymous) on the remote system.

Arguments The remote-system is the name or network address of the server, running an FTP

daemon (e.g., ftpd,vsftpd, or sshd), you want to exchange files with.

Options –i (interactive) Turns off prompts during file transfers with mget and mput. See

also prompt.

–n (no automatic login) Disables automatic logins.

–p (passive mode) Starts ftp in passive mode (page 706).

–v (verbose) Tells you more about how ftp is working. Displays responses from

the remote-system and reports transfer times and speeds.

Discussion The ftp utility is interactive. After you start it, ftp prompts you to enter commands to

set parameters and transfer files. You can use a number of commands in response to

the ftp> prompt; following are some of the more common ones.

![command]Escapes to (spawns) a shell on the local system; use CONTROL-D or exit to return to ftp

when you are finished using the local shell. Follow the exclamation point with a

command to execute that command only; ftp returns to the ftp> prompt when the

command completes executing. Because the shell that ftp spawns with this com-

mand is a child of the shell that is running ftp, no changes you make in this shell are

preserved when you return to ftp. Specifically, when you want to copy files to a local

directory other than the directory that you started ftp from, you need to use the ftp

lcd command to change the local working directory: Issuing a cd command in the

spawned shell will not make the change you desire. See page 709 for an example.

ascii Sets the file transfer type to ASCII. This command allows you to transfer text files

from systems that end lines with a RETURN/LINEFEED combination and automatically strip

Use FTP only to download public information

security FTP is not a secure protocol. The ftp utility sends your password over the network as cleartext,

which is not a secure practice. You can use sftp as a secure replacement for ftp if the server is

running OpenSSH. You can also use scp (page 810) for many FTP functions other than allowing

anonymous users to download information. Because scp uses an encrypted connection, user

passwords and data cannot be sniffed.

ftp 705

off the RETURN. Such a transfer is useful when the remote computer is a DOS or MS

Windows machine. The cr command must be ON for ascii to work.

binary Sets the file transfer type to binary. This command allows you to transfer files that

contain non-ASCII (unprintable) characters correctly. It also works for ASCII files

that do not require changes to the ends of lines.

bye Closes the connection to a remote system and terminates ftp. Same as quit.

cd remote-directory

Changes to the working directory named remote-directory on the remote system.

close Closes the connection with the remote system without exiting from ftp.

cr (carriage return) Toggles RETURN stripping when you retrieve files in ASCII mode. See

ascii.

dir [directory [file]]

Displays a listing of the directory named directory from the remote system. When

you do not specify directory, the working directory is displayed. When you specify

afile, the listing is saved on the local system in a file named file.

get remote-file [local-file]

Copies remote-file to the local system under the name local-file. Without local-file,

ftp uses remote-file as the filename on the local system. The remote-file and local-file

names can be pathnames.

glob Toggles filename expansion for the mget and mput commands and displays the

current state (Globbing on or Globbing off).

help Displays a list of commands recognized by the ftp utility on the local system.

lcd [local_directory]

(local change directory) Changes the working directory on the local system to

local_directory. Without an argument, this command changes the working direc-

tory on the local system to your home directory (just as cd does without an

argument).

ls [directory [file]] Similar to dir but produces a more concise listing on some remote systems.

mget remote-file-list

(multiple get) Unlike the get command, allows you to retrieve multiple files from

the remote system. You can name the remote files literally or use wildcards (see

glob). See also prompt.

mput local-file-list

(multiple put) The mput command allows you to copy multiple files from the local

system to the remote system. You can name the local files literally or use wildcards

(see glob). See also prompt.

open Interactively specifies the name of the remote system. This command is useful if you

did not specify a remote system on the command line or if the attempt to connect to

the system failed.

706 ftp

passive Toggles between the active (PORT—the default) and passive (PASV) transfer modes

and displays the transfer mode. See “Passive versus active connections” under the

“Notes” section.

prompt When using mget or mput to receive or send multiple files, ftp asks for verification

(by default) before transferring each file. This command toggles that behavior and

displays the current state (Interactive mode off or Interactive mode on).

put local-file [remote-file]

Copies local-file to the remote system under the name remote-file. Without remote-

file,ftp uses local-file as the filename on the remote system. The remote-file and

local-file names can be pathnames.

pwd Causes ftp to display the pathname of the remote working directory. Use !pwd to

display the name of the local working directory.

quit Closes the connection to a remote system and terminates ftp. Same as bye.

reget remote-file Attempts to resume an aborted transfer. This command is similar to get, but instead

of overwriting an existing local file, ftp appends new data to it. Not all servers sup-

port reget.

user [username]If the ftp utility did not log you in automatically, you can specify your account name

as username. If you omit username,ftp prompts you for a username.

Notes A Linux or Mac OS X system running ftp can exchange files with any of the many

operating systems that support the FTP protocol. Many sites offer archives of free

information on an FTP server, although many of these FTP sites are merely alterna-

tives to an easier-to-access Web site (for example, ftp://ftp.ibiblio.org/pub/Linux and

http://www.ibiblio.org/pub/Linux). Most browsers can connect to and download

files from FTP servers.

The ftp utility makes no assumptions about filesystem naming or structure because

you can use ftp to exchange files with non-UNIX/Linux systems (whose filename

conventions may be different).

Anonymous FTP Many systems—most notably those from which you can download free soft-

ware—allow you to log in as anonymous. Most systems that support anonymous

logins accept the name ftp as an easier-to-spell and quicker-to-enter synonym for

anonymous. An anonymous user is usually restricted to a portion of a filesystem set

aside to hold files that are to be shared with remote users. When you log in as an

anonymous user, the server prompts you to enter a password. Although any pass-

word may be accepted, by convention you are expected to supply your email

address. Many systems that permit anonymous access store interesting files in the

pub directory.

Passive versus

active connections

A client can ask an FTP server to establish either a PASV (passive—the default) or

PORT (active) connection for data transfer. Some servers are limited to one type of

connection. The difference between passive and active FTP connections lies in

whether the client or server initiates the data connection. In passive mode, the client

ftp 707

initiates the data connection to the server (on port 20 by default); in active mode,

the server initiates the data connection (there is no default port). Neither type of

connection is inherently more secure. Passive connections are more common

because a client behind a NAT (page 967) firewall can connect to a passive server

and because it is simpler to program a scalable passive server.

Automatic login You can store server-specific FTP username and password information so that you

do not have to enter it each time you visit an FTP site. Each line of the ~/.netrc file

identifies a server. When you connect to an FTP server, ftp reads ~/.netrc to deter-

mine whether you have an automatic login set up for that server. The format of a

line in ~/.netrc is

machine server login username password passwd

where server is the name of the server, username is your username, and passwd is

your password on server. Replace machine with default on the last line of the file to

specify a username and password for systems not listed in ~/.netrc. The default line

is useful for logging in on anonymous servers. A sample ~/.netrc file follows:

$ cat ~/.netrc

machine bravo login max password mypassword

default login anonymous password max@example.com

To protect the account information in .netrc, make it readable by only the user whose

home directory it appears in. Refer to the netrc man page for more information.

Examples Following are two ftp sessions wherein Max transfers files from and to an FTP

server named bravo. When Max gives the command ftp bravo, the local ftp client

connects to the server, which asks for a username and a password. Because he is

logged in on his local system as max,ftp suggests that he log in on bravo as max. To

ever, so he types watson in response to the Name (bravo:max): prompt. Max

responds to the Password: prompt with his normal (remote) system password, and

the FTP server greets him and informs him that it is Using binary mode to transfer

files. With ftp in binary mode, Max can transfer ASCII and binary files.

Connect and log in $ ftp bravo

Connected to bravo.

220 (vsFTPd 2.0.7)

530 Please login with USER and PASS.

KERBEROS_V4 rejected as an authentication type

Name (bravo:max): watson

331 Please specify the password.

Password:

230 Login successful.

Remote system type is UNIX.

Using binary mode to transfer files.

ftp>

708 ftp

After logging in, Max uses the ftp ls command to display the contents of his remote

working directory, which is his home directory on bravo. Then he cds to the memos

directory and displays the files there.

ls and cd ftp> ls

227 Entering Passive Mode (192,168,0,6,79,105)

150 Here comes the directory listing.

drwxr-xr-x 2 500 500 4096 Oct 10 23:52 expenses

drwxr-xr-x 2 500 500 4096 Oct 10 23:59 memos

drwxrwxr-x 22 500 500 4096 Oct 10 23:32 tech

226 Directory send OK.

ftp> cd memos

250 Directory successfully changed.

ftp> ls

227 Entering Passive Mode (192,168,0,6,114,210)

150 Here comes the directory listing.

-rw-r--r-- 1 500 500 4770 Oct 10 23:58 memo.0514

-rw-r--r-- 1 500 500 7134 Oct 10 23:58 memo.0628

-rw-r--r-- 1 500 500 9453 Oct 10 23:58 memo.0905

-rw-r--r-- 1 500 500 3466 Oct 10 23:59 memo.0921

-rw-r--r-- 1 500 500 1945 Oct 10 23:59 memo.1102

226 Directory send OK.

Next Max uses the ftp get command to copy memo.1102 from the server to the

local system. His use of binary mode ensures that he will get a good copy of the file

regardless of whether it is in binary or ASCII format. The server confirms that the

file was copied successfully and notes the size of the file and the time it took to copy.

Max then copies the local file memo.1114 to the remote system. The file is copied

into his remote working directory, memos.

get and put ftp> get memo.1102

local: memo.1102 remote: memo.1102

227 Entering Passive Mode (192,168,0,6,194,214)

150 Opening BINARY mode data connection for memo.1102 (1945 bytes).

226 File send OK.

1945 bytes received in 7.1e-05 secs (2.7e+04 Kbytes/sec)

ftp> put memo.1114

local: memo.1114 remote: memo.1114

227 Entering Passive Mode (192,168,0,6,174,97)

150 Ok to send data.

226 File receive OK.

1945 bytes sent in 2.8e-05 secs (6.8e+04 Kbytes/sec)

After a while Max decides he wants to copy all the files in the memos directory on

bravo to a new directory on the local system. He gives an ls command to make sure

he is going to copy the right files, but ftp has timed out. Instead of exiting from ftp

and giving another ftp command from the shell, Max gives ftp an open bravo com-

mand to reconnect to the server. After logging in, he uses the ftp cd command to

change directories to memos on the server.

ftp 709

Timeout and open ftp> ls

421 Timeout.

Passive mode refused.

ftp> open bravo

Connected to bravo (192.168.0.6).

220 (vsFTPd 1.1.3)

...

ftp> cd memos

250 Directory successfully changed.

lcd (local cd) At this point, Max realizes he has not created the new directory to hold the files he

wants to download. Giving an ftp mkdir command would create a new directory on

the server, but Max wants a new directory on the local system. He uses an exclama-

tion point (!) followed by a mkdir memos.hold command to invoke a shell and run

mkdir on the local system, thereby creating a directory named memos.hold in his

working directory on the local system. (You can display the name of your working

directory on the local system with !pwd.) Next, because he wants to copy files from

the server to the memos.hold directory on his local system, Max has to change his

working directory on the local system. Giving the command !cd memos.hold will

not accomplish what Max wants to do because the exclamation point spawns a new

shell on the local system and the cd command would be effective only in the new

shell, which is not the shell that ftp is running under. For this situation, ftp provides

the lcd (local cd) command, which changes the working directory for ftp and reports

on the new local working directory.

ftp> !mkdir memos.hold

ftp> lcd memos.hold

Local directory now /home/max/memos.hold

Max uses the ftp mget (multiple get) command followed by the asterisk (*) wildcard

to copy all the files from the remote memos directory to the memos.hold directory

on the local system. When ftp prompts him for the first file, he realizes that he forgot

to turn off the prompts, so he responds with n and presses CONTROL-C to stop copying

files in response to the second prompt. The server checks whether he wants to con-

tinue with his mget command.

Next Max gives the ftp prompt command, which toggles the prompt action (turns it

off if it is on and turns it on if it is off). Now when he gives an mget * command, ftp

copies all the files without prompting him.

After getting the files he wants, Max gives a quit command to close the connection

with the server, exit from ftp, and return to the local shell prompt.

mget and prompt ftp> mget *

mget memo.0514? n

mget memo.0628? CONTROL-C

Continue with mget? n

ftp> prompt

Interactive mode off.

710 ftp

ftp> mget *

local: memo.0514 remote: memo.0514

227 Entering Passive Mode (192,168,0,6,53,55)

150 Opening BINARY mode data connection for memo.0514 (4770 bytes).

226 File send OK.

4770 bytes received in 8.8e-05 secs (5.3e+04 Kbytes/sec)

local: memo.0628 remote: memo.0628

227 Entering Passive Mode (192,168,0,6,65,102)

150 Opening BINARY mode data connection for memo.0628 (7134 bytes).

226 File send OK.

...

150 Opening BINARY mode data connection for memo.1114 (1945 bytes).

226 File send OK.

1945 bytes received in 3.9e-05 secs (4.9e+04 Kbytes/sec)

ftp> quit

221 Goodbye.

gawk 711

gawk

Searches for and processes patterns in a file

gawk [options] [program] [file-list]

gawk [options] –f program-file [file-list]

AWK is a pattern-scanning and processing language that searches one or more files for records

(usually lines) that match specified patterns. It processes lines by performing actions, such as writ-

ing the record to standard output or incrementing a counter, each time it finds a match. As opposed

to procedural languages, AWK is data driven: You describe the data you want to work with and tell

AWK what to do with the data once it finds it.

See Chapter 12 for information on gawk

tip See Chapter 12 starting on page 531 for information on the awk,gawk, and mawk implementa-

tions of the AWK language.

712 gcc

gcc

Compiles C and C++ programs

gcc [options]file-list [–larg]

g++ [options]file-list [–larg]

The Linux and Mac OS X operating systems use the GNU C compiler, gcc, to preprocess, compile,

assemble, and link C language source files. The same compiler with a different front end, g++, pro-

cesses C++ source code. The gcc and g++ compilers can also assemble and link assembly language

source files, link object files only, or build object files for use in shared libraries.

These compilers take input from files you specify on the command line. Unless you use the –o

option, they store the executable program in a.out.

The gcc and g++ compilers are part of GCC, the GNU Compiler Collection, which includes front

ends for C, C++, Objective C, Fortran, Java, and Ada as well as libraries for these languages. Go to

gcc.gnu.org for more information.

Arguments The file-list is a list of files gcc is to process.

Options Without any options gcc accepts C language source files, assembly language files,

object files, and other files described in Table V-16 on page 715. The gcc utility pre-

processes, compiles, assembles, and links these files as appropriate, producing an

executable file named a.out. If gcc is used to create object files without linking them

to produce an executable file, each object file is named by adding the extension .o to

the basename of the corresponding source file. If gcc is used to create an executable

file, it deletes the object files after linking.

Some of the most commonly used options are listed here. When certain filename

extensions are associated with an option, you can assume gcc adds the extension to

the basename of the source file.

–c (compile) Suppresses the linking step of compilation. Compiles and/or assem-

bles source code files and leaves the object code in files with the extension .o.

–Dname[=value]

Usually #define preprocessor directives are given in header, or include, files.

You can use this option to define symbolic names on the command line

instead. For example, –DLinux is equivalent to placing the line #define Linux

in an include file, and –DMACH=i586 is the same as #define MACH i586.

gcc and g++

tip Although this section specifies the gcc compiler, most of it applies to g++ as well.

gcc 713

–E (everything) For source code files, suppresses all steps of compilation except

preprocessing and writes the result to standard output. By convention the

extension .i is used for preprocessed C source and .ii for preprocessed C++

source.

–fpic

Causes gcc to produce position-independent code, which is suitable for installing

into a shared library.

–fwritable-strings

By default the GNU C compiler places string constants into protected memory,

where they cannot be changed. Some (usually older) programs assume you can

modify string constants. This option changes the behavior of gcc so that string

constants can be modified.

–g (gdb) Embeds diagnostic information in the object files. This information is

used by symbolic debuggers, such as gdb. Although this option is necessary

only if you later use a debugger, it is a good practice to include it as a matter of

course.

–Idirectory

Looks for include files in directory before looking in the standard locations.

Give this option multiple times to look in more than one directory.

–larg

(lowercase “l”) Searches the directories /lib and /usr/lib for a library file

named libarg.a. If the file is found, gcc then searches this library for any

required functions. Replace arg with the name of the library you want to

search. For example, the –lm option normally links the standard math library

libm.a. The position of this option is significant: It generally needs to appear at

the end of the command line but can be repeated multiple times to search dif-

ferent libraries. Libraries are searched in the order in which they appear on the

command line. The linker uses the library only to resolve undefined symbols

from modules that precede the library option on the command line. You can

add other library paths to search for libarg.a using the –L option.

–Ldirectory

Adds directory to the list of directories to search for libraries given with the –l

option. Directories that are added to the list with –L are searched before gcc

looks in the standard locations for libraries.

–o file

(output) Names the executable program that results from linking file instead

of a.out.

–On

(optimize) Attempts to improve (optimize) the object code produced by the

compiler. The value of n may be 0,1,2, or 3 (or 06 if you are compiling code

714 gcc

for the Linux kernel). The default value of n is 1. Larger values of n result in

better optimization but may increase both the size of the object file and the

time it takes gcc to run. Specify –O0 to turn off optimization. Many related

options control precisely the types of optimizations attempted by gcc when

you use –O. Refer to the gcc info page for details.

–pedantic

The C language accepted by the GNU C compiler includes features that are not

part of the ANSI standard for the C language. Using this option forces gcc to

reject these language extensions and accept only standard C programming lan-

guage features.

–Q Displays the names of functions as gcc compiles them. This option also displays

statistics about each pass.

–S (suppress) Suppresses the assembling and linking steps of compilation on

source code files. The resulting assembly language files have .s filename

extensions.

–traditional

Causes gcc to accept only C programming language features that existed in the

traditional Kernighan and Ritchie C programming language. With this option,

older programs written using the traditional C language (which existed before

the ANSI standard C language was defined) can be compiled correctly.

–Wall

Causes gcc to warn you about questionable code in the source code files. Many

related options control warning messages more precisely.

Notes The preceding list of options represents only a small fraction of the full set of

options available with the GNU C compiler. See the gcc info page for a complete list.

Although the –o option is generally used to specify a filename in which to store

object code, this option also allows you to name files resulting from other compila-

tion steps. In the following example, the –o option causes the assembly language

produced by the gcc command to be stored in the file acode instead of pgm.s, the

default:

$ gcc -S -o acode pgm.c

The lint utility found in many UNIX systems is not available on Linux or Mac OS X.

However, the –Wall option performs many of the same checks and can be used in

place of lint.

Table V-16 summarizes the conventions used by the C compiler for assigning file-

name extensions.

gcc 715

Examples The first example compiles, assembles, and links a single C program, compute.c.

The executable output is stored in a.out. The gcc utility deletes the object file.

$ gcc compute.c

The next example compiles the same program using the C optimizer (–O option). It

assembles and links the optimized code. The –o option causes gcc to store the exe-

cutable output in compute.

$ gcc -O -o compute compute.c

Next a C source file, an assembly language file, and an object file are compiled,

assembled, and linked. The executable output is stored in progo.

$ gcc -o progo procom.c profast.s proout.o

In the next example, gcc searches the standard math library found in /lib/libm.a

when it is linking the himath program and stores the executable output in a.out:

$ gcc himath.c -lm

In the following example, the C compiler compiles topo.c with options that check

the code for questionable source code practices (–Wall option) and violations of the

ANSI C standard (–pedantic option). The –g option embeds debugging support in

the executable file, which is saved in topo with the –o topo option. Full optimiza-

tion is enabled with the –O3 option.

The warnings produced by the C compiler are sent to standard output. In this

example the first and last warnings result from the –pedantic option; the other

warnings result from the –Wall option.

Table V-16 Filename extensions

Extension Type of file

.a Library of object modules

.c C language source file

.C,.cc,or .cxx C++ language source file

.i Preprocessed C language source file

.ii Preprocessed C++ language source file

.o Object file

.s Assembly language source file

.S Assembly language source file that needs preprocessing

716 gcc

$ gcc -Wall -g -O3 -pedantic -o topo topo.c

In file included from topo.c:2:

/usr/include/ctype.h:65: warning: comma at end of enumerator list

topo.c:13: warning: return-type defaults to 'int'

topo.c: In function 'main':

topo.c:14: warning: unused variable 'c'

topo.c: In function 'getline':

topo.c:44: warning: 'c' might be used uninitialized in this function

When compiling programs that rely on the X11 include files and libraries, you may

need to use the –I and –L options to tell gcc where to locate those include files and

libraries. The next example uses those options and instructs gcc to link the program

with the basic X11 library:

$ gcc -I/usr/X11R6/include plot.c -L/usr/X11R6/lib -lX11

GetFileInfo O717

GetFileInfo

GetFileInfo O

Displays file attributes

GetFileInfo [option]file

The GetFileInfo utility displays file attributes (page 931), including the file’s type and creator code,

creation and last modification times, and attribute flags such as the invisible and locked flags.

Arguments The file specifies a single file or a directory that GetFileInfo displays information

about.

Options The options for GetFileInfo correspond to the options for SetFile (page 813).

Without an option, GetFileInfo reports on the metadata of file, indicating the flags

that are set, the file’s type and creator codes, and its creation and modification

dates. Missing data is omitted. When you specify an option, GetFileInfo displays the

information specified by that option only. This utility accepts a single option; it

silently ignores additional options.

–aflag

(attribute) Reports the status of a single attribute flag named flag. This option

displays 1 if flag is set and 0 if flag is not set. The flag must follow the –a

immediately, without any intervening SPACEs. See Table D-2 on page 931 for a

list of attribute flags.

–c (creator) Displays the creator code of file. If file is a directory and has no creator

code, this option displays an error message.

–d (date) Displays the creation date of file as mm/dd/yyyy hh:mm:ss, using a

24-hour clock.

–m (modification) Displays the modification date of file as mm/dd/yyyy

hh:mm:ss, using a 24-hour clock.

–P (no dereference) For each file that is a symbolic link, displays information

about the symbolic link, not the file the link points to. This option affects all

files and treats files that are not symbolic links normally. See page 625 for an

example of the –P option.

–t (type) Displays the type code of file. If file is a directory and has no type code,

this option displays an error message.

Discussion Without an option, GetFileInfo displays flags as the string avbstclinmedz, with

uppercase letters denoting which flags are set. See page 931 for a discussion of

attribute flags.

718 GetFileInfo O

Notes You can use the SetFile utility (page 813) to set file attributes. You can set Mac OS X

permissions and ownership (page 93) using chmod (page 626) or chown (page 631),

and you can display this information using ls (page 745) or stat (page 835).

Directories do not have type or creator codes, and they may not have all flags. The

GetFileInfo utility cannot read special files such as device files.

Examples The first example shows the output from GetFileInfo when you call it without an

option.

$ GetFileInfo picture.jpg

file: "/private/tmp/picture.jpg"

type: "JPEG"

creator: "GKON"

attributes: avbstClinmedz

created: 07/18/2005 15:15:26

modified: 07/18/2005 15:15:26

The only uppercase letter on the attributes line is C, indicating that this flag is set.

The c flag tells the Finder to look for a custom icon for this file. See Table D-2 on

page 931 for a list of flags.

The next example uses the –a flag to display the attribute flags for a file:

$ GetFileInfo -a /Applications/Games/Alchemy/Alchemy

avBstclInmedz

The output shows that the b and i flags are set.

The GetFileInfo utility can process only one file each time you call it. The following

multiline bash command uses a for loop (page 411) to display the creator codes of

multiple files. The echo command displays the name of the file being examined

because GetFileInfo does not always display the name of the file:

$ for i in *

> do echo -n "$i: "; GetFileInfo -c "$i"

> done

Desktop: Desktop is a directory and has no creator

Documents: Documents is a directory and has no creator

...

aa: ""

ab: ""

...

grep 719

grep

Searches for a pattern in files

grep [options]pattern [file-list]

The grep utility searches one or more text files, line by line, for a pattern, which can be a simple

string or another form of a regular expression. The grep utility takes various actions, specified by

options, each time it finds a line that contains a match for the pattern.This utility takes its input

either from files you specify on the command line or from standard input.

Arguments The pattern is a regular expression, as defined in Appendix A. You must quote reg-

ular expressions that contain special characters, SPACE s, or TABs. An easy way to quote

these characters is to enclose the entire expression within single quotation marks.

The file-list is a list of the pathnames of ordinary text files that grep searches. With

the –r option, file-list may contain directories whose contents are searched.

Options Without any options grep sends lines that contain a match for pattern to standard

output. When you specify more than one file on the command line, grep precedes

each line it displays with the name of the file it came from, followed by a colon.

Major Options

You can use only one of the following three options at a time. Normally you do not

need to use any, because grep defaults to –G, which is regular grep.

–E (extended) Interprets pattern as an extended regular expression (page 895).

The command grep –E is the same as egrep. See the “Notes” section.

–F (fixed) Interprets pattern as a fixed string of characters. The command grep –F

is the same as fgrep.

–G (grep) Interprets pattern as a basic regular expression. This is the default major

option if you do not specify a major option.

Other Options

The grep utility accepts the common options described on page 603.

––count –c Displays only the number of lines that contain a match in each file.

––context=n–C n

Displays n lines of context around each matching line.

The Mac OS X version of grep accepts long options

tip Options for grep preceded by a double hyphen (––) work under Mac OS X as well as under Linux.

720 grep

––file=file –f file

Reads file, which contains one pattern per line, and finds lines in the input that

match each of the patterns.

––no-filename –h Does not display the filename at the beginning of each line when searching

through multiple files.

––ignore-case –i Causes lowercase letters in the pattern to match uppercase letters in the file,

and vice versa. Use this option when you are searching for a word that may

appear at the beginning of a sentence (that is, the word may or may not start

with an uppercase letter).

––files-with-matches

–l (lowercase “l”; list) Displays only the name of each file that contains one or

more matches. A filename is displayed only once, even if the file contains more

than one match.

––max-count=n–m n

Stops reading each file, or standard input, after displaying n lines containing

matches.

––line-number –n Precedes each line by its line number in the file. The file does not need to contain

line numbers.

––quiet or

––silent

–q Does not write anything to standard output; only sets the exit code.

––recursive –r or –R

Recursively descends directories in the file-list and processes files within these

directories.

––no-messages –s (silent) Does not display an error message if a file in the file-list does not exist

or is not readable.

––invert-match –v Causes lines not containing a match to satisfy the search. When you use this

option by itself, grep displays all lines that do not contain a match for the pattern.

––word-regexp –w With this option, the pattern must match a whole word. This option is helpful

if you are searching for a specific word that may also appear as a substring of

another word in the file.

––line-regexp –x The pattern matches whole lines only.

Notes The grep utility returns an exit status of 0 if it finds a match, 1 if it does not find a

match, and 2 if the file is not accessible or the grep command contains a syntax

error.

egrep and fgrep Two utilities perform functions similar to that of grep. The egrep utility (same as

grep –E) allows you to use extended regular expressions (page 895), which include a

different set of special characters than basic regular expressions (page 893). The

grep 721

fgrep utility (same as grep –F) is fast and compact but processes only simple strings,

not regular expressions.

GNU grep, which runs under Linux and Mac OS X, uses extended regular expres-

sions in place of regular expressions. Thus egrep is virtually the same as grep. Refer

to the grep info page for a minimal distinction.

Examples The following examples assume that the working directory contains three files:

testa,testb, and testc.

File testa File testb File testc

aaabb aaaaa AAAAA

bbbcc bbbbb BBBBB

ff-ff ccccc CCCCC

cccdd ddddd DDDDD

dddaa

The grep utility can search for a pattern that is a simple string of characters. The fol-

lowing command line searches testa and displays each line containing the string bb:

$ grep bb testa

aaabb

bbbcc

The –v option reverses the sense of the test. The following example displays the

lines in testa without bb:

$ grep -v bb testa

ff-ff

cccdd

dddaa

The –n option displays the line number of each displayed line:

$ grep -n bb testa

1:aaabb

2:bbbcc

The grep utility can search through more than one file. Here it searches through

each file in the working directory. The name of the file containing the string pre-

cedes each line of output.

$ grep bb *

testa:aaabb

testa:bbbcc

testb:bbbbb

When the search for the string bb is done with the –w option, grep produces no output

because none of the files contains the string bb as a separate word:

$ grep -w bb *

722 grep

The search grep performs is case sensitive. Because the previous examples specified

lowercase bb,grep did not find the uppercase string BBBBB in testc. The –i option

causes both uppercase and lowercase letters to match either case of letter in the pattern:

$ grep -i bb *

testa:aaabb

testa:bbbcc

testb:bbbbb

testc:BBBBB

$ grep -i BB *

testa:aaabb

testa:bbbcc

testb:bbbbb

testc:BBBBB

The –c option displays the number of lines in each file that contain a match:

$ grep -c bb *

testa:2

testb:1

testc:0

The –f option finds matches for each pattern in a file of patterns. In the next

example, gfile holds two patterns, one per line, and grep searches for matches to

the patterns in gfile:

$ cat gfile

aaa

bbb

$ grep -f gfile test*

testa:aaabb

testa:bbbcc

testb:aaaaa

testb:bbbbb

The following command line searches text2 and displays lines that contain a string

of characters starting with st, followed by zero or more characters (.* represents

zero or more characters in a regular expression—see Appendix A), and ending in ing:

$ grep 'st.*ing' text2

...

The ^ regular expression, which matches the beginning of a line, can be used alone

to match every line in a file. Together with the –n option, ^ can be used to display

the lines in a file, preceded by their line numbers:

$ grep -n '^' testa

1:aaabb

2:bbbcc

3:ff-ff

4:cccdd

5:dddaa

grep 723

The next command line counts the number of times #include statements appear in

C source files in the working directory. The –h option causes grep to suppress the

filenames from its output. The input to sort consists of all lines from *.c that match

#include. The output from sort is an ordered list of lines that contains many dupli-

cates. When uniq with the –c option processes this sorted list, it outputs repeated

lines only once, along with a count of the number of repetitions in its input.

$ grep -h '#include' *.c | sort | uniq -c

9 #include "buff.h"

2 #include "poly.h"

1 #include "screen.h"

6 #include "window.h"

2 #include "x2.h"

2 #include "x3.h"

2 #include <math.h>

3 #include <stdio.h>

The final command calls the vim editor with a list of files in the working directory

that contain the string Sampson. The $(...) command substitution construct

(page 340) causes the shell to execute grep in place and supply vim with a list of file-

names that you want to edit:

$ vim $(grep -l 'Sampson' *)

...

The single quotation marks are not necessary in this example, but they are required

if the regular expression you are searching for contains special characters or SPACEs. It

is generally a good habit to quote the pattern so the shell does not interpret any spe-

cial characters the pattern may contain.

724 gzip

gzip

Compresses or decompresses files

gzip [options] [file-list]

gunzip [options] [file-list]

zcat [file-list]

The gzip utility compresses files, the gunzip utility restores files compressed with gzip, and the zcat

utility displays files compressed with gzip.

Arguments The file-list is a list of the names of one or more files that are to be compressed or

decompressed. If a directory appears in file-list with no ––recursive option, gzip/gun-

zip issues an error message and ignores the directory. With the ––recursive option,

gzip/gunzip recursively compresses/decompresses files within the directory hierarchy.

If file-list is empty or if the special option – (hyphen) is present, gzip reads from stan-

dard input. The ––stdout option causes gzip and gunzip to write to standard output.

The information in this section also applies to gunzip, a link to gzip.

Options The gzip,gunzip, and zcat utilities accept the common options described on page 603.

––stdout –c Writes the results of compression or decompression to standard output instead

of to filename.gz.

––decompress or

––uncompress

–d Decompresses a file compressed with gzip. This option with gzip is equivalent

to the gunzip command.

––force –f Overwrites an existing output file on compression/decompression.

––list –l For each compressed file in file-list, displays the file’s compressed and decom-

pressed sizes, the compression ratio, and the name of the file before compression.

Use this option with ––verbose to display additional information.

––fast or

––best

–nControls the tradeoff between the speed of compression and the amount of com-

pression. The n argument is a digit from 1 to 9; level 1 is the fastest (least) com-

pression and level 9 is the best (slowest and most) compression. The default level

is 6. The options ––fast and ––best are synonyms for –1 and –9, respectively.

––quiet –q Suppresses warning messages.

––recursive –r Recursively descends directories in file-list and compresses/decompresses files

within these directories.

The Mac OS X versions of gzip,gunzip, and zcat accept long options

tip Options for gzip,gunzip, and zcat preceded by a double hyphen (––) work under Mac OS X as

well as under Linux.

gzip 725

––test –t Verifies the integrity of a compressed file. This option displays nothing if the

file is OK.

––verbose –v Displays the name of the file, the name of the compressed file, and the amount

of compression as each file is processed.

Discussion Compressing files reduces disk space requirements and shortens the time needed to

transmit files between systems. When gzip compresses a file, it adds the extension

.gz to the filename. For example, compressing the file fname creates the file

fname.gz and, unless you use the ––stdout (–c) option, deletes the original file. To

restore fname, use the command gunzip with the argument fname.gz.

Almost all files become much smaller when compressed with gzip. On rare occa-

sions a file will become larger, but only by a slight amount. The type of a file and its

contents (as well as the –n option) determine how much smaller a file becomes; text

files are often reduced by 60 to 70 percent.

The attributes of a file, such as its owner, permissions, and modification and access

times, are left intact when gzip compresses and gunzip decompresses a file.

If the compressed version of a file already exists, gzip reports that fact and asks for

your confirmation before overwriting the existing file. If a file has multiple links to

it, gzip issues an error message and exits. The ––force option overrides the default

behavior in both of these situations.

Notes Without the ––stdout (–c) option, gzip removes the files in file-list.

The bzip2 utility (page 615) compresses files more efficiently than does gzip.

In addition to the gzip format, gunzip recognizes several other compression formats,

enabling gunzip to decompress files compressed with compress.

To see an example of a file that becomes larger when compressed with gzip, com-

pare the size of a file that has been compressed once with the same file compressed

with gzip again. Because gzip complains when you give it an argument with the

extension .gz, you need to rename the file before compressing it a second time.

The tar utility with the –z modifier (page 848) calls gzip.

The following related utilities display and manipulate compressed files. None of

these utilities changes the files it works on.

zcat file-list Works like cat except that it uses gunzip to decompress file-list as it copies files to

standard output.

zdiff [options]file1 [file2]

Works like diff (page 663) except file1 and file2 are decompressed with gunzip as

needed. The zdiff utility accepts the same options as diff. If you omit file2,zdiff com-

pares file1 with the compressed version of file1.

zless file-list Works like less except that it uses gunzip to decompress file-list as it displays files.

726 gzip

Examples In the first example, gzip compresses two files. Next gunzip decompresses one of the

files. When a file is compressed and decompressed, its size changes but its modifica-

tion time remains the same:

$ ls -l

total 175

-rw-rw-r-- 1 max group 33557 Jul 20 17:32 patch-2.0.7

-rw-rw-r-- 1 max group 143258 Jul 20 17:32 patch-2.0.8

$ gzip *

$ ls -l

total 51

-rw-rw-r-- 1 max group 9693 Jul 20 17:32 patch-2.0.7.gz

-rw-rw-r-- 1 max group 40426 Jul 20 17:32 patch-2.0.8.gz

$ gunzip patch-2.0.7.gz

$ ls -l

total 75

-rw-rw-r-- 1 max group 33557 Jul 20 17:32 patch-2.0.7

-rw-rw-r-- 1 max group 40426 Jul 20 17:32 patch-2.0.8.gz

In the next example, the files in Sam’s home directory are archived using cpio

(page 644). The archive is compressed with gzip before it is written to the device

mounted on /dev/sde1.

$ find ~sam -depth -print | cpio -oBm | gzip >/dev/sde1

head 727

head

Displays the beginning of a file

head [options] [file-list]

The head utility displays the beginning (head) of a file. This utility takes its input either from one or

more files you specify on the command line or from standard input.

Arguments The file-list is a list of the pathnames of the files that head displays. When you spec-

ify more than one file, head displays the filename before displaying the first few lines

of each file. When you do not specify a file, head takes its input from standard input.

Options Under Linux, head accepts the common options described on page 603. Options pre-

ceded by a double hyphen (––) work under Linux only. Except as noted, options named

with a single letter and preceded by a single hyphen work under Linux and OS X.

––bytes=n[u]–c n[u]

Displays the first n bytes (characters) of a file. Under Linux only, the u argu-

ment is an optional multiplicative suffix as described on page 602, except that

head uses a lowercase k for kilobyte (1,024-byte blocks) and accepts b for

512-byte blocks. If you include a multiplicative suffix, head counts by this

unit instead of by bytes.

––lines=n–n n

Displays the first n lines of a file. You can use –n to specify n lines without

using the lines keyword or the –n option. If you specify a negative value for n,

head displays all but the last n lines of the file.

––quiet –q Suppresses header information when you specify more than one filename on

the command line. L

Notes The head utility displays the first ten lines of a file by default.

Examples The examples in this section are based on the following file:

$ cat eleven

line one

line two

line three

line four

line five

line six

line seven

line eight

line nine

line ten

line eleven

728 head

Without any arguments head displays the first ten lines of a file:

$ head eleven

line one

line two

line three

line four

line five

line six

line seven

line eight

line nine

line ten

The next example displays the first three lines (–n 3) of the file:

$ head -n 3 eleven

line one

line two

line three

The following example is equivalent to the preceding one:

$ head -3 eleven

line one

line two

line three

The next example displays the first six characters (–c 6) in the file:

$ head -c 6 eleven

line o$

The final example displays all but the last seven lines of the file:

$ head -n -7 eleven

line one

line two

line three

line four

kill 729

kill

Terminates a process by PID

kill [option]PID-list

kill –l [signal-name | signal-number]

The kill utility sends a signal to one or more processes. Typically this signal terminates the processes.

For kill to work, the processes must belong to the user executing kill, with one exception: A user

working with root privileges can terminate any process. The –l (lowercase “l”) option lists informa-

tion about signals.

Arguments The PID-list is a list of process identification (PID) numbers of processes that kill is

to terminate.

Options –l (list) Without an argument, displays a list of signals. With an argument of a

signal-name, displays the corresponding signal-number. With an argument of

asignal-number, displays the corresponding signal-name.

–signal-name | –signal-number

Sends the signal specified by signal-name or signal-number to PID-list. You

can specify a signal-name preceded by SIG or not (e.g., SIGKILL or KILL).

Without this option, kill sends a software termination signal (SIGTERM; signal

number 15).

Notes See also killall on page 731.

See Table 10-5 on page 453 for a list of signals. The command kill –l displays a

complete list of signal numbers and names.

In addition to the kill utility, a kill builtin is available in the Bourne Again and TC

Shells. The builtins work similarly to the utility described here. Give the command

/bin/kill to use the kill utility and the command kill to use the builtin. It does not

usually matter which version you call.

The shell displays the PID number of a background process when you initiate the

process. You can also use the ps utility to determine PID numbers.

If the software termination signal does not terminate a process, try sending a KILL

signal (signal number 9). A process can choose to ignore any signal except KILL.

The kill utility/builtin accepts job identifiers in place of the PID-list. Job identifiers

consist of a percent sign (%) followed by either a job number or a string that

uniquely identifies the job.

730 kill

To terminate all processes that the current login process initiated and have the oper-

ating system log you out, give the command kill –9 0.

Examples The first example shows a command line executing the file compute as a back-

ground process and the kill utility terminating it:

$ compute &

[2] 259

$ kill 259

$ RETURN

[2]+ Terminated compute

The next example shows the ps utility determining the PID number of the back-

ground process running a program named xprog and the kill utility terminating

xprog with the TERM signal:

$ ps

PID TTY TIME CMD

7525 pts/1 00:00:00 bash

14668 pts/1 00:00:00 xprog

14699 pts/1 00:00:00 ps

$ kill -TERM 14668

root: do not run kill with arguments of –9 0 or KILL 0

caution If you run the command kill –9 0 while you are working with root privileges, you will bring the

system down.

killall 731

killall

Terminates a process by name

killall [option]name-list

The killall utility sends a signal to one or more processes executing specified commands. Typically this

signal terminates the processes. For killall to work, the processes must belong to the user executing killall,

with one exception: A user working with root privileges can terminate any process.

Arguments The name-list is a SPACE-separated list of names of programs that are to receive signals.

Options Options preceded by a double hyphen (––) work under Linux only. Except as noted,

options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

––interactive –i Prompts for confirmation before killing a process. L

––list –l Displays a list of signals (but kill –l displays a better list). With this option killall

does not accept a name-list.

––quiet –q Does not display a message if killall fails to terminate a process. L

–signal-name | –signal-number

Sends the signal specified by signal-name or signal-number to name-list. You

can specify a signal-name preceded by SIG or not (e.g., SIGKILL or KILL).

Without this option, kill sends a software termination signal (SIGTERM; signal

number 15).

Notes See also kill on page 729.

See Table 10-5 on page 453 for a list of signals. The command kill –l displays a

complete list of signal numbers and names.

If the software termination signal does not terminate the process, try sending a KILL

signal (signal number 9). A process can choose to ignore any signal except KILL.

You can use ps (page 796) to determine the name of the program you want to terminate.

Examples You can give the following commands to experiment with killall:

$ sleep 60 &

[1] 23274

$ sleep 50 &

[2] 23275

$ sleep 40 &

[3] 23276

732 killall

$ sleep 120 &

[4] 23277

$ killall sleep

$ RETURN

[1] Terminated sleep 60

[2] Terminated sleep 50

[3]- Terminated sleep 40

[4]+ Terminated sleep 120

The next command, run by a user with root privileges, terminates all instances of

the Firefox browser:

# killall firefox

launchctl O733

launchctl O

Controls the launchd daemon

launchctl [command [options] [arguments]]

The launchctl utility controls the launchd daemon.

Arguments The command is the command that launchctl sends to launchd. Table V-17 lists

some of the commands and the options and arguments each command accepts.

Without a command,launchctl reads commands, options, and arguments from stan-

dard input, one set per line. Without a command, when standard input comes from

the keyboard, launchctl runs interactively.

Option Only the load and unload commands take an option.

–w (write) When loading a file, removes the Disabled key and saves the modified

configuration file. When unloading a file, adds the Disabled key and saves the

modified configuration file.

Discussion The launchctl utility is the user interface to launchd, which manages system daemons

and background tasks (called jobs). Each job is described by a job configuration

file, which is a property list file in the format defined by the launchd.plist man page.

For security reasons, users not working with root privileges cannot communicate

with the system’s primary launchd process, PID 1. When such a user loads jobs, OS

X creates a new instance of launchd for that user. When all its jobs are unloaded,

that instance of launchd quits.

Notes The launchctl utility and launchd daemon were introduced in Mac OS X version 10.4.

Under version 10.3 and earlier, system jobs were managed by init,xinetd, and cron.

Table V-17 launchctl commands

Command Argument Description

help None Displays a help message

list None Lists jobs loaded into launchd

load [–w] Job configuration file Loads the job named by the argument

shutdown None Prepares for shutdown by removing all jobs

start Job name Starts the job named by the argument

stop Job name Stops the job named by the argument

unload [–w] Job configuration file Unloads the job named by the argument

734 launchctl O

Examples The first example, which is run by a user with root privileges, uses the list command

to list launchd jobs running on the local system:

# launchctl list

PID Status Label

51479 - 0x109490.launchctl

50515 - 0x10a780.bash

50514 - 0x10a680.sshd

50511 - 0x108d20.sshd

22 - 0x108bc0.securityd

- 0 com.apple.launchctl.StandardIO

37057 - [0x0-0x4e84e8].com.apple.ScreenSaver.Engine

27860 - 0x10a4e0.DiskManagementTo

27859 - [0x0-0x3a23a2].com.apple.SoftwareUpdate

...

The next example enables the ntalk service. Looking at the ntalk.plist file before

and after the launchctl command is executed shows that launchctl has modified the

file by removing the Disabled key.

# cat /System/Library/LaunchDaemons/ntalk.plist

...

<dict>

<key>Disabled</key>

<true/>

<key>Label</key>

<string>com.apple.ntalkd</string>

...

# launchctl load -w /System/Library/LaunchDaemons/ntalk.plist

# cat /System/Library/LaunchDaemons/ntalk.plist

...

<dict>

<key>Label</key>

<string>com.apple.ntalkd</string>

...

Without any arguments, launchctl prompts for commands on standard input. Give a

quit command or press CONTROL-D to exit from launchctl. In the last example, a user

running with root privileges causes launchctl to display a list of jobs and then to stop

the job that would launch airportd:

# launchctl

launchd% list

PID Status Label

8659 - 0x10ba10.cron

1 - 0x10c760.launchd

...

- 0 com.apple.airport.updateprefs

- 0 com.apple.airportd

- 0 com.apple.AirPort.wps

- 0 0x100670.dashboardadvisoryd

- 0 com.apple.launchctl.System

launchd% stop com.apple.airportd

launchd% quit

less 735

less

Displays text files, one screen at a time

less [options] [file-list]

The less utility displays text files, one screen at a time.

Arguments The file-list is the list of files you want to view. If there is no file-list,less reads from

standard input.

Options The less utility accepts the common options described on page 603.

––clear-screen –c Repaints the screen from the top line down instead of scrolling.

––QUIT-AT-EOF –E (exit) Normally less requires you to enter q to terminate. This option exits

automatically the first time less reads the end of file.

––quit-at-eof –e (exit) Similar to –E, except that less exits automatically the second time it

reads the end of file.

––quit-if-one-screen

–F Displays the file and quits if the file can be displayed on a single screen.

––ignore-case –i Causes a search for a string of lowercase letters to match both uppercase and

lowercase letters. This option is ignored if you specify a pattern that includes

uppercase letters.

––IGNORE-CASE

–I Causes a search for a string of letters of any case to match both uppercase and

lowercase letters, regardless of the case of the search pattern.

––long-prompt –m Each prompt reports the percentage of the file you have viewed. This option

causes less to display a prompt similar to the prompt used by more. It reports

byte numbers when less reads from standard input because less has no way of

determining the size of the input file.

––LINE-NUMBERS

–N Displays a line number at the beginning of each line.

––prompt=prompt

–Pprompt

Changes the short prompt string (the prompt that appears at the bottom of

each screen of output) to prompt. Enclose prompt in quotation marks if it con-

tains SPACEs. You can include special symbols in prompt, which less will replace

The Mac OS X version of less accepts long options

tip Options for less preceded by a double hyphen (––) work under Mac OS X as well as under Linux.

736 less

with other values when it displays the prompt. For example, less displays the

current filename in place of %f in prompt. See the less man page for a list of

these special symbols and descriptions of other prompts. Custom prompts are

useful if you are running less from within another program and want to give

instructions or information to the person using the program. The default

prompt is the name of the file displayed in reverse video.

––squeeze-blank-lines

–s Displays multiple, adjacent blank lines as a single blank line. When you use

less to display text that has been formatted for printing with blank space at the

top and bottom of each page, this option shortens these headers and footers to

a single line.

––tabs=n–xn

Sets tab stops n characters apart. The default is eight characters.

––window=n–[z]n

Sets the scrolling size to n lines. The default is the height of the display in lines.

Each time you move forward or backward a page, you move n lines. The z part

of the option maintains compatibility with more and can be omitted.

+command

Any command you can give less while it is running can also be given as an

option by preceding it with a plus sign (+) on the command line. See the

“Commands” section. A command preceded by a plus sign on the command

line is executed as soon as less starts and applies to the first file only.

++command

Similar to +command except that command is applied to every file in file-list,

not just the first file.

Notes The phrase “less is more” explains the origin of the name of this utility. The more

utility is the original Berkeley UNIX pager (also available under Linux). The less

utility is similar to more but includes many enhancements. (Under OS X, less and

more are copies of the same file.) After displaying a screen of text, less displays a

prompt and waits for you to enter a command. You can skip forward and back-

ward in the file, invoke an editor, search for a pattern, or perform a number of

other tasks.

See the v command in the next section for information on how you can edit the file

you are viewing with less.

You can set the options to less either from the command line when you call less or

by setting the LESS environment variable. For example, you can use the following

command from bash to use less with the –x4 and –s options:

$ export LESS="-x4 -s"

less 737

Normally you would set LESS in ~/.bash_profile if you are using bash or in ~/.login

if you are using tcsh. Once you have set the LESS variable, less is invoked with the

specified options each time you call it. Any options you give on the command line

override the settings in the LESS variable. The LESS variable is used both when you

call less from the command line and when less is invoked by another program, such

as man. To specify less as the pager to use with man and other programs, set the

environment variable PAGER to less. For example, with bash you can add the fol-

lowing line to ~/.bash_profile:

export PAGER=less

Commands Whenever less pauses, you can enter any of a large number of commands. This sec-

tion describes some commonly used commands. Refer to the less man page for the

full list of commands. The optional numeric argument n defaults to 1, with the

exceptions noted. You do not need to follow these commands with a RETURN.

nb or n

CONTROL-B

(backward) Scrolls backward n lines. The default value of n is the height of the

screen in lines.

nd or n

CONTROL-D

(down) Scrolls forward n lines. The default value of n is one-half the height of the

screen in lines. When you specify n, it becomes the new default value for this

command.

F(forward) Scrolls forward. If the end of the input is reached, this command waits for

more input and then continues scrolling. This command allows you to use less in a

manner similar to tail –f (page 843), except that less paginates the output as it

appears.

ng(go) Goes to line number n. This command may not work if the file is read from

standard input and you have moved too far into the file. The default value of n is 1.

h or H(help) Displays a summary of all available commands. The summary is displayed

using less, as the list of commands is quite long.

RETURN

or nj(jump) Scrolls forward n lines. The default value of n is 1.

qor :q Terminates less.

nu or n

CONTROL-U

Scrolls backward n lines. The default value of n is one-half the height of the screen

in lines. When you specify n, it becomes the default value for this command.

vBrings the current file into an editor with the cursor on the current line. The less

utility uses the editor specified in the EDITOR environment variable. If EDITOR is

not set, less uses vi (which is typically linked to vim).

nwScrolls backward like nb, except that the value of n becomes the new default value

for this command.

ny or nkScrolls backward n lines. The default value of n is 1.

nzDisplays the next n lines like nSPACE except that the value of n, if present, becomes

the new default value for the z and SPACE commands.

738 less

SPACE

Displays the next n lines. Pressing the SPACE bar by itself displays the next screen

of text.

/regular-expression

Skips forward in the file, looking for lines that contain a match for regular-expression.

If you begin regular-expression with an exclamation point (!), this command looks for

lines that do not contain a match for regular-expression. If regular-expression begins

with an asterisk (*), this command continues the search through file-list. (A normal

search stops at the end of the current file.) If regular-expression begins with an at sign

(@), this command begins the search at the beginning of file-list and continues to the

end of file-list.

?regular-expression

This command is similar to the previous one but searches backward through the file

(and file-list). An asterisk (*) as the first character in regular-expression causes the

search to continue backward through file-list to the beginning of the first file. An at

sign (@) causes the search to start with the last line of the last file in file-list and

progress toward the first line of the first file.

{or ( or [ If one of these characters appears in the top line of the display, this command scrolls

forward to the matching right brace, parenthesis, or bracket. For example, typing {

causes less to move the cursor forward to the matching }.

}or ) or ] Similar to the preceding commands, these commands move the cursor backward to

the matching left brace, parenthesis, or bracket.

CONTROL-L

Redraws the screen. This command is useful if the text on the screen has become

garbled.

[n]:n Skips to the next file in file-list. If n is given, skips to the nth next file in file-list.

![command line]Executes command line under the shell specified by the SHELL environment vari-

able or under sh (usually linked to or a copy of bash) by default. A percent sign (%)

in command line is replaced by the name of the current file. If you omit command

line,less starts an interactive shell.

Examples The following example displays the file memo.txt. To see more of the file, the user

presses the SPACE bar in response to the less prompt at the lower-left corner of the

screen:

$ less memo.txt

...

memo.txt SPACE

...

In the next example, the user changes the prompt to a more meaningful message

and uses the –N option to display line numbers. Finally the user instructs less to

skip forward to the first line containing the string procedure.

less 739

$ less -Ps"Press SPACE to continue, q to quit" -N +/procedure ncut.icn

28 procedure main(args)

29 local filelist, arg, fields, delim

31 filelist:=[]

...

45 # Check for real field list

46 #

47 if /fields then stop(“-fFIELD_LIST is required.")

49 # Process the files and output the fields

Press SPACE to continue, q to quit

740 ln

Makes a link to a file

ln [options]existing-file [new-link]

ln [options]existing-file-list directory

The ln utility creates hard or symbolic links to one or more files. You can create a symbolic link, but

not a hard link, to a directory.

Arguments In the first format the existing-file is the pathname of the file you want to create a

link to. The new-link is the pathname of the new link. When you are creating a

symbolic link, the existing-file can be a directory. If you omit new-link,ln creates a

link to existing-file in the working directory, and uses the same simple filename as

existing-file.

In the second format the existing-file-list is a list of the pathnames of the ordinary files

you want to create links to. The ln utility establishes the new links in the directory. The

simple filenames of the entries in the directory are the same as the simple filenames of

the files in the existing-file-list.

Options Options preceded by a double hyphen (––) work under Linux only. Except as noted,

options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

––backup –b If the ln utility will remove a file, this option makes a backup by appending a

tilde (~) to the filename. This option works only with ––force.L

––force –f Normally ln does not create the link if new-link already exists. This option

removes new-link before creating the link. When you use ––force and ––backup

together (Linux only), ln makes a copy of new-link before removing it.

––interactive –i If new-link already exists, this option prompts you before removing new-link.

If you enter y or yes,ln removes new-link before creating the link. If you

answer n or no,ln does not remove new-link and does not make a new link.

––symbolic –s Creates a symbolic link. When you use this option, the existing-file and the

new-link may be directories and may reside on different filesystems. Refer to

“Symbolic Links” on page 108.

Notes For more information refer to “Links” on page 104. The ls utility with the –l option

displays the number of hard links to a file (Figure 4-12, page 94).

Hard links By default ln creates hard links. A hard link to a file is indistinguishable from the

original file. All hard links to a file must be in the same filesystem. For more infor-

mation refer to “Hard Links” on page 106.

Symbolic links You can also use ln to create symbolic links. Unlike a hard link, a symbolic link can

exist in a different filesystem from the linked-to file. Also, a symbolic link can point

to a directory. For more information refer to “Symbolic Links” on page 108.

ln 741

If new-link is the name of an existing file, ln does not create the link unless you use the

––force option (Linux only) or answer yes when using the –i (––interactive) option.

Examples The following command creates a link between memo2 in the literature subdirec-

tory of Zach’s home directory and the working directory. The file appears as

memo2 (the simple filename of the existing file) in the working directory:

$ ln ~zach/literature/memo2 .

You can omit the period that represents the working directory from the preceding

command. When you give a single argument to ln, it creates a link in the working

directory.

The next command creates a link to the same file. This time the file appears as

new_memo in the working directory:

$ ln ~zach/literature/memo2 new_memo

The following command creates a link that causes the file to appear in Sam’s home

directory:

$ ln ~zach/literature/memo2 ~sam/new_memo

You must have write and execute access permissions to the other user’s directory for

this command to work. If you own the file, you can use chmod to give the other user

write access permission to the file.

The next command creates a symbolic link to a directory. The ls –ld command

shows the link:

$ ln -s /usr/local/bin bin

$ ls -ld bin

lrwxrwxrwx 1 zach zach 14 Feb 10 13:26 bin -> /usr/local/bin

The final example attempts to create a symbolic link named memo1 to the file

memo2. Because the file memo1 exists, ln refuses to make the link. When you use

the –i (––interactive) option, ln asks whether you want to replace the existing

memo1 file with the symbolic link. If you enter y or yes,ln creates the link and the

old memo1 disappears.

$ ls -l memo?

-rw-rw-r-- 1 zach group 224 Jul 31 14:48 memo1

-rw-rw-r-- 1 zach group 753 Jul 31 14:49 memo2

$ ln -s memo2 memo1

ln: memo1: File exists

$ ln -si memo2 memo1

ln: replace 'memo1'? y

$ ls -l memo?

lrwxrwxrwx 1 zach group 5 Jul 31 14:49 memo1 -> memo2

-rw-rw-r-- 1 zach group 753 Jul 31 14:49 memo2

Under Linux you can also use the ––force option to cause ln to overwrite a file.

742 lpr

lpr

Sends files to printers

lpr [options] [file-list]

lpq [options] [job-identifiers]

lprm [options] [job-identifiers]

The lpr utility places one or more files into a print queue, providing orderly access to printers for sev-

eral users or processes. This utility can work with printers attached to remote systems. You can use

the lprm utility to remove files from the print queues and the lpq utility to check the status of files in

the queues. Refer to “Notes” later in this section.

Arguments The file-list is a list of one or more filenames for lpr to print. Often these files are

text files, but many systems are configured so lpr can accept and properly print a

variety of file types including PostScript and PDF files. Without a file-list,lpr

accepts input from standard input.

The job-identifiers is a list of job numbers or usernames. If you do not know the

job number, use lpq to display a list of print jobs.

Options Some of the following options depend on which type of file is being printed as well

as on how the system is configured for printing.

–h (no header) Suppresses printing of the header (burst) page. This page is useful

for identifying the owner of the output in a multiuser setup, but printing it is a

waste of paper when this identification is not needed.

–l (lowercase “l”) Specifies that lpr should not preprocess (filter) the file being

printed. Use this option when the file is already formatted for the printer.

–P printer

Routes the print jobs to the queue for the printer named printer. If you do not

use this option, print jobs are routed to the default printer for the local system.

The acceptable values for printer are found in the Linux file /etc/printcap and

can be displayed by an lpstat –t command. These values vary from system to

system.

–r (remove) Deletes the files in file-list after calling lpr.

–# n

Prints n copies of each file. Depending on which shell you are using, you may

need to escape the # by preceding it with a backslash to keep the shell from

interpreting it as a special character.

lpr 743

Discussion The lpr utility takes input either from files you specify on the command line or from

standard input; it adds these files to the print queue as print jobs. The utility assigns

a unique identification number to each print job. The lpq utility displays the job

numbers of the print jobs that lpr has set up; you can use the lprm utility to remove a

job from the print queue.

lpq The lpq utility displays information about jobs in a print queue. When called with-

out any arguments, lpq lists all print jobs queued for the default printer. Use lpr’s –P

printer option with lpq to look at other print queues—even those for printers con-

nected to remote systems. With the –l option, lpq displays more information about

each job. If you give a username as an argument, lpq displays only the printer jobs

belonging to that user.

lprm One item displayed by lpq is the job number for each print job in the queue. To

remove a job from the print queue, give the job number as an argument to lprm.

Unless you are working with root privileges, you can remove only your own jobs.

Even a user working with root privileges may not be able to remove a job from a

queue for a remote printer. If you do not give any arguments to lprm, it removes the

active printer job (that is, the job that is now printing) from the queue, if you own

that job.

Notes If you normally use a printer other than the system default printer, you can set up lpr

to use another printer as your personal default by assigning the name of this printer to

the environment variable PRINTER. For example, if you use bash, you can add the

following line to ~/.bash_profile to set your default printer to the printer named ps:

export PRINTER=ps

LPD and LPR Traditionally, UNIX had two printing systems: the BSD Line Printer Daemon (LPD)

and the System V Line Printer system (LPR). Linux adopted those systems at first,

and both UNIX and Linux have seen modifications to and replacements for these

systems. Today CUPS is the default printing system under Linux and OS X.

CUPS CUPS is a cross-platform print server built around the Internet Printing Protocol

(IPP), which is based on HTTP. CUPS provides a number of printer drivers and can

print different types of files, including PostScript files. CUPS provides System V and

BSD command-line interfaces and, in addition to IPP, supports LPD/LPR, HTTP,

SMB, and JetDirect (socket) protocols, among others.

This section describes the LPD command-line interface that runs under CUPS and

also in native mode on older systems.

Examples The first command sends the file named memo2 to the default printer:

$ lpr memo2

744 lpr

Next a pipe sends the output of ls to the printer named deskjet:

$ ls | lpr -Pdeskjet

The next example paginates and sends the file memo to the printer:

$ pr -h "Today's memo" memo | lpr

The next example shows a number of print jobs queued for the default printer. Max

owns all the jobs, and the first one is being printed (it is active). Jobs 635 and 639

were created by sending input to lpr’s standard input; job 638 was created by giving

ncut.icn as an argument to the lpr command. The last column gives the size of each

print job.

$ lpq

deskjet is ready and printing

Rank Owner Job Files Total Size

active max 635 (stdin) 38128 bytes

1st max 638 ncut.icn 3587 bytes

2nd max 639 (stdin) 3960 bytes

The next command removes job 638 from the default print queue:

$ lprm 638

ls 745

Displays information about one or more files

ls [options] [file-list]

The ls utility displays information about one or more files. It lists the information alphabetically by

filename unless you use an option that changes the order.

Arguments When you do not provide an argument, ls displays the names of the visible files

(those with filenames that do not begin with a period) in the working directory.

The file-list is a list of one or more pathnames of any ordinary, directory, or device

files. It can include ambiguous file references.

When the file-list includes a directory, ls displays the contents of the directory. It

displays the name of the directory only when needed to avoid ambiguity, such as

when the listing includes more than one directory. When you specify an ordinary

file, ls displays information about that one file.

Options Options preceded by a double hyphen (––) work under Linux only. Except as noted,

options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

The options determine the type of information ls displays, the manner in which it

displays the information, and the order in which the information is displayed. When

you do not use an option, ls displays a short list that contains just the names of files,

in alphabetical order.

––almost-all –A The same as –a but does not list the . and .. directory entries.

––all –a Includes hidden filenames (those filenames that begin with a period; page 82)

in the listing. Without this option ls does not list information about hidden

files unless you include the name of a hidden file in the file-list.The * ambigu-

ous file reference does not match a leading period in a filename (page 138), so

you must use this option or explicitly specify a filename (ambiguous or not)

that begins with a period to display hidden files.

––escape –b Displays nonprinting characters in a filename, using backslash escape

sequences similar to those used in C language strings (Table V-18). Other non-

printing characters are displayed as a backslash followed by an octal number.

Table V-18 Backslash escape sequences

Sequence Meaning

\b BACKSPACE

\n NEWLINE

746 ls

––color[=when]The ls utility can display various types of files in different colors but normally

does not use colors (the same result as when you specify when as none). If you

do not specify when or if you specify when as always,ls uses colors. When you

specify when as auto,ls uses colors only when the output goes to a screen. See

the “Notes” section for more information. L

––directory –d Displays directories without displaying their contents. This option does not

dereference symbolic links; that is, for each file that is a symbolic link, this

option lists the symbolic link, not the file the link points to.

–e Displays ACLs (page 934). O

––format=word By default ls displays files sorted vertically. This option sorts files based on

word:across or horizontal (also –x), separated by commas (also –m), long

(also –l), or single-column (also –1). L

––classify –F Displays a slash (/) after each directory, an asterisk (*) after each executable

file, and an at sign (@) after a symbolic link.

––dereference-command-line

–H (partial dereference) For each file that is a symbolic link, lists the file the link

points to, not the symbolic link itself. This option affects files specified on the

command line; it does not affect files found while descending a directory hier-

archy. This option treats files that are not symbolic links normally. See

page 623 for an example of the use of the –H versus –L options.

––human-readable

–h With the –l option, displays sizes in K (kilobyte), M (megabyte), and G

(gigabyte) blocks, as appropriate. This option works with the –l option only. It

displays powers of 1,024. Under Mac OS X, it displays B (bytes) in addition to

the preceding suffixes. See also ––si.

––inode –i Displays the inode number of each file. With the –l option, this option displays

the inode number in column 1 and shifts other items one column to the right.

––dereference –L (dereference) For each file that is a symbolic link, lists the file the link points

to, not the symbolic link itself. This option affects all files and treats files that

are not symbolic links normally. See page 623 for an example of the use of the

–H versus –L options.

Sequence Meaning

\r RETURN

\t HORIZONTAL TAB

\v VERTICAL TAB

\\ BACKSLASH

Table V-18 Backslash escape sequences (continued)

ls 747

––format=long –l (lowercase “l”) Lists more information about each file. This option does not

dereference symbolic links; that is, for each file that is a symbolic link, this

option lists the symbolic link, not the file the link points to. If standard output

for a directory listing is sent to the screen, this option displays the number of

blocks used by all files in the listing on a line before the listing. Use this option

with –h to make file sizes more readable. See the “Discussion” section for

more information.

––format=commas

–m Displays a comma-separated list of files that fills the width of the screen.

–P (no dereference) For each file that is a symbolic link, lists the symbolic link, not

the file the link points to. This option affects all files and treats files that are

not symbolic links normally. See page 625 for an example of the use of the –P

option. O

––hide-control-chars

–q Displays nonprinting characters in a filename as question marks. When stan-

dard output is sent to the screen, this behavior is the default. Without this

option, when standard output is sent to a filter or a file, nonprinting characters

are displayed as themselves.

––recursive –R Recursively lists directory hierarchies.

––reverse –r Displays the list of filenames in reverse sorted order.

––size –s Displays the number of 1,024-byte (Linux) or 512-byte (Mac OS X) blocks

allocated to the file. The size precedes the filename. With the –l option this

option displays the size in column 1 and shifts other items one column to the

right. If standard output for a directory listing is sent to the screen, this option

displays the number of blocks used by all files in the listing on a line before the

listing. You can include the –h option to make the file sizes easier to read.

Under Mac OS X, you can use the BLOCKSIZE environment variable

(page 603) to change the size of the blocks this option reports on.

––si With the –l option, displays sizes in K (kilobyte), M (megabyte), and G

(gigabyte) blocks, as appropriate. This option works with the –l option only.

This option displays powers of 1,000. See also ––human-readable.L

––sort=time –t Displays files sorted by the time they were last modified.

––sort=word By default ls displays files in ASCII order. This option sorts the files based on

word: filename extension (–X; Linux only), none (–U; Linux only), file size (–S),

access time (–u), or modification time (–t). See ––time for an exception. L

––time=word By default ls with the –l option displays the modification time of a file. Set

word to atime (–u) to display the access time or to ctime (–t) to display the

modification time. The list is sorted by word when you also give the

––sort=time option. L

748 ls

––sort=access –u Displays files sorted by the time they were last accessed.

––format=extension

–X Displays files sorted by filename extension. Files with no filename extension

are listed first. L

––format=across –x Displays files sorted by lines (the default display is sorted by columns).

––format=single-column

–1 (one) Displays one file per line. This type of display is the default when you

redirect the output from ls.

Discussion The ls long listing (–l or ––format=long options) displays the columns shown in

Figure 4-12 on page 94. The first column, which contains 10 or 11 characters, is

divided as described in the following paragraphs. The character in the first position

describes the type of file, as shown in Table V-19.

The next nine characters of the first column describe the access permissions

associated with the file. These characters are divided into three sets of three

characters each.

The first three characters represent the owner’s access permissions. If the owner has

read access permission to the file, r appears in the first character position. If the

owner is not permitted to read the file, a hyphen appears in this position. The next

two positions represent the owner’s write and execute access permissions. If w

appears in the second position, the owner is permitted to write to the file; if x

appears in the third position, the owner is permitted to execute the file. An s in the

third position indicates the file has both setuid and execute permissions. An S in the

third position indicates setuid permission without execute permission. A hyphen

indicates that the owner does not have the access permission associated with the

character position.

In a similar manner the second set of three characters represents the access permis-

sions for the group the file is associated with. An s in the third position indicates the

Table V-19 First character in a long ls display

Character Meaning

–Ordinary

bBlock device

cCharacter device

dDirectory

pFIFO (named pipe)

lSymbolic link

ls 749

file has setgid permission with execute permission, and an S indicates setgid permis-

sion with no execute permission.

The third set of three characters represents the access permissions for other users. A

t in the third position indicates that the file has the sticky bit (page 980) set.

Refer to chmod on page 626 for information on changing access permissions.

If ACLs (page 99) are enabled and a listed file has an ACL, ls –l displays a plus sign

(+) following the third set of three characters.

Still referring to Figure 4-12 on page 94, the second column indicates the number of

hard links to the file. Refer to page 104 for more information on links.

The third and fourth columns display the name of the owner of the file and the

name of the group the file is associated with, respectively.

The fifth column indicates the size of the file in bytes or, if information about a

device file is being displayed, the major and minor device numbers. In the case of a

directory, this number is the size of the directory file, not the size of the files that are

entries within the directory. (Use du [page 677] to display the sum of the sizes of all

files in a directory.) Use the –h option to display the size of files in kilobytes, mega-

bytes, or gigabytes.

The last two columns display the date and time the file was last modified and the

filename, respectively.

Notes By default, ls dereferences symbolic links: For each file that is a symbolic link, ls lists

the file the link points to, not the symbolic link itself. Use the –l or –d option to ref-

erence symbolic links (display information on symbolic links, not the files the links

point to).

For other than long listings (displayed by the –l option), when standard output goes

to the screen, ls displays output in columns based on the width of the screen. When

you redirect standard output to a filter or file, ls displays a single column.

Refer to page 136 for examples of using ls with ambiguous file references.

With the ––color option, ls displays the filenames of various types of files in dif-

ferent colors. By default executable files are green, directory files are blue, sym-

bolic links are cyan, archives and compressed files are red, and ordinary text files

are black. The manner in which ls colors the various file types is specified in the

/etc/DIR_COLORS file. If this file does not exist on the local system, ls will not

color filenames. You can modify /etc/DIR_COLORS to alter the default

color/filetype mappings on a systemwide basis. For your personal use, you can

copy /etc/DIR_COLORS to the ~/.dir_colors file in your home directory and

modify it. For your login, ~/.dir_colors overrides the systemwide colors estab-

lished in /etc/DIR_COLORS. Refer to the dir_colors and dircolors man pages for

more information.

750 ls

Examples The first example shows ls, without any options or arguments, listing the names of

the files in the working directory in alphabetical order. The listing is sorted in columns

(vertically):

$ ls

bin calendar letters

c execute shell

The next example shows the ls utility with the –x option, which sorts the files

horizontally:

$ ls -x

bin c calendar

execute letters shell

The –F option appends a slash (/) to files that are directories, an asterisk to files

that are executable, and an at sign (@) to files that are symbolic links:

$ ls -Fx

bin/ c/ calendar

execute* letters/ shell@

Next the –l (long) option displays a long list. The files are still in alphabetical order:

$ ls -l

total 20

drwxr-xr-x 2 sam pubs 4096 May 20 09:17 bin

drwxr-xr-x 2 sam pubs 4096 Mar 26 11:59 c

-rw-r--r-- 1 sam pubs 104 Jan 9 14:44 calendar

-rwxr-xr-x 1 sam pubs 85 May 6 08:27 execute

drwxr-xr-x 2 sam pubs 4096 Apr 4 18:56 letters

lrwxrwxrwx 1 sam sam 9 May 21 11:35 shell -> /bin/bash

The –a (all) option lists all files, including those with hidden names:

$ ls -a

. bin execute

.. c letters

.profile calendar shell

Combining the –a and –l options displays a long listing of all files, including invisible

files, in the working directory. This list is still in alphabetical order:

$ ls -al

total 32

drwxr-xr-x 5 sam sam 4096 May 21 11:50 .

drwxrwxrwx 3 sam sam 4096 May 21 11:50 ..

-rw-r--r-- 1 sam sam 160 May 21 11:45 .profile

drwxr-xr-x 2 sam pubs 4096 May 20 09:17 bin

drwxr-xr-x 2 sam pubs 4096 Mar 26 11:59 c

-rw-r--r-- 1 sam pubs 104 Jan 9 14:44 calendar

-rwxr-xr-x 1 sam pubs 85 May 6 08:27 execute

drwxr-xr-x 2 sam pubs 4096 Apr 4 18:56 letters

lrwxrwxrwx 1 sam sam 9 May 21 11:35 shell -> /bin/bash

ls 751

When you add the –r (reverse) option to the command line, ls produces a list in

reverse alphabetical order:

$ ls -ral

total 32

lrwxrwxrwx 1 sam sam 9 May 21 11:35 shell -> /bin/bash

drwxr-xr-x 2 sam pubs 4096 Apr 4 18:56 letters

-rwxr-xr-x 1 sam pubs 85 May 6 08:27 execute

-rw-r--r-- 1 sam pubs 104 Jan 9 14:44 calendar

drwxr-xr-x 2 sam pubs 4096 Mar 26 11:59 c

drwxr-xr-x 2 sam pubs 4096 May 20 09:17 bin

-rw-r--r-- 1 sam sam 160 May 21 11:45 .profile

drwxrwxrwx 3 sam sam 4096 May 21 11:50 ..

drwxr-xr-x 5 sam sam 4096 May 21 11:50 .

Use the –t and –l options to list files so the most recently modified file appears at the

top of the list:

$ ls -tl

total 20

lrwxrwxrwx 1 sam sam 9 May 21 11:35 shell -> /bin/bash

drwxr-xr-x 2 sam pubs 4096 May 20 09:17 bin

-rwxr-xr-x 1 sam pubs 85 May 6 08:27 execute

drwxr-xr-x 2 sam pubs 4096 Apr 4 18:56 letters

drwxr-xr-x 2 sam pubs 4096 Mar 26 11:59 c

-rw-r--r-- 1 sam pubs 104 Jan 9 14:44 calendar

Together the –r and –t options cause the file you modified least recently to appear at

the top of the list:

$ ls -trl

total 20

-rw-r--r-- 1 sam pubs 104 Jan 9 14:44 calendar

drwxr-xr-x 2 sam pubs 4096 Mar 26 11:59 c

drwxr-xr-x 2 sam pubs 4096 Apr 4 18:56 letters

-rwxr-xr-x 1 sam pubs 85 May 6 08:27 execute

drwxr-xr-x 2 sam pubs 4096 May 20 09:17 bin

lrwxrwxrwx 1 sam sam 9 May 21 11:35 shell -> /bin/bash

The next example shows ls with a directory filename as an argument. The ls utility

lists the contents of the directory in alphabetical order:

$ ls bin

c e lsdir

To display information about the directory file itself, use the –d (directory) option.

This option lists information about the directory only:

$ ls -dl bin

drwxr-xr-x 2 sam pubs 4096 May 20 09:17 bin

You can use the following command to display a list of all hidden filenames (those

starting with a period) in your home directory. It is a convenient way to list the

startup (initialization) files in your home directory.

752 ls

$ ls -d ~/.*

/home/sam/.

/home/sam/..

/home/sam/.AbiSuite

/home/sam/.Azureus

/home/sam/.BitTornado

...

A plus sign (+) to the right of the permissions in a long listing denotes the presence

of an ACL for a file:

$ ls -l memo

-rw-r--r--+ 1 sam pubs 19 Jul 19 21:59 memo

Under Mac OS X you can use the –le option to display an ACL:

$ ls -le memo

-rw-r--r-- + 1 sam pubs 19 Jul 19 21:59 memo

0: user:jenny allow read

See page 934 for more examples of using ls under Mac OS X to display ACLs.

make 753

make

Keeps a set of programs current

make [options] [target-files] [arguments]

The GNU make utility keeps a set of executable programs current, based on differences in the

modification times of the programs and the source files that each program is dependent on.

Arguments The target-files refer to targets on dependency lines in the makefile. When you do

not specify a target-file,make updates the target on the first dependency line in the

makefile. Command-line arguments of the form name=value set the variable name

to value inside the makefile. See the “Discussion” section for more information.

Options If you do not use the –f option, make takes its input from a file named GNUmakefile,

makefile, or Makefile (in that order) in the working directory. In this section, this

input file is referred to as makefile. Many users prefer to use the name Makefile

because it shows up earlier in directory listings.

––directory=dir –C dir

Changes directories to dir before starting.

––debug –d Displays information about how make decides what to do.

––file=file –f file

(input file) Uses file as input instead of makefile.

––jobs[=n]–j [n]

(jobs) Runs up to n commands at the same time instead of the default of one

command. Running multiple commands simultaneously is especially effective

if you are working on a multiprocessor system. If you omit n,make does not

limit the number of simultaneous jobs.

––keep-going –k Continues with the next file from the list of target-files instead of quitting

when a construction command fails.

––just-print or

––dry-run

–n (no execution) Displays, but does not execute, the commands that make would

execute to bring the target-files up-to-date.

––silent or

––quiet

–s Does not display the names of the commands being executed.

––touch –t Updates the modification times of target files but does not execute any con-

struction commands. Refer to touch on page 862.

The Mac OS X version of make accepts long options

tip Options for make preceded by a double hyphen (––) work under Mac OS X as well as under

Linux.

754 make

Discussion The make utility bases its actions on the modification times of the programs and the

source files that each program depends on. Each of the executable programs, or

target-files, depends on one or more prerequisite files. The relationships between

target-files and prerequisites are specified on dependency lines in a makefile. Con-

struction commands follow the dependency line, specifying how make can update

the target-files. See page 756 for examples of makefiles.

Documentation Refer to www.gnu.org/software/make/manual/make.html and to the make info page

for more information about make and makefiles.

Although the most common use of make is to build programs from source code, this

general-purpose build utility is suitable for a wide range of applications. Anywhere

you can define a set of dependencies to get from one state to another represents a

candidate for using make.

Much of make’s power derives from the features you can set up in a makefile. For

example, you can define variables using the same syntax found in the Bourne Again

Shell. Always define the variable SHELL in a makefile; set it to the pathname of the

shell you want to use when running construction commands. To define the variable

and assign it a value, place the following line near the top of a makefile:

SHELL=/bin/sh

Assigning the value /bin/sh to SHELL allows you to use a makefile on other com-

puter systems. On Linux systems, /bin/sh is generally linked to /bin/bash or

/bin/dash. Under Mac OS X, /bin/sh is a copy of bash that attempts to emulate the

original Bourne Shell. The make utility uses the value of the environment variable

SHELL if you do not set SHELL in a makefile. If SHELL does not hold the path of

the shell you intended to use and if you do not set SHELL in a makefile, the con-

struction commands may fail.

Following is a list of additional features associated with make:

• You can run specific construction commands silently by preceding them

with an at sign (@). For example, the following lines will display a short

help message when you run the command make help:

help:

@echo "You can make the following:"

@echo " "

@echo "libbuf.a -- the buffer library"

@echo "Bufdisplay -- display any-format buffer"

@echo "Buf2ppm -- convert buffer to pixmap"

Without the @ signs in the preceding example, make would display each of

the echo commands before executing it. This way of displaying a message

works because no file is named help in the working directory. As a result

make runs the construction commands in an attempt to build this file.

Because the construction commands display messages but do not build the

file help, you can run make help repeatedly with the same result.

make 755

• You can cause make to ignore the exit status of a command by preceding

the command with a hyphen (–). For example, the following line allows

make to continue regardless of whether the call to /bin/rm is successful

(the call to /bin/rm fails if libbuf.a does not exist):

-/bin/rm libbuf.a

• You can use special variables to refer to information that might change

from one use of make to the next. Such information might include files that

need updating, files that are newer than the target, and files that match a

pattern. For example, you can use the variable $? in a construction com-

mand to identify all prerequisite files that are newer than the target file.

This variable allows you to print any files that have changed since the last

time you printed those files out:

list: .list

.list: Makefile buf.h xtbuff_ad.h buff.c buf_print.c xtbuff.c

pr $? | lpr

date >.list

The target list depends on the source files that might be printed. The con-

struction command pr $? | lpr prints only those source files that are newer

than the file .list. The line date > .list modifies the .list file so it is newer

than any of the source files. The next time you run the command make list,

only the files that have been changed are printed.

• You can include other makefiles as if they were part of the current makefile.

The following line causes make to read Make.config and treat the contents

of that file as though it were part of the current makefile, allowing you to

put information common to more than one makefile in a single place:

include Make.config

Note Under Mac OS X, the make utility is part of the Developer Tools optional install.

Examples The first example causes make to bring the target-file named analysis up-to-date by

issuing three cc commands. It uses a makefile named GNUmakefile,makefile, or

Makefile in the working directory.

$ make analysis

cc -c analy.c

cc -c stats.c

cc -o analysis analy.o stats.o

The following example also updates analysis but uses a makefile named analysis.mk

in the working directory:

$ make -f analysis.mk analysis

'analysis' is up to date.

756 make

The next example lists the commands make would execute to bring the target-file

named credit up-to-date. Because of the –n (no-execution) option, make does not

execute the commands.

$ make -n credit

cc -c -O credit.c

cc -c -O accounts.c

cc -c -O terms.c

cc -o credit credit.c accounts.c terms.c

The next example uses the –t option to update the modification time of the target-file

named credit. After you use this option, make thinks that credit is up-to-date.

$ make -t credit

$ make credit

'credit' is up to date.

Example makefiles Following is a very simple makefile named Makefile. This makefile compiles a pro-

gram named morning (the target file). The first line is a dependency line that shows

morning depends on morning.c. The next line is the construction line: It shows how

to create morning using the gcc C compiler. The construction line must be indented

using a TAB, not SPACEs.

$ cat Makefile

morning: morning.c

TAB gcc -o morning morning.c

When you give the command make,make compiles morning.c if it has been modified

more recently than morning.

The next example is a simple makefile for building a utility named ff. Because the cc

command needed to build ff is complex, using a makefile allows you to rebuild ff

easily, without having to remember and retype the cc command.

$ cat Makefile

# Build the ff command from the fastfind.c source

SHELL=/bin/sh

ff:

gcc -traditional -O2 -g -DBIG=5120 -o ff fastfind.c myClib.a

$ make ff

gcc -traditional -O2 -g -DBIG=5120 -o ff fastfind.c myClib.a

In the next example, a makefile keeps the file named compute up-to-date. The make

utility ignores comment lines (lines that begin with a pound sign [#]); the first three

lines of the following makefile are comment lines. The first dependency line shows

that compute depends on two object files: compute.o and calc.o. The corresponding

construction line gives the command make needs to produce compute. The second

dependency line shows that compute.o depends not only on its C source file but also

on the compute.h header file. The construction line for compute.o uses the C com-

piler optimizer (–O3 option). The third set of dependency and construction lines is

not required. In their absence, make infers that calc.o depends on calc.c and pro-

duces the command line needed for the compilation.

make 757

$ cat Makefile

# Makefile for compute

compute: compute.o calc.o

gcc -o compute compute.o calc.o

compute.o: compute.c compute.h

gcc -c -O3 compute.c

calc.o: calc.c

gcc -c calc.c

clean:

rm *.o *core* *~

There are no prerequisites for clean, the last target. This target is often used to

remove extraneous files that may be out-of-date or no longer needed, such as .o

files.

The next example shows a much more sophisticated makefile that uses features not

discussed in this section. Refer to the sources cited under “Documentation” on

page 754 for information about these and other advanced features.

$ cat Makefile

###########################################################

## build and maintain the buffer library

###########################################################

SHELL=/bin/sh

###########################################################

## Flags and libraries for compiling. The XLDLIBS are needed

# whenever you build a program using the library. The CCFLAGS

# give maximum optimization.

CC=gcc

CCFLAGS=-O2 $(CFLAGS)

XLDLIBS= -lXaw3d -lXt -lXmu -lXext -lX11 -lm

BUFLIB=libbuf.a

###########################################################

## Miscellaneous

INCLUDES=buf.h

XINCLUDES=xtbuff_ad.h

OBJS=buff.o buf_print.o xtbuff.o

###########################################################

## Just a 'make' generates a help message

help: Help

@echo "You can make the following:"

@echo " "

@echo " libbuf.a -- the buffer library"

@echo " bufdisplay -- display any-format buffer"

@echo " buf2ppm -- convert buffer to pixmap"

###########################################################

758 make

## The main target is the library

libbuf.a: $(OBJS)

-/bin/rm libbuf.a

ar rv libbuf.a $(OBJS)

ranlib libbuf.a

###########################################################

## Secondary targets -- utilities built from the library

bufdisplay: bufdisplay.c libbuf.a

$(CC) $(CCFLAGS) bufdisplay.c -o bufdisplay $(BUFLIB) $(XLDLIBS)

buf2ppm: buf2ppm.c libbuf.a

$(CC) $(CCFLAGS) buf2ppm.c -o buf2ppm $(BUFLIB)

###########################################################

## Build the individual object units

buff.o: $(INCLUDES) buff.c

$(CC) -c $(CCFLAGS) buff.c

buf_print.o:$(INCLUDES) buf_print.c

$(CC) -c $(CCFLAGS) buf_print.c

xtbuff.o: $(INCLUDES) $(XINCLUDES) xtbuff.c

$(CC) -c $(CCFLAGS) xtbuff.c

The make utility can be used for tasks other than compiling code. As a final exam-

ple, assume you have a database that lists IP addresses and the corresponding host-

names in two columns and that the database dumps these values to a file named

hosts.tab. You need to extract only the hostnames from this file and generate a Web

page named hosts.html containing these names. The following makefile is a simple

report writer:

$ cat makefile

SHELL=/bin/bash

hosts.html: hosts.tab

@echo "<HTML><BODY>" > hosts.html

@awk '{print $$2, "<br>"}' hosts.tab >> hosts.html

@echo "</BODY></HTML>" >> hosts.html

man 759

man

Displays documentation for commands

man [options] [section]command

man –k keyword

The man (manual) utility provides online documentation for Linux and Mac OS X commands. In

addition to user commands, documentation is available for many other commands and details that

relate to Linux and OS X. Because many Linux and OS X commands come from GNU, the GNU info

utility (page 36) frequently provides more complete information about them.

A one-line header is associated with each manual page. This header consists of a command name, the

section of the manual in which the command is found, and a brief description of what the command

does. These headers are stored in a database; thus you can perform quick searches on keywords

associated with each man page.

Arguments The section argument tells man to limit its search to the specified section of the man-

ual (see page 34 for a listing of manual sections). Without this argument man searches

the sections in numerical order and displays the first man page it finds. In the second

form of the man command, the –k option searches for the keyword in the database of

man page headers; man displays a list of headers that contain the keyword. A man –k

command performs the same function as apropos (page 35).

Options Options preceded by a double hyphen (––) work under Linux only. Not all options

preceded by a double hyphen work under all Linux distributions. Options named

with a single letter and preceded by a single hyphen work under Linux and OS X.

––all –a Displays man pages for all sections of the manual. Without this option man dis-

plays only the first page it finds. Use this option when you are not sure which

section contains the desired information.

–K keyword

Searches for keyword in all man pages. This option can take a long time to run.

It is not available under some Linux distributions.

––apropos –k keyword

Displays manual page headers that contain the string keyword. You can scan

this list for commands of interest. This option is equivalent to the apropos com-

mand (page 35).

––manpath=path –M path

Searches the directories in path for man pages, where path is a colon-separated

list of directories. See “Discussion.”

760 man

––troff –t Formats the page for printing on a PostScript printer. The output goes to standard

output.

Discussion The manual pages are organized in sections, each pertaining to a separate aspect of

the Linux system. Section 1 contains user-callable commands and is the section

most likely to be accessed by users who are not system administrators or program-

mers. Other sections of the manual describe system calls, library functions, and

commands used by system administrators. See page 34 for a listing of the manual

sections.

Pager The man utility uses less to display manual pages that fill more than one screen. To

use another pager, set the environment variable PAGER to the pathname of that

pager. For example, adding the following line to the ~/.bash_profile file allows a

bash user to use more instead of less:

export PAGER=/usr/bin/less

Under OS X, less and more are copies of the same file. Because of the way each is

called, they work slightly differently.

MANPATH You can tell man where to look for man pages by setting the environment variable

MANPATH to a colon-separated list of directories. For example, bash users run-

ning Linux can add the following line to ~/.bash_profile to cause man to search the

/usr/man, /usr/local/man, and /usr/local/share/man directories:

export MANPATH=/usr/man:/usr/local/man:/usr/local/share/man

Working as a privileged user, you can edit /etc/manpath.config or /etc/man.config

(Linux) or /etc/man.conf (OS X) to further configure man. Refer to the man man

page for more information.

Notes See page 33 for another discussion of man.

The argument to man does not have to be a command name. For example, the com-

mand man ascii lists the ASCII characters and their various representations; the

command man –k postscript lists man pages that pertain to PostScript.

The man pages are commonly stored in an unformatted, compressed form. When

you request a man page, it has to be decompressed and formatted before being dis-

played. To speed up subsequent requests for that man page, man attempts to save

the formatted version of the page.

Some utilities described in the manual pages have the same name as shell builtin

commands. The behavior of the shell builtin may differ slightly from the behavior

of the utility as described in the manual page. For information about shell builtins,

see the man page for builtin or the man page for a specific shell.

References to man pages frequently use section numbers in parentheses. For example,

write(2) refers to the man page for write in section 2 of the manual (page 34).

man 761

The first of the following commands uses the col utility to generate a simple text man

page that does not include bold or underlined text. The second command generates a

PostScript version of the man page.

$ man ls | col -b > ls.txt

$ man -t ls > ls.ps

Under Linux you can use ps2pdf to convert the PostScript file to a PDF file.

Examples The following example uses man to display the documentation for the command

write, which sends messages to another user’s terminal:

$ man write

WRITE(1) BSD General Commands Manual WRITE(1)

NAME

write - send a message to another user

SYNOPSIS

write user [ttyname]

DESCRIPTION

The write utility allows you to communicate with other users, by copying

lines from your terminal to theirs.

When you run the write command, the user you are writing to gets a mes-

sage of the form:

Message from yourname@yourhost on yourtty at hh:mm ...

...

The next example displays the man page for another command—the man command

itself, which is a good starting place for someone learning about the system:

$ man man

MAN(1) Manual pager utils MAN(1)

NAME

man - an interface to the on-line reference manuals

SYNOPSIS

man [-c|-w|-tZ] [-H[browser]] [-T[device]] [-X[dpi]] [-adhu7V] [-i|-I]

[-m system[,...]] [-L locale] [-p string] [-C file] [-M path] [-P

pager] [-r prompt] [-S list] [-e extension] [--warnings [warnings]]

[[section] page ...] ...

...

DESCRIPTION

man is the system's manual pager. Each page argument given to man is

normally the name of a program, utility or function. The manual page

associated with each of these arguments is then found and displayed. A

...

762 man

You can also use the man utility to find the man pages that pertain to a certain topic.

In the next example, man –k displays man page headers containing the string latex.

The apropos utility functions similarly to man –k.

$ man -k latex

elatex (1) [latex] - structured text formatting and typesetting

latex (1) - structured text formatting and typesetting

mkindex (1) - script to process LaTeX index and glossary files

pdflatex (1) - PDF output from TeX

pod2latex (1) - convert pod documentation to latex format

Pod::LaTeX (3perl) - Convert Pod data to formatted Latex

...

The search for the keyword entered with the –k option is not case sensitive.

Although the keyword entered on the command line is all lowercase, it matches the

last header, which contains the string LaTeX (uppercase and lowercase). The 3perl

entry on the last line indicates the man page is from Section 3 (Subroutines) of the

Linux System Manual and comes from the Perl Programmers Reference Guide (it is

a Perl subroutine; see Chapter 11 for more information on the Perl programming

language).

mkdir 763

mkdir

Creates a directory

mkdir [option]directory-list

The mkdir utility creates one or more directories.

Arguments The directory-list is a list of pathnames of directories that mkdir creates.

Options Under Linux, mkdir accepts the common options described on page 603. Options

preceded by a double hyphen (––) work under Linux only. Options named with a

single letter and preceded by a single hyphen work under Linux and OS X.

––mode=mode –m mode

Sets the permission to mode. You can represent the mode absolutely using an

octal number (Table V-7 on page 627) or symbolically (Table V-4 on

page 626).

––parents –p Creates directories that do not exist in the path to the directory you wish to

create.

––verbose –v Displays the name of each directory created. This option is helpful when used

with the –p option.

Notes You must have permission to write to and search (execute permission) the parent

directory of the directory you are creating. The mkdir utility creates directories that

contain the standard hidden entries (. and ..).

Examples The following command creates the accounts directory as a subdirectory of the

working directory and the prospective directory as a subdirectory of accounts:

$ mkdir -p accounts/prospective

Without changing working directories, the same user now creates another subdirectory

within the accounts directory:

$ mkdir accounts/existing

Next the user changes the working directory to the accounts directory and creates

one more subdirectory:

$ cd accounts

$ mkdir closed

The last example shows the user creating another subdirectory. This time the

––mode option removes all access permissions for the group and others:

$ mkdir -m go= accounts/past_due

764 mkfs

mkfs

Creates a filesystem on a device

mkfs [options]device

The mkfs utility creates a filesystem on a device such as a floppy diskette or a partition of a hard disk.

It acts as a front end for programs that create filesystems, each specific to a filesystem type. The mkfs

utility is available under Linux only. L

Arguments The device is the name of the device that you want to create the filesystem on. If the

device name is in /etc/fstab, you can use the mount point of the device instead of the

device name (e.g., /home in place of /dev/sda2).

Options When you run mkfs, you can specify both global options and options specific to the

filesystem type that mkfs is creating (e.g., ext2,ext3,ext4,msdos,reiserfs). Global

options must precede type-specific options.

Global Options

–t fstype

(type) The fstype is the type of filesystem you want to create—for example,

ext2,ext3,msdos, or reiserfs. The default filesystem varies.

–v (verbose) Displays more output. Use –V for filesystem-specific information.

Filesystem Type-Specific Options

The options described in this section apply to many common filesystem types,

including ext2 and ext3. The following command lists the filesystem creation utili-

ties available on the local system:

$ ls /sbin/mkfs.*

/sbin/mkfs.bfs /sbin/mkfs.ext2 /sbin/mkfs.minix /sbin/mkfs.reiserfs

/sbin/mkfs.cramfs /sbin/mkfs.ext3 /sbin/mkfs.msdos /sbin/mkfs.vfat

There is frequently a link to /sbin/mkfs.ext2 at /sbin/mke2fs. Review the man page

or give the pathname of the filesystem creation utility to determine which options

the utility accepts.

$ /sbin/mkfs.ext3

Usage: mkfs.ext3 [-c|-l filename] [-b block-size] [-f fragment-size]

[-i bytes-per-inode] [-I inode-size] [-J journal-options]

[-N number-of-inodes] [-m reserved-blocks-percentage] [-o creator-os]

[-g blocks-per-group] [-L volume-label] [-M last-mounted-directory]

[-O feature[,...]] [-r fs-revision] [-E extended-option[,...]]

[-T fs-type] [-jnqvFSV] device [blocks-count]

mkfs destroys all data on a device

caution Be careful when using mkfs: It destroys all data on a device.

mkfs 765

–b size

(block) Specifies the size of blocks in bytes. On ext2,ext3, and ext4 filesystems,

valid block sizes are 1,024, 2,048, and 4,096 bytes.

–c (check) Checks for bad blocks on the device before creating a filesystem. Specify

this option twice to perform a slow, destructive, read-write test.

Discussion Before you can write to and read from a hard disk or floppy diskette in the usual

fashion, there must be a filesystem on it. Typically a hard disk is divided into parti-

tions (page 970), each with a separate filesystem. A floppy diskette normally holds a

single filesystem. Refer to Chapter 4 for more information on filesystems.

Notes Under Mac OS X, use diskutil (page 668) to create a filesystem.

You can use tune2fs (page 868) with the –j option to change an existing ext2 filesystem

into a journaling filesystem (page 961) of type ext3. (See the “Examples” section.)

You can also use tune2fs to change how often fsck (page 699) checks a filesystem.

mkfs is a front end Much like fsck,mkfs is a front end that calls other utilities to handle various types of

filesystems. For example, mkfs calls mke2fs (which is typically linked to mkfs.ext2

and mkfs.ext3) to create the widely used ext2 and ext3 filesystems. Refer to the

mke2fs man page for more information. Other utilities that mkfs calls are typically

named mkfs.type, where type is the filesystem type. By splitting mkfs in this manner,

filesystem developers can provide programs to create their filesystems without

affecting the development of other filesystems or changing how system administra-

tors use mkfs.

Examples In the following example, mkfs creates a filesystem on the device at /dev/hda8. In

this case the default filesystem type is ext2.

# /sbin/mkfs /dev/sda5

mke2fs 1.41.4 (27-Jan-2009)

Filesystem label=

OS type: Linux

Block size=1024 (log=0)

Fragment size=1024 (log=0)

122880 inodes, 489948 blocks

24497 blocks (5.00%) reserved for the super user

First data block=1

Maximum filesystem blocks=67633152

60 block groups

8192 blocks per group, 8192 fragments per group

2048 inodes per group

Superblock backups stored on blocks:

8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409

Writing inode tables: done

Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 37 mounts or

180 days, whichever comes first. Use tune2fs -c or -i to override.

766 mkfs

Next the administrator uses tune2fs to convert the ext2 filesystem to an ext3 jour-

naling filesystem:

# /sbin/tune2fs -j /dev/sda5

tune2fs 1.41.4 (27-Jan-2009)

Creating journal inode: done

This filesystem will be automatically checked every 37 mounts or

180 days, whichever comes first. Use tune2fs -c or -i to override.

Mtools 767

Mtools

Uses DOS-style commands on files and directories

mcd [directory]

mcopy [options]file-list target

mdel file-list

mdir [–w] directory

mformat [options]device

mtype [options]file-list

These utilities mimic DOS commands and manipulate Linux, Mac OS X, or DOS files. The mcopy

utility provides an easy way to move files between a Linux/OS X filesystem and a DOS disk. The

default drive for all commands is /dev/fd0 or A:.

Utilities Table V-20 lists some of the utilities in the Mtools collection.

Arguments The directory, which is used with mcd and mdir, must be the name of a directory on

a DOS disk. The file-list, which is used with mcopy and mtype, is a SPACE-separated

list of filenames. The target, which is used with mcopy, is the name of a regular file

or a directory. If you give mcopy a file-list with more than one filename, target must

be the name of a directory. The device, which is used with mformat, is the DOS drive

letter containing the disk to be formatted (for example, A:).

Options mcopy

–n Automatically replaces existing files without asking. Normally mcopy asks for

verification before overwriting a file.

–p (preserve) Preserves the attributes of files when they are copied.

–s (recursive) Copies directory hierarchies.

Table V-20 The Mtools collection

Utility Function

mcd Changes the working directory on the DOS disk

mcopy Copies DOS files from one directory to another

mdel Deletes DOS files

mdir Lists the contents of DOS directories

mformat Adds DOS formatting information to a disk

mtype Displays the contents of DOS files

768 Mtools

–t (text) Converts DOS text files for use on a Linux/OS X system, and vice versa.

Lines in DOS text files are terminated with the character pair RETURN–NEWLINE;

lines in Linux/OS X text files end in NEWLINE. This option removes the RETURN

character when copying from a DOS file and adds it when copying from a

Linux file.

mdir

–w (wide) Displays only filenames and fits as many as possible on each line. By

default mdir lists information about each file on a separate line, showing the

filename, size, and creation time.

mformat

–f 1440

Specifies a 1,440K 3.5-inch HD floppy diskette.

–f 2880

Specifies a 2,880K 3.5-inch ED floppy diskette.

–v vol

(label) Puts vol as the volume label on the newly formatted DOS disk.

mtype

–t (text) Similar to the –t option for mcopy, this option replaces each RETURN–NEWLINE

character pair in the DOS file with a single NEWLINE character before displaying

the file.

Discussion Although these utilities mimic their DOS counterparts, they do not attempt to

match those tools exactly. In most cases the restrictions imposed by DOS are

removed. For example, the ambiguous file reference represented by the asterisk (*)

matches all filenames (as it does under Linux/OS X), including those filenames that

DOS would require *.* to match.

Notes In this discussion, the term DOS disk refers to either a DOS partition on a hard disk

or a DOS floppy diskette.

If Mtools is not available in a Linux distribution repository, you can download

Mtools from the Mtools home page (www.gnu.org/software/mtools). Under Mac

OS X, follow these steps to download and compile Mtools:

1. Install MacPorts.

2. Exit from any running terminals.

3. Start a terminal and run sudo port install Mtools.

Mtools 769

If the local kernel is configured to support DOS filesystems, you can mount DOS

disks on a Linux/OS X filesystem and manipulate the files using Linux/OS X utili-

ties. Although this feature is handy and reduces the need for Mtools, it may not be

practical or efficient to mount and unmount DOS filesystems each time you need to

access a DOS file. These tasks can be time-consuming, and some systems are set up

so regular users cannot mount and unmount filesystems.

Use caution when using Mtools. Its utilities may not warn you if you are about to

overwrite a file. Using explicit pathnames—not ambiguous file references—reduces

the chance of overwriting a file.

The Mtools utilities are most commonly used to examine files on DOS floppy

diskettes (mdir) and to copy files between a DOS floppy diskette and the Linux

filesystem (mcopy). You can identify DOS disks by using the usual DOS drive let-

ters: A: for the first floppy drive, C: for the first hard disk, and so on. To separate

filenames in paths use either the Linux forward slash (/) or the DOS backslash (\).

You need to escape backslashes to prevent the shell from interpreting them before

passing the pathname to the utility you are using.

Each of the Mtools utilities returns an exit code of 0 on success, 1 on complete fail-

ure, and 2 on partial failure.

Examples In the first example, mdir displays the contents of a DOS floppy diskette in /dev/fd0:

$ mdir

Volume in drive A is DOS UTY

Directory for A:/

ACAD LIF 419370 5-10-09 1:29p

CADVANCE LIF 40560 2-08-08 10:36a

CHIPTST EXE 2209 4-26-09 4:22p

DISK ID 31 12-27-09 4:49p

GENERIC LIF 20983 2-08-08 10:37a

INSTALL COM 896 7-05-09 10:23a

INSTALL DAT 45277 12-27-09 4:49p

KDINSTAL EXE 110529 8-13-09 10:50a

LOTUS LIF 44099 1-18-09 3:36p

PCAD LIF 17846 5-01-09 3:46p

READID EXE 17261 5-07-09 8:26a

README TXT 9851 4-30-09 10:32a

UTILITY LIF 51069 5-05-09 9:13a

WORD LIF 16817 7-01-09 9:58a

WP LIF 57992 8-29-09 4:22p

15 File(s) 599040 bytes free

The next example uses mcopy to copy the *.TXT files from the DOS floppy diskette

to the working directory on the local filesystem. Because only one file has the exten-

sion .TXT, only one file is copied. Because .TXT files are usually text files under

770 Mtools

DOS, the –t option strips the unnecessary RETURN characters at the end of each line.

The ambiguous file reference * is escaped on the command line to prevent the shell

from attempting to expand it before passing the argument to mcopy. The mcopy util-

ity locates the file README.TXT when given the pattern *.txt because DOS does

not differentiate between uppercase and lowercase letters in filenames.

$ mcopy -t a:\*.txt .

Copying README.TXT

Finally, the DOS floppy diskette is reformatted using mformat, which wipes all data

from the diskette. If the diskette has not been low-level formatted, you need to use

fdformat before giving the following command:

$ mformat a:

A check with mdir shows the floppy diskette is empty after formatting:

$ mdir a:

Can't open /dev/fd0: No such device or address

Cannot initialize 'A:'

mv 771

Renames or moves a file

mv [options]existing-file new-filename

mv [options]existing-file-list directory

mv [options]existing-directory new-directory

The mv utility, which renames or moves one or more files, has three formats. The first renames a sin-

gle file with a new filename that you supply. The second renames one or more files so that they

appear in a specified directory. The third renames a directory. The mv utility physically moves the file

if it is not possible to rename it (that is, if you move the file from one filesystem to another).

Arguments In the first form, the existing-file is a pathname that specifies the ordinary file you

want to rename. The new-filename is the new pathname of the file.

In the second form, the existing-file-list is a list of the pathnames of the files you

want to rename and the directory specifies the new parent directory for the files.

The files you rename will have the same simple filenames as each of the files in the

existing-file-list but new absolute pathnames.

The third form renames the existing-directory with the new-directory name. This

form works only when the new-directory does not already exist.

Options Under Linux, mv accepts the common options described on page 603. Options pre-

ceded by a double hyphen (––) work under Linux only. Except as noted, options

named with a single letter and preceded by a single hyphen work under Linux and

OS X.

––backup –b Makes a backup copy (by appending ~ to the filename) of any file that would

be overwritten. L

––force –f Causes mv not to prompt you if a move would overwrite an existing file that

you do not have write permission for. You must have write permission for the

directory holding the existing file.

––interactive –i Prompts for confirmation if mv would overwrite a file. If your response begins

with a y or Y,mv overwrites the file; otherwise, mv does not move the file.

––update –u If a move would overwrite an existing file—not a directory—this option causes

mv to compare the modification times of the source and target files. If the tar-

get file has a more recent modification time (the target is newer than the

source), mv does not replace it. L

––verbose –v Lists files as they are moved.

772 mv

Notes When GNU mv copies a file from one filesystem to another, mv is implemented as

cp (with the –a option) and rm: It first copies the existing-file to the new-file and

then deletes the existing-file. If the new-file already exists, mv may delete it before

copying.

As with rm, you must have write and execute access permissions to the parent direc-

tory of the existing-file, but you do not need read or write access permission to the

file itself. If the move would overwrite a file that you do not have write permission

for, mv displays the file’s access permissions and waits for a response. If you enter y

or Y,mv overwrites the file; otherwise, it does not move the file. If you use the –f

option, mv does not prompt you for a response but simply overwrites the file.

Although earlier versions of mv could move only ordinary files between filesystems,

mv can now move any type of file, including directories and device files.

Examples The first command renames letter, a file in the working directory, as letter.1201:

$ mv letter letter.1201

The next command renames the file so it appears with the same simple filename in

the user’s ~/archives directory:

$ mv letter.1201 ~/archives

The following command moves all files in the working directory whose names begin

with memo so they appear in the /p04/backup directory:

$ mv memo* /p04/backup

Using the –u option prevents mv from replacing a newer file with an older one. After

the mv –u command shown below is executed, the newer file (memo2) has not been

overwritten. The mv command without the –u option overwrites the newer file

(memo2’s modification time and size have changed to those of memo1).

$ ls -l

-rw-rw-r-- 1 sam sam 22 Mar 25 23:34 memo1

-rw-rw-r-- 1 sam sam 19 Mar 25 23:40 memo2

$ mv -u memo1 memo2

$ ls -l

-rw-rw-r-- 1 sam sam 22 Mar 25 23:34 memo1

-rw-rw-r-- 1 sam sam 19 Mar 25 23:40 memo2

$ mv memo1 memo2

$ ls -l

-rw-rw-r-- 1 sam sam 22 Mar 25 23:34 memo2

nice 773

nice

Changes the priority of a command

nice [option] [command-line]

The nice utility reports the priority of the shell or alters the priority of a command. An ordinary user

can decrease the priority of a command. Only a user working with root privileges can increase the

priority of a command. The nice builtin in the TC Shell has a different syntax. Refer to the “Notes”

section for more information.

Arguments The command-line is the command line you want to execute at a different priority.

Without any options or arguments, nice displays the priority of the shell running nice.

Options Without an option, nice defaults to an adjustment of 10, lowering the priority of the

command by 10—typically from 0 to 10. As you raise the priority value, the com-

mand runs at a lower priority.

The option preceded by a double hyphen (––) works under Linux only. The option

named with a single letter and preceded by a single hyphen works under Linux and

OS X.

––adjustment=value

–n value

Changes the priority by the increment (or decrement) specified by value. The

priorities range from –20 (the highest priority) to 19 (the lowest priority). A

positive value lowers the priority, whereas a negative value raises the priority.

Only a user working with root privileges can specify a negative value. When

you specify a value outside this range, the priority is set to the limit of the range.

Notes You can use renice (page 802) or top’s r command (page 860) to change the priority

of a running process.

Higher (more positive) priority values mean that the kernel schedules a job less

often. Lower (more negative) values cause the job to be scheduled more often.

When a user working with root privileges schedules a job to run at the highest pri-

ority, this change can affect the performance of the system for all other jobs,

including the operating system itself. For this reason you should be careful when

using nice with negative values.

The TC Shell includes a nice builtin. Under tcsh, use the following syntax to change

the priority at which command-line is run. The default priority is 4. You must

include the plus sign for positive values.

nice [±value]command line

774 nice

The tcsh nice builtin works differently from the nice utility: When you use the

builtin, nice –5 decrements the priority at which command-line is run. When you

use the utility, nice –n –5 increments the priority at which command-line is run.

Examples The following command executes find in the background at the lowest possible priority.

The ps –l command displays the nice value of the command in the NI column:

# nice -n 19 find / -name core -print > corefiles.out &

[1] 422

# ps -l

F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD

4 R 0 389 8657 0 80 0 - 4408 - pts/4 00:00:00 bash

4 D 0 422 389 28 99 19 - 1009 - pts/4 00:00:04 find

0 R 0 433 389 0 80 0 - 1591 - pts/4 00:00:00 ps

The next command finds very large files and runs at a high priority (–15):

# nice -n -15 find / -size +50000k

nohup 775

nohup

Runs a command that keeps running after you log out

nohup command line

The nohup utility executes a command line such that the command keeps running after you log out.

In other words, nohup causes a process to ignore a SIGHUP signal. Depending on how the local shell

is configured, it may kill your background processes when you log out. The TC Shell includes a

nohup builtin. Refer to the “Notes” section for more information.

Arguments The command line is the command line you want to execute.

Notes Under Linux, nohup accepts the common options described on page 603.

If you do not redirect the output from a command you execute using nohup, both

standard output and standard error are sent to the file named nohup.out in the

working directory. If you do not have write permission for the working directory,

nohup sends output to ~/nohup.out.

Unlike the nohup utility, the TC Shell’s nohup builtin does not send output to

nohup.out. Background jobs started from tcsh continue to run after you log out.

Examples The following command executes find in the background, using nohup:

$ nohup find / -name core -print > corefiles.out &

[1] 14235

776 od

Dumps the contents of a file

od [options] [file-list]

The od (octal dump) utility dumps the contents of a file. The dump is useful for viewing executable (object)

files and text files with embedded nonprinting characters. This utility takes its input from the file you spec-

ify on the command line or from standard input. The od utility is available under Linux only. L

Arguments The file-list specifies the pathnames of the files that od displays. When you do not

specify a file-list,od reads from standard input.

Options The od utility accepts the common options described on page 603.

––address-radix=base

–A base

Specifies the base used when displaying the offsets shown for positions in the

file. By default offsets are given in octal. Possible values for base are d(deci-

mal), o(octal), x (hexadecimal), and n (no offsets printed).

––skip-bytes=n–j n

Skips n bytes before displaying data.

––read-bytes=n–N n

Reads a maximum of n bytes and then quits.

––strings=n–S n

Outputs from the file only those bytes that contain runs of n or more printable

ASCII characters that are terminated by a NULL byte.

––format=type[n]

–t type[n]

Specifies the output format for displaying data from a file. You can repeat this

option with different format types to see the file in several different formats.

Table V-21 lists the possible values for type.

By default od dumps a file as 2-byte octal numbers. You can specify the num-

ber of bytes od uses to compose each number by specifying a length indicator,

n. You can specify a length indicator for all types except a and c. Table V-23

lists the possible values of n.

Table V-21 Output formats

type Type of output

aNamed character. Displays nonprinting control characters using their official

ASCII names. For example, FORMFEED is displayed as ff.

od 777

type Type of output

cASCII character. Displays nonprinting control characters as backslash escape

sequences (Table V-22) or three-digit octal numbers.

dSigned decimal.

fFloating point.

oOctal (default).

uUnsigned decimal.

xHexadecimal.

Table V-22 Output format type c backslash escape sequences

Sequence Meaning

\0 NULL

\a BELL

\b BACKSPACE

\f FORMFEED

\n NEWLINE

\r RETURN

\t HORIZONTAL TAB

\v VERTICAL TAB

Table V-23 Length indicators

Number of bytes to use

Integers (types d, o, u, and x)

C(character) Uses single characters for each decimal value

S(short integer) Uses 2 bytes

I(integer) Uses 4 bytes

L(long) Uses 4 bytes on 32-bit machines and 8 bytes on 64-bit

machines

Table V-21 Output formats (continued)

778 od

Notes To retain backward compatibility with older, non-POSIX versions of od, the od util-

ity includes the options listed in Table V-24 as shorthand versions of many of the

preceding options.

Examples The file ac, which is used in the following examples, contains all of the ASCII char-

acters. In the first example, the bytes in this file are displayed as named characters.

The first column shows the offset of the first byte on each line of output from the

start of the file. The offsets are displayed as octal values.

$ od -t a ac

0000000 nul soh stx etx eot enq ack bel bs ht nl vt ff cr so si

0000020 dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us

0000040 sp ! " # $ % & ' ( ) * + , - . /

0000060 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

0000100 @ A B C D E F G H I J K L M N O

0000120 P Q R S T U V W X Y Z [ \ ] ^ _

0000140 ` a b c d e f g h i j k l m n o

0000160 p q r s t u v w x y z { | } ~ del

0000200 nul soh stx etx eot enq ack bel bs ht nl vt ff cr so si

0000220 dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us

0000240 sp ! " # $ % & ' ( ) * + , - . /

Floating point (type f)

F(float) Uses 4 bytes

D(double) Uses 8 bytes

L(long double) Typically uses 8 bytes

Table V-23 Length indicators (continued)

Table V-24 Shorthand format specifications

Shorthand Equivalent specification

–a –t a

–b –t oC

–c –t c

–d –t u2

–f –t fF

–h –t x2

–i –t d2

–l –t d4

–o –t o2

–x –t x2

od 779

0000260 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

0000300 @ A B C D E F G H I J K L M N O

0000320 P Q R S T U V W X Y Z [ \ ] ^ _

0000340 ` a b c d e f g h i j k l m n o

0000360 p q r s t u v w x y z { | } ~ del

0000400 nl

0000401

In the next example, the bytes are displayed as octal numbers, ASCII characters, or

printing characters preceded by a backslash (refer to Table V-22 on page 777):

$ od -t c ac

0000000 \0 001 002 003 004 005 006 \a \b \t \n \v \f \r 016 017

0000020 020 021 022 023 024 025 026 027 030 031 032 033 034 035 036 037

0000040 ! " # $ % & ' ( ) * + , - . /

0000060 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

0000100 @ A B C D E F G H I J K L M N O

0000120 P Q R S T U V W X Y Z [ \ ] ^ _

0000140 ` a b c d e f g h i j k l m n o

0000160 p q r s t u v w x y z { | } ~ 177

0000200 200 201 202 203 204 205 206 207 210 211 212 213 214 215 216 217

0000220 220 221 222 223 224 225 226 227 230 231 232 233 234 235 236 237

0000240 240 241 242 243 244 245 246 247 250 251 252 253 254 255 256 257

0000260 260 261 262 263 264 265 266 267 270 271 272 273 274 275 276 277

0000300 300 301 302 303 304 305 306 307 310 311 312 313 314 315 316 317

0000320 320 321 322 323 324 325 326 327 330 331 332 333 334 335 336 337

0000340 340 341 342 343 344 345 346 347 350 351 352 353 354 355 356 357

0000360 360 361 362 363 364 365 366 367 370 371 372 373 374 375 376 377

0000400 \n

0000401

The final example finds in the file /usr/bin/who all strings that are at least three

characters long (the default) and are terminated by a null byte. See strings on

page 837 for another way of displaying a similar list. The offset positions are given

as decimal offsets instead of octal offsets.

$ od -A d -S 3 /usr/bin/who

...

0028455 /usr/share/locale

0028473 Michael Stone

0028487 David MacKenzie

0028503 Joseph Arceneaux

0028520 6.10

0028525 GNU coreutils

0028539 who

0028543 abdlmpqrstuwHT

0028558 %Y-%m-%d %H:%M

0028573 %b %e %H:%M

0028585 extra operand %s

0028602 all

0028606 count

0028612 dead

0028617 heading

0028625 ips

0028629 login

...

780 open O

open O

Opens files, directories, and URLs

open [option] [file-list]

The open utility opens one or more files, directories, or URLs.

Arguments The file-list specifies the pathnames of the files, directories, or URLs that open is to

open.

Options Without any options, open opens the files in file-list as though you had double-

clicked each of the files’ icons in the Finder.

–a application

Opens file-list using application. This option is equivalent to dragging file-list

to the application’s icon in the Finder.

–b bundle

Opens file-list using the application with bundle identifier bundle. A bundle

identifier is a string, registered with the system, that identifies an application

that can open files. For example, the bundle identifier com.apple.TextEdit

specifies the TextEdit editor.

–e (edit) Opens file-list using the TextEdit application.

–f (file) Opens standard input as a file in the default text editor. This option does

not accept file-list.

–t (text) Opens file-list using the default text editor (see the “Discussion” section).

Discussion Opening a file brings up the application associated with that file. For example,

opening a disk image file mounts it. The open command returns immediately, with-

out waiting for the application to launch.

LaunchServices is a system framework that identifies applications that can open

files. It maintains lists of available applications and user preferences about which

application to use for each file type. LaunchServices also keeps track of the default

text editor used by the –t and –f options.

Notes When a file will be opened by a GUI application, you must run open from Ter m i n a l

or another terminal emulator that is running under a GUI. Otherwise, the operation

will fail.

open O781

Examples The first example mounts the disk image file backups.dmg. The disk is mounted in

/Volumes, using the name it was formatted with.

$ ls /Volumes

House Spare Leopard

$ open backups.dmg

$ ls /Volumes

Backups House Spare Leopard

The next command opens the file picture.jpg. You must run this and the following

example from a textual window within a GUI (e.g., Te r m i n a l ). The application

selected depends on the file attributes. If the file’s type and creator code specify a

particular application, open opens the file using that application. Otherwise, open

uses the system’s default program for handling .jpg files.

$ open picture.jpg

The final example opens the /usr/bin directory in the Finder. The /usr directory is

normally hidden from the Finder because its invisible file attribute flag (page 931) is

set. However, the open command can open any file you can access from the shell,

even if it is not normally accessible from the Finder.

$ open /usr/bin

782 otool O

otool O

Displays object, library, and executable files

otool options file-list

The otool utility displays information about, or part of, object, library, and executable files.

Arguments The file-list specifies the pathnames of files that otool is to display.

Options You must use at least one of the –L, –M,–t, or –T options to specify which part of

each file in file-list otool is to display.

–L (libraries) Displays the names and version numbers of the shared libraries an

object file uses.

–M (module) Displays the module table of a shared library.

–p name

(print) Begins output at the symbol named name. This option requires the –t

option and either the –v or –V option.

–T (table of contents) Displays the table of contents of a shared library.

–t (text) Displays the TEXT section of an object file.

–V (very verbose) Displays even more data (than the –v option) symbolically.

When displaying code, this option causes otool to display the names of called

routines instead of their addresses.

–v (verbose) Displays data symbolically. When displaying code, this option causes

otool to display the names of instructions instead of numeric codes.

Discussion The otool utility displays information about the contents and dependencies of object

files. This information can prove helpful when you are debugging a program. For

example, when you are setting up a chroot jail, otool can report which libraries are

needed to run a given program.

Some options are useful only with certain types of object modules. For example, the

–T option does not report anything for a typical executable file.

Notes The otool utility is part of the Developer Tools optional install.

An otool –L command performs a function similar to the ldd utility on systems

using the ELF binary format.

Examples The examples in this section use the compiled version of the following C program:

otool O783

$ cat myname.c

#include <stdio.h>

int main(void) {

printf("My name is Sam.\n");

return 0;

}

In the first example, otool displays the libraries the program is linked with:

$ otool -L myname

myname:

/usr/lib/libmx.A.dylib (compatibility version 1.0.0, current version 92.0.0)

/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.0.0)

In some cases, a library used by a program will depend on other libraries. You can

use otool -L on a library to see whether it uses other libraries:

$ otool -L /usr/lib/libmx.A.dylib /usr/lib/libSystem.B.dylib

/usr/lib/libmx.A.dylib:

/usr/lib/libmx.A.dylib (compatibility version 1.0.0, current version 92.0.0)

/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.0.0)

/usr/lib/libSystem.B.dylib:

/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.0.0)

/usr/lib/system/libmathCommon.A.dylib (compatibility version 1.0.0, current ...

The next example disassembles the code for the main function. When compiling

programs, the compiler sometimes modifies symbol names. The compiler gives

functions, such as main, a leading underscore; thus the symbol name is _main.

$ otool -Vt -p _main myname

myname:

(__TEXT,__text) section

_main:

00002ac0 mfspr r0,lr

00002ac4 stmw r30,0xfff8(r1)

00002ac8 stw r0,0x8(r1)

00002acc stwu r1,0xffb0(r1)

00002ad0 or r30,r1,r1

00002ad4 bcl 20,31,0x2ad8

00002ad8 mfspr r31,lr

00002adc addis r2,r31,0x0

00002ae0 addi r3,r2,0x4b8

00002ae4 bl _printf$LDBLStub

00002ae8 li r0,0x0

00002aec or r3,r0,r0

00002af0 lwz r1,0x0(r1)

00002af4 lwz r0,0x8(r1)

00002af8 mtspr lr,r0

00002afc lmw r30,0xfff8(r1)

00002b00 blr

...

784 paste

paste

Joins corresponding lines from files

paste [option] [file-list]

The paste utility reads lines from the file-list and joins corresponding lines in its output. By default

output lines are separated by a TAB character.

Arguments The file-list is a list of ordinary files. When you omit the file-list,paste reads from

standard input.

Options Under Linux, paste accepts the common options described on page 603. Options

preceded by a double hyphen (––) work under Linux only. Options named with a

single letter and preceded by a single hyphen work under Linux and OS X.

––delimiter=dlist –d dlist

The dlist is a list of characters used to separate output fields. If dlist contains a

single character, paste uses that character instead of the default TAB character to

separate fields. If dlist contains more than one character, the characters are

used in turn to separate fields and are then reused from the beginning of the

list as necessary.

––serial –s Processes one file at a time; pastes horizontally. See the “Examples” section.

Notes The paste utility is often used to rearrange the columns of a table. A utility, such as

cut, can place the desired columns in separate files, and then paste can join them in

any order.

Examples The following example uses the files fnames and acctinfo. You can create these files

using cut (page 652) and the /etc/passwd file. The paste command puts the full-

name field first, followed by the remaining user account information. A TAB charac-

ter separates the two output fields. Although this example works under Mac OS X,

/etc/passwd does not contain information about most users; see “Open Directory”

on page 926 for more information.

$ cat fnames

Sam the Great

Max Wild

Zach Brill

Helen Simpson

$ cat acctinfo

sam:x:401:50:/home/sam:/bin/zsh

max:x:402:50:/home/max:/bin/bash

zach:x:504:500:/home/zach:/bin/tcsh

hls:x:505:500:/home/hls:/bin/bash

paste 785

$ paste fnames acctinfo

Sam the Great sam:x:401:50:/home/sam:/bin/zsh

Max Wild max:x:402:50:/home/max:/bin/bash

Zach Brill zach:x:504:500:/home/zach:/bin/tcsh

Helen Simpson hls:x:505:500:/home/hls:/bin/bash

The next examples use the files p1,p2,p3, and p4. In the last example in this group,

the –d option gives paste a list of characters to use to separate output fields:

$ cat p1

one

ONE

$ cat p2

two

TWO

extra

$ cat p3

three

THREE

$ cat p4

four

FOUR

$ paste p4 p3 p2 p1

4 3 2 1

four three two one

FOUR THREE TWO ONE

extra

$ paste -d="+-=" p3 p2 p1 p4

3+2-1=4

three+two-one=four

THREE+TWO-ONE=FOUR

+extra-=

The final example uses the ––serial option to paste the files one at a time:

$ paste --serial p1 p2 p3 p4

1 one ONE

2 two TWO extra

3 three THREE

4 four FOUR

786 pax

pax

Creates an archive, restores files from an archive, or copies a directory hierarchy

pax [options] [pattern-list]

pax –w [options] [source-files]

pax –r [options] [pattern-list]

pax –rw [options] [source-files]destination-directory

The pax utility has four modes of operation: List mode displays the list of files in an archive, create

mode places multiple files into a single archive file, extract mode restores files from an archive, and

copy mode copies a directory hierarchy. The archive files created by pax can be saved on disk, remov-

able media, or a remote system.

A major option determines the mode pax runs in. List mode (absence of a major option) reads an

archive from standard input and displays a list of files stored in the archive. Create mode (–w for

write) reads a list of ordinary or directory filenames from the command line or standard input and

writes the resulting archive file to standard output. Extract mode (–r for read) reads an archive from

standard input and extracts files from that archive; you can restore all files from the archive or only

those files whose names match a pattern. Copy mode (–rw) reads a list of ordinary or directory file-

names from the command line or standard input and copies the files to an existing directory.

Arguments In create mode, pax reads the names of files that will be included in the archive from

source-files on the command line. If source-files is not present, pax reads the names

from standard input, one filename per line.

By default, in extract mode pax extracts, and in list mode pax displays the names of,

all files in the archive it reads from standard input. You can choose to extract or dis-

play the names of files selectively by supplying a pattern-list. If the name of a file in

the archive matches a pattern in the pattern-list,pax extracts that file or displays that

filename; otherwise, it ignores the file. The pax patterns are similar to shell wildcards

(page 136) except they match slashes (/) and a leading period (.) in a filename. See

the fnmatch man page for more information about pax patterns (OS X only).

In copy mode, pax does not create destination-directory. Thus destination-directory

must exist before you give pax a copy mode command.

Options A major option specifies the mode in which pax operates: create, extract, or copy.

Major Options

Three options determine the mode in which pax operates. You must include zero or

one of these options. Without a major option, pax runs in list mode.

–r (read) Reads the archive from standard input and extracts files. Without a

pattern-list on the command line, pax extracts all files from the archive. With a

pax 787

pattern-list,pax extracts only those files whose names match one of the pat-

terns in the pattern-list. The following example extracts from the root.pax

archive file on an external drive only those files whose names end in .c:

$ pax -r \*.c < /Volumes/Backups/root.pax

The backslash prevents the shell from expanding the * before it passes the

argument to pax.

–w (write) Constructs an archive from the files named on the command line or

standard input. These files may be ordinary or directory files. When the files

come from standard input, each must appear on a separate line. The archive is

written to standard output. The find utility frequently generates the filenames

that pax uses. The following command builds an archive of the /Users direc-

tory and writes it to the archive file named Users.pax on /Volumes/Backups:

# find -d /Users -print | pax -w > /Volumes/Backups/Users.pax

The –d option causes find to search for files in a depth-first manner, reducing

the likelihood of permissions problems when you restore the files from the

archive. See the discussion of this find option on page 647.

–rw (copy) Copies files named on the command line or standard input from one

place on the system to another. Instead of constructing an archive file containing

the files named on standard input, pax copies them to the destination-directory

(the last argument on the pax command line). The effect is the same as if you had

created an archive in create mode and then extracted the files in extract mode,

except using copy mode avoids creating an archive.

The following example copies the working directory hierarchy to the

/Users/max/code directory. The code directory must exist before you give this

command. Make sure that ~max/code is not a subdirectory of the working

directory or you will perform a recursive copy.

$ pax -rw . ~max/code

Other Options

The following options alter the behavior of pax. These options work with one or

more of the major options.

–c (complement) Reverses the sense of the test performed on pattern-list. Files are

listed or put in the archive only if they do not match any of the patterns in

pattern-list.

–f archive

(file) Uses archive as the name of the archive file. In list and extract modes, this

option reads from archive instead of standard input. In create mode, it writes

to archive instead of standard output. You can use this option to access a

device on another system on a network; see the ––file option to tar (page 847)

for more information.

788 pax

–H (partial dereference) For each file that is a symbolic link, copies the file the link

points to, not the symbolic link itself. This option affects files specified on the

command line; it does not affect files found while descending a directory hier-

archy. This option treats files that are not symbolic links normally. See

page 623 for an example of the use of the –H versus –L options.

–L (dereference) For each file that is a symbolic link, copies the file the link points

to, not the symbolic link itself. This option affects all files and treats files that

are not symbolic links normally. See page 623 for an example of the use of the

–H versus –L options.

–l (link) In copy mode, when possible, makes hard links instead of copying files.

–P (no dereference) For each file that is a symbolic link, copies the symbolic link,

not the file the link points to. This option affects all files and treats files that

are not symbolic links normally. This behavior is the default for pax. See

page 625 for an example of the use of the –P option.

–p preserve-list

Preserves or discards the file attributes specified by preserve-list. The preserve-

list is a string of one or more letters as shown in Table V-25. By default pax

preserves file access and modification times but does not preserve ownership

or file permissions. This option works in extract and copy modes only.

–s subcmd

Executes subcmd, a substitution command, on filenames while storing them in

any of the modes. The subcmd has the following format:

s/search-string/replacement-string/[gp]

The pax utility replaces occurrences of the regular expression search-string

with replacement-string. A trailing g indicates a global replacement; without it

pax replaces only the first instance of search-string in each filename. A trailing

p (for print) causes pax to display each substitution it makes. The subcmd is

similar to vim’s search and replace feature described on page 178, except it

lacks an address.

Table V-25 Preserve flags

Letter Meaning

aDiscard the access time

ePreserve everything

mDiscard the modification time

oPreserve ownership

pPreserve permissions

pax 789

–v (verbose) In list mode, displays output similar to that produced by ls –l. In

other modes, displays a list of files being processed.

–X In copy and create modes, prevents pax from searching directories in filesys-

tems other than those holding source-files.

–x format

In create mode, writes the archive in format format, as shown in Table V-26. If

you do not specify a format, pax writes a (POSIX) tar format file (ustar in the

table).

–z (gzip) In create mode, compresses archives using gzip. In extract mode, decom-

presses archives using gunzip.

Discussion The pax utility is a general replacement for tar,cpio, and other archive programs.

In create and copy modes, pax processes specified directories recursively. In list and

extract modes, if a pattern in the pattern-list matches a directory name, pax lists or

extracts the files from the named directory.

Notes There is no native pax format. Instead, the pax utility can read and write archives in

a number of formats. By default it creates POSIX tar format archives. See Table V-

26 for the list of formats pax supports.

The pax utility determines the format of an archive file; it does not allow you to

specify an archive format (–x option) in list or extract mode.

Under Mac OS X version 10.4 and later, pax copies extended attributes (page 928).

Examples In the first example, pax creates an archive named corres.0901.pax. This archive

stores the contents of the corres directory.

$ pax -w corres > corres.0901.pax

The –w option puts pax in create mode, where the list of files to be put in an archive

is supplied by command-line arguments (corres in the preceding example) or from

standard input. The pax utility sends the archive to standard output. The preceding

example redirects that output to a file named corres.0901.pax.

Table V-26 Archive formats

format Description

cpio The format specified for cpio archives in POSIX

sv4cpio The format used for cpio archives under UNIX System V, release 4

tar The historical Berkeley tar format

ustar The POSIX tar format (default)

790 pax

Next, without any options, pax displays a list of files in the archive it reads from

standard input:

$ pax < corres.0901.pax

corres

corres/hls

corres/max

corres/max/0823

corres/max/0828

corres/max/0901

corres/memo1

corres/notes

The following command uses the –f option to name the input file and performs the

same function as the preceding command:

$ pax -f corres.0901.pax

When pax reads an archive file, as in the previous examples, it determines the format

of the file. You do not have to (nor are you allowed to) specify the format.

The next example, run under Mac OS X, attempts to create an archive of the /etc

directory hierarchy in the default tar format. In this case pax is run by a user with

root privileges because some of the files in this hierarchy cannot be read by ordinary

users. Because pax does not follow (dereference) symbolic links by default, and

because under Mac OS X /etc is a symbolic link, pax copies the link and not the

directory that the link points to. The first command creates the archive, the second

command shows the link in the archive, and the third command uses the –v option

to show that it is a link:

# pax -w -f /tmp/etc.tar /etc

# pax -f /tmp/etc.tar

/etc

# pax -v -f /tmp/etc.tar

lrwxr-xr-x 1 root admin 0 May 21 01:48 /etc => private/etc

pax: ustar vol 1, 1 files, 10240 bytes read, 0 bytes written.

The –L option follows (dereferences) links. In the next example, pax creates the

desired archive of /etc:

# pax -wLf /tmp/etc.tar /etc

# pax -f /tmp/etc.tar

/etc

/etc/6to4.conf

/etc/AFP.conf

/etc/afpovertcp.cfg

/etc/aliases

/etc/aliases.db

/etc/amavisd.conf

/etc/amavisd.conf.personal

/etc/appletalk.cfg

...

pax 791

The next example uses pax to create a backup of the memos directory, preserving

ownership and file permissions. The destination directory must exist before you

give pax a copy mode command.

$ mkdir memos.0625

$ pax -rw -p e memos memos.0625

$ ls memos.0625

memos

The preceding example copies the memos directory into the destination directory.

You can use pax to make a copy of a directory in the working directory without put-

ting it in a subdirectory. In the next example, the –s option causes pax to replace the

name of the memos directory with the name . (the name of the working directory)

as it writes files to the memos.0625 directory:

$ pax -rw -p e -s /memos/./ memos memos.0625

The final example uses find to build a list of files that begin with the string memo

and that are located in the working directory hierarchy. The pax utility writes these

files to an archive in cpio format. The output from pax goes to standard output,

which is sent through a pipe to bzip2, which compresses it before writing it to the

archive file.

$ find . -type f -name "memo*" | pax -w -x cpio | bzip2 > memos.cpio.bz2

792 plutil O

plutil O

Manipulates property list files

plutil [options]file-list

The plutil utility converts property list files between formats and checks their syntax.

Arguments The file-list specifies the pathnames of one or more files that plutil is to manipulate.

Options –convert format

Converts file-list to format, which must be either xml1 or binary1.

–e extension

Gives output files a filename extension of extension.

–help

Displays a help message.

–lint

Checks the syntax of files (default).

–o file

(output) Names the converted file file.

–s (silent) Does not display any messages for successfully converted files.

Discussion The plutil utility converts files from the XML property list format to binary format,

and vice versa. The plutil utility can read—but not write—the older plain-text property

list format.

Notes The plutil utility accepts a double hyphen (––) to mark the end of the options on the

command line. For more information refer to “Common Options” on page 603.

Examples The following example checks the syntax of the file named java.plist:

$ plutil java.plist

java.plist: OK

The next example shows the output of plutil as it checks a damaged property list file:

$ plutil broken.plist

broken.plist:

XML parser error:

Encountered unexpected element at line 2 (plist can only include one object)

Old-style plist parser error:

Malformed data byte group at line 1; invalid hex

plutil O793

The next example converts the file StartupParameters.plist to binary format, over-

writing the original file:

$ plutil -convert binary1 StartupParameters.plist

The final example converts the binary format property list file loginwindow.plist in

the working directory to XML format and, because of the –o option, puts the con-

verted file in /tmp/lw.p:

$ plutil -convert xml1 -o /tmp/lw.p loginwindow.plist

794 pr

Paginates files for printing

pr [options] [file-list]

The pr utility breaks files into pages, usually in preparation for printing. Each page has a header with

the name of the file, date, time, and page number.

The pr utility takes its input from files you specify on the command line or from standard input. The

output from pr goes to standard output and is frequently redirected through a pipe to a printer.

Arguments The file-list is a list of the pathnames of text files that you want pr to paginate.

When you omit the file-list,pr reads from standard input.

Options Under Linux, pr accepts the common options described on page 603. Options pre-

ceded by a double hyphen (––) work under Linux only. Except as noted, options

named with a single letter and preceded by a single hyphen work under Linux and

OS X.

You can embed options within the file-list. An embedded option affects only those

files following it on the command line.

––show-control-chars

–c Displays control characters with a caret (^; for example, ^H). Displays other

nonprinting characters as octal numbers preceded by a backslash. L

––columns=col –col

Displays output in col columns with a default of one. This option may truncate

lines and cannot be used with the –m (––merge) option.

––double-space –d Double-spaces the output.

––form-feed –F Uses a FORMFEED character to skip to the next page rather than filling the current

page with NEWLINE characters.

––header=head –h head

Displays head at the top of each page instead of the filename. If head contains

SPACEs, you must enclose it within quotation marks.

––length=lines –l lines

Sets the page length to lines lines. The default is 66 lines.

––merge –m Displays all specified files simultaneously in multiple columns. This option

cannot be used with the –col (––columns) option.

––number-lines=[c[num]]

–n[c[num]]

Numbers the lines of output. The pr utility appends the character c to the

pr 795

number to separate it from the contents of the file (the default is a TAB). The

num specifies the number of digits in each line number (the default is 5).

––indent=spaces –o spaces

Indents the output by spaces characters (specifies the left margin).

––separator=c–s[c]

Separates columns with the single character c (defaults to TAB when you omit c).

By default pr uses TABs as separation characters to align columns unless you use

the –w option, in which case nothing separates the columns.

––omit-header –t Causes pr not to display its five-line page header and trailer. The header that pr

normally displays includes the name of the file, the date, time, and page num-

ber. The trailer is five blank lines.

––width=num –w num

Sets the page width to num columns. This option is effective only with multi-

column output (the –m [––merge] or –col [––columns] option).

––pages=firstpage[:lastpage]

+firstpage[:lastpage]

Output begins with the page numbered firstpage and ends with lastpage.

Without lastpage,pr outputs through the last page of the document. The short

version of this option begins with a plus sign, not a hyphen. Mac OS X does

not accept the lastpage argument; printing always continues through the end

of the document.

Notes When you use the –col (––columns) option to display the output in multiple columns,

pr displays the same number of lines in each column (with the possible exception of

the last).

Examples The first command shows pr paginating a file named memo and sending its output

through a pipe to lpr for printing:

$ pr memo | lpr

Next memo is sent to the printer again, this time with a special heading at the top of

each page. The job is run in the background.

$ pr -h 'MEMO RE: BOOK' memo | lpr &

[1] 4904

Finally pr displays the memo file on the screen, without any header, starting with

page 3:

$ pr -t +3 memo

...

796 ps

Displays process status

ps [options] [process-list]

The ps utility displays status information about processes running on the local system.

Arguments The process-list is a comma- or SPACE-separated list of PID numbers. When you specify

aprocess-list,ps reports on just the processes in that list.

Options Under Linux, the ps utility accepts three types of options, each preceded by a different

prefix. You can intermix the options. See the ps man page for details.

Two hyphens: GNU (long) options

One hyphen: UNIX98 (short) options

No hyphens: BSD options

Options preceded by a double hyphen (––) or no hyphen work under Linux only.

Options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

–A (all) Reports on all processes. Also –e.

–e (everything) Reports on all processes. Also –A.

–f (full) Displays a listing with more columns of information.

––forest fDisplays the process tree (no hyphen before the f). L

–l (long) Produces a long listing showing more information about each pro-

cess. See the “Discussion” section for a description of the columns this

option displays.

––no-headers Omits the header. This option is useful if you are sending the output to another

program. L

––user usernames –uusernames

Reports on processes being run by usernames, a comma-separated list of the

names or UIDs of one or more users on the local system.

–w (wide) Without this option ps truncates output lines at the right side of the

screen. This option extends the display to 132 columns; it wraps around one

more line, if needed. Use this option twice to extend the display to an unlim-

ited number of columns, which is the default if you redirect the output of ls.

Discussion Without any options, ps displays the statuses of all active processes controlled by

your terminal or screen. Table V-27 lists the heading and content of each of the four

columns ps displays.

ps 797

The columns that ps displays depend on your choice of options. Table V-28 lists the

headings and contents of the most common columns.

Table V-27 Column headings I

Heading Meaning

PID The process identification number.

TTY (terminal) The name of the terminal that controls the process.

TIME The number of hours, minutes, and seconds the process has

been running.

CMD The command line the process was called with. The command

is truncated to fit on one line. Use the –w option to see more of

the command line.

Table V-28 Column headings II

The column titles differ, depending on the type of option you use. This table shows the headings

for UNIX98 (one-hyphen) options.

Heading Meaning

%CPU The percentage of total CPU time the process is using. Because

of the way Linux/OS X handles process accounting, this figure

is approximate, and the total of %CPU values for all processes

may exceed 100%.

%MEM (memory) The percentage of RAM the process is using.

COMMAND or CMD The command line the process was called with. The command

is truncated to fit on one line. Use the –w option to see more of

the command line. This column is always displayed last on a

line.

F(flags) The flags associated with the process.

NI (nice) The nice value of the process (page 773).

PID The process identification number.

PPID (parent PID) The process identification number of the parent process.

PRI (priority) The priority of the process.

RSS (resident set size) The number of blocks of memory the process is using.

SIZE or SZ The size, in blocks, of the core image of the process.

STIME or START The date the process started.

798 ps

Notes Use top (page 858) to display process status information dynamically.

Examples The first example shows ps, without any options, displaying the user’s active pro-

cesses. The first process is the shell (bash), and the second is the process executing

the ps utility.

$ ps

PID TTY TIME CMD

2697 pts/0 00:00:02 bash

3299 pts/0 00:00:00 ps

With the –l (long) option, ps displays more information about the processes:

$ ps -l

F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD

000 S 500 2697 2696 0 75 0 - 639 wait4 pts/0 00:00:02 bash

000 R 500 3300 2697 0 76 0 - 744 - pts/0 00:00:00 ps

The –u option shows information about the specified user:

Heading Meaning

STAT or S(status) The status of the process as specified by one or more letters

from the following list:

<High priority

DSleeping and cannot be interrupted

LPages locked in memory (real-time and custom

I/O)

NLow priority

RAvailable for execution (in the run queue)

SSleeping

TEither stopped or being traced

WHas no pages resident in RAM

XDead

ZZombie process that is waiting for its child

processes to terminate before it terminates

TIME The number of minutes and seconds that the process has been

running.

TTY (terminal) The name of the terminal controlling the process.

USER or UID The username of the user who owns the process.

WCHAN (wait channel) If the process is waiting for an event, the address of the kernel

function that caused the process to wait. This value is 0 for pro-

cesses that are not waiting or sleeping.

Table V-28 Column headings II (continued)

ps 799

$ ps -u root

PID TTY TIME CMD

1 ? 00:00:01 init

2 ? 00:00:00 kthreadd

3 ? 00:00:00 migration/0

4 ? 00:00:01 ksoftirqd/0

5 ? 00:00:00 watchdog/0

...

The ––forest option causes ps to display what the man page describes as an “ASCII

art process tree.” Processes that are children of other processes appear indented

under their parents, making the process hierarchy, or tree, easier to see.

$ ps -ef --forest

UID PID PPID C STIME TTY TIME CMD

root 1 0 0 Jul22 ? 00:00:03 init

root 2 1 0 Jul22 ? 00:00:00 [keventd]

...

root 785 1 0 Jul22 ? 00:00:00 /usr/sbin/apmd -p 10 -w 5 -W -P

root 839 1 0 Jul22 ? 00:00:01 /usr/sbin/sshd

root 3305 839 0 Aug01 ? 00:00:00 \_ /usr/sbin/sshd

max 3307 3305 0 Aug01 ? 00:00:00 \_ /usr/sbin/sshd

max 3308 3307 0 Aug01 pts/1 00:00:00 \_ -bash

max 3774 3308 0 Aug01 pts/1 00:00:00 \_ ps -ef --forest

...

root 1040 1 0 Jul22 ? 00:00:00 login -- root

root 3351 1040 0 Aug01 tty2 00:00:00 \_ -bash

root 3402 3351 0 Aug01 tty2 00:00:00 \_ make modules

root 3416 3402 0 Aug01 tty2 00:00:00 \_ make -C drivers CFLA

root 3764 3416 0 Aug01 tty2 00:00:00 \_ make -C scsi mod

root 3773 3764 0 Aug01 tty2 00:00:00 \_ ld -m elf_i3

ps and kill The next sequence of commands shows how to use ps to determine the PID number

of a process running in the background and how to terminate that process using kill.

In this case it is not necessary to use ps because the shell displays the PID number of

the background processes. The ps utility verifies the PID number.

The first command executes find in the background. The shell displays the job and

PID numbers of the process, followed by a prompt.

$ find ~ -name memo -print > memo.out &

[1] 3343

Next ps confirms the PID number of the background task. If you did not already

know this number, using ps would be the only way to obtain it.

$ ps

PID TTY TIME CMD

3308 pts/1 00:00:00 bash

3343 pts/1 00:00:00 find

3344 pts/1 00:00:00 ps

Finally kill (page 729) terminates the process:

$ kill 3343

$ RETURN

[1]+ Terminated find ~ -name memo -print >memo.out

800 rcp

rcp

Copies one or more files to or from a remote system

rcp [options]source-file destination-file

rcp [options]source-file-list destination-directory

The rcp utility copies one or more ordinary files between two systems that can communicate over a

network.

Arguments The source-file,source-file-list, and destination-file are pathnames of the ordinary

files, and the destination-directory is the pathname of a directory file. A pathname

that does not contain a colon (:) is the name of a file on the local system. A path-

name of the form name@host:path names a file on the remote system named host.

The path is relative to the home directory of the user name (unless path is an abso-

lute pathname). When you omit the name@ portion of the destination, a relative

pathname is relative to the home directory on the host of the user giving the rcp

command.

Like cp,rcp has two modes of operation: The first copies one file to another, and the

second copies one or more files to a directory. The source-file[-list] is a list of the

name(s) of the file(s) that rcp will copy; destination-file is the name rcp assigns to

the resulting copy of the file, or destination-directory is the name of the directory

rcp puts the copied files in. When rcp copies files to a destination-directory, the files

maintain their original simple filenames.

Options –p (preserve) Sets the modification times and file access permissions of each copy

to match those of the source-file. When you do not use –p,rcp uses the file-

creation mask (umask; see page 870) on the remote system to modify the access

permissions.

–r (recursive) When a file in the source-file-list is a directory, copies the contents

of that directory and any subdirectories into the destination-directory. You

can use this option only when the destination is a directory.

Notes You must have an account on the remote system to copy files to or from it using rcp.

The rcp utility does not prompt for a password but rather uses several alternative

methods to verify you have the authority to read or write files on the remote system.

rcp is not secure

security The rcp utility uses host-based trust, which is not secure, to authorize files to be copied. Use scp

(page 810) when it is available.

rcp 801

One method requires that the name of the local system be specified in the

/etc/hosts.equiv file on the remote system. If the name is there, rcp allows you to

copy files if your usernames are the same on both systems and your account on the

remote system has the necessary permissions to access files there.

Authorization can also be specified on a per-user basis. With this method the remote

user’s home directory must contain a file named ~/.rhosts that lists trusted remote

systems and users. Your local and remote user names do not have to match but your

local username must appear on the line in the remote ~/.rhosts file that starts with

the name of the local system. See page 803 for a sample .rhosts file.

If you use a wildcard (such as *) in a remote pathname, you must quote the wild-

card character or pathname so the wildcard is interpreted by the shell on the remote

system and not by the local shell. As with cp, if the destination-file exists before you

execute rcp,rcp overwrites the file.

Examples The first example copies the files with filenames ending in .c into the archives direc-

tory on the remote system named bravo. Because the command does not specify a

username, rcp uses the local user’s username on the remote system. Because the

command does not specify the full pathname of the archives directory, rcp assumes

it is a subdirectory of the user’s home directory on bravo. Each of the copied files

retains its simple filename.

$ rcp *.c bravo:archives

The next example copies memo from the /home/sam directory on bravo to the

working directory on the local system:

$ rcp bravo:/home/sam/memo .

Next rcp copies the files named memo.new and letter to Sam’s home directory on

the remote system bravo. The absolute pathnames of the copied files on bravo are

/home/sam/memo.new and /home/sam/letter, respectively:

$ rcp memo.new letter bravo:/home/sam

The final command copies all the files in Sam’s reports directory on bravo to the

oldreports directory on the local system, preserving the original modification dates

and file access permissions on the copies:

$ rcp -p 'bravo:reports/*' oldreports

802 renice

renice

Changes the priority of a process

renice priority process-list

renice –n increment process-list

The renice utility alters the priority of a running process. An ordinary user can decrease the priority of

a process that he owns. Only a user running with root privileges can increase the priority of a process

or alter the priority of another user’s process.

Arguments The process-list specifies the PID numbers of the processes that are to have their pri-

orities altered. Each process has its priority set to priority or, using the second for-

mat, has its priority incremented by a value of increment (which can be negative).

Options The options, which you can specify throughout the process-list, change the interpre-

tation of the arguments that follow them on the command line.

–p (process) Interprets the following arguments as process ID (PID) numbers

(default).

–u (user) Interprets the following arguments as usernames or user ID numbers.

Notes The range of priorities is from –20 (the highest priority) to +20 (the lowest priority).

Higher (more positive) priority values mean the kernel schedules a job less often.

Lower (more negative) values cause the job to be scheduled more often.

When a user running with root privileges schedules a job to run at the highest prior-

ity, this change can affect the performance of the system for all other jobs, including

the operating system itself. For this reason you should be careful when using renice

with negative values.

See nice (page 773) if you want to start a process with a nondefault priority.

Examples The first example decreases the priority of all tasks owned by Zach:

$ renice -n 5 -u zach

In the following example, root uses ps to check the priority of the process running

find. The NI (nice) column shows a value of 19 and the administrator decides to

increase the priority by 5:

# ps -l

UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND

501 9705 9701 0 31 0 27792 856 - Ss p1 0:00.15 -bash

501 10548 9705 0 12 19 27252 516 - RN p1 0:00.62 find /

# renice -n -5 10548

rlogin 803

rlogin

Logs in on a remote system

rlogin [option]remote-system

The rlogin utility establishes a login session on a remote system over a network.

Arguments The remote-system is the name of a system the local system can reach over a network.

Options –lusername

(login) Logs you in on the remote system as the user specified by username.

Notes If the file named /etc/hosts.equiv located on the remote system specifies the name of

the local system, the remote system will not prompt you to enter your password.

Systems that are listed in the /etc/hosts.equiv file are considered to be just as secure

as the local system.

An alternative way to specify a trusted relationship is on a per-user basis. Each

user’s home directory can contain a file named ~/.rhosts that holds a list of trusted

remote systems and users. See the “Examples” section for a sample .rhosts file.

Examples The following example illustrates the use of rlogin. On the local system, Max’s user-

name is max; on the remote system bravo, his username is watson. The remote sys-

tem prompts Max to enter a password because he is logging in using a username

different from the one he uses on the local system.

$ who am i

max tty06 Oct 14 13:26

$ rlogin -l watson bravo

Password:

~/.rhosts file If the local system is named hurrah, the following .rhosts file on bravo allows the

user max to log in as the user watson without entering a password:

$ cat ~watson/.rhosts

hurrah max

rlogin is not secure

security The rlogin utility uses host-based trust, which is not secure, to authorize your login. Alternatively,

it sends your password over the network as cleartext, which is not a secure practice. Use ssh

(page 828) when it is available.

804 rm

Removes a file (deletes a link)

rm [options]file-list

The rm utility removes hard and/or symbolic links to one or more files. When you remove the last

hard link to a file, the file is deleted.

Arguments The file-list is a list of the files rm will delete.

Options Under Linux, rm accepts the common options described on page 603. Options

preceded by a double hyphen (––) work under Linux only. Options named with a

single letter and preceded by a single hyphen work under Linux and OS X.

––force –f Without asking for your consent, removes files for which you do not have

write access permission. This option also suppresses informative messages if a

file does not exist.

––interactive –i Asks before removing each file. If you use ––recursive with this option, rm also

asks you before examining each directory.

––recursive –r Deletes the contents of the specified directory, including all its subdirectories,

and the directory itself. Use this option with caution.

––verbose –v Displays the name of each file as it is removed.

Notes To delete a file, you must have execute and write access permission to the parent

directory of the file, but you do not need read or write access permission to the file

itself. If you are running rm interactively (that is, if rm’s standard input is coming

from the keyboard) and you do not have write access permission to the file, rm dis-

plays your access permission and waits for you to respond. If your response starts

with a y or Y,rm deletes the file; otherwise, it takes no action. If standard input is

not coming from a keyboard, rm deletes the file without querying you.

Refer to page 106 for information on hard links and page 108 for information on

symbolic links. Page 110 includes a discussion about removing links. Refer to the

rmdir utility (page 806) if you need to remove an empty directory.

Be careful when you use rm with wildcards

caution Because this utility enables you to remove a large number of files with a single command, use rm

cautiously, especially when you are working with ambiguous file references. If you have any

doubts about the effect of an rm command with an ambiguous file reference, first use echo with

the same file reference and evaluate the list of files the reference generates. Alternatively, you can

use the –i (––interactive) option.

rm 805

When you want to remove a file that begins with a hyphen, you must prevent rm

from interpreting the filename as an option. One way to do so is to give the special

option –– (double hyphen) before the name of the file. This option tells rm that no

more options follow: Any arguments that come after it are filenames, even if they

look like options.

Examples The following commands delete files both in the working directory and in another

directory:

$ rm memo

$ rm letter memo1 memo2

$ rm /home/sam/temp

The next example asks the user before removing each file in the working directory

and its subdirectories:

$ rm -ir .

This command is useful for removing filenames that contain special characters,

especially SPACEs,TABs, and NEWLINEs. (You should not create filenames containing these

characters on purpose, but it may happen accidentally.)

Use shred to remove a file securely

security Using rm does not securely delete a file—it is possible to recover a file deleted with rm. Use the

shred utility to delete files more securely. See the example “Wiping a file” on page 659 for another

method of deleting files more securely.

806 rmdir

rmdir

Removes directories

rmdir directory-list

The rmdir utility deletes empty directories.

Arguments The directory-list is a list of pathnames of empty directories that rmdir removes.

Options Under Linux, rmdir accepts the common options described on page 603. Options

preceded by a double hyphen (––) work under Linux only. Except as noted, options

named with a single letter and preceded by a single hyphen work under Linux and

OS X.

––ignore-fail-on-non-empty

Suppresses the message rmdir normally displays when it fails because a direc-

tory is not empty. With the ––parents option, rmdir does not quit when it finds

a directory that is not empty. L

––parents –p Removes a hierarchy of empty directories.

––verbose –v Displays the names of directories as they are removed. L

Notes Use the rm utility with the –r option if you need to remove directories that are not

empty, together with their contents.

Examples The following command deletes the empty literature directory from the working

directory:

$ rmdir literature

The next command removes the letters directory, using an absolute pathname:

$ rmdir /home/sam/letters

The final command removes the letters,march, and 05 directories, assuming the

directories are empty except for other directories named in the path:

$ rmdir -p letters/march/05

rsh 807

rsh

Executes commands on a remote system

rsh [option]host [command-line]

The rsh utility runs command-line on host by starting a shell on the remote system. Without a

command-line,rsh calls rlogin, which logs you in on the remote system.

Arguments The host is the name of the remote system. The rsh utility runs command-line on

the remote system. You must quote special characters in command-line so the local

shell does not expand them prior to passing them to rsh.

Options –lusername

(login) Logs you in on the remote system as the user specified by username.

Notes If the file named /etc/hosts.equiv located on the remote system specifies the name of

the local system, the remote system will not prompt you to enter your password.

Systems that are listed in the /etc/hosts.equiv file are considered as secure as the

local system.

An alternative way to specify a trusted relationship is on a per-user basis. Each

user’s home directory can contain a file named ~/.rhosts that holds a list of trusted

remote systems and users. See “Examples” under rlogin (page 803) for a sample

.rhosts file.

Examples In the first example, Max uses rsh to obtain a listing of the files in his home directory

on bravo:

$ rsh bravo ls

cost_of_living

info

preferences

work

Next the output of the previous command is redirected to the file bravo.ls. Because

the redirection character (>) is not escaped, it is interpreted by the local shell and

the file bravo.ls is created on the local system.

rsh is not secure

security The rsh utility uses host-based trust, which is not secure, to authorize your login. Alternatively, it

sends your password over the network as cleartext, which is not a secure practice. Use ssh

(page 828) when it is available.

808 rsh

$ rsh bravo ls > bravo_ls

$ cat bravo_ls

cost_of_living

info

preferences

work

The next example quotes the redirection character (>) so the file bravo.ls is created

on the remote system (bravo), as shown by ls run on bravo:

$ rsh bravo ls ">" bravo.ls

$ rsh bravo ls

bravo.ls

cost_of_living

info

preferences

work

In the final example, rsh without command-line logs in on the remote system. Max

has used the –l watson option to log in on bravo as watson. The ~watson/.rhosts

file must be configured to allow Max to log in on the account in this manner. See

“Examples” under rlogin (page 803) for a sample .rhosts file.

$ rsh -l watson bravo

Last login: Sat Jul 30 16:13:53 from kudos

$ hostname

bravo

$ exit

rlogin: connection closed.

rsync 809

rsync

Copies files and directory hierarchies securely over a network

rsync [options] [[user@]from-host:]source-file [[user@]to-host:][destination-file]

The rsync (remote synchronization) utility copies an ordinary file or directory hierarchy locally or

from the local system to or from another system on a network. By default, this utility uses OpenSSH

to transfer files and the same authentication mechanism as OpenSSH; as a consequence, it provides

the same security as OpenSSH. The rsync utility prompts for a password when it needs one. Alterna-

tively you can use the rsyncd daemon as a transfer agent.

See Chapter 14 for information on rsync

tip See Chapter 14 starting on page 583 for information on rsync.

810 scp

scp

Securely copies one or more files to or from a remote system

scp [[user@]from-host:]source-file [[user@]to-host:][destination-file]

The scp (secure copy) utility copies an ordinary or directory file from one system to another on a

network. This utility uses ssh to transfer files and the same authentication mechanism as ssh; as a

consequence, it provides the same security as ssh. The scp utility prompts for a password when it

needs one.

Arguments The from-host is the name of the system you are copying files from and the to-host

is the system you are copying to. When you do not specify a host, scp assumes the

local system. The user on either system defaults to the user on the local system who

is giving the command; you can specify a different user with user@. The scp utility

permits you to copy between two remote systems.

The source-file is the file you are copying, and the destination-file is the resulting

copy. You can specify plain or directory files as relative or absolute pathnames. A

relative pathname is relative to the specified or implicit user’s home directory. When

the source-file is a directory, you must use the –r option to copy its contents. When

the destination-file is a directory, each of the source files maintains its simple file-

name. If source-file is a single file, you can omit destination-file and the copied file

will have the same simple filename as source-file.

Options –Pport

Connects to port port on the remote system. Be aware that ssh uses (lowercase)

–p to specify a port.

–p (preserve) Preserves the modification and access times as well as the permissions

of the original file.

–q (quiet) Does not display the progress meter.

–r (recursive) Recursively copies a directory hierarchy.

–v (verbose) Displays debugging messages about the connection and transfer. This

option is useful if things are not going as expected.

Notes The scp utility belongs to the OpenSSH suite of secure network connectivity tools.

See the “Discussion” section on page 829 for a discussion of OpenSSH security.

Refer to “First-time authentication” on page 829 for information about a message

you may get when using scp to connect to a remote system for the first time.

scp 811

Using this utility, you can copy files from or to the local system or between two

remote systems. Make sure you have read permission for the file you are copying

and write permission for the directory you are copying it into.

You must quote a wildcard character (such as *) in a remote pathname so it is

interpreted by the shell on the remote system and not by the local shell.

As with cp, if the destination-file exists before you run scp,scp overwrites the file.

Examples The first example copies the files with filenames ending in .c from the working

directory on the local system into the ~sam/archives directory on bravo. The wild-

card character is not quoted so the local shell will expand it. Because archives is a

relative pathname, scp assumes it is a subdirectory of Sam’s home directory on

bravo. Each of the copied files retains its simple filename.

$ scp *.c sam@bravo:archives

Next Max copies the directory structure under ~max/memos on the system named

bravo to ~sam/max.memos.bravo on kudos. He must have the necessary permis-

sions to write to Sam’s home directory on kudos.

$ scp -r bravo:memos sam@kudos:max.memos.bravo

Finally Max copies the files with filenames ending in .c from Sam’s archives directory

on bravo to the sam.c.bravo directory in his working directory. The wildcard charac-

ter is quoted to protect it from expansion by the local shell; it will be interpreted by

the remote system, bravo.

$ scp -r 'sam@bravo:archives/*.c' sam.c.bravo

Whenever you copy multiple files or directories, the destination—either local or

remote—must be an existing directory and not an ordinary or nonexistent file.

812 sed

sed

Edits a file noninteractively

sed [–n] program [file-list]

sed [–n] –f program-file [file-list]

The sed (stream editor) utility is a batch (noninteractive) editor. It transforms an input stream that

can come from a file or standard input. It is frequently used as a filter in a pipe. Because it makes

only one pass through its input, sed is more efficient than an interactive editor such as ed. Most

Linux distributions provide GNU sed; Mac OS X supplies BSD sed. Chapter 13 applies to both

versions.

See Chapter 13 for information on sed

tip See Chapter 13 starting on page 565 for information on sed.

SetFile O813

SetFile O

Sets file attributes

SetFile [options]file-list

The SetFile utility sets file attributes (page 931), including the file’s type and creator codes, creation

and last modification times, and attribute flags such as the invisible and locked flags.

Arguments The file-list specifies the pathnames of one or more files that SetFile works on.

Options The options for SetFile correspond to the options for GetFileInfo (page 717).

–a flags

(attribute) Sets the attribute flags specified by flags. An uppercase letter for a

flag sets that flag and a lowercase letter unsets the flag. The values of unspecified

flags are not changed. See Table D-2 on page 931 or the SetFile man page for a

list of attribute flags.

–c creator

Sets the creator code to creator.

–d date

Sets the creation date to date. You must format the date as mm/dd/[yy]yy

[hh:mm:[:ss] [AM | PM]]. If you do not specify AM or PM,SetFile assumes a

24-hour clock. You must enclose a date string that contains SPACEs within quo-

tation marks.

–m date

(modification) Sets the modification date to date. The format of date is the

same as that used with the –d option.

–P (no dereference) For each file that is a symbolic link, sets information about

the symbolic link, not the file the link points to. This option affects all files and

treats files that are not symbolic links normally. See page 625 for an example

of the –P option.

–t type

Sets the type code to type.

Notes The SetFile utility is part of the Developer Tools optional install.

The options to SetFile and the corresponding options to GetFileInfo have minor dif-

ferences. For example, you can specify multiple attribute flags with the –a option to

SetFile but only a single flag with GetFileInfo. Also, SetFile requires a SPACE between

the –a option and the list of flags; GetFileInfo does not allow a SPACE there.

814 SetFile O

Examples The first example sets the type and creator codes of the file named arch to SIT5 and

SIT!, respectively, indicating that it is a StuffIt archive. The GetFileInfo utility displays

these codes.

$ SetFile -t SIT5 -c SIT! arch

$ GetFileInfo -c arch

"SIT!"

$ GetFileInfo -t arch

"SIT5"

The next example marks the file named secret as invisible and locked. The file will

not be visible in the Finder, and most OS X applications will be unable to overwrite

it.

$ SetFile -a VL secret

The final example clears the invisible attribute flag from every file in the working

directory:

$ SetFile -a v *

sleep 815

sleep

Creates a process that sleeps for a specified interval

sleep time

sleep time-list L

The sleep utility causes the process executing it to go to sleep for the time specified.

Arguments Traditionally the amount of time a process sleeps is given as a single integer argu-

ment, time, which denotes a number of seconds. The time does not have to be an

integer, however: You can specify a decimal fraction. Under Linux you can also

append a unit specification to time:s (seconds), m (minutes), h (hours), or d (days).

Under Linux you can construct a time-list by including several times on the com-

mand line: The total time the process sleeps is the sum of these times. For example,

if you specify 1h 30m 100s, the process will sleep for 91 minutes and 40 seconds.

Examples You can use sleep from the command line to execute a command after a period of

time. The following example executes in the background a process that reminds you

to make a phone call in 20 minutes (1,200 seconds):

$ (sleep 1200; echo "Remember to make call.") &

[1] 4660

Alternatively, under Linux, you could give the following command to get the same

reminder:

$ (sleep 20m; echo "Remember to make call.") &

[2] 4667

You can also use sleep within a shell script to execute a command at regular inter-

vals. For example, the per shell script executes a program named update every 90

seconds:

$ cat per

#!/bin/bash

while true

update

sleep 90

done

If you execute a shell script such as per in the background, you can terminate it only

by using kill.

The final shell script accepts the name of a file as an argument and waits for that file

to appear on the disk. If the file does not exist, the script sleeps for 1 minute and 45

seconds before checking for the file again.

816 sleep

$ cat wait_for_file

#!/bin/bash

if [ $# != 1 ]; then

echo "Usage: wait_for_file filename"

exit 1

while true

if [ -f "$1" ]; then

echo "$1 is here now"

exit 0

sleep 1m 45

done

Under Mac OS X replace 1m 45 with 105.

sort 817

sort

Sorts and/or merges files

sort [options] [file-list]

The sort utility sorts and/or merges one or more text files.

Arguments The file-list is a list of pathnames of one or more ordinary files that contain the text

to be sorted. If the file-list is omitted, sort takes its input from standard input. With-

out the –o option, sort sends its output to standard output. This utility sorts and

merges files unless you use the –m (merge only) or –c (check only) option.

Options When you do not specify an option, sort orders the file in the machine collating

sequence (usually ASCII). Without a ––key option, sort orders a file based on full

lines. Use ––key to specify sort fields within a line. You can follow a ––key option

with additional options without a leading hyphen; see the “Discussion” section for

more information.

––ignore-leading-blanks

–b Blanks (TAB and SPACE characters) normally mark the beginning of fields in the

input file. Without this option, sort considers leading blanks to be part of the

field they precede. This option ignores leading blanks within a field, so sort

does not consider these characters in sort comparisons.

––check –c Checks whether the file is properly sorted. The sort utility does not display

anything if everything is in order. It displays a message if the file is not in

sorted order and returns an exit status of 1.

––dictionary-order

–d Ignores all characters that are not alphanumeric characters or blanks. For

example, sort does not consider punctuation with this option.

––ignore-case –f (fold) Considers all lowercase letters to be uppercase letters. Use this option

when you are sorting a file that contains both uppercase and lowercase letters.

––ignore-nonprinting

–i Ignores nonprinting characters. This option is overridden by the ––dictionary-

order option.

––key=start[,stop]

–k start[,stop]

Specifies a sort field within a line. Without this option sort orders a file based on

full lines. The sort field starts at the position on the line specified by start and

The Mac OS X version of sort accepts long options

tip Options for sort preceded by a double hyphen (––) work under Mac OS X as well as under Linux.

818 sort

ends at stop, or the end of the line if stop is omitted. The start and stop posi-

tions are in the format f[.c], where f is the field number and c is the optional

character within the field. Numbering starts with 1. When c is omitted from

start, it defaults to the first character in the field; when c is omitted from stop, it

defaults to the last character in the field. See the “Discussion” section for fur-

ther explanation of sort fields and the “Examples” section for illustrations of

their use.

––merge –m Assumes that each of the multiple input files is in sorted order and merges

them without verifying they are sorted.

––numeric-sort –n Sorts in arithmetic sequence; does not order lines or sort fields in the machine

collating sequence. With this option, minus signs and decimal points take on

their arithmetic meaning.

––output=filename

–o filename

Sends output to filename instead of standard output; filename can be the same

as one of the names in the file-list.

––reverse –r Reverses the sense of the sort (for example, z precedes a).

––field-separator=x

–t x

Specifies x as the field separator. See the “Discussion” section for more infor-

mation on field separators.

––unique –u Outputs repeated lines only once. When you use this option with ––check,sort

displays a message if the same line appears more than once in the input file,

even if the file is in sorted order.

Discussion Without any options, sort bases its ordering on full lines.

In the following description, a field is a sequence of characters in a line of input.

Without the ––field-separator option, fields are bounded by the empty string pre-

ceding a group of one or more blanks (TAB and SPACE characters). You cannot see the

empty string that delimits the fields; it is an imaginary point between two fields.

Fields are also bounded by the beginning and end of the line. The line shown in

Figure V-1 holds the fields Toni,SPACEBarnett, and SPACESPACESPACESPACE55020. These

fields define sort fields. Sometimes fields and sort fields are the same.

Sort field Asort field is a sequence of characters that sort uses to put lines in order. A sort field

can contain all or part of one or more fields (Figure V-1).

The ––key option specifies pairs of pointers that define subsections of each line

(sort fields) for comparison. See the ––key option (page 817) for details.

Leading blanks The –b option causes sort to ignore leading blanks in a sort field. If you do not use

this option, sort considers each leading blank to be a character in the sort field and

includes it in the sort comparison.

Options You can specify options that pertain only to a given sort field by immediately fol-

lowing the stop pointer (or the start pointer if there is no stop pointer) with one

sort 819

of the options b,d,f,i,n, or r. In this case you must not precede the option with

a hyphen.

Multiple sort fields When you specify more than one sort field, sort examines them in the order you

specify them on the command line. If the first sort field of two lines is the same, sort

examines the second sort field. If these are again the same, sort looks at the third

field. This process continues for all the sort fields you specify. If all the sort fields

are the same, sort examines the entire line.

Examples The examples in this section demonstrate some of the features and uses of the sort

utility. The examples assume the following list file is in the working directory:

$ cat list

Tom Winstrom 94201

Janet Dempsey 94111

Alice MacLeod 94114

David Mack 94114

Toni Barnett 95020

Jack Cooper 94072

Richard MacDonald 95510

This file contains a list of names and ZIP codes. Each line of the file contains three

fields: the first name field, the last name field, and the ZIP code field. For the exam-

ples to work, the blanks in the file must be SPACEs, and not TABs.

The first example demonstrates sort without any options—the only argument is the

name of the input file. In this case sort orders the file on a line-by-line basis. If the

first characters on two lines are the same, sort looks at the second characters to

determine the proper order. If the second characters are the same, sort looks at the

third characters. This process continues until sort finds a character that differs

between the lines. If the lines are identical, it does not matter which one sort puts

first. In this example, sort needs to examine only the first three characters (at most)

of each line. It displays a list that is in alphabetical order by first name.

$ sort list

Alice MacLeod 94114

David Mack 94114

Jack Cooper 94072

Janet Dempsey 94111

Richard MacDonald 95510

Tom Winstrom 94201

Toni Barnett 95020

Figure V-1 Fields and sort fields

Toni Barnett 55020

Field Field Field

Possible

sort fields

820 sort

You can instruct sort to skip any number of fields and characters on a line before

beginning its comparison. Blanks normally mark the beginning of a field. The next

example sorts the same list by last name, the second field: The ––key=2 argument

instructs sort to begin its comparison with the this field. Because there is no second

pointer, the sort field extends to the end of the line. Now the list is almost in last-

name order, but there is a problem with Mac.

$ sort --key=2 list

Toni Barnett 95020

Jack Cooper 94072

Janet Dempsey 94111

Richard MacDonald 95510

Alice MacLeod 94114

David Mack 94114

Tom Winstrom 94201

In the preceding example, MacLeod comes before Mack. After finding that the sort

fields of these two lines were the same through the third letter (Mac), sort put L

before k because it arranges lines based on ASCII character codes, in which upper-

case letters come before lowercase ones.

The ––ignore-case option makes sort treat uppercase and lowercase letters as equals

and fixes the problem with MacLeod and Mack:

$ sort --ignore-case --key=2 list

Toni Barnett 95020

Jack Cooper 94072

Janet Dempsey 94111

Richard MacDonald 95510

David Mack 94114

Alice MacLeod 94114

Tom Winstrom 94201

The next example attempts to sort list on the third field, the ZIP code. In this case

sort does not put the numbers in order but rather puts the shortest name first in the

sorted list and the longest name last. The ––key=3 argument instructs sort to begin

its comparison with the third field, the ZIP code. A field starts with a blank and

includes subsequent blanks. In the case of the list file, the blanks are SPACEs. The

ASCII value of a SPACE character is less than that of any other printable character, so

sort puts the ZIP code that is preceded by the most SPACEs first and the ZIP code that

is preceded by the fewest SPACEs last.

$ sort --key=3 list

David Mack 94114

Jack Cooper 94072

Tom Winstrom 94201

Toni Barnett 95020

Janet Dempsey 94111

Alice MacLeod 94114

Richard MacDonald 95510

The –b (––ignore-leading-blanks) option causes sort to ignore leading SPACEs within a

field. With this option, the ZIP codes come out in the proper order. When sort determines

sort 821

that MacLeod and Mack have the same ZIP codes, it compares the entire lines, putting

Alice MacLeod before David Mack (because A comes before D).

$ sort -b --key=3 list

Jack Cooper 94072

Janet Dempsey 94111

Alice MacLeod 94114

David Mack 94114

Tom Winstrom 94201

Toni Barnett 95020

Richard MacDonald 95510

To sort alphabetically by last name when ZIP codes are the same, sort needs to make

a second pass that sorts on the last name field. The next example shows how to make

this second pass by specifying a second sort field and uses the –f (––ignore-case)

option to keep the Mack/MacLeod problem from cropping up again:

$ sort -b -f --key=3 --key=2 list

Jack Cooper 94072

Janet Dempsey 94111

David Mack 94114

Alice MacLeod 94114

Tom Winstrom 94201

Toni Barnett 95020

Richard MacDonald 95510

The next example shows a sort command that skips not only fields but also charac-

ters. The –k 3.4 option (equivalent to ––key=3.4) causes sort to start its comparison

with the fourth character of the third field. Because the command does not define

an end to the sort field, it defaults to the end of the line. The sort field is the last two

digits in the ZIP code.

$ sort -fb -k 3.4 list

Tom Winstrom 94201

Richard MacDonald 95510

Janet Dempsey 94111

Alice MacLeod 94114

David Mack 94114

Toni Barnett 95020

Jack Cooper 94072

The problem of how to sort by last name within the last two digits of the ZIP code

is solved by a second pass covering the last name field. The f option following the

–k 2 affects the second pass, which orders lines by last name only.

$ sort -b -k 3.4 -k 2f list

Tom Winstrom 94201

Richard MacDonald 95510

Janet Dempsey 94111

David Mack 94114

Alice MacLeod 94114

Toni Barnett 95020

Jack Cooper 94072

822 sort

The next set of examples uses the cars data file. From left to right, the columns in

this file contain each car’s make, model, year of manufacture, mileage, and price:

$ cat cars

plym fury 1970 73 2500

chevy malibu 1999 60 3000

ford mustang 1965 45 10000

volvo s80 1998 102 9850

ford thundbd 2003 15 10500

chevy malibu 2000 50 3500

bmw 325i 1985 115 450

honda accord 2001 30 6000

ford taurus 2004 10 17000

toyota rav4 2002 180 750

chevy impala 1985 85 1550

ford explor 2003 25 9500

Without any options sort displays a sorted copy of the file:

$ sort cars

bmw 325i 1985 115 450

chevy impala 1985 85 1550

chevy malibu 1999 60 3000

chevy malibu 2000 50 3500

ford explor 2003 25 9500

ford mustang 1965 45 10000

ford taurus 2004 10 17000

ford thundbd 2003 15 10500

honda accord 2001 30 6000

plym fury 1970 73 2500

toyota rav4 2002 180 750

volvo s80 1998 102 9850

The objective of the next example is to sort by manufacturer and by price within

manufacturer. Unless you specify otherwise, a sort field extends to the end of the

line. The –k 1 sort field specifier sorts from the beginning of the line. The command

line instructs sort to sort on the entire line and then make a second pass, sorting on

the fifth field all lines whose first-pass sort fields were the same (–k 5):

$ sort -k 1 -k 5 cars

bmw 325i 1985 115 450

chevy impala 1985 85 1550

chevy malibu 1999 60 3000

chevy malibu 2000 50 3500

ford explor 2003 25 9500

ford mustang 1965 45 10000

ford taurus 2004 10 17000

ford thundbd 2003 15 10500

honda accord 2001 30 6000

plym fury 1970 73 2500

toyota rav4 2002 180 750

volvo s80 1998 102 9850

sort 823

Because no two lines are the same, sort makes only one pass, sorting on each entire

line. (If two lines differed only in the fifth field, they would be sorted properly on

the first pass anyway, so the second pass would be unnecessary.) Look at the lines

containing taurus and thundbd. They are sorted by the second field rather than by

the fifth field, demonstrating that sort never made a second pass and so never sorted

on the fifth field.

The next example forces the first-pass sort to stop at the end of the first field. The

–k 1,1 option specifies a start pointer of the first character of the first field and a

stop pointer of the last character of the first field. When you do not specify a char-

acter within a start pointer, it defaults to the first character; when you do not spec-

ify a character within a stop pointer, it defaults to the last character. Now taurus

and thundbd are properly sorted by price. But look at explor: It is less expensive

than the other Fords, but sort has it positioned as the most expensive. The sort util-

ity put the list in ASCII collating sequence order, not in numeric order: 9500 comes

after 10000 because 9 comes after 1.

$ sort -k 1,1 -k 5 cars

bmw 325i 1985 115 450

chevy impala 1985 85 1550

chevy malibu 1999 60 3000

chevy malibu 2000 50 3500

ford mustang 1965 45 10000

ford thundbd 2003 15 10500

ford taurus 2004 10 17000

ford explor 2003 25 9500

honda accord 2001 30 6000

plym fury 1970 73 2500

toyota rav4 2002 180 750

volvo s80 1998 102 9850

The –n (numeric) option on the second pass puts the list in the proper order:

$ sort -k 1,1 -k 5n cars

bmw 325i 1985 115 450

chevy impala 1985 85 1550

chevy malibu 1999 60 3000

chevy malibu 2000 50 3500

ford explor 2003 25 9500

ford mustang 1965 45 10000

ford thundbd 2003 15 10500

ford taurus 2004 10 17000

honda accord 2001 30 6000

plym fury 1970 73 2500

toyota rav4 2002 180 750

volvo s80 1998 102 9850

The next example again demonstrates that, unless you instruct it otherwise, sort

orders a file starting with the field you specify and continuing to the end of the line.

It does not make a second pass unless two of the first sort fields are the same.

Because there is no stop pointer on the first sort field specifier, the sort field for the

824 sort

first pass includes the third field through the end of the line. Although this example

sorts the cars by years, it does not sort the cars by model within manufacturer within

years (ford thundbd comes before ford explor, these lines should be reversed).

$ sort -k 3 -k 1 cars

ford mustang 1965 45 10000

plym fury 1970 73 2500

bmw 325i 1985 115 450

chevy impala 1985 85 1550

volvo s80 1998 102 9850

chevy malibu 1999 60 3000

chevy malibu 2000 50 3500

honda accord 2001 30 6000

toyota rav4 2002 180 750

ford thundbd 2003 15 10500

ford explor 2003 25 9500

ford taurus 2004 10 17000

Specifying an end to the sort field for the first pass allows sort to perform its secondary

sort properly:

$ sort -k 3,3 -k 1 cars

ford mustang 1965 45 10000

plym fury 1970 73 2500

bmw 325i 1985 115 450

chevy impala 1985 85 1550

volvo s80 1998 102 9850

chevy malibu 1999 60 3000

chevy malibu 2000 50 3500

honda accord 2001 30 6000

toyota rav4 2002 180 750

ford explor 2003 25 9500

ford thundbd 2003 15 10500

ford taurus 2004 10 17000

The next examples demonstrate important sorting techniques: putting a list in

alphabetical order, merging uppercase and lowercase entries, and eliminating

duplicates. The unsorted list follows:

$ cat short

Pear

apple

pear

Apple

Following is a plain sort:

$ sort short

Apple

Pear

apple

pear

sort 825

The following folded sort is a good start, but it does not eliminate duplicates:

$ sort -f short

Apple

apple

Pear

pear

The –u (unique) option eliminates duplicates but without the –f the uppercase

entries come first:

$ sort -u short

Apple

Pear

apple

pear

When you attempt to use both –u and –f, some of the entries get lost:

$ sort -uf short

apple

Pear

Two passes is the answer. Both passes are unique sorts, and the first folds lowercase

letters onto uppercase ones:

$ sort -u -k 1f -k 1 short

Apple

apple

Pear

pear

826 split

split

Divides a file into sections

split [options] [filename [prefix]]

The split utility breaks its input into 1,000-line sections named xaa,xab,xac, and so on. The last section

may be shorter. Options can change the sizes of the sections and lengths of the names.

Arguments The filename is the pathname of the file that split processes. If you do not specify an

argument or if you specify a hyphen (–) instead of the filename,split reads from

standard input. The prefix is one or more characters that split uses to prefix the

names of the files it creates; the default is x.

Options Under Linux, split accepts the common options described on page 603. Options

preceded by a double hyphen (––) work under Linux only. Options named with a

single letter and preceded by a single hyphen work under Linux and OS X.

––suffix-length=len

–a len

Specifies that the filename suffix is len characters long (the default is 2).

––bytes=n[u]–b n[u]

Breaks the input into files that are n bytes long. The u is an optional unit of

measure that can be k (kilobyte or 1,024-byte blocks) or m (megabyte or

1,048,576-byte blocks). If you include the unit of measure, split counts by this

unit in place of bytes. Under Linux, u can be b (512-byte blocks) or any of the

suffixes listed in Table V-1 on page 603.

––numeric-suffixes

–d Specifies numeric suffixes instead of alphabetic suffixes.

––lines=num –l num

Breaks the input into files that are num lines long (the default is 1,000).

Discussion By default split names the first file it creates xaa. The x is the default prefix. You can

change the prefix with the prefix argument on the command line. You can change the

number of characters in each filename following the prefix with the ––suffix-length

option.

Examples By default split breaks a file into 1,000-line sections with the names xaa,xab,xac,

and so on. The wc utility with the –l option shows the number of lines in each file.

The last file, xar, is smaller than the rest.

split 827

$ split /usr/share/dict/words

$ wc -l *

1000 xaa

1000 xab

1000 xac

...

1000 xdt

569 xdu

98569 total

The next example uses the prefix argument to specify a filename prefix of SEC and

uses –c (––suffix-length) to change the number of letters in the filename suffix to 3:

$ split -a 3 /usr/share/dict/words SEC

$ ls

SECaaa SECaak SECaau SECabe SECabo SECaby SECaci SECacs SECadc SECadm

SECaab SECaal SECaav SECabf SECabp SECabz SECacj SECact SECadd SECadn

...

SECaaj SECaat SECabd SECabn SECabx SECach SECacr SECadb SECadl

828 ssh

ssh

Securely executes commands on a remote system

ssh [option] [user@]host [command-line]

The ssh utility logs in on a remote system and starts a shell. Optionally, ssh executes a command on

the remote system and logs off. The ssh utility, which can replace rsh and rlogin, provides secure,

encrypted communication between two systems over an unsecure network.

Arguments The host is the system you want to log in or run a command on. Unless you have

one of several kinds of authentication established, ssh prompts you for a username

and password for the remote system. When ssh is able to log in automatically, it

logs in as the user running the ssh command or as user if you specify user@ on the

ssh command line.

The command-line runs on the remote system. Without command-line,ssh logs in

on the remote system. You must quote special characters in command-line if you do

not want the local shell to expand them.

Options –f (not foreground) Sends ssh to the background after asking for a password and

before executing command-line. This option is useful when you want to run

the command-line in the background but must supply a password. Its use

implies –n.

–l user

(login) Attempts to log in as user. This option is equivalent to using user@ on

the command line.

–n (null) Redirects standard input to ssh to come from /dev/null. See –f.

–p port

Connects to port port on the remote system. Be aware that scp uses (uppercase)

–P to specify a port.

–q (quiet) Suppresses warning and diagnostic messages.

–t (tty) Allocates a pseudo-tty to the ssh process on the remote system. Without

this option, when you run a command on a remote system, ssh does not allo-

cate a tty (terminal) to the process. Instead, ssh attaches standard input and

standard output of the remote process to the ssh session—that is normally, but

not always, what you want. This option forces ssh to allocate a tty on the

remote system so programs that require a tty will work.

–v (verbose) Displays debugging messages about the connection and transfer. This

option is useful if things are not going as expected.

ssh 829

–X (X11) Turns on X11 forwarding. You may not need this option—X11 for-

warding may be turned on in a configuration file.

–x (X11) Turns off X11 forwarding.

Discussion

OpenSSH Using public key encryption, OpenSSH provides two levels of authentication: server

and client/user. First the client verifies that it is connected to the correct server. Then

OpenSSH encrypts communication between the systems. Once a secure, encrypted

connection has been established, OpenSSH makes sure the user is authorized to log

in on or copy files to and from the server. After verifying the system and user,

OpenSSH can allow various services to pass through the connection, including

interactive shell sessions (ssh), remote command execution (ssh), file copying (scp),

FTP services (sftp), X11 client/server connections, and TCP/IP port tunneling.

ssh The ssh utility allows you to log in on a remote system over a network. For

example, you might choose to use a remote system to access a special-purpose

application or to take advantage of a device that is available only on that sys-

tem, or you might use a remote system because you know it is faster or less busy

than the local system. While they are traveling, many businesspeople use ssh on

a laptop to log in on a system at company headquarters. From a GUI you can

use several systems simultaneously by logging in on each one from a different

terminal emulator window.

X11 forwarding With trusted X11 forwarding turned on, it is a simple matter to run an X11 pro-

gram over an ssh connection: Run ssh from a terminal emulator running on an X11

server and give an X11 command such as xclock; the graphical output appears on

the local display. To turn on trusted X11 forwarding, set ForwardX11Trusted to yes

in the /etc/ssh/ssh_config configuration file. To increase security (and in some cases

reduce usability) set ForwardX11Trusted to no.

known_hosts,

ssh_known_hosts

Two files list the hosts the local system has connected to and positively identified:

~/.ssh/known_hosts (user) and /etc/ssh/ssh_known_hosts (global). No one except

the owner (a user working with root privileges in the case of the second file) should

be able to write to either of these files. No one except the owner should have any

access to a ~/.ssh directory.

First-time

authentication

When you connect to an OpenSSH server for the first time, the OpenSSH client

prompts you to confirm you are connected to the right system. This check can help

prevent a man-in-the-middle attack:

The authenticity of host 'plum (192.168.0.10)' can't be established.

RSA key fingerprint is d1:9d:1b:5b:97:5c:80:e9:4b:41:9a:b7:bc:1a:ea:a1.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'plum,192.168.0.10' (RSA) to the list of

known hosts.

830 ssh

Before you respond to the preceding query, make sure you are logging in on the cor-

rect system and not on an imposter. If you are not sure, a telephone call to someone

who logs in on that system locally can help verify you have accessed the right sys-

tem. When you answer yes (you must spell it out), the client appends the server’s

public host key (the single line in the /etc/ssh/ssh_host_rsa_key.pub or

/etc/ssh/ssh_host_dsa_key.pub file on the server) to the user’s ~/.ssh/known_hosts file

on the local system, creating the ~/.ssh directory if necessary. To ensure that it can keep

track of which line in known_hosts applies to which server, OpenSSH prepends the

name of the server and the server’s IP address to the line.

When you subsequently use OpenSSH to connect to that server, the client verifies it

is connected to the correct server by comparing this key to the one supplied by the

server.

After a remote system’s public key is stored in one of the known-hosts files, if the

remote system supplies a different fingerprint when the systems connect, OpenSSH

displays the following message and does not complete the connection:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!

Someone could be eavesdropping on you right now (man-in-the-middle attack)!

It is also possible that the RSA host key has just been changed.

The fingerprint for the RSA key sent by the remote host is

f1:6f:ea:87:bb:1b:df:cd:e3:45:24:60:d3:25:b1:0a.

Please contact your system administrator.

Add correct host key in /home/sam/.ssh/known_hosts to get rid of this message.

Offending key in /home/sam/.ssh/known_hosts:1

RSA host key for plum has changed and you have requested strict checking.

Host key verification failed.

If you see this message, you may be the subject of a man-in-the-middle attack. More

likely, however, something on the remote system has changed, causing it to supply a

new fingerprint. Check with the remote system’s administrator. If all is well, remove

the offending key from the specified file (the third line from the bottom in the pre-

ceding example points to the line you need to remove) and try connecting again.

You can use ssh-keygen with the –R option followed by the name of a host to

remove a hashed entry. In this scenario you will be subject to first-time authentica-

tion (page 829) again as OpenSSH verifies you are connecting to the correct system.

Follow the same steps as when you initially connected to the remote host.

Authorized keys:

automatic login

You can configure OpenSSH so you do not have to enter a password each time you

connect to a server (remote system). To set up this feature, you need to generate a

personal authentication key on the client (local system), place the public part of the

key on the server, and keep the private part of the key on the client. When you con-

nect to the server, it issues a challenge based on the public part of the key. The private

part of the key must then respond properly to this challenge. If the client provides the

appropriate response, the server logs you in.

ssh 831

The first step in setting up an automatic login is to generate your personal authen-

tication keys. First check whether these authentication keys already exist on the

local system (client) by looking in ~/.ssh for either id_dsa and id_dsa.pub or id_rsa

and id_rsa.pub. If one of these pairs of files is present, skip the next step (i.e., do not

create a new key).

On the client, the ssh-keygen utility creates the public and private parts of an RSA

key. The key’s randomart image is a visual representation of the public key; it is

designed to be easy to recall. Display of the randomart image by a client is con-

trolled by the VisualHostKey declaration in the ssh_config file.

ssh-keygen $ ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/home/sam/.ssh/id_rsa):RETURN

Enter passphrase (empty for no passphrase):RETURN

Enter same passphrase again:RETURN

Your identification has been saved in /home/sam/.ssh/id_rsa.

Your public key has been saved in /home/sam/.ssh/id_rsa.pub.

The key fingerprint is:

f2:eb:c8:fe:ed:fd:32:98:e8:24:5a:76:1d:0e:fd:1d sam@peach

The key's randomart image is:

+--[ RSA 2048]----+

| oE|

| o . . o|

| . o + ..|

| . + o o |

...

Replace rsa with dsa to generate DSA keys. In this example, the user pressed RETURN

in response to each query. You have the option of specifying a passphrase (10–30

characters is a good length) to encrypt the private part of the key. There is no way

to recover a lost passphrase. See the following security tip for more information

about the passphrase.

When you encrypt your personal key

security The private part of the key is kept in a file that only you can read. If a malicious user compromises

your account, an account that can use sudo to gain root privileges, or the root account on the

local system, that user then has access to your account on the remote system because she can

read the private part of your personal key.

Encrypting the private part of your personal key protects the key and, therefore, restricts access to

the remote system should someone compromise your local account. However, if you encrypt your

personal key, you must supply the passphrase you used to encrypt the key each time you use the

key, negating the benefit of not having to type a password when logging in on the remote system.

Also, most passphrases that you can easily remember can be cracked quite quickly by a powerful

computer.

A better idea is to store the private keys on a removable medium, such as a USB flash drive, and

use your ~/.ssh directory as the mount point for the filesystem stored on this drive. You may want

to encrypt these keys with a passphrase in case you lose the flash drive.

832 ssh

The ssh-keygen utility generates two keys: a private key or identification in

~/.ssh/id_rsa and a public key in ~/.ssh/id_rsa.pub. No one except the owner

should be able to write to either of these files, and only the owner should be able to

read from the private key file.

authorized_keys To enable you to log in on or copy files to and from another system without supplying

a password, first create a ~/.ssh directory with permissions set to 700 on the server

(remote system). Next copy ~/.ssh/id_rsa.pub from the client (local system) to a file

named ~/.ssh/authorized_keys on the server (remote system). Set its permissions to

600 so that no one except the owner can read from or write to this file. Now when

you run ssh or scp to access the server, you do not have to supply a password. To

make the server even more secure, you can disable password authentication by setting

PasswordAuthentication to no in /etc/ssh/sshd_config (remove the # from the begin-

ning of the PasswordAuthentication line and change the yes to no).

Examples In the first example, Max uses ssh to display a list of the files in his home directory

on kudos:

$ ssh kudos ls

max@kudos's password:

Work

code

graphs

reports

Next the output of the previous command is redirected to the file kudos_ls. Because

the redirection character (>) is not escaped, it is interpreted by the local shell, and

the file kudos_ls is created on the local system.

$ ssh kudos ls > kudos_ls

max@kudos's password:

$ cat kudos_ls

Work

code

graphs

reports

The next example quotes the entire command that will run on the remote system.

As a result, the local shell does not interpret the redirection character (>) but rather

passes it to the remote shell. The file kudos.ls is created on the remote system

(kudos), as shown by ls run on kudos:

$ ssh kudos "ls > kudos.ls"

max@kudos's password:

$ ssh kudos ls

max@kudos's password:

Work

code

graphs

kudos.ls

reports

ssh 833

The next command does not quote the pipe symbol (|). As a result the pipe is inter-

preted by the local shell, which sends the output of the remote ls to standard input

of less on the local system:

$ ssh kudos ls | less

Next ssh executes a series of commands, connected with pipes, on a remote system.

The commands are enclosed within single quotation marks so the local shell does

not interpret the pipe symbols and all the commands are run on the remote system.

$ ssh kudos 'ps -ef | grep nmbd | grep -v grep | cut -c10-15 |xargs kill -1'

The output of ps is piped through grep, which passes all lines containing the string

nmbd to another invocation of grep. The second grep passes all lines not containing

the string grep to cut (page 652). The cut utility extracts the process ID numbers and

passes them to xargs (page 881), which kills the listed processes by calling kill to

issue a HUP signal (kill –1).

In the following example, ssh without command-line logs in on the remote system.

Here Max has used watson@kudos to log in on kudos as watson:

$ ssh watson@kudos

watson@kudos's password:

Last login: Sat Sep 17 06:51:59 from bravo

$ hostname

kudos

$ exit

Max now decides to change the password for his watson login on kudos:

$ ssh watson@kudos passwd

watson@kudos's password:

(current) UNIX password: por

Max stops as soon as he sees passwd (running on kudos) displaying his password:

He knows something is wrong. For the passwd to work, it must run with a tty (termi-

nal) so it can turn off character echo (stty –echo); then it will not display passwords

as the user enters them. The –t option solves the problem by associating a pseudo-tty

with the process running passwd on the remote system:

$ ssh -t watson@kudos passwd

watson@kudos's password:

Changing password for watson

(current) UNIX password:

New UNIX password:

Retype new UNIX password:

passwd: all authentication tokens updated successfully

Connection to kudos closed.

The –t option is also useful when you are running a program that uses a character-

based/pseudographical interface.

834 ssh

The following example uses tar (page 846) to create an archive file of the contents of

the working directory hierarchy. The f – option causes tar to send its output to stan-

dard output. A pipe sends the output of tar running on the local system, via ssh, to

dd (page 658) running on the remote system.

$ cat buwd

#! /bin/bash

# back up the working directory to the user's

# home directory on the remote system specified

# by $machine

# remote system:

machine=speedy

dir=$(basename $(pwd))

filename=$$.$dir.tar

echo Backing up $(pwd) to your home directory on $machine

tar -cf - . | ssh $machine "dd obs=256k of=$filename"

echo done. Name of file on $machine is $filename

stat 835

stat

Displays information about files

stat [options] [file-list]

The stat utility displays information about files.

Arguments The file-list specifies the pathnames of one or more files that stat displays informa-

tion about. Without a file-list,stat displays information about standard input.

Options Options preceded by a double hyphen (––) work under Linux only. Except as noted,

options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

Without any options, stat displays all available information about each file it

processes.

––format=fmt –c fmt

Formats output using fmt. See the stat man page for more information. L

–F (file type) Displays a slash (/) after each directory, an asterisk (*) after each

executable file, an at sign (@) after a symbolic link, an equal sign (=) after a

socket, and a pipe (|) after a FIFO. O

–f (filesystem) Displays filesystem information instead of file information. L

–f format

Formats output using format. The format string is similar to that used by printf.

See the stat man page for more information. O

––dereference –L For each file that is a symbolic link, displays information about the file the link

points to, not the symbolic link itself. This option treats files that are not sym-

bolic links normally.

–l (long) Uses the same format as ls –l (page 93). O

–q (quiet) Suppresses error messages. O

–s (shell) Displays information in a format that can be used to initialize shell vari-

ables. See the “Examples” section. O

–x (Linux) Displays a more verbose format that is compatible with the version of

stat found on Linux. O

Examples The Linux and Mac OS X versions of stat display different information. The exam-

ples show the output from the Linux version.

836 stat

The first example displays information about the /bin/bash file:

$ stat /bin/bash

File: `/bin/bash'

Size: 813912 Blocks: 1600 IO Block: 4096 regular file

Device: 812h/2066d Inode: 146674 Links: 1

Access: (0755/-rwxr-xr-x) Uid: ( 1003/ sam) Gid: ( 1002/ pubs)

Access: 2009-05-28 11:36:01.000000000 -0700

Modify: 2008-05-12 11:36:49.000000000 -0700

Change: 2009-05-21 11:34:41.000000000 -0700

The next example displays information about the root filesystem:

$ stat -f /

File: "/"

ID: 586ac700c664ac38 Namelen: 255 Type: ext2/ext3

Block size: 4096 Fundamental block size: 4096

Blocks: Total: 960602 Free: 703655 Available: 654858

Inodes: Total: 488640 Free: 467723

strings 837

strings

Displays strings of printable characters

strings [options]file-list

The strings utility displays strings of printable characters from object and other nontext files.

Arguments The file-list is a list of files that strings processes.

Options Options preceded by a double hyphen (––) work under Linux only. Except as noted,

options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

––all –a Processes whole files. Without this option strings processes only the initialized

and loaded parts of an object file.

––print-file-name –f Precedes each string with the name of the file that the string comes from. L

––bytes=min –min

Displays strings of characters that are at least min characters long (the default

is 4).

Discussion The strings utility can help you determine the contents of nontext files. One notable

application for strings is determining the owner of files in a lost+found directory.

Examples The following example displays strings of four or more printable characters in the

executable file for the man utility. If you did not know what this file was, these

strings could help you determine that it was the man executable.

$ strings /usr/bin/man

...

--Man-- next: %s [ view (return) | skip (Ctrl-D) | quit (Ctrl-C) ]

format: %d, save_cat: %d, found: %d

cannot write to %s in catman mode

creating temporary cat for %s

can't write to temporary cat for %s

can't create temporary cat for %s

cat-saver exited with status %d

found ultimate source file %s

...

838 stty

stty

Displays or sets terminal parameters

stty [options] [arguments]

Without any arguments, stty displays parameters that affect the operation of the terminal or terminal

emulator. For a list of some of these parameters and an explanation of each, see the “Arguments”

section. The arguments establish or change parameters.

Options Under Linux, stty accepts the common options described on page 603. Options pre-

ceded by a double hyphen (––) work under Linux only. Except as noted, options

named with a single letter and preceded by a single hyphen work under Linux and

OS X.

––all –a Reports on all parameters. This option does not accept arguments.

––file=device –F device

Affects device.Without this option stty affects the device attached to standard

input. You can change the characteristics of a device only if you own its device

file or if you are working with root privileges. L

–f device

Affects device.Performs the same function as –F.O

––save –g Generates a report of the current settings in a format you can use as arguments

to another stty command. This option does not accept arguments.

Arguments The arguments to stty specify which terminal parameters stty is to alter. To turn on

each of the parameters that is preceded by an optional hyphen (indicated in the follow-

ing list as [–]) specify the parameter without the hyphen. To turn it off, use the hyphen.

Unless specified otherwise, this section describes the parameters in their on states.

Special Keys and Characteristics

columns nSets the line width to n columns.

ek (erase kill) Sets the erase and line kill keys to their default values. Many systems use

DELETE and CONTROL-U, respectively, as the defaults.

erase xSets the erase key to x. To specify a control character, precede x with CONTROL-V (for

example, use CONTROL-V CONTROL-H to indicate CONTROL-H) or use the notation ^h, where ^ is

a caret (SHIFT-6 on most keyboards).

intr xSets the interrupt key to x. See erase x for conventions.

kill xSets the line kill key to x. See erase x for conventions.

rows nSets the number of screen rows to n.

stty 839

sane Sets the terminal parameters to values that are usually acceptable. The sane argu-

ment is useful when several stty parameters have changed, making it difficult to use

the terminal to run stty to set things right. If sane does not appear to work, try

entering the following characters:

CONTROL-J stty sane CONTROL-J

susp x(suspend) Sets the suspend (terminal stop) key to x. See erase x for conventions.

werase x(word erase) Sets the word erase key to x. See erase x for conventions.

Modes of Data Transmission

[–]cooked See raw.L

cooked See sane.O

[–]cstopb (stop bits) Selects two stop bits (–cstopb specifies one stop bit).

[–]parenb (parity enable) Enables parity on input and output. When you specify –parenb, the

system does not use or expect a parity bit when communicating with the terminal.

[–]parodd (parity odd) Selects odd parity (–parodd selects even parity).

[–]raw The normal state is –raw. When the system reads input in its raw form, it does not

interpret the following special characters: erase (usually DELETE), line kill (usually

CONTROL-U), interrupt execution (CONTROL-C), and EOF (CONTROL-D). In addition, the system

does not use parity bits. Reflecting the humor that is typical of Linux’s heritage,

under Linux you can also specify –raw as cooked.

Treatment of Characters

[–]echo Echoes characters as they are typed (full-duplex operation). If a terminal is half-

duplex and displays two characters for each one it should display, turn the echo

parameter off (–echo). Use –echo when the user is entering passwords.

[–]echoe (echo erase) The normal setting is echoe, which causes the kernel to echo the charac-

ter sequence BACKSPACE SPACE BACKSPACE when you use the erase key to delete a character.

The effect is to move the cursor backward across the line, removing characters as

you delete them.

[–]echoke (echo kill erase) The normal setting is echoke. When you use the kill character to

delete a line while this option is set, all characters back to the prompt are erased on

the current line. When this option is negated, pressing the kill key moves the cursor

to the beginning of the next line.

[–]echoprt (echo print) The normal setting is –echoprt, which causes characters to disappear as

you erase them. When you set echoprt, characters you erase are displayed between a

backslash (\) and a slash (/). For example, if you type the word sort and then erase

it by pressing BACKSPACE four times, Linux displays sort\tros/ when echoprt is set. If

you use the kill character to delete the entire line, having echoprt set causes the

entire line to be displayed as if you had BACKSPACEd to the beginning of the line.

840 stty

[–]lcase For uppercase-only terminals, translates all uppercase characters into lowercase as

they are entered (also [–]LCASE). L

[–]nl Accepts only a NEWLINE character as a line terminator. With –nl in effect, the system

accepts a RETURN character from the terminal as a NEWLINE but sends a RETURN followed

by a NEWLINE to the terminal in place of a NEWLINE.

[–]tabs Transmits each TAB character to the terminal as a TAB character. When tabs is turned

off (–tabs), the kernel translates each TAB character into the appropriate number of

SPACEs and transmits them to the terminal (also [–]tab3).

Job Control Parameters

[–]tostop Stops background jobs if they attempt to send output to the terminal (–tostop

allows background jobs to send output to the terminal).

Notes The name stty is an abbreviation for set teletypewriter, or set tty (page 867), the

first terminal UNIX was run on. Today stty is commonly thought of as meaning set

terminal.

The shells retain some control over standard input when you use them interactively.

As a consequence, a number of the options available with stty appear to have no

effect. For example, the command stty –echo appears to have no effect under tcsh:

tcsh $ stty -echo

tcsh $ date

Thu May 28 16:53:01 PDT 2009

While stty –echo does work when you are using bash interactively, stty –echoe does

not. However, you can still use these options to affect shell scripts and other utilities.

$ cat testit

#!/bin/bash

stty -echo

echo -n "Enter a value: "

read a

echo

echo "You entered: $a"

stty echo

$ ./testit

Enter a value:

You entered: 77

The preceding example does not display the user’s response to the Enter a value:

prompt. The value is retained by the a variable and is displayed by the echo "You

entered: $a" statement.

Examples The first example shows that stty without any arguments displays several terminal

operation parameters. (Your system may display more or different parameters.) The

character following the erase = is the erase key. A ^ preceding a character indicates

stty 841

aCONTROL key. In the example the erase key is set to CONTROL-H. If stty does not display

the erase character, it is set to its default value of DELETE. If it does not display a kill

character, it is set to its default of ^U.

$ stty

speed 38400 baud; line = 0;

erase = ^H;

Next the ek argument returns the erase and line kill keys to their default values:

$ stty ek

The next display verifies the change. The stty utility does not display either the erase

character or the line kill character, indicating both are set to their default values:

$ stty

speed 38400 baud; line = 0;

The next example sets the erase key to CONTROL-H. The CONTROL-V quotes the CONTROL-H so

that the shell does not interpret it and it is passed to stty unchanged:

$ stty erase CONTROL–V CONTROL–H

$ stty

speed 38400 baud; line = 0;

erase = ^H;

Next stty sets the line kill key to CONTROL-X. This time the user entered a caret (^) fol-

lowed by an x to represent CONTROL-X. You can use either a lowercase or uppercase letter.

$ stty kill ^X

$ stty

speed 38400 baud; line = 0;

erase = ^H; kill = ^X;

Now stty changes the interrupt key to CONTROL-C:

$ stty intr CONTROL–VCONTROL–C

In the following example, stty turns off TABs so the appropriate number of SPACEs is

sent to the terminal in place of a TAB. Use this command if a terminal does not auto-

matically expand TABs.

$ stty -tabs

If you log in under Linux and everything appears on the terminal in uppercase let-

ters, give the following command and then check the CAPS LOCK key. If it is set, turn it

off.

$ STTY -LCASE

Turn on lcase if you are using a very old terminal that cannot display lowercase

characters.

Although no one usually changes the suspend key from its default of CONTROL-Z, you

can. Give the following command to change the suspend key to CONTROL-T:

$ stty susp ^T

842 sysctl O

sysctl O

Displays and alters kernel variables

sysctl [options] [variable-list]

sysctl [options]–w [var=value ... ]

The sysctl utility displays and alters kernel variables, including kernel tuning parameters.

Arguments The variable-list is a list of kernel variables whose values sysctl displays. In the

second format, each value is assigned to the variable named var.

Options –a (all) Displays all kernel variables.

–b (binary) Displays kernel variables as binary data without terminating NEWLINEs.

–n (no label) Displays variables without labels.

Discussion The sysctl utility provides access to a number of kernel variables, including the max-

imum number of processes the kernel will run at one time, the filename used for

core files, and the system security level. Some variables cannot be altered or can be

altered only in certain ways. For example, you can never lower the security level.

Examples The sysctl utility is commonly used for tuning the kernel. The process limits can be

displayed by anyone, but can be altered only by a user who is working with root

privileges. The following example shows a user changing the maximum number of

processes per UID:

$ sysctl -a | grep proc

kern.maxproc = 532

kern.maxfilesperproc = 10240

kern.maxprocperuid = 266

kern.aioprocmax = 16

kern.proc_low_pri_io = 0

...

# sysctl -w kern.maxprocperuid=200

kern.maxprocperuid: 266 -> 200

tail 843

tail

Displays the last part (tail) of a file

tail [options] [file-list]

The tail utility displays the last part, or end, of a file.

Arguments The file-list is a list of pathnames of the files that tail displays. When you specify

more than one file, tail displays the filename of each file before displaying lines from

that file. If you do not specify an argument or, under Linux, if you specify a hyphen

(–) instead of a filename, tail reads from standard input.

Options Under Linux, tail accepts the common options described on page 603. Options pre-

ceded by a double hyphen (––) work under Linux only. Except as noted, options

named with a single letter and preceded by a single hyphen work under Linux and

OS X.

–b [+]n

Counts by 512-byte blocks instead of lines. The n argument is an integer that

specifies the number of blocks. Thus the command tail –b 5 displays the last

five blocks of a file. See the note about using a plus sign (+) in the next

option. O

––bytes=[+]n[u]–c [+]n[u]

Counts by bytes (characters) instead of lines. The n argument is an integer

that specifies the number of bytes. Thus the command tail –c 5 displays the

last five bytes of a file. Under Linux only, the u is an optional multiplicative

suffix as described on page 602, except that tail uses a lowercase k for kilobyte

(1,024-byte blocks). If you include a multiplicative suffix, tail counts by this

unit instead of by bytes.

If you put a plus sign (+) in front of n,tail counts from the start of the file

instead of the end. The tail utility still displays characters through the end of

the file, even though it starts counting from the beginning. Thus tail –c +5

causes tail to display from the fifth character through the end of the file.

––follow –f After copying the last line of the file, tail enters an endless loop, waiting and

copying additional lines from the file if the file grows. If you specify multiple

files in file-list with this option, tail includes a new header each time it displays

output from a different file so you know which file is being added to. This

option is useful for tracking the progress of a process that is running in the

background and sending its output to a file. The tail utility continues to wait

indefinitely, so you must use the interrupt key to terminate it. See also the –s

option.

844 tail

––lines=[+]n[u]–n [+]n[u]

Counts by lines (the default). The n argument is an integer that specifies the

number of lines. Under Linux, the u is an optional unit of measure; see the –c

(––bytes) option for an explanation of its use. Although it is not documented,

you can use ±n to specify a number of lines without using this option.

If you put a plus sign (+) in front of n,tail counts from the start of the file

instead of the end. The tail utility still displays lines through the end of the file,

even though it starts counting from the beginning. Thus tail –n +5 causes tail to

display from the fifth line through the last line of the file.

––quiet –q Suppresses header information when you specify multiple files in file-list.L

––sleep-interval=n–s n

When used with –f, causes tail to sleep for n seconds between checks for addi-

tional output. L

Notes The tail utility displays the last ten lines of its input by default.

Examples The examples are based on the eleven file:

$ cat eleven

line one

line two

line three

line four

line five

line six

line seven

line eight

line nine

line ten

line eleven

First tail displays the last ten lines of the eleven file (no options):

$ tail eleven

line two

line three

line four

line five

line six

line seven

line eight

line nine

line ten

line eleven

Next it displays the last three lines (–n 3 or ––lines 3) of the file:

tail 845

$ tail -n 3 eleven

line nine

line ten

line eleven

The following example displays the file starting at line 8 (+8):

$ tail -n +8 eleven

line eight

line nine

line ten

line eleven

The next example displays the last six characters in the file (–c 6 or ––bytes 6). Only

five characters are evident (leven); the sixth is a NEWLINE.

$ tail -c 6 eleven

leven

The final example demonstrates the –f option. Here tail tracks the output of a make

command, which is being sent to the file accounts.out:

$ make accounts > accounts.out &

$ tail -f accounts.out

cc -c trans.c

cc -c reports.c

...

CONTROL-C

In the preceding example, running tail with –f displays the same information as run-

ning make in the foreground and not redirecting its output to a file. However, using

tail offers some advantages. First, the output of make is saved in a file. (The output

would not be saved if you did not redirect its output.) Second, if you decide to do

something else while make is running, you can kill tail and the screen will be free for

you to use while make continues running in the background. When you are running

a large job such as compiling a large program, you can use tail with the –f option to

check on its progress periodically.

846 tar

tar

Stores or retrieves files to/from an archive file

tar option [modifiers] [file-list]

The tar (tape archive) utility creates, adds to, lists, and retrieves files from an archive file.

Arguments The file-list is a list of pathnames of the files that tar archives or extracts.

Options Use only one of the following options to indicate which action you want tar to take.

You can alter the action of the option by following it with one or more modifiers.

––create –c Creates an archive. This option stores the files named in file-list in a new

archive. If the archive already exists, tar destroys it before creating the new

archive. If a file-list argument is a directory, tar copies the directory hierarchy

into the archive. Without the ––file option, tar writes the archive to standard

output.

––compare or

––diff

–d Compares an archive with the corresponding disk files and reports on the

differences.

––help Displays a list of options and modifiers, along with short descriptions of each.

––append –r Writes the files named in file-list to the end of the archive. This option leaves

files that are already in the archive intact, so duplicate copies of files may

appear in the archive after tar finishes. When tar extracts the files, the most

recent copy of a file in the archive is the one that ends up on the disk.

––list –t (table of contents) Without a file-list, this option produces a table of contents

listing all files in an archive. With a file-list, it displays the name of each file in

the file-list each time it occurs in the archive. You can use this option with the

––verbose option to display detailed information about each file in an archive.

––update –u Adds the files from file-list if they are not already in the archive or if they have

been modified since they were last written to the archive. Because of the addi-

tional checking required, tar runs more slowly when you specify this option.

––extract –x Extracts file-list from the archive and writes it to the disk, overwriting existing

files with the same names. Without a file-list, this option extracts all files from

the archive. If the file-list includes a directory, tar extracts the directory hierar-

chy. The tar utility attempts to keep the owner, modification time, and access

privileges the same as those of the original file. If tar reads the same file more

than once, the last version read will appear on the disk when tar is finished.

The Mac OS X version of tar accepts long options

tip Options for tar preceded by a double hyphen (––) work under Mac OS X as well as under Linux.

tar 847

Modifiers

––blocking-factor n

–b n

Uses n as the blocking factor for creating an archive. Use this option only

when tar is creating an archive directly to removable media. (When tar reads an

archive, it automatically determines the blocking factor.) The value of n is the

number of 512-byte blocks to write as a single block on the removable media.

––directory dir –C dir

Changes the working directory to dir before processing.

––checkpoint Displays periodic messages. This option lets you know tar is running without

forcing you to view all the messages displayed by ––verbose.

––exclude=file Does not process the file named file. If file is a directory, no files or directories

in that directory hierarchy are processed. The file can be an ambiguous file ref-

erence; quote special characters as needed.

––file filename –f filename

Uses filename as the name of the file the archive is created in or extracted

from. The filename can be the name of an ordinary file or a device (e.g., a CD

or USB flash drive). You can use a hyphen (–) instead of the filename to refer

to standard input when creating an archive and to standard output when

extracting files from an archive. The following two commands are equivalent

ways of creating a compressed archive of the files in the /home directory hier-

archy on /dev/sde1:

$ tar -zcf /dev/sde1 /home

$ tar -cf - /home | gzip > /dev/sde1

––dereference –h For each file that is a symbolic link, archives the file the link points to, not the

symbolic link itself.

––ignore-failed-read

When creating an archive, tar normally quits with a nonzero exit status if any of

the files in file-list is unreadable. This option causes tar to continue processing,

skipping unreadable files.

––bzip2 –j Uses bzip2 (pages 60 and 615) to compress/decompress files when creating an

archive and extracting files from an archive.

––tape–length n–L n

Asks for a new medium after writing n*1,024 bytes to the current medium.

This feature is useful when you are building archives that are too big to fit on a

single USB flash drive, partition, CD, or other storage device.

––touch –m Sets the modification time of the extracted files to the time of extraction.

Without this option tar attempts to maintain the modification time of the

original file.

848 tar

––one-file-system When a directory name appears in file-list while it is creating an archive, tar

recursively processes the files and directories in the named directory hierarchy.

With this option tar stays in the filesystem that contains the named directory

and does not process directories in other filesystems. Under OS X, you can use

–l (lowercase “l”) in place of ––one-file-system. Under Linux, –l is used for a

different purpose.

––absolute-names –P The default behavior of tar is to force all pathnames to be relative by stripping

leading slashes from them. This option disables this feature, so absolute path-

names remain absolute.

––sparse –S Linux allows you to create sparse files (large, mostly empty files). The empty

sections of sparse files do not take up disk space. When tar extracts a sparse file

from an archive, it normally expands the file to its full size. As a result, when

you restore a sparse file from a tar backup, the file takes up its full space and

may no longer fit in the same disk space as the original. This option causes tar

to handle sparse files efficiently so they do not take up unnecessary space

either in the archive or when they are extracted.

––verbose –v Lists each file as tar reads or writes it. When combined with the –t option, –v

causes tar to display a more detailed listing of the files in the archive, showing

their ownership, permissions, size, and other information.

––interactive –w Asks you for confirmation before reading or writing each file. Respond with y

if you want tar to take the action. Any other response causes tar not to take the

action.

––exclude-from filename

–X filename

Similar to the ––exclude option, except filename specifies a file that contains a

list of files to exclude from processing. Each file listed in filename must appear

on a separate line.

––compress or

––uncompress

–Z Uses compress when creating an archive and uncompress when extracting files

from an archive.

––gzip or

––gunzip

–z Causes tar to use gzip to compress an archive while it is being created and to

decompress an archive when extracting files from it. This option also works to

extract files from archives that have been compressed with the compress utility.

Notes The ––help option displays all tar options and modifiers; the ––usage option pro-

vides a brief summary of the same information. The info page on tar provides exten-

sive information, including a tutorial for this utility.

You can use ambiguous file references in file-list when you create an archive but not

when you extract files from an archive.

The name of a directory file within the file-list references the directory hierarchy (all

files and directories within that directory).

tar 849

The file that tar sends its output to by default is compilation specific; typically it

goes to standard output. Use the –f option to specify a different filename or device

to hold the archive.

When you create an archive using a simple filename in file-list, the file appears in

the working directory when you extract it. If you use a relative pathname when you

create an archive, the file appears with that relative pathname, starting from the

working directory when you extract it. If you use the –P option and an absolute

pathname when you create an archive, and if you use the –P option when you

extract files from the archive, tar extracts the file with the same pathname.

Leading hyphens The tar utility does not require leading hyphens on options and modifiers. However,

it behaves slightly differently with regard to the order of options and modifiers it

requires with and without the hyphen.

You can specify one or more modifiers following an option. With a leading hyphen,

the following tar command generates an error:

$ tar -cbf 10 /dev/sde1 memos

tar: f: Invalid blocking factor

Try `tar --help' or `tar --usage' for more information.

The error occurs because the –b modifier takes an argument but is not the last modi-

fier in a group. The same command works correctly if you omit the leading hyphen.

You must use leading hyphens if you separate options:

$ tar -cb 10 -f /dev/sde1 memos

Examples The following example makes a copy of the /home/max directory hierarchy on a

USB flash drive mounted at /dev/sde1. The v modifier causes the command to list

the files it writes to the device. This command erases anything that was already on

the device. The message from tar explains that the default action is to store all path-

names as relative paths instead of absolute paths, thereby allowing you to extract

the files into a different directory on the disk.

$ tar -cvf /dev/sde1 /home/max

tar: Removing leading '/' from member names.

/home/max/

/home/max/.bash_history

/home/max/.bash_profile

...

In the next example, the same directory is saved on the device mounted on

/dev/sde1 with a blocking factor of 100. Without the v modifier, tar does not dis-

play the list of files it is writing to the device. The command runs in the background

and displays any messages after the shell issues a new prompt.

$ tar -cb 100 -f /dev/sde1 /home/max &

[1] 4298

$ tar: Removing leading '/' from member names.

850 tar

The next command displays the table of contents of the archive on the device

mounted on /dev/sde1:

$ tar -tvf /dev/sde1

drwxrwxrwx max/group 0 Jun 30 21:39 2008 home/max/

-rw-r--r-- max/group 678 Aug 6 14:12 2009 home/max/.bash_history

-rw-r--r-- max/group 571 Aug 6 14:06 2009 home/max/.bash_profile

drwx------ max/group 0 Nov 6 22:34 2009 home/max/mail/

-rw------- max/group 2799 Nov 6 22:34 2009 home/max/mail/sent-mail

...

Next, Max creates a gzipped tar archive in /tmp/max.tgz. This approach is a popular

way to bundle files that you want to transfer over a network or otherwise share with

others. Ending a filename with .tgz is one convention for identifying gzipped tar

archives. Another convention is to end the filename with .tar.gz.

$ tar -czf /tmp/max.tgz literature

The final command lists the files in the compressed archive max.tgz:

$ tar -tzvf /tmp/max.tgz

...

tee 851

tee

Copies standard input to standard output and one or more files

tee [options]file-list

The tee utility copies standard input to standard output and to one or more files.

Arguments The file-list is a list of the pathnames of files that receive output from tee. If a file in

file-list does not exist, tee creates it.

Options Options preceded by a double hyphen (––) work under Linux only. Options named

with a single letter and preceded by a single hyphen work under Linux and OS X.

Without any options, tee overwrites the output files if they exist and responds to

interrupts.

––append –a Appends output to existing files rather than overwriting them.

–i Causes tee not to respond to the SIGINT interrupt. O

––ignore-interrupts

–i Causes tee not to respond to interrupts. L

Examples In the following example, a pipe sends the output from make to tee, which copies

that information to standard output and to the file accounts.out. The copy that goes

to standard output appears on the screen. The cat utility displays the copy that was

sent to the file.

$ make accounts | tee accounts.out

cc -c trans.c

cc -c reports.c

...

$ cat accounts.out

cc -c trans.c

cc -c reports.c

...

Refer to page 845 for a similar example that uses tail –f rather than tee.

852 telnet

telnet

Connects to a remote system over a network

telnet [options] [remote-system]

The telnet utility implements the TELNET protocol to connect to a remote system over a network.

Arguments The remote-system is the name or IP address of the remote system that telnet con-

nects to. When you do not specify a remote-system,telnet works interactively and

prompts you to enter one of the commands described in this section.

Options –a Initiates automatic login (the default behavior under Mac OS X).

–e c

(escape) Changes the escape character from CONTROL-] to the character c.

–K Prevents automatic login. This option is available under Mac OS X and some

Linux distributions (it is the default behavior under Linux).

–l username

Attempts an automatic login on the remote system using username. If the

remote system understands how to handle automatic login with telnet, it

prompts for a password.

Discussion After telnet connects to a remote system, you can put telnet into command mode by

typing the escape character (usually CONTROL-]). The remote system typically reports

the escape character it recognizes. To leave command mode, type RETURN on a line by

itself.

In command mode, telnet displays the telnet> prompt. You can use the following

commands in command mode:

?(help) Displays a list of commands recognized by the telnet utility on the local system.

close Closes the connection to the remote system. If you specified the name of a system

on the command line when you started telnet,close has the same effect as quit: The

telnet program quits, and the shell displays a prompt. If you used the open com-

mand instead of specifying a remote system on the command line, close returns

telnet to command mode.

logout Logs you out of the remote system; similar to close.

telnet is not secure

security The telnet utility is not secure. It sends your username and password over the network as clear-

text, which is not a secure practice. Use ssh (page 828) when it is available.

telnet 853

open remote-computer

If you did not specify a remote system on the command line or if the attempt to con-

nect to the system failed, you can specify the name of a remote system interactively

with the open command.

quit Quits the telnet session.

zSuspends the telnet session. When you suspend a session, you return to the login shell

on the local system. To resume the suspended telnet session, type fg at a shell prompt.

Notes Under Linux, telnet does not attempt to log in automatically. Under OS X, telnet

attempts to log in automatically. When telnet attempts to log in automatically, it

uses your username on the local system unless you specify a different name using

the –l option.

Many computers, including non-Linux/OS X systems, support the TELNET proto-

col. The telnet utility is a user interface to this protocol for Linux/OS X systems that

allows you to connect to many different types of systems. Although you typically

use telnet to log in, the remote computer may offer other services through telnet,

such as access to special databases.

Examples In the following example, the user connects to the remote system named bravo.

After running a few commands, the user escapes to command mode and uses the z

command to suspend the telnet session so he can run a few commands on the local

system. The user gives an fg command to the shell to resume using telnet. The logout

command on the remote system ends the telnet session, and the local shell displays a

prompt.

kudos% telnet bravo

Trying 192.168.0.55 ...

Connected to bravo.

Escape character is '^]'.

Password:

Last login: Wed Jul 31 10:37:16 from kudos

bravo $

...

bravo $CONTROL-]

telnet> z

[1]+ Stopped telnet bravo

kudos $

...

kudos $fg

telnet bravo

bravo$ logout

Connection closed by foreign host.

kudos $

854 test

test

Evaluates an expression

test expression

[expression ]

The test utility evaluates an expression and returns a condition code indicating the expression is

either true (0) or false (not 0). You can place brackets ([]) around the expression instead of using the

word test (second format).

Arguments The expression contains one or more criteria (see the following list) that test evalu-

ates. A –a separating two criteria is a Boolean AND operator: Both criteria must be

true for test to return a condition code of true. A –o is a Boolean OR operator.

When –o separates two criteria, one or the other (or both) of the criteria must be

true for test to return a condition code of true.

You can negate any criterion by preceding it with an exclamation point (!). You can

group criteria with parentheses. If there are no parentheses, –a takes precedence

over –o, and test evaluates operators of equal precedence from left to right.

Within the expression you must quote special characters, such as parentheses, so

the shell does not interpret them but rather passes them to test unchanged.

Because each element, such as a criterion, string, or variable within the expression,

is a separate argument, you must separate each element from other elements with a

SPACE. Table V-29 lists the criteria you can use within the expression.Table V-30 on

page 856 lists test’s relational operators.

Table V-29 Criteria

Criterion Meaning

string

True if

string

has a length greater than zero. L

True if

string

is not a null string. O

–n

string

True if

string

has a length greater than zero.

–z

string

True if

string

has a length of zero.

string1

string2

True if

string1

is equal to

string2

string1

string2

True if

string1

is not equal to

string2

int1 relop int2

True if integer

int1

has the specified algebraic relationship to integer

int2

. The

relop

is a relational operator from Table V-30. As a special case, –l

string

which gives the length of

string

, may be used for

int1

int2

test 855

Criterion Meaning

file1

–ef

file2

True if

file1

and

file2

have the same device and inode numbers.

file1

–nt

file2

True if

file1

was modified after

file2

(the modification time of

file1

is newer than

that of

file2

file1

–ot

file2

True if

file1

was modified before

file2

(the modification time of

file1

is older

than that of

file2

–b

filename

True if the file named

filename

exists and is a block special file.

–c

filename

True if the file named

filename

exists and is a character special file.

–d

filename

True if the file named

filename

exists and is a directory.

–e

filename

True if the file named

filename

exists.

–f

filename

True if the file named

filename

exists and is an ordinary file.

–g

filename

True if the file named

filename

exists and its setgid bit (page 96) is set.

–G

filename

True if the file named

filename

exists and is associated with the group that is

the primary group of the user running the command (same effective group ID).

–k

filename

True if the file named

filename

exists and its sticky bit (page 980) is set.

–L

filename

True if the file named

filename

exists and is a symbolic link.

–O

filename

True if the file named

filename

exists and is owned by the user running the

command (same effective user ID).

–p

filename

True if the file named

filename

exists and is a named pipe.

–r

filename

True if the file named

filename

exists and the user running the command has

read permission for it.

–s

filename

True if the file named

filename

exists and contains information (has a size

greater than 0 bytes).

–t

file-descriptor

True if

file-descriptor

is open and associated with the screen or keyboard. The

file-descriptor

for standard input is 0, for standard output is 1, and for standard

error is 2.

–u

filename

True if the file named

filename

exists and its setuid bit (page 96) is set.

–w

filename

True if the file named

filename

exists and you have write permission for it.

–x

filename

True if the file named

filename

exists and you have execute/search permission

for it.

Table V-29 Criteria (continued)

856 test

Notes The test command is built into the Bourne Again and TC Shells.

Examples The following examples demonstrate the use of the test utility in Bourne Again Shell

scripts. Although test works from a command line, it is more commonly employed

in shell scripts to test input or verify access to a file.

The first example prompts the user, reads a line of input into a variable, and uses

the synonym for test,[], to see whether the user entered yes:

$ cat user_in

echo -n "Input yes or no: "

read user_input

if [ "$user_input" = "yes" ]

then

echo You input yes.

The next example prompts for a filename and then uses the synonym for test,[], to

see whether the user has read access permission (–r) for the file and (–a) whether the

file contains information (–s):

$ cat validate

echo -n "Enter filename: "

read filename

if [ -r "$filename" -a -s "$filename" ]

then

echo File $filename exists and contains information.

echo You have read access permission to the file.

The –t 1 criterion checks whether the process running test is sending standard out-

put to the screen. If it is, the test utility returns a value of true (0). The shell stores

the exit status of the last command it ran in the $? variable. The following script

tests whether its output is going to a terminal:

Table V-30 Relational operators

Relational operator Meaning

–eq Equal to

–ge Greater than or equal to

–gt Greater than

–le Less than or equal to

–lt Less than

–ne Not equal to

test 857

$ cat term

test -t 1

echo "This program is (=0) or is not (=1)

sending its output to a terminal:" $?

First term is run with the output going to the terminal:

$ term

This program is (=0) or is not (=1)

sending its output to a terminal: 0

The next example runs term and redirects its output to a file. The contents of the file

temp show that test returned 1, indicating its output was not going to a terminal.

$ term > temp

$ cat temp

This program is (=0) or is not (=1)

sending its output to a terminal: 1

858 top

top

Dynamically displays process status

top [options]

The top utility displays information about the status of the local system, including information about

current processes.

Options Although top does not require the use of hyphens with options, it is a good idea to

include them for clarity and consistency with other utilities. You can cause top to run

as though you had specified any of the options by giving commands to the utility

while it is running. See the “Discussion” section for more information.

–ca Causes top to run in accumulative mode. In this mode, times and events are

counted cumulatively since top started. See the top man page if you want to use

the –c option to run top in another mode. O

–d ss.tt

(delay) Specifies ss.tt as the number of seconds and tenths of seconds of delay

from one display update to the next. The default is 3 seconds. L

–i Ignores idle and zombie processes (processes without a parent). L

–n n

(number) Specifies the number of iterations: top updates the display n times

and exits. L

–p n

(PID) Monitors the process with a PID of n. You can use this option up to 20

times on a command line or specify n as a comma-separated list of up to

20 PID numbers. L

–S (sum) Causes top to run in cumulative mode. In cumulative mode, the CPU

times reported for processes include forked processes and CPU times accumu-

lated by child processes that are now dead. L

–s (secure) Runs top in secure mode, restricting the commands you can use while

top is running to those that pose less of a security risk. L

–s ss

(seconds) Specifies ss as the number of seconds of delay from one display

update to the next. The default is 1 second. O

Discussion The first few lines top displays summarize the status of the local system. You can turn

each of these lines on or off with the toggle switches (interactive command keys)

top 859

specified in the following descriptions. The first line is the same as the output of the

uptime utility and shows the current time, the amount of time the local system has

been running since it was last booted, the number of users logged in, and the load

averages from the last 1, 5, and 15 minutes (toggle l [lowercase “l”]). The second line

indicates the number of running processes (toggle t). The next three lines report on

CPU (also toggle t), memory (toggle m), and swap space (also toggle m) use.

The rest of the display reports on individual processes, which are listed in order by

descending CPU usage (i.e., the most CPU-intensive process is listed first). By

default top displays the number of processes that fit on the screen.

Table V-31 describes the meanings of the fields displayed for each process.

While top is running, you can issue the following commands to modify its behavior:

h (help) Displays a summary of the commands you can use while top is running. L

? (help) Displays a summary of the commands you can use while top is running. O

k(kill) Allows you to kill a process. Unless you are working with root privileges,

you can kill only processes you own. When you use this command, top prompts

you for the PID of the process and the signal to send to the process. You can

enter either a signal number or a name. (See Table 10-5 on page 453 for a list of

signals.) This command is disabled when you are working in secure mode. L

Table V-31 Field names

Name Meaning

PID Process identification number

USER Username of the owner of the process

PR Priority of the process

NI nice value (page 773)

VIRT Number of kilobytes of virtual memory used by the process

RES Number of kilobytes of physical (nonswapped) memory used by the process

SHR Number of kilobytes of shared memory used by the process

SStatus of the process (see STAT on page 798)

%CPU Percentage of the total CPU time the process is using

%MEM Percentage of physical memory the process is using

TIME[+] Total CPU time used by the process

COMMAND Command line that started the process or name of the program (toggle with c)

860 top

n (number) When you give this command, top asks you to enter the number of

processes you want it to display. If you enter 0 (the default), top shows as many

processes as fit on the screen.

q(quit) Terminates top.

r(renice) Changes the priority of a running process (refer to renice on page 802).

Unless you are working with root privileges, you can change the priority of

only your own processes and even then only to lower the priority by entering a

positive value. A user working with root privileges can enter a negative value,

increasing the priority of the process. This command is disabled when you are

working in secure mode. L

S(switch) Toggles top between cumulative mode and regular mode. See the –S

option for details. L

s(seconds) Prompts for the number of seconds to delay between updates to the

display (3 is the default). You may enter an integer, a fraction, or 0 (for contin-

uous updates). Under Linux, this command is disabled when you are working

in secure mode.

W(write) Writes top’s current configuration to your personal configuration file

(~/.toprc). L

SPACE

Refreshes the screen.

Notes The Linux and OS X versions of top are very different. Although it applies to both

versions, this coverage of top is oriented toward the Linux version. Refer to the

OS X top man page for more information about the OS X version.

The top utility is similar to ps but periodically updates the display, enabling you to

monitor the behavior of the local system over time.

This utility shows only as much of the command line for each process as fits on a

line. If a process is swapped out, top replaces the command line with the name of

the command in parentheses.

Under Linux, the top utility uses the proc filesystem. When proc is not mounted, top

does not work.

Requesting continuous updates is almost always a mistake. The display is updated

too quickly and the system load increases dramatically.

Example The following display is the result of a typical execution of top:

top 861

top - 17:48:53 up 6:21, 1 user, load average: 0.71, 1.00, 0.80

Tasks: 178 total, 1 running, 176 sleeping, 0 stopped, 1 zombie

Cpu(s): 16.2%us, 3.0%sy, 0.0%ni, 80.4%id, 0.0%wa, 0.3%hi, 0.0%si, 0.0%st

Mem: 8188664k total, 8122636k used, 66028k free, 127200k buffers

Swap: 11912112k total, 0k used, 11912112k free, 7217728k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

7841 mark 20 0 108m 29m 12m S 15 0.4 5:14.62 npviewer.bin

7810 mark 20 0 675m 175m 27m S 8 2.2 24:07.87 firefox

12084 mark 20 0 568m 70m 32m S 7 0.9 5:14.39 amarokapp

13428 mark 10 -10 1715m 74m 62m S 5 0.9 6:25.19 vmware-vmx

7147 root 20 0 449m 102m 14m S 4 1.3 57:57.59 Xorg

7676 mark 20 0 159m 17m 11m S 1 0.2 0:05.14 konsole

13437 root 20 0 300m 18m 10m S 1 0.2 0:21.14 firestarter

16542 mark 20 0 18992 1332 940 R 1 0.0 0:00.16 top

7672 mark 20 0 323m 21m 10m S 0 0.3 1:09.47 artsd

7747 mark 20 0 306m 45m 25m S 0 0.6 1:20.28 vmware

1 root 20 0 4020 876 592 S 0 0.0 0:01.24 init

2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd

3 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/0

4 root 15 -5 0 0 0 S 0 0.0 0:03.04 ksoftirqd/0

...

862 touch

touch

Creates a file or changes a file’s access and/or modification time

touch [options]file-list

The touch utility changes the access and/or modification time of a file to the current time or a time

you specify. You can also use touch to create a file.

Arguments The file-list is a list of the pathnames of the files that touch will create or update.

Options Under Linux, touch accepts the common options described on page 603. Options

preceded by a double hyphen (––) work under Linux only. Except as noted, options

named with a single letter and preceded by a single hyphen work under Linux and

OS X. Without any options, touch changes the access and modification times to the

current time. When you do not specify the –c (––no-create) option, touch creates

files that do not exist.

–a Updates the access time only, leaving the modification time unchanged.

––no–create –c Does not create files that do not exist.

––date=datestring –d datestring

Updates times with the date specified by datestring. Most familiar formats are

permitted for datestring. Components of the date and time not included in

datestring are assumed to be the current date and time. This option may not be

used with –t.L

–m Updates the modification time only, leaving the access time unchanged.

––reference=file –r file

Updates times with the times of file.

–t [[cc]yy]nnddhhmm[.ss]

Changes times to the date specified by the argument. The nn argument is the

number of the month (01–12), dd is the day of the month (01–31), hh is the

hour based on a 24-hour clock (00–23), and mm is the minutes (00–59). You

must specify at least these fields. You can specify the number of seconds past

the start of the minute with .ss.

The optional cc specifies the first two digits of the year (the value of the cen-

tury minus 1), and yy specifies the last two digits of the year. When you do not

specify a year, touch assumes the current year. When you do not specify cc,

touch assumes 20 for yy in the range 0–68 and 19 for yy in the range 69–99.

This option may not be used with –d.

touch 863

Examples The first three commands show touch updating an existing file. The ls utility with

the –l option displays the modification time of the file. The last three commands

show touch creating a file.

$ ls -l program.c

-rw-r--r-- 1 max group 5860 Apr 21 09:54 program.c

$ touch program.c

$ ls -l program.c

-rw-r--r-- 1 max group 5860 Aug 13 19:01 program.c

$ ls -l read.c

ls: read.c: No such file or directory

$ touch read.c

$ ls -l read.c

-rw-rw-r-- 1 max group 0 Aug 13 19:01 read.c

The first of the following ls commands displays the file modification times; the second

ls (with the –lu options) displays the file access times:

$ ls -l

-rw-r--r-- 1 max group 552 Jan 10 19:44 cases

-rw-r--r-- 1 max group 2209 Apr 18 04:24 excerpts

$ ls -lu

-rw-r--r-- 1 max group 552 May 29 15:28 cases

-rw-r--r-- 1 max group 2209 May 29 15:28 excerpts

The next example works on the two files shown above and demonstrates the use of

the –a option to change the access time only and the –t option to specify a date for

touch to use instead of the current date and time. After the touch command is exe-

cuted, ls shows that the access times of the files cases and excerpts have been

updated but the modification times remain the same.

$ touch -at 02040608 cases excerpts

$ ls -l

-rw-r--r-- 1 max group 552 Jan 10 19:44 cases

-rw-r--r-- 1 max group 2209 Apr 18 04:24 excerpts

$ ls -lu

-rw-r--r-- 1 max group 552 Feb 4 06:08 cases

-rw-r--r-- 1 max group 2209 Feb 4 06:08 excerpts

864 tr

Replaces specified characters

tr [options]string1 [string2]

The tr utility reads standard input and, for each input character, either maps it to an alternate charac-

ter, deletes the character, or leaves the character as is. This utility reads from standard input and

writes to standard output.

Arguments The tr utility is typically used with two arguments, string1 and string2. The position

of each character in the two strings is important: Each time tr finds a character from

string1 in its input, it replaces that character with the corresponding character from

string2.

With one argument, string1, and the –d (––delete) option, tr deletes the characters

specified in string1. The option –s (––squeeze-repeats) replaces multiple sequential

occurrences of characters in string1 with single occurrences (for example, abbc

becomes abc).

Ranges

A range of characters is similar in function to a character class within a regular

expression (page 889). GNU tr does not support ranges (character classes) enclosed

within brackets. You can specify a range of characters by following the character

that appears earlier in the collating sequence with a hyphen and the character that

comes later in the collating sequence. For example, 1–6 expands to 123456.

Although the range A–Z expands as you would expect in ASCII, this approach does

not work when you use the EBCDIC collating sequence, as these characters are not

sequential in EBCDIC. See “Character Classes” for a solution to this issue.

Character Classes

Atr character class is not the same as the character class described elsewhere in this

book. (GNU documentation uses the term list operator for what this book calls a

character class.) You specify a character class as '[:class:]', where class is one of the

character classes from Table V-32. You must specify a character class in string1

(and not string2) unless you are performing case conversion (see the “Examples”

section) or you use the –d and –s options together.

Table V-32 Character classes

Class Meaning

alnum Letters and digits

alpha Letters

tr 865

Options Options preceded by a double hyphen (––) work under Linux only. Except as noted,

options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

––complement –c Complements string1, causing tr to match all characters except those in string1.

––delete –d Deletes characters that match those specified in string1. If you use this option

with the –s (––squeeze-repeats) option, you must specify both string1 and

string2 (see “Notes”).

––help Summarizes how to use tr, including the special symbols you can use in string1

and string2.L

––squeeze-repeats –s Replaces multiple sequential occurrences of a character in string1 with a single

occurrence of the character when you call tr with only one string argument. If

you use both string1 and string2, the tr utility first translates the characters in

string1 to those in string2; it then replaces multiple sequential occurrences of a

character in string2 with a single occurrence of the character.

––truncate-set1 –t Truncates string1 so it is the same length as string2 before processing input. L

Notes When string1 is longer than string2, the initial portion of string1 (equal in length to

string2) is used in the translation. When string1 is shorter than string2,tr uses the

last character of string1 to extend string1 to the length of string2. In this case tr

departs from the POSIX standard, which does not define a result.

Class Meaning

blank Whitespace

cntrl CONTROL characters

digit Digits

graph Printable characters but not SPACEs

lower Lowercase letters

print Printable characters including SPACEs

punct Punctuation characters

space Horizontal or vertical whitespace

upper Uppercase letters

xdigit Hexadecimal digits

Table V-32 Character classes (continued)

866 tr

If you use the –d (––delete) and –s (––squeeze-repeats) options at the same time, tr

first deletes the characters in string1 and then replaces multiple sequential occur-

rences of a character in string2 with a single occurrence of the character.

Examples You can use a hyphen to represent a range of characters in string1 or string2. The

two command lines in the following example produce the same result:

$ echo abcdef | tr 'abcdef' 'xyzabc'

xyzabc

$ echo abcdef | tr 'a-f' 'x-za-c'

xyzabc

The next example demonstrates a popular method for disguising text, often

called ROT13 (rotate 13) because it replaces the first letter of the alphabet with

the thirteenth, the second with the fourteenth, and so forth.

$ echo The punchline of the joke is ... |

> tr 'A-M N-Z a-m n-z' 'N-Z A-M n-z a-m'

Gur chapuyvar bs gur wbxr vf ...

To make the text intelligible again, reverse the order of the arguments to tr:

$ echo Gur chapuyvar bs gur wbxr vf ... |

> tr 'N-Z A-M n-z a-m' 'A-M N-Z a-m n-z'

The punchline of the joke is ...

The ––delete option causes tr to delete selected characters:

$ echo If you can read this, you can spot the missing vowels! |

> tr --delete 'aeiou'

If y cn rd ths, y cn spt th mssng vwls!

In the following example, tr replaces characters and reduces pairs of identical

characters to single characters:

$ echo tennessee | tr -s 'tnse' 'srne'

serene

The next example replaces each sequence of nonalphabetic characters (the comple-

ment of all the alphabetic characters as specified by the character class alpha) in the

file draft1 with a single NEWLINE character. The output is a list of words, one per line.

$ tr -c -s '[:alpha:]' '\n' < draft1

The next example uses character classes to upshift the string hi there:

$ echo hi there | tr '[:lower:]' '[:upper:]'

HI THERE

tty 867

tty

Displays the terminal pathname

tty [option]

The tty utility displays the pathname of standard input if it is a terminal and displays not a tty if it is

not a terminal. The exit status is 0 if standard input is a terminal and 1 if it is not.

Arguments There are no arguments.

Options Under Linux, tty accepts the common options described on page 603. The option

preceded by a double hyphen (––) works under Linux only. The option named with

a single letter and preceded by a single hyphen works under Linux and OS X.

––silent or

––quiet

–s Causes tty not to print anything. The exit status of tty is set.

Notes The term tty is short for teletypewriter, the terminal device on which UNIX was

first run. This command appears in UNIX, and Linux has kept it for the sake of

consistency and tradition.

Examples The following example illustrates the use of tty:

$ tty

/dev/pts/11

$ echo $?

$ tty < memo

not a tty

$ echo $?

868 tune2fs

tune2fs

Changes parameters on an ext2,ext3, or ext4 filesystem

tune2fs [options]device

The tune2fs utility displays and modifies filesystem parameters on ext2,ext3, and ext4 filesystems.

This utility can also set up journaling on an ext2 filesystem, turning it into an ext3 filesystem. With

typical filesystem permissions, tune2fs must be run by a user working with root privileges. The

tune2fs utility is available under Linux only. L

Arguments The device is the name of the device, such as /dev/sda8, that holds the filesystem

whose parameters you want to display or modify.

Options –C n

(count) Sets the number of times the filesystem has been mounted without

being checked to n. This option is useful for staggering filesystem checks (see

“Discussion”) and for forcing a check the next time the system boots.

–c n

(max count) Sets the maximum number of times the filesystem can be mounted

between filesystem checks to n. Set n to 0 (zero) to disregard this parameter.

–e behavior

(error) Specifies what the kernel will do when it detects an error. Set behavior

to continue (continues execution), remount-ro (remounts the filesystem read-

only), or panic (causes a kernel panic). Regardless of how you set this option,

an error will cause fsck to check the filesystem next time the system boots.

–i n[u]

(interval) Sets the maximum time between filesystem checks to n time periods.

Without u or with u set to d, the time period is days. Set u to w to set the time

period to weeks; use m for months. Set n to 0 (zero) to disregard this parame-

ter. Because a filesystem check is forced only when the system is booted, the

time specified by this option may be exceeded.

–j (journal) Adds an ext3 journal to an ext2 filesystem. For more information on

journaling filesystems see page 961.

–l (list) Lists information about the filesystem.

–T date

(time) Sets the time the filesystem was last checked to date. The date is the time

and date in the format yyyynndd[hh[mm]ss]]]. Here yyyy is the year, nn is the

number of the month (01–12), and dd is the day of the month (01–31). You

must specify at least these fields. The hh is the hour based on a 24-hour clock

(00–23), mm is the minutes (00–59), and .ss is the number of seconds past the

start of the minute. You can also specify date as now.

tune2fs 869

Discussion Checking a large filesystem can take a long time. When all filesystem checks occur

at the same time, the system may boot slowly. Use the –C and/or –T options to stag-

ger filesystem checks so they do not all happen at the same time.

Example Following is the output of tune2fs run with the –l option on a typical ext3 filesystem:

# /sbin/tune2fs -l /dev/sda1

tune2fs 1.41.4 (27-Jan-2009)

Filesystem volume name: <none>

Last mounted on: <not available>

Filesystem UUID: b6d9714e-ed5d-45b8-8023-716a669c16d8

Filesystem magic number: 0xEF53

Filesystem revision #: 1 (dynamic)

Filesystem features: has_journal ext_attr resize_inode dir_index

filetype needs_recovery sparse_super large_file

Filesystem flags: signed_directory_hash

Default mount options: (none)

Filesystem state: clean

Errors behavior: Continue

Filesystem OS type: Linux

Inode count: 624624

Block count: 2498099

Reserved block count: 124904

Free blocks: 1868063

Free inodes: 509355

First block: 0

Block size: 4096

Fragment size: 4096

Reserved GDT blocks: 609

Blocks per group: 32768

Fragments per group: 32768

Inodes per group: 8112

Inode blocks per group: 507

Filesystem created: Mon Apr 27 09:41:43 2009

Last mount time: Tue May 5 03:54:59 2009

Last write time: Tue May 5 03:54:59 2009

Mount count: 4

Maximum mount count: 31

Last checked: Mon Apr 27 09:41:43 2009

Check interval: 15552000 (6 months)

Next check after: Sat Oct 24 09:41:43 2009

Reserved blocks uid: 0 (user root)

Reserved blocks gid: 0 (group root)

First inode: 11

Inode size: 256

Required extra isize: 28

Desired extra isize: 28

Journal inode: 8

First orphan inode: 308701

Default directory hash: half_md4

Directory Hash Seed: bceae349-a46f-4d45-a8f1-a21b1ae8d2bd

Journal backup: inode blocks

870 umask

umask

Establishes the file-creation permissions mask

umask [mask]

The umask builtin specifies the mask the system uses to set access permissions when you create a file.

This builtin works slightly differently in each of the shells.

Arguments The mask can be a three-digit octal number (bash and tcsh) or a symbolic value

(bash) such as you would use with chmod (page 626). The mask specifies the per-

missions that are not allowed.

When mask is an octal number, the digits correspond to the permissions for the

owner of the file, members of the group the file is associated with, and everyone

else. Because the mask specifies the permissions that are not allowed, the system

subtracts each of these digits from 7 when you create a file. The resulting three octal

numbers specify the access permissions for the file (the numbers you would use with

chmod). A mask you specify as a symbolic value also specifies the permissions that

are not allowed. See the “Notes” section.

Without any arguments, umask displays the file-creation permissions mask.

Option –S (symbolic) Displays the file-creation permissions mask symbolically.

Notes Most utilities and applications do not attempt to create files with execute permis-

sions, regardless of the value of mask;they assume you do not want an executable

file. As a result, when a utility or application such as touch creates a file, the system

subtracts each of the digits in mask from 6. An exception occurs with mkdir, which

does assume that you want the execute (access in the case of a directory) bit set. See

the “Examples” section.

The umask program is a builtin in bash and tcsh and generally goes in the initialization

file for your shell (~/.bash_profile [bash] or ~/.login [tcsh]).

Under bash the argument u=rwx,go=r turns off all bits in the mask for the owner

and turns off the read bit in the mask for groups and other users (the mask is 0033),

causing those bits to be on in file permissions (744 or 644). Refer to chmod on

page 626 for more information about symbolic permissions.

Examples The following commands set the file-creation permissions mask and display the

mask and its effect when you create a file and a directory. The mask of 022, when

subtracted from 777, gives permissions of 644 (rw–r––r––) for a file and 755

(rwxr–xr–x) for a directory:

umask 871

$ umask 022

$ umask

0022

$ touch afile

$ mkdir adirectory

$ ls -ld afile adirectory

drwxr-xr-x 2 max max 4096 Jul 24 11:25 adirectory

-rw-r--r-- 1 max max 0 Jul 24 11:25 afile

The next example sets the same mask using symbolic values. The –S option displays

the mask symbolically:

$ umask u=rwx,g=rx,o=rx

$ umask

0022

$ umask -S

u=rwx,g=rx,o=rx

872 uniq

uniq

Displays unique lines

uniq [options] [input-file] [output-file]

The uniq utility displays its input, removing all but one copy of successive repeated lines. If the file

has been sorted (see sort on page 817), uniq ensures that no two lines it displays are the same.

Arguments When you do not specify the input-file,uniq reads from standard input. When you

do not specify the output-file,uniq writes to standard output.

Options Under Linux, uniq accepts the common options described on page 603. Options pre-

ceded by a double hyphen (––) work under Linux only. Except as noted, options

named with a single letter and preceded by a single hyphen work under Linux and OS

X. A field is a sequence of characters bounded by SPACEs,TABs, NEWLINEs, or a combination

of these characters.

––count –c Precedes each line with the number of occurrences of the line in the input file.

––repeated –d Displays one copy of lines that are repeated; does not display lines that are not

repeated.

––skip-fields=nfield

–f nfield

Ignores the first nfield blank-separated fields of each line. The uniq utility bases

its comparison on the remainder of the line, including the leading blanks of the

next field on the line (see the –s [––skip-chars] option).

––ignore-case –i Ignores case when comparing lines.

––skip-chars=nchar

–s nchar

Ignores the first nchar characters of each line. If you also use the –f (––skip-fields)

option, uniq ignores the first nfield fields followed by nchar characters. You can

use this option to skip over leading blanks of a field.

––unique –u Displays only lines that are not repeated.

––check-chars=nchar

–w nchar

Compares up to nchars characters on a line after honoring the –f (––skip-fields)

and –s (––skip-chars) options. By default uniq compares the entire line. L

Examples These examples assume the file named test in the working directory contains the

following text:

uniq 873

$ cat test

boy took bat home

girl took bat home

dog brought hat home

Without any options, uniq displays only one copy of successive repeated lines:

$ uniq test

boy took bat home

girl took bat home

dog brought hat home

The –c (––count) option displays the number of consecutive occurrences of each

line in the file:

$ uniq -c test

2 boy took bat home

1 girl took bat home

3 dog brought hat home

The –d (––repeated) option displays only lines that are consecutively repeated in

the file:

$ uniq -d test

boy took bat home

dog brought hat home

The –u (––unique) option displays only lines that are not consecutively repeated in

the file:

$ uniq -u test

girl took bat home

In the next example, the –f (––skip-fields) option skips the first field in each line,

causing the lines that begin with boy and the one that begins with girl to appear to

be consecutive repeated lines. The uniq utility displays only one occurrence of these

lines:

$ uniq -f 1 test

boy took bat home

dog brought hat home

The next example uses both the –f (––skip-fields) and –s (––skip-chars) arguments

first to skip two fields and then to skip two characters. The two characters this

command skips include the SPACE that separates the second and third fields and the

first character of the third field. Ignoring these characters, all the lines appear to be

consecutive repeated lines containing the string at home. The uniq utility displays

only the first of these lines:

$ uniq -f 2 -s 2 test

boy took bat home

874 w

Displays information about system users

w [options] [username]

The w utility displays the names of users who are currently logged in, together with their terminal

device numbers, the times they logged in, the commands they are running, and other information.

Arguments The username restricts the display to information about that user.

Options –f (from) Removes the FROM column. For users who are directly connected, this

field contains a hyphen. L

–h (no header) Suppresses the header line.

–s (short) Displays less information: username, terminal device, idle time, and

command. L

Discussion The first line w displays is the same as that displayed by uptime. This line includes

the time of day, how long the system has been running (in days, hours, and min-

utes), how many users are logged in, and how busy the system is (load average).

From left to right, the load averages indicate the number of processes that have been

waiting to run in the past 1 minute, 5 minutes, and 15 minutes.

The columns of information that w displays have the following headings:

USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT

The USER is the username of the user. The TTY is the device name for the line that

the user logged in on. The FROM is the system name that a remote user is logged in

from; it is a hyphen for a local user. The LOGIN@ gives the date and time when the

user logged in. The IDLE indicates how many minutes have elapsed since the user

last used the keyboard. The JCPU is the CPU time used by all processes attached to

the user’s tty, not including completed background jobs. The PCPU is the time used

by the process named in the WHAT column. The WHAT is the command that user

is running.

Examples The first example shows the full list produced by the w utility:

$ w

10:26am up 1 day, 55 min, 6 users, load average: 0.15, 0.03, 0.01

USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT

max tty1 - Fri 9am 20:39m 0.22s 0.01s vim td

max tty2 - Fri 5pm 17:16m 0.07s 0.07s -bash

root pts/1 - Fri 4pm 14:28m 0.20s 0.07s -bash

sam pts/2 - Fri 5pm 3:23 0.08s 0.08s /bin/bash

hls pts/3 potato 10:07am 0.00s 0.08s 0.02s w

w875

In the next example, the –s option produces an abbreviated listing:

$ w -s

10:30am up 1 day, 58 min, 6 users, load average: 0.15, 0.03, 0.01

USER TTY FROM IDLE WHAT

max tty1 - 20:43m vim td

max tty2 - 17:19m -bash

root pts/1 - 14:31m -bash

sam pts/2 - 0.20s vim memo.030125

hls pts/3 potato 0.00s w -s

The final example requests information only about Max:

$ w max

10:35am up 1 day, 1:04, 6 users, load average: 0.06, 0.01, 0.00

USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT

max tty1 - Fri 9am 20:48m 0.22s 0.01s vim td

max tty2 - Fri 5pm 17:25m 0.07s 0.07s -bash

876 wc

Displays the number of lines, words, and bytes

wc [options] [file-list]

The wc utility displays the number of lines, words, and bytes in its input. When you specify more

than one file on the command line, wc displays totals for each file as well as totals for all files.

Arguments The file-list is a list of the pathnames of one or more files that wc analyzes. When

you omit file-list,wc takes its input from standard input.

Options Under Linux, wc accepts the common options described on page 603. Options pre-

ceded by a double hyphen (––) work under Linux only. Except as noted, options

named with a single letter and preceded by a single hyphen work under Linux and

OS X.

––bytes –c Displays only the number of bytes in the input.

––max-line-length –L Displays the length of the longest line in the input. L

––lines –l (lowercase “l”) Displays only the number of lines (that is, NEWLINE characters) in

the input.

––chars –m Displays only the number of characters in the input.

––words –w Displays only the number of words in the input.

Notes Aword is a sequence of characters bounded by SPACE s,TABs, NEWLINEs, or a combination

of these characters.

Examples The following command analyzes the file named memo. The numbers in the output

represent the number of lines, words, and bytes in the file.

$ wc memo

5 31 146 memo

The next command displays the number of lines and words in three files. The line

at the bottom, with the word total in the right column, contains the sum of each

column.

$ wc -lw memo1 memo2 memo3

10 62 memo1

12 74 memo2

12 68 memo3

34 204 total

which 877

which

Shows where in PATH a command is located

which command-list

For each command in command-list, the which utility searches the directories in the PATH variable

(page 297) and displays the absolute pathname of the first file it finds whose simple filename is the

same as the command.

Arguments The command-list is a list of one or more commands (utilities) that which searches

for. For each command, which searches the directories listed in the PATH environment

variable, in order, and displays the full pathname of the first command (executable

file) it finds. If which does not locate a command, it displays a message.

Options Options preceded by a double hyphen (––) work under Linux only; not all of these

options are available under all Linux distributions. Except as noted, options named

with a single letter and preceded by a single hyphen work under Linux and OS X.

––all –a Displays all matching executable files in PATH, not just the first.

––read-alias –i Reads aliases from standard input and reports on matching aliases in addition

to executable files in PATH (turn off with ––skip-alias). L

––read-functions Reads shell functions from standard input and reports on matching functions

in addition to executable files in PATH (turn off with ––skip-functions). L

––show-dot Displays ./ in place of the absolute pathname when a directory in PATH starts

with a period and a matching executable file is in that directory (turn off with

––skip-dot). L

––show-tilde Displays a tilde (~) in place of the absolute pathname of the user’s home

directory where appropriate. This option is ignored when a user working

with root privileges runs which.L

––tty-only Does not process more options (to the right of this option) if the process run-

ning which is not attached to a terminal. L

Notes Some distributions define an alias for which such as the following:

$ alias which

alias which='alias | /usr/bin/which --tty-only --read-alias --show-dot --show-tilde'

If which is not behaving as you would expect, verify that you are not running an

alias. The preceding alias causes which to be effective only when it is run interac-

tively (––tty-only) and to display aliases, display the working directory as a period

when appropriate, and display the name of the user’s home directory as a tilde.

878 which

The TC Shell includes a which builtin (see the tcsh man page) that works slightly

differently from the which utility (/usr/bin/which). Without any options, the which

utility does not locate aliases, functions, and shell builtins because these do not

appear in PATH. In contrast, the tcsh which builtin locates aliases, functions, and

shell builtins.

Examples The first example quotes the first letter of the command (\which) to prevent the

shell from invoking the alias (page 326) for which:

$ \which vim dir which

/usr/bin/vim

/bin/dir

/usr/bin/which

The next example, which works on some Linux systems only, is the same as the first

but uses the alias for which (which it displays):

$ which vim dir which

alias which='alias | /usr/bin/which --tty-only --read-alias --show-dot --show-tilde'

/usr/bin/which

/usr/bin/vim

/usr/bin/dir

The final example is the same as the previous one except it is run from tcsh. The tcsh

which builtin is used instead of the which utility:

tcsh $ which vim dir which

/usr/bin/vim

/bin/dir

which: shell built-in command.

who 879

who

Displays information about logged-in users

who [options]

who am i

The who utility displays information about users who are logged in on the local system. This informa-

tion includes each user’s username, terminal device, login time, and, if applicable, the corresponding

remote hostname.

Arguments When given two arguments (traditionally, am i), who displays information about the

user giving the command. If applicable, the username is preceded by the hostname

of the system running who (as in kudos!max).

Options Under Linux, who accepts the common options described on page 603. Options pre-

ceded by a double hyphen (––) work under Linux only. Except as noted, options

named with a single letter and preceded by a single hyphen work under Linux and

OS X.

––all –a Displays a lot of information. L

––boot –b Displays the date and time when the system was last booted. L

––heading –H Displays a header.

––login –l (lowercase “l”) Lists devices waiting for a user to log in. L

––count –q (quick) Lists the usernames only, followed by the number of users logged in on

the system.

––mesg –T Appends after each user’s username a character that shows whether that user

has messages enabled. A plus (+) means that messages are enabled, a hyphen

(–) means that they are disabled, and a question mark (?) indicates that who

cannot find the device. If messages are enabled, you can use write to communi-

cate with the user. Refer to “mesg: Denies or Accepts Messages” on page 71.

––users –u Includes each user’s idle time in the display. If the user has typed on her termi-

nal in the past minute, a period (.) appears in this field. If no input has

occurred for more than a day, the word old appears. In addition, this option

includes the PID number and comment fields. See the “Discussion” section.

Discussion The line who displays has the following syntax:

user [messages]line login-time [idle] [PID] comment

880 who

The user is the username of the user. The messages argument indicates whether mes-

sages are enabled or disabled (see the –T [––mesg] option). The line is the device

name associated with the line the user is logged in on. The login-time is the date

and time when the user logged in. The idle argument is the length of time since the

terminal was last used (the idle time; see the –u [––users] option). The PID is the

process identification number. The comment is the name of the remote system that

the user is logged in from (it is blank for local users).

Notes The finger utility (pages 68 and 695) provides information similar to that given by

who.

Examples The following examples demonstrate the use of the who utility:

$ who

hls tty1 Jul 30 06:01

sam tty2 Jul 30 06:02

max ttyp3 Jul 30 14:56 (bravo)

$ who am i

bravo!max ttyp3 Jul 30 14:56 (bravo)

$ who -HTu

USER LINE TIME IDLE PID COMMENT

hls - tty1 Jul 30 06:01 03:53 1821

sam + tty2 Jul 30 06:02 14:47 2235

max + ttyp3 Jul 30 14:56 . 14777 (bravo)

xargs 881

xargs

Converts standard input to command lines

xargs [options] [command]

The xargs utility is a convenient, efficient way to convert standard output of one command into

arguments for another command. This utility reads from standard input, keeps track of the maxi-

mum allowable length of a command line, and avoids exceeding that limit by repeating command as

needed. Finally xargs executes the constructed command line(s).

Arguments The command is the command line you want xargs to use as a base for the com-

mand it constructs. If you omit command, it defaults to echo. The xargs utility

appends to command the arguments it receives from standard input. If any argu-

ments should precede the arguments from standard input, you must include them as

part of command.

Options Options preceded by a double hyphen (––) work under Linux only. Except as noted,

options named with a single letter and preceded by a single hyphen work under

Linux and OS X.

–I [marker]

(replace) Allows you to place arguments from standard input anywhere within

command. All occurrences of marker in command for xargs are replaced by

the arguments generated from standard input of xargs. With this option, xargs

executes command for each input line. The –l (––max-lines) option is ignored

when you use this option.

––max-lines[=num]

–l [num]

(lowercase “l”) Executes command once for every num lines of input (num

defaults to 1). L

––max-args=num –n num

Executes command once for every num arguments in the input line.

––max-procs=num

–P num

Allows xargs to run up to maxprocs instances of command simultaneously.

(The default is 1, which runs commands sequentially.) This option may

improve the throughput if xargs is running on a multiprocessor system.

––interactive –p Prompts prior to each execution of command.

––no-run-if-empty

–r Causes xargs not to execute command if standard input is empty. Ordinarily

xargs executes command at least once, even if standard input includes only

blanks. L

882 xargs

Discussion The xargs utility reads arguments to command from standard input, interpreting each

whitespace-delimited string as a separate argument. It then constructs a command line

from command and a series of arguments. When the maximum command line length

would be exceeded by adding another argument, xargs runs the command line it has

built. If there is more input, xargs repeats the process of building a command line and

running it. This process continues until all input has been read.

Notes The xargs utility is often used as an efficient alternative to the –exec option of find

(page 689). If you call find with the –exec option to run a command, it runs the

command once for each file it processes. Each execution of the command creates a

new process, which can drain system resources when you are processing many files.

By accumulating as many arguments as possible, xargs can greatly reduce the num-

ber of processes needed. The first example in the “Examples” section shows how to

use xargs with find.

Examples To locate and remove all files with names ending in .o from the working directory

and its subdirectories, you can use the find –exec option:

$ find . -name \*.o -exec rm --force {} \;

This approach calls the rm utility once for each .o file that find locates. Each invoca-

tion of rm requires a new process. If a lot of .o files exist, the system must spend a

significant amount of time creating, starting, and then cleaning up these processes.

You can reduce the number of processes by allowing xargs to accumulate as many

filenames as possible before calling rm:

$ find . -name \*.o -print | xargs rm --force

In the next example, the contents of all *.txt files located by find are searched for

lines containing the word login. All filenames that contain login are displayed by

grep.

$ find . -name \*.txt -print | xargs grep -w -l login

The next example shows how to use the –I option to cause xargs to embed standard

input within command instead of appending it to command. This option also

causes command to be executed each time a NEWL NE character is encountered in stan-

dard input; –l (––max-lines) does not override this behavior.

$ cat names

Tom,

Dick,

and Harry

$ xargs echo "Hello," < names

Hello, Tom, Dick, and Harry

xargs 883

$ xargs -I xxx echo "Hello xxx. Join me for lunch?" < names

Hello Tom,. Join me for lunch?

Hello Dick,. Join me for lunch?

Hello and Harry. Join me for lunch?

The final example uses the same input file as the previous examples as well as the –n

(––max-args) and –l (––max-lines) options:

$ xargs -n 1 echo "Hi there" < names

Hi there Tom,

Hi there Dick,

Hi there and

Hi there Harry

$ xargs -l 2 echo "Hi there" < names

Hi there Tom, Dick,

Hi there and Harry

See page 833 for another example of the use of xargs.

887887

Aregular expression defines a set of one or more strings of

characters. A simple string of characters is a regular expression

that defines one string of characters: itself. A more complex

regular expression uses letters, numbers, and special characters

to define many different strings of characters. A regular

expression is said to match any string it defines.

This appendix describes the regular expressions used by ed,

vim,emacs, grep,awk/mawk/gawk,sed, Perl, and many other

utilities. Refer to page 517 for more information on Perl reg-

ular expressions. The regular expressions used in shell

ambiguous file references are different and are described in

“Filename Generation/Pathname Expansion” on page 136.

In This Appendix

Characters . . . . . . . . . . . . . . . . . . . 888

Delimiters . . . . . . . . . . . . . . . . . . . 888

Simple Strings. . . . . . . . . . . . . . . . 888

Special Characters . . . . . . . . . . . . 888

Rules . . . . . . . . . . . . . . . . . . . . . . . 891

Bracketing Expressions . . . . . . . . 892

The Replacement String . . . . . . . . 892

Extended Regular Expressions. . . 893

Regular Expressions

888 Appendix A Regular Expressions

Characters

As used in this appendix, a character is any character except a NEWL NE. Most charac-

ters represent themselves within a regular expression. A special character, also

called a metacharacter, is one that does not represent itself. If you need to use a spe-

cial character to represent itself, you must quote it as explained on page 891.

Delimiters

A character called a delimiter usually marks the beginning and end of a regular

expression. The delimiter is always a special character for the regular expression it

delimits (that is, it does not represent itself but marks the beginning and end of the

expression). Although vim permits the use of other characters as a delimiter and grep

does not use delimiters at all, the regular expressions in this appendix use a forward

slash (/) as a delimiter. In some unambiguous cases, the second delimiter is not

required. For example, you can sometimes omit the second delimiter when it would

be followed immediately by RETURN.

Simple Strings

The most basic regular expression is a simple string that contains no special char-

acters except the delimiters. A simple string matches only itself (Table A-1). In

the examples in this appendix, the strings that are matched are underlined and

look like this.

Special Characters

You can use special characters within a regular expression to cause the regular

expression to match more than one string. A regular expression that includes a

Table A-1 Simple strings

Regular

expression Matches Examples

/ring/ ring ring, spring, ringing,

stringing

/Thursday/ Thursday Thursday, Thursday’s

/or not/ or not or not, poor nothing

Special Characters 889

special character always matches the longest possible string, starting as far toward

the beginning (left) of the line as possible.

Periods

A period (.) matches any character (Table A-2).

Brackets

Brackets ([]) define a character class1 that matches any single character within

the brackets (Table A-3). If the first character following the left bracket is a caret

(^), the brackets define a character class that matches any single character not

within the brackets. You can use a hyphen to indicate a range of characters. Within

a character-class definition, backslashes and asterisks (described in the following

sections) lose their special meanings. A right bracket (appearing as a member of the

character class) can appear only as the first character following the left bracket. A

caret is special only if it is the first character following the left bracket. A dollar sign

is special only if it is followed immediately by the right bracket.

Table A-2 Periods

Regular

expression Matches Examples

/ .alk/ All strings consisting of a SPACE followed by

any character followed by alk

will talk, may balk

/.ing/ All strings consisting of any character pre-

ceding ing

sing song, ping,

before inglenook

1. GNU documentation calls these List Operators and defines Character Class operators as expressions

that match a predefined group of characters, such as all numbers (see Table V-32 on page 864).

Table A-3 Brackets

Regular

expression Matches Examples

/[bB]ill/ Member of the character class b and B fol-

lowed by ill

bill, Bill, billed

/t[aeiou].k/ t followed by a lowercase vowel, any char-

acter, and a k

talkative, stink, teak, tanker

/# [6–9]/ # followed by a SPACE and a member of the

character class 6 through 9

# 60, # 8:, get # 9

/[^a–zA–Z]/ Any character that is not a letter (ASCII

character set only)

1, 7, @, ., }, Stop!

890 Appendix A Regular Expressions

Asterisks

An asterisk can follow a regular expression that represents a single character

(Table A-4). The asterisk represents zero or more occurrences of a match of the

regular expression. An asterisk following a period matches any string of charac-

ters. (A period matches any character, and an asterisk matches zero or more

occurrences of the preceding regular expression.) A character-class definition

followed by an asterisk matches any string of characters that are members of the

character class.

Carets and Dollar Signs

A regular expression that begins with a caret (^) can match a string only at the

beginning of a line. In a similar manner, a dollar sign ($) at the end of a regular

expression matches the end of a line. The caret and dollar sign are called

anchors because they force (anchor) a match to the beginning or end of a line

(Table A-5).

Table A-4 Asterisks

Regular

expression Matches Examples

/ab*c/ a followed by zero or more b’s followed by

a c

ac, abc, abbc, debbcaabbbc

/ab.*c/ ab followed by zero or more characters fol-

lowed by c

abc, abxc, ab45c,

xab 756.345 x cat

/t.*ing/ t followed by zero or more characters fol-

lowed by ing

thing, ting, I thought of going

/[a–zA–Z ]*/ A string composed only of letters and

SPACEs

1. any string without

numbers or punctuation!

/(.*)/ As long a string as possible between ( and ) Get (this) and (that);

/([^)]*)/ The shortest string possible that starts

with ( and ends with )

(this), Get (this and that)

Table A-5 Carets and dollar signs

Regular

expression Matches Examples

/^T/ A T at the beginning of a line This line...,

That Time...,

In Time

/^+[0–9]/ A plus sign followed by a digit at the begin-

ning of a line

+5 +45.72,

+759 Keep this...

/:$/ A colon that ends a line ...below:

Rules 891

Quoting Special Characters

You can quote any special character (but not parentheses [except in Perl; page 521]

or a digit) by preceding it with a backslash (Table A-6). Quoting a special character

makes it represent itself.

Rules

The following rules govern the application of regular expressions.

Longest Match Possible

A regular expression always matches the longest possible string, starting as far

toward the beginning of the line as possible. Perl calls this type of mach a greedy

match (page 520). For example, given the string

This (rug) is not what it once was (a long time ago), is it?

the expression /Th.*is/ matches

This (rug) is not what it once was (a long time ago), is

and /(.*)/ matches

(rug) is not what it once was (a long time ago)

However, /([^)]*)/ matches

(rug)

Given the string

singing songs, singing more and more

the expression /s.*ing/ matches

singing songs, singing

and /s.*ing song/ matches

singing song

Table A-6 Quoted special characters

Regular

expression Matches Examples

/end\./ All strings that contain end followed by a

period

The end., send., pretend.mail

/ \\/ A single backslash \

/ \*/ An asterisk *.c, an asterisk (*)

/ \[5\]/ [5] it was five [5]

/and\/or/ and/or and/or

892 Appendix A Regular Expressions

Empty Regular Expressions

Within some utilities, such as vim and less (but not grep), an empty regular expres-

sion represents the last regular expression you used. For example, suppose you give

vim the following Substitute command:

:s/mike/robert/

If you then want to make the same substitution again, you can use the following

command:

:s//robert/

Alternatively, you can use the following commands to search for the string mike and

then make the substitution

/mike/

:s//robert/

The empty regular expression (//) represents the last regular expression you used

(/mike/).

Bracketing Expressions

You can use quoted parentheses, $ and $, to bracket a regular expression. (How-

ever, Perl uses unquoted parentheses to bracket regular expressions; page 521.) The

string that the bracketed regular expression matches can be recalled, as explained in

“Quoted Digit.” A regular expression does not attempt to match quoted paren-

theses. Thus a regular expression enclosed within quoted parentheses matches what

the same regular expression without the parentheses would match. The expression

/$rexp$/ matches what /rexp/ would match; /a$b*$c/ matches what /ab*c/

would match.

You can nest quoted parentheses. The bracketed expressions are identified only by

the opening \(, so no ambiguity arises in identifying them. The expression

/$[a–z]\([A–Z]*$x\)/ consists of two bracketed expressions, one nested within the

other. In the string 3 t dMNORx7 l u, the preceding regular expression matches

dMNORx, with the first bracketed expression matching dMNORx and the second

matching MNOR.

The Replacement String

The vim and sed editors use regular expressions as search strings within Substitute

commands. You can use the ampersand (&) and quoted digits (\n) special charac-

ters to represent the matched strings within the corresponding replacement string.

Extended Regular Expressions 893

Ampersand

Within a replacement string, an ampersand (&) takes on the value of the string that

the search string (regular expression) matched. For example, the following vim Sub-

stitute command surrounds a string of one or more digits with NN. The ampersand

in the replacement string matches whatever string of digits the regular expression

(search string) matched:

:s/[0-9][0-9]*/NN&NN/

Two character-class definitions are required because the regular expression [0–9]*

matches zero or more occurrences of a digit, and any character string constitutes

zero or more occurrences of a digit.

Quoted Digit

Within the search string, a bracketed regular expression, $xxx$ [(xxx) in Perl],

matches what the regular expression would have matched without the quoted

parentheses, xxx. Within the replacement string, a quoted digit, \n, represents the

string that the bracketed regular expression (portion of the search string) beginning

with the nth \( matched. (Perl accepts a quoted digit for this purpose, but Perl’s pre-

ferred style is to precede the digit with a dollar sign [$n; page 521]). For example,

you can take a list of people in the form

last-name, first-name initial

and put it in the form

first-name initial last-name

with the following vim command:

:1,$s/$[^,]*$, $.*$/\2 \1/

This command addresses all lines in the file (1,$). The Substitute command (s) uses

a search string and a replacement string delimited by forward slashes. The first

bracketed regular expression within the search string, $[^,]*$, matches what the

same unbracketed regular expression, [^,]*, would match: zero or more characters

not containing a comma (the last-name). Following the first bracketed regular

expression are a comma and a SPACE that match themselves. The second bracketed

expression, $.*$, matches any string of characters (the first-name and initial).

The replacement string consists of what the second bracketed regular expression

matched (\2), followed by a SPACE and what the first bracketed regular expression

matched ( \1).

Extended Regular Expressions

This section covers patterns that use an extended set of special characters. These pat-

terns are called full regular expressions or extended regular expressions. In addition

894 Appendix A Regular Expressions

to ordinary regular expressions, Perl and vim provide extended regular expressions.

The three utilities egrep,grep when run with the –E option (similar to egrep), and

mawk/gawk provide all the special characters included in ordinary regular expres-

sions, except for $ and $, as well those included in extended regular expressions.

Two of the additional special characters are the plus sign (+) and the question mark

(?). They are similar to *, which matches zero or more occurrences of the previous

character. The plus sign matches one or more occurrences of the previous character,

whereas the question mark matches zero or one occurrence. You can use any one of

the special characters *,+, and ? following parentheses, causing the special charac-

ter to apply to the string surrounded by the parentheses. Unlike the parentheses in

bracketed regular expressions, these parentheses are not quoted (Table A-7).

In full regular expressions, the vertical bar (|) special character is a Boolean OR

operator. Within vim, you must quote the vertical bar by preceding it with a back-

slash to make it special (\|). A vertical bar between two regular expressions causes a

match with strings that match the first expression, the second expression, or both.

You can use the vertical bar with parentheses to separate from the rest of the regular

expression the two expressions that are being ORed (Table A-8).

Table A-7 Extended regular expressions

Regular

expression Matches Examples

/ab+c/ a followed by one or more b’s followed by

a c

yabcw, abbc57

/ab?c/ a followed by zero or one b followed by c back, abcdef

/(ab)+c/ One or more occurrences of the string ab

followed by c

zabcd, ababc!

/(ab)?c/ Zero or one occurrence of the string ab fol-

lowed by c

xc, abcc

Table A-8 Full regular expressions

Regular

expression Meaning Examples

/ab|ac/ Either ab or ac ab, ac, abac (

abac

is two

matches of the regular

expression)

/^Exit|^Quit/ Lines that begin with Exit or Quit Exit,

Quit,

No Exit

/(D|N)\. Jones/ D. Jones or N. Jones P.D. Jones, N. Jones

Appendix Summary 895

Appendix Summary

A regular expression defines a set of one or more strings of characters. A regular

expression is said to match any string it defines.

In a regular expression, a special character is one that does not represent itself.

Table A-9 lists special characters.

Table A-10 lists ways of representing character classes and bracketed regular

expressions.

In addition to the preceding special characters and strings (excluding quoted paren-

theses, except in vim), the characters in Table A-11 are special within full, or

extended, regular expressions.

Table A-9 Special characters

Character Meaning

.Matches any single character

*Matches zero or more occurrences of a match of the preceding character

^Forces a match to the beginning of a line

$A match to the end of a line

\Quotes special characters

\< Forces a match to the beginning of a word

\> Forces a match to the end of a word

Table A-10 Character classes and bracketed regular expressions

Class Defines

[

xyz

]Defines a character class that matches

, or

xyz

]Defines a character class that matches any character except

, or

[

x–z

]Defines a character class that matches any character

through

inclusive

xyz

\) Matches what

xyz

matches (a bracketed regular expression; not Perl)

(

xyz

)Matches what

xyz

matches (a bracketed regular expression; Perl only)

Table A-11 Extended regular expressions

Expression Matches

+Matches one or more occurrences of the preceding character

?Matches zero or one occurrence of the preceding character

896 Appendix A Regular Expressions

Table A-12 lists characters that are special within a replacement string in sed,vim,

and Perl.

Expression Matches

(

xyz

)+ Matches one or more occurrences of what

xyz

matches

(

xyz

)? Matches zero or one occurrence of what

xyz

matches

(

xyz

)*Matches zero or more occurrences of what

xyz

matches

xyz

abc

Matches either what

xyz

or what

abc

matches (use \| in vim)

(

)

Matches either what

xyc

or what

abc

matches (use \| in vim)

Table A-12 Replacement strings

String Represents

&Represents what the regular expression (search string) matched

A quoted number,

, represents what the

th bracketed regular expression in

the search string matched

A number preceded by a dollar sign,

, represents what the

th bracketed reg-

ular expression in the search string matched (Perl only)

Table A-11 Extended regular expressions (continued)

897897

You need not be a user or system administrator in isolation. A

large community of Linux and OS X experts is willing to assist

you in learning about, helping you solve problems with, and

getting the most out of a Linux or OS X system. Before you

ask for help, however, make sure you have done everything

you can to solve the problem yourself. No doubt, someone has

experienced the same problem before you and the answer to

your question exists somewhere on the Internet. Your job is to

find it. This appendix lists resources and describes methods

that can help you in that task.

In This Appendix

Solving a Problem . . . . . . . . . . . . . 898

The Apple Web Site. . . . . . . . . . . . 899

Finding Linux and OS X–Related

Information . . . . . . . . . . . . . . . . 899

Documentation . . . . . . . . . . . . . . . 900

Useful Linux and OS X Sites . . . . . 901

Linux and OS X Newsgroups . . . . 902

Mailing Lists . . . . . . . . . . . . . . . . . 903

Words. . . . . . . . . . . . . . . . . . . . . . . 903

Software . . . . . . . . . . . . . . . . . . . . 904

Office Suites and Word

Processors . . . . . . . . . . . . . . . . . 906

Specifying a Terminal . . . . . . . . . . 906

Help

898 Appendix B Help

Solving a Problem

Following is a list of steps that can help you solve a problem without asking someone

for help. Depending on your understanding of and experience with the hardware and

software involved, these steps may lead to a solution.

1. Linux and OS X come with extensive documentation. Read the documen-

tation on the specific hardware or software you are having a problem

with. If it is a GNU product, use info; otherwise, use man to find local

information. Also look in /usr/share/doc for documentation on specific

tools. For more information refer to “Where to Find Documentation” on

page 33.

2. When the problem involves some type of error or other message, use a

search engine, such as Google (www.google.com) or Google Groups

(groups.google.com), to look up the message on the Web. If the message is

long, pick a unique part of the message to search for; 10 to 20 characters

should be enough. Enclose the search string within double quotation

marks. See “Using the Internet to Get Help” on page 39 for an example of

this kind of search.

3. Check whether the Linux Documentation Project (www.tldp.org) has

a HOWTO or mini-HOWTO on the subject in question. Search its

site for keywords that relate directly to the product and problem.

Read the FAQs.

4. See Table B-1 on page 900 for other sources of documentation.

5. Use Google or Google Groups to search on keywords that relate directly

to the product and problem.

6. When all else fails (or perhaps before you try anything else), examine the

system logs in /var/log. First look at the end of the messages (Linux) or

system.log (OS X) file using one of the following commands:

# tail -20 /var/log/messages

# tail -20 /var/log/system.log

If messages or system.log contain nothing useful, run the following com-

mand. It displays the names of the log files in chronological order, with the

most recently modified files appearing at the bottom of the list:

$ ls -ltr /var/log

Look at the files at the bottom of the list first. If the problem involves a

network connection, review the secure or auth.log file (some systems use a

different name) on the local and remote systems. Also look at messages or

system.log on the remote system.

Finding Linux and OS X–Related Information 899

7. The /var/spool directory contains subdirectories with useful information:

cups holds the print queues; postfix,mail, or exim4 holds the user’s mail

files. Other subdirectories of /var contain similarly useful files.

If you are unable to solve a problem yourself, a thoughtful question to an appro-

priate newsgroup (page 902) or mailing list (page 903) can often elicit useful

information. When you send or post a question, make sure you describe the prob-

lem and identify the local system carefully. Include the version numbers of the

operating system and any software packages that relate to the problem. Describe

the hardware, if appropriate. There is an etiquette to posting questions—see

www.catb.org/~esr/faqs/smart-questions.html for a good paper by Eric S. Ray-

mond and Rick Moen titled “How To Ask Questions the Smart Way.”

The author’s home page (www.sobell.com) contains corrections to this book,

answers to selected chapter exercises, and pointers to other Linux and OS X sites.

The Apple Web Site

The Apple Web site is a rich source of Mac OS X information. Following is a list of

some locations that may be of interest:

• Manuals for Apple operating systems and products are available at

www.info.apple.com/support/manuals.html.

• General support material is available at www.apple.com/support.

• Apple support forums are online discussions about Apple-related issues.

Forums are dedicated to specific applications or OS versions, specific

Macintosh models, and so on. Go to discussions.info.apple.com to browse

through the forums.

• Mailing lists are hosted by Apple; see lists.apple.com for a list of mailing lists.

• Even if you are not using Mac OS X Server, the documentation for Server

has a lot of material on OS X command-line usage.

Finding Linux and OS X–Related Information

Linux and OS X come with reference pages stored online. You can read these docu-

ments by using the man (page 33) or info (page 36) utility. Under Mac OS X, you can

use the documentation browser in Xcode or the Mac Help application. You can read

man and info pages to get more information about specific topics while reading this

book or to determine which features are available with Linux or OS X. To search for

topics, use the apropos utility (see page 35 or give the command man apropos). The

Apple support site (www.apple.com/support) has many useful Mac OS X links.

900 Appendix B Help

Documentation

Good books are available on various aspects of using and managing UNIX systems

in general and Linux and OS X systems in particular. In addition, you may find the

sites listed in Table B-1 useful.1

1. The right-hand columns of most of the tables in this appendix show Internet addresses (URLs). All sites

have an implicit http:// prefix unless ftp:// or https:// is shown. Refer to “URLs (Web addresses)” on

page 25.

Table B-1 Documentation

Site About the site URL

Apple Manuals Apple manuals (look under Software

Manuals/Mac OS X in a box on the

right)

www.apple.com/support/manuals

freedesktop.org Creates standards for interoperability

between open-source desktop

environments

freedesktop.org

GNOME GNOME home page www.gnome.org

GNU Manuals GNU manuals www.gnu.org/manual

info Instructions for using the info utility www.gnu.org/software/texinfo/manual/info

Internet FAQ

directory

hierarchy

A directory, called the root of the directory hierarchy, and all the directory and ordi-

nary files below it (its children).

directory

service

A structured repository of information on people and resources within an organiza-

tion, facilitating management and communication.FOLDOC

disk partition See partition on page 970.

diskless A computer, usually a workstation, that has no disk and must contact another com-

puter (a server) to boot a copy of the operating system and access the necessary sys-

tem files.

distributed

computing

A style of computing in which tasks or services are performed by a network of

cooperating systems, some of which may be specialized.

DMZ Demilitarized zone. A host or small network that is a neutral zone between a LAN

and the Internet. It can serve Web pages and other data to the Internet and allow

local systems access to the Internet while preventing LAN access to unauthorized

Internet users. Even if a DMZ is compromised, it holds no data that is private and

none that cannot be easily reproduced.

DNS Domain Name Service. A distributed service that manages the correspondence of

full hostnames (those that include a domain name) to IP addresses and other system

characteristics.

DNS domain

name

See domain name on page 952.

952 Glossary

document

object model

See DOM.

DOM Document Object Model. A platform-/language-independent interface that enables

a program to update the content, structure, and style of a document dynamically.

The changes can then be made part of the displayed document. Go to

www.w3.org/DOM for more information.

domain name A name associated with an organization, or part of an organization, to help identify

systems uniquely. Technically, the part of the FQDN (page 955) to the right of the

leftmost period. Domain names are assigned hierarchically. The domain

berkeley.edu refers to the University of California at Berkeley, for example; it is part

of the top-level edu (education) domain. Also DNS domain name. Different than

NIS domain name (page 968).

Domain Name

Service

See DNS.

door An evolving filesystem-based RPC (page 976) mechanism.

DoS attack Denial of service attack. An attack that attempts to make the target host or network

unusable by flooding it with spurious traffic.

DPMS Display Power Management Signaling. A standard that can extend the life of CRT

monitors and conserve energy. DPMS supports four modes for a monitor: Normal,

Standby (power supply on, monitor ready to come to display images almost

instantly), Suspend (power supply off, monitor takes up to ten seconds to display an

image), and Off.

drag The motion part of drag-and-drop.

drag-and-drop To move an object from one position or application to another within a GUI. To

drag an object, the user clicks a mouse button (typically the left one) while the

mouse pointer hovers (page 958) over the object. Then, without releasing the mouse

button, the user drags the object, which stays attached to the mouse pointer, to a

different location. The user can then drop the object at the new location by releas-

ing the mouse button.

drop-down list Awidget (page 987) that displays a static list for a user to choose from. When the

list is not active, it appears as text in a box, displaying the single selected entry.

When a user clicks the box, a list appears; the user can move the mouse cursor to

select an entry from the list. Different from a list box (page 963).

druid In role-playing games, a character that represents a magical user. Red Hat uses the

term druid at the ends of names of programs that guide you through a task-driven

chain of steps. Other operating systems call these types of programs wizards.

DSA Digital Signature Algorithm. A public key cipher used to generate digital signatures.

Glossary 953

DSL Digital Subscriber Line/Loop. Provides high-speed digital communication over a

specialized, conditioned telephone line. See also xDSL (page 988).

Dynamic Host

Configuration

Protocol

See DHCP on page 951.

editor A utility, such as vim or emacs, that creates and modifies text files.

EEPROM Electrically erasable, programmable, readonly memory. A PROM (page 972) that

can be written to.

effective

user ID

The user ID that a process appears to have; usually the same as the user ID. For

example, while you are running a setuid program, the effective user ID of the pro-

cess running the program is that of the owner of the program.

element One thing; usually a basic part of a group of things. An element of a numeric array

is one of the numbers stored in the array.

emoticon See smiley on page 978.

encapsulation See tunneling on page 984.

environment See calling environment on page 945.

EOF End of file.

EPROM Erasable programmable readonly memory. A PROM (page 972) that can be written

to by applying a higher than normal voltage.

escape See quote on page 973.

Ethernet A type of LAN (page 962) capable of transfer rates as high as 1,000 megabits per

second.

event An occurrence, or happening, of significance to a task or program—for example,

the completion of an asynchronous input/output operation, such as a keypress or

mouse click.FOLDOC

exabyte 260 bytes or about 1018 bytes. See also large number (page 962).

exit status The status returned by a process; either successful (usually 0) or unsuccessful (usu-

ally 1).

exploit A security hole or an instance of taking advantage of a security hole.FOLDOC

expression See logical expression (page 964) and arithmetic expression (page 941).

extranet A network extension for a subset of users (such as students at a particular school or

engineers working for the same company). An extranet limits access to private

information even though it travels on the public Internet.

954 Glossary

failsafe session A session that allows you to log in on a minimal desktop in case your standard login

does not work well enough to allow you to log in to fix a login problem.

FDDI Fiber Distributed Data Interface. A type of LAN (page 962) designed to transport

data at the rate of 100 million bits per second over fiberoptic cable.

file A collection of related information referred to with a filename and frequently stored

on a disk. Text files typically contain memos, reports, messages, program source

code, lists, or manuscripts. Binary or executable files contain utilities or programs

that you can run. Refer to “Directory Files and Ordinary Files” on page 78.

filename The name of a file. A filename refers to a file.

filename

completion

Automatic completion of a filename after you specify a unique prefix.

filename

extension

The part of a filename following a period.

filename

generation

What occurs when the shell expands ambiguous file references. See ambiguous file

reference on page 940.

filesystem Adata structure (page 950) that usually resides on part of a disk. All Linux systems

have a root filesystem, and many have other filesystems. Each filesystem is com-

posed of some number of blocks, depending on the size of the disk partition that

has been assigned to the filesystem. Each filesystem has a control block, named the

superblock, that contains information about the filesystem. The other blocks in a

filesystem are inodes, which contain control information about individual files, and

data blocks, which contain the information in the files.

filling A variant of maximizing in which window edges are pushed out as far as they can

go without overlapping another window.

filter A command that can take its input from standard input and send its output to stan-

dard output. A filter transforms the input stream of data and sends it to standard

output. A pipe usually connects a filter’s input to standard output of one command,

and a second pipe connects the filter’s output to standard input of another com-

mand. The grep and sort utilities are commonly used as filters.

firewall A device for policy-based traffic management used to keep a network secure. A fire-

wall can be implemented in a single router that filters out unwanted packets, or it

can rely on a combination of routers, proxy servers, and other devices. Firewalls are

widely used to give users access to the Internet in a secure fashion and to separate a

company’s public WWW server from its internal network. They are also employed

to keep internal network segments more secure.

Recently the term has come to be defined more loosely to include a simple packet

filter running on an endpoint machine.

directory

The directory that is the working directory when you first log in. The pathname of

this directory is stored in the HOME shell variable.

hover To leave the mouse pointer stationary for a moment over an object. In many cases

hovering displays a tooltip (page 984).

HTML Hypertext Markup Language. A hypertext document format used on the World

Wide Web. Tags, which are embedded in the text, consist of a less than sign (<), a

directive, zero or more parameters, and a greater than sign (>). Matched pairs of

directives, such as <TITLE> and </TITLE>, delimit text that is to appear in a special

place or style.FOLDOC For more information on HTML, go to www.htmlhelp.com/faq/

html/all.html.

HTTP Hypertext Transfer Protocol. The client/server TCP/IP protocol used on the World

Wide Web for the exchange of HTML documents.

hub A multiport repeater. A hub rebroadcasts all packets it receives on all ports. This

term is frequently used to refer to small hubs and switches, regardless of the device’s

intelligence. It is a generic term for a layer 2 shared-media networking device.

Today the term hub is sometimes used to refer to small intelligent devices, although

that was not its original meaning. Contrast with network switch (page 968).

Table G-1 Decimal, octal, and hexadecimal numbers

Decimal Octal Hex Decimal Octal Hex

1 1 1 17 21 11

2 2 2 18 22 12

3 3 3 19 23 13

4 4 4 20 24 14

5 5 5 21 25 15

6 6 6 31 37 1F

7 7 7 32 40 20

8 10 8 33 41 21

911 96410040

10 12 A 96 140 60

11 13 B 100 144 64

12 14 C 128 200 80

13 15 D 254 376 FE

14 16 E 255 377 FF

15 17 F 256 400 100

16 20 10 257 401 101

Glossary 959

hypertext A collection of documents/nodes containing (usually highlighted or underlined)

cross-references or links, which, with the aid of an interactive browser program,

allow the reader to move easily from one document to another.FOLDOC

Hypertext

Markup

Language

See HTML.

Hypertext

Transfer

Protocol

See HTTP.

i/o device Input/output device. See device on page 951.

IANA Internet Assigned Numbers Authority. A group that maintains a database of all per-

manent, registered system services (www.iana.org).

ICMP Internet Control Message Protocol. A type of network packet that carries only mes-

sages, no data.

icon In a GUI, a small picture representing a file, directory, action, program, and so on.

When you click an icon, an action, such as opening a window and starting a pro-

gram or displaying a directory or Web site, takes place. From miniature religious

statues.FOLDOC

iconify The process of changing a window into an icon. Contrast with restore (page 975).

ignored

window

A state in which a window has no decoration and therefore no buttons or titlebar to

control it with.

indentation See indention.

indention The blank space between the margin and the beginning of a line that is set in from

the margin.

inode Adata structure (page 950) that contains information about a file. An inode for a

file contains the file’s length, the times the file was last accessed and modified, the

time the inode was last modified, owner and group IDs, access privileges, number of

links, and pointers to the data blocks that contain the file itself. Each directory entry

associates a filename with an inode. Although a single file may have several file-

names (one for each link), it has only one inode.

input Information that is fed to a program from a terminal or other file. See standard

input on page 980.

installation A computer at a specific location. Some aspects of the Linux system are installation

dependent. Also site.

interactive A program that allows ongoing dialog with the user. When you give commands in

response to shell prompts, you are using the shell interactively. Also, when you give

commands to utilities, such as vim and mail, you are using the utilities interactively.

960 Glossary

interface The meeting point of two subsystems. When two programs work together, their

interface includes every aspect of either program that the other deals with. The user

interface (page 985) of a program includes every program aspect the user comes

into contact with: the syntax and semantics involved in invoking the program, the

input and output of the program, and its error and informational messages. The

shell and each of the utilities and built-in commands have a user interface.

International

Organization

for

Standardization

See ISO on page 961.

internet A large network that encompasses other, smaller networks.

Internet The largest internet in the world. The Internet (uppercase “I”) is a multilevel

hierarchy composed of backbone networks (ARPANET, NSFNET, MILNET, and

others), midlevel networks, and stub networks. These include commercial (.com

or .co), university (.ac or .edu), research (.org or .net), and military (.mil) net-

works and span many different physical networks around the world with various

protocols, including the Internet Protocol (IP). Outside the United States, country

code domains are popular (.us,.es,.mx,.de, and so forth), although you will see

them used within the United States as well.

Internet

Protocol

See IP.

Internet

service

provider

See ISP.

intranet An inhouse network designed to serve a group of people such as a corporation or

school. The general public on the Internet does not have access to the intranet.

invisible file See hidden filename on page 957.

IP Internet Protocol. The network layer for TCP/IP. IP is a best-effort, packet-

switching, connectionless protocol (page 948) that provides packet routing, frag-

mentation, and reassembly through the data link layer. IPv4 is slowly giving way

to IPv6.FOLDOC

IP address Internet Protocol address. A four-part address associated with a particular network

connection for a system using the Internet Protocol (IP). A system that is attached to

multiple networks that use the IP will have a different IP address for each network

interface.

IP multicast See multicast on page 967.

Glossary 961

IP spoofing A technique used to gain unauthorized access to a computer. The would-be intruder

sends messages to the target machine. These messages contain an IP address indicat-

ing that the messages are coming from a trusted host. The target machine responds

to the messages, giving the intruder (privileged) access to the target.

IPC Interprocess communication. A method to communicate specific information

between programs.

IPv4 IP version 4. See IP and IPv6.

IPv6 IP version 6. The next generation of Internet Protocol, which provides a much

larger address space (2128 bits versus 232 bits for IPv4) that is designed to accommo-

date the rapidly growing number of Internet addressable devices. IPv6 also has

built-in autoconfiguration, enhanced security, better multicast support, and many

other features.

ISDN Integrated Services Digital Network. A set of communications standards that allows

a single pair of digital or standard telephone wires to carry voice, data, and video at

a rate of 64 kilobits per second.

ISO International Organization for Standardization. A voluntary, nontreaty organiza-

tion founded in 1946. It is responsible for creating international standards in many

areas, including computers and communications. Its members are the national stan-

dards organizations of 89 countries, including the American National Standards

Institute.FOLDOC

ISO9660 The ISO standard defining a filesystem for CD-ROMs.

ISP Internet service provider. Provides Internet access to its customers.

job control A facility that enables you to move commands from the foreground to the back-

ground and vice versa. Job control enables you to stop commands temporarily.

journaling

filesystem

A filesystem that maintains a noncached log file, or journal, which records all trans-

actions involving the filesystem. When a transaction is complete, it is marked as

complete in the log file.

The log file results in greatly reduced time spent recovering a filesystem after a

crash, making it particularly valuable in systems where high availability is an issue.

JPEG Joint Photographic Experts Group. This committee designed the standard image-

compression algorithm. JPEG is intended for compressing either full-color or gray-scale

digital images of natural, real-world scenes and does not work as well on nonrealistic

images, such as cartoons or line drawings. Filename extensions: .jpg,.jpeg.FOLDOC

justify To expand a line of type in the process of formatting text. A justified line has even

margins. A line is justified by increasing the space between words and sometimes

between letters on the line.

962 Glossary

Kerberos An MIT-developed security system that authenticates users and machines. It does

not provide authorization to services or databases; it establishes identity at logon,

which is used throughout the session. Once you are authenticated, you can open as

many terminals, windows, services, or other network accesses as you like until your

session expires.

kernel The part of the operating system that allocates machine resources, including mem-

ory, disk space, and CPU (page 949) cycles, to all other programs that run on a

computer. The kernel includes the low-level hardware interfaces (drivers) and man-

ages processes (page 972), the means by which Linux executes programs. The ker-

nel is the part of the Linux system that Linus Torvalds originally wrote (see the

beginning of Chapter 1).

kernelspace The part of memory (RAM) where the kernel resides. Code running in kernelspace

has full access to hardware and all other processes in memory. See the

KernelAnalysis-HOWTO.

key binding Akeyboard key is said to be bound to the action that results from pressing it. Typi-

cally keys are bound to the letters that appear on the keycaps: When you press A, an

A appears on the screen. Key binding usually refers to what happens when you

press a combination of keys, one of which is CONTROL,ALT,META, or SHIFT, or when you

press a series of keys, the first of which is typically ESCAPE.

keyboard A hardware input device consisting of a number of mechanical buttons (keys) that

the user presses to input characters to a computer. By default a keyboard is con-

nected to standard input of a shell.FOLDOC

kilo- In the binary system, the prefix kilo- multiplies by 210 (i.e., 1,024). Kilobit and kilobyte

are common uses of this prefix. Abbreviated as k.

Korn Shell ksh. A command processor, developed by David Korn at AT&T Bell Laboratories,

that is compatible with the Bourne Shell but includes many extensions. See also

shell on page 978.

LAN Local area network. A network that connects computers within a localized area

(such as a single site, building, or department).

large number Visit mathworld.wolfram.com/LargeNumber.html for a comprehensive list.

LDAP Lightweight Directory Access Protocol. A simple protocol for accessing online

directory services. LDAP is a lightweight alternative to the X.500 Directory Access

Protocol (DAP). It can be used to access information about people, system users,

network devices, email directories, and systems. In some cases, it can be used as an

alternative for services such as NIS. Given a name, many mail clients can use LDAP

to discover the corresponding email address. See directory service on page 951.

leaf In a tree structure, the end of a branch that cannot support other branches. When

the Linux filesystem hierarchy is conceptualized as a tree, files that are not directo-

ries are leaves. See node on page 969.

Glossary 963

least privilege,

concept of

Mistakes made by a user working with root privileges can be much more devastat-

ing than those made by an ordinary user. When you are working on the computer,

especially when you are working as the system administrator, always perform any

task using the least privilege possible. If you can perform a task logged in as an ordi-

nary user, do so. If you must work with root privileges, do as much as you can as an

ordinary user, log in as root or give an su or sudo command so you are working

with root privileges, do as much of the task that has to be done with root privileges,

and revert to being an ordinary user as soon as you can.

Because you are more likely to make a mistake when you are rushing, this concept

becomes more important when you have less time to apply it.

Lightweight

directory

The directory that you are associated with at any given time. The relative path-

names you use are relative to the working directory. Also current directory.

workspace A subdivision of a desktop (page 950) that occupies the entire display.

workstation A small computer, typically designed to fit in an office and be used by one person

and usually equipped with a bit-mapped graphical display, keyboard, and mouse.

Differentiated from a terminal (page 983) by its intelligence. A workstation runs

Linux on itself while a terminal connects to a computer that runs Linux.

worm A program that propagates itself over a network, reproducing itself as it goes.

Today the term has negative connotations, as it is assumed that only crackers

(page 949) write worms. Compare to virus (page 986) and Trojan horse

(page 984). From Tapeworm in John Brunner’s novel, The Shockwave Rider, Bal-

lantine Books, 1990 (via XEROX PARC).FOLDOC

WYSIWYG What You See Is What You Get. A graphical application, such as a word processor,

whose display is similar to its printed output.

X server The X server is the part of the X Window System that runs the mouse, keyboard,

and display. (The application program is the client.)

X terminal A graphics terminal designed to run the X Window System.

X Window

System

A design and set of tools for writing flexible, portable windowing applications, cre-

ated jointly by researchers at MIT and several leading computer manufacturers.

XDMCP X Display Manager Control Protocol. XDMCP allows the login server to accept

requests from network displays. XDMCP is built into many X terminals.

xDSL Different types of DSL (page 953) are identified by a prefix, for example, ADSL,

HDSL, SDSL, and VDSL.

Xinerama An extension to X.org. Xinerama allows window managers and applications to

use the two or more physical displays as one large virtual display. Refer to the

Xinerama-HOWTO.

XML Extensible Markup Language. A universal format for structured documents and

data on the Web. Developed by W3C (page 986), XML is a pared-down version of

SGML. See www.w3.org/XML and www.w3.org/XML/1999/XML-in-10-points.

XSM X Session Manager. This program allows you to create a session that includes cer-

tain applications. While the session is running, you can perform a checkpoint (saves

the application state) or a shutdown (saves the state and exits from the session).

When you log back in, you can load your session so that everything in your session

is running just as it was when you logged off.

Z Shell zsh. A shell (page 978) that incorporates many of the features of the Bourne Again

Shell (page 944), Korn Shell (page 962), and TC Shell (page 982), as well as many

original features.

Zulu time See UTC on page 986.

989989

File Tree Index

A light page number such as 456 indicates a brief mention.

Symbols

/bin 91

echo 439

/boot 91

/dev 91, 124

kmem (OS X) 673

null 131, 417

random 659

tty 452

urandom 659

/etc 92

apt

sources.list 920

aspell.conf 608

at.allow 612

at.deny 612

bashrc 272

cron.allow 650

cron.daily 650

cron.deny 650

cron.hourly 650

cron.monthly 650

cron.weekly 650

/etc,continued

crontab 650

csh.cshrc 352

csh.login 352

csh.logout 353

DIR_COLORS 749

fstab 100

group 526, 926

hosts.equiv 801, 803, 807

issue 26

man.conf 760

man.config 760

manpath.config 760

motd 27

opt 92

passwd 351, 926

printcap 742

profile 272

ssh_known_hosts 829

termcap 982

X11 92

yum.conf 914

yum.repos.d 915

990 File Tree Index

/home 92

/lib 92, 713

modules 92

/lost+found 702

/mnt 92

/opt 92

/proc 92

/root 92

/sbin 92

/sys 92

/tmp 92, 427

/Users (OS X) 92

/usr 92

bin 92

test 400

games 92

include 92

lib 92, 713

cron (OS X) 650

terminfo 983

local 92

ascii 941

sbin 93

share 93

doc 39, 93, 898

info 93

magic 964

man 93

src 93

Documentation 39

/var 93

at (OS X) 612

at.allow (OS X) 612

at.deny (OS X) 612

cron.allow (OS X) 650

cron.deny (OS X) 650

jobs (OS X) 612

spool (OS X) 612

dslocal (OS X) 674

/var,continued

log 93, 898

auth.log 898

messages 673, 898

system.log (OS X) 898

name 299

spool 93, 899

at 612

spool 612

cron 649

atjobs 612

atspool 612

crontabs 649

/Volumes (OS X) 928

~/ (a user’s home directory)

.aspell 607

.bash_history 309

.bash_login 272

.bash_logout 272

.bash_profile 184, 185, 272, 309, 329

.bashrc 272

.cshrc 949

.dir_colors 749

.emacs 206, 207, 250, 253

.inputrc 321

.login 352, 964

.logout 353, 964

.netrc 707

.nofinger 696

.pgpkey 69, 695

.plan 68, 695

.profile 272, 972

.project 69, 695

.rhosts 801, 803, 807

.ssh

known_hosts 829

.tcshrc 184, 185, 352

.toprc 860

.vimrc 184, 185

991991

Utility Index

A light page number such as 456 indicates a brief mention. Page numbers followed by the letter t

refer to tables.

Symbols

%387

. (dot) 274, 451

: (null) 455

: (null) builtin 446

@366, 368, 370, 387

[[...]] 462

alias 324, 357, 387

alloc 387

apropos 35

apt-get 916–920

aspell 415, 607–610

at 611–614

atq 611–614

atrm 611–614

atrun (OS X) 612

autoconf 639

awk see gawk

basename 426

bash see bash in the Main index (page 995)

batch 611–614

bg 135, 287, 387

bind 322

bindkey 363, 364, 387

btdownloadcurses 922

btshowmetainfo 923

builtins 459t

builtins 388

bunzip2 61, 615–616

bzcat 61, 615–616

bzip2 60, 615–616

bzip2recover 62, 615–616

cal 617

cat 48, 125, 127, 404, 618–619

cd 87, 109, 302, 388, 620–621

chdir 388

chgrp 622–623

chmod 94, 96, 279, 626–630

992 Utility Index

chmod (OS X) 933–935

chown 631–633

chsh 271, 351

cmp 634–635

col 761

comm 636–637

compress 62, 81, 725, 848

configure 638–639

cp 49, 90, 640–643

cpio 644–648, 689, 726

cron 649–651

crontab 649–651

cut 339, 652–654

dash 270

date 58, 353,443, 655–657

dd 658–660

declare 295–296, 435

df 661–662

diff 54, 663–667

diff3 664

dircolors 749

dirs 288, 388

disktool (OS X) 669

diskutil (OS X) 668–670

ditto (OS X) 671–672

dmesg 673

dscl (OS X) 674–676

du 677–679

e2fsck 701

echo 57, 350, 388, 424,439, 680–681

echo builtin 424t

ed 52, 150, 663

egrep 719, 720

elvis 150

emacs 205–262

env 329

eval 329, 388

ex 150, 159

exec 385, 388, 431, 450–453

exit 352,377, 388, 402, 440

export 296, 436

expr 682–685

false 466

fc 310–313

fdformat 770

fg 135, 388

fgrep 719, 721

file 56, 686–687

filetest 380, 388

find 360,410, 688–693

finger 68, 70t, 695–696

fmt 697–698

fromdos 59

fsaclctl (OS X) 933

fsck 699–703

fsck (OS X) 669

ftp 704–709

g++ 712–716

gawk 414, 532–564

gcc 712–716

gdb 713

getfacl 100–104

GetFileInfo (OS X) 717–718

getopts 456–459

getty 306

glob 389

grep 52, 417,429,486, 719–723

gunzip 62, 724

gzip 62, 724–726, 848

hashstat 389

head 52, 727–728

history 308, 309, 354, 389

hostname 49

info 36–38

init 306

ispell,see aspell

jobs 30, 136, 285, 286, 389

Utility Index 993

kill 30, 136, 389, 453, 454, 456, 729–730, 799

killall 731–732

launchctl (OS X) 613, 733–734

ldd 782

less 34, 48, 404, 735–738

let 339, 460

limit 389

ln 106, 109, 740–741

locate 66

log 389

logout 352,377, 389

lpq 51, 742–744

lpr 51, 132, 742–744

lprm 51, 742–744

lpstat 51

ls 47, 93, 745–751

ls–F 377, 389

make 524, 753–758

man 33–35, 759–762

mawk see gawk

mcd 767–770

mcopy 767–770

mdel 767–770

mdir 767–770

mesg 71

mformat 767–770

mingetty 306

mkdir 86–87, 763

mkfs 764–766

mklost+found 702

more 48, 736

Mtools 767–770

mtype 767–770

mv 50, 90, 771–772

ncal 617

nice 389, 773–774

nohup 389, 775

notify 389

od 776–779

onintr 381, 389

open (OS X) 780–781

otool (OS X) 782–783

paste 784–785

pax 786–791

perldoc 487

pinfo 37

plutil (OS X) 792–793

popd 290, 389

pr 794–795

printenv 389

ps 136,280, 306, 439, 796–799

ps2pdf 761

pstree 307

pushd 289, 377, 390

pwck 556

pwd 82, 109

pwgen 42

rcp 800–801

read 418, 447–450

read builtin 449t

readonly 295, 296

rehash 390

renice 802

repeat 390

rlogin 803

rm 48, 110, 378, 804–805

rmdir 88, 806

rsh 807–808

rsync 583–594

run-parts 650

sched 390

scp 810–811

script 58

sdiff 664, 665

sed 565–581

set 331, 366, 370, 390, 407,409, 442

setenv 366, 390

994 Utility Index

setfacl 100–104

SetFile (OS X) 813–814

sh 270, 944

shift 390, 403, 442

shopt 331

shred 805

sleep 440, 815–816

slocate 66

sort 54, 132,343,433, 817–825

source 274, 390

split 826–827

ssh 27, 828–834

ssh-keygen 831

stat 835–836

stop 390

strings 837

stty 29, 838–841

su 32

sudo 32

suspend 390

sysctl (OS X) 842

tac 579,618

tail 53, 360, 843–845

tar 62–65, 285, 846–850

tee 134, 851

telnet 852–853

Terminal (OS X) 935

test 399–401, 401, 405,408,409,412, 414, 420,

854–857

test builtin 401t

time 374, 375t, 390

todos 59

top 858–860

touch 89, 616, 862–863

tput 420

tr 60, 132, 276, 653, 864–866

trap 419, 453–456

true 455,466

tty 867

tune2fs 766, 868–869

type 447

typeset 295–296, 438

udev 91

umask 391, 870–871

unalias 324, 327, 391

uncompress 81, 848

unhash 391

uniq 54, 872–873

unix2dos 59

unlimit 391

unset 294, 391

unsetenv 367, 391

updatedb 66

uptime 70, 859,874

vi 150 see also vim

vile 150

vim 149–201

vimtutor 151

w69, 70t, 874–875

wait 391

wc 56,339, 876

whatis 36

where 391

whereis 65

which 65, 391, 877–878

who 67, 70t, 879–880

write 70, 879

xargs 881–883

yum 910–916

yumdownloader 914

zcat 62, 724–726

zdiff 725

zless 725

995995

Main Index

An italic page number such as 123 indictes a definition. A light page number such as 456 indicates a

brief mention. Page numbers followed by the letter t refer to tables. Only variables that must always

appear with a leading dollar sign are indexed with a leading dollar sign. Other variables are indexed

without a leading dollar sign.

Symbols

! (NOT) Boolean operator 468, 574

!! reexecutes the previous event 313

# comment 756

# comment symbol 281, 408

#! specifies a script shell 280, 407

$bash parameters 438–443

$ in regular expressions 890

$ in variable names 292

$ prompt 27

$! parameter 440

$# parameter 441

$# variable 371, 372

$#argv variable 372

$$ parameter 427, 439

$$ variable 376

$((...)) see arithmetic, expansion

$(...) see command, substitution

$* parameter 443

$< reads user input 371

$? parameter 440

$? variable 371

$@ parameter 413,414, 444

${} expands variables 445

$0 parameter 441

$n parameters 441

% builtin 387

& (AND) bitwise operator 467

& background process 135, 308, 942

& command separator 283

& in replacement strings (regular expressions) 893

&& (AND) Boolean operator 383, 461, 466–467,

468, 534

((...)) see arithmetic, evaluation

() command grouping 284

* in regular expressions 890

* special character 138

+ in extended regular expressions 894

, (comma) operator (bash) 465

. (dot) builtin 274, 451

. directory 88

. in regular expressions 889

./ executes a file in the working directory 279, 298

996 Main Index

.. directory 88

..namedfork (OS X) 930

.jpg filename extension 961

/ (root) directory 83, 91

: (null) builtin 446, 455

:– substitutes default values for a variable 445

:= assigns default values for a variable 445

:? displays an error message for a variable 446

; command separator 282

< redirects input 128, 940

<< Here document 429–431, 940

> redirects output 126, 940

>& duplicates output file descriptor 276, 277, 402

>> appends output 130, 940

>| redirects output without clobber 130

>1 redirects standard output 275

? in extended regular expressions 894

? special character 137

@ builtin 366, 368, 370, 387

[] character class (regular expressions) 889, 946

[] special characters 139

[...] see test utility

[[...]] builtin 462

\ escape character 46, 282, 292

\( in regular expressions 892

\) in regular expressions 892

^ in regular expressions 890

^ quick substitution character (bash)316

| (OR) bitwise operator 468

| (OR) Boolean operator 894

| command separator 283

| in extended regular expressions 894

|see also pipe

|| (OR) Boolean operator 461, 466–467, 468

|| Boolean operator 534

~– synonym for OLDPWD 338

~ completion 361

~ expansion 84, 297, 337

~ in directory stack manipulation 338

~see also home directory

~+ synonym for PWD 338

‘ ...‘ see command, substitution

Numerics

0< redirects standard input 275

10.0.0.0 (IP address) 940

172.16.0.0 (IP address) 940

192.168.0.0 (IP address) 940

2> redirects standard error 275

802.11 wireless specification 940

–a (AND) Boolean operator 405, 461

.a filename extension 715

a.out file 712

aborting execution 30

absolute mode specifications (chmod) 627t, 628t

absolute pathnames 83, 122, 940

access 940

Access Control Lists see ACLs

access permissions 93–104, 940

change using chmod 94–96, 626–630

directory 98–99

display using ls 94

establish mask using umask 870–871

execute 278–280

setgid see setgid

setuid see setuid

ACLs 99–104, 940

access rules 100–104

default rules 103

effective rights mask 102

enabling 100

enabling (OS X) 933

getfacl utility 100–104

OS X 932–935

setfacl utility 100–104

work with using chmod (OS X) 933–935

active window 940

addbanner shell script 455

addition operators 464

address mask see subnet, mask

address see the type of address you are looking for

(e.g., MAC address, IP address) or see the

specific address (e.g., 127.0.0.1)

algorithm 469

alias 324–327, 940

examples 326–327

quotation marks in 325

recursion 324

recursive plunge 327

substitution 334

alias builtin 324, 357, 387

aliases

special 357

alloc builtin 387

Main Index 997

alphanumeric character 940

ALT key 216

ambiguous file references 137, 377, 940

AND bitwise operator 467

AND Boolean operator 405, 461

angle bracket 940

animate 940

anonymous FTP 706

ANSI 10

ansi terminal name 906

antialiasing 941

API 941

append 941

append standard output using >> 130

Apple Human Interface Guidelines 936

applet 941

apropos utility 35

apt-get utility 916–920

archive 941

archive files 60–65 see also backing up files

archive, shell 430

arguments 118,941

command line 443

convert standard input into using xargs 881–883

testing 400, 408

argv variable 371

arithmatic, perform using test 854–857

arithmetic

evaluation (bash)338,415,438, 460–461

expansion (bash) 338–340, 465

expression 941

array 367,489,941

ASCII 941

EBCDIC conversion 658

man page 760

terminal 941

ascii file 941

ASP 941

.aspell file 607

aspell utility 415, 607–610

aspell.conf file 608

ASPELL_CONF variable 607

assembly language 10

assignment operators 465

associative arrays (gawk) 538, 551–556

asterisk special character 138, 890

asynchronous event 942

at directory 612

at directory (OS X) 650

at utility 611–614

AT&T Bell Laboratories 3, 270

at.allow file 612

at.deny file 612

atd daemon 612

atjobs directory 612

atq utility 611–614

atrm utility 611–614

atrun utility (OS X) 612

atspool directory 612

attachments 942

attributes, extended (OS X) 928

attributes, file (OS X) 931

auth.log file 898

authentication 942

authentication, OpenSSH 829

autoconf utility 639

autocorrect variable 376

autolist variable 362, 372

autologout variable 372

automatic mounting 942

avoided 942

awk utility see gawk utility

AWK see gawk utility

B language 10

back door 942

back ticks see command, substitution 947

background

command, running in the 134

foreground versus 134

jobs 134–136

processes 308,942

backing up files 60–65

using cp 640

using cpio 644–648

using pax 786–791

using rsync 591

using tar 846–850

BACKSLASH escape character 46, 282, 292

BACKSLASH in replacement strings 893

BACKSPACE key (erase character) 29

base operators 467

basename 83,943

basename utility 426

998 Main Index

bash 270, 944

see also alias; bash history; bash variables;

command; command line; commands;

operators; shell scripts

alias see alias

archive 430

arguments 443

arithmetic evaluation 338,415,438, 460–461

arrays see bash variables, array

background process 308

builtins see builtins

calling program, name of 441

command line see command line

command not found error message 122, 279, 297

command substitution 409,426

commands see command

conditional expressions 461

control structures see control structures

debugging prompt 301, 410

directory stack 288–290

expressions 460–468

features 330–331, 332t

features shared with tcsh 353

file descriptors 431, 431–434

functions 327–330, 432, 437–438

globbing 341

history see bash history

logical evaluation 461

menu 427

operators see operators

options, command line 330

pathname expansion 293

Permission denied error message 122, 278, 297

prompt (PS1)299

prompt (PS2)300

prompt (PS3) 301, 428

prompt (PS4) 301, 410

quiz shell script 476

quotation mark removal 335

recursion 469

redirection operators 277t

set, turn features on and off using 331

shopt, turn features on and off using 331

special characters 46, 304, 304t

standard error see standard error

standard input see standard input

standard output see standard output

startup files 271–274

string operators 462t

tcsh, features shared with 353

variables see bash variables

word splitting 301

–x option 410, 470

bash history 308, 308–324

bind builtin 322

C Shell mechanism, classic 313–317

commands

editing 312–313, 318–324

reexecuting 310–317

viewing 310–311

event 308

designators 314t

modifiers 317t

numbers 309, 313

reference using !313–317

expansion 334

history builtin 309

INPUTRC variable 321

quick substitution 316

Readline Library 318–324

Readline variables 322t

substitute modifier 316

tcsh history 308

variables 308, 309t

word designators 314, 316t

bash parameters 290, 290–303

Prentice Hall A Practical Guide To Linux Commands Editors And Shell Programming 2nd