Perl Programmers Reference Guide Version 5.005 02

User Manual: Pdf

Open the PDF directly: View PDF PDF.
Page Count: 1463

DownloadPerl Programmers Reference Guide Version 5.005 02
Open PDF In BrowserView PDF
Perl Programmers Reference Guide
Version 5.005_02
18−Oct−1998

"There’s more than one way to do it."
−− Larry Wall, Author of the Perl Programming Language

Author: Perl5−Porters

blank

INSTALL

Perl Programmers Reference Guide

INSTALL

NAME
Install − Build and Installation guide for perl5.
SYNOPSIS
The basic steps to build and install perl5 on a Unix system are:
rm −f config.sh Policy.sh
sh Configure
make
make test
make install
# You may also wish to add these:
(cd /usr/include && h2ph *.h sys/*.h)
(installhtml −−help)
(cd pod && make tex && )
Each of these is explained in further detail below.
For information on non−Unix systems, see the section on "Porting information" below.
For information on what‘s new in this release, see the pod/perldelta.pod file. For more detailed information
about specific changes, see the Changes file.
DESCRIPTION
This document is written in pod format as an easy way to indicate its structure. The pod format is described
in pod/perlpod.pod, but you can read it as is with any pager or editor. Headings and items are marked by
lines beginning with ‘=’. The other mark−up used is
B
C
L

embolden text, used for switches, programs or commands
literal code
A link (cross reference) to name

You should probably at least skim through this entire document before proceeding.
If you‘re building Perl on a non−Unix system, you should also read the README file specific to your
operating system, since this may provide additional or different instructions for building Perl.
If there is a hint file for your system (in the hints/ directory) you should also read that hint file for specific
information for your system. (Unixware users should use the svr4.sh hint file.)
WARNING: This version is not binary compatible with Perl 5.004.
Starting with Perl 5.004_50 there were many deep and far−reaching changes to the language internals. If
you have dynamically loaded extensions that you built under perl 5.003 or 5.004, you can continue to use
them with 5.004, but you will need to rebuild and reinstall those extensions to use them 5.005. See the
discussions below on "Coexistence with earlier versions of perl5" and "Upgrading from 5.004 to 5.005" for
more details.
The standard extensions supplied with Perl will be handled automatically.
In a related issue, old extensions may possibly be affected by the changes in the Perl language in the current
release. Please see pod/perldelta.pod for a description of what‘s changed.
Space Requirements
The complete perl5 source tree takes up about 10 MB of disk space. The complete tree after completing
make takes roughly 20 MB, though the actual total is likely to be quite system−dependent. The installation
directories need something on the order of 10 MB, though again that value is system−dependent.

18−Oct−1998

Version 5.005_02

3

INSTALL

Perl Programmers Reference Guide

INSTALL

Start with a Fresh Distribution
If you have built perl before, you should clean out the build directory with the command
make distclean
or
make realclean
The only difference between the two is that make distclean also removes your old config.sh and Policy.sh
files.
The results of a Configure run are stored in the config.sh and Policy.sh files. If you are upgrading from a
previous version of perl, or if you change systems or compilers or make other significant changes, or if you
are experiencing difficulties building perl, you should probably not re−use your old config.sh. Simply
remove it or rename it, e.g.
mv config.sh config.sh.old
If you wish to use your old config.sh, be especially attentive to the version and architecture−specific
questions and answers. For example, the default directory for architecture−dependent library modules
includes the version name. By default, Configure will reuse your old name (e.g.
/opt/perl/lib/i86pc−solaris/5.003) even if you‘re running Configure for a different version, e.g. 5.004. Yes,
Configure should probably check and correct for this, but it doesn‘t, presently. Similarly, if you used a
shared libperl.so (see below) with version numbers, you will probably want to adjust them as well.
Also, be careful to check your architecture name. Some Linux systems (such as Debian) use i386, while
others may use i486, i586, or i686. If you pick up a precompiled binary, it might not use the same name.
In short, if you wish to use your old config.sh, I recommend running Configure interactively rather than
blindly accepting the defaults.
If your reason to reuse your old config.sh is to save your particular installation choices, then you can
probably achieve the same effect by using the new Policy.sh file. See the section on
"Site−wide Policy settings" below.
Run Configure
Configure will figure out various things about your system. Some things Configure will figure out for itself,
other things it will ask you about. To accept the default, just press RETURN. The default is almost always
okay. At any Configure prompt, you can type &−d and Configure will use the defaults from then on.
After it runs, Configure will perform variable substitution on all the *.SH files and offer to run make depend.
Configure supports a number of useful options. Run Configure −h to get a listing. See the Porting/Glossary
file for a complete list of Configure variables you can set and their definitions.
To compile with gcc, for example, you should run
sh Configure −Dcc=gcc
This is the preferred way to specify gcc (or another alternative compiler) so that the hints files can set
appropriate defaults.
If you want to use your old config.sh but override some of the items with command line options, you need to
use Configure −O.
By default, for most systems, perl will be installed in /usr/local/{bin, lib, man}. You can specify a different
‘prefix’ for the default installation directory, when Configure prompts you or by using the Configure
command line option −Dprefix=‘/some/directory‘, e.g.
sh Configure −Dprefix=/opt/perl

4

Version 5.005_02

18−Oct−1998

INSTALL

Perl Programmers Reference Guide

INSTALL

If your prefix contains the string "perl", then the directories are simplified. For example, if you use
prefix=/opt/perl, then Configure will suggest /opt/perl/lib instead of /opt/perl/lib/perl5/.
NOTE: You must not specify an installation directory that is below your perl source directory. If you do,
installperl will attempt infinite recursion.
It may seem obvious to say, but Perl is useful only when users can easily find it. It‘s often a good idea to
have both /usr/bin/perl and /usr/local/bin/perl be symlinks to the actual binary. Be especially careful,
however, of overwriting a version of perl supplied by your vendor. In any case, system administrators are
strongly encouraged to put (symlinks to) perl and its accompanying utilities, such as perldoc, into a directory
typically found along a user‘s PATH, or in another obvious and convenient place.
By default, Configure will compile perl to use dynamic loading if your system supports it. If you want to
force perl to be compiled statically, you can either choose this when Configure prompts you or you can use
the Configure command line option −Uusedl.
If you are willing to accept all the defaults, and you want terse output, you can run
sh Configure −des
For my Solaris system, I usually use
sh Configure −Dprefix=/opt/perl −Doptimize=’−xpentium −xO4’ −des
GNU−style configure
If you prefer the GNU−style configure command line interface, you can use the supplied configure.gnu
command, e.g.
CC=gcc ./configure.gnu
The configure.gnu script emulates a few of the more common configure options. Try
./configure.gnu −−help
for a listing.
Cross compiling is not supported.
(The file is called configure.gnu to avoid problems on systems that would not distinguish the files
"Configure" and "configure".)
Extensions
By default, Configure will offer to build every extension which appears to be supported. For example,
Configure will offer to build GDBM_File only if it is able to find the gdbm library. (See examples below.)
B, DynaLoader, Fcntl, IO, and attrs are always built by default. Configure does not contain code to test for
POSIX compliance, so POSIX is always built by default as well. If you wish to skip POSIX, you can set the
Configure variable useposix=false either in a hint file or from the Configure command line. Similarly, the
Opcode extension is always built by default, but you can skip it by setting the Configure variable
useopcode=false either in a hint file for from the command line.
You can learn more about each of these extensions by consulting the documentation in the individual .pm
modules, located under the ext/ subdirectory.
Even if you do not have dynamic loading, you must still build the DynaLoader extension; you should just
build the stub dl_none.xs version. (Configure will suggest this as the default.)
In summary, here are the Configure command−line variables you can set to turn off each extension:
B
DB_File
DynaLoader
Fcntl
GDBM_File
IO

18−Oct−1998

(Always included by default)
i_db
(Must always be included as a static extension)
(Always included by default)
i_gdbm
(Always included by default)

Version 5.005_02

5

INSTALL

Perl Programmers Reference Guide
NDBM_File
ODBM_File
POSIX
SDBM_File
Opcode
Socket
Threads
attrs

INSTALL

i_ndbm
i_dbm
useposix
(Always included by default)
useopcode
d_socket
usethreads
(Always included by default)

Thus to skip the NDBM_File extension, you can use
sh Configure −Ui_ndbm
Again, this is taken care of automatically if you don‘t have the ndbm library.
Of course, you may always run Configure interactively and select only the extensions you want.
Note: The DB_File module will only work with version 1.x of Berkeley DB or newer releases of version 2.
Configure will automatically detect this for you and refuse to try to build DB_File with version 2.
If you re−use your old config.sh but change your system (e.g. by adding libgdbm) Configure will still offer
your old choices of extensions for the default answer, but it will also point out the discrepancy to you.
Finally, if you have dynamic loading (most modern Unix systems do) remember that these extensions do not
increase the size of your perl executable, nor do they impact start−up time, so you probably might as well
build all the ones that will work on your system.
Including locally−installed libraries
Perl5 comes with interfaces to number of database extensions, including dbm, ndbm, gdbm, and Berkeley
db. For each extension, if Configure can find the appropriate header files and libraries, it will automatically
include that extension. The gdbm and db libraries are not included with perl. See the library documentation
for how to obtain the libraries.
Note: If your database header (.h) files are not in a directory normally searched by your C compiler, then
you will need to include the appropriate −I/your/directory option when prompted by Configure. If your
database library (.a) files are not in a directory normally searched by your C compiler and linker, then you
will need to include the appropriate −L/your/directory option when prompted by Configure. See the
examples below.
Examples
gdbm in /usr/local
Suppose you have gdbm and want Configure to find it and build the GDBM_File extension. This
examples assumes you have gdbm.h installed in /usr/local/include/gdbm.h and libgdbm.a installed in
/usr/local/lib/libgdbm.a. Configure should figure all the necessary steps out automatically.
Specifically, when Configure prompts you for flags for your C compiler, you should include
−I/usr/local/include.
When Configure prompts you for linker flags, you should include −L/usr/local/lib.
If you are using dynamic loading, then when Configure prompts you for linker flags for dynamic
loading, you should again include −L/usr/local/lib.
Again, this should all happen automatically. If you want to accept the defaults for all the questions and
have Configure print out only terse messages, then you can just run
sh Configure −des
and Configure should include the GDBM_File extension automatically.
This should actually work if you have gdbm installed in any of (/usr/local, /opt/local, /usr/gnu,
/opt/gnu, /usr/GNU, or /opt/GNU).

6

Version 5.005_02

18−Oct−1998

INSTALL

Perl Programmers Reference Guide

INSTALL

gdbm in /usr/you
Suppose you have gdbm installed in some place other than /usr/local/, but you still want Configure to
find it. To be specific, assume you have /usr/you/include/gdbm.h and /usr/you/lib/libgdbm.a. You still
have to add −I/usr/you/include to cc flags, but you have to take an extra step to help Configure find
libgdbm.a. Specifically, when Configure prompts you for library directories, you have to add
/usr/you/lib to the list.
It is possible to specify this from the command line too (all on one line):
sh Configure −des \
−Dlocincpth="/usr/you/include" \
−Dloclibpth="/usr/you/lib"
locincpth is a space−separated list of include directories to search. Configure will automatically add
the appropriate −I directives.
loclibpth is a space−separated list of library directories to search. Configure will automatically add the
appropriate −L directives. If you have some libraries under /usr/local/ and others under /usr/you, then
you have to include both, namely
sh Configure −des \
−Dlocincpth="/usr/you/include /usr/local/include" \
−Dloclibpth="/usr/you/lib /usr/local/lib"
Installation Directories
The installation directories can all be changed by answering the appropriate questions in Configure. For
convenience, all the installation questions are near the beginning of Configure.
I highly recommend running Configure interactively to be sure it puts everything where you want it. At any
point during the Configure process, you can answer a question with &−d and Configure will use the
defaults from then on.
By default, Configure will use the following directories for library files for 5.005 (archname is a string like
sun4−sunos, determined by Configure).
Configure variable
$archlib
$privlib
$sitearch
$sitelib

Default value
/usr/local/lib/perl5/5.005/archname
/usr/local/lib/perl5/5.005
/usr/local/lib/perl5/site_perl/5.005/archname
/usr/local/lib/perl5/site_perl/5.005

Some users prefer to append a "/share" to $privlib and $sitelib to emphasize that those directories
can be shared among different architectures.
By default, Configure will use the following directories for manual pages:
Configure variable
$man1dir
$man3dir

Default value
/usr/local/man/man1
/usr/local/lib/perl5/man/man3

(Actually, Configure recognizes the SVR3−style /usr/local/man/l_man/man1 directories, if present, and uses
those instead.)
The module man pages are stuck in that strange spot so that they don‘t collide with other man pages stored in
/usr/local/man/man3, and so that Perl‘s man pages don‘t hide system man pages. On some systems, man
less would end up calling up Perl‘s less.pm module man page, rather than the less program. (This default
location will likely change to /usr/local/man/man3 in a future release of perl.)
Note: Many users prefer to store the module man pages in /usr/local/man/man3. You can do this from the
command line with

18−Oct−1998

Version 5.005_02

7

INSTALL

Perl Programmers Reference Guide

INSTALL

sh Configure −Dman3dir=/usr/local/man/man3
Some users also prefer to use a .3pm suffix. You can do that with
sh Configure −Dman3ext=3pm
If you specify a prefix that contains the string "perl", then the directory structure is simplified. For example,
if you Configure with −Dprefix=/opt/perl, then the defaults for 5.005 are
Configure variable
$archlib
$privlib
$sitearch
$sitelib
$man1dir
$man3dir

Default value
/opt/perl/lib/5.005/archname
/opt/perl/lib/5.005
/opt/perl/lib/site_perl/5.005/archname
/opt/perl/lib/site_perl/5.005
/opt/perl/man/man1
/opt/perl/man/man3

The perl executable will search the libraries in the order given above.
The directories under site_perl are empty, but are intended to be used for installing local or site−wide
extensions. Perl will automatically look in these directories.
In order to support using things like #!/usr/local/bin/perl5.005 after a later version is released,
architecture−dependent libraries are stored in a version−specific directory, such as
/usr/local/lib/perl5/archname/5.005/.
Further details about the installation directories, maintenance and development subversions, and about
supporting multiple versions are discussed in "Coexistence with earlier versions of perl5" below.
Again, these are just the defaults, and can be changed as you run Configure.
Changing the installation directory
Configure distinguishes between the directory in which perl (and its associated files) should be installed and
the directory in which it will eventually reside. For most sites, these two are the same; for sites that use AFS,
this distinction is handled automatically. However, sites that use software such as depot to manage software
packages may also wish to install perl into a different directory and use that management software to move
perl to its final destination. This section describes how to do this. Someday, Configure may support an
option −Dinstallprefix=/foo to simplify this.
Suppose you want to install perl under the /tmp/perl5 directory. You can edit config.sh and change all the
install* variables to point to /tmp/perl5 instead of /usr/local/wherever. Or, you can automate this process by
placing the following lines in a file config.over before you run Configure (replace /tmp/perl5 by a directory
of your choice):
installprefix=/tmp/perl5
test −d $installprefix || mkdir $installprefix
test −d $installprefix/bin || mkdir $installprefix/bin
installarchlib=‘echo $installarchlib | sed "s!$prefix!$installprefix!"‘
installbin=‘echo $installbin | sed "s!$prefix!$installprefix!"‘
installman1dir=‘echo $installman1dir | sed "s!$prefix!$installprefix!"‘
installman3dir=‘echo $installman3dir | sed "s!$prefix!$installprefix!"‘
installprivlib=‘echo $installprivlib | sed "s!$prefix!$installprefix!"‘
installscript=‘echo $installscript | sed "s!$prefix!$installprefix!"‘
installsitelib=‘echo $installsitelib | sed "s!$prefix!$installprefix!"‘
installsitearch=‘echo $installsitearch | sed "s!$prefix!$installprefix!"‘
Then, you can Configure and install in the usual way:
sh Configure −des
make
make test

8

Version 5.005_02

18−Oct−1998

INSTALL

Perl Programmers Reference Guide

INSTALL

make install
Beware, though, that if you go to try to install new add−on extensions, they too will get installed in under
‘/tmp/perl5’ if you follow this example. The next section shows one way of dealing with that problem.
Creating an installable tar archive
If you need to install perl on many identical systems, it is convenient to compile it once and create an archive
that can be installed on multiple systems. Here‘s one way to do that:
# Set up config.over to install perl into a different directory,
# e.g. /tmp/perl5 (see previous part).
sh Configure −des
make
make test
make install
cd /tmp/perl5
# Edit $archlib/Config.pm to change all the
# install* variables back to reflect where everything will
# really be installed.
# Edit any of the scripts in $scriptdir to have the correct
# #!/wherever/perl line.
tar cvf ../perl5−archive.tar .
# Then, on each machine where you want to install perl,
cd /usr/local # Or wherever you specified as $prefix
tar xvf perl5−archive.tar
Site−wide Policy settings
After Configure runs, it stores a number of common site−wide "policy" answers (such as installation
directories and the local perl contact person) in the Policy.sh file. If you want to build perl on another
system using the same policy defaults, simply copy the Policy.sh file to the new system and Configure will
use it along with the appropriate hint file for your system.
Alternatively, if you wish to change some or all of those policy answers, you should
rm −f Policy.sh
to ensure that Configure doesn‘t re−use them.
Further information is in the Policy_sh.SH file itself.
Configure−time Options
There are several different ways to Configure and build perl for your system. For most users, the defaults are
sensible and will work. Some users, however, may wish to further customize perl. Here are some of the
main things you can change.
Threads
On some platforms, perl5.005 can be compiled to use threads.
README.threads, and then try

To enable this, read the file

sh Configure −Dusethreads
Currently, you need to specify −Dusethreads on the Configure command line so that the hint files can make
appropriate adjustments.
The default is to compile without thread support.
Selecting File IO mechanisms
Previous versions of perl used the standard IO mechanisms as defined in stdio.h. Versions 5.003_02 and
later of perl allow alternate IO mechanisms via a "PerlIO" abstraction, but the stdio mechanism is still the
default and is the only supported mechanism.

18−Oct−1998

Version 5.005_02

9

INSTALL

Perl Programmers Reference Guide

INSTALL

This PerlIO abstraction can be enabled either on the Configure command line with
sh Configure −Duseperlio
or interactively at the appropriate Configure prompt.
If you choose to use the PerlIO abstraction layer, there are two (experimental) possibilities for the underlying
IO calls. These have been tested to some extent on some platforms, but are not guaranteed to work
everywhere.
1.

AT&T‘s "sfio". This has superior performance to stdio.h in many cases, and is extensible by the use
of "discipline" modules. Sfio currently only builds on a subset of the UNIX platforms perl supports.
Because the data structures are completely different from stdio, perl extension modules or external
libraries may not work. This configuration exists to allow these issues to be worked on.
This option requires the ‘sfio’ package to have been built and installed. A (fairly old) version of sfio is
in CPAN.
You select this option by
sh Configure −Duseperlio −Dusesfio
If you have already selected −Duseperlio, and if Configure detects that you have sfio, then sfio will be
the default suggested by Configure.
Note: On some systems, sfio‘s iffe configuration script fails to detect that you have an atexit function
(or equivalent). Apparently, this is a problem at least for some versions of Linux and SunOS 4.
You can test if you have this problem by trying the following shell script. (You may have to add some
extra cflags and libraries. A portable version of this may eventually make its way into Configure.)
#!/bin/sh
cat > try.c <<’EOCP’
#include 
main() { printf("42\n"); }
EOCP
cc −o try try.c −lsfio
val=‘./try‘
if test X$val = X42; then
echo "Your sfio looks ok"
else
echo "Your sfio has the exit problem."
fi
If you have this problem, the fix is to go back to your sfio sources and correct iffe‘s guess about atexit.
There also might be a more recent release of Sfio that fixes your problem.

2.

Normal stdio IO, but with all IO going through calls to the PerlIO abstraction layer. This configuration
can be used to check that perl and extension modules have been correctly converted to use the PerlIO
abstraction.
This configuration should work on all platforms (but might not).
You select this option via:
sh Configure −Duseperlio −Uusesfio
If you have already selected −Duseperlio, and if Configure does not detect sfio, then this will be the
default suggested by Configure.

10

Version 5.005_02

18−Oct−1998

INSTALL

Perl Programmers Reference Guide

INSTALL

Building a shared libperl.so Perl library
Currently, for most systems, the main perl executable is built by linking the "perl library" libperl.a with
perlmain.o, your static extensions (usually just DynaLoader.a) and various extra libraries, such as −lm.
On some systems that support dynamic loading, it may be possible to replace libperl.a with a shared
libperl.so. If you anticipate building several different perl binaries (e.g. by embedding libperl into different
programs, or by using the optional compiler extension), then you might wish to build a shared libperl.so so
that all your binaries can share the same library.
The disadvantages are that there may be a significant performance penalty associated with the shared
libperl.so, and that the overall mechanism is still rather fragile with respect to different versions and
upgrades.
In terms of performance, on my test system (Solaris 2.5_x86) the perl test suite took roughly 15% longer to
run with the shared libperl.so. Your system and typical applications may well give quite different results.
The default name for the shared library is typically something like libperl.so.3.2 (for Perl 5.003_02) or
libperl.so.302 or simply libperl.so. Configure tries to guess a sensible naming convention based on your C
library name. Since the library gets installed in a version−specific architecture−dependent directory, the
exact name isn‘t very important anyway, as long as your linker is happy.
For some systems (mostly SVR4), building a shared libperl is required for dynamic loading to work, and
hence is already the default.
You can elect to build a shared libperl by
sh Configure −Duseshrplib
To actually build perl, you must add the current working directory to your LD_LIBRARY_PATH
environment variable before running make. You can do this with
LD_LIBRARY_PATH=‘pwd‘:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH
for Bourne−style shells, or
setenv LD_LIBRARY_PATH ‘pwd‘
for Csh−style shells. You *MUST* do this before running make. Folks running NeXT OPENSTEP must
substitute DYLD_LIBRARY_PATH for LD_LIBRARY_PATH above.
There is also an potential problem with the shared perl library if you want to have more than one "flavor" of
the same version of perl (e.g. with and without −DDEBUGGING). For example, suppose you build and
install a standard Perl 5.004 with a shared library. Then, suppose you try to build Perl 5.004 with
−DDEBUGGING enabled, but everything else the same, including all the installation directories. How can
you ensure that your newly built perl will link with your newly built libperl.so.4 rather with the installed
libperl.so.4? The answer is that you might not be able to. The installation directory is encoded in the perl
binary with the LD_RUN_PATH environment variable (or equivalent ld command−line option). On Solaris,
you can override that with LD_LIBRARY_PATH; on Linux you can‘t. On Digital Unix, you can override
LD_LIBRARY_PATH by setting the _RLD_ROOT environment variable to point to the perl build directory.
The only reliable answer is that you should specify a different directory for the architecture−dependent
library for your −DDEBUGGING version of perl. You can do this by changing all the *archlib* variables in
config.sh, namely archlib, archlib_exp, and installarchlib, to point to your new architecture−dependent
library.
Malloc Issues
Perl relies heavily on malloc(3) to grow data structures as needed, so perl‘s performance can be noticeably
affected by the performance of the malloc function on your system.
The perl source is shipped with a version of malloc that is very fast but somewhat wasteful of space. On the
other hand, your system‘s malloc function may be a bit slower but also a bit more frugal. However, as of

18−Oct−1998

Version 5.005_02

11

INSTALL

Perl Programmers Reference Guide

INSTALL

5.004_68, perl‘s malloc has been optimized for the typical requests from perl, so there‘s a chance that it may
be both faster and use less memory.
For many uses, speed is probably the most important consideration, so the default behavior (for most
systems) is to use the malloc supplied with perl. However, if you will be running very large applications
(e.g. Tk or PDL) or if your system already has an excellent malloc, or if you are experiencing difficulties
with extensions that use third−party libraries that call malloc, then you might wish to use your system‘s
malloc. (Or, you might wish to explore the malloc flags discussed below.)
To build without perl‘s malloc, you can use the Configure command
sh Configure −Uusemymalloc
or you can answer ‘n’ at the appropriate interactive Configure prompt.
Malloc Performance Flags
If you are using Perl‘s malloc, you may add one or more of the following items to your ccflags config.sh
variable to change its behavior. You can find out more about these and other flags by reading the
commentary near the top of the malloc.c source. The defaults should be fine for nearly everyone.
−DNO_FANCY_MALLOC
Undefined by default. Defining it returns malloc to the version used in Perl 5.004.
−DPLAIN_MALLOC
Undefined by default. Defining it in addition to NO_FANCY_MALLOC returns malloc to the version
used in Perl version 5.000.
Building a debugging perl
You can run perl scripts under the perl debugger at any time with perl −d your_script. If, however, you
want to debug perl itself, you probably want to do
sh Configure −Doptimize=’−g’
This will do two independent things: First, it will force compilation to use cc −g so that you can use your
system‘s debugger on the executable. (Note: Your system may actually require something like cc −g2.
Check your man pages for cc(1) and also any hint file for your system.) Second, it will add
−DDEBUGGING to your ccflags variable in config.sh so that you can use perl −D to access perl‘s internal
state. (Note: Configure will only add −DDEBUGGING by default if you are not reusing your old config.sh.
If you want to reuse your old config.sh, then you can just edit it and change the optimize and ccflags
variables by hand and then propagate your changes as shown in "Propagating your changes to config.sh"
below.)
You can actually specify −g and −DDEBUGGING independently, but usually it‘s convenient to have both.
If you are using a shared libperl, see the warnings about multiple versions of perl under
Building a shared libperl.so Perl library.
Other Compiler Flags
For most users, all of the Configure defaults are fine. However, you can change a number of factors in the
way perl is built by adding appropriate −D directives to your ccflags variable in config.sh.
For example, you can replace the rand() and srand() functions in the perl source by any other random
number generator by a trick such as the following (this should all be on one line):
sh Configure −Dccflags=’−Dmy_rand=random −Dmy_srand=srandom’ \
−Drandbits=31
or you can use the drand48 family of functions with
sh Configure −Dccflags=’−Dmy_rand=lrand48 −Dmy_srand=srand48’ \
−Drandbits=31

12

Version 5.005_02

18−Oct−1998

INSTALL

Perl Programmers Reference Guide

INSTALL

or by adding the −D flags to your ccflags at the appropriate Configure prompt. (Read pp.c to see how this
works.)
You should also run Configure interactively to verify that a hint file doesn‘t inadvertently override your
ccflags setting. (Hints files shouldn‘t do that, but some might.)
What if it doesn‘t work?
Running Configure Interactively
If Configure runs into trouble, remember that you can always run Configure interactively so that you
can check (and correct) its guesses.
All the installation questions have been moved to the top, so you don‘t have to wait for them. Once
you‘ve handled them (and your C compiler and flags) you can type &−d at the next Configure prompt
and Configure will use the defaults from then on.
If you find yourself trying obscure command line incantations and config.over tricks, I recommend you
run Configure interactively instead. You‘ll probably save yourself time in the long run.
Hint files
The perl distribution includes a number of system−specific hints files in the hints/ directory. If one of
them matches your system, Configure will offer to use that hint file.
Several of the hint files contain additional important information. If you have any problems, it is a
good idea to read the relevant hint file for further information. See hints/solaris_2.sh for an extensive
example. More information about writing good hints is in the hints/README.hints file.
** WHOA THERE!!! ***
Occasionally, Configure makes a wrong guess. For example, on SunOS 4.1.3, Configure incorrectly
concludes that tzname[] is in the standard C library. The hint file is set up to correct for this. You will
see a message:
*** WHOA THERE!!! ***
The recommended value for $d_tzname on this machine was "undef"!
Keep the recommended value? [y]
You should always keep the recommended value unless, after reading the relevant section of the hint
file, you are sure you want to try overriding it.
If you are re−using an old config.sh, the word "previous" will be used instead of "recommended".
Again, you will almost always want to keep the previous value, unless you have changed something on
your system.
For example, suppose you have added libgdbm.a to your system and you decide to reconfigure perl to
use GDBM_File. When you run Configure again, you will need to add −lgdbm to the list of libraries.
Now, Configure will find your gdbm include file and library and will issue a message:
*** WHOA THERE!!! ***
The previous value for $i_gdbm on this machine was "undef"!
Keep the previous value? [y]
In this case, you do not want to keep the previous value, so you should answer ‘n’. (You‘ll also have
to manually add GDBM_File to the list of dynamic extensions to build.)
Changing Compilers
If you change compilers or make other significant changes, you should probably not re−use your old
config.sh. Simply remove it or rename it, e.g. mv config.sh config.sh.old. Then rerun Configure with
the options you want to use.
This is a common source of problems. If you change from cc to gcc, you should almost always
remove your old config.sh.

18−Oct−1998

Version 5.005_02

13

INSTALL

Perl Programmers Reference Guide

INSTALL

Propagating your changes to config.sh
If you make any changes to config.sh, you should propagate them to all the .SH files by running
sh Configure −S
You will then have to rebuild by running
make depend
make
config.over
You can also supply a shell script config.over to over−ride Configure‘s guesses. It will get loaded up
at the very end, just before config.sh is created. You have to be careful with this, however, as
Configure does no checking that your changes make sense. See the section on
"Changing the installation directory" for an example.
config.h
Many of the system dependencies are contained in config.h. Configure builds config.h by running the
config_h.SH script. The values for the variables are taken from config.sh.
If there are any problems, you can edit config.h directly. Beware, though, that the next time you run
Configure, your changes will be lost.
cflags
If you have any additional changes to make to the C compiler command line, they can be made in
cflags.SH. For instance, to turn off the optimizer on toke.c, find the line in the switch structure for
toke.c and put the command optimize=‘−g’ before the ;; . You can also edit cflags directly, but beware
that your changes will be lost the next time you run Configure.
To explore various ways of changing ccflags from within a hint file, see the file hints/README.hints.
To change the C flags for all the files, edit config.sh and change either $ccflags or $optimize,
and then re−run
sh Configure −S
make depend
No sh
If you don‘t have sh, you‘ll have to copy the sample file Porting/config_H to config.h and edit the
config.h to reflect your system‘s peculiarities. You‘ll probably also have to extensively modify the
extension building mechanism.
Porting information
Specific information for the OS/2, Plan9, VMS and Win32 ports is in the corresponding README
files and subdirectories. Additional information, including a glossary of all those config.sh variables,
is in the Porting subdirectory.
Ports for other systems may also be available. You should check out http://www.perl.com/CPAN/ports
for current information on ports to various other operating systems.
make depend
This will look for all the includes. The output is stored in makefile. The only difference between Makefile
and makefile is the dependencies at the bottom of makefile. If you have to make any changes, you should
edit makefile, not Makefile since the Unix make command reads makefile first. (On non−Unix systems, the
output may be stored in a different file. Check the value of $firstmakefile in your config.sh if in
doubt.)
Configure will offer to do this step for you, so it isn‘t listed explicitly above.

14

Version 5.005_02

18−Oct−1998

INSTALL

Perl Programmers Reference Guide

INSTALL

make
This will attempt to make perl in the current directory.
If you can‘t compile successfully, try some of the following ideas. If none of them help, and careful reading
of the error message and the relevant manual pages on your system doesn‘t help, you can send a message to
either the comp.lang.perl.misc newsgroup or to perlbug@perl.com with an accurate description of your
problem. See "Reporting Problems" below.
hints
If you used a hint file, try reading the comments in the hint file for further tips and information.
extensions
If you can successfully build miniperl, but the process crashes during the building of extensions, you
should run
make minitest
to test your version of miniperl.
locale
If you have any locale−related environment variables set, try unsetting them. I have some reports that
some versions of IRIX hang while running ./miniperl configpm with locales other than the C locale.
See the discussion under "make test" below about locales and the whole "Locale problems" section in
the file pod/perllocale.pod. The latter is especially useful if you see something like this
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LC_ALL = "En_US",
LANG = (unset)
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
at Perl startup.
malloc duplicates
If you get duplicates upon linking for malloc et al, add −DEMBEDMYMALLOC to your ccflags
variable in config.sh.
varargs
If you get varargs problems with gcc, be sure that gcc is installed correctly and that you are not passing
−I/usr/include to gcc. When using gcc, you should probably have i_stdarg=‘define’ and
i_varargs=‘undef’ in config.sh. The problem is usually solved by running fixincludes correctly. If you
do change config.sh, don‘t forget to propagate your changes (see
"Propagating your changes to config.sh" below). See also the "vsprintf" item below.
util.c
If you get error messages such as the following (the exact line numbers and function name may vary in
different versions of perl):
util.c: In function ‘Perl_form’:
util.c:1107: number of arguments doesn’t match prototype
proto.h:125: prototype declaration
it might well be a symptom of the gcc "varargs problem". See the previous "varargs" item.
Solaris and SunOS dynamic loading
If you have problems with dynamic loading using gcc on SunOS or Solaris, and you are using GNU as
and GNU ld, you may need to add −B/bin/ (for SunOS) or −B/usr/ccs/bin/ (for Solaris) to your
$ccflags, $ldflags, and $lddlflags so that the system‘s versions of as and ld are used.

18−Oct−1998

Version 5.005_02

15

INSTALL

Perl Programmers Reference Guide

INSTALL

Note that the trailing ‘/’ is required. Alternatively, you can use the GCC_EXEC_PREFIX environment
variable to ensure that Sun‘s as and ld are used. Consult your gcc documentation for further
information on the −B option and the GCC_EXEC_PREFIX variable.
One convenient way to ensure you are not using GNU as and ld is to invoke Configure with
sh Configure −Dcc=’gcc −B/usr/ccs/bin/’
for Solaris systems. For a SunOS system, you must use −B/bin/ instead.
Alternatively, recent versions of GNU ld reportedly work if you include −Wl,−export−dynamic
in the ccdlflags variable in config.sh.
ld.so.1: ./perl: fatal: relocation error:
If you get this message on SunOS or Solaris, and you‘re using gcc, it‘s probably the GNU as or GNU
ld problem in the previous item "Solaris and SunOS dynamic loading".
LD_LIBRARY_PATH
If you run into dynamic loading problems, check your setting of the LD_LIBRARY_PATH
environment variable. If you‘re creating a static Perl library (libperl.a rather than libperl.so) it should
build fine with LD_LIBRARY_PATH unset, though that may depend on details of your local set−up.
dlopen: stub interception failed
The primary cause of the ‘dlopen: stub interception failed’ message is that the LD_LIBRARY_PATH
environment variable includes a directory which is a symlink to /usr/lib (such as /lib).
The reason this causes a problem is quite subtle. The file libdl.so.1.0 actually *only* contains
functions which generate ‘stub interception failed’ errors! The runtime linker intercepts links to
"/usr/lib/libdl.so.1.0" and links in internal implementation of those functions instead. [Thanks to Tim
Bunce for this explanation.]
nm extraction
If Configure seems to be having trouble finding library functions, try not using nm extraction. You
can do this from the command line with
sh Configure −Uusenm
or by answering the nm extraction question interactively. If you have previously run Configure, you
should not reuse your old config.sh.
umask not found
If the build processes encounters errors relating to umask(), the problem is probably that Configure
couldn‘t find your umask() system call. Check your config.sh. You should have d_umask=‘define’.
If you don‘t, this is probably the "nm extraction" problem discussed above. Also, try reading the hints
file for your system for further information.
vsprintf
If you run into problems with vsprintf in compiling util.c, the problem is probably that Configure failed
to detect your system‘s version of vsprintf(). Check whether your system has vprintf().
(Virtually all modern Unix systems do.) Then, check the variable d_vprintf in config.sh. If your
system has vprintf, it should be:
d_vprintf=’define’
If Configure guessed wrong, it is likely that Configure guessed wrong on a number of other common
functions too. This is probably the "nm extraction" problem discussed above.
do_aspawn
If you run into problems relating to do_aspawn or do_spawn, the problem is probably that Configure
failed to detect your system‘s fork() function. Follow the procedure in the previous item on
"nm extraction".

16

Version 5.005_02

18−Oct−1998

INSTALL

Perl Programmers Reference Guide

INSTALL

__inet_* errors
If you receive unresolved symbol errors during Perl build and/or test referring to __inet_* symbols,
check to see whether BIND 8.1 is installed. It installs a /usr/local/include/arpa/inet.h that refers to
these symbols. Versions of BIND later than 8.1 do not install inet.h in that location and avoid the
errors. You should probably update to a newer version of BIND. If you can‘t, you can either link with
the updated resolver library provided with BIND 8.1 or rename /usr/local/bin/arpa/inet.h during the
Perl build and test process to avoid the problem.
Optimizer
If you can‘t compile successfully, try turning off your compiler‘s optimizer. Edit config.sh and change
the line
optimize=’−O’
to
optimize=’ ’
then propagate your changes with sh Configure −S and rebuild with make depend; make.
CRIPPLED_CC
If you still can‘t compile successfully, try adding a −DCRIPPLED_CC flag. (Just because you get no
errors doesn‘t mean it compiled right!) This simplifies some complicated expressions for compilers
that get indigestion easily.
Missing functions
If you have missing routines, you probably need to add some library or other, or you need to undefine
some feature that Configure thought was there but is defective or incomplete. Look through config.h
for likely suspects. If Configure guessed wrong on a number of functions, you might have the
"nm extraction" problem discussed above.
toke.c
Some compilers will not compile or optimize the larger files (such as toke.c) without some extra
switches to use larger jump offsets or allocate larger internal tables. You can customize the switches
for each file in cflags. It‘s okay to insert rules for specific files into makefile since a default rule only
takes effect in the absence of a specific rule.
Missing dbmclose
SCO prior to 3.2.4 may be missing dbmclose(). An upgrade to 3.2.4 that includes libdbm.nfs
(which includes dbmclose()) may be available.
Note (probably harmless): No library found for −lsomething
If you see such a message during the building of an extension, but the extension passes its tests anyway
(see "make test" below), then don‘t worry about the warning message. The extension Makefile.PL
goes looking for various libraries needed on various systems; few systems will need all the possible
libraries listed. For example, a system may have −lcposix or −lposix, but it‘s unlikely to have both, so
most users will see warnings for the one they don‘t have. The phrase ‘probably harmless’ is intended
to reassure you that nothing unusual is happening, and the build process is continuing.
On the other hand, if you are building GDBM_File and you get the message
Note (probably harmless): No library found for −lgdbm
then it‘s likely you‘re going to run into trouble somewhere along the line, since it‘s hard to see how
you can use the GDBM_File extension without the −lgdbm library.
It is true that, in principle, Configure could have figured all of this out, but Configure and the extension
building process are not quite that tightly coordinated.

18−Oct−1998

Version 5.005_02

17

INSTALL

Perl Programmers Reference Guide

INSTALL

sh: ar: not found
This is a message from your shell telling you that the command ‘ar’ was not found. You need to check
your PATH environment variable to make sure that it includes the directory with the ‘ar’ command.
This is a common problem on Solaris, where ‘ar’ is in the /usr/ccs/bin directory.
db−recno failure on tests 51, 53 and 55
Old versions of the DB library (including the DB library which comes with FreeBSD 2.1) had broken
handling of recno databases with modified bval settings. Upgrade your DB library or OS.
Bad arg length for semctl, is XX, should be ZZZ
If you get this error message from the lib/ipc_sysv test, your System V IPC may be broken. The XX
typically is 20, and that is what ZZZ also should be. Consider upgrading your OS, or reconfiguring
your OS to include the System V semaphores.
lib/ipc_sysv........semget: No space left on device
Either your account or the whole system has run out of semaphores. Or both. Either list the
semaphores with "ipcs" and remove the unneeded ones (which ones these are depends on your system
and applications) with "ipcrm −s SEMAPHORE_ID_HERE" or configure more semaphores to your
system.
Miscellaneous
Some additional things that have been reported for either perl4 or perl5:
Genix may need to use libc rather than libc_s, or #undef VARARGS.
NCR Tower 32 (OS 2.01.01) may need −W2,−Sl,2000 and #undef MKDIR.
UTS may need one or more of −DCRIPPLED_CC, −K or −g, and undef LSTAT.
FreeBSD can fail the lib/ipc_sysv.t test if SysV IPC has not been configured to the kernel. Perl tries to
detect this, though, and you will get a message telling what to do.
If you get syntax errors on ‘(‘, try −DCRIPPLED_CC.
Machines with half−implemented dbm routines will need to #undef I_ODBM
make test
This will run the regression tests on the perl you just made (you should run plain ‘make’ before ‘make test’
otherwise you won‘t have a complete build). If ‘make test’ doesn‘t say "All tests successful" then something
went wrong. See the file t/README in the t subdirectory.
Note that you can‘t run the tests in background if this disables opening of /dev/tty. You can use ‘make
test−notty’ in that case but a few tty tests will be skipped.
What if make test doesn‘t work?
If make test bombs out, just cd to the t directory and run ./TEST by hand to see if it makes any difference. If
individual tests bomb, you can run them by hand, e.g.,
./perl op/groups.t
Another way to get more detailed information about failed tests and individual subtests is to cd to the t
directory and run
./perl harness
(this assumes that most basic tests succeed, since harness uses complicated constructs).
You should also read the individual tests to see if there are any helpful comments that apply to your system.
locale
Note: One possible reason for errors is that some external programs may be broken due to the
combination of your environment and the way make test exercises them. For example, this may

18

Version 5.005_02

18−Oct−1998

INSTALL

Perl Programmers Reference Guide

INSTALL

happen if you have one or more of these environment variables set: LC_ALL LC_CTYPE
LC_COLLATE LANG. In some versions of UNIX, the non−English locales are known to cause
programs to exhibit mysterious errors.
If you have any of the above environment variables set, please try
setenv LC_ALL C
(for C shell) or
LC_ALL=C;export LC_ALL
for Bourne or Korn shell) from the command line and then retry make test. If the tests then succeed,
you may have a broken program that is confusing the testing. Please run the troublesome test by hand
as shown above and see whether you can locate the program. Look for things like: exec, ‘backquoted
command‘, system, open("|...") or open("...|"). All these mean that Perl is trying to run some external
program.
Out of memory
On some systems, particularly those with smaller amounts of RAM, some of the tests in t/op/pat.t may
fail with an "Out of memory" message. Specifically, in perl5.004_64, tests 74 and 78 have been
reported to fail on some systems. On my SparcStation IPC with 8 MB of RAM, test 78 will fail if the
system is running any other significant tasks at the same time.
Try stopping other jobs on the system and then running the test by itself:
cd t; ./perl op/pat.t
to see if you have any better luck. If your perl still fails this test, it does not necessarily mean you have
a broken perl. This test tries to exercise the regular expression subsystem quite thoroughly, and may
well be far more demanding than your normal usage.
make install
This will put perl into the public directory you specified to Configure; by default this is /usr/local/bin. It will
also try to put the man pages in a reasonable place. It will not nroff the man pages, however. You may need
to be root to run make install. If you are not root, you must own the directories in question and you should
ignore any messages about chown not working.
Installing perl under different names
If you want to install perl under a name other than "perl" (for example, when installing perl with special
features enabled, such as debugging), indicate the alternate name on the "make install" line, such as:
make install PERLNAME=myperl
Installed files
If you want to see exactly what will happen without installing anything, you can run
./perl installperl −n
./perl installman −n
make install will install the following:
perl,
perl5.nnn
suidperl,
sperl5.nnn
a2p
cppstdin
c2ph, pstruct
s2p

18−Oct−1998

where nnn is the current release number.
will be a link to perl.

This

If you requested setuid emulation.
awk−to−perl translator
This is used by perl −P, if your cc −E can’t
read from stdin.
Scripts for handling C structures in header files.
sed−to−perl translator

Version 5.005_02

19

INSTALL

Perl Programmers Reference Guide

INSTALL

find2perl
find−to−perl translator
h2ph
Extract constants and simple macros from C headers
h2xs
Converts C .h header files to Perl extensions.
perlbug
Tool to report bugs in Perl.
perldoc
Tool to read perl’s pod documentation.
pl2pm
Convert Perl 4 .pl files to Perl 5 .pm modules
pod2html,
Converters from perl’s pod documentation format
pod2latex,
to other useful formats.
pod2man, and
pod2text
splain
Describe Perl warnings and errors
library files
man pages
module
man pages
pod/*.pod

in $privlib and $archlib specified to
Configure, usually under /usr/local/lib/perl5/.
in the location specified to Configure, usually
something like /usr/local/man/man1.
in the location specified to Configure, usually
under /usr/local/lib/perl5/man/man3.
in $privlib/pod/.

Installperl will also create the library directories $siteperl and $sitearch listed in config.sh. Usually,
these are something like
/usr/local/lib/perl5/site_perl/5.005
/usr/local/lib/perl5/site_perl/5.005/archname
where archname is something like sun4−sunos. These directories will be used for installing extensions.
Perl‘s *.h header files and the libperl.a library are also installed under $archlib so that any user may later
build new extensions, run the optional Perl compiler, or embed the perl interpreter into another program even
if the Perl source is no longer available.
Coexistence with earlier versions of perl5
WARNING: The upgrade from 5.004_0x to 5.005 is going to be a bit tricky. See
"Upgrading from 5.004 to 5.005" below.
In general, you can usually safely upgrade from one version of Perl (e.g. 5.004_04) to another similar version
(e.g. 5.004_05) without re−compiling all of your add−on extensions. You can also safely leave the old
version around in case the new version causes you problems for some reason. For example, if you want to be
sure that your script continues to run with 5.004_04, simply replace the ‘#!/usr/local/bin/perl’ line at the top
of the script with the particular version you want to run, e.g. #!/usr/local/bin/perl5.00404.
Most extensions will probably not need to be recompiled to use with a newer version of perl. Here is how it
is supposed to work. (These examples assume you accept all the Configure defaults.)
The directories searched by version 5.005 will be
Configure variable
$archlib
$privlib
$sitearch
$sitelib

Default value
/usr/local/lib/perl5/5.005/archname
/usr/local/lib/perl5/5.005
/usr/local/lib/perl5/site_perl/5.005/archname
/usr/local/lib/perl5/site_perl/5.005

while the directories searched by version 5.005_01 will be
$archlib
$privlib
$sitearch
$sitelib

20

/usr/local/lib/perl5/5.00501/archname
/usr/local/lib/perl5/5.00501
/usr/local/lib/perl5/site_perl/5.005/archname
/usr/local/lib/perl5/site_perl/5.005

Version 5.005_02

18−Oct−1998

INSTALL

Perl Programmers Reference Guide

INSTALL

When you install an add−on extension, it gets installed into $sitelib (or $sitearch if it is
architecture−specific). This directory deliberately does NOT include the sub−version number (01) so that
both 5.005 and 5.005_01 can use the extension. Only when a perl version changes to break backwards
compatibility will the default suggestions for the $sitearch and $sitelib version numbers be
increased.
However, if you do run into problems, and you want to continue to use the old version of perl along with
your extension, move those extension files to the appropriate version directory, such as $privlib (or
$archlib). (The extension‘s .packlist file lists the files installed with that extension. For the Tk
extension, for example, the list of files installed is in $sitearch/auto/Tk/.packlist.) Then use
your newer version of perl to rebuild and re−install the extension into $sitelib. This way, Perl 5.005
will find your files in the 5.005 directory, and newer versions of perl will find your newer extension in the
$sitelib directory. (This is also why perl searches the site−specific libraries last.)
Alternatively, if you are willing to reinstall all your extensions every time you upgrade perl, then you can
include the subversion number in $sitearch and $sitelib when you run Configure.
Maintaining completely separate versions
Many users prefer to keep all versions of perl in completely separate directories. One convenient way to do
this is by using a separate prefix for each version, such as
sh Configure −Dprefix=/opt/perl5.004
and adding /opt/perl5.004/bin to the shell PATH variable. Such users may also wish to add a symbolic link
/usr/local/bin/perl so that scripts can still start with #!/usr/local/bin/perl.
Others might share a common directory for maintenance sub−versions (e.g. 5.004 for all 5.004_0x versions),
but change directory with each major version.
If you are installing a development subversion, you probably ought to seriously consider using a separate
directory, since development subversions may not have all the compatibility wrinkles ironed out yet.
Upgrading from 5.004 to 5.005
Extensions built and installed with versions of perl prior to 5.004_50 will need to be recompiled to be used
with 5.004_50 and later. You will, however, be able to continue using 5.004 even after you install 5.005.
The 5.004 binary will still be able to find the extensions built under 5.004; the 5.005 binary will look in the
new $sitearch and $sitelib directories, and will not find them.
Coexistence with perl4
You can safely install perl5 even if you want to keep perl4 around.
By default, the perl5 libraries go into /usr/local/lib/perl5/, so they don‘t override the perl4 libraries in
/usr/local/lib/perl/.
In your /usr/local/bin directory, you should have a binary named perl4.036. That will not be touched by the
perl5 installation process. Most perl4 scripts should run just fine under perl5. However, if you have any
scripts that require perl4, you can replace the #! line at the top of them by #!/usr/local/bin/perl4.036 (or
whatever the appropriate pathname is). See pod/perltrap.pod for possible problems running perl4 scripts
under perl5.
cd /usr/include; h2ph *.h sys/*.h
Some perl scripts need to be able to obtain information from the system header files. This command will
convert the most commonly used header files in /usr/include into files that can be easily interpreted by perl.
These files will be placed in the architecture−dependent library ($archlib) directory you specified to
Configure.
Note: Due to differences in the C and perl languages, the conversion of the header files is not perfect. You
will probably have to hand−edit some of the converted files to get them to parse correctly. For example,
h2ph breaks spectacularly on type casting and certain structures.

18−Oct−1998

Version 5.005_02

21

INSTALL

Perl Programmers Reference Guide

INSTALL

installhtml —help
Some sites may wish to make perl documentation available in HTML format. The installhtml utility can be
used to convert pod documentation into linked HTML files and install them.
The following command−line is an example of one used to convert perl documentation:
./installhtml
\
−−podroot=.
\
−−podpath=lib:ext:pod:vms
\
−−recurse
\
−−htmldir=/perl/nmanual
\
−−htmlroot=/perl/nmanual
\
−−splithead=pod/perlipc
\
−−splititem=pod/perlfunc
\
−−libpods=perlfunc:perlguts:perlvar:perlrun:perlop \
−−verbose
See the documentation in installhtml for more details. It can take many minutes to execute a large
installation and you should expect to see warnings like "no title", "unexpected directive" and "cannot
resolve" as the files are processed. We are aware of these problems (and would welcome patches for them).
You may find it helpful to run installhtml twice. That should reduce the number of "cannot resolve"
warnings.
cd pod && make tex && (process the latex files)
Some sites may also wish to make the documentation in the pod/ directory available in TeX format. Type
(cd pod && make tex && )
Reporting Problems
If you have difficulty building perl, and none of the advice in this file helps, and careful reading of the error
message and the relevant manual pages on your system doesn‘t help either, then you should send a message
to either the comp.lang.perl.misc newsgroup or to perlbug@perl.com with an accurate description of your
problem.
Please include the output of the ./myconfig shell script that comes with the distribution. Alternatively, you
can use the perlbug program that comes with the perl distribution, but you need to have perl compiled before
you can use it. (If you have not installed it yet, you need to run ./perl −Ilib utils/perlbug
instead of a plain perlbug.)
You might also find helpful information in the Porting directory of the perl distribution.
DOCUMENTATION
Read the manual entries before running perl. The main documentation is in the pod/ subdirectory and should
have been installed during the build process. Type man perl to get started. Alternatively, you can type
perldoc perl to use the supplied perldoc script. This is sometimes useful for finding things in the library
modules.
Under UNIX, you can produce a documentation book in postscript form, along with its table of contents, by
going to the pod/ subdirectory and running (either):
./roffitall −groff
./roffitall −psroff

# If you have GNU groff installed
# If you have psroff

This will leave you with two postscript files ready to be printed. (You may need to fix the roffitall command
to use your local troff set−up.)
Note that you must have performed the installation already before running the above, since the script collects
the installed files to generate the documentation.

22

Version 5.005_02

18−Oct−1998

INSTALL

Perl Programmers Reference Guide

INSTALL

AUTHOR
Original author: Andy Dougherty doughera@lafayette.edu , borrowing very heavily from the original
README by Larry Wall, with lots of helpful feedback and additions from the perl5−porters@perl.org folks.
If you have problems, corrections, or questions, please see "Reporting Problems" above.
REDISTRIBUTION
This document is part of the Perl package and may be distributed under the same terms as perl itself.
If you are distributing a modified version of perl (perhaps as part of a larger package) please do modify these
installation instructions and the contact information to match your distribution.
LAST MODIFIED
$Id: INSTALL,v 1.42 1998/07/15 18:04:44 doughera Released $

18−Oct−1998

Version 5.005_02

23

perlfaq

Perl Programmers Reference Guide

perlfaq

NAME
perlfaq − frequently asked questions about Perl ($Date: 1998/08/05 12:09:32 $)
DESCRIPTION
This document is structured into the following sections:
perlfaq: Structural overview of the FAQ.
This document.

perlfaq1: General Questions About Perl
Very general, high−level information about Perl.
perlfaq2: Obtaining and Learning about Perl
Where to find source and documentation to Perl, support, and related matters.
perlfaq3: Programming Tools
Programmer tools and programming support.
perlfaq4: Data Manipulation
Manipulating numbers, dates, strings, arrays, hashes, and miscellaneous data issues.
perlfaq5: Files and Formats
I/O and the "f" issues: filehandles, flushing, formats and footers.
perlfaq6: Regexps
Pattern matching and regular expressions.
perlfaq7: General Perl Language Issues
General Perl language issues that don‘t clearly fit into any of the other sections.
perlfaq8: System Interaction
Interprocess communication (IPC), control over the user−interface (keyboard, screen and pointing
devices).
perlfaq9: Networking
Networking, the Internet, and a few on the web.
Where to get this document
This document is posted regularly to comp.lang.perl.announce and several other related newsgroups. It is
available in a variety of formats from CPAN in the /CPAN/doc/FAQs/FAQ/ directory, or on the web at
http://www.perl.com/perl/faq/ .
How to contribute to this document
You may mail corrections, additions, and suggestions to perlfaq−suggestions@perl.com . This alias should
not be used to ask FAQs. It‘s for fixing the current FAQ.
What will happen if you mail your Perl programming problems to the authors
Your questions will probably go unread, unless they‘re suggestions of new questions to add to the FAQ, in
which case they should have gone to the perlfaq−suggestions@perl.com instead.
You should have read section 2 of this faq. There you would have learned that comp.lang.perl.misc is the
appropriate place to go for free advice. If your question is really important and you require a prompt and
correct answer, you should hire a consultant.
Credits
When I first began the Perl FAQ in the late 80s, I never realized it would have grown to over a hundred
pages, nor that Perl would ever become so popular and widespread. This document could not have been
written without the tremendous help provided by Larry Wall and the rest of the Perl Porters.

24

Version 5.005_02

18−Oct−1998

perlfaq

Perl Programmers Reference Guide

perlfaq

Author and Copyright Information
Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. All rights reserved.
Bundled Distributions
When included as part of the Standard Version of Perl, or as part of its complete documentation whether
printed or otherwise, this work may be distributed only under the terms of Perl‘s Artistic License. Any
distribution of this file or derivatives thereof outside of that package require that special arrangements be
made with copyright holder.
Irrespective of its distribution, all code examples in these files are hereby placed into the public domain.
You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit.
A simple comment in the code giving credit would be courteous but is not required.
Disclaimer
This information is offered in good faith and in the hope that it may be of use, but is not guaranteed to be
correct, up to date, or suitable for any particular purpose whatsoever. The authors accept no liability in
respect of this information or its use.
Changes
22/June/98
Significant changes throughout in preparation for the 5.005 release.
24/April/97
Style and whitespace changes from Chip, new question on reading one character at a time from a
terminal using POSIX from Tom.
23/April/97
Added http://www.oasis.leo.org/perl/ to perlfaq2. Style fix to perlfaq3. Added floating point
precision, fixed complex number arithmetic, cross−references, caveat for Text::Wrap, alternative
answer for initial capitalizing, fixed incorrect regexp, added example of Tie::IxHash to perlfaq4.
Added example of passing and storing filehandles, added commify to perlfaq5. Restored variable
suicide, and added mass commenting to perlfaq7. Added Net::Telnet, fixed backticks, added
reader/writer pair to telnet question, added FindBin, grouped module questions together in perlfaq8.
Expanded caveats for the simple URL extractor, gave LWP example, added CGI security question,
expanded on the mail address answer in perlfaq9.
25/March/97
Added more info to the binary distribution section of perlfaq2. Added Net::Telnet to perlfaq6. Fixed
typos in perlfaq8. Added mail sending example to perlfaq9. Added Merlyn‘s columns to perlfaq2.
18/March/97
Added the DATE to the NAME section, indicating which sections have changed.
Mentioned SIGPIPE and perlipc in the forking open answer in perlfaq8.
Fixed description of a regular expression in perlfaq4.
17/March/97 Version
Various typos fixed throughout.
Added new question on Perl BNF on perlfaq7.
Initial Release: 11/March/97
This is the initial release of version 3 of the FAQ; consequently there have been no changes since its
initial release.

18−Oct−1998

Version 5.005_02

25

perlfaq1

Perl Programmers Reference Guide

perlfaq1

NAME
perlfaq1 − General Questions About Perl ($Revision: 1.15 $, $Date: 1998/08/05 11:52:24 $)
DESCRIPTION
This section of the FAQ answers very general, high−level questions about Perl.
What is Perl?
Perl is a high−level programming language with an eclectic heritage written by Larry Wall and a cast of
thousands. It derives from the ubiquitous C programming language and to a lesser extent from sed, awk, the
Unix shell, and at least a dozen other tools and languages. Perl‘s process, file, and text manipulation facilities
make it particularly well−suited for tasks involving quick prototyping, system utilities, software tools,
system management tasks, database access, graphical programming, networking, and world wide web
programming. These strengths make it especially popular with system administrators and CGI script authors,
but mathematicians, geneticists, journalists, and even managers also use Perl. Maybe you should, too.
Who supports Perl? Who develops it? Why is it free?
The original culture of the pre−populist Internet and the deeply−held beliefs of Perl‘s author, Larry Wall,
gave rise to the free and open distribution policy of perl. Perl is supported by its users. The core, the
standard Perl library, the optional modules, and the documentation you‘re reading now were all written by
volunteers. See the personal note at the end of the README file in the perl source distribution for more
details. See perlhist (new as of 5.005) for Perl‘s milestone releases.
In particular, the core development team (known as the Perl Porters) are a rag−tag band of highly altruistic
individuals committed to producing better software for free than you could hope to purchase for money.
You may snoop on pending developments via news://genetics.upenn.edu/perl.porters−gw/ and
http://www.frii.com/~gnat/perl/porters/summary.html.
While the GNU project includes Perl in its distributions, there‘s no such thing as "GNU Perl". Perl is not
produced nor maintained by the Free Software Foundation. Perl‘s licensing terms are also more open than
GNU software‘s tend to be.
You can get commercial support of Perl if you wish, although for most users the informal support will more
than suffice. See the answer to "Where can I buy a commercial version of perl?" for more information.
Which version of Perl should I use?
You should definitely use version 5. Version 4 is old, limited, and no longer maintained; its last patch
(4.036) was in 1992. The most recent production release is 5.005_01. Further references to the Perl
language in this document refer to this production release unless otherwise specified. There may be one or
more official bug fixes for 5.005_01 by the time you read this, and also perhaps some experimental versions
on the way to the next release.
What are perl4 and perl5?
Perl4 and perl5 are informal names for different versions of the Perl programming language. It‘s easier to
say "perl5" than it is to say "the 5(.004) release of Perl", but some people have interpreted this to mean
there‘s a language called "perl5", which isn‘t the case. Perl5 is merely the popular name for the fifth major
release (October 1994), while perl4 was the fourth major release (March 1991). There was also a perl1 (in
January 1988), a perl2 (June 1988), and a perl3 (October 1989).
The 5.0 release is, essentially, a complete rewrite of the perl source code from the ground up. It has been
modularized, object−oriented, tweaked, trimmed, and optimized until it almost doesn‘t look like the old
code. However, the interface is mostly the same, and compatibility with previous releases is very high.
To avoid the "what language is perl5?" confusion, some people prefer to simply use "perl" to refer to the
latest version of perl and avoid using "perl5" altogether. It‘s not really that big a deal, though.
See perlhist for a history of Perl revisions.

26

Version 5.005_02

18−Oct−1998

perlfaq1

Perl Programmers Reference Guide

perlfaq1

How stable is Perl?
Production releases, which incorporate bug fixes and new functionality, are widely tested before release.
Since the 5.000 release, we have averaged only about one production release per year.
Larry and the Perl development team occasionally make changes to the internal core of the language, but all
possible efforts are made toward backward compatibility. While not quite all perl4 scripts run flawlessly
under perl5, an update to perl should nearly never invalidate a program written for an earlier version of perl
(barring accidental bug fixes and the rare new keyword).
Is Perl difficult to learn?
No, Perl is easy to start learning — and easy to keep learning. It looks like most programming languages
you‘re likely to have experience with, so if you‘ve ever written an C program, an awk script, a shell script, or
even BASIC program, you‘re already part way there.
Most tasks only require a small subset of the Perl language. One of the guiding mottos for Perl development
is "there‘s more than one way to do it" (TMTOWTDI, sometimes pronounced "tim toady"). Perl‘s learning
curve is therefore shallow (easy to learn) and long (there‘s a whole lot you can do if you really want).
Finally, Perl is (frequently) an interpreted language. This means that you can write your programs and test
them without an intermediate compilation step, allowing you to experiment and test/debug quickly and
easily. This ease of experimentation flattens the learning curve even more.
Things that make Perl easier to learn: Unix experience, almost any kind of programming experience, an
understanding of regular expressions, and the ability to understand other people‘s code. If there‘s something
you need to do, then it‘s probably already been done, and a working example is usually available for free.
Don‘t forget the new perl modules, either. They‘re discussed in Part 3 of this FAQ, along with the CPAN,
which is discussed in Part 2.
How does Perl compare with other languages like Java, Python, REXX, Scheme, or Tcl?
Favorably in some areas, unfavorably in others. Precisely which areas are good and bad is often a personal
choice, so asking this question on Usenet runs a strong risk of starting an unproductive Holy War.
Probably the best thing to do is try to write equivalent code to do a set of tasks. These languages have their
own newsgroups in which you can learn about (but hopefully not argue about) them.
Can I do [task] in Perl?
Perl is flexible and extensible enough for you to use on almost any task, from one−line file−processing tasks
to complex systems. For many people, Perl serves as a great replacement for shell scripting. For others, it
serves as a convenient, high−level replacement for most of what they‘d program in low−level languages like
C or C++. It‘s ultimately up to you (and possibly your management ...) which tasks you‘ll use Perl for and
which you won‘t.
If you have a library that provides an API, you can make any component of it available as just another Perl
function or variable using a Perl extension written in C or C++ and dynamically linked into your main perl
interpreter. You can also go the other direction, and write your main program in C or C++, and then link in
some Perl code on the fly, to create a powerful application.
That said, there will always be small, focused, special−purpose languages dedicated to a specific problem
domain that are simply more convenient for certain kinds of problems. Perl tries to be all things to all
people, but nothing special to anyone. Examples of specialized languages that come to mind include prolog
and matlab.
When shouldn‘t I program in Perl?
When your manager forbids it — but do consider replacing them :−).
Actually, one good reason is when you already have an existing application written in another language
that‘s all done (and done well), or you have an application language specifically designed for a certain task
(e.g. prolog, make).

18−Oct−1998

Version 5.005_02

27

perlfaq1

Perl Programmers Reference Guide

perlfaq1

For various reasons, Perl is probably not well−suited for real−time embedded systems, low−level operating
systems development work like device drivers or context−switching code, complex multithreaded
shared−memory applications, or extremely large applications. You‘ll notice that perl is not itself written in
Perl.
The new native−code compiler for Perl may reduce the limitations given in the previous statement to some
degree, but understand that Perl remains fundamentally a dynamically typed language, and not a statically
typed one. You certainly won‘t be chastized if you don‘t trust nuclear−plant or brain−surgery monitoring
code to it. And Larry will sleep easier, too — Wall Street programs not withstanding. :−)
What‘s the difference between "perl" and "Perl"?
One bit. Oh, you weren‘t talking ASCII? :−) Larry now uses "Perl" to signify the language proper and "perl"
the implementation of it, i.e. the current interpreter. Hence Tom‘s quip that "Nothing but perl can parse
Perl." You may or may not choose to follow this usage. For example, parallelism means "awk and perl" and
"Python and Perl" look ok, while "awk and Perl" and "Python and perl" do not.
Is it a Perl program or a Perl script?
It doesn‘t matter.
In "standard terminology" a program has been compiled to physical machine code once, and can then be be
run multiple times, whereas a script must be translated by a program each time it‘s used. Perl programs,
however, are usually neither strictly compiled nor strictly interpreted. They can be compiled to a byte code
form (something of a Perl virtual machine) or to completely different languages, like C or assembly
language. You can‘t tell just by looking whether the source is destined for a pure interpreter, a parse−tree
interpreter, a byte code interpreter, or a native−code compiler, so it‘s hard to give a definitive answer here.
What is a JAPH?
These are the "just another perl hacker" signatures that some people sign their postings with. About 100 of
the of the earlier ones are available from http://www.perl.com/CPAN/misc/japh .
Where can I get a list of Larry Wall witticisms?
Over a hundred quips by Larry, from postings of his or source code, can be found at
http://www.perl.com/CPAN/misc/lwall−quotes .
How can I convince my sysadmin/supervisor/employees to use version (5/5.005/Perl instead of
some other language)?
If your manager or employees are wary of unsupported software, or software which doesn‘t officially ship
with your Operating System, you might try to appeal to their self−interest. If programmers can be more
productive using and utilizing Perl constructs, functionality, simplicity, and power, then the typical
manager/supervisor/employee may be persuaded. Regarding using Perl in general, it‘s also sometimes
helpful to point out that delivery times may be reduced using Perl, as compared to other languages.
If you have a project which has a bottleneck, especially in terms of translation or testing, Perl almost
certainly will provide a viable, and quick solution. In conjunction with any persuasion effort, you should not
fail to point out that Perl is used, quite extensively, and with extremely reliable and valuable results, at many
large computer software and/or hardware companies throughout the world. In fact, many Unix vendors now
ship Perl by default, and support is usually just a news−posting away, if you can‘t find the answer in the
comprehensive documentation, including this FAQ.
If you face reluctance to upgrading from an older version of perl, then point out that version 4 is utterly
unmaintained and unsupported by the Perl Development Team. Another big sell for Perl5 is the large
number of modules and extensions which greatly reduce development time for any given task. Also mention
that the difference between version 4 and version 5 of Perl is like the difference between awk and C++.
(Well, ok, maybe not quite that distinct, but you get the idea.) If you want support and a reasonable
guarantee that what you‘re developing will continue to work in the future, then you have to run the supported
version. That probably means running the 5.005 release, although 5.004 isn‘t that bad (it‘s just one year and
one release behind). Several important bugs were fixed from the 5.000 through 5.003 versions, though, so
try upgrading past them if possible.

28

Version 5.005_02

18−Oct−1998

perlfaq1

Perl Programmers Reference Guide

perlfaq1

Of particular note is the massive bughunt for buffer overflow problems that went into the 5.004 release. All
releases prior to that, including perl4, are considered insecure and should be upgraded as soon as possible.
AUTHOR AND COPYRIGHT
Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. All rights reserved.
When included as an integrated part of the Standard Distribution of Perl or of its documentation (printed or
otherwise), this works is covered under Perl‘s Artistic Licence. For separate distributions of all or part of
this FAQ outside of that, see perlfaq.
Irrespective of its distribution, all code examples here are public domain. You are permitted and encouraged
to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A
simple comment in the code giving credit to the FAQ would be courteous but is not required.

18−Oct−1998

Version 5.005_02

29

perlfaq2

Perl Programmers Reference Guide

perlfaq2

NAME
perlfaq2 − Obtaining and Learning about Perl ($Revision: 1.25 $, $Date: 1998/08/05 11:47:25 $)
DESCRIPTION
This section of the FAQ answers questions about where to find source and documentation for Perl, support,
and related matters.
What machines support Perl? Where do I get it?
The standard release of Perl (the one maintained by the perl development team) is distributed only in source
code form. You can find this at http://www.perl.com/CPAN/src/latest.tar.gz, which in standard Internet
format (a gzipped archive in POSIX tar format).
Perl builds and runs on a bewildering number of platforms. Virtually all known and current Unix derivatives
are supported (Perl‘s native platform), as are proprietary systems like VMS, DOS, OS/2, Windows, QNX,
BeOS, and the Amiga. There are also the beginnings of support for MPE/iX.
Binary distributions for some proprietary platforms, including Apple systems can be found
http://www.perl.com/CPAN/ports/ directory. Because these are not part of the standard distribution, they
may and in fact do differ from the base Perl port in a variety of ways. You‘ll have to check their respective
release notes to see just what the differences are. These differences can be either positive (e.g. extensions for
the features of the particular platform that are not supported in the source release of perl) or negative (e.g.
might be based upon a less current source release of perl).
A useful FAQ for Win32 Perl users is
http://www.endcontsw.com/people/evangelo/Perl_for_Win32_FAQ.html
How can I get a binary version of Perl?
If you don‘t have a C compiler because for whatever reasons your vendor did not include one with your
system, the best thing to do is grab a binary version of gcc from the net and use that to compile perl with.
CPAN only has binaries for systems that are terribly hard to get free compilers for, not for Unix systems.
Your first stop should be http://www.perl.com/CPAN/ports to see what information is already available. A
simple installation guide for MS−DOS is available at http://www.cs.ruu.nl/~piet/perl5dos.html , and
similarly for Windows 3.1 at http://www.cs.ruu.nl/~piet/perlwin3.html .
I don‘t have a C compiler on my system. How can I compile perl?
Since you don‘t have a C compiler, you‘re doomed and your vendor should be sacrificed to the Sun gods.
But that doesn‘t help you.
What you need to do is get a binary version of gcc for your system first. Consult the Usenet FAQs for your
operating system for information on where to get such a binary version.
I copied the Perl binary from one machine to another, but scripts don‘t work.
That‘s probably because you forgot libraries, or library paths differ. You really should build the whole
distribution on the machine it will eventually live on, and then type make install. Most other
approaches are doomed to failure.
One simple way to check that things are in the right place is to print out the hard−coded @INC which perl is
looking for.
perl −e ’print join("\n",@INC)’
If this command lists any paths which don‘t exist on your system, then you may need to move the
appropriate libraries to these locations, or create symlinks, aliases, or shortcuts appropriately.
You might also want to check out How do I keep my own module/library directory? in perlfaq8.
I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed.
How do I make it work?
Read the INSTALL file, which is part of the source distribution. It describes in detail how to cope with most

30

Version 5.005_02

18−Oct−1998

perlfaq2

Perl Programmers Reference Guide

perlfaq2

idiosyncracies that the Configure script can‘t work around for any given system or architecture.
What modules and extensions are available for Perl? What is CPAN? What does CPAN/src/...
mean?
CPAN stands for Comprehensive Perl Archive Network, a huge archive replicated on dozens of machines all
over the world. CPAN contains source code, non−native ports, documentation, scripts, and many
third−party modules and extensions, designed for everything from commercial database interfaces to
keyboard/screen control to web walking and CGI scripts. The master machine for CPAN is
ftp://ftp.funet.fi/pub/languages/perl/CPAN/, but you can use the address
http://www.perl.com/CPAN/CPAN.html to fetch a copy from a "site near you". See
http://www.perl.com/CPAN (without a slash at the end) for how this process works.
CPAN/path/... is a naming convention for files available on CPAN sites. CPAN indicates the base directory
of a CPAN mirror, and the rest of the path is the path from that directory to the file. For instance, if you‘re
using ftp://ftp.funet.fi/pub/languages/perl/CPAN as your CPAN site, the file CPAN/misc/japh file is
downloadable as ftp://ftp.funet.fi/pub/languages/perl/CPAN/misc/japh .
Considering that there are hundreds of existing modules in the archive, one probably exists to do nearly
anything you can think of. Current categories under CPAN/modules/by−category/ include perl core modules;
development support; operating system interfaces; networking, devices, and interprocess communication;
data type utilities; database interfaces; user interfaces; interfaces to other languages; filenames, file systems,
and file locking; internationalization and locale; world wide web support; server and daemon utilities;
archiving and compression; image manipulation; mail and news; control flow utilities; filehandle and I/O;
Microsoft Windows modules; and miscellaneous modules.
Is there an ISO or ANSI certified version of Perl?
Certainly not. Larry expects that he‘ll be certified before Perl is.
Where can I get information on Perl?
The complete Perl documentation is available with the perl distribution. If you have perl installed locally,
you probably have the documentation installed as well: type man perl if you‘re on a system resembling
Unix. This will lead you to other important man pages, including how to set your $MANPATH. If you‘re not
on a Unix system, access to the documentation will be different; for example, it might be only in HTML
format. But all proper perl installations have fully−accessible documentation.
You might also try perldoc perl in case your system doesn‘t have a proper man command, or it‘s been
misinstalled. If that doesn‘t work, try looking in /usr/local/lib/perl5/pod for documentation.
If all else fails, consult the CPAN/doc directory, which contains the complete documentation in various
formats, including native pod, troff, html, and plain text. There‘s also a web page at
http://www.perl.com/perl/info/documentation.html that might help.
Many good books have been written about Perl — see the section below for more details.
What are the Perl newsgroups on USENET? Where do I post questions?
The now defunct comp.lang.perl newsgroup has been superseded by the following groups:
comp.lang.perl.announce
comp.lang.perl.misc
comp.lang.perl.moderated
comp.lang.perl.modules
comp.lang.perl.tk

Moderated announcement group
Very busy group about Perl in general
Moderated discussion group
Use and development of Perl modules
Using Tk (and X) from Perl

comp.infosystems.www.authoring.cgi

Writing CGI scripts for the Web.

Actually, the moderated group hasn‘t passed yet, but we‘re keeping our fingers crossed.
There is also USENET gateway to the mailing list used by the crack Perl development team (perl5−porters)
at news://news.perl.com/perl.porters−gw/ .

18−Oct−1998

Version 5.005_02

31

perlfaq2

Perl Programmers Reference Guide

perlfaq2

Where should I post source code?
You should post source code to whichever group is most appropriate, but feel free to cross−post to
comp.lang.perl.misc. If you want to cross−post to alt.sources, please make sure it follows their posting
standards, including setting the Followup−To header line to NOT include alt.sources; see their FAQ for
details.
If you‘re just looking for software, first use Alta Vista, Deja News, and search CPAN. This is faster and
more productive than just posting a request.
Perl Books
A number of books on Perl and/or CGI programming are available. A few of these are good, some are ok,
but many aren‘t worth your money. Tom Christiansen maintains a list of these books, some with extensive
reviews, at http://www.perl.com/perl/critiques/index.html.
The incontestably definitive reference book on Perl, written by the creator of Perl, is now in its second
edition:
Programming Perl (the "Camel Book"):
Authors: Larry Wall, Tom Christiansen, and Randal Schwartz
ISBN 1−56592−149−6
(English)
ISBN 4−89052−384−7
(Japanese)
URL: http://www.oreilly.com/catalog/pperl2/
(French, German, Italian, and Hungarian translations also
available)
The companion volume to the Camel containing thousands of real−world examples, mini−tutorials, and
complete programs (first premiering at the 1998 Perl Conference), is:
The Perl Cookbook (the "Ram Book"):
Authors: Tom Christiansen and Nathan Torkington,
with Foreword by Larry Wall
ISBN: 1−56592−243−3
URL: http://perl.oreilly.com/cookbook/
If you‘re already a hard−core systems programmer, then the Camel Book might suffice for you to learn Perl
from. But if you‘re not, check out:
Learning Perl (the "Llama Book"):
Authors: Randal Schwartz and Tom Christiansen
with Foreword by Larry Wall
ISBN: 1−56592−284−0
URL: http://www.oreilly.com/catalog/lperl2/
Despite the picture at the URL above, the second edition of "Llama Book" really has a blue cover, and is
updated for the 5.004 release of Perl. Various foreign language editions are available, including Learning
Perl on Win32 Systems (the Gecko Book).
If you‘re not an accidental programmer, but a more serious and possibly even degreed computer scientist
who doesn‘t need as much hand−holding as we try to provide in the Llama or its defurred cousin the Gecko,
please check out the delightful book, Perl: The Programmer‘s Companion, written by Nigel Chapman.
You can order O‘Reilly books directly from O‘Reilly & Associates, 1−800−998−9938. Local/overseas is
1−707−829−0515. If you can locate an O‘Reilly order form, you can also fax to 1−707−829−0104. See
http://www.ora.com/ on the Web.
What follows is a list of the books that the FAQ authors found personally useful. Your mileage may (but, we
hope, probably won‘t) vary.
Recommended books on (or muchly on) Perl follow; those marked with a star may be ordered from
O‘Reilly.

32

Version 5.005_02

18−Oct−1998

perlfaq2

Perl Programmers Reference Guide

perlfaq2

References
*Programming Perl
by Larry Wall, Tom Christiansen, and Randal L. Schwartz
*Perl 5 Desktop Reference
By Johan Vromans
Tutorials
*Learning Perl [2nd edition]
by Randal L. Schwartz and Tom Christiansen
with foreword by Larry Wall
*Learning Perl on Win32 Systems
by Randal L. Schwartz, Erik Olson, and Tom Christiansen,
with foreword by Larry Wall
Perl: The Programmer’s Companion
by Nigel Chapman
Cross−Platform Perl
by Eric F. Johnson
MacPerl: Power and Ease
by Vicki Brown and Chris Nandor, foreword by Matthias Neeracher
Task−Oriented
*The Perl Cookbook
by Tom Christiansen and Nathan Torkington
with foreword by Larry Wall
Perl5 Interactive Course [2nd edition]
by Jon Orwant
*Advanced Perl Programming
by Sriram Srinivasan
Effective Perl Programming
by Joseph Hall
Special Topics
*Mastering Regular Expressions
by Jeffrey Friedl
How to Set up and Maintain a World Wide Web Site [2nd edition]
by Lincoln Stein
Perl in Magazines
The first and only periodical devoted to All Things Perl, The Perl Journal contains tutorials, demonstrations,
case studies, announcements, contests, and much more. TPJ has columns on web development, databases,
Win32 Perl, graphical programming, regular expressions, and networking, and sponsors the Obfuscated Perl
Contest. It is published quarterly under the gentle hand of its editor, Jon Orwant. See http://www.tpj.com/
or send mail to subscriptions@tpj.com.
Beyond this, magazines that frequently carry high−quality articles on Perl are Web Techniques (see
http://www.webtechniques.com/), Performance Computing (http://www.performance−computing.com/), and
Usenix‘s newsletter/magazine to its members, login:, at http://www.usenix.org/. Randal‘s Web Technique‘s
columns are available on the web at http://www.stonehenge.com/merlyn/WebTechniques/.

18−Oct−1998

Version 5.005_02

33

perlfaq2

Perl Programmers Reference Guide

perlfaq2

Perl on the Net: FTP and WWW Access
To get the best (and possibly cheapest) performance, pick a site from the list below and use it to grab the
complete list of mirror sites. From there you can find the quickest site for you. Remember, the following list
is not the complete list of CPAN mirrors.
http://www.perl.com/CPAN
(redirects to another mirror)
http://www.perl.org/CPAN
ftp://ftp.funet.fi/pub/languages/perl/CPAN/
http://www.cs.ruu.nl/pub/PERL/CPAN/
ftp://ftp.cs.colorado.edu/pub/perl/CPAN/
What mailing lists are there for perl?
Most of the major modules (tk, CGI, libwww−perl) have their own mailing lists. Consult the documentation
that came with the module for subscription information. The following are a list of mailing lists related to
perl itself.
If you subscribe to a mailing list, it behooves you to know how to unsubscribe from it. Strident pleas to the
list itself to get you off will not be favorably received.
MacPerl
There is a mailing list for discussing Macintosh Perl. Contact "mac−perl−request@iis.ee.ethz.ch".
Also see Matthias Neeracher‘s (the creator and maintainer of MacPerl) webpage at
http://www.iis.ee.ethz.ch/~neeri/macintosh/perl.html for many links to interesting MacPerl sites, and
the applications/MPW tools, precompiled.
Perl5−Porters
The core development team have a mailing list for discussing fixes and changes to the language. Send
mail to "perl5−porters−request@perl.org" with help in the body of the message for information on
subscribing.
NTPerl
This list is used to discuss issues involving Win32 Perl 5 (Windows NT and Win95). Subscribe by
mailing ListManager@ActiveWare.com with the message body:
subscribe Perl−Win32−Users
The list software, also written in perl, will automatically determine your address, and subscribe you
automatically. To unsubscribe, mail the following in the message body to the same address like so:
unsubscribe Perl−Win32−Users
You can also check http://www.activeware.com/ and select "Mailing Lists" to join or leave this list.
Perl−Packrats
Discussion related to archiving of perl materials, particularly the Comprehensive Perl Archive
Network (CPAN). Subscribe by emailing majordomo@cis.ufl.edu:
subscribe perl−packrats
The list software, also written in perl, will automatically determine your address, and subscribe you
automatically. To unsubscribe, simple prepend the same command with an "un", and mail to the same
address like so:
unsubscribe perl−packrats
Archives of comp.lang.perl.misc
Have you tried Deja News or Alta Vista?
ftp.cis.ufl.edu:/pub/perl/comp.lang.perl.*/monthly has an almost complete collection dating back to 12/89
(missing 08/91 through 12/93). They are kept as one large file for each month.

34

Version 5.005_02

18−Oct−1998

perlfaq2

Perl Programmers Reference Guide

perlfaq2

You‘ll probably want more a sophisticated query and retrieval mechanism than a file listing, preferably one
that allows you to retrieve articles using a fast−access indices, keyed on at least author, date, subject, thread
(as in "trn") and probably keywords. The best solution the FAQ authors know of is the MH pick command,
but it is very slow to select on 18000 articles.
If you have, or know where can be found, the missing sections, please let perlfaq−suggestions@perl.com
know.
Where can I buy a commercial version of Perl?
In a sense, Perl already is commercial software: It has a licence that you can grab and carefully read to your
manager. It is distributed in releases and comes in well−defined packages. There is a very large user
community and an extensive literature. The comp.lang.perl.* newsgroups and several of the mailing lists
provide free answers to your questions in near real−time. Perl has traditionally been supported by Larry,
dozens of software designers and developers, and thousands of programmers, all working for free to create a
useful thing to make life better for everyone.
However, these answers may not suffice for managers who require a purchase order from a company whom
they can sue should anything go wrong. Or maybe they need very serious hand−holding and contractual
obligations. Shrink−wrapped CDs with perl on them are available from several sources if that will help.
Or you can purchase a real support contract. Although Cygnus historically provided this service, they no
longer sell support contracts for Perl. Instead, the Paul Ingram Group will be taking up the slack through The
Perl Clinic. The following is a commercial from them:
"Do you need professional support for Perl and/or Oraperl? Do you need a support contract with defined
levels of service? Do you want to pay only for what you need?
"The Paul Ingram Group has provided quality software development and support services to some of the
world‘s largest corporations for ten years. We are now offering the same quality support services for Perl at
The Perl Clinic. This service is led by Tim Bunce, an active perl porter since 1994 and well known as the
author and maintainer of the DBI, DBD::Oracle, and Oraperl modules and author/co−maintainer of The Perl
5 Module List. We also offer Oracle users support for Perl5 Oraperl and related modules (which Oracle is
planning to ship as part of Oracle Web Server 3). 20% of the profit from our Perl support work will be
donated to The Perl Institute."
For more information, contact the The Perl Clinic:
Tel:
Fax:
Web:
Email:

+44 1483 424424
+44 1483 419419
http://www.perl.co.uk/
perl−support−info@perl.co.uk or Tim.Bunce@ig.co.uk

See also www.perl.com for updates on training and support.
Where do I send bug reports?
If you are reporting a bug in the perl interpreter or the modules shipped with perl, use the perlbug program in
the perl distribution or mail your report to perlbug@perl.com.
If you are posting a bug with a non−standard port (see the answer to "What platforms is Perl available for?"),
a binary distribution, or a non−standard module (such as Tk, CGI, etc), then please see the documentation
that came with it to determine the correct place to post bugs.
Read the perlbug(1) man page (perl5.004 or later) for more information.
What is perl.com? perl.org? The Perl Institute?
The perl.com domain is managed by Tom Christiansen, who created it as a public service long before
perl.org came about. Despite the name, it‘s a pretty non−commercial site meant to be a clearinghouse for
information about all things Perlian, accepting no paid advertisements, bouncy happy gifs, or silly java
applets on its pages. The Perl Home Page at http://www.perl.com/ is currently hosted on a T3 line courtesy
of Songline Systems, a software−oriented subsidiary of O‘Reilly and Associates.

18−Oct−1998

Version 5.005_02

35

perlfaq2

Perl Programmers Reference Guide

perlfaq2

perl.org is the official vehicle for The Perl Institute. The motto of TPI is "helping people help Perl help
people" (or something like that). It‘s a non−profit organization supporting development, documentation, and
dissemination of perl.
How do I learn about object−oriented Perl programming?
perltoot (distributed with 5.004 or later) is a good place to start. Also, perlobj, perlref, and perlmod are
useful references, while perlbot has some excellent tips and tricks.
AUTHOR AND COPYRIGHT
Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. All rights reserved.
When included as an integrated part of the Standard Distribution of Perl or of its documentation (printed or
otherwise), this works is covered under Perl‘s Artistic Licence. For separate distributions of all or part of
this FAQ outside of that, see perlfaq.
Irrespective of its distribution, all code examples here are public domain. You are permitted and encouraged
to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A
simple comment in the code giving credit to the FAQ would be courteous but is not required.

36

Version 5.005_02

18−Oct−1998

perlfaq3

Perl Programmers Reference Guide

perlfaq3

NAME
perlfaq3 − Programming Tools ($Revision: 1.29 $, $Date: 1998/08/05 11:57:04 $)
DESCRIPTION
This section of the FAQ answers questions related to programmer tools and programming support.
How do I do (anything)?
Have you looked at CPAN (see perlfaq2)? The chances are that someone has already written a module that
can solve your problem. Have you read the appropriate man pages? Here‘s a brief index:
Basics
Execution
Functions
Objects
Data Structures
Modules
Regexps
Moving to perl5
Linking w/C
Various

perldata, perlvar, perlsyn, perlop, perlsub
perlrun, perldebug
perlfunc
perlref, perlmod, perlobj, perltie
perlref, perllol, perldsc
perlmod, perlmodlib, perlsub
perlre, perlfunc, perlop, perllocale
perltrap, perl
perlxstut, perlxs, perlcall, perlguts, perlembed
http://www.perl.com/CPAN/doc/FMTEYEWTK/index.html
(not a man−page but still useful)

perltoc provides a crude table of contents for the perl man page set.
How can I use Perl interactively?
The typical approach uses the Perl debugger, described in the perldebug(1) man page, on an ‘‘empty‘’
program, like this:
perl −de 42
Now just type in any legal Perl code, and it will be immediately evaluated. You can also examine the
symbol table, get stack backtraces, check variable values, set breakpoints, and other operations typically
found in symbolic debuggers.
Is there a Perl shell?
In general, no. The Shell.pm module (distributed with perl) makes perl try commands which aren‘t part of
the Perl language as shell commands. perlsh from the source distribution is simplistic and uninteresting, but
may still be what you want.
How do I debug my Perl programs?
Have you used −w? It enables warnings for dubious practices.
Have you tried use strict? It prevents you from using symbolic references, makes you predeclare any
subroutines that you call as bare words, and (probably most importantly) forces you to predeclare your
variables with my or use vars.
Did you check the returns of each and every system call? The operating system (and thus Perl) tells you
whether they worked or not, and if not why.
open(FH, "> /etc/cantwrite")
or die "Couldn’t write to /etc/cantwrite: $!\n";
Did you read perltrap? It‘s full of gotchas for old and new Perl programmers, and even has sections for
those of you who are upgrading from languages like awk and C.
Have you tried the Perl debugger, described in perldebug? You can step through your program and see what
it‘s doing and thus work out why what it‘s doing isn‘t what it should be doing.

18−Oct−1998

Version 5.005_02

37

perlfaq3

Perl Programmers Reference Guide

perlfaq3

How do I profile my Perl programs?
You should get the Devel::DProf module from CPAN, and also use Benchmark.pm from the standard
distribution. Benchmark lets you time specific portions of your code, while Devel::DProf gives detailed
breakdowns of where your code spends its time.
Here‘s a sample use of Benchmark:
use Benchmark;
@junk = ‘cat /etc/motd‘;
$count = 10_000;
timethese($count, {
’map’ => sub { my @a = @junk;
map { s/a/b/ } @a;
return @a
},
’for’ => sub { my @a = @junk;
local $_;
for (@a) { s/a/b/ };
return @a },
});
This is what it prints (on one machine—your results will be dependent on your hardware, operating system,
and the load on your machine):
Benchmark: timing 10000 iterations of for, map...
for: 4 secs ( 3.97 usr 0.01 sys = 3.98 cpu)
map: 6 secs ( 4.97 usr 0.00 sys = 4.97 cpu)
How do I cross−reference my Perl programs?
The B::Xref module, shipped with the new, alpha−release Perl compiler (not the general distribution prior to
the 5.005 release), can be used to generate cross−reference reports for Perl programs.
perl −MO=Xref[,OPTIONS] scriptname.plx
Is there a pretty−printer (formatter) for Perl?
There is no program that will reformat Perl as much as indent(1) does for C. The complex feedback between
the scanner and the parser (this feedback is what confuses the vgrind and emacs programs) makes it
challenging at best to write a stand−alone Perl parser.
Of course, if you simply follow the guidelines in perlstyle, you shouldn‘t need to reformat. The habit of
formatting your code as you write it will help prevent bugs. Your editor can and should help you with this.
The perl−mode for emacs can provide a remarkable amount of help with most (but not all) code, and even
less programmable editors can provide significant assistance.
If you are used to using vgrind program for printing out nice code to a laser printer, you can take a stab at
this using http://www.perl.com/CPAN/doc/misc/tips/working.vgrind.entry, but the results are not particularly
satisfying for sophisticated code.
Is there a ctags for Perl?
There‘s a simple one at http://www.perl.com/CPAN/authors/id/TOMC/scripts/ptags.gz which may do the
trick.
Where can I get Perl macros for vi?
For a complete version of Tom Christiansen‘s vi configuration file, see
http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/toms.exrc, the standard benchmark file for vi
emulators. This runs best with nvi, the current version of vi out of Berkeley, which incidentally can be built
with an embedded Perl interpreter — see http://www.perl.com/CPAN/src/misc.

38

Version 5.005_02

18−Oct−1998

perlfaq3

Perl Programmers Reference Guide

perlfaq3

Where can I get perl−mode for emacs?
Since Emacs version 19 patchlevel 22 or so, there have been both a perl−mode.el and support for the perl
debugger built in. These should come with the standard Emacs 19 distribution.
In the perl source directory, you‘ll find a directory called "emacs", which contains a cperl−mode that
color−codes keywords, provides context−sensitive help, and other nifty things.
Note that the perl−mode of emacs will have fits with "main‘foo" (single quote), and mess up the
indentation and hilighting. You should be using "main::foo" in new Perl code anyway, so this shouldn‘t
be an issue.
How can I use curses with Perl?
The Curses module from CPAN provides a dynamically loadable object module interface to a curses library.
A small demo can be found at the directory
http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/rep; this program repeats a command and
updates the screen as needed, rendering rep ps axu similar to top.
How can I use X or Tk with Perl?
Tk is a completely Perl−based, object−oriented interface to the Tk toolkit that doesn‘t force you to use Tcl
just to get at Tk. Sx is an interface to the Athena Widget set. Both are available from CPAN. See the
directory http://www.perl.com/CPAN/modules/by−category/08_User_Interfaces/
Invaluable for Perl/Tk programming are: the Perl/Tk FAQ at
http://w4.lns.cornell.edu/~pvhp/ptk/ptkTOC.html , the Perl/Tk Reference Guide available at
http://www.perl.com/CPAN−local/authors/Stephen_O_Lidie/ , and the online manpages at
http://www−users.cs.umn.edu/~amundson/perl/perltk/toc.html .
How can I generate simple menus without using CGI or Tk?
The http://www.perl.com/CPAN/authors/id/SKUNZ/perlmenu.v4.0.tar.gz module, which is curses−based,
can help with this.
What is undump?
See the next questions.
How can I make my Perl program run faster?
The best way to do this is to come up with a better algorithm. This can often make a dramatic difference.
Chapter 8 in the Camel has some efficiency tips in it you might want to look at. Jon Bentley‘s book
‘‘Programming Pearls‘’ (that‘s not a misspelling!) has some good tips on optimization, too. Advice on
benchmarking boils down to: benchmark and profile to make sure you‘re optimizing the right part, look for
better algorithms instead of microtuning your code, and when all else fails consider just buying faster
hardware.
A different approach is to autoload seldom−used Perl code. See the AutoSplit and AutoLoader modules in
the standard distribution for that. Or you could locate the bottleneck and think about writing just that part in
C, the way we used to take bottlenecks in C code and write them in assembler. Similar to rewriting in C is
the use of modules that have critical sections written in C (for instance, the PDL module from CPAN).
In some cases, it may be worth it to use the backend compiler to produce byte code (saving compilation
time) or compile into C, which will certainly save compilation time and sometimes a small amount (but not
much) execution time. See the question about compiling your Perl programs for more on the compiler—the
wins aren‘t as obvious as you‘d hope.
If you‘re currently linking your perl executable to a shared libc.so, you can often gain a 10−25%
performance benefit by rebuilding it to link with a static libc.a instead. This will make a bigger perl
executable, but your Perl programs (and programmers) may thank you for it. See the INSTALL file in the
source distribution for more information.
Unsubstantiated reports allege that Perl interpreters that use sfio outperform those that don‘t (for IO intensive
applications). To try this, see the INSTALL file in the source distribution, especially the ‘‘Selecting File IO

18−Oct−1998

Version 5.005_02

39

perlfaq3

Perl Programmers Reference Guide

perlfaq3

mechanisms‘’ section.
The undump program was an old attempt to speed up your Perl program by storing the already−compiled
form to disk. This is no longer a viable option, as it only worked on a few architectures, and wasn‘t a good
solution anyway.
How can I make my Perl program take less memory?
When it comes to time−space tradeoffs, Perl nearly always prefers to throw memory at a problem. Scalars in
Perl use more memory than strings in C, arrays take more that, and hashes use even more. While there‘s still
a lot to be done, recent releases have been addressing these issues. For example, as of 5.004, duplicate hash
keys are shared amongst all hashes using them, so require no reallocation.
In some cases, using substr() or vec() to simulate arrays can be highly beneficial. For example, an
array of a thousand booleans will take at least 20,000 bytes of space, but it can be turned into one 125−byte
bit vector for a considerable memory savings. The standard Tie::SubstrHash module can also help for
certain types of data structure. If you‘re working with specialist data structures (matrices, for instance)
modules that implement these in C may use less memory than equivalent Perl modules.
Another thing to try is learning whether your Perl was compiled with the system malloc or with Perl‘s builtin
malloc. Whichever one it is, try using the other one and see whether this makes a difference. Information
about malloc is in the INSTALL file in the source distribution. You can find out whether you are using
perl‘s malloc by typing perl −V:usemymalloc.
Is it unsafe to return a pointer to local data?
No, Perl‘s garbage collection system takes care of this.
sub makeone {
my @a = ( 1 .. 10 );
return \@a;
}
for $i ( 1 .. 10 ) {
push @many, makeone();
}
print $many[4][5], "\n";
print "@many\n";
How can I free an array or hash so my program shrinks?
You can‘t. On most operating systems, memory allocated to a program can never be returned to the system.
That‘s why long−running programs sometimes re−exec themselves. Some operating systems (notably,
FreeBSD) allegedly reclaim large chunks of memory that is no longer used, but it doesn‘t appear to happen
with Perl (yet). The Mac appears to be the only platform that will reliably (albeit, slowly) return memory to
the OS.
However, judicious use of my() on your variables will help make sure that they go out of scope so that Perl
can free up their storage for use in other parts of your program. A global variable, of course, never goes out
of scope, so you can‘t get its space automatically reclaimed, although undef()ing and/or delete()ing it
will achieve the same effect. In general, memory allocation and de−allocation isn‘t something you can or
should be worrying about much in Perl, but even this capability (preallocation of data types) is in the works.
How can I make my CGI script more efficient?
Beyond the normal measures described to make general Perl programs faster or smaller, a CGI program has
additional issues. It may be run several times per second. Given that each time it runs it will need to be
re−compiled and will often allocate a megabyte or more of system memory, this can be a killer. Compiling
into C isn‘t going to help you because the process start−up overhead is where the bottleneck is.
There are two popular ways to avoid this overhead. One solution involves running the Apache HTTP server
(available from http://www.apache.org/) with either of the mod_perl or mod_fastcgi plugin modules.

40

Version 5.005_02

18−Oct−1998

perlfaq3

Perl Programmers Reference Guide

perlfaq3

With mod_perl and the Apache::Registry module (distributed with mod_perl), httpd will run with an
embedded Perl interpreter which pre−compiles your script and then executes it within the same address
space without forking. The Apache extension also gives Perl access to the internal server API, so modules
written in Perl can do just about anything a module written in C can. For more on mod_perl, see
http://perl.apache.org/
With the FCGI module (from CPAN), a Perl executable compiled with sfio (see the INSTALL file in the
distribution) and the mod_fastcgi module (available from http://www.fastcgi.com/) each of your perl scripts
becomes a permanent CGI daemon process.
Both of these solutions can have far−reaching effects on your system and on the way you write your CGI
scripts, so investigate them with care.
See http://www.perl.com/CPAN/modules/by−category/15_World_Wide_Web_HTML_HTTP_CGI/ .
A non−free, commerical product, ‘‘The Velocity Engine for Perl‘’, (http://www.binevolve.com/ or
http://www.binevolve.com/bine/vep) might also be worth looking at. It will allow you to increase the
performance of your perl scripts, upto 25 times faster than normal CGI perl by running in persistent perl
mode, or 4 to 5 times faster without any modification to your existing CGI scripts. Fully functional
evaluation copies are available from the web site.
How can I hide the source for my Perl program?
Delete it. :−) Seriously, there are a number of (mostly unsatisfactory) solutions with varying levels of
‘‘security‘’.
First of all, however, you can‘t take away read permission, because the source code has to be readable in
order to be compiled and interpreted. (That doesn‘t mean that a CGI script‘s source is readable by people on
the web, though, only by people with access to the filesystem) So you have to leave the permissions at the
socially friendly 0755 level.
Some people regard this as a security problem. If your program does insecure things, and relies on people
not knowing how to exploit those insecurities, it is not secure. It is often possible for someone to determine
the insecure things and exploit them without viewing the source. Security through obscurity, the name for
hiding your bugs instead of fixing them, is little security indeed.
You can try using encryption via source filters (Filter::* from CPAN), but crackers might be able to decrypt
it. You can try using the byte code compiler and interpreter described below, but crackers might be able to
de−compile it. You can try using the native−code compiler described below, but crackers might be able to
disassemble it. These pose varying degrees of difficulty to people wanting to get at your code, but none can
definitively conceal it (this is true of every language, not just Perl).
If you‘re concerned about people profiting from your code, then the bottom line is that nothing but a
restrictive licence will give you legal security. License your software and pepper it with threatening
statements like ‘‘This is unpublished proprietary software of XYZ Corp. Your access to it does not give you
permission to use it blah blah blah.‘’ We are not lawyers, of course, so you should see a lawyer if you want
to be sure your licence‘s wording will stand up in court.
How can I compile my Perl program into byte code or C?
Malcolm Beattie has written a multifunction backend compiler, available from CPAN, that can do both these
things. It is included in the perl5.005 release, but is still considered experimental. This means it‘s fun to play
with if you‘re a programmer but not really for people looking for turn−key solutions.
Merely compiling into C does not in and of itself guarantee that your code will run very much faster. That‘s
because except for lucky cases where a lot of native type inferencing is possible, the normal Perl run time
system is still present and so your program will take just as long to run and be just as big. Most programs
save little more than compilation time, leaving execution no more than 10−30% faster. A few rare programs
actually benefit significantly (like several times faster), but this takes some tweaking of your code.
You‘ll probably be astonished to learn that the current version of the compiler generates a compiled form of
your script whose executable is just as big as the original perl executable, and then some. That‘s because as

18−Oct−1998

Version 5.005_02

41

perlfaq3

Perl Programmers Reference Guide

perlfaq3

currently written, all programs are prepared for a full eval() statement. You can tremendously reduce this
cost by building a shared libperl.so library and linking against that. See the INSTALL podfile in the perl
source distribution for details. If you link your main perl binary with this, it will make it miniscule. For
example, on one author‘s system, /usr/bin/perl is only 11k in size!
In general, the compiler will do nothing to make a Perl program smaller, faster, more portable, or more
secure. In fact, it will usually hurt all of those. The executable will be bigger, your VM system may take
longer to load the whole thing, the binary is fragile and hard to fix, and compilation never stopped software
piracy in the form of crackers, viruses, or bootleggers. The real advantage of the compiler is merely
packaging, and once you see the size of what it makes (well, unless you use a shared libperl.so), you‘ll
probably want a complete Perl install anyway.
How can I get #!perl to work on [MS−DOS,NT,...]?
For OS/2 just use
extproc perl −S −your_switches
as the first line in *.cmd file (−S due to a bug in cmd.exe‘s ‘extproc’ handling). For DOS one should first
invent a corresponding batch file, and codify it in ALTERNATIVE_SHEBANG (see the INSTALL file in the
source distribution for more information).
The Win95/NT installation, when using the ActiveState port of Perl, will modify the Registry to associate the
.pl extension with the perl interpreter. If you install another port (Gurusaramy Sarathy‘s is the
recommended Win95/NT port), or (eventually) build your own Win95/NT Perl using WinGCC, then you‘ll
have to modify the Registry yourself.
Macintosh perl scripts will have the the appropriate Creator and Type, so that double−clicking them will
invoke the perl application.
IMPORTANT!: Whatever you do, PLEASE don‘t get frustrated, and just throw the perl interpreter into your
cgi−bin directory, in order to get your scripts working for a web server. This is an EXTREMELY big
security risk. Take the time to figure out how to do it correctly.
Can I write useful perl programs on the command line?
Yes. Read perlrun for more information. Some examples follow. (These assume standard Unix shell
quoting rules.)
# sum first and last fields
perl −lane ’print $F[0] + $F[−1]’ *
# identify text files
perl −le ’for(@ARGV) {print if −f && −T _}’ *
# remove (most) comments from C program
perl −0777 −pe ’s{/\*.*?\*/}{}gs’ foo.c
# make file a month younger than today, defeating reaper daemons
perl −e ’$X=24*60*60; utime(time(),time() + 30 * $X,@ARGV)’ *
# find first unused uid
perl −le ’$i++ while getpwuid($i); print $i’
# display reasonable manpath
echo $PATH | perl −nl −072 −e ’
s![^/+]*$!man!&&−d&&!$s{$_}++&&push@m,$_;END{print"@m"}’
Ok, the last one was actually an obfuscated perl entry. :−)
Why don‘t perl one−liners work on my DOS/Mac/VMS system?
The problem is usually that the command interpreters on those systems have rather different ideas about
quoting than the Unix shells under which the one−liners were created. On some systems, you may have to
change single−quotes to double ones, which you must NOT do on Unix or Plan9 systems. You might also

42

Version 5.005_02

18−Oct−1998

perlfaq3

Perl Programmers Reference Guide

perlfaq3

have to change a single % to a %%.
For example:
# Unix
perl −e ’print "Hello world\n"’
# DOS, etc.
perl −e "print \"Hello world\n\""
# Mac
print "Hello world\n"
(then Run "Myscript" or Shift−Command−R)
# VMS
perl −e "print ""Hello world\n"""
The problem is that none of this is reliable: it depends on the command interpreter. Under Unix, the first two
often work. Under DOS, it‘s entirely possible neither works. If 4DOS was the command shell, you‘d
probably have better luck like this:
perl −e "print "Hello world\n""
Under the Mac, it depends which environment you are using. The MacPerl shell, or MPW, is much like
Unix shells in its support for several quoting variants, except that it makes free use of the Mac‘s non−ASCII
characters as control characters.
There is no general solution to all of this. It is a mess, pure and simple. Sucks to be away from Unix, huh?
:−)
[Some of this answer was contributed by Kenneth Albanowski.]
Where can I learn about CGI or Web programming in Perl?
For modules, get the CGI or LWP modules from CPAN. For textbooks, see the two especially dedicated to
web stuff in the question on books. For problems and questions related to the web, like ‘‘Why do I get 500
Errors‘’ or ‘‘Why doesn‘t it run from the browser right when it runs fine on the command line‘’, see these
sources:
WWW Security FAQ
http://www.w3.org/Security/Faq/
Web FAQ
http://www.boutell.com/faq/
CGI FAQ
http://www.webthing.com/page.cgi/cgifaq
HTTP Spec
http://www.w3.org/pub/WWW/Protocols/HTTP/
HTML Spec
http://www.w3.org/TR/REC−html40/
http://www.w3.org/pub/WWW/MarkUp/
CGI Spec
http://www.w3.org/CGI/
CGI Security FAQ
http://www.go2net.com/people/paulp/cgi−security/safe−cgi.txt
Where can I learn about object−oriented Perl programming?
perltoot is a good place to start, and you can use perlobj and perlbot for reference. Perltoot didn‘t come out
until the 5.004 release, but you can get a copy (in pod, html, or postscript) from
http://www.perl.com/CPAN/doc/FMTEYEWTK/ .

18−Oct−1998

Version 5.005_02

43

perlfaq3

Perl Programmers Reference Guide

perlfaq3

Where can I learn about linking C with Perl? [h2xs, xsubpp]
If you want to call C from Perl, start with perlxstut, moving on to perlxs, xsubpp, and perlguts. If you want
to call Perl from C, then read perlembed, perlcall, and perlguts. Don‘t forget that you can learn a lot from
looking at how the authors of existing extension modules wrote their code and solved their problems.
I‘ve read perlembed, perlguts, etc., but I can‘t embed perl in
my C program, what am I doing wrong?
Download the ExtUtils::Embed kit from CPAN and run ‘make test’. If the tests pass, read the pods again
and again and again. If they fail, see perlbug and send a bugreport with the output of make test
TEST_VERBOSE=1 along with perl −V.
When I tried to run my script, I got this message. What does it
mean?
perldiag has a complete list of perl‘s error messages and warnings, with explanatory text. You can also use
the splain program (distributed with perl) to explain the error messages:
perl program 2>diag.out
splain [−v] [−p] diag.out
or change your program to explain the messages for you:
use diagnostics;
or
use diagnostics −verbose;
What‘s MakeMaker?
This module (part of the standard perl distribution) is designed to write a Makefile for an extension module
from a Makefile.PL. For more information, see ExtUtils::MakeMaker.
AUTHOR AND COPYRIGHT
Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. All rights reserved.
When included as an integrated part of the Standard Distribution of Perl or of its documentation (printed or
otherwise), this works is covered under Perl‘s Artistic Licence. For separate distributions of all or part of
this FAQ outside of that, see perlfaq.
Irrespective of its distribution, all code examples here are public domain. You are permitted and encouraged
to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A
simple comment in the code giving credit to the FAQ would be courteous but is not required.

44

Version 5.005_02

18−Oct−1998

perlfaq4

Perl Programmers Reference Guide

perlfaq4

NAME
perlfaq4 − Data Manipulation ($Revision: 1.26 $, $Date: 1998/08/05 12:04:00 $)
DESCRIPTION
The section of the FAQ answers question related to the manipulation of data as numbers, dates, strings,
arrays, hashes, and miscellaneous data issues.
Data: Numbers
Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting
(eg, 19.95)?
The infinite set that a mathematician thinks of as the real numbers can only be approximate on a computer,
since the computer only has a finite number of bits to store an infinite number of, um, numbers.
Internally, your computer represents floating−point numbers in binary. Floating−point numbers read in from
a file or appearing as literals in your program are converted from their decimal floating−point representation
(eg, 19.95) to the internal binary representation.
However, 19.95 can‘t be precisely represented as a binary floating−point number, just like 1/3 can‘t be
exactly represented as a decimal floating−point number. The computer‘s binary representation of 19.95,
therefore, isn‘t exactly 19.95.
When a floating−point number gets printed, the binary floating−point representation is converted back to
decimal. These decimal numbers are displayed in either the format you specify with printf(), or the
current output format for numbers (see $# in perlvar if you use print. $# has a different default value
in Perl5 than it did in Perl4. Changing $# yourself is deprecated.
This affects all computer languages that represent decimal floating−point numbers in binary, not just Perl.
Perl provides arbitrary−precision decimal numbers with the Math::BigFloat module (part of the standard Perl
distribution), but mathematical operations are consequently slower.
To get rid of the superfluous digits, just use a format (eg, printf("%.2f", 19.95)) to get the required
precision. See Floating−point Arithmetic in perlop.
Why isn‘t my octal data interpreted correctly?
Perl only understands octal and hex numbers as such when they occur as literals in your program. If they are
read in from somewhere and assigned, no automatic conversion takes place. You must explicitly use oct()
or hex() if you want the values converted. oct() interprets both hex ("0x350") numbers and octal ones
("0350" or even without the leading "0", like "377"), while hex() only converts hexadecimal ones, with or
without a leading "0x", like "0x255", "3A", "ff", or "deadbeef".
This problem shows up most often when people try using chmod(), mkdir(), umask(), or
sysopen(), which all want permissions in octal.
chmod(644, $file); # WRONG −− perl −w catches this
chmod(0644, $file); # right
Does perl have a round function? What about ceil() and floor()? Trig functions?
Remember that int() merely truncates toward 0. For rounding to a certain number of digits, sprintf()
or printf() is usually the easiest route.
printf("%.3f", 3.1415926535);

# prints 3.142

The POSIX module (part of the standard perl distribution) implements ceil(), floor(), and a number of
other mathematical and trigonometric functions.
use POSIX;
$ceil
= ceil(3.5);
$floor = floor(3.5);

# 4
# 3

In 5.000 to 5.003 Perls, trigonometry was done in the Math::Complex module. With 5.004, the Math::Trig

18−Oct−1998

Version 5.005_02

45

perlfaq4

Perl Programmers Reference Guide

perlfaq4

module (part of the standard perl distribution) implements the trigonometric functions. Internally it uses the
Math::Complex module and some functions can break out from the real axis into the complex plane, for
example the inverse sine of 2.
Rounding in financial applications can have serious implications, and the rounding method used should be
specified precisely. In these cases, it probably pays not to trust whichever system rounding is being used by
Perl, but to instead implement the rounding function you need yourself.
How do I convert bits into ints?
To turn a string of 1s and 0s like 10110110 into a scalar containing its binary value, use the pack()
function (documented in pack in perlfunc):
$decimal = pack(’B8’, ’10110110’);
Here‘s an example of going the other way:
$binary_string = join(’’, unpack(’B*’, "\x29"));
How do I multiply matrices?
Use the Math::Matrix or Math::MatrixReal modules (available from CPAN) or the PDL extension (also
available from CPAN).
How do I perform an operation on a series of integers?
To call a function on each element in an array, and collect the results, use:
@results = map { my_func($_) } @array;
For example:
@triple = map { 3 * $_ } @single;
To call a function on each element of an array, but ignore the results:
foreach $iterator (@array) {
&my_func($iterator);
}
To call a function on each integer in a (small) range, you can use:
@results = map { &my_func($_) } (5 .. 25);
but you should be aware that the .. operator creates an array of all integers in the range. This can take a lot
of memory for large ranges. Instead use:
@results = ();
for ($i=5; $i < 500_005; $i++) {
push(@results, &my_func($i));
}
How can I output Roman numerals?
Get the http://www.perl.com/CPAN/modules/by−module/Roman module.
Why aren‘t my random numbers random?
The short explanation is that you‘re getting pseudorandom numbers, not random ones, because computers
are good at being predictable and bad at being random (despite appearances caused by bugs in your programs
:−). A longer explanation is available on http://www.perl.com/CPAN/doc/FMTEYEWTK/random, courtesy
of Tom Phoenix. John von Neumann said, ‘‘Anyone who attempts to generate random numbers by
deterministic means is, of course, living in a state of sin.‘’
You should also check out the Math::TrulyRandom module from CPAN. It uses the imperfections in your
system‘s timer to generate random numbers, but this takes quite a while. If you want a better pseudorandom
generator than comes with your operating system, look at ‘‘Numerical Recipes in C‘’ at
http://nr.harvard.edu/nr/bookc.html .

46

Version 5.005_02

18−Oct−1998

perlfaq4

Perl Programmers Reference Guide

perlfaq4

Data: Dates
How do I find the week−of−the−year/day−of−the−year?
The day of the year is in the array returned by localtime() (see localtime in perlfunc):
$day_of_year = (localtime(time()))[7];
or more legibly (in 5.004 or higher):
use Time::localtime;
$day_of_year = localtime(time())−>yday;
You can find the week of the year by dividing this by 7:
$week_of_year = int($day_of_year / 7);
Of course, this believes that weeks start at zero. The Date::Calc module from CPAN has a lot of date
calculation functions, including day of the year, week of the year, and so on. Note that not all business
consider ‘‘week 1‘’ to be the same; for example, American business often consider the first week with a
Monday in it to be Work Week #1, despite ISO 8601, which consider WW1 to be the frist week with a
Thursday in it.
How can I compare two dates and find the difference?
If you‘re storing your dates as epoch seconds then simply subtract one from the other. If you‘ve got a
structured date (distinct year, day, month, hour, minute, seconds values) then use one of the Date::Manip and
Date::Calc modules from CPAN.
How can I take a string and turn it into epoch seconds?
If it‘s a regular enough string that it always has the same format, you can split it up and pass the parts to
timelocal in the standard Time::Local module. Otherwise, you should look into the Date::Calc and
Date::Manip modules from CPAN.
How can I find the Julian Day?
Neither Date::Manip nor Date::Calc deal with Julian days. Instead, there is an example of Julian date
calculation that should help you in
http://www.perl.com/CPAN/authors/David_Muir_Sharnoff/modules/Time/JulianDay.pm.gz .
Does Perl have a year 2000 problem? Is Perl Y2K compliant?
Short answer: No, Perl does not have a Year 2000 problem. Yes, Perl is Y2K compliant. The programmers
you‘re hired to use it, however, probably are not.
Long answer: Perl is just as Y2K compliant as your pencil—no more, and no less. The date and time
functions supplied with perl (gmtime and localtime) supply adequate information to determine the year well
beyond 2000 (2038 is when trouble strikes for 32−bit machines). The year returned by these functions when
used in an array context is the year minus 1900. For years between 1910 and 1999 this happens to be a
2−digit decimal number. To avoid the year 2000 problem simply do not treat the year as a 2−digit number. It
isn‘t.
When gmtime() and localtime() are used in scalar context they return a timestamp string that
contains a fully−expanded year. For example, $timestamp = gmtime(1005613200) sets
$timestamp to "Tue Nov 13 01:00:00 2001". There‘s no year 2000 problem here.
That doesn‘t mean that Perl can‘t be used to create non−Y2K compliant programs. It can. But so can your
pencil. It‘s the fault of the user, not the language. At the risk of inflaming the NRA: ‘‘Perl doesn‘t break
Y2K, people do.‘’ See http://language.perl.com/news/y2k.html for a longer exposition.
Data: Strings
How do I validate input?
The answer to this question is usually a regular expression, perhaps with auxiliary logic. See the more
specific questions (numbers, mail addresses, etc.) for details.

18−Oct−1998

Version 5.005_02

47

perlfaq4

Perl Programmers Reference Guide

perlfaq4

How do I unescape a string?
It depends just what you mean by ‘‘escape‘’. URL escapes are dealt with in perlfaq9. Shell escapes with the
backslash (\) character are removed with:
s/\\(.)/$1/g;
This won‘t expand "\n" or "\t" or any other special escapes.
How do I remove consecutive pairs of characters?
To turn "abbcccd" into "abccd":
s/(.)\1/$1/g;
How do I expand function calls in a string?
This is documented in perlref. In general, this is fraught with quoting and readability problems, but it is
possible. To interpolate a subroutine call (in list context) into a string:
print "My sub returned @{[mysub(1,2,3)]} that time.\n";
If you prefer scalar context, similar chicanery is also useful for arbitrary expressions:
print "That yields ${\($n + 5)} widgets\n";
Version 5.004 of Perl had a bug that gave list context to the expression in ${...}, but this is fixed in
version 5.005.
See also ‘‘How can I expand variables in text strings?‘’ in this section of the FAQ.
How do I find matching/nesting anything?
This isn‘t something that can be done in one regular expression, no matter how complicated. To find
something between two single characters, a pattern like /x([^x]*)x/ will get the intervening bits in $1.
For multiple ones, then something more like /alpha(.*?)omega/ would be needed. But none of these
deals with nested patterns, nor can they. For that you‘ll have to write a parser.
If you are serious about writing a parser, there are a number of modules or oddities that will make your life a
lot easier. There is the CPAN module Parse::RecDescent, the standard module Text::Balanced, the byacc
program, and Mark−Jason Dominus‘s excellent py tool at http://www.plover.com/~mjd/perl/py/ .
One simple destructive, inside−out approach that you might try is to pull out the smallest nesting parts one at
a time:
while (s//BEGIN((?:(?!BEGIN)(?!END).)*)END/gs) {
# do something with $1
}
How do I reverse a string?
Use reverse() in scalar context, as documented in reverse.
$reversed = reverse $string;
How do I expand tabs in a string?
You can do it yourself:
1 while $string =~ s/\t+/’ ’ x (length($&) * 8 − length($‘) % 8)/e;
Or you can just use the Text::Tabs module (part of the standard perl distribution).
use Text::Tabs;
@expanded_lines = expand(@lines_with_tabs);
How do I reformat a paragraph?
Use Text::Wrap (part of the standard perl distribution):

48

Version 5.005_02

18−Oct−1998

perlfaq4

Perl Programmers Reference Guide
use Text::Wrap;
print wrap("\t", ’

perlfaq4

’, @paragraphs);

The paragraphs you give to Text::Wrap should not contain embedded newlines. Text::Wrap doesn‘t justify
the lines (flush−right).
How can I access/change the first N letters of a string?
There are many ways. If you just want to grab a copy, use substr():
$first_byte = substr($a, 0, 1);
If you want to modify part of a string, the simplest way is often to use substr() as an lvalue:
substr($a, 0, 3) = "Tom";
Although those with a pattern matching kind of thought process will likely prefer:
$a =~ s/^.../Tom/;
How do I change the Nth occurrence of something?
You have to keep track of N yourself. For example, let‘s say you want to change the fifth occurrence of
"whoever" or "whomever" into "whosoever" or "whomsoever", case insensitively.
$count = 0;
s{((whom?)ever)}{
++$count == 5
? "${2}soever"
: $1
}igex;

# is it the 5th?
# yes, swap
# renege and leave it there

In the more general case, you can use the /g modifier in a while loop, keeping count of matches.
$WANT = 3;
$count = 0;
while (/(\w+)\s+fish\b/gi) {
if (++$count == $WANT) {
print "The third fish is a $1 one.\n";
# Warning: don’t ‘last’ out of this loop
}
}
That prints out: "The third fish is a red one." You can also use a repetition count and
repeated pattern like this:
/(?:\w+\s+fish\s+){2}(\w+)\s+fish/i;
How can I count the number of occurrences of a substring within a string?
There are a number of ways, with varying efficiency: If you want a count of a certain single character (X)
within a string, you can use the tr/// function like so:
$string = "ThisXlineXhasXsomeXx’sXinXit":
$count = ($string =~ tr/X//);
print "There are $count X charcters in the string";
This is fine if you are just looking for a single character. However, if you are trying to count multiple
character substrings within a larger string, tr/// won‘t work. What you can do is wrap a while() loop
around a global pattern match. For example, let‘s count negative integers:
$string = "−9 55 48 −2 23 −76 4 14 −44";
while ($string =~ /−\d+/g) { $count++ }
print "There are $count negative numbers in the string";

18−Oct−1998

Version 5.005_02

49

perlfaq4

Perl Programmers Reference Guide

perlfaq4

How do I capitalize all the words on one line?
To make the first letter of each word upper case:
$line =~ s/\b(\w)/\U$1/g;
This has the strange effect of turning "don‘t do it" into "Don‘T Do It". Sometimes you might want
this, instead (Suggested by Brian Foy):
$string =~ s/ (
(^\w)
#at the beginning of the line
|
# or
(\s\w)
#preceded by whitespace
)
/\U$1/xg;
$string =~ /([\w’]+)/\u\L$1/g;
To make the whole line upper case:
$line = uc($line);
To force each word to be lower case, with the first letter upper case:
$line =~ s/(\w+)/\u\L$1/g;
You can (and probably should) enable locale awareness of those characters by placing a use locale
pragma in your program. See perllocale for endless details on locales.
How can I split a [character] delimited string except when inside
[character]? (Comma−separated files)
Take the example case of trying to split a string that is comma−separated into its different fields. (We‘ll
pretend you said comma−separated, not comma−delimited, which is different and almost never what you
mean.) You can‘t use split(/,/) because you shouldn‘t split if the comma is inside quotes. For
example, take a data line like this:
SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped"
Due to the restriction of the quotes, this is a fairly complex problem. Thankfully, we have Jeffrey Friedl,
author of a highly recommended book on regular expressions, to handle these for us. He suggests (assuming
your string is contained in $text):
@new = ();
push(@new, $+) while $text =~ m{
"([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside the quotes
| ([^,]+),?
| ,
}gx;
push(@new, undef) if substr($text,−1,1) eq ’,’;
If you want to represent quotation marks inside a quotation−mark−delimited field, escape them with
backslashes (eg, "like \"this\"". Unescaping them is a task addressed earlier in this section.
Alternatively, the Text::ParseWords module (part of the standard perl distribution) lets you say:
use Text::ParseWords;
@new = quotewords(",", 0, $text);
How do I strip blank space from the beginning/end of a string?
Although the simplest approach would seem to be:
$string =~ s/^\s*(.*?)\s*$/$1/;

50

Version 5.005_02

18−Oct−1998

perlfaq4

Perl Programmers Reference Guide

perlfaq4

This is unneccesarily slow, destructive, and fails with embedded newlines. It is much better faster to do this
in two steps:
$string =~ s/^\s+//;
$string =~ s/\s+$//;
Or more nicely written as:
for ($string) {
s/^\s+//;
s/\s+$//;
}
This idiom takes advantage of the foreach loop‘s aliasing behavior to factor out common code. You can
do this on several strings at once, or arrays, or even the values of a hash if you use a slide:
# trim whitespace in the scalar, the array,
# and all the values in the hash
foreach ($scalar, @array, @hash{keys %hash}) {
s/^\s+//;
s/\s+$//;
}
How do I extract selected columns from a string?
Use substr() or unpack(), both documented in perlfunc. If you prefer thinking in terms of columns
instead of widths, you can use this kind of thing:
# determine the unpack format needed to split Linux ps output
# arguments are cut columns
my $fmt = cut2fmt(8, 14, 20, 26, 30, 34, 41, 47, 59, 63, 67, 72);
sub cut2fmt {
my(@positions) = @_;
my $template = ’’;
my $lastpos
= 1;
for my $place (@positions) {
$template .= "A" . ($place − $lastpos) . " ";
$lastpos
= $place;
}
$template .= "A*";
return $template;
}
How do I find the soundex value of a string?
Use the standard Text::Soundex module distributed with perl.
How can I expand variables in text strings?
Let‘s assume that you have a string like:
$text = ’this has a $foo in it and a $bar’;
If those were both global variables, then this would suffice:
$text =~ s/\$(\w+)/${$1}/g;
But since they are probably lexicals, or at least, they could be, you‘d have to do this:
$text =~ s/(\$\w+)/$1/eeg;
die if $@;

# needed on /ee, not /e

It‘s probably better in the general case to treat those variables as entries in some special hash. For example:

18−Oct−1998

Version 5.005_02

51

perlfaq4

Perl Programmers Reference Guide

perlfaq4

%user_defs = (
foo => 23,
bar => 19,
);
$text =~ s/\$(\w+)/$user_defs{$1}/g;
See also ‘‘How do I expand function calls in a string?‘’ in this section of the FAQ.
What‘s wrong with always quoting "$vars"?
The problem is that those double−quotes force stringification, coercing numbers and references into strings,
even when you don‘t want them to be.
If you get used to writing odd things like these:
print "$var";
$new = "$old";
somefunc("$var");

# BAD
# BAD
# BAD

You‘ll be in trouble. Those should (in 99.8% of the cases) be the simpler and more direct:
print $var;
$new = $old;
somefunc($var);
Otherwise, besides slowing you down, you‘re going to break code when the thing in the scalar is actually
neither a string nor a number, but a reference:
func(\@array);
sub func {
my $aref = shift;
my $oref = "$aref";
}

# WRONG

You can also get into subtle problems on those few operations in Perl that actually do care about the
difference between a string and a number, such as the magical ++ autoincrement operator or the
syscall() function.
Stringification also destroys arrays.
@lines = ‘command‘;
print "@lines";
print @lines;

# WRONG − extra blanks
# right

Why don‘t my <op_ppaddr)() ) ;
@@@
TAINT_NOT;
@@@
return 0;
@@@ }
MAIN_INTERPRETER_LOOP
Or with a fixed amount of leading white space, with remaining indentation correctly preserved:
$poem = fix< 1 ? \@intersection : \@difference }, $element;
}
How do I find the first array element for which a condition is true?
You can use this if you care about the index:
for ($i=0; $i < @array; $i++) {
if ($array[$i] eq "Waldo") {
$found_index = $i;
last;
}
}
Now $found_index has what you want.
How do I handle linked lists?
In general, you usually don‘t need a linked list in Perl, since with regular arrays, you can push and pop or
shift and unshift at either end, or you can use splice to add and/or remove arbitrary number of elements at
arbitrary points. Both pop and shift are both O(1) operations on perl‘s dynamic arrays. In the absence of
shifts and pops, push in general needs to reallocate on the order every log(N) times, and unshift will need to
copy pointers each time.
If you really, really wanted, you could use structures as described in perldsc or perltoot and do just what the
algorithm book tells you to do.
How do I handle circular lists?
Circular lists could be handled in the traditional fashion with linked lists, or you could just do something like
this with an array:
unshift(@array, pop(@array));
push(@array, shift(@array));

18−Oct−1998

# the last shall be first
# and vice versa

Version 5.005_02

55

perlfaq4

Perl Programmers Reference Guide

perlfaq4

How do I shuffle an array randomly?
Use this:
# fisher_yates_shuffle( \@array ) :
# generate a random permutation of @array in place
sub fisher_yates_shuffle {
my $array = shift;
my $i;
for ($i = @$array; −−$i; ) {
my $j = int rand ($i+1);
next if $i == $j;
@$array[$i,$j] = @$array[$j,$i];
}
}
fisher_yates_shuffle( \@array );

# permutes @array in place

You‘ve probably seen shuffling algorithms that works using splice, randomly picking another element to
swap the current element with:
srand;
@new = ();
@old = 1 .. 10; # just a demo
while (@old) {
push(@new, splice(@old, rand @old, 1));
}
This is bad because splice is already O(N), and since you do it N times, you just invented a quadratic
algorithm; that is, O(N**2). This does not scale, although Perl is so efficient that you probably won‘t notice
this until you have rather largish arrays.
How do I process/modify each element of an array?
Use for/foreach:
for (@lines) {
s/foo/bar/;
y/XZ/ZX/;
}

# change that word
# swap those letters

Here‘s another; let‘s compute spherical volumes:
for (@volumes = @radii) {
$_ **= 3;
$_ *= (4/3) * 3.14159;
}

# @volumes has changed parts
# this will be constant folded

If you want to do the same thing to modify the values of the hash, you may not use the values function,
oddly enough. You need a slice:
for $orbit ( @orbits{keys %orbits} ) {
($orbit **= 3) *= (4/3) * 3.14159;
}
How do I select a random element from an array?
Use the rand() function (see rand):
# at the top of the program:
srand;
# not needed for 5.004 and later
# then later on

56

Version 5.005_02

18−Oct−1998

perlfaq4

Perl Programmers Reference Guide

perlfaq4

$index
= rand @array;
$element = $array[$index];
Make sure you only call srand once per program, if then. If you are calling it more than once (such as before
each call to rand), you‘re almost certainly doing something wrong.
How do I permute N elements of a list?
Here‘s a little program that generates all permutations of all the words on each line of input. The algorithm
embodied in the permute() function should work on any list:
#!/usr/bin/perl −n
# tsc−permute: permute each word of input
permute([split], []);
sub permute {
my @items = @{ $_[0] };
my @perms = @{ $_[1] };
unless (@items) {
print "@perms\n";
} else {
my(@newitems,@newperms,$i);
foreach $i (0 .. $#items) {
@newitems = @items;
@newperms = @perms;
unshift(@newperms, splice(@newitems, $i, 1));
permute([@newitems], [@newperms]);
}
}
}
How do I sort an array by (anything)?
Supply a comparison function to sort() (described in sort):
@list = sort { $a <=> $b } @list;
The default sort function is cmp, string comparison, which would sort (1, 2, 10) into (1, 10, 2).
<=>, used above, is the numerical comparison operator.
If you have a complicated function needed to pull out the part you want to sort on, then don‘t do it inside the
sort function. Pull it out first, because the sort BLOCK can be called many times for the same element.
Here‘s an example of how to pull out the first word after the first number on each item, and then sort those
words case−insensitively.
@idx = ();
for (@data) {
($item) = /\d+\s*(\S+)/;
push @idx, uc($item);
}
@sorted = @data[ sort { $idx[$a] cmp $idx[$b] } 0 .. $#idx ];
Which could also be written this way, using a trick that‘s come to be known as the Schwartzian Transform:
@sorted = map { $_−>[0] }
sort { $a−>[1] cmp $b−>[1] }
map { [ $_, uc((/\d+\s*(\S+)/ )[0] ] } @data;
If you need to sort on several fields, the following paradigm is useful.
@sorted = sort { field1($a) <=> field1($b) ||
field2($a) cmp field2($b) ||
field3($a) cmp field3($b)

18−Oct−1998

Version 5.005_02

57

perlfaq4

Perl Programmers Reference Guide
}

perlfaq4

@data;

This can be conveniently combined with precalculation of keys as given above.
See http://www.perl.com/CPAN/doc/FMTEYEWTK/sort.html for more about this approach.
See also the question below on sorting hashes.
How do I manipulate arrays of bits?
Use pack() and unpack(), or else vec() and the bitwise operations.
For example, this sets $vec to have bit N set if $ints[N] was set:
$vec = ’’;
foreach(@ints) { vec($vec,$_,1) = 1 }
And here‘s how, given a vector in $vec, you can get those bits into your @ints array:
sub bitvec_to_list {
my $vec = shift;
my @ints;
# Find null−byte density then select best algorithm
if ($vec =~ tr/\0// / length $vec > 0.95) {
use integer;
my $i;
# This method is faster with mostly null−bytes
while($vec =~ /[^\0]/g ) {
$i = −9 + 8 * pos $vec;
push @ints, $i if vec($vec, ++$i, 1);
push @ints, $i if vec($vec, ++$i, 1);
push @ints, $i if vec($vec, ++$i, 1);
push @ints, $i if vec($vec, ++$i, 1);
push @ints, $i if vec($vec, ++$i, 1);
push @ints, $i if vec($vec, ++$i, 1);
push @ints, $i if vec($vec, ++$i, 1);
push @ints, $i if vec($vec, ++$i, 1);
}
} else {
# This method is a fast general algorithm
use integer;
my $bits = unpack "b*", $vec;
push @ints, 0 if $bits =~ s/^(\d)// && $1;
push @ints, pos $bits while($bits =~ /1/g);
}
return \@ints;
}
This method gets faster the more sparse the bit vector is. (Courtesy of Tim Bunce and Winfried Koenig.)
Why does defined() return true on empty arrays and hashes?
See defined in the 5.004 release or later of Perl.
Data: Hashes (Associative Arrays)
How do I process an entire hash?
Use the each() function (see each) if you don‘t care whether it‘s sorted:
while ( ($key, $value) = each %hash) {
print "$key = $value\n";
}

58

Version 5.005_02

18−Oct−1998

perlfaq4

Perl Programmers Reference Guide

perlfaq4

If you want it sorted, you‘ll have to use foreach() on the result of sorting the keys as shown in an earlier
question.
What happens if I add or remove keys from a hash while iterating over it?
Don‘t do that.
How do I look up a hash element by value?
Create a reverse hash:
%by_value = reverse %by_key;
$key = $by_value{$value};
That‘s not particularly efficient. It would be more space−efficient to use:
while (($key, $value) = each %by_key) {
$by_value{$value} = $key;
}
If your hash could have repeated values, the methods above will only find one of the associated keys. This
may or may not worry you.
How can I know how many entries are in a hash?
If you mean how many keys, then all you have to do is take the scalar sense of the keys() function:
$num_keys = scalar keys %hash;
In void context it just resets the iterator, which is faster for tied hashes.
How do I sort a hash (optionally by value instead of key)?
Internally, hashes are stored in a way that prevents you from imposing an order on key−value pairs. Instead,
you have to sort a list of the keys or values:
@keys = sort keys %hash;
# sorted by key
@keys = sort {
$hash{$a} cmp $hash{$b}
} keys %hash;
# and by value
Here we‘ll do a reverse numeric sort by value, and if two keys are identical, sort by length of key, and if that
fails, by straight ASCII comparison of the keys (well, possibly modified by your locale — see perllocale).
@keys = sort {
$hash{$b} <=> $hash{$a}
||
length($b) <=> length($a)
||
$a cmp $b
} keys %hash;
How can I always keep my hash sorted?
You can look into using the DB_File module and tie() using the $DB_BTREE hash bindings as
documented in In Memory Databases in DB_File. The Tie::IxHash module from CPAN might also be
instructive.
What‘s the difference between "delete" and "undef" with hashes?
Hashes are pairs of scalars: the first is the key, the second is the value. The key will be coerced to a string,
although the value can be any kind of scalar: string, number, or reference. If a key $key is present in the
array, exists($key) will return true. The value for a given key can be undef, in which case
$array{$key} will be undef while $exists{$key} will return true. This corresponds to ($key,
undef) being in the hash.
Pictures help... here‘s the %ary table:

18−Oct−1998

Version 5.005_02

59

perlfaq4

Perl Programmers Reference Guide

perlfaq4

keys values
+−−−−−−+−−−−−−+
| a
| 3
|
| x
| 7
|
| d
| 0
|
| e
| 2
|
+−−−−−−+−−−−−−+
And these conditions hold
$ary{’a’}
$ary{’d’}
defined $ary{’d’}
defined $ary{’a’}
exists $ary{’a’}
grep ($_ eq ’a’, keys %ary)

is
is
is
is
is
is

true
false
true
true
true (perl5 only)
true

is
is
is
is
is
is

FALSE
false
true
FALSE
true (perl5 only)
true

If you now say
undef $ary{’a’}
your table now reads:
keys values
+−−−−−−+−−−−−−+
| a
| undef|
| x
| 7
|
| d
| 0
|
| e
| 2
|
+−−−−−−+−−−−−−+
and these conditions now hold; changes in caps:
$ary{’a’}
$ary{’d’}
defined $ary{’d’}
defined $ary{’a’}
exists $ary{’a’}
grep ($_ eq ’a’, keys %ary)

Notice the last two: you have an undef value, but a defined key!
Now, consider this:
delete $ary{’a’}
your table now reads:
keys values
+−−−−−−+−−−−−−+
| x
| 7
|
| d
| 0
|
| e
| 2
|
+−−−−−−+−−−−−−+
and these conditions now hold; changes in caps:
$ary{’a’}
$ary{’d’}
defined $ary{’d’}
defined $ary{’a’}
exists $ary{’a’}

60

is
is
is
is
is

Version 5.005_02

false
false
true
false
FALSE (perl5 only)

18−Oct−1998

perlfaq4

Perl Programmers Reference Guide
grep ($_ eq ’a’, keys %ary)

perlfaq4

is FALSE

See, the whole entry is gone!
Why don‘t my tied hashes make the defined/exists distinction?
They may or may not implement the EXISTS() and DEFINED() methods differently. For example, there
isn‘t the concept of undef with hashes that are tied to DBM* files. This means the true/false tables above will
give different results when used on such a hash. It also means that exists and defined do the same thing with
a DBM* file, and what they end up doing is not what they do with ordinary hashes.
How do I reset an each() operation part−way through?
Using keys %hash in scalar context returns the number of keys in the hash and resets the iterator
associated with the hash. You may need to do this if you use last to exit a loop early so that when you
re−enter it, the hash iterator has been reset.
How can I get the unique keys from two hashes?
First you extract the keys from the hashes into arrays, and then solve the uniquifying the array problem
described above. For example:
%seen = ();
for $element (keys(%foo), keys(%bar)) {
$seen{$element}++;
}
@uniq = keys %seen;
Or more succinctly:
@uniq = keys %{{%foo,%bar}};
Or if you really want to save space:
%seen = ();
while (defined ($key = each %foo)) {
$seen{$key}++;
}
while (defined ($key = each %bar)) {
$seen{$key}++;
}
@uniq = keys %seen;
How can I store a multidimensional array in a DBM file?
Either stringify the structure yourself (no fun), or else get the MLDBM (which uses Data::Dumper) module
from CPAN and layer it on top of either DB_File or GDBM_File.
How can I make my hash remember the order I put elements into it?
Use the Tie::IxHash from CPAN.
use Tie::IxHash;
tie(%myhash, Tie::IxHash);
for ($i=0; $i<20; $i++) {
$myhash{$i} = 2*$i;
}
@keys = keys %myhash;
# @keys = (0,1,2,3,...)
Why does passing a subroutine an undefined element in a hash create it?
If you say something like:
somefunc($hash{"nonesuch key here"});

18−Oct−1998

Version 5.005_02

61

perlfaq4

Perl Programmers Reference Guide

perlfaq4

Then that element "autovivifies"; that is, it springs into existence whether you store something there or not.
That‘s because functions get scalars passed in by reference. If somefunc() modifies $_[0], it has to be
ready to write it back into the caller‘s version.
This has been fixed as of perl5.004.
Normally, merely accessing a key‘s value for a nonexistent key does not cause that key to be forever there.
This is different than awk‘s behavior.
How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays?
Use references (documented in perlref). Examples of complex data structures are given in perldsc and
perllol. Examples of structures and object−oriented classes are in perltoot.
How can I use a reference as a hash key?
You can‘t do this directly, but you could use the standard Tie::Refhash module distributed with perl.
Data: Misc
How do I handle binary data correctly?
Perl is binary clean, so this shouldn‘t be a problem. For example, this works fine (assuming the files are
found):
if (‘cat /vmunix‘ =~ /gzip/) {
print "Your kernel is GNU−zip enabled!\n";
}
On some systems, however, you have to play tedious games with "text" versus "binary" files.
binmode in perlfunc.

See

If you‘re concerned about 8−bit ASCII data, then see perllocale.
If you want to deal with multibyte characters, however, there are some gotchas. See the section on Regular
Expressions.
How do I determine whether a scalar is a number/whole/integer/float?
Assuming that you don‘t care about IEEE notations like "NaN" or "Infinity", you probably just want to use a
regular expression.
warn "has nondigits"
if
/\D/;
warn "not a natural number" unless /^\d+$/;
# rejects −3
warn "not an integer"
unless /^−?\d+$/;
# rejects +3
warn "not an integer"
unless /^[+−]?\d+$/;
warn "not a decimal number" unless /^−?\d+\.?\d*$/; # rejects .2
warn "not a decimal number" unless /^−?(?:\d+(?:\.\d*)?|\.\d+)$/;
warn "not a C float"
unless /^([+−]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+−]?\d+))?$/;
If you‘re on a POSIX system, Perl‘s supports the POSIX::strtod function. Its semantics are somewhat
cumbersome, so here‘s a getnum wrapper function for more convenient access. This function takes a string
and returns the number it found, or undef for input that isn‘t a C float. The is_numeric function is a
front end to getnum if you just want to say, ‘‘Is this a float?‘’
sub getnum {
use POSIX qw(strtod);
my $str = shift;
$str =~ s/^\s+//;
$str =~ s/\s+$//;
$! = 0;
my($num, $unparsed) = strtod($str);
if (($str eq ’’) || ($unparsed != 0) || $!) {
return undef;

62

Version 5.005_02

18−Oct−1998

perlfaq4

Perl Programmers Reference Guide

perlfaq4

} else {
return $num;
}
}
sub is_numeric { defined &getnum }
Or you could check out http://www.perl.com/CPAN/modules/by−module/String/String−Scanf−1.1.tar.gz
instead. The POSIX module (part of the standard Perl distribution) provides the strtol and strtod for
converting strings to double and longs, respectively.
How do I keep persistent data across program calls?
For some specific applications, you can use one of the DBM modules. See AnyDBM_File. More generically,
you should consult the FreezeThaw, Storable, or Class::Eroot modules from CPAN.
How do I print out or copy a recursive data structure?
The Data::Dumper module on CPAN is nice for printing out data structures, and FreezeThaw for copying
them. For example:
use FreezeThaw qw(freeze thaw);
$new = thaw freeze $old;
Where $old can be (a reference to) any kind of data structure you‘d like. It will be deeply copied.
How do I define methods for every class/object?
Use the UNIVERSAL class (see UNIVERSAL).
How do I verify a credit card checksum?
Get the Business::CreditCard module from CPAN.
AUTHOR AND COPYRIGHT
Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. All rights reserved.
When included as part of the Standard Version of Perl, or as part of its complete documentation whether
printed or otherwise, this work may be distributed only under the terms of Perl‘s Artistic License. Any
distribution of this file or derivatives thereof outside of that package require that special arrangements be
made with copyright holder.
Irrespective of its distribution, all code examples in this file are hereby placed into the public domain. You
are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A
simple comment in the code giving credit would be courteous but is not required.

18−Oct−1998

Version 5.005_02

63

perlfaq5

Perl Programmers Reference Guide

perlfaq5

NAME
perlfaq5 − Files and Formats ($Revision: 1.24 $, $Date: 1998/07/05 15:07:20 $)
DESCRIPTION
This section deals with I/O and the "f" issues: filehandles, flushing, formats, and footers.
How do I flush/unbuffer an output filehandle? Why must I do this?
The C standard I/O library (stdio) normally buffers characters sent to devices. This is done for efficiency
reasons, so that there isn‘t a system call for each byte. Any time you use print() or write() in Perl,
you go though this buffering. syswrite() circumvents stdio and buffering.
In most stdio implementations, the type of output buffering and the size of the buffer varies according to the
type of device. Disk files are block buffered, often with a buffer size of more than 2k. Pipes and sockets are
often buffered with a buffer size between 1/2 and 2k. Serial devices (e.g. modems, terminals) are normally
line−buffered, and stdio sends the entire line when it gets the newline.
Perl does not support truly unbuffered output (except insofar as you can syswrite(OUT, $char, 1)).
What it does instead support is "command buffering", in which a physical write is performed after every
output command. This isn‘t as hard on your system as unbuffering, but does get the output where you want
it when you want it.
If you expect characters to get to your device when you print them there, you‘ll want to autoflush its handle.
Use select() and the $| variable to control autoflushing (see $| and select):
$old_fh = select(OUTPUT_HANDLE);
$| = 1;
select($old_fh);
Or using the traditional idiom:
select((select(OUTPUT_HANDLE), $| = 1)[0]);
Or if don‘t mind slowly loading several thousand lines of module code just because you‘re afraid of the $|
variable:
use FileHandle;
open(DEV, "+autoflush(1);

# ceci n’est pas une pipe

or the newer IO::* modules:
use IO::Handle;
open(DEV, ">/dev/printer");
DEV−>autoflush(1);

# but is this?

or even this:
use IO::Socket;
# this one is kinda a pipe?
$sock = IO::Socket::INET−>new(PeerAddr => ’www.perl.com’,
PeerPort => ’http(80)’,
Proto
=> ’tcp’);
die "$!" unless $sock;
$sock−>autoflush();
print $sock "GET / HTTP/1.0" . "\015\012" x 2;
$document = join(’’, <$sock>);
print "DOC IS: $document\n";
Note the bizarrely hardcoded carriage return and newline in their octal equivalents. This is the ONLY way
(currently) to assure a proper flush on all platforms, including Macintosh. That the way things work in
network programming: you really should specify the exact bit pattern on the network line terminator. In

64

Version 5.005_02

18−Oct−1998

perlfaq5

Perl Programmers Reference Guide

perlfaq5

practice, "\n\n" often works, but this is not portable.
See perlfaq9 for other examples of fetching URLs over the web.
How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to
the beginning of a file?
Although humans have an easy time thinking of a text file as being a sequence of lines that operates much
like a stack of playing cards — or punch cards — computers usually see the text file as a sequence of bytes.
In general, there‘s no direct way for Perl to seek to a particular line of a file, insert text into a file, or remove
text from a file.
(There are exceptions in special circumstances. You can add or remove at the very end of the file. Another
is replacing a sequence of bytes with another sequence of the same length. Another is using the
$DB_RECNO array bindings as documented in DB_File. Yet another is manipulating files with all lines the
same length.)
The general solution is to create a temporary copy of the text file with the changes you want, then copy that
over the original. This assumes no locking.
$old = $file;
$new = "$file.tmp.$$";
$bak = "$file.bak";
open(OLD, "< $old")
open(NEW, "> $new")

or die "can’t open $old: $!";
or die "can’t open $new: $!";

# Correct typos, preserving case
while () {
s/\b(p)earl\b/${1}erl/i;
(print NEW $_)
or die "can’t write to $new: $!";
}
close(OLD)
close(NEW)

or die "can’t close $old: $!";
or die "can’t close $new: $!";

rename($old, $bak)
rename($new, $old)

or die "can’t rename $old to $bak: $!";
or die "can’t rename $new to $old: $!";

Perl can do this sort of thing for you automatically with the −i command−line switch or the closely−related
$^I variable (see perlrun for more details). Note that −i may require a suffix on some non−Unix systems;
see the platform−specific documentation that came with your port.
# Renumber a series of tests from the command line
perl −pi −e ’s/(^\s+test\s+)\d+/ $1 . ++$count /e’ t/op/taint.t
# form a script
local($^I, @ARGV) = (’.bak’, glob("*.c"));
while (<>) {
if ($. == 1) {
print "This line should appear at the top of each file\n";
}
s/\b(p)earl\b/${1}erl/i;
# Correct typos, preserving case
print;
close ARGV if eof;
# Reset $.
}
If you need to seek to an arbitrary line of a file that changes infrequently, you could build up an index of byte
positions of where the line ends are in the file. If the file is large, an index of every tenth or hundredth line
end would allow you to seek and read fairly efficiently. If the file is sorted, try the look.pl library (part of the
standard perl distribution).

18−Oct−1998

Version 5.005_02

65

perlfaq5

Perl Programmers Reference Guide

perlfaq5

In the unique case of deleting lines at the end of a file, you can use tell() and truncate(). The
following code snippet deletes the last line of a file without making a copy or reading the whole file into
memory:
open (FH, "+< $file");
while (  ) { $addr = tell(FH) unless eof(FH) }
truncate(FH, $addr);
Error checking is left as an exercise for the reader.
How do I count the number of lines in a file?
One fairly efficient way is to count newlines in the file. The following program uses a feature of tr///, as
documented in perlop. If your text file doesn‘t end with a newline, then it‘s not really a proper text file, so
this may report one fewer line than you expect.
$lines = 0;
open(FILE, $filename) or die "Can’t open ‘$filename’: $!";
while (sysread FILE, $buffer, 4096) {
$lines += ($buffer =~ tr/\n//);
}
close FILE;
This assumes no funny games with newline translations.
How do I make a temporary file name?
Use the new_tmpfile class method from the IO::File module to get a filehandle opened for reading and
writing. Use this if you don‘t need to know the file‘s name.
use IO::File;
$fh = IO::File−>new_tmpfile()
or die "Unable to make new temporary file: $!";
Or you can use the tmpnam function from the POSIX module to get a filename that you then open yourself.
Use this if you do need to know the file‘s name.
use Fcntl;
use POSIX qw(tmpnam);
# try new temporary filenames until we get one that didn’t already
# exist; the check should be unnecessary, but you can’t be too careful
do { $name = tmpnam() }
until sysopen(FH, $name, O_RDWR|O_CREAT|O_EXCL);
# install atexit−style handler so that when we exit or die,
# we automatically delete this temporary file
END { unlink($name) or die "Couldn’t unlink $name : $!" }
# now go on to use the file ...
If you‘re committed to doing this by hand, use the process ID and/or the current time−value. If you need to
have many temporary files in one process, use a counter:
BEGIN {
use Fcntl;
my $temp_dir = −d ’/tmp’ ? ’/tmp’ : $ENV{TMP} || $ENV{TEMP};
my $base_name = sprintf("%s/%d−%d−0000", $temp_dir, $$, time());
sub temp_file {
local *FH;
my $count = 0;
until (defined(fileno(FH)) || $count++ > 100) {
$base_name =~ s/−(\d+)$/"−" . (1 + $1)/e;

66

Version 5.005_02

18−Oct−1998

perlfaq5

Perl Programmers Reference Guide

perlfaq5

sysopen(FH, $base_name, O_WRONLY|O_EXCL|O_CREAT);
}
if (defined(fileno(FH))
return (*FH, $base_name);
} else {
return ();
}
}
}
How can I manipulate fixed−record−length files?
The most efficient way is using pack() and unpack(). This is faster than using substr() when take
many, many strings. It is slower for just a few.
Here is a sample chunk of code to break up and put back together again some fixed−format input lines, in
this case from the output of a normal, Berkeley−style ps:
# sample input line:
#
15158 p5 T
0:00 perl /home/tchrist/scripts/now−what
$PS_T = ’A6 A4 A7 A5 A*’;
open(PS, "ps|");
print scalar ;
while () {
($pid, $tt, $stat, $time, $command) = unpack($PS_T, $_);
for $var (qw!pid tt stat time command!) {
print "$var: <$$var>\n";
}
print ’line=’, pack($PS_T, $pid, $tt, $stat, $time, $command),
"\n";
}
We‘ve used $$var in a way that forbidden by use strict ‘refs’. That is, we‘ve promoted a string to
a scalar variable reference using symbolic references. This is ok in small programs, but doesn‘t scale well.
It also only works on global variables, not lexicals.
How can I make a filehandle local to a subroutine? How do I pass filehandles between
subroutines? How do I make an array of filehandles?
The fastest, simplest, and most direct way is to localize the typeglob of the filehandle in question:
local *TmpHandle;
Typeglobs are fast (especially compared with the alternatives) and reasonably easy to use, but they also have
one subtle drawback. If you had, for example, a function named TmpHandle(), or a variable named
%TmpHandle, you just hid it from yourself.
sub findme {
local *HostFile;
open(HostFile, ") {
print if /\b127\.(0\.0\.)?1\b/;
}
# *HostFile automatically closes/disappears here
}
Here‘s how to use this in a loop to open and store a bunch of filehandles. We‘ll use as values of the hash an
ordered pair to make it easy to sort the hash in insertion order.
@names = qw(motd termcap passwd hosts);

18−Oct−1998

Version 5.005_02

67

perlfaq5

Perl Programmers Reference Guide

perlfaq5

my $i = 0;
foreach $filename (@names) {
local *FH;
open(FH, "/etc/$filename") || die "$filename: $!";
$file{$filename} = [ $i++, *FH ];
}
# Using the filehandles in the array
foreach $name (sort { $file{$a}[0] <=> $file{$b}[0] } keys %file) {
my $fh = $file{$name}[1];
my $line = <$fh>;
print "$name $. $line";
}
For passing filehandles to functions, the easiest way is to prefer them with a star, as in func(*STDIN). See
Passing Filehandles in perlfaq7 for details.
If you want to create many, anonymous handles, you should check out the Symbol, FileHandle, or
IO::Handle (etc.) modules. Here‘s the equivalent code with Symbol::gensym, which is reasonably
light−weight:
foreach $filename (@names) {
use Symbol;
my $fh = gensym();
open($fh, "/etc/$filename") || die "open /etc/$filename: $!";
$file{$filename} = [ $i++, $fh ];
}
Or here using the semi−object−oriented FileHandle, which certainly isn‘t light−weight:
use FileHandle;
foreach $filename (@names) {
my $fh = FileHandle−>new("/etc/$filename") or die "$filename: $!";
$file{$filename} = [ $i++, $fh ];
}
Please understand that whether the filehandle happens to be a (probably localized) typeglob or an anonymous
handle from one of the modules, in no way affects the bizarre rules for managing indirect handles. See the
next question.
How can I use a filehandle indirectly?
An indirect filehandle is using something other than a symbol in a place that a filehandle is expected. Here
are ways to get those:
$fh
$fh
$fh
$fh
$fh

=
SOME_FH;
= "SOME_FH";
= *SOME_FH;
= \*SOME_FH;
= *SOME_FH{IO};

#
#
#
#
#

bareword is strict−subs hostile
strict−refs hostile; same package only
typeglob
ref to typeglob (bless−able)
blessed IO::Handle from *SOME_FH typeglob

Or to use the new method from the FileHandle or IO modules to create an anonymous filehandle, store that
in a scalar variable, and use it as though it were a normal filehandle.
use FileHandle;
$fh = FileHandle−>new();
use IO::Handle;
$fh = IO::Handle−>new();

# 5.004 or higher

Then use any of those as you would a normal filehandle. Anywhere that Perl is expecting a filehandle, an

68

Version 5.005_02

18−Oct−1998

perlfaq5

Perl Programmers Reference Guide

perlfaq5

indirect filehandle may be used instead. An indirect filehandle is just a scalar variable that contains a
filehandle. Functions like print, open, seek, or the functions or the  diamond operator will accept
either a read filehandle or a scalar variable containing one:
($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
print $ofh "Type it: ";
$got = <$ifh>
print $efh "What was that: $got";
Of you‘re passing a filehandle to a function, you can write the function in two ways:
sub accept_fh {
my $fh = shift;
print $fh "Sending to indirect filehandle\n";
}
Or it can localize a typeglob and use the filehandle directly:
sub accept_fh {
local *FH = shift;
print FH "Sending to localized filehandle\n";
}
Both styles work with either objects or typeglobs of real filehandles. (They might also work with strings
under some circumstances, but this is risky.)
accept_fh(*STDOUT);
accept_fh($handle);
In the examples above, we assigned the filehandle to a scalar variable before using it. That is because only
simple scalar variables, not expressions or subscripts into hashes or arrays, can be used with built−ins like
print, printf, or the diamond operator. These are illegal and won‘t even compile:
@fd = (*STDIN, *STDOUT, *STDERR);
print $fd[1] "Type it: ";
$got = <$fd[0]>
print $fd[2] "What was that: $got";

# WRONG
# WRONG
# WRONG

With print and printf, you get around this by using a block and an expression where you would place
the filehandle:
print { $fd[1] } "funny stuff\n";
printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559;
# Pity the poor deadbeef.
That block is a proper block like any other, so you can put more complicated code there. This sends the
message out to one of two places:
$ok = −x "/bin/cat";
print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n";
print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";
This approach of treating print and printf like object methods calls doesn‘t work for the diamond
operator. That‘s because it‘s a real operator, not just a function with a comma−less argument. Assuming
you‘ve been storing typeglobs in your structure as we did above, you can use the built−in function named
readline to reads a record just as <> does. Given the initialization shown above for @fd, this would
work, but only because readline() require a typeglob. It doesn‘t work with objects or strings, which
might be a bug we haven‘t fixed yet.
$got = readline($fd[0]);
Let it be noted that the flakiness of indirect filehandles is not related to whether they‘re strings, typeglobs,

18−Oct−1998

Version 5.005_02

69

perlfaq5

Perl Programmers Reference Guide

perlfaq5

objects, or anything else. It‘s the syntax of the fundamental operators. Playing the object game doesn‘t help
you at all here.
How can I set up a footer format to be used with write()?
There‘s no builtin way to do this, but perlform has a couple of techniques to make it possible for the intrepid
hacker.
How can I write() into a string?
See perlform for an swrite() function.
How can I output my numbers with commas added?
This one will do it for you:
sub commify {
local $_ = shift;
1 while s/^(−?\d+)(\d{3})/$1,$2/;
return $_;
}
$n = 23659019423.2331;
print "GOT: ", commify($n), "\n";
GOT: 23,659,019,423.2331
You can‘t just:
s/^(−?\d+)(\d{3})/$1,$2/g;
because you have to put the comma in and then recalculate your position.
Alternatively, this commifies all numbers in a line regardless of whether they have decimal portions, are
preceded by + or −, or whatever:
# from Andrew Johnson 
sub commify {
my $input = shift;
$input = reverse $input;
$input =~ s<(\d\d\d)(?=\d)(?!\d*\.)><$1,>g;
return reverse $input;
}
How can I translate tildes (~) in a filename?
Use the <> (glob()) operator, documented in perlfunc. This requires that you have a shell installed that
groks tildes, meaning csh or tcsh or (some versions of) ksh, and thus may have portability problems. The
Glob::KGlob module (available from CPAN) gives more portable glob functionality.
Within Perl, you may use this directly:
$filename =~ s{
^ ~
# find a leading tilde
(
# save this in $1
[^/]
# a non−slash character
*
# repeated 0 or more times (0 means me)
)
}{
$1
? (getpwnam($1))[7]
: ( $ENV{HOME} || $ENV{LOGDIR} )
}ex;

70

Version 5.005_02

18−Oct−1998

perlfaq5

Perl Programmers Reference Guide

perlfaq5

How come when I open a file read−write it wipes it out?
Because you‘re using something like this, which truncates the file and then gives you read−write access:
open(FH, "+> /path/name");

# WRONG (almost always)

Whoops. You should instead use this, which will fail if the file doesn‘t exist. Using ">" always clobbers or
creates. Using "<" never does either. The "+" doesn‘t change this.
Here are examples of many kinds of file opens. Those using sysopen() all assume
use Fcntl;
To open file for reading:
open(FH, "< $path")
sysopen(FH, $path, O_RDONLY)

|| die $!;
|| die $!;

To open file for writing, create new file if needed or else truncate old file:
open(FH, "> $path") || die $!;
sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT)
sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666)

|| die $!;
|| die $!;

To open file for writing, create new file, file must not exist:
sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT)
sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT, 0666)

|| die $!;
|| die $!;

To open file for appending, create if necessary:
open(FH, ">> $path") || die $!;
sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT)
|| die $!;
sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT, 0666) || die $!;
To open file for appending, file must exist:
sysopen(FH, $path, O_WRONLY|O_APPEND)

|| die $!;

To open file for update, file must exist:
open(FH, "+< $path")
sysopen(FH, $path, O_RDWR)

|| die $!;
|| die $!;

To open file for update, create file if necessary:
sysopen(FH, $path, O_RDWR|O_CREAT)
sysopen(FH, $path, O_RDWR|O_CREAT, 0666)

|| die $!;
|| die $!;

To open file for update, file must not exist:
sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT)
sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT, 0666)

|| die $!;
|| die $!;

To open a file without blocking, creating if necessary:
sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT)
or die "can’t open /tmp/somefile: $!":
Be warned that neither creation nor deletion of files is guaranteed to be an atomic operation over NFS. That
is, two processes might both successful create or unlink the same file! Therefore O_EXCL isn‘t so exclusive
as you might wish.
Why do I sometimes get an "Argument list too long" when I use <*?
The <> operator performs a globbing operation (see above). By default glob() forks csh(1) to do the actual
glob expansion, but csh can‘t handle more than 127 items and so gives the error message Argument list
too long. People who installed tcsh as csh won‘t have this problem, but their users may be surprised by

18−Oct−1998

Version 5.005_02

71

perlfaq5

Perl Programmers Reference Guide

perlfaq5

it.
To get around this, either do the glob yourself with Dirhandles and patterns, or use a module like
Glob::KGlob, one that doesn‘t use the shell to do globbing.
Is there a leak/bug in glob()?
Due to the current implementation on some operating systems, when you use the glob() function or its
angle−bracket alias in a scalar context, you may cause a leak and/or unpredictable behavior. It‘s best
therefore to use glob() only in list context.
How can I open a file with a leading ">" or trailing blanks?
Normally perl ignores trailing blanks in filenames, and interprets certain leading characters (or a trailing "|")
to mean something special. To avoid this, you might want to use a routine like this. It makes incomplete
pathnames into explicit relative ones, and tacks a trailing null byte on the name to make perl leave it alone:
sub safe_filename {
local $_ = shift;
return m#^/#
? "$_\0"
: "./$_\0";
}
$fn = safe_filename("<< $fn") or "couldn’t open $fn: $!";

");

You could also use the sysopen() function (see sysopen).
How can I reliably rename a file?
Well, usually you just use Perl‘s rename() function. But that may not work everywhere, in particular,
renaming files across file systems. If your operating system supports a mv(1) program or its moral
equivalent, this works:
rename($old, $new) or system("mv", $old, $new);
It may be more compelling to use the File::Copy module instead. You just copy to the new file to the new
name (checking return values), then delete the old one. This isn‘t really the same semantics as a real
rename(), though, which preserves metainformation like permissions, timestamps, inode info, etc.
The newer version of File::Copy export a move() function.
How can I lock a file?
Perl‘s builtin flock() function (see perlfunc for details) will call flock(2) if that exists, fcntl(2) if it doesn‘t
(on perl version 5.004 and later), and lockf(3) if neither of the two previous system calls exists. On some
systems, it may even use a different form of native locking. Here are some gotchas with Perl‘s flock():
1

Produces a fatal error if none of the three system calls (or their close equivalent) exists.

2

lockf(3) does not provide shared locking, and requires that the filehandle be open for writing (or
appending, or read/writing).

3

Some versions of flock() can‘t lock files over a network (e.g. on NFS file systems), so you‘d need
to force the use of fcntl(2) when you build Perl. See the flock entry of perlfunc, and the INSTALL file
in the source distribution for information on building Perl to do this.

What can‘t I just open(FH, "file.lock")?
A common bit of code NOT TO USE is this:
sleep(3) while −e "file.lock";
open(LCK, "> file.lock");

# PLEASE DO NOT USE
# THIS BROKEN CODE

This is a classic race condition: you take two steps to do something which must be done in one. That‘s why
computer hardware provides an atomic test−and−set instruction. In theory, this "ought" to work:

72

Version 5.005_02

18−Oct−1998

perlfaq5

Perl Programmers Reference Guide

perlfaq5

sysopen(FH, "file.lock", O_WRONLY|O_EXCL|O_CREAT)
or die "can’t open file.lock: $!":
except that lamentably, file creation (and deletion) is not atomic over NFS, so this won‘t work (at least, not
every time) over the net. Various schemes involving involving link() have been suggested, but these tend
to involve busy−wait, which is also subdesirable.
I still don‘t get locking. I just want to increment the number in the file. How can I do this?
Didn‘t anyone ever tell you web−page hit counters were useless? They don‘t count number of hits, they‘re a
waste of time, and they serve only to stroke the writer‘s vanity. Better to pick a random number. It‘s more
realistic.
Anyway, this is what you can do if you can‘t help yourself.
use Fcntl;
sysopen(FH, "numfile", O_RDWR|O_CREAT)
flock(FH, 2)
$num =  || 0;
seek(FH, 0, 0)
truncate(FH, 0)
(print FH $num+1, "\n")
# DO NOT UNLOCK THIS UNTIL YOU CLOSE
close FH

or die "can’t open numfile: $!";
or die "can’t flock numfile: $!";

or die "can’t rewind numfile: $!";
or die "can’t truncate numfile: $!";
or die "can’t write numfile: $!";
or die "can’t close numfile: $!";

Here‘s a much better web−page hit counter:
$hits = int( (time() − 850_000_000) / rand(1_000) );
If the count doesn‘t impress your friends, then the code might. :−)
How do I randomly update a binary file?
If you‘re just trying to patch a binary, in many cases something as simple as this works:
perl −i −pe ’s{window manager}{window mangler}g’ /usr/bin/emacs
However, if you have fixed sized records, then you might do something more like this:
$RECSIZE = 220; # size of record, in bytes
$recno
= 37; # which record to update
open(FH, "+mtime);
print "file $file updated at $date_string\n";
Error checking is left as an exercise for the reader.
How do I set a file‘s timestamp in perl?
You use the utime() function documented in utime. By way of example, here‘s a little program that copies
the read and write times from its first argument to all the rest of them.
if (@ARGV < 2) {
die "usage: cptimes timestamp_file other_files ...\n";
}
$timestamp = shift;
($atime, $mtime) = (stat($timestamp))[8,9];
utime $atime, $mtime, @ARGV;
Error checking is left as an exercise for the reader.
Note that utime() currently doesn‘t work correctly with Win95/NT ports. A bug has been reported.
Check it carefully before using it on those platforms.
How do I print to more than one file at once?
If you only have to do this once, you can do this:
for $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
To connect up to one filehandle to several output filehandles, it‘s easiest to use the tee(1) program if you
have it, and let it take care of the multiplexing:
open (FH, "| tee file1 file2 file3");
Or even:
# make STDOUT go to three files, plus original STDOUT
open (STDOUT, "| tee file1 file2 file3") or die "Teeing off: $!\n";
print "whatever\n"
or die "Writing: $!\n";
close(STDOUT)
or die "Closing: $!\n";
Otherwise you‘ll have to write your own multiplexing print function — or your own tee program — or use
Tom Christiansen‘s, at http://www.perl.com/CPAN/authors/id/TOMC/scripts/tct.gz, which is written in Perl
and offers much greater functionality than the stock version.
How can I read in a file by paragraphs?
Use the $\ variable (see perlvar for details). You can either set it to "" to eliminate empty paragraphs
("abc\n\n\n\ndef", for instance, gets treated as two paragraphs and not three), or "\n\n" to accept
empty paragraphs.
How can I read a single character from a file? From the keyboard?
You can use the builtin getc() function for most filehandles, but it won‘t (easily) work on a terminal
device. For STDIN, either use the Term::ReadKey module from CPAN, or use the sample code in getc.
If your system supports POSIX, you can use the following code, which you‘ll note turns off echo processing
as well.
#!/usr/bin/perl −w
use strict;
$| = 1;

74

Version 5.005_02

18−Oct−1998

perlfaq5

Perl Programmers Reference Guide

perlfaq5

for (1..4) {
my $got;
print "gimme: ";
$got = getone();
print "−−> $got\n";
}
exit;
BEGIN {
use POSIX qw(:termios_h);
my ($term, $oterm, $echo, $noecho, $fd_stdin);
$fd_stdin = fileno(STDIN);
$term
= POSIX::Termios−>new();
$term−>getattr($fd_stdin);
$oterm
= $term−>getlflag();
$echo
$noecho

= ECHO | ECHOK | ICANON;
= $oterm & ~$echo;

sub cbreak {
$term−>setlflag($noecho);
$term−>setcc(VTIME, 1);
$term−>setattr($fd_stdin, TCSANOW);
}
sub cooked {
$term−>setlflag($oterm);
$term−>setcc(VTIME, 0);
$term−>setattr($fd_stdin, TCSANOW);
}
sub getone {
my $key = ’’;
cbreak();
sysread(STDIN, $key, 1);
cooked();
return $key;
}
}
END { cooked() }
The Term::ReadKey module from CPAN may be easier to use:
use Term::ReadKey;
open(TTY, " fionread.c
#include 
main() {
printf("%#08x\n", FIONREAD);
}
^D
% cc −o fionread fionread
% ./fionread
0x4004667f
And then hard−code it, leaving porting as an exercise to your successor.
$FIONREAD = 0x4004667f;

# XXX: opsys dependent

$size = pack("L", 0);
ioctl(FH, $FIONREAD, $size)
$size = unpack("L", $size);

or die "Couldn’t call ioctl: $!\n";

FIONREAD requires a filehandle connected to a stream, meaning sockets, pipes, and tty devices work, but
not files.
How do I do a tail −f in perl?
First try
seek(GWFILE, 0, 1);
The statement seek(GWFILE, 0, 1) doesn‘t change the current position, but it does clear the
end−of−file condition on the handle, so that the next ; $curpos = tell(GWFILE)) {
# search for some stuff and put it into files
}
# sleep for a while
seek(GWFILE, $curpos, 0); # seek to where we had been
}
If this still doesn‘t work, look into the POSIX module. POSIX defines the clearerr() method, which
can remove the end of file condition on a filehandle. The method: read until end of file, clearerr(), read
some more. Lather, rinse, repeat.
How do I dup() a filehandle in Perl?
If you check open, you‘ll see that several of the ways to call open() should do the trick. For example:
open(LOG, ">>/tmp/logfile");
open(STDERR, ">&LOG");
Or even with a literal numeric descriptor:
$fd = $ENV{MHCONTEXTFD};
open(MHCONTEXT, "<&=$fd");

# like fdopen(3S)

Note that "<&STDIN" makes a copy, but "<&=STDIN" make an alias. That means if you close an aliased

18−Oct−1998

Version 5.005_02

77

perlfaq5

Perl Programmers Reference Guide

perlfaq5

handle, all aliases become inaccessible. This is not true with a copied one.
Error checking, as always, has been left as an exercise for the reader.
How do I close a file descriptor by number?
This should rarely be necessary, as the Perl close() function is to be used for things that Perl opened
itself, even if it was a dup of a numeric descriptor, as with MHCONTEXT above. But if you really have to,
you may be able to do this:
require ’sys/syscall.ph’;
$rc = syscall(&SYS_close, $fd + 0); # must force numeric
die "can’t sysclose $fd: $!" unless $rc == −1;
Why can‘t I use "C:\temp\foo" in DOS paths? What doesn‘t ‘C:\temp\foo.exe‘ work?
Whoops! You just put a tab and a formfeed into that filename! Remember that within double quoted strings
("like\this"), the backslash is an escape character. The full list of these is in
Quote and Quote−like Operators. Unsurprisingly, you don‘t have a file called "c:(tab)emp(formfeed)oo" or
"c:(tab)emp(formfeed)oo.exe" on your DOS filesystem.
Either single−quote your strings, or (preferably) use forward slashes. Since all DOS and Windows versions
since something like MS−DOS 2.0 or so have treated / and \ the same in a path, you might as well use the
one that doesn‘t clash with Perl — or the POSIX shell, ANSI C and C++, awk, Tcl, Java, or Python, just to
mention a few.
Why doesn‘t glob("*.*") get all the files?
Because even on non−Unix ports, Perl‘s glob function follows standard Unix globbing semantics. You‘ll
need glob("*") to get all (non−hidden) files. This makes glob() portable.
Why does Perl let me delete read−only files? Why does −i clobber protected files? Isn‘t this a
bug in Perl?
This is elaborately and painstakingly described in the "Far More Than You Ever Wanted To Know" in
http://www.perl.com/CPAN/doc/FMTEYEWTK/file−dir−perms .
The executive summary: learn how your filesystem works. The permissions on a file say what can happen to
the data in that file. The permissions on a directory say what can happen to the list of files in that directory.
If you delete a file, you‘re removing its name from the directory (so the operation depends on the
permissions of the directory, not of the file). If you try to write to the file, the permissions of the file govern
whether you‘re allowed to.
How do I select a random line from a file?
Here‘s an algorithm from the Camel Book:
srand;
rand($.) < 1 && ($line = $_) while <>;
This has a significant advantage in space over reading the whole file in. A simple proof by induction is
available upon request if you doubt its correctness.
AUTHOR AND COPYRIGHT
Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. All rights reserved.
When included as an integrated part of the Standard Distribution of Perl or of its documentation (printed or
otherwise), this works is covered under Perl‘s Artistic Licence. For separate distributions of all or part of
this FAQ outside of that, see perlfaq.
Irrespective of its distribution, all code examples here are public domain. You are permitted and encouraged
to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A
simple comment in the code giving credit to the FAQ would be courteous but is not required.

78

Version 5.005_02

18−Oct−1998

perlfaq6

Perl Programmers Reference Guide

perlfaq6

NAME
perlfaq6 − Regexps ($Revision: 1.22 $, $Date: 1998/07/16 14:01:07 $)
DESCRIPTION
This section is surprisingly small because the rest of the FAQ is littered with answers involving regular
expressions. For example, decoding a URL and checking whether something is a number are handled with
regular expressions, but those answers are found elsewhere in this document (in the section on Data and the
Networking one on networking, to be precise).
How can I hope to use regular expressions without creating illegible and unmaintainable code?
Three techniques can make regular expressions maintainable and understandable.
Comments Outside the Regexp
Describe what you‘re doing and how you‘re doing it, using normal Perl comments.
# turn the line into the first word, a colon, and the
# number of characters on the rest of the line
s/^(\w+)(.*)/ lc($1) . ":" . length($2) /meg;
Comments Inside the Regexp
The /x modifier causes whitespace to be ignored in a regexp pattern (except in a character class), and
also allows you to use normal comments there, too. As you can imagine, whitespace and comments
help a lot.
/x lets you turn this:
s{<(?:[^>’"]*|".*?"|’.*?’)+>}{}gs;
into this:
s{ <
(?:
[^>’"] *
|
".*?"
|
’.*?’
) +
>
}{}gsx;

#
#
#
#
#
#
#
#
#
#

opening angle bracket
Non−backreffing grouping paren
0 or more things that are neither > nor ’ nor "
or else
a section between double quotes (stingy match)
or else
a section between single quotes (stingy match)
all occurring one or more times
closing angle bracket
replace with nothing, i.e. delete

It‘s still not quite so clear as prose, but it is very useful for describing the meaning of each part of the
pattern.
Different Delimiters
While we normally think of patterns as being delimited with / characters, they can be delimited by
almost any character. perlre describes this. For example, the s/// above uses braces as delimiters.
Selecting another delimiter can avoid quoting the delimiter within the pattern:
s/\/usr\/local/\/usr\/share/g;
s#/usr/local#/usr/share#g;

# bad delimiter choice
# better

I‘m having trouble matching over more than one line. What‘s wrong?
Either you don‘t have more than one line in the string you‘re looking at (probably), or else you aren‘t using
the correct modifier(s) on your pattern (possibly).
There are many ways to get multiline data into a string. If you want it to happen automatically while reading
input, you‘ll want to set $/ (probably to ‘’ for paragraphs or undef for the whole file) to allow you to read
more than one line at a time.

18−Oct−1998

Version 5.005_02

79

perlfaq6

Perl Programmers Reference Guide

perlfaq6

Read perlre to help you decide which of /s and /m (or both) you might want to use: /s allows dot to
include newline, and /m allows caret and dollar to match next to a newline, not just at the end of the string.
You do need to make sure that you‘ve actually got a multiline string in there.
For example, this program detects duplicate words, even when they span line breaks (but not paragraph
ones). For this example, we don‘t need /s because we aren‘t using dot in a regular expression that we want
to cross line boundaries. Neither do we need /m because we aren‘t wanting caret or dollar to match at any
point inside the record next to newlines. But it‘s imperative that $/ be set to something other than the
default, or else we won‘t actually ever have a multiline record read in.
$/ = ’’;
# read in more whole paragraph, not just one line
while ( <> ) {
while ( /\b([\w’−]+)(\s+\1)+\b/gi ) {
# word starts alpha
print "Duplicate $1 at paragraph $.\n";
}
}
Here‘s code that finds sentences that begin with "From " (which would be mangled by many mailers):
$/ = ’’;
# read in more whole paragraph, not just one line
while ( <> ) {
while ( /^From /gm ) { # /m makes ^ match next to \n
print "leading from in paragraph $.\n";
}
}
Here‘s code that finds everything between START and END in a paragraph:
undef $/;
# read in whole file, not just one line or paragraph
while ( <> ) {
while ( /START(.*?)END/sm ) { # /s makes . cross line boundaries
print "$1\n";
}
}
How can I pull out lines between two patterns that are themselves on different lines?
You can use Perl‘s somewhat exotic .. operator (documented in perlop):
perl −ne ’print if /START/ .. /END/’ file1 file2 ...
If you wanted text and not lines, you would use
perl −0777 −pe ’print "$1\n" while /START(.*?)END/gs’ file1 file2 ...
But if you want nested occurrences of START through END, you‘ll run up against the problem described in
the question in this section on matching balanced text.
Here‘s another example of using ..:
while (<>) {
$in_header =
1 .. /^$/;
$in_body
= /^$/ .. eof();
# now choose between them
} continue {
reset if eof();
# fix $.
}
I put a regular expression into $/ but it didn‘t work. What‘s wrong?
$/ must be a string, not a regular expression. Awk has to be better for something. :−)
Actually, you could do this if you don‘t mind reading the whole file into memory:

80

Version 5.005_02

18−Oct−1998

perlfaq6

Perl Programmers Reference Guide

perlfaq6

undef $/;
@records = split /your_pattern/, ;
The Net::Telnet module (available from CPAN) has the capability to wait for a pattern in the input stream, or
timeout if it doesn‘t appear within a certain time.
## Create a file with three lines.
open FH, ">file";
print FH "The first line\nThe second line\nThe third line\n";
close FH;
## Get a read/write filehandle to it.
$fh = new FileHandle "+ $fh);
## Search for the second line and print out the third.
$file−>waitfor(’/second line\n/’);
print $file−>getline;
How do I substitute case insensitively on the LHS, but preserving case on the RHS?
It depends on what you mean by "preserving case". The following script makes the substitution have the
same case, letter by letter, as the original. If the substitution has more characters than the string being
substituted, the case of the last character is used for the rest of the substitution.
# Original by Nathan Torkington, massaged by Jeffrey Friedl
#
sub preserve_case($$)
{
my ($old, $new) = @_;
my ($state) = 0; # 0 = no change; 1 = lc; 2 = uc
my ($i, $oldlen, $newlen, $c) = (0, length($old), length($new));
my ($len) = $oldlen < $newlen ? $oldlen : $newlen;
for ($i = 0; $i < $len; $i++) {
if ($c = substr($old, $i, 1), $c =~ /[\W\d_]/) {
$state = 0;
} elsif (lc $c eq $c) {
substr($new, $i, 1) = lc(substr($new, $i, 1));
$state = 1;
} else {
substr($new, $i, 1) = uc(substr($new, $i, 1));
$state = 2;
}
}
# finish up with any remaining new (for when new is longer than old)
if ($newlen > $oldlen) {
if ($state == 1) {
substr($new, $oldlen) = lc(substr($new, $oldlen));
} elsif ($state == 2) {
substr($new, $oldlen) = uc(substr($new, $oldlen));
}
}
return $new;
}

18−Oct−1998

Version 5.005_02

81

perlfaq6

Perl Programmers Reference Guide

perlfaq6

$a = "this is a TEsT case";
$a =~ s/(test)/preserve_case($1, "success")/gie;
print "$a\n";
This prints:
this is a SUcCESS case
How can I make \w match national character sets?
See perllocale.
How can I match a locale−smart version of /[a−zA−Z]/?
One alphabetic character would be /[^\W\d_]/, no matter what locale you‘re in. Non−alphabetics would
be /[\W\d_]/ (assuming you don‘t consider an underscore a letter).
How can I quote a variable to use in a regexp?
The Perl parser will expand $variable and @variable references in regular expressions unless the
delimiter is a single quote. Remember, too, that the right−hand side of a s/// substitution is considered a
double−quoted string (see perlop for more details). Remember also that any regexp special characters will
be acted on unless you precede the substitution with \Q. Here‘s an example:
$string = "to die?";
$lhs = "die?";
$rhs = "sleep no more";
$string =~ s/\Q$lhs/$rhs/;
# $string is now "to sleep no more"
Without the \Q, the regexp would also spuriously match "di".
What is /o really for?
Using a variable in a regular expression match forces a re−evaluation (and perhaps recompilation) each time
through. The /o modifier locks in the regexp the first time it‘s used. This always happens in a constant
regular expression, and in fact, the pattern was compiled into the internal format at the same time your entire
program was.
Use of /o is irrelevant unless variable interpolation is used in the pattern, and if so, the regexp engine will
neither know nor care whether the variables change after the pattern is evaluated the very first time.
/o is often used to gain an extra measure of efficiency by not performing subsequent evaluations when you
know it won‘t matter (because you know the variables won‘t change), or more rarely, when you don‘t want
the regexp to notice if they do.
For example, here‘s a "paragrep" program:
$/ = ’’; # paragraph mode
$pat = shift;
while (<>) {
print if /$pat/o;
}
How do I use a regular expression to strip C style comments from a file?
While this actually can be done, it‘s much harder than you‘d think. For example, this one−liner
perl −0777 −pe ’s{/\*.*?\*/}{}gs’ foo.c
will work in many but not all cases. You see, it‘s too simple−minded for certain kinds of C programs, in
particular, those with what appear to be comments in quoted strings. For that, you‘d need something like
this, created by Jeffrey Friedl:
$/ = undef;
$_ = <>;

82

Version 5.005_02

18−Oct−1998

perlfaq6

Perl Programmers Reference Guide

perlfaq6

s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|’(\\.|[^’\\])*’|\n+|.[^/"’\\]*)#$
print;
This could, of course, be more legibly written with the /x modifier, adding whitespace and comments.
Can I use Perl regular expressions to match balanced text?
Although Perl regular expressions are more powerful than "mathematical" regular expressions, because they
feature conveniences like backreferences (\1 and its ilk), they still aren‘t powerful enough. You still need to
use non−regexp techniques to parse balanced text, such as the text enclosed between matching parentheses or
braces, for example.
An elaborate subroutine (for 7−bit ASCII only) to pull out balanced and possibly nested single chars, like ‘
and ’, { and }, or ( and ) can be found in
http://www.perl.com/CPAN/authors/id/TOMC/scripts/pull_quotes.gz .
The C::Scan module from CPAN contains such subs for internal usage, but they are undocumented.
What does it mean that regexps are greedy? How can I get around it?
Most people mean that greedy regexps match as much as they can. Technically speaking, it‘s actually the
quantifiers (?, *, +, {}) that are greedy rather than the whole pattern; Perl prefers local greed and immediate
gratification to overall greed. To get non−greedy versions of the same quantifiers, use (??, *?, +?, {}?).
An example:
$s1 = $s2 = "I am very very cold";
$s1 =~ s/ve.*y //;
# I am cold
$s2 =~ s/ve.*?y //;
# I am very cold
Notice how the second substitution stopped matching as soon as it encountered "y ". The *? quantifier
effectively tells the regular expression engine to find a match as quickly as possible and pass control on to
whatever is next in line, like you would if you were playing hot potato.
How do I process each word on each line?
Use the split function:
while (<>) {
foreach $word ( split ) {
# do something with $word here
}
}
Note that this isn‘t really a word in the English sense; it‘s just chunks of consecutive non−whitespace
characters.
To work with only alphanumeric sequences, you might consider
while (<>) {
foreach $word (m/(\w+)/g) {
# do something with $word here
}
}
How can I print out a word−frequency or line−frequency summary?
To do this, you have to parse out each word in the input stream. We‘ll pretend that by word you mean chunk
of alphabetics, hyphens, or apostrophes, rather than the non−whitespace chunk idea of a word given in the
previous question:
while (<>) {
while ( /(\b[^\W_\d][\w’−]+\b)/g ) {
$seen{$1}++;
}

18−Oct−1998

Version 5.005_02

# misses "‘sheep’"

83

perlfaq6

Perl Programmers Reference Guide

perlfaq6

}
while ( ($word, $count) = each %seen ) {
print "$count $word\n";
}
If you wanted to do the same thing for lines, you wouldn‘t need a regular expression:
while (<>) {
$seen{$_}++;
}
while ( ($line, $count) = each %seen ) {
print "$count $line";
}
If you want these output in a sorted order, see the section on Hashes.
How can I do approximate matching?
See the module String::Approx available from CPAN.
How do I efficiently match many regular expressions at once?
The following is super−inefficient:
while () {
foreach $pat (@patterns) {
if ( /$pat/ ) {
# do something
}
}
}
Instead, you either need to use one of the experimental Regexp extension modules from CPAN (which might
well be overkill for your purposes), or else put together something like this, inspired from a routine in Jeffrey
Friedl‘s book:
sub _bm_build {
my $condition = shift;
my @regexp = @_; # this MUST not be local(); need my()
my $expr = join $condition => map { "m/\$regexp[$_]/o" } (0..$#regexp);
my $match_func = eval "sub { $expr }";
die if $@; # propagate $@; this shouldn’t happen!
return $match_func;
}
sub bm_and { _bm_build(’&&’, @_) }
sub bm_or { _bm_build(’||’, @_) }
$f1 = bm_and qw{
xterm
(?i)window
};
$f2 = bm_or qw{
\b[Ff]ree\b
\bBSD\B
(?i)sys(tem)?\s*[V5]\b
};
# feed me /etc/termcap, prolly
while ( <> ) {
print "1: $_" if &$f1;

84

Version 5.005_02

18−Oct−1998

perlfaq6

Perl Programmers Reference Guide

perlfaq6

print "2: $_" if &$f2;
}
Why don‘t word−boundary searches with \b work for me?
Two common misconceptions are that \b is a synonym for \s+, and that it‘s the edge between whitespace
characters and non−whitespace characters. Neither is correct. \b is the place between a \w character and a
\W character (that is, \b is the edge of a "word"). It‘s a zero−width assertion, just like ^, $, and all the
other anchors, so it doesn‘t consume any characters. perlre describes the behaviour of all the regexp
metacharacters.
Here are examples of the incorrect application of \b, with fixes:
"two words" =~ /(\w+)\b(\w+)/;
"two words" =~ /(\w+)\s+(\w+)/;

# WRONG
# right

" =matchless= text" =~ /\b=(\w+)=\b/;
" =matchless= text" =~ /=(\w+)=/;

# WRONG
# right

Although they may not do what you thought they did, \b and \B can still be quite useful. For an example of
the correct use of \b, see the example of matching duplicate words over multiple lines.
An example of using \B is the pattern \Bis\B. This will find occurrences of "is" on the insides of words
only, as in "thistle", but not "this" or "island".
Why does using $&, $‘, or $’ slow my program down?
Because once Perl sees that you need one of these variables anywhere in the program, it has to provide them
on each and every pattern match. The same mechanism that handles these provides for the use of $1, $2,
etc., so you pay the same price for each regexp that contains capturing parentheses. But if you never use $&,
etc., in your script, then regexps without capturing parentheses won‘t be penalized. So avoid $&, $‘, and
$‘ if you can, but if you can‘t (and some algorithms really appreciate them), once you‘ve used them once,
use them at will, because you‘ve already paid the price.
What good is \G in a regular expression?
The notation \G is used in a match or substitution in conjunction the /g modifier (and ignored if there‘s no
/g) to anchor the regular expression to the point just past where the last match occurred, i.e. the pos()
point.
For example, suppose you had a line of text quoted in standard mail and Usenet notation, (that is, with
leading > characters), and you want change each leading > into a corresponding :. You could do so in this
way:
s/^(>+)/’:’ x length($1)/gem;
Or, using \G, the much simpler (and faster):
s/\G>/:/g;
A more sophisticated use might involve a tokenizer. The following lex−like example is courtesy of Jeffrey
Friedl. It did not work in 5.003 due to bugs in that release, but does work in 5.004 or better. (Note the use of
/c, which prevents a failed match with /g from resetting the search position back to the beginning of the
string.)
while (<>) {
chomp;
PARSER: {
m/ \G(
m/ \G(
m/ \G(
m/ \G(
}
}

18−Oct−1998

\d+\b
\w+
\s+
[^\w\d]+

)/gcx
)/gcx
)/gcx
)/gcx

&&
&&
&&
&&

do
do
do
do

Version 5.005_02

{
{
{
{

print
print
print
print

"number:
"word:
"space:
"other:

$1\n";
$1\n";
$1\n";
$1\n";

redo;
redo;
redo;
redo;

85

};
};
};
};

perlfaq6

Perl Programmers Reference Guide

perlfaq6

Of course, that could have been written as
while (<>) {
chomp;
PARSER: {
if ( /\G( \d+\b
)/gcx {
print "number: $1\n";
redo PARSER;
}
if ( /\G( \w+
)/gcx {
print "word: $1\n";
redo PARSER;
}
if ( /\G( \s+
)/gcx {
print "space: $1\n";
redo PARSER;
}
if ( /\G( [^\w\d]+ )/gcx {
print "other: $1\n";
redo PARSER;
}
}
}
But then you lose the vertical alignment of the regular expressions.
Are Perl regexps DFAs or NFAs? Are they POSIX compliant?
While it‘s true that Perl‘s regular expressions resemble the DFAs (deterministic finite automata) of the
egrep(1) program, they are in fact implemented as NFAs (non−deterministic finite automata) to allow
backtracking and backreferencing. And they aren‘t POSIX−style either, because those guarantee worst−case
behavior for all cases. (It seems that some people prefer guarantees of consistency, even when what‘s
guaranteed is slowness.) See the book "Mastering Regular Expressions" (from O‘Reilly) by Jeffrey Friedl
for all the details you could ever hope to know on these matters (a full citation appears in perlfaq2).
What‘s wrong with using grep or map in a void context?
Both grep and map build a return list, regardless of their context. This means you‘re making Perl go to the
trouble of building up a return list that you then just ignore. That‘s no way to treat a programming language,
you insensitive scoundrel!
How can I match strings with multibyte characters?
This is hard, and there‘s no good way. Perl does not directly support wide characters. It pretends that a byte
and a character are synonymous. The following set of approaches was offered by Jeffrey Friedl, whose
article in issue #5 of The Perl Journal talks about this very matter.
Let‘s suppose you have some weird Martian encoding where pairs of ASCII uppercase letters encode single
Martian letters (i.e. the two bytes "CV" make a single Martian letter, as do the two bytes "SG", "VS", "XX",
etc.). Other bytes represent single characters, just like ASCII.
So, the string of Martian "I am CVSGXX!" uses 12 bytes to encode the nine characters ‘I‘, ’ ‘, ‘a‘, ‘m‘, ’ ‘,
‘CV‘, ‘SG‘, ‘XX‘, ‘!’.
Now, say you want to search for the single character /GX/. Perl doesn‘t know about Martian, so it‘ll find the
two bytes "GX" in the "I am CVSGXX!" string, even though that character isn‘t there: it just looks like it is
because "SG" is next to "XX", but there‘s no real "GX". This is a big problem.
Here are a few ways, all painful, to deal with it:
$martian =~ s/([A−Z][A−Z])/ $1 /g; # Make sure adjacent ‘‘martian’’ bytes
# are no longer adjacent.

86

Version 5.005_02

18−Oct−1998

perlfaq6

Perl Programmers Reference Guide

perlfaq6

print "found GX!\n" if $martian =~ /GX/;
Or like this:
@chars = $martian =~ m/([A−Z][A−Z]|[^A−Z])/g;
# above is conceptually similar to:
@chars = $text =~ m/(.)/g;
#
foreach $char (@chars) {
print "found GX!\n", last if $char eq ’GX’;
}
Or like this:
while ($martian =~ m/\G([A−Z][A−Z]|.)/gs) { # \G probably unneeded
print "found GX!\n", last if $1 eq ’GX’;
}
Or like this:
die "sorry, Perl doesn’t (yet) have Martian support )−:\n";
In addition, a sample program which converts half−width to full−width katakana (in Shift−JIS or EUC
encoding) is available from CPAN as
=for Tom make it so
There are many double− (and multi−) byte encodings commonly used these days. Some versions of these
have 1−, 2−, 3−, and 4−byte characters, all mixed.
AUTHOR AND COPYRIGHT
Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. All rights reserved.
When included as part of the Standard Version of Perl, or as part of its complete documentation whether
printed or otherwise, this work may be distributed only under the terms of Perl‘s Artistic License. Any
distribution of this file or derivatives thereof outside of that package require that special arrangements be
made with copyright holder.
Irrespective of its distribution, all code examples in this file are hereby placed into the public domain. You
are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A
simple comment in the code giving credit would be courteous but is not required.

18−Oct−1998

Version 5.005_02

87

perlfaq7

Perl Programmers Reference Guide

perlfaq7

NAME
perlfaq7 − Perl Language Issues ($Revision: 1.21 $, $Date: 1998/06/22 15:20:07 $)
DESCRIPTION
This section deals with general Perl language issues that don‘t clearly fit into any of the other sections.
Can I get a BNF/yacc/RE for the Perl language?
There is no BNF, but you can paw your way through the yacc grammar in perly.y in the source distribution if
you‘re particularly brave. The grammar relies on very smart tokenizing code, so be prepared to venture into
toke.c as well.
In the words of Chaim Frenkel: "Perl‘s grammar can not be reduced to BNF. The work of parsing perl is
distributed between yacc, the lexer, smoke and mirrors."
What are all these $@%* punctuation signs, and how do I know when to use them?
They are type specifiers, as detailed in perldata:
$
@
%
*

for scalar values (number, string or reference)
for arrays
for hashes (associative arrays)
for all types of that symbol name. In version 4 you used them like
pointers, but in modern perls you can just use references.

While there are a few places where you don‘t actually need these type specifiers, you should always use
them.
A couple of others that you‘re likely to encounter that aren‘t really type specifiers are:
<> are used for inputting a record from a filehandle.
\ takes a reference to something.
Note that  is neither the type specifier for files nor the name of the handle. It is the <> operator
applied to the handle FILE. It reads one line (well, record − see $/) from the handle FILE in scalar context,
or all lines in list context. When performing open, close, or any other operation besides <> on files, or even
talking about the handle, do not use the brackets. These are correct: eof(FH), seek(FH, 0, 2) and
"copying from STDIN to FILE".
Do I always/never have to quote my strings or use semicolons and commas?
Normally, a bareword doesn‘t need to be quoted, but in most cases probably should be (and must be under
use strict). But a hash key consisting of a simple word (that isn‘t the name of a defined subroutine)
and the left−hand operand to the => operator both count as though they were quoted:
This
−−−−−−−−−−−−
$foo{line}
bar => stuff

is like this
−−−−−−−−−−−−−−−
$foo{"line"}
"bar" => stuff

The final semicolon in a block is optional, as is the final comma in a list. Good style (see perlstyle) says to
put them in except for one−liners:
if ($whoops) { exit 1 }
@nums = (1, 2, 3);
if ($whoops) {
exit 1;
}
@lines = (
"There Beren came from mountains cold",
"And lost he wandered under leaves",
);

88

Version 5.005_02

18−Oct−1998

perlfaq7

Perl Programmers Reference Guide

perlfaq7

How do I skip some return values?
One way is to treat the return values as a list and index into it:
$dir = (getpwnam($user))[7];
Another way is to use undef as an element on the left−hand−side:
($dev, $ino, undef, undef, $uid, $gid) = stat($file);
How do I temporarily block warnings?
The $^W variable (documented in perlvar) controls runtime warnings for a block:
{
local $^W = 0;
$a = $b + $c;

# temporarily turn off warnings
# I know these might be undef

}
Note that like all the punctuation variables, you cannot currently use my() on $^W, only local().
A new use warnings pragma is in the works to provide finer control over all this. The curious should
check the perl5−porters mailing list archives for details.
What‘s an extension?
A way of calling compiled C code from Perl. Reading perlxstut is a good place to learn more about
extensions.
Why do Perl operators have different precedence than C operators?
Actually, they don‘t. All C operators that Perl copies have the same precedence in Perl as they do in C. The
problem is with operators that C doesn‘t have, especially functions that give a list context to everything on
their right, eg print, chmod, exec, and so on. Such functions are called "list operators" and appear as such in
the precedence table in perlop.
A common mistake is to write:
unlink $file || die "snafu";
This gets interpreted as:
unlink ($file || die "snafu");
To avoid this problem, either put in extra parentheses or use the super low precedence or operator:
(unlink $file) || die "snafu";
unlink $file or die "snafu";
The "English" operators (and, or, xor, and not) deliberately have precedence lower than that of list
operators for just such situations as the one above.
Another operator with surprising precedence is exponentiation. It binds more tightly even than unary minus,
making −2**2 product a negative not a positive four. It is also right−associating, meaning that 2**3**2 is
two raised to the ninth power, not eight squared.
Although it has the same precedence as in C, Perl‘s ?: operator produces an lvalue. This assigns $x to
either $a or $b, depending on the trueness of $maybe:
($maybe ? $a : $b) = $x;
How do I declare/create a structure?
In general, you don‘t "declare" a structure. Just use a (probably anonymous) hash reference. See perlref and
perldsc for details. Here‘s an example:
$person = {};
$person−>{AGE} = 24;
$person−>{NAME} = "Nat";

18−Oct−1998

# new anonymous hash
# set field AGE to 24
# set field NAME to "Nat"

Version 5.005_02

89

perlfaq7

Perl Programmers Reference Guide

perlfaq7

If you‘re looking for something a bit more rigorous, try perltoot.
How do I create a module?
A module is a package that lives in a file of the same name. For example, the Hello::There module would
live in Hello/There.pm. For details, read perlmod. You‘ll also find Exporter helpful. If you‘re writing a C
or mixed−language module with both C and Perl, then you should study perlxstut.
Here‘s a convenient template you might wish you use when starting your own module. Make sure to change
the names appropriately.
package Some::Module;

# assumes Some/Module.pm

use strict;
BEGIN {
use Exporter
use vars

();
qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS);

## set the version for version checking; uncomment to use
## $VERSION
= 1.00;
# if using RCS/CVS, this next line may be preferred,
# but beware two−digit versions.
$VERSION = do{my@r=q$Revision: 1.21 $=~/\d+/g;sprintf ’%d.’.’%02d’x$#r,@r};
@ISA
= qw(Exporter);
@EXPORT
= qw(&func1 &func2 &func3);
%EXPORT_TAGS = ( );
# eg: TAG => [ qw!name1 name2! ],
# your exported package globals go here,
# as well as any optionally exported functions
@EXPORT_OK
= qw($Var1 %Hashit);
}
use vars

@EXPORT_OK;

# non−exported package globals go here
use vars
qw( @more $stuff );
# initialize package globals, first exported ones
$Var1
= ’’;
%Hashit = ();
# then the others (which are still accessible as $Some::Module::stuff)
$stuff = ’’;
@more
= ();
# all file−scoped lexicals must be created before
# the functions below that use them.
# file−private lexicals go here
my $priv_var
= ’’;
my %secret_hash = ();
# here’s a file−private function as a closure,
# callable as &$priv_func; it cannot be prototyped.
my $priv_func = sub {
# stuff goes here.
};
# make all your functions, whether exported or not;
# remember to put something interesting in the {} stubs
sub func1
{}
# no prototype

90

Version 5.005_02

18−Oct−1998

perlfaq7

Perl Programmers Reference Guide

perlfaq7

sub func2() # {}
proto’d void
sub func3($$)# proto’d
{}
to 2 scalars
# this one isn’t exported, but could be called!
sub func4(\%) {}
# proto’d to 1 hash ref
END { }

# module clean−up code here (global destructor)

1;

# modules must return true

How do I create a class?
See perltoot for an introduction to classes and objects, as well as perlobj and perlbot.
How can I tell if a variable is tainted?
See Laundering and Detecting Tainted Data in perlsec. Here‘s an example (which doesn‘t use any system
calls, because the kill() is given no processes to signal):
sub is_tainted {
return ! eval { join(’’,@_), kill 0; 1; };
}
This is not −w clean, however. There is no −w clean way to detect taintedness − take this as a hint that you
should untaint all possibly−tainted data.
What‘s a closure?
Closures are documented in perlref.
Closure is a computer science term with a precise but hard−to−explain meaning. Closures are implemented
in Perl as anonymous subroutines with lasting references to lexical variables outside their own scopes. These
lexicals magically refer to the variables that were around when the subroutine was defined (deep binding).
Closures make sense in any programming language where you can have the return value of a function be
itself a function, as you can in Perl. Note that some languages provide anonymous functions but are not
capable of providing proper closures; the Python language, for example. For more information on closures,
check out any textbook on functional programming. Scheme is a language that not only supports but
encourages closures.
Here‘s a classic function−generating function:
sub add_function_generator {
return sub { shift + shift };
}
$add_sub = add_function_generator();
$sum = $add_sub−>(4,5);

# $sum is 9 now.

The closure works as a function template with some customization slots left out to be filled later. The
anonymous subroutine returned by add_function_generator() isn‘t technically a closure because it
refers to no lexicals outside its own scope.
Contrast this with the following make_adder() function, in which the returned anonymous function
contains a reference to a lexical variable outside the scope of that function itself. Such a reference requires
that Perl return a proper closure, thus locking in for all time the value that the lexical had when the function
was created.
sub make_adder {
my $addpiece = shift;
return sub { shift + $addpiece };
}
$f1 = make_adder(20);
$f2 = make_adder(555);

18−Oct−1998

Version 5.005_02

91

perlfaq7

Perl Programmers Reference Guide

perlfaq7

Now &$f1($n) is always 20 plus whatever $n you pass in, whereas &$f2($n) is always 555 plus
whatever $n you pass in. The $addpiece in the closure sticks around.
Closures are often used for less esoteric purposes. For example, when you want to pass in a bit of code into
a function:
my $line;
timeout( 30, sub { $line =  } );
If the code to execute had been passed in as a string, ‘$line = ’, there would have been no
way for the hypothetical timeout() function to access the lexical variable $line back in its caller‘s
scope.
What is variable suicide and how can I prevent it?
Variable suicide is when you (temporarily or permanently) lose the value of a variable. It is caused by
scoping through my() and local() interacting with either closures or aliased foreach() interator
variables and subroutine arguments. It used to be easy to inadvertently lose a variable‘s value this way, but
now it‘s much harder. Take this code:
my $f = "foo";
sub T {
while ($i++ < 3) { my $f = $f; $f .= "bar"; print $f, "\n" }
}
T;
print "Finally $f\n";
The $f that has "bar" added to it three times should be a new $f (my $f should create a new local variable
each time through the loop). It isn‘t, however. This is a bug, and will be fixed.
How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regexp}?
With the exception of regexps, you need to pass references to these objects. See
Pass by Reference in perlsub for this particular question, and perlref for information on references.
Passing Variables and Functions
Regular variables and functions are quite easy: just pass in a reference to an existing or anonymous
variable or function:
func( \$some_scalar );
func( \$some_array
func( [ 1 .. 10 ]

);
);

func( \%some_hash
);
func( { this => 10, that => 20 }
func( \&some_func
);
func( sub { $_[0] ** $_[1] }

);

);

Passing Filehandles
To pass filehandles to subroutines, use the *FH or \*FH notations. These are "typeglobs" − see
Typeglobs and Filehandles in perldata and especially Pass by Reference in perlsub for more
information.
Here‘s an excerpt:
If you‘re passing around filehandles, you could usually just use the bare typeglob, like *STDOUT, but
typeglobs references would be better because they‘ll still work properly under use strict
‘refs’. For example:
splutter(\*STDOUT);
sub splutter {
my $fh = shift;

92

Version 5.005_02

18−Oct−1998

perlfaq7

Perl Programmers Reference Guide

perlfaq7

print $fh "her um well a hmmm\n";
}
$rec = get_rec(\*STDIN);
sub get_rec {
my $fh = shift;
return scalar <$fh>;
}
If you‘re planning on generating new filehandles, you could do this:
sub openit {
my $name = shift;
local *FH;
return open (FH, $path) ? *FH : undef;
}
$fh = openit(’< /etc/motd’);
print <$fh>;
Passing Regexps
To pass regexps around, you‘ll need to either use one of the highly experimental regular expression
modules from CPAN (Nick Ing−Simmons‘s Regexp or Ilya Zakharevich‘s Devel::Regexp), pass
around strings and use an exception−trapping eval, or else be be very, very clever. Here‘s an example
of how to pass in a string to be regexp compared:
sub compare($$) {
my ($val1, $regexp) = @_;
my $retval = eval { $val =~ /$regexp/ };
die if $@;
return $retval;
}
$match = compare("old McDonald", q/d.*D/);
Make sure you never say something like this:
return eval "\$val =~ /$regexp/";

# WRONG

or someone can sneak shell escapes into the regexp due to the double interpolation of the eval and the
double−quoted string. For example:
$pattern_of_evil = ’danger ${ system("rm −rf * &") } danger’;
eval "\$string =~ /$pattern_of_evil/";
Those preferring to be very, very clever might see the O‘Reilly book, Mastering Regular Expressions,
by Jeffrey Friedl. Page 273‘s Build_MatchMany_Function() is particularly interesting. A
complete citation of this book is given in perlfaq2.
Passing Methods
To pass an object method into a subroutine, you can do this:
call_a_lot(10, $some_obj, "methname")
sub call_a_lot {
my ($count, $widget, $trick) = @_;
for (my $i = 0; $i < $count; $i++) {
$widget−>$trick();
}
}
Or you can use a closure to bundle up the object and its method call and arguments:

18−Oct−1998

Version 5.005_02

93

perlfaq7

Perl Programmers Reference Guide

perlfaq7

my $whatnot = sub { $some_obj−>obfuscate(@args) };
func($whatnot);
sub func {
my $code = shift;
&$code();
}
You could also investigate the can() method in the UNIVERSAL class (part of the standard perl
distribution).
How do I create a static variable?
As with most things in Perl, TMTOWTDI. What is a "static variable" in other languages could be either a
function−private variable (visible only within a single function, retaining its value between calls to that
function), or a file−private variable (visible only to functions within the file it was declared in) in Perl.
Here‘s code to implement a function−private variable:
BEGIN {
my $counter = 42;
sub prev_counter { return −−$counter }
sub next_counter { return $counter++ }
}
Now prev_counter() and next_counter() share a private variable $counter that was initialized
at compile time.
To declare a file−private variable, you‘ll still use a my(), putting it at the outer scope level at the top of the
file. Assume this is in file Pax.pm:
package Pax;
my $started = scalar(localtime(time()));
sub begun { return $started }
When use Pax or require Pax loads this module, the variable will be initialized. It won‘t get
garbage−collected the way most variables going out of scope do, because the begun() function cares about
it, but no one else can get it. It is not called $Pax::started because its scope is unrelated to the package.
It‘s scoped to the file. You could conceivably have several packages in that same file all accessing the same
private variable, but another file with the same package couldn‘t get to it.
See Peristent Private Variables in perlsub for details.
What‘s the difference between dynamic and lexical (static) scoping? Between local() and
my()?
local($x) saves away the old value of the global variable $x, and assigns a new value for the duration
of the subroutine, which is visible in other functions called from that subroutine. This is done at run−time,
so is called dynamic scoping. local() always affects global variables, also called package variables or
dynamic variables.
my($x) creates a new variable that is only visible in the current subroutine. This is done at compile−time,
so is called lexical or static scoping. my() always affects private variables, also called lexical variables or
(improperly) static(ly scoped) variables.
For instance:
sub visible {
print "var has value $var\n";
}
sub dynamic {
local $var = ’local’;
visible();

94

# new temporary value for the still−global
#
variable called $var

Version 5.005_02

18−Oct−1998

perlfaq7

Perl Programmers Reference Guide

perlfaq7

}
sub lexical {
my $var = ’private’;
visible();
}

# new private variable, $var
# (invisible outside of sub scope)

$var = ’global’;
visible();
dynamic();
lexical();

# prints global
# prints local
# prints global

Notice how at no point does the value "private" get printed. That‘s because $var only has that value within
the block of the lexical() function, and it is hidden from called subroutine.
In summary, local() doesn‘t make what you think of as private, local variables. It gives a global variable
a temporary value. my() is what you‘re looking for if you want private variables.
See "Private Variables via my()" and "Temporary Values via local()" for excruciating details.
How can I access a dynamic variable while a similarly named lexical is in scope?
You can do this via symbolic references, provided you haven‘t set use strict "refs". So instead of
$var, use ${‘var‘}.
local $var = "global";
my
$var = "lexical";
print "lexical is $var\n";
no strict ’refs’;
print "global is ${’var’}\n";
If you know your package, you can just mention it explicitly, as in $Some_Pack::var. Note that the
notation $::var is not the dynamic $var in the current package, but rather the one in the main package,
as though you had written $main::var. Specifying the package directly makes you hard−code its name,
but it executes faster and avoids running afoul of use strict "refs".
What‘s the difference between deep and shallow binding?
In deep binding, lexical variables mentioned in anonymous subroutines are the same ones that were in scope
when the subroutine was created. In shallow binding, they are whichever variables with the same names
happen to be in scope when the subroutine is called. Perl always uses deep binding of lexical variables (i.e.,
those created with my()). However, dynamic variables (aka global, local, or package variables) are
effectively shallowly bound. Consider this just one more reason not to use them. See the answer to
"What‘s a closure?".
Why doesn‘t "my($foo) =  read operation, like so many of
Perl‘s functions and operators, can tell which context it was called in and behaves appropriately. In general,
the scalar() function can help. This function does nothing to the data itself (contrary to popular myth) but
rather tells its argument to behave in whatever its scalar fashion is. If that function doesn‘t have a defined
scalar behavior, this of course doesn‘t help you (such as with sort()).
To enforce scalar context in this particular case, however, you need merely omit the parentheses:
local($foo) = ;
local($foo) = scalar();
local $foo = ;

# WRONG
# ok
# right

You should probably be using lexical variables anyway, although the issue is the same here:
my($foo) = ;

18−Oct−1998

# WRONG

Version 5.005_02

95

perlfaq7

Perl Programmers Reference Guide
my $foo

= ;

perlfaq7

# right

How do I redefine a builtin function, operator, or method?
Why do you want to do that? :−)
If you want to override a predefined function, such as open(), then you‘ll have to import the new definition
from a different module. See Overriding Builtin Functions in perlsub. There‘s also an example in
Class::Template in perltoot.
If you want to overload a Perl operator, such as + or **, then you‘ll want to use the use overload
pragma, documented in overload.
If you‘re talking about obscuring method calls in parent classes, see Overridden Methods in perltoot.
What‘s the difference between calling a function as &foo and foo()?
When you call a function as &foo, you allow that function access to your current @_ values, and you
by−pass prototypes. That means that the function doesn‘t get an empty @_, it gets yours! While not strictly
speaking a bug (it‘s documented that way in perlsub), it would be hard to consider this a feature in most
cases.
When you call your function as &foo(), then you do get a new @_, but prototyping is still circumvented.
Normally, you want to call a function using foo(). You may only omit the parentheses if the function is
already known to the compiler because it already saw the definition (use but not require), or via a
forward reference or use subs declaration. Even in this case, you get a clean @_ without any of the old
values leaking through where they don‘t belong.
How do I create a switch or case statement?
This is explained in more depth in the perlsyn. Briefly, there‘s no official case statement, because of the
variety of tests possible in Perl (numeric comparison, string comparison, glob comparison, regexp matching,
overloaded comparisons, ...). Larry couldn‘t decide how best to do this, so he left it out, even though it‘s
been on the wish list since perl1.
The general answer is to write a construct like this:
for ($variable_to_test) {
if
(/pat1/) { }
elsif (/pat2/) { }
elsif (/pat3/) { }
else
{ }
}

#
#
#
#

do something
do something else
do something else
default

Here‘s a simple example of a switch based on pattern matching, this time lined up in a way to make it look
more like a switch statement. We‘ll do a multi−way conditional based on the type of reference stored in
$whatchamacallit:
SWITCH: for (ref $whatchamacallit) {
/^$/

&& die "not a reference";

/SCALAR/

&& do {
print_scalar($$ref);
last SWITCH;
};

/ARRAY/

&& do {
print_array(@$ref);
last SWITCH;
};

/HASH/

&& do {
print_hash(%$ref);

96

Version 5.005_02

18−Oct−1998

perlfaq7

Perl Programmers Reference Guide

perlfaq7

last SWITCH;
};
/CODE/

&& do {
warn "can’t print function ref";
last SWITCH;
};

# DEFAULT
warn "User defined type skipped";
}
See perlsyn/"Basic BLOCKs and Switch Statements" for many other examples in this style.
Sometimes you should change the positions of the constant and the variable. For example, let‘s say you
wanted to test which of many answers you were given, but in a case−insensitive way that also allows
abbreviations. You can use the following technique if the strings all start with different characters, or if you
want to arrange the matches so that one takes precedence over another, as "SEND" has precedence over
"STOP" here:
chomp($answer = <>);
if
("SEND" =~ /^\Q$answer/i)
elsif ("STOP" =~ /^\Q$answer/i)
elsif ("ABORT" =~ /^\Q$answer/i)
elsif ("LIST" =~ /^\Q$answer/i)
elsif ("EDIT" =~ /^\Q$answer/i)

{
{
{
{
{

print
print
print
print
print

"Action
"Action
"Action
"Action
"Action

is
is
is
is
is

send\n"
stop\n"
abort\n"
list\n"
edit\n"

}
}
}
}
}

A totally different approach is to create a hash of function references.
my %commands =
"happy" =>
"sad", =>
"done" =>
"mad"
=>
);

(
\&joy,
\&sullen,
sub { die "See ya!" },
\&angry,

print "How are you? ";
chomp($string = );
if ($commands{$string}) {
$commands{$string}−>();
} else {
print "No such command: $string\n";
}
How can I catch accesses to undefined variables/functions/methods?
The AUTOLOAD method, discussed in Autoloading in perlsub and
AUTOLOAD: Proxy Methods in perltoot, lets you capture calls to undefined functions and methods.
When it comes to undefined variables that would trigger a warning under −w, you can use a handler to trap
the pseudo−signal __WARN__ like this:
$SIG{__WARN__} = sub {
for ( $_[0] ) {

# voici un switch statement

/Use of uninitialized value/ && do {
# promote warning to a fatal
die $_;
};

18−Oct−1998

Version 5.005_02

97

perlfaq7

Perl Programmers Reference Guide

perlfaq7

# other warning cases to catch could go here;
warn $_;
}
};
Why can‘t a method included in this same file be found?
Some possible reasons: your inheritance is getting confused, you‘ve misspelled the method name, or the
object is of the wrong type. Check out perltoot for details on these. You may also use print
ref($object) to find out the class $object was blessed into.
Another possible reason for problems is because you‘ve used the indirect object syntax (eg, find Guru
"Samy") on a class name before Perl has seen that such a package exists. It‘s wisest to make sure your
packages are all defined before you start using them, which will be taken care of if you use the use
statement instead of require. If not, make sure to use arrow notation (eg, Guru−>find("Samy"))
instead. Object notation is explained in perlobj.
Make sure to read about creating modules in perlmod and the perils of indirect objects in
WARNING in perlobj.
How can I find out my current package?
If you‘re just a random program, you can do this to find out what the currently compiled package is:
my $packname = __PACKAGE__;
But if you‘re a method and you want to print an error message that includes the kind of object you were
called on (which is not necessarily the same as the one in which you were compiled):
sub amethod {
my $self = shift;
my $class = ref($self) || $self;
warn "called me from a $class object";
}
How can I comment out a large block of perl code?
Use embedded POD to discard it:
# program is here
=for nobody
This paragraph is commented out
# program continues
=begin comment text
all of this stuff
here will be ignored
by everyone
=end comment text
=cut
This can‘t go just anywhere. You have to put a pod directive where the parser is expecting a new statement,
not just in the middle of an expression or some other arbitrary yacc grammar production.
AUTHOR AND COPYRIGHT
Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. All rights reserved.
When included as part of the Standard Version of Perl, or as part of its complete documentation whether
printed or otherwise, this work may be distributed only under the terms of Perl‘s Artistic License. Any

98

Version 5.005_02

18−Oct−1998

perlfaq7

Perl Programmers Reference Guide

perlfaq7

distribution of this file or derivatives thereof outside of that package require that special arrangements be
made with copyright holder.
Irrespective of its distribution, all code examples in this file are hereby placed into the public domain. You
are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A
simple comment in the code giving credit would be courteous but is not required.

18−Oct−1998

Version 5.005_02

99

perlfaq8

Perl Programmers Reference Guide

perlfaq8

NAME
perlfaq8 − System Interaction ($Revision: 1.26 $, $Date: 1998/08/05 12:20:28 $)
DESCRIPTION
This section of the Perl FAQ covers questions involving operating system interaction. This involves
interprocess communication (IPC), control over the user−interface (keyboard, screen and pointing devices),
and most anything else not related to data manipulation.
Read the FAQs and documentation specific to the port of perl to your operating system (eg, perlvms,
perlplan9, ...). These should contain more detailed information on the vagaries of your perl.
How do I find out which operating system I‘m running under?
The $^O variable ($OSNAME if you use English) contains the operating system that your perl binary was
built for.
How come exec() doesn‘t return?
Because that‘s what it does: it replaces your currently running program with a different one. If you want to
keep going (as is probably the case if you‘re asking this question) use system() instead.
How do I do fancy stuff with the keyboard/screen/mouse?
How you access/control keyboards, screens, and pointing devices ("mice") is system−dependent. Try the
following modules:
Keyboard
Term::Cap
Term::ReadKey
Term::ReadLine::Gnu
Term::ReadLine::Perl
Term::Screen

Standard perl distribution
CPAN
CPAN
CPAN
CPAN

Term::Cap
Curses
Term::ANSIColor

Standard perl distribution
CPAN
CPAN

Tk

CPAN

Screen

Mouse
Some of these specific cases are shown below.
How do I print something out in color?
In general, you don‘t, because you don‘t know whether the recipient has a color−aware display device. If
you know that they have an ANSI terminal that understands color, you can use the Term::ANSIColor module
from CPAN:
use Term::ANSIColor;
print color("red"), "Stop!\n", color("reset");
print color("green"), "Go!\n", color("reset");
Or like this:
use Term::ANSIColor qw(:constants);
print RED, "Stop!\n", RESET;
print GREEN, "Go!\n", RESET;
How do I read just one key without waiting for a return key?
Controlling input buffering is a remarkably system−dependent matter. If most systems, you can just use the
stty command as shown in getc, but as you see, that‘s already getting you into portability snags.

100

Version 5.005_02

18−Oct−1998

perlfaq8

Perl Programmers Reference Guide

perlfaq8

open(TTY, "+/dev/tty 2>&1";
$key = getc(TTY);
# perhaps this works
# OR ELSE
sysread(TTY, $key, 1);
# probably this does
system "stty −cbreak /dev/tty 2>&1";
The Term::ReadKey module from CPAN offers an easy−to−use interface that should be more efficient than
shelling out to stty for each key. It even includes limited support for Windows.
use Term::ReadKey;
ReadMode(’cbreak’);
$key = ReadKey(0);
ReadMode(’normal’);
However, that requires that you have a working C compiler and can use it to build and install a CPAN
module. Here‘s a solution using the standard POSIX module, which is already on your systems (assuming
your system supports POSIX).
use HotKey;
$key = readkey();
And here‘s the HotKey module, which hides the somewhat mystifying calls to manipulate the POSIX
termios structures.
# HotKey.pm
package HotKey;
@ISA = qw(Exporter);
@EXPORT = qw(cbreak cooked readkey);
use strict;
use POSIX qw(:termios_h);
my ($term, $oterm, $echo, $noecho, $fd_stdin);
$fd_stdin = fileno(STDIN);
$term
= POSIX::Termios−>new();
$term−>getattr($fd_stdin);
$oterm
= $term−>getlflag();
$echo
$noecho

= ECHO | ECHOK | ICANON;
= $oterm & ~$echo;

sub cbreak {
$term−>setlflag($noecho); # ok, so i don’t want echo either
$term−>setcc(VTIME, 1);
$term−>setattr($fd_stdin, TCSANOW);
}
sub cooked {
$term−>setlflag($oterm);
$term−>setcc(VTIME, 0);
$term−>setattr($fd_stdin, TCSANOW);
}
sub readkey {
my $key = ’’;
cbreak();
sysread(STDIN, $key, 1);
cooked();
return $key;

18−Oct−1998

Version 5.005_02

101

perlfaq8

Perl Programmers Reference Guide

perlfaq8

}
END { cooked() }
1;
How do I check whether input is ready on the keyboard?
The easiest way to do this is to read a key in nonblocking mode with the Term::ReadKey module from
CPAN, passing it an argument of −1 to indicate not to block:
use Term::ReadKey;
ReadMode(’cbreak’);
if (defined ($char = ReadKey(−1)) ) {
# input was waiting and it was $char
} else {
# no input was waiting
}
ReadMode(’normal’);

# restore normal tty settings

How do I clear the screen?
If you only have to so infrequently, use system:
system("clear");
If you have to do this a lot, save the clear string so you can print it 100 times without calling a program 100
times:
$clear_string = ‘clear‘;
print $clear_string;
If you‘re planning on doing other screen manipulations, like cursor positions, etc, you might wish to use
Term::Cap module:
use Term::Cap;
$terminal = Term::Cap−>Tgetent( {OSPEED => 9600} );
$clear_string = $terminal−>Tputs(’cl’);
How do I get the screen size?
If you have Term::ReadKey module installed from CPAN, you can use it to fetch the width and height in
characters and in pixels:
use Term::ReadKey;
($wchar, $hchar, $wpixels, $hpixels) = GetTerminalSize();
This is more portable than the raw ioctl, but not as illustrative:
require ’sys/ioctl.ph’;
die "no TIOCGWINSZ " unless defined &TIOCGWINSZ;
open(TTY, "+autoflush(1);

18−Oct−1998

Version 5.005_02

103

perlfaq8

Perl Programmers Reference Guide

perlfaq8

As mentioned in the previous item, this still doesn‘t work when using socket I/O between Unix and
Macintosh. You‘ll need to hardcode your line terminators, in that case.
non−blocking input
If you are doing a blocking read() or sysread(), you‘ll have to arrange for an alarm handler to
provide a timeout (see alarm). If you have a non−blocking open, you‘ll likely have a non−blocking
read, which means you may have to use a 4−arg select() to determine whether I/O is ready on that
device (see select in perlfunc.
While trying to read from his caller−id box, the notorious Jamie Zawinski &1");
# starting cu hoses /dev/tty’s stty settings, even when it has
# been opened on a pipe...
system("/bin/stty $stty");
$_ = ;
chop;
if ( !m/^Connected/ ) {
print STDERR "$0: cu printed ‘$_’ instead of ‘Connected’\n";
}
}
How do I decode encrypted password files?
You spend lots and lots of money on dedicated hardware, but this is bound to get you talked about.
Seriously, you can‘t if they are Unix password files − the Unix password system employs one−way
encryption. It‘s more like hashing than encryption. The best you can check is whether something else
hashes to the same string. You can‘t turn a hash back into the original string. Programs like Crack can
forcibly (and intelligently) try to guess passwords, but don‘t (can‘t) guarantee quick success.
If you‘re worried about users selecting bad passwords, you should proactively check when they try to change
their password (by modifying passwd(1), for example).
How do I start a process in the background?
You could use
system("cmd &")
or you could use fork as documented in fork in perlfunc, with further examples in perlipc. Some things to be
aware of, if you‘re on a Unix−like system:
STDIN, STDOUT, and STDERR are shared
Both the main process and the backgrounded one (the "child" process) share the same STDIN,
STDOUT and STDERR filehandles. If both try to access them at once, strange things can happen.
You may want to close or reopen these for the child. You can get around this with opening a pipe
(see open in perlfunc) but on some systems this means that the child process cannot outlive the parent.
Signals
You‘ll have to catch the SIGCHLD signal, and possibly SIGPIPE too. SIGCHLD is sent when the
backgrounded process finishes. SIGPIPE is sent when you write to a filehandle whose child process
has closed (an untrapped SIGPIPE can cause your program to silently die). This is not an issue with
system("cmd&").

104

Version 5.005_02

18−Oct−1998

perlfaq8

Perl Programmers Reference Guide

perlfaq8

Zombies
You have to be prepared to "reap" the child process when it finishes
$SIG{CHLD} = sub { wait };
See Signals in perlipc for other examples of code to do this. Zombies are not an issue with
system("prog &").
How do I trap control characters/signals?
You don‘t actually "trap" a control character. Instead, that character generates a signal which is sent to your
terminal‘s currently foregrounded process group, which you then trap in your process. Signals are
documented in Signals in perlipc and chapter 6 of the Camel.
Be warned that very few C libraries are re−entrant. Therefore, if you attempt to print() in a handler that
got invoked during another stdio operation your internal structures will likely be in an inconsistent state, and
your program will dump core. You can sometimes avoid this by using syswrite() instead of print().
Unless you‘re exceedingly careful, the only safe things to do inside a signal handler are: set a variable and
exit. And in the first case, you should only set a variable in such a way that malloc() is not called (eg, by
setting a variable that already has a value).
For example:
$Interrupted = 0;
# to ensure it has a value
$SIG{INT} = sub {
$Interrupted++;
syswrite(STDERR, "ouch\n", 5);
}
However, because syscalls restart by default, you‘ll find that if you‘re in a "slow" call, such as ,
read(), connect(), or wait(), that the only way to terminate them is by "longjumping" out; that is, by
raising an exception. See the time−out handler for a blocking flock() in Signals in perlipc or chapter 6 of
the Camel.
How do I modify the shadow password file on a Unix system?
If perl was installed correctly, and your shadow library was written properly, the getpw*() functions
described in perlfunc should in theory provide (read−only) access to entries in the shadow password file. To
change the file, make a new shadow password file (the format varies from system to system − see passwd(5)
for specifics) and use pwd_mkdb(8) to install it (see pwd_mkdb(5) for more details).
How do I set the time and date?
Assuming you‘re running under sufficient permissions, you should be able to set the system−wide date and
time by running the date(1) program. (There is no way to set the time and date on a per−process basis.) This
mechanism will work for Unix, MS−DOS, Windows, and NT; the VMS equivalent is set time.
However, if all you want to do is change your timezone, you can probably get away with setting an
environment variable:
$ENV{TZ} = "MST7MDT";
# unixish
$ENV{’SYS$TIMEZONE_DIFFERENTIAL’}="−5" # vms
system "trn comp.lang.perl.misc";
How can I sleep() or alarm() for under a second?
If you want finer granularity than the 1 second that the sleep() function provides, the easiest way is to use
the select() function as documented in select in perlfunc. If your system has itimers and syscall()
support, you can check out the old example in
http://www.perl.com/CPAN/doc/misc/ancient/tutorial/eg/itimers.pl .

18−Oct−1998

Version 5.005_02

105

perlfaq8

Perl Programmers Reference Guide

perlfaq8

How can I measure time under a second?
In general, you may not be able to. The Time::HiRes module (available from CPAN) provides this
functionality for some systems.
In general, you may not be able to. But if your system supports both the syscall() function in Perl as
well as a system call like gettimeofday(2), then you may be able to do something like this:
require ’sys/syscall.ph’;
$TIMEVAL_T = "LL";
$done = $start = pack($TIMEVAL_T, ());
syscall( &SYS_gettimeofday, $start, 0)) != −1
or die "gettimeofday: $!";
##########################
# DO YOUR OPERATION HERE #
##########################
syscall( &SYS_gettimeofday, $done, 0) != −1
or die "gettimeofday: $!";
@start = unpack($TIMEVAL_T, $start);
@done = unpack($TIMEVAL_T, $done);
# fix microseconds
for ($done[1], $start[1]) { $_ /= 1_000_000 }
$delta_time = sprintf "%.4f", ($done[0]

+ $done[1] )
−
($start[0] + $start[1] );

How can I do an atexit() or setjmp()/longjmp()? (Exception handling)
Release 5 of Perl added the END block, which can be used to simulate atexit(). Each package‘s END
block is called when the program or thread ends (see perlmod manpage for more details).
For example, you can use this to make sure your filter program managed to finish its output without filling
up the disk:
END {
close(STDOUT) || die "stdout close failed: $!";
}
The END block isn‘t called when untrapped signals kill the program, though, so if you use END blocks you
should also use
use sigtrap qw(die normal−signals);
Perl‘s exception−handling mechanism is its eval() operator. You can use eval() as setjmp and die()
as longjmp. For details of this, see the section on signals, especially the time−out handler for a blocking
flock() in Signals in perlipc and chapter 6 of the Camel.
If exception handling is all you‘re interested in, try the exceptions.pl library (part of the standard perl
distribution).
If you want the atexit() syntax (and an rmexit() as well), try the AtExit module available from
CPAN.
Why doesn‘t my sockets program work under System V (Solaris)? What does the error message
"Protocol not supported" mean?
Some Sys−V based systems, notably Solaris 2.X, redefined some of the standard socket constants. Since
these were constant across all architectures, they were often hardwired into perl code. The proper way to

106

Version 5.005_02

18−Oct−1998

perlfaq8

Perl Programmers Reference Guide

perlfaq8

deal with this is to "use Socket" to get the correct values.
Note that even though SunOS and Solaris are binary compatible, these values are different. Go figure.
How can I call my system‘s unique C functions from Perl?
In most cases, you write an external module to do it − see the answer to "Where can I learn about linking C
with Perl? [h2xs, xsubpp]". However, if the function is a system call, and your system supports
syscall(), you can use the syscall function (documented in perlfunc).
Remember to check the modules that came with your distribution, and CPAN as well − someone may
already have written a module to do it.
Where do I get the include files to do ioctl() or syscall()?
Historically, these would be generated by the h2ph tool, part of the standard perl distribution. This program
converts cpp(1) directives in C header files to files containing subroutine definitions, like
&SYS_getitimer, which you can use as arguments to your functions. It doesn‘t work perfectly, but it
usually gets most of the job done. Simple files like errno.h, syscall.h, and socket.h were fine, but the hard
ones like ioctl.h nearly always need to hand−edited. Here‘s how to install the *.ph files:
1.
2.
3.

become super−user
cd /usr/include
h2ph *.h */*.h

If your system supports dynamic loading, for reasons of portability and sanity you probably ought to use
h2xs (also part of the standard perl distribution). This tool converts C header files to Perl extensions. See
perlxstut for how to get started with h2xs.
If your system doesn‘t support dynamic loading, you still probably ought to use h2xs. See perlxstut and
ExtUtils::MakeMaker for more information (in brief, just use make perl instead of a plain make to rebuild
perl with a new static extension).
Why do setuid perl scripts complain about kernel problems?
Some operating systems have bugs in the kernel that make setuid scripts inherently insecure. Perl gives you
a number of options (described in perlsec) to work around such systems.
How can I open a pipe both to and from a command?
The IPC::Open2 module (part of the standard perl distribution) is an easy−to−use approach that internally
uses pipe(), fork(), and exec() to do the job. Make sure you read the deadlock warnings in its
documentation, though (see IPC::Open2). See
Bidirectional Communication with Another Process in perlipc and
Bidirectional Communication with Yourself in perlipc
You may also use the IPC::Open3 module (part of the standard perl distribution), but be warned that it has a
different order of arguments from IPC::Open2 (see IPC::Open3).
Why can‘t I get the output of a command with system()?
You‘re confusing the purpose of system() and backticks (‘‘). system() runs a command and returns
exit status information (as a 16 bit value: the low 7 bits are the signal the process died from, if any, and the
high 8 bits are the actual exit value). Backticks (‘‘) run a command and return what it sent to STDOUT.
$exit_status
= system("mail−users");
$output_string = ‘ls‘;
How can I capture STDERR from an external command?
There are three basic ways of running external commands:
system $cmd;
$output = ‘$cmd‘;
open (PIPE, "cmd |");

18−Oct−1998

# using system()
# using backticks (‘‘)
# using open()

Version 5.005_02

107

perlfaq8

Perl Programmers Reference Guide

perlfaq8

With system(), both STDOUT and STDERR will go the same place as the script‘s versions of these,
unless the command redirects them. Backticks and open() read only the STDOUT of your command.
With any of these, you can change file descriptors before the call:
open(STDOUT, ">logfile");
system("ls");
or you can use Bourne shell file−descriptor redirection:
$output = ‘$cmd 2>some_file‘;
open (PIPE, "cmd 2>some_file |");
You can also use file−descriptor redirection to make STDERR a duplicate of STDOUT:
$output = ‘$cmd 2>&1‘;
open (PIPE, "cmd 2>&1 |");
Note that you cannot simply open STDERR to be a dup of STDOUT in your Perl program and avoid calling
the shell to do the redirection. This doesn‘t work:
open(STDERR, ">&STDOUT");
$alloutput = ‘cmd args‘; # stderr still escapes
This fails because the open() makes STDERR go to where STDOUT was going at the time of the
open(). The backticks then make STDOUT go to a string, but don‘t change STDERR (which still goes to
the old STDOUT).
Note that you must use Bourne shell (sh(1)) redirection syntax in backticks, not csh(1)! Details on why
Perl‘s system() and backtick and pipe opens all use the Bourne shell are in
http://www.perl.com/CPAN/doc/FMTEYEWTK/versus/csh.whynot . To capture a command‘s STDERR and
STDOUT together:
$output = ‘cmd 2>&1‘;
$pid = open(PH, "cmd 2>&1 |");
while () { }

# either with backticks
# or with an open pipe
#
plus a read

To capture a command‘s STDOUT but discard its STDERR:
$output = ‘cmd 2>/dev/null‘;
$pid = open(PH, "cmd 2>/dev/null |");
while () { }

# either with backticks
# or with an open pipe
#
plus a read

To capture a command‘s STDERR but discard its STDOUT:
$output = ‘cmd 2>&1 1>/dev/null‘;
$pid = open(PH, "cmd 2>&1 1>/dev/null |");
while () { }

# either with backticks
# or with an open pipe
#
plus a read

To exchange a command‘s STDOUT and STDERR in order to capture the STDERR but leave its STDOUT
to come out our old STDERR:
$output = ‘cmd 3>&1 1>&2 2>&3 3>&−‘;
# either with backticks
$pid = open(PH, "cmd 3>&1 1>&2 2>&3 3>&−|");# or with an open pipe
while () { }
#
plus a read
To read both a command‘s STDOUT and its STDERR separately, it‘s easiest and safest to redirect them
separately to files, and then read from those files when the program is done:
system("program args 1>/tmp/program.stdout 2>/tmp/program.stderr");
Ordering is important in all these examples. That‘s because the shell processes file descriptor redirections in
strictly left to right order.

108

Version 5.005_02

18−Oct−1998

perlfaq8

Perl Programmers Reference Guide

perlfaq8

system("prog args 1>tmpfile 2>&1");
system("prog args 2>&1 1>tmpfile");
The first command sends both standard out and standard error to the temporary file. The second command
sends only the old standard output there, and the old standard error shows up on the old standard out.
Why doesn‘t open() return an error when a pipe open fails?
It does, but probably not how you expect it to. On systems that follow the standard fork()/exec()
paradigm (such as Unix), it works like this: open() causes a fork(). In the parent, open() returns with
the process ID of the child. The child exec()s the command to be piped to/from. The parent can‘t know
whether the exec() was successful or not − all it can return is whether the fork() succeeded or not. To
find out if the command succeeded, you have to catch SIGCHLD and wait() to get the exit status. You
should also catch SIGPIPE if you‘re writing to the child — you may not have found out the exec() failed
by the time you write. This is documented in perlipc.
On systems that follow the spawn() paradigm, open() might do what you expect − unless perl uses a
shell to start your command. In this case the fork()/exec() description still applies.
What‘s wrong with using backticks in a void context?
Strictly speaking, nothing. Stylistically speaking, it‘s not a good way to write maintainable code because
backticks have a (potentially humungous) return value, and you‘re ignoring it. It‘s may also not be very
efficient, because you have to read in all the lines of output, allocate memory for them, and then throw it
away. Too often people are lulled to writing:
‘cp file file.bak‘;
And now they think "Hey, I‘ll just always use backticks to run programs." Bad idea: backticks are for
capturing a program‘s output; the system() function is for running programs.
Consider this line:
‘cat /etc/termcap‘;
You haven‘t assigned the output anywhere, so it just wastes memory (for a little while). Plus you forgot to
check $? to see whether the program even ran correctly. Even if you wrote
print ‘cat /etc/termcap‘;
In most cases, this could and probably should be written as
system("cat /etc/termcap") == 0
or die "cat program failed!";
Which will get the output quickly (as its generated, instead of only at the end) and also check the return
value.
system() also provides direct control over whether shell wildcard processing may take place, whereas
backticks do not.
How can I call backticks without shell processing?
This is a bit tricky. Instead of writing
@ok = ‘grep @opts ’$search_string’ @filenames‘;
You have to do this:
my @ok = ();
if (open(GREP, "−|")) {
while () {
chomp;
push(@ok, $_);
}
close GREP;

18−Oct−1998

Version 5.005_02

109

perlfaq8

Perl Programmers Reference Guide

perlfaq8

} else {
exec ’grep’, @opts, $search_string, @filenames;
}
Just as with system(), no shell escapes happen when you exec() a list.
There are more examples of this Safe Pipe Opens in perlipc.
Why can‘t my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS−DOS)?
Because some stdio‘s set error and eof flags that need clearing. The POSIX module defines clearerr()
that you can use. That is the technically correct way to do it. Here are some less reliable workarounds:
1

Try keeping around the seekpointer and go there, like this:
$where = tell(LOG);
seek(LOG, $where, 0);

2

If that doesn‘t work, try seeking to a different part of the file and then back.

3

If that doesn‘t work, try seeking to a different part of the file, reading something, and then seeking
back.

4

If that doesn‘t work, give up on your stdio package and use sysread.

How can I convert my shell script to perl?
Learn Perl and rewrite it. Seriously, there‘s no simple converter. Things that are awkward to do in the shell
are easy to do in Perl, and this very awkwardness is what would make a shell−perl converter nigh−on
impossible to write. By rewriting it, you‘ll think about what you‘re really trying to do, and hopefully will
escape the shell‘s pipeline datastream paradigm, which while convenient for some matters, causes many
inefficiencies.
Can I use perl to run a telnet or ftp session?
Try the Net::FTP, TCP::Client, and Net::Telnet modules (available from CPAN).
http://www.perl.com/CPAN/scripts/netstuff/telnet.emul.shar will also help for emulating the telnet protocol,
but Net::Telnet is quite probably easier to use..
If all you want to do is pretend to be telnet but don‘t need the initial telnet handshaking, then the standard
dual−process approach will suffice:
use IO::Socket;
# new in 5.004
$handle = IO::Socket::INET−>new(’www.perl.com:80’)
|| die "can’t connect to port 80 on www.perl.com: $!";
$handle−>autoflush(1);
if (fork()) {
# XXX: undef means failure
select($handle);
print while ;
# everything from stdin to socket
} else {
print while <$handle>; # everything from socket to stdout
}
close $handle;
exit;
How can I write expect in Perl?
Once upon a time, there was a library called chat2.pl (part of the standard perl distribution), which never
really got finished. If you find it somewhere, don‘t use it. These days, your best bet is to look at the Expect
module available from CPAN, which also requires two other modules from CPAN, IO::Pty and IO::Stty.
Is there a way to hide perl‘s command line from programs such as "ps"?
First of all note that if you‘re doing this for security reasons (to avoid people seeing passwords, for example)
then you should rewrite your program so that critical information is never given as an argument. Hiding the
arguments won‘t make your program completely secure.

110

Version 5.005_02

18−Oct−1998

perlfaq8

Perl Programmers Reference Guide

perlfaq8

To actually alter the visible command line, you can assign to the variable $0 as documented in perlvar. This
won‘t work on all operating systems, though. Daemon programs like sendmail place their state there, as in:
$0 = "orcus [accepting connections]";
I {changed directory, modified my environment} in a perl script. How come the change
disappeared when I exited the script? How do I get my changes to be visible?
Unix
In the strictest sense, it can‘t be done — the script executes as a different process from the shell it was
started from. Changes to a process are not reflected in its parent, only in its own children created after
the change. There is shell magic that may allow you to fake it by eval()ing the script‘s output in
your shell; check out the comp.unix.questions FAQ for details.
How do I close a process‘s filehandle without waiting for it to complete?
Assuming your system supports such things, just send an appropriate signal to the process (see
kill in perlfunc. It‘s common to first send a TERM signal, wait a little bit, and then send a KILL signal to
finish it off.
How do I fork a daemon process?
If by daemon process you mean one that‘s detached (disassociated from its tty), then the following process is
reported to work on most Unixish systems. Non−Unix users should check their Your_OS::Process module
for other solutions.
Open /dev/tty and use the the TIOCNOTTY ioctl on it. See tty(4) for details. Or better yet, you can
just use the POSIX::setsid() function, so you don‘t have to worry about process groups.
Change directory to /
Reopen STDIN, STDOUT, and STDERR so they‘re not connected to the old tty.
Background yourself like this:
fork && exit;
How do I make my program run with sh and csh?
See the eg/nih script (part of the perl source distribution).
How do I find out if I‘m running interactively or not?
Good question. Sometimes −t STDIN and −t STDOUT can give clues, sometimes not.
if (−t STDIN && −t STDOUT) {
print "Now what? ";
}
On POSIX systems, you can test whether your own process group matches the current process group of your
controlling terminal as follows:
use POSIX qw/getpgrp tcgetpgrp/;
open(TTY, "/dev/tty") or die $!;
$tpgrp = tcgetpgrp(TTY);
$pgrp = getpgrp();
if ($tpgrp == $pgrp) {
print "foreground\n";
} else {
print "background\n";
}
How do I timeout a slow event?
Use the alarm() function, probably in conjunction with a signal handler, as documented Signals in perlipc
and chapter 6 of the Camel. You may instead use the more flexible Sys::AlarmCall module available from

18−Oct−1998

Version 5.005_02

111

perlfaq8

Perl Programmers Reference Guide

perlfaq8

CPAN.
How do I set CPU limits?
Use the BSD::Resource module from CPAN.
How do I avoid zombies on a Unix system?
Use the reaper code from Signals in perlipc to call wait() when a SIGCHLD is received, or else use the
double−fork technique described in fork.
How do I use an SQL database?
There are a number of excellent interfaces to SQL databases. See the DBD::* modules available from
http://www.perl.com/CPAN/modules/dbperl/DBD . A lot of information on this can be found at
http://www.hermetica.com/technologia/perl/DBI/index.html .
How do I make a system() exit on control−C?
You can‘t. You need to imitate the system() call (see perlipc for sample code) and then have a signal
handler for the INT signal that passes the signal on to the subprocess. Or you can check for it:
$rc = system($cmd);
if ($rc & 127) { die "signal death" }
How do I open a file without blocking?
If you‘re lucky enough to be using a system that supports non−blocking reads (most Unixish systems do),
you need only to use the O_NDELAY or O_NONBLOCK flag from the Fcntl module in conjunction with
sysopen():
use Fcntl;
sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT, 0644)
or die "can’t open /tmp/somefile: $!":
How do I install a CPAN module?
The easiest way is to have the CPAN module do it for you. This module comes with perl version 5.004 and
later. To manually install the CPAN module, or any well−behaved CPAN module for that matter, follow
these steps:
1

Unpack the source into a temporary area.

2
perl Makefile.PL
3
make
4
make test
5
make install
If your version of perl is compiled without dynamic loading, then you just need to replace step 3 (make)
with make perl and you will get a new perl binary with your extension linked in.
See ExtUtils::MakeMaker for more details on building extensions. See also the next question.
What‘s the difference between require and use?
Perl offers several different ways to include code from one file into another. Here are the deltas between the
various inclusion constructs:
1)

112

do $file is like eval ‘cat $file‘, except the former:
1.1: searches @INC and updates %INC.
1.2: bequeaths an *unrelated* lexical scope on the eval’ed code.

Version 5.005_02

18−Oct−1998

perlfaq8

Perl Programmers Reference Guide

perlfaq8

2)

require $file is like do $file, except the former:
2.1: checks for redundant loading, skipping already loaded files.
2.2: raises an exception on failure to find, compile, or execute $file.

3)

require Module is like require "Module.pm", except the former:
3.1: translates each "::" into your system’s directory separator.
3.2: primes the parser to disambiguate class Module as an indirect object.

4)

use Module is like require Module, except the former:
4.1: loads the module at compile time, not run−time.
4.2: imports symbols and semantics from that package to the current one.

In general, you usually want use and a proper Perl module.
How do I keep my own module/library directory?
When you build modules, use the PREFIX option when generating Makefiles:
perl Makefile.PL PREFIX=/u/mydir/perl
then either set the PERL5LIB environment variable before you run scripts that use the modules/libraries (see
perlrun) or say
use lib ’/u/mydir/perl’;
See Perl‘s lib for more information.
How do I add the directory my program lives in to the module/library search path?
use FindBin;
use lib "$FindBin::Bin";
use your_own_modules;
How do I add a directory to my include path at runtime?
Here are the suggested ways of modifying your include path:
the
the
the
the

PERLLIB environment variable
PERL5LIB environment variable
perl −Idir commpand line flag
use lib pragma, as in
use lib "$ENV{HOME}/myown_perllib";

The latter is particularly useful because it knows about machine dependent architectures. The lib.pm
pragmatic module was first included with the 5.002 release of Perl.
AUTHOR AND COPYRIGHT
Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. All rights reserved.
When included as part of the Standard Version of Perl, or as part of its complete documentation whether
printed or otherwise, this work may be distributed only under the terms of Perl‘s Artistic License. Any
distribution of this file or derivatives thereof outside of that package require that special arrangements be
made with copyright holder.
Irrespective of its distribution, all code examples in this file are hereby placed into the public domain. You
are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A
simple comment in the code giving credit would be courteous but is not required.

18−Oct−1998

Version 5.005_02

113

perlfaq9

Perl Programmers Reference Guide

perlfaq9

NAME
perlfaq9 − Networking ($Revision: 1.20 $, $Date: 1998/06/22 18:31:09 $)
DESCRIPTION
This section deals with questions related to networking, the internet, and a few on the web.
My CGI script runs from the command line but not the browser. (500 Server Error)
If you can demonstrate that you‘ve read the following FAQs and that your problem isn‘t something simple
that can be easily answered, you‘ll probably receive a courteous and useful reply to your question if you post
it on comp.infosystems.www.authoring.cgi (if it‘s something to do with HTTP, HTML, or the CGI
protocols). Questions that appear to be Perl questions but are really CGI ones that are posted to
comp.lang.perl.misc may not be so well received.
The useful FAQs and related documents are:
CGI FAQ
http://www.webthing.com/page.cgi/cgifaq
Web FAQ
http://www.boutell.com/faq/
WWW Security FAQ
http://www.w3.org/Security/Faq/
HTTP Spec
http://www.w3.org/pub/WWW/Protocols/HTTP/
HTML Spec
http://www.w3.org/TR/REC−html40/
http://www.w3.org/pub/WWW/MarkUp/
CGI Spec
http://www.w3.org/CGI/
CGI Security FAQ
http://www.go2net.com/people/paulp/cgi−security/safe−cgi.txt
How can I get better error messages from a CGI program?
Use the CGI::Carp module. It replaces warn and die, plus the normal Carp modules carp, croak, and
confess functions with more verbose and safer versions. It still sends them to the normal server error log.
use CGI::Carp;
warn "This is a complaint";
die "But this one is serious";
The following use of CGI::Carp also redirects errors to a file of your choice, placed in a BEGIN block to
catch compile−time warnings as well:
BEGIN {
use CGI::Carp qw(carpout);
open(LOG, ">>/var/local/cgi−logs/mycgi−log")
or die "Unable to append to mycgi−log: $!\n";
carpout(*LOG);
}
You can even arrange for fatal errors to go back to the client browser, which is nice for your own debugging,
but might confuse the end user.
use CGI::Carp qw(fatalsToBrowser);
die "Bad error here";

114

Version 5.005_02

18−Oct−1998

perlfaq9

Perl Programmers Reference Guide

perlfaq9

Even if the error happens before you get the HTTP header out, the module will try to take care of this to
avoid the dreaded server 500 errors. Normal warnings still go out to the server error log (or wherever you‘ve
sent them with carpout) with the application name and date stamp prepended.
How do I remove HTML from a string?
The most correct way (albeit not the fastest) is to use HTML::Parse from CPAN (part of the libwww−perl
distribution, which is a must−have module for all web hackers).
Many folks attempt a simple−minded regular expression approach, like s/<.*?>//g, but that fails in many
cases because the tags may continue over line breaks, they may contain quoted angle−brackets, or HTML
comment may be present. Plus folks forget to convert entities, like < for example.
Here‘s one "simple−minded" approach, that works for most files:
#!/usr/bin/perl −p0777
s/<(?:[^>’"]*|([’"]).*?\1)*>//gs
If you want a more complete solution, see the 3−stage striphtml program in
http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/striphtml.gz .
Here are some tricky cases that you should think about when picking a solution:
A > B
A > B
 −−>

<# Just data #>
>>>>>>>>>>> ]]>
If HTML comments include other tags, those solutions would also break on text like this:
You can’t see me!
−−>
How do I extract URLs?
A quick but imperfect approach is
#!/usr/bin/perl −n00
# qxurl − tchrist@perl.com
print "$2\n" while m{
< \s*
A \s+ HREF \s* = \s* (["’]) (.*?) \1
\s* >
}gsix;
This version does not adjust relative URLs, understand alternate bases, deal with HTML comments, deal
with HREF and NAME attributes in the same tag, or accept URLs themselves as arguments. It also runs
about 100x faster than a more "complete" solution using the LWP suite of modules, such as the
http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/xurl.gz program.
How do I download a file from the user‘s machine? How do I open a file on another machine?
In the context of an HTML form, you can use what‘s known as multipart/form−data encoding. The
CGI.pm module (available from CPAN) supports this in the start_multipart_form() method, which
isn‘t the same as the startform() method.

18−Oct−1998

Version 5.005_02

115

perlfaq9

Perl Programmers Reference Guide

perlfaq9

How do I make a pop−up menu in HTML?
Use the