://link.springer.com/content/pdf/bbm%3A978 1 4939 2122 5%2F1 Bbm:978 5/1

User Manual: ://link.springer.com/content/pdf/bbm%3A978-1-4939-2122-5%2F1 R FAQ

Open the PDF directly: View PDF PDF.
Page Count: 188

Download://link.springer.com/content/pdf/bbm%3A978-1-4939-2122-5%2F1 Bbm:978-1-4939-2122-5/1
Open PDF In BrowserView PDF
Appendix A

R

R (R Core Team, 2015) is the lingua franca of data analysis and graphics. In this
book we will learn the fundamentals of the language and immediately use it for the
statistical analysis of data. We will graph both the data and the results of the analyses. We will work with the basic statistical tools (regression, analysis of variance,
and contingency tables) and some more specialized tools. Many of our examples
use the additional functions we have provided in the HH package (Appendix B)
(Heiberger, 2015). The R code for all tables and graphs in the book is included in
the HH package.
In later appendices we will also look at the Rcmdr menu system (Appendix C),
RExcel integration of R with Excel (Appendix D), and the shiny package for integration of R with interactive html web pages (Appendix E).

A.1 Installing R—Initial Installation
R is an open-source publicly licensed software system. R is free under the GPL
(Gnu Public License).
R is available for download on Windows, Macintosh OSX, and Linux computer systems. Start at http://www.R-project.org. Click on “download R” in
the “Getting Started” box, pick a mirror near you, and download and install the most
recent release of R for your operating system.
Once R is running, you will download several necessary contributed packages.
You get them from a running R session while connected to the internet. This
install.packages statement will install all the packages listed and additional
packages that these specified packages need.

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

699

A R

700

A.1.1 Packages Needed for This Book—Macintosh and Linux
Start R, then enter
install.packages(c("HH","RcmdrPlugin.HH","RcmdrPlugin.mosaic",
"fortunes","ggplot2","shiny","gridExtra",
"gridBase","Rmpfr","png","XLConnect",
"matrixcalc", "sem", "relimp", "lmtest",
"markdown", "knitr", "effects", "aplpack",
"RODBC", "TeachingDemos",
"gridGraphics", "gridSVG"),
dependencies=TRUE)
##
##
##
##
##
##

This is the sufficient list (as of 16 August 2015) of packages
needed in order to install the HH package. Should
additional dependencies be declared by any of these packages
after that date, the first use of "library(HH)" after the
installation might ask for permission to install some more
packages.

The install.packages command might tell you that it can’t write in a system
directory and will then ask for permission to create a personal library. The question
might be in a message box that is behind other windows. Should the installation
seem to freeze, find the message box and respond “yes” and accept its recommend
directory. It might ask you for a CRAN mirror. Take the mirror from which you
downloaded R.

A.1.2 Packages and Other Software Needed for This
Book—Windows
A.1.2.1 RExcel
If you are running on a Windows machine and have access to Excel, then we recommend that you also install RExcel. The RExcel software provides a seamless
integration of R and Excel. See Appendix D for further information on RExcel,
including the download statements and licensing information. See Section A.1.2.2
for information about using Rcmdr with RExcel.

A.1 Installing R—Initial Installation

701

A.1.2.2 RExcel Users Need to Install Rcmdr as Administrator
Should you choose to install RExcel then you need to install Rcmdr as Administrator. Otherwise you can install Rcmdr as an ordinary user.
Start R as Administrator (on Windows 7 and 8 you need to right-click the R
icon and click the “Run as administrator” item). In R, run the following commands
(again, you must have started R as Administrator to do this)
## Tell Windows that R should have the same access to the
## outside internet that is granted to Internet Explorer.
setInternet2()
install.packages("Rcmdr",
dependencies=TRUE)
Close R with q("no"). Answer with n if it asks
Save workspace image? [y/n/c]:

A.1.2.3 Packages Needed for This Book—Windows
The remaining packages can be installed as an ordinary user. Rcmdr is one of the
dependencies of RcmdrPlugin.HH, so it will be installed by the following statement (unless it was previously installed). Start R, then enter
## Tell Windows that R should have the same access to the
## outside internet that is granted to Internet Explorer.
setInternet2()
install.packages(c("HH","RcmdrPlugin.HH","RcmdrPlugin.mosaic",
"fortunes","ggplot2","shiny","gridExtra",
"gridBase","Rmpfr","png","XLConnect",
"matrixcalc", "sem", "relimp", "lmtest",
"markdown", "knitr", "effects", "aplpack",
"RODBC", "TeachingDemos",
"gridGraphics", "gridSVG"),
dependencies=TRUE)
##
##
##
##
##
##

This is the sufficient list (as of 16 August 2015) of packages
needed in order to install the HH package. Should
additional dependencies be declared by any of these packages
after that date, the first use of "library(HH)" after the
installation might ask for permission to install some more
packages.

702

A R

A.1.2.4 Rtools
Rtools provides all the standard Unix utilities (C and Fortran compilers, commandline editing tools such as grep, awk, diff, and many others) that are not included with the Windows operating system. These utilities are needed in two
circumstances.
1. Should you decide to collect your R functions and datasets into a package, you
will need Rtools to build the package. See Appendix F for more information.
2. Should you need to use the ediff command in Emacs for visual comparison
of two different versions of a file (yesterday’s version and today’s after some
editing, for example), you will need Rtools. See Section M.1.2 for an example
of visual comparison of files.
You may download Rtools from the Windows download page at CRAN. Please
see the references from that page for more details.

A.1.2.5 Windows Potential Complications: Internet, Firewall, and Proxy
When install.packages on a Windows machine gives an Error message that
includes the phrase “unable to connect”, then you are probably working behind
a company firewall. You will need the R statement
setInternet2()
before the install.packages statement. This statement tells the firewall to give
R the same access to the outside internet that is granted to Internet Explorer.
When the install.packages gives a Warning message that says you don’t have
write access to one directory, but it will install the packages in a different directory,
that is normal and the installation is successful.
When the install.packages gives an Error message that says you don’t have
write access and doesn’t offer an alternative, then you will have to try the package
installation as Administrator. Close R, then reopen R by right-clicking the R icon,
and selecting “Run as administrator”.
If this still doesn’t allow the installation, then
1. Run the R line
sessionInfo()
2. Run the install.packages lines.
3. Highlight and pick up the entire contents of the R console and save it in a text
file. Screenshots are usually not helpful.
4. Show your text listing to an R expert.

A.1 Installing R—Initial Installation

703

A.1.3 Installation Problems—Any Operating System
The most likely source of installation problems is settings (no write access to restricted directories on your machine, or system-wide firewalls to protect against
offsite internet problems) that your computer administrator has placed on your
machine.
Check the FAQ (Frequently Asked Questions) files linked to at
http://cran.r-project.org/faqs.html.
For Windows, see also Section A.1.2.5.
If outside help is needed, then save the contents of the R console window to show
to your outside expert. In addition to the lines and results leading to the problem,
including ALL messages that R produces, you must include the line (and its results)
sessionInfo()
in the material you show the expert.
Screenshots are not a good way to capture information. The informative way
to get the contents of the console window is by highlighting the entire window
(including the off-screen part) and saving it in a text file. Show your text listing to
an R expert.

A.1.4 XLConnect: All Operating Systems
The XLConnect package lets you read MS Excel files directly from R on any operating system. You may use an R statement similar to the following
library(XLConnect)
WB <- ## pathname of file with some additional information
loadWorkbook(
## "c:/Users/rmh/MyWorkbook.xlsx" ## rmh pathname in Windows
"~/MyWorkbook.xlsx"
## rmh pathname in Macintosh
)
mydata <- readWorksheet(WB, sheet="Sheet1", region="A1:D11")
If you get an error from library(XLConnect) of the form
Error : .onLoad failed in loadNamespace() for ’rJava’
then you need to install java on your machine from http://java.com. The java
installer will ask you if you want to install ask as your default search provider. You
may deselect both checkboxes to retain your preferred search provider.

704

A R

A.2 Installing R—Updating
R is under constant development with new releases every few months. At this writing (August 2015) the current release is R-3.2.2 (2015-08-14).
See the FAQ for general update information, and in addition the Windows FAQ
or MacOs X FAQ for those operating systems. Links to all three FAQs are available
at http://cran.r-project.org/faqs.html. The FAQ files are included in the
documentation placed on your machine during installation.
The update.packages mechanism works for packages on CRAN. It does not
work for packages downloaded from elsewhere. Specifically, Windows users with
RExcel installed will need to update the RExcel packages by reinstalling them as
described in Section D.1.2.

A.3 Using R
A.3.1 Starting the R Console
In this book our primary access to the R language is from the command line. On
most computer systems clicking the R icon on the desktop will start an R session
in a console window. Other options are to begin within an Emacs window with M-x
R, to start an R-Studio session, to start R at the Unix or MS-DOS command line
prompt, or to start R through one of several other R-aware editors.
With any of these, the R offers a prompt “> ”, the user types a command, R
responds to the user’s command and then offers another prompt “> ”. A very simple
command-line interaction is shown in Table A.1.
Table A.1 Very simple command-line interaction.
> ## Simple R session
> 3 + 4
[1] 7
> pnorm(c(-1.96, -1.645, -0.6745, 0, 0.6745, 1.645, 1.96))
[1] 0.02500 0.04998 0.25000 0.50000 0.75000 0.95002 0.97500
>

A.3 Using R

705

A.3.2 Making the Functions in the HH Package Available
to the Current Session
At the R prompt, enter
library(HH)
This will load the HH package and several others. All HH functions are now available to you at the command line.

A.3.3 Access HH Datasets
All HH datasets are accessed with the R data() function. The first six observations
are displayed with the head() function, for example
data(fat)
head(fat)

A.3.4 Learning the R Language
R is a dialect of the S language. The easiest way to learn it is from the manuals that
are distributed with R in the doc/manual directory; you can find the pathname with
the R call
system.file("../../doc/manual")
or
WindowsPath(system.file("../../doc/manual"))
Open the manual directory with your computer’s tools, and then read the pdf or
html files. Start with the R-intro and the R-FAQ files. With the Windows Rgui,
you can access the manuals from the menu item Help > Manuals (in PDF). From
the Macintosh R.app, you can access the manuals from the menu item R Help.

A Note on Notation: Slashes
Inside R, on any computer system, pathnames always use only the forward
slashes “/”.
In Linux and Macintosh, operating system pathnames use only the forward
slashes “/”.

A R

706

In Windows, operating system pathnames—at the MS-DOS prompt shell CMD,
in the Windows icon Properties windows, and in the Windows Explorer filename entry bar—are written with backslashes “\”. The HH package provides a
convenience function WindowsPath to convert pathnames from the forward slash
notation to the backslash notation.

A.3.5 Duplicating All HH Examples
Script files containing R code for all examples in the book are available for you to
use to duplicate the examples (table and figures) in the book, or to use as templates
for your own analyses. You may open these files in an R-aware editor.
See the discussion in Section B.2 for more details.

A.3.5.1 Linux and Macintosh
The script files for the second edition of this book are in the directory
HHscriptnames()
The script files for the first edition of this book are in the directory
HHscriptnames(edition=1)

A.3.5.2 Windows.
The script files for the second edition of this book are in the directory
WindowsPath(HHscriptnames())
The script files for the first edition of this book are in the directory
WindowsPath(HHscriptnames(edition=1))

A.3.6 Learning the Functions in R
Help on any function is available. For example, to learn about the ancovaplot
function, type
?ancovaplot

A.3 Using R

707

To see a function in action, you can run the examples on the help page. You can
do them one at a time by manually copying the code from the example section of a
help page and pasting it into the R console. You can do them all together with the
example function, for example
example("ancovaplot")
Some functions have demonstration scripts available, for example,
demo("ancova")
The list of demos available for a specific package is available by giving the package
name
demo(package="HH")
The list of all demos available for currently loaded packages is available by
demo()
See ?demo for more information on the demo function.
The demo and example functions have optional arguments ask=FALSE and
echo=TRUE. The default value ask=TRUE means the user has to press the ENTER
key every time a new picture is ready to be drawn. The default echo=FALSE often
has the effect that only the last of a series of lattice or ggplot2 graphs will be displayed. See FAQ 7.22:
7.22 Why do lattice/trellis graphics not work?
The most likely reason is that you forgot to tell R to display the graph. lattice functions
such as xyplot() create a graph object, but do not display it (the same is true of ggplot2
graphics, and trellis graphics in S-Plus). The print() method for the graph object produces the actual display. When you use these functions interactively at the command line,
the result is automatically printed, but in source() or inside your own functions you will
need an explicit print() statement.

A.3.7 Learning the lattice Functions in R
One of the best places to learn the lattice functions is the original trellis documentation: the S-Plus Users Manual (Becker et al., 1996a) and a descriptive paper
with examples (Becker et al., 1996b). Both are available for download.

A.3.8 Graphs in an Interactive Session
We frequently find during an interactive session that we wish to back up and compare our current graph with previous graphs.
For R on the Macintosh using the quartz device, the most recent 16 figures are
available by Command-leftarrow and Command-rightarrow.

A R

708

For R on Windows using the windows device, previous graphs are available
if you turn on the graphical history of your device. This can be done with the
mouse (by clicking History > Recording on the device menu) or by entering the R
command
options(graphics.record = TRUE)
You can now navigate between graphs with the PgUp and PgDn keys.

A.4 S/R Language Style
S is a language. R is a dialect of S. Languages have standard styles in which they
are written. When a language is displayed without paying attention to the style, it
looks unattractive and may be illegible. It may also give valid statements that are
not what the author intended. Read what you turn in before turning it in.
The basic style conventions are simple. They are also self-evident after they have
been pointed out. Look at the examples in the book’s code files in the directory
HHscriptnames()
and in the R manuals.
1. Use the courier font for computer listings.
This is courier.
This is Times Roman.
Notice that displaying program output in a font other than one for which it was
designed destroys the alignment and makes the output illegible. We illustrate the
illegibility of improper font choice in Table A.2 by displaying the first few lines
of the data(tv) dataset from Chapter 4 in both correct and incorrect fonts.
Table A.2 Correct and incorrect alignment of computer listings. The columns in the correct
example are properly aligned. The concept of columns isn’t even visible in the incorrect example.
Courier (with correct alignment of columns)

Times Roman (alignment is lost)

> tv[1:5, 1:3]
life.exp ppl.per.tv ppl.per.phys
Argentina
70.5
4.0
370
Bangladesh
53.5
315.0
6166
Brazil
65.0
4.0
684
Canada
76.5
1.7
449
China
70.0
8.0
643

> tv[1:5, 1:3]
life.exp ppl.per.tv ppl.per.phys
Argentina 70.5 4.0 370
Bangladesh 53.5 315.0 6166
Brazil 65.0 4.0 684
Canada 76.5 1.7 449
China 70.0 8.0 643

A.4 S/R Language Style

709

2. Use sensible spacing to distinguish the words and symbols visually. This convention allows people to read the program.

bad:

abc<-def

good:

abc <- def

no space surrounding the <-

3. Use sensible indentation to display the structure of long statements. Additional
arguments on continuation lines are most easily parsed by people when they
are aligned with the parentheses that define their depth in the set of nested
parentheses.

bad:

names(tv) <- c("life.exp","ppl.per.tv","ppl.per.phys",
"fem.life.exp","male.life.exp")

good: names(tv) <- c("life.exp",
"ppl.per.tv",
"ppl.per.phys",
"fem.life.exp",
"male.life.exp")

Use Emacs (or other R-aware editor) to help with indentation. For example,
open up a new file tmp.r in Emacs (or another editor) and type the above
bad example—in two lines with the indentation exactly as displayed. Emacs in
ESS[S] mode and other R-aware editors will automatically indent it correctly.
4. Use a page width in the Commands window that your word processor and printer
supports. We recommend
options(width=80)
if you work with the natural width of 8.5in×11in paper (“letter” paper in the US.
The rest of the world uses “A4” paper at 210cm×297cm) with 10-pt type. If you
use a word processor that insists on folding lines at some shorter width (72 characters is a common—and inappropriate—default folding width), you must either
take control of your word processor, or tell R to use a shorter width. Table A.3
shows a fragment from an anova output with two different width settings for the
word processor.

A R

710

Table A.3 Legible and illegible printings of the same table. The illegible table was inappropriately
folded by an out-of-control word processor. You, the user, must take control of folding widths.

Legible:
> anova(fat2.lm)
Analysis of Variance Table
Response: bodyfat
Terms added sequentially (first to last)
Df Sum of Sq Mean Sq F Value
Pr(F)
abdomin 1 2440.500 2440.500 101.1718 0.00000000
biceps 1
209.317 209.317
8.6773 0.00513392
Residuals 44 1061.382
24.122

Illegible (folded at 31 characters):
> anova(fat2.lm)
Analysis of Variance Table
Response: bodyfat
Terms added sequentially (first
to last)
Df Sum of Sq Mean Sq
F Value
Pr(F)
abdomin 1 2440.500 2440.500
101.1718 0.00000000
biceps 1
209.317 209.317
8.6773 0.00513392
Residuals 44 1061.382
24.122

5. Reserved names. R has functions with the single-letter names c, s, and t. R
also has many functions whose names are commonly used statistical terms, for
example: mean, median, resid, fitted, data. If you inadvertently name an
object with one of the names used by the system, your object might mask the
system object and strange errors would ensue. Do not use system names for your
variables. You can check with the statement
conflicts(detail=TRUE)

A.5 Getting Help While Learning and Using R

711

A.5 Getting Help While Learning and Using R
Although this section is written in terms of the R email list, its recommendations apply to all situations, in particular, to getting help from your instructor while reading
this book and learning R.
R has an email help list. The archives are available and can be searched. Queries
sent to the help list will be forwarded to several thousand people world-wide and
will be archived. For basic information on the list, read the note that is appended to
the bottom of EVERY email on the R-help list, and follow its links:
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more,
see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and
provide commented, minimal, self-contained, reproducible code.
R-help is a plain text email list. Posting in HTML mangles your code, making it
hard to read. Please send your question in plain text and make the code reproducible.
There are two helpful sites on reproducible code
http://adv-r.had.co.nz/Reproducibility.html
http://stackoverflow.com/questions/5963269/
how-to-make-a-great-r-reproducible-example
When outside help is needed, save the contents of the R console window to show
to your outside expert. In addition to the lines and results leading to the problem,
including ALL messages that R produces, you must include the line (and its results)
sessionInfo()
in the material you show the expert.
Screenshots are not a good way to capture information. The informative way
to get the contents of the console window is by highlighting the text of the entire
window (including the off-screen part) and saving it in a text file. Show your text
listing to an R expert.

A R

712

A.6 R Inexplicable Error Messages—Some Debugging Hints
In general, weird and inexplicable errors mean that there are masked function
names. That’s the easy part. The trick is to find which name. The name conflict
is frequently inside a function that has been called by the function that you called
directly. The general method, which we usually won’t need, is to trace the action
of the function you called, and all the functions it called in turn. See ?trace,
?recover, ?browser, and ?debugger for help on using these functions.
One method we will use is to find all occurrences of our names that might
mask system functions. R provides two functions that help us. See ?find and
?conflicts for further detail.
find: Returns a vector of names, or positions of databases and/or frames that
contain an object.
• This example is problem because the user has used a standard function name
“data” for a different purpose
> args(data)
function (..., list = character(), package = NULL,
lib.loc = NULL, verbose = getOption("verbose"),
envir = .GlobalEnv)
NULL
> data <- data.frame(a=1:3, b=4:6)
> data
a b
1 1 4
2 2 5
3 3 6
> args(data)
NULL
> find("data")
[1] ".GlobalEnv"
"package:utils"
> rm(data)
> args(data)
function (..., list = character(), package = NULL,
lib.loc = NULL, verbose = getOption("verbose"),
envir = .GlobalEnv)
NULL
>
conflicts: This function checks a specified portion of the search list for items
that appear more than once.

A.6 R Inexplicable Error Messages—Some Debugging Hints

713

• The only items we need to worry about are the ones that appear in our working
directory.
> data <- data.frame(a=1:3, b=4:6)
> conflicts(detail=TRUE)
$.GlobalEnv
[1] "data"
$‘package:utils‘
[1] "data"
$‘package:methods‘
[1] "body<-"
"kronecker"
$‘package:base‘
[1] "body<-"
"kronecker"
> rm(data)
Once we have found those names we must assign their value to some other variable name and then remove them from the working directory. In the above example,
we have used the system name "data" for one of our variable names. We must
assign the value to a name without conflict, and then remove the conflicting name.
> data <- data.frame(a=1:3, b=4:6)
> find("data")
[1] ".GlobalEnv"
"package:utils"
> CountingData <- data
> rm(data)
> find("data")
[1] "package:utils"
>

Appendix B

HH

Every graph and table in this book is an example of the types of graphs and analytic
tables that readers can produce for their own data using functions in either base R or
the HH package (Heiberger, 2015). Please see Section A.1 for details on installing
HH and additional packages.
When you see a graph or table you need, open the script file for that chapter and
use the code there on your data. For example, the MMC plot (Mean–mean Multiple
Comparisons plot) is described in Chapter 7, and the first MMC plot in that Chapter
is Figure 7.3. Therefore you can enter at the R prompt:
HHscriptnames(7)
and discover the pathname for the script file. Open that file in your favorite R-aware
editor. Start at the top and enter chunks (that is what a set of code lines is called in
these files which were created directly from the manuscript using the R function
Stangle) from the top until the figure you are looking for appears. That gives the
sequence of code you will need to apply to your own data and model.

B.1 Contents of the HH Package
The HH package contains several types of items.
1. Functions for the types of graphs illustrated in the book. Most of the graphical
functions in HH are built on the trellis objects in the lattice package.
2. R scripts for all figures and tables in the book.
3. R data objects for all datasets used in the book that are not part of base R.
4. Additional R functions, some analysis and some utility, that I like to use.

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

715

716

B HH

B.2 R Scripts for all Figures and Tables in the Book
Files containing R scripts for all figures and tables in the book, both Second and
First Editions, are included with the HH package. The details of pathnames to the
script files differ by computer operating systems, and often by individual computer.
To duplicate the figures and tables in the book, open the appropriate script file
in an R-aware editor. Highlight and send over to the R console one chunk at a
time. Each script file is consistent within itself. Code chunks later in a script will
frequently depend on the earlier chunks within the same script having already been
executed.
Second Edition: The R function HHscriptnames displays the full pathnames
of Second Edition script files for your computer. For example, HHscriptnames(7)
displays the full pathname for Chapter 7. The pathname for Chapter 7 relative to the
HH package is HH/scripts/hh2/mcomp.r. Valid values for the chapternumbers
argument for the Second Edition are c(1:18, LETTERS[1:15]).
First Edition: First Edition script file pathnames are similar, for example, the
relative path for Chapter 7 is HH/scripts/hh1/Ch07-mcomp.r. The full pathnames of First Edition files for your computer are displayed by the R statement HHscriptnames(7, edition=1). Valid values for the chapternumbers
argument for the First Edition are c(1:18).
The next few subsections show sample full pathnames of Second Edition script
filenames for several different operating systems.

B.2.1 Macintosh
On Macintosh the full pathname will appear as something like
> HHscriptnames(7)
7 "/Library/Frameworks/R.framework/Versions/3.2/Resources/
library/HH/scripts/hh2/mcomp.R"

B.2.2 Linux
On Linux, the full pathname will appear something like
> HHscriptnames(7)
7 "/home/rmh/R/x86_64-unknown-linux-gnu-library/3.2/HH/
scripts/hh2/mcomp.R"

B.4 HH and S+

717

B.2.3 Windows
On Windows the full pathname will appear as something like
> HHscriptnames(7)
7 "C:/Users/rmh/Documents/R/win-library/3.2/HH/scripts/
hh2/mcomp.R"
You might prefer it to appear with Windows-style path separators (with the escaped
backslash that looks like a double backslash)
> WindowsPath(HHscriptnames(7), display=FALSE)
7 "C:\\Users\\rmh\\Documents\\R\\win-library\\3.2\\HH\
\scripts\\hh2\\mcomp.R"
or unquoted and with the single backslash
> WindowsPath(HHscriptnames(7))
7 C:\Users\rmh\Documents\R\win-library\3.2\HH\scripts\
hh2\mcomp.R
Some of these variants will work with Windows Explorer (depending on which
version of Windows) or your favorite editor, and some won’t.

B.3 Functions in the HH Package
There are many functions in the HH package, and in the rest of R, that you will
need to learn about. The easiest way is to use the documentation that is included
with R. For example, to learn about the linear regression function lm (for “Linear
Model”), just ask R with the simple “?” command:
?lm
and a window will open with the description. Try it now.

B.4 HH and S+
Package version HH 2.1-29 of 2009-05-27 (Heiberger, 2009) is still available at
CSAN. This version of the package is appropriate for the First Edition of the
book. It includes very little of the material developed after the publication of the
First Edition. Once I started using the features of R’s latticeExtra package it no
longer made sense to continue compatibility with S-Plus. HH 2.1-29 doesn’t have
data(); instead it has a datasets subdirectory.

Appendix C

Rcmdr: R Commander

The R Commander, released as the package Rcmdr (Fox, 2005; John Fox et al.,
2015), is a platform-independent basic-statistics GUI (graphical user interface) for
R, based on the tcltk package (part of R).
We illustrate how to use it by reconstructing the two panels of Figure 9.5 directly
from the Rcmdr menu. We load Rcmdr indirectly by explicitly loading our package
RcmdrPlugin.HH (Heiberger and with contributions from Burt Holland, 2015). We
then use the menu to bring in the hardness dataset, compute the quadratic model,
display the squared residuals for the quadratic model (duplicating the left panel of
Figure 9.5), display the squared residuals for the linear model (duplicating the right
panel of Figure 9.5). The linear model was fit implicitly and its summary is not
automatically printed. We show the summary of the quadratic model.
Figures C.1–C.14 illustrate all the steps summarized above.

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

719

720

C Rcmdr: R Commander

Fig. C.1 From the *R* buffer (or the R Console) enter library(RcmdrPlugin.HH). This
also loads HH and Rcmdr and several other packages. It also open the Rcmdr window shown in
Figure C.2.

C Rcmdr: R Commander

721

Fig. C.2 The Rcmdr window as it appears when first opened. It shows a menu bar, a tool bar, and
three subwindows.

722

C Rcmdr: R Commander

Fig. C.3 We bring in a dataset by clicking on the Data menu sequence to open Figure C.4.

Fig. C.4 In this menu box we click on the HH package and within it, on the hardness data.

C Rcmdr: R Commander

723

Fig. C.5 The Data set: item in the tool bar now shows hardness as the active dataset. The R
Script subwindow show the R command that was generated by the menu sequence. The Output
subwindow shows the transcript of the R session where the command was executed. The Messages subwindow give information on the dataset that was opened.

724

C Rcmdr: R Commander

Fig. C.6 From the Statistics menu we open the Linear model menu box in Figure C.7.

Fig. C.7 Specify the linear model. The user can enter the model by typing or by clicking on the
menu items.

C Rcmdr: R Commander

725

Fig. C.8 The Linear model menu item wrote the R commands shown in the R Script subwindow and executed them in the Output window. The model is stored in the R "lm" object
named LinearModel.1. The Model: item in the tool bar now shows LinearModel.1 as the
active model.

726

C Rcmdr: R Commander

Fig. C.9 Use the Graphs menu to get to the Squared Residuals. . . (HH) menu box in
Figure C.10.

C Rcmdr: R Commander

727

Fig. C.10 Specify the x-variable, the y-variable, and the active model (LinearModel.1) to get
Figure C.11.

728

C Rcmdr: R Commander

Fig. C.11 This is the squared residuals for the quadratic model (duplicating the right panel of
Figure 9.5).

C Rcmdr: R Commander

729

Fig. C.12 We repeated Figure C.9 to get the menu box again. This time we took the default linear
model to get Figure C.13.

730

C Rcmdr: R Commander

Fig. C.13 This is the squared residuals for the linear model (duplicating the left panel of
Figure 9.5).

C Rcmdr: R Commander

731

Fig. C.14 The Rcmdr window now shows the summary for the quadratic regression and the
specifications for the two graphs in Figures C.11 and C.13.

Appendix D

RExcel: Embedding R inside Excel
on Windows

If you are running on a Windows machine and have access to MS Excel, then we
recommend that you install RExcel (Baier and Neuwirth, 2007; Neuwirth, 2014).
RExcel is free of charge for “single user non-commercial use” with 32-bit Excel.
Any other use will require a license. Please see the license at rcom.univie.ac.at
for details.
RExcel seamlessly integrates the entire set of R’s statistical and graphical methods into Excel, allowing students to focus on statistical methods and concepts and
minimizing the distraction of learning a new programming language. Data can be
transferred between R and Excel “the Excel way” by selecting worksheet ranges
and using Excel menus. RExcel has embedded the Rcmdr menu into the Excel
ribbon. Thus R’s basic statistical functions and selected advanced methods are available from an Excel menu. Almost all R functions can be used as worksheet functions
in Excel. Results of the computations and statistical graphics can be returned back
into Excel worksheet ranges. RExcel allows the use of Excel scroll bars and check
boxes to create and animate R graphics as an interactive analysis tool.
See Heiberger and Neuwirth (2009) for the book R through Excel: A Spreadsheet
Interface for Statistics, Data Analysis, and Graphics. This book is designed as a
computational supplement for any Statistics course.
RExcel works with Excel only on Windows. Excel on the Macintosh uses a
completely different interprocess communications system.

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

733

734

D RExcel: Embedding R inside Excel on Windows

D.1 Installing RExcel for Windows
You must have MS Excel (2007, 2010, or 2013) installed on your Windows
machine. You need to purchase Excel separately. Excel 2013 is the current version (in early 2015). RExcel is free of charge for “single user non-commercial use”
with 32-bit Excel. Any other use will require a license.

D.1.1 Install R
Begin by installing R and the necessary packages as described in Section A.1.

D.1.2 Install Two R Packages Needed by RExcel
You will also need two more R packages that must be installed as computer
Administrator.
Start R as Administrator (on Windows 7 and 8 you need to right-click the R
icon and click the “Run as administrator” item). In R, run the following commands
(again, you must have started R as Administrator to do this)
## Tell Windows that R should have the same access to the outside
## internet that is granted to Internet Explorer.
setInternet2()
install.packages(c("rscproxy","rcom"),
repos="http://rcom.univie.ac.at/download",
type="binary",
lib=.Library)
library(rcom)
comRegisterRegistry()
Close R with q("no"). If it asks
Save workspace image? [y/n/c]:
answer with n.

D.1 Installing RExcel for Windows

735

D.1.3 Install RExcel and Related Software
Go to http://rcom.univie.ac.at and click on the Download tab. Download
and execute the following four files (or newer releases if available)
• statconnDCOM3.6-0B2_Noncommercial
• RExcel 3.2.15
• RthroughExcelWorkbooksInstaller_1.2-10.exe
• SWord 1.0-1B1 Noncommercial SWord (Baier, 2014) is an add-in package
for MSword that makes it possible to embed R code in a MSword document.
The R code will be automatically executed and the output from the R code will
be included within the MSword document. SWord is free for non-commercial
use. Any other use will require a license. SWord is a separate program from
RExcel and is not required for RExcel.
These installer .exe files will ask for administrator approval, as they write in
the Program Files directory and write to the Windows registry as part of the
installation. Once they are installed, they run in normal user mode.

D.1.4 Install Rcmdr to Work with RExcel
In order for RExcel to place the Rcmdr menu on the Excel ribbon, it is necessary
that Rcmdr be installed in the C:/Program Files/R/R-x.y.z/library directory and not in the C:/Users/rmh/*/R/win-library/x.y directory. If Rcmdr is
installed in the user directory, it must be removed before reinstalling it as Administrator. Remove it with the remove.packages function using
remove.packages(c("Rcmdr", "RcmdrMisc"))
Then see the installation details in Section A.1.2.2.

D.1.5 Additional Information on Installing RExcel
Additional RExcel installation information is available in the Wiki page at
http://rcom.univie.ac.at

736

D RExcel: Embedding R inside Excel on Windows

D.2 Using RExcel
D.2.1 Automatic Recalculation of an R Function
RExcel places R inside the Excel automatic recalculation model. Figure D.1 by
Heiberger and Neuwirth was originally presented in Robbins et al. (2009) using
Excel 2007. We reproduce it here with Excel 2013.

Fig. D.1 Any R function can be used in Excel with the RExcel worksheet function RApply. The
formula =RApply("pchisq", B1, B2, B3) computes the value of the noncentral distribution
function for the quantile-value, the degrees of freedom, and the noncentrality parameter in cells
B1, B2, and B3, respectively, and returns its value into cell B4. When the value of one of the
arguments, in this example the noncentrality parameter in cell B3, is changed, the value of the
cumulative distribution is automatically updated by Excel to the appropriate new value.

D.2 Using RExcel

737

Fig. D.2 Retrieve the StudentData into Excel. From the RExcel Add-In tab, click
RthroughExcel Worksheets. This brings up the BookFilesTOC worksheet in Figure D.3.

D.2.2 Transferring Data To/From R and Excel
Datasets can be transferred in either direction to/from Excel from/to R. In
Figures D.2–D.4 we bring in a dataset from an Excel worksheet, transfer it to R,
and make it the active dataset for use with Rcmdr.
The StudentData was collected by Erich Neuwirth for over ten years from
students in his classes at the University of Vienna. The StudentData dataset is
included with RExcel.

738

D RExcel: Embedding R inside Excel on Windows

Fig. D.3 Click the StudentData button to bring up the StudentData worksheet in Figure D.4.

D.2 Using RExcel

739

Fig. D.4 Highlight the entire region containing data A1:Q1126. Right-click for the menu and
select Put R DataFrame. Click OK on the “Put dataframe in R” Dialog box. This places the
dataset name StudentData into the Rcmdr’s active Dataset box in Figure D.5.

740

D RExcel: Embedding R inside Excel on Windows

D.2.3 Control of a lattice Plot from an Excel/Rcmdr Menu
The example in Figures D.5–D.8 is originally from Heiberger and Neuwirth (2009)
using Excel 2007. We reproduce it here with Excel 2013. We made it the active
dataset for Rcmdr in Figures D.2–D.4.

Fig. D.5 RExcel has placed the Rcmdr menu onto the Excel ribbon. Click the Graphs tab to
get the menu and then click XY conditioning plot. . . (HH) to get the Dialog box in Figure D.6.
The (HH) in the menu item means the function was added to the Rcmdr menu by our RcmdrPlugin.HH package (Heiberger and with contributions from Burt Holland, 2015).

D.2 Using RExcel

741

Fig. D.6 The user fills in the Dialog box to specify the graph in Figure D.7 and generate the R
commands displayed in Figure D.8.

742

D RExcel: Embedding R inside Excel on Windows

Fig. D.7 The graph is displayed. Shoesize is the student’s shoe size in Paris points (2/3 cm). Size
is the student’s height in cm. SizeFather and SizeMother are the heights of the student’s parents.
Fathers of both male and female students have the same height distribution as the male students.
Mothers of both male and female students have the same height distribution as the female students.

Fig. D.8 The generated R code is displayed.

Appendix E

Shiny: Web-Based Access to R Functions

Shiny (Chang et al., 2015; RStudio, 2015) is an R package that provides an R
language interface for writing interactive web applications. Apps built with shiny
place the power of R behind an interactive webpage that can be used by nonprogrammers. A full tutorial and gallery are available at the Shiny web site.
We have animated several of the graphs in the HH package using shiny.

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

743

744

E Shiny: Web-Based Access to R Functions

E.1 NTplot
The NTplot function shows significance levels and power for the normal or tdistributions. Figure E.1, an interactive version of the top panel of the middle section
of Figure 3.20, is specified with
NTplot(mean0=8, mean1=8.411, sd=2, n=64, cex.prob=1.3,
shiny=TRUE)

Fig. E.1 This is an interactive version of Figure 3.20 constructed with the shiny package. Adjusting the n slider at the bottom right can produce all three columns of Figure 3.20. Clicking the 
button below the slider will dynamically move through all values of n from 1 through 150, including the three that are displayed in Figure 3.20. There are additional controls on the Normal and t,
Display Options, and Fonts tabs that will show the power and beta panels.

E.2 bivariateNormal

745

E.2 bivariateNormal
Figure E.2 is an interactive version of Figure 3.9 showing the bivariate normal density in 3D space with various correlations and various viewpoints.
shiny::runApp(system.file("shiny/bivariateNormal",
package="HH"))

Fig. E.2 This is an interactive version of Figure 3.9 constructed with the shiny package. Adjusting
the rho slider changes the correlation between x and y. Adjusting the angle in degrees slider
rotates the figure through all the viewpoint angles shown in Figure 3.10. Both sliders can be made
dynamic by clicking their  buttons.

746

E Shiny: Web-Based Access to R Functions

E.3 bivariateNormalScatterplot
Figure E.3 is a dynamic version of Figure 3.8 specified with
shiny::runApp(system.file("shiny/bivariateNormalScatterplot",
package="HH"))

Fig. E.3 This is an interactive version of Figure 3.8 constructed with the shiny package. Adjusting
the rho slider changes the correlation between x and y. By clicking the  button, the figure will
transition through the panels of Figure 3.8.

E.4 PopulationPyramid

747

E.4 PopulationPyramid
Figure E.4 is an interactive version of Figure 15.19 showing the population pyramid
for the United States annually for the years 1900–1979 specified with
shiny::runApp(system.file("shiny/PopulationPyramid",
package="HH"))

Fig. E.4 This is an interactive version of Figure 15.19 constructed with the shiny package. Adjusting the Year slider changes the year in the range 1900–1970. By clicking the  button, the
figure will dynamically transition through the panels of Figure 15.19.

Appendix F

R Packages

The R program as supplied by R-Core on the CRAN (Comprehensive R Archive
Network) page (CRAN, 2015) consists of the base program and about 30 required
and recommended packages. Everything else is a contributed package. There are
about 6500 contributed packages (April 2015). Our HH is a contributed package.

F.1 What Is a Package?
R packages are extensions to R.
Each package is a collection of functions and datasets designed to work together
for a specific purpose. The HH package is designed to provide computing support
for the techniques discussed in this book . The lattice package (a recommended
package) is designed to provide xyplot and related graphics functions. Most of the
graphics functions in HH are built on the functions in lattice.
Packages consist at a minimum of functions written in R, datasets that can be
brought into the R working directory, and documentation for all functions and
datasets. Some packages include subroutines written in Fortran or C.

F.2 Installing and Loading R Packages
The base and recommended packages are installed on your computer when you
download and install R itself. All other packages must be explicitly installed (downloaded from CRAN and placed into an R-determined location on your computer).

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

749

F R Packages

750

The packages available at CRAN are most easily installed into your computer
with a command of the form
install.packages("MyPackage")
The R GUIs usually have a menu item that constructs this statement for you.
The functions in an installed package are not automatically available when you
start an R session. It is necessary to load them into the current session, usually with
the library function. Most examples in this book require you to enter
library(HH)
(once per R session) before doing anything else. The HH package loads lattice and
several additional packages.
The list of R packages installed on your computer is seen with the R command
library()
The list of R packages loaded into your current R session is seen with the R
command
search()

F.3 Where Are the Packages on Your Computer?
Once a package has been installed on your computer it is kept in a directory (called a
“library” in R terminology). The base and recommended packages are stored under
R itself in directory
system.file("..")
Files inside the installed packages are in an internal format and cannot be read by
an editor directly from the file system.
Contributed packages will usually be installed in a directory in your user space.
In Windows that might be something like
C:/Users/yourloginname/Documents/R/win-library/3.2
or
C:/Users/yourloginname/AppData/Roaming/R/win-library/3.2
On Macintosh it will be something like
/Users/yourloginname/Library/R/3.2/library
You can find out where the installed packages are stored by loading one and then
entering
searchpaths()
(searchpaths() is similar to search() but with the full pathname included in the
output, not just the package name). For example,
library(HH)
searchpaths()

F.5 Writing and Building Your Own Package

751

F.4 Structure of an R Package
The package developer writes individual source files. The R build system (see the
Writing R Extensions manual) has procedures for checking coherency and then for
building the source package and the binaries.
The packages at CRAN are available in three formats. The source packages
(what the package designer wrote and what you should read when you want to read
the code) are stored as packagename.tar.gz files. The binary packages for Windows are stored as packagename.zip files. The binary packages for Macintosh
are stored as packagename.tgz files.

F.5 Writing and Building Your Own Package
At some point in your analysis project you will have accumulated several of your
own functions that you quite frequently use. At that point it will be time to collect
them into a package.
We do not say much here about designing and writing functions in this book.
Begin with An Introduction to R in file
system.file("../../doc/manual/R-intro.pdf")
It includes a chapter “Writing your own functions”.
Nor do we say much here about building a package. The official reference is the
Writing R Extensions manual that also comes with R in file
system.file("../../doc/manual/R-exts.pdf")
When you are ready, begin by looking at the help file
?package.skeleton
to see how the pieces fit together.
It will help to have someone work with you the first time you build a package.
You build the package with the operating system command
R CMD check YourPackageName
and then install it on your own machine with the command
R CMD INSTALL --build YourPackageName
The checking process is very thorough, and gets more thorough at every release
of R. Understanding how to respond to the messages from the check is the specific
place where it will help to have someone already familiar with package building.

752

F R Packages

F.6 Building Your Own Package with Windows
The MS Windows operating system does not include many programs that are central to building R packages. You will need to download and install the most recent
Rtools from CRAN. See Section A.1.2.4 for download information.
You will need to include Rtools in your PATH environment variable to enable the
R CMD check packagename command to work. See “Appendix D The Windows
toolset” in R Installation and Administration manual at
system.file("../../doc/manual/R-admin.pdf")

Appendix G

Computational Precision and Floating-Point
Arithmetic

Computers use floating point arithmetic. The floating point system is not identical to
the real-number system that we (teachers and students) know well, having studied
it from kindergarten onward. In this section we show several examples to illustrate
and emphasize the distinction.
The principal characteristic of real numbers is that we can have as many digits
as we wish. The principal characteristic of floating point numbers is that we are
limited in the number of digits that we can work with. In double-precision IEEE
754 arithmetic, we are limited to exactly 53 binary digits (approximately 16 decimal
digits)
The consequences of the use of floating point numbers are pervasive, and present
even with numbers we normally think of as far from the boundaries. For detailed
information please see FAQ 7.31 in file
system.file("../../doc/FAQ")
The help menus in Rgui in Windows and R.app on Macintosh have direct links to
the FAQ file.

G.1 Examples
Let us start by looking at two simple examples that require basic familiarity with
floating point arithmetic.
1. Why is .9 not recognized to be the same as (.3 + .6)?
Table G.1 shows that .9 is not perceived to have the same value as .3 + .6, when
calculated in floating point (that is, when calculated by a computer). The difference between the two values is not 0, but is instead a number on the order of

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

753

754

G Computational Precision and Floating-Point Arithmetic

machine epsilon (the smallest number  such that 1 +  > 1). In R, the standard
mathematical comparison operators recognize the difference. There is a function
all.equal which tests for near equality. See ?all.equal for details.
Table G.1 Calculations showing that the floating point numbers .9 and .3 + .6 are not stored the
same inside the computer. R comparison operators recognize the numbers as different.
> c(.9, (.3 + .6))
[1] 0.9 0.9
> .9 == (.3 + .6)
[1] FALSE
> .9 - (.3 + .6)
[1] 1.11e-16
> identical(.9, (.3 + .6))
[1] FALSE
> all.equal(.9, (.3 + .6))
[1] TRUE

Table G.2

 √ 2
2  2 in floating point arithmetic inside the computer.

> c(2, sqrt(2)^2)
[1] 2 2
> sqrt(2)^2
[1] 2
> 2 == sqrt(2)^2
[1] FALSE
> 2 - sqrt(2)^2
[1] -4.441e-16
> identical(2, sqrt(2)^2)
[1] FALSE
> all.equal(2, sqrt(2)^2)
[1] TRUE

G.2 Floating Point Numbers in the IEEE 754 Floating-Point Standard

2. Why is

755

 √ 2
2 not recognized to be the same as 2?

 √ 2
Table G.2 shows that the difference between the two values
2 and 2 is not
0, but is instead a number on the order of machine epsilon (the smallest number
 such that 1 +  > 1).

We will pursue these examples further in Section G.7, but first we need to introduce floating point numbers—the number system used inside the computer.

G.2 Floating Point Numbers in the IEEE 754 Floating-Point
Standard
The number system we are most familiar with is the infinite-precision base-10
system. Any number can be represented as the infinite sum
± (a0 × 100 + a1 × 10−1 + a2 × 10−2 + . . .) × 10 p
where p can be any positive integer, and the values ai are digits selected from
decimal digits {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. For example, the decimal number 3.3125
is expressed as
3.3125 = (3 × 100 + 3 × 10−1 + 1 × 10−2 + 2 × 10−3 + 5 × 10−4 ) × 100
= 3.3125 × 1
In this example, there are 4 decimal digits after the radix point. There is no limit to
the number of digits that could have been specified. For decimal numbers the term
decimal point is usually used in preference to the more general term radix point.
Floating point arithmetic in computers uses a finite-precision base-2 (binary) system for representation of numbers. Most computers today use the 53-bit IEEE 754
system, with numbers represented by the finite sum
± (a0 × 20 + a1 × 2−1 + a2 × 2−2 + . . . + a52 × 2− 52) × 2 p
where p is an integer in the range −1022 to 1023 (expressed as decimal numbers),
the values ai are digits selected from {0, 1}, and the subscripts and powers i are decimal numbers selected from {0, 1, . . . , 52}. The decimal number 3.12510 is 11.01012
in binary.
3.312510 = 11.01012 = (1 × 20 + 1 × 2−1 + 0 × 2−2 + 1 × 2−3 + 0 × 2−4 +
1 × 2−5 ) × 210
= 1.101012 × 210

756

G Computational Precision and Floating-Point Arithmetic

This example (in the normalized form 1.101012 × 210 ) has five binary digits (bits)
after the radix point (binary point in this case). There is a maximum of 52 binary
positions after the binary point.
Strings of 0 and 1 are hard for people to read. They are usually collected into
units of 4 bits (called a byte).
The IEEE 754 standard requires the base β = 2 number system with p = 53 base2 digits. Except for 0, the numbers in internal representation are always normalized
with the leading bit always 1. Since it is always 1, there is no need to store it and
only 52 bits are actually needed for 53-bit precision. A string of 0 and 1 is difficult
for humans to read. Therefore every set of 4 bits is represented as a single hexadecimal digit, from the set {0 1 2 3 4 5 6 7 8 9 a b c d e f}, representing the
decimal values {0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15}. The 52 stored bits can be
displayed with 13 hex digits. Since the base is β = 2, the exponent of an IEEE 754
floating point number must be a power of 2. The double-precision computer numbers contain 64 bits, allocated 52 for the significant, 1 for the sign, and 11 for the
exponent. The 11 bits for the exponent can express 211 = 2048 unique values. These
are assigned to range from −2−1022 to 21023 , with the remaining 2 exponent values
used for special cases (zero and the special quantities NaN and ∞).
The number 3.312510 is represented in hexadecimal (base-16) notation as
3.312510 = 3.5016 = (1 × 160 + a16 × 16−1 + 816 × 16−2 ) × 210
= 1.a816 × 210
= 1.101010002 × 210
There are two hex digits after the radix point (binary point, not hex point because
the normalization is by powers of 210 not powers of 1610 ).
The R function sprintf is used to specify the printed format of numbers. The
letter a in format sprintf("%+13.13a", x) tells R to print the numbers in hexadecimal notation. The “13”s say to use 13 hexadecimal digits after the binary point.
See ?sprintf for more detail on the formatting specifications used by the sprintf
function. Several numbers, simple decimals, and simple multiples of powers of 1/2
are shown in Table G.3 in both decimal and binary notation.

G.3 Multiple Precision Floating Point
The R package Rmpfr allows the construction and use of arbitrary precision
floating point numbers. It was designed, and is usually used, for higher-precision
arithmetic—situations where the 53-bit double-precision numbers are not precise

G.4 Binary Format

757

Table G.3 Numbers, some simple integers divided by 10, and some fractions constructed as multiples of powers of 1/2. The i/10 decimal numbers are stored as repeating binaries in the hexadecimal
notation until they run out of digits. There are only 52 bits (binary digits) after the binary point.
For decimal input 0.1 we see that the repeating hex digit is “9” until the last position where it is
rounded up to “a”.

> nums <- c(0, .0625, .1, .3, .3125, .5, .6, (.3 + .6), .9,

1)

> data.frame("decimal-2"=nums,
+
"decimal-17"=format(nums, digits=17),
+
hexadecimal=sprintf("%+13.13a", nums))
decimal.2
decimal.17
hexadecimal
1
0.0000 0.00000000000000000 +0x0.0000000000000p+0
2
0.0625 0.06250000000000000 +0x1.0000000000000p-4
3
0.1000 0.10000000000000001 +0x1.999999999999ap-4
4
0.3000 0.29999999999999999 +0x1.3333333333333p-2
5
0.3125 0.31250000000000000 +0x1.4000000000000p-2
6
0.5000 0.50000000000000000 +0x1.0000000000000p-1
7
0.6000 0.59999999999999998 +0x1.3333333333333p-1
8
0.9000 0.89999999999999991 +0x1.cccccccccccccp-1
9
0.9000 0.90000000000000002 +0x1.ccccccccccccdp-1
10
1.0000 1.00000000000000000 +0x1.0000000000000p+0

enough. In this Appendix we use it for lower-precision arithmetic—four or five significant digits. In this way it will be much easier to illustrate how the behavior of
floating point numbers differs from the behavior of real numbers.

G.4 Binary Format
It is often easier to see the details of the numerical behavior when numbers are displayed in binary, not in the hex format of sprintf("%+13.13a", x). The Rmpfr
package includes a binary display format for numbers. The formatBin function
uses sprintf to construct a hex display format and then modifies it by replacing
each hex character with its 4-bit expansion as shown in Table G.4.
Optionally (with argument scientific=FALSE), all binary numbers can be formatted to show aligned radix points. There is also a formatHex function which is
essentially a wrapper for sprintf. Both functions are used in the examples in this
Appendix. Table G.5 illustrates both functions, including the optional scientific
argument, with a 4-bit arithmetic example.

758

G Computational Precision and Floating-Point Arithmetic

Table G.4 Four-bit expansions for the sixteen hex digits. We show both lowercase [a:f] and uppercase [A:F] for the hex digits.

> Rmpfr:::HextoBin
1
2
3
4
5
6
7
"0000" "0001" "0010" "0011" "0100" "0101" "0110" "0111"
8
9
A
B
C
D
E
F
"1000" "1001" "1010" "1011" "1100" "1101" "1110" "1111"
a
b
c
d
e
f
"1010" "1011" "1100" "1101" "1110" "1111"

G.5 Round to Even
The IEEE 754 standard calls for “rounding ties to even”. The explanation here is
from help(mpfr, package="Rmpfr"):
The round to nearest ("N") mode, the default here, works as in the IEEE 754 standard: in
case the number to be rounded lies exactly in the middle of two representable numbers, it
is rounded to the one with the least significant bit set to zero. For example, the number 5/2,
which is represented by (10.1) in binary, is rounded to (10.0)=2 with a precision of two bits,
and not to (11.0)=3. This rule avoids the drift phenomenon mentioned by Knuth in volume
2 of The Art of Computer Programming (Section 4.2.2).

G.6 Base-10, 2-Digit Arithmetic
Hex numbers are hard to fathom the first time they are seen. We therefore look at a
simple example of finite-precision arithmetic with 2 significant decimal digits.
Calculate the sum of squares of three numbers in 2-digit base-10 arithmetic. For
concreteness, use the example
22 + 112 + 152
This requires rounding to 2 significant digits at every intermediate step. The steps
are easy. Putting your head around the steps is hard.
We rewrite the expression as a fully parenthesized algebraic expression, so we
don’t need to worry about precedence of operators at this step.
((22 ) + (112 )) + (152 )
Now we can evaluate the parenthesized groups from the inside out.

G.6 Base-10, 2-Digit Arithmetic

759

Table G.5 Integers from 0 to 39 stored as 4-bit mpfr numbers. The numbers from 17 to 39 are
rounded to four significant bits. All numbers in the “16” and “24” columns are multiples of 2, and
all numbers in the “32” columns are multiples of 4. The numbers are displayed in decimal, hex,
binary, and binary with aligned radix points. To interpret the aligned binary numbers, replace the
“_” placeholder character with a zero.
> library(Rmpfr)
> FourBits <- mpfr(matrix(0:39, 8, 5), precBits=4)
> dimnames(FourBits) <- list(0:7, c(0,8,16,24,32))
> FourBits
’mpfrMatrix’ of dim(.) = (8, 5) of precision
0
8
16
24
32
0 0.00 8.00 16.0 24.0 32.0
1 1.00 9.00 16.0 24.0 32.0
2 2.00 10.0 18.0 26.0 32.0
3 3.00 11.0 20.0 28.0 36.0
4 4.00 12.0 20.0 28.0 36.0
5 5.00 13.0 20.0 28.0 36.0
6 6.00 14.0 22.0 30.0 40.0
7 7.00 15.0 24.0 32.0 40.0
> formatHex(FourBits)
0
8
0 +0x0.0p+0 +0x1.0p+3
1 +0x1.0p+0 +0x1.2p+3
2 +0x1.0p+1 +0x1.4p+3
3 +0x1.8p+1 +0x1.6p+3
4 +0x1.0p+2 +0x1.8p+3
5 +0x1.4p+2 +0x1.ap+3
6 +0x1.8p+2 +0x1.cp+3
7 +0x1.cp+2 +0x1.ep+3

16
+0x1.0p+4
+0x1.0p+4
+0x1.2p+4
+0x1.4p+4
+0x1.4p+4
+0x1.4p+4
+0x1.6p+4
+0x1.8p+4

> formatBin(FourBits)
0
8
0 +0b0.000p+0 +0b1.000p+3
1 +0b1.000p+0 +0b1.001p+3
2 +0b1.000p+1 +0b1.010p+3
3 +0b1.100p+1 +0b1.011p+3
4 +0b1.000p+2 +0b1.100p+3
5 +0b1.010p+2 +0b1.101p+3
6 +0b1.100p+2 +0b1.110p+3
7 +0b1.110p+2 +0b1.111p+3

24
+0x1.8p+4
+0x1.8p+4
+0x1.ap+4
+0x1.cp+4
+0x1.cp+4
+0x1.cp+4
+0x1.ep+4
+0x1.0p+5

16
+0b1.000p+4
+0b1.000p+4
+0b1.001p+4
+0b1.010p+4
+0b1.010p+4
+0b1.010p+4
+0b1.011p+4
+0b1.100p+4

4

bits

32
+0x1.0p+5
+0x1.0p+5
+0x1.0p+5
+0x1.2p+5
+0x1.2p+5
+0x1.2p+5
+0x1.4p+5
+0x1.4p+5

24
+0b1.100p+4
+0b1.100p+4
+0b1.101p+4
+0b1.110p+4
+0b1.110p+4
+0b1.110p+4
+0b1.111p+4
+0b1.000p+5

> formatBin(FourBits, scientific=FALSE)
0
8
16
0 +0b_____0.000 +0b__1000.___ +0b_1000_.___
1 +0b_____1.000 +0b__1001.___ +0b_1000_.___
2 +0b____10.00_ +0b__1010.___ +0b_1001_.___
3 +0b____11.00_ +0b__1011.___ +0b_1010_.___
4 +0b___100.0__ +0b__1100.___ +0b_1010_.___
5 +0b___101.0__ +0b__1101.___ +0b_1010_.___
6 +0b___110.0__ +0b__1110.___ +0b_1011_.___
7 +0b___111.0__ +0b__1111.___ +0b_1100_.___

32
+0b1.000p+5
+0b1.000p+5
+0b1.000p+5
+0b1.001p+5
+0b1.001p+5
+0b1.001p+5
+0b1.010p+5
+0b1.010p+5

24
+0b_1100_.___
+0b_1100_.___
+0b_1101_.___
+0b_1110_.___
+0b_1110_.___
+0b_1110_.___
+0b_1111_.___
+0b1000__.___

32
+0b1000__.___
+0b1000__.___
+0b1000__.___
+0b1001__.___
+0b1001__.___
+0b1001__.___
+0b1010__.___
+0b1010__.___

760

G Computational Precision and Floating-Point Arithmetic

( (22 ) + (112 )
( (4) + (121)
( 4 + 120
(
124
(
120

)
)
)
)
)

+
+
+
+
+
340

(152 )
(225)
220
220
220

## parenthesized expression
## square each term
## round each term to two significant decimal digits
## calculate the intermediate sum
## round the intermediate sum to two decimal digits
## sum the terms

Compare this to the full precision arithmetic
( (22 ) + (112 ) ) + (152 ) ## parenthesized expression
( 4 + 121 ) + 225 ## square each term
(
125
) + 225 ## calculate the intermediate sum
350
## sum the terms
We see immediately that two-decimal-digit rounding at each stage gives an answer
that is not the same as the one from familiar arithmetic with real numbers.

G.7 Why Is .9 Not Recognized to Be the Same as (.3 + .6)?
We can now continue with the first example from Section G.1. The floating point
binary representation of 0.3 and the floating point representation of 0.6 must be
aligned on the binary point before the addition. When the numbers are aligned by
shifting the smaller number right one position, the last bit of the smaller number has
nowhere to go and is lost. The sum is therefore one bit too small compared to the
floating point binary representation of 0.9. Details are in Table G.6.

G.8 Why Is

 √ 2
2 Not Recognized to Be the Same as 2?

We continue with the second example from Section G.1. The binary representation
 √ 2
inside the machine of the two numbers
2 and 2 is not identical. We see in
Table G.7 that they differ by one bit in the 53rd binary digit.

G.9 zapsmall to Round Small Values to Zero for Display
R provides a function that rounds small values (those close to the machine epsilon)
to zero. We use this function for printing of many tables where we wish to interpret numbers close to machine epsilon as if they were zero. See Table G.8 for an
example.

G.9 zapsmall to Round Small Values to Zero for Display

761

Table G.6 Now let’s add 0.3 and 0.6 in hex:
0.3 +0x1.3333333333333p-2 = +0x0.9999999999999p-1 aligned binary (see below)
0.6 +0x1.3333333333333p-1 = +0x1.3333333333333p-1
---------------------------------------------0.9 add of aligned binary
+0x1.cccccccccccccp-1
0.9 convert from decimal
+0x1.ccccccccccccdp-1
We need to align binary points for addition. The shift is calculated by converting hex to binary, shifting one bit to the right to get the same p-1 exponent, regrouping four bits into hex
characters, and allowing the last bit to fall off:
1.0011 0011 0011 ... 0011 × 2−2 → .1001 1001 1001 ... 1001 | 1/ × 2−1

> nums369 <- c(.3, .6, .3+.6, 9)
> nums369df <+ data.frame("decimal-2"=nums369,
+
"decimal-17"=format(nums369, digits=17),
+
hexadecimal=sprintf("%+13.13a", nums369))
> nums369df[3,1] <- "0.3 + 0.6"
> nums369df
decimal.2
1
0.3
2
0.6
3 0.3 + 0.6
4
9

decimal.17
0.29999999999999999
0.59999999999999998
0.89999999999999991
9.00000000000000000

hexadecimal
+0x1.3333333333333p-2
+0x1.3333333333333p-1
+0x1.cccccccccccccp-1
+0x1.2000000000000p+3

Table G.7 The binary representation of the two numbers

 √ 2
2 and 2 is not identical. They differ

by one bit in the 53rd binary digit.

> sprintf("%+13.13a", c(2, sqrt(2)^2))
[1] "+0x1.0000000000000p+1" "+0x1.0000000000001p+1"

Table G.8 We frequently wish to interpret numbers that are very different in magnitude as if the
smaller one is effectively zero. The display function zapsmall provides that capability.

> c(100, 1e-10)
[1] 1e+02 1e-10
> zapsmall(c(100, 1e-10))
[1] 100
0

762

G Computational Precision and Floating-Point Arithmetic

G.10 Apparent Violation of Elementary Factoring
We show a simple example of disastrous cancellation (loss of high-order digits),
where the floating point statement
a2 − b2  (a + b) × (a − b)
is an inequality, not an equation, for some surprising values of a and b. Table G.9
shows two examples, a decimal example for which the equality holds so we can
use our intuition to see what is happening, and a hex example at the boundary of
rounding so we can see precisely how the equality fails.
Table G.9 Two examples comparing a2 − b2 to (a + b) × (a − b). On the top, the numbers are
decimal a = 101 and b=102 and the equality holds on a machine using IEEE 754 floating point
arithmetic. On the bottom, the numbers are hexadecimal a = 0x8000001 and b = 0x8000002 and
the equality fails to hold on a machine using IEEE 754 floating point arithmetic. The outlined 0 in
the decimal column for a^2 with x=+0x8000000 would have been a 1 if we had 54-bit arithmetic.
Since we have only 53 bits available to store numbers, the 54th bit was rounded to 0 by the Round
to Even rule (see Section G.5). The marker  in the hex column for a^2 with x=+0x8000000
shows that one more hex digit would be needed to indicate the squared value precisely.
Decimal Hex
100 = +0x64
x
a <- x+1
b <- x+2
a^2
b^2
b^2 - a^2
(b+a) * (b-a)

100
101
102
10201
10404
203
203

+0x1.9000000000000p+6
+0x1.9400000000000p+6
+0x1.9800000000000p+6
+0x1.3ec8000000000p+13
+0x1.4520000000000p+13
+0x1.9600000000000p+7
+0x1.9600000000000p+7

Decimal Hex
134217728 = +0x8000000
x
134217728
a <- x+1
134217729
b <- x+2
134217730
a^2
18014398777917440
b^2
18014399046352900
b^2 - a^2
268435460
(b+a) * (b-a)
268435459

+0x1.0000000000000p+27
+0x1.0000002000000p+27
+0x1.0000004000000p+27
+0x1.0000004000000p+54
+0x1.0000008000001p+54
+0x1.0000004000000p+28
+0x1.0000003000000p+28



G.12 Variance Calculations at the Precision Boundary

763

G.11 Variance Calculations
Once we understand disastrous cancellation, we can study algorithms for the calculation of variance. Compare the two common formulas for calculating sample
variance, the two-pass formula and the disastrous one-pass formula.
Two-pass formula
⎞
⎛ n

⎟
⎜⎜⎜
⎜⎜⎝ (xi − x̄)2 ⎟⎟⎟⎟⎠ /(n − 1)
i=1

One-pass formula
⎞
⎞
⎛⎛ n

⎟⎟
⎟
⎜⎜⎜⎜⎜⎜
2⎟
2⎟
⎟
⎜⎜⎝⎜⎜⎝
xi ⎟⎠ − n x̄ ⎟⎟⎟⎠ /(n − 1)
i=1

Table G.10 shows the calculation of the variance by both formulas. For x = (1, 2, 3),
var(x) = 1 by both formulas. For x = (k + 1, k + 2, k + 3), var(x) = 1 by both
formulas for k ≤ 107 . For k = 108 , the one-pass formula gives 0. The one-pass
formula is often shown in introductory books with the name “machine formula”.
The “machine” it is referring to is the desk calculator, not the digital computer.
The one-pass formula gives valid answers for numbers with only a few significant
figures (about half the number of digits for machine precision), and therefore does
not belong in a general algorithm. The name “one-pass” is reflective of the older
computation technology where scalars, not the vector, were the fundamental data
unit. See Section G.14 where we show the one-pass formula written with an explicit
loop on scalars.
We can show what is happening in these two algorithms by looking at the binary
display of the numbers. We do so in Section G.12 with presentations in Tables G.11
and G.12. Table G.11 shows what happens for double precision arithmetic (53 significant bits, approximately 16 significant decimal digits). Table G.12 shows the
same behavior with 5-bit arithmetic (approximately 1.5 significant decimal digits).

G.12 Variance Calculations at the Precision Boundary
Table G.11 shows the calculation of the sample variance for three sequential numbers at the boundary of precision of 53-bit floating point numbers. The numbers in
column “15” fit within 53 bits and their variance is the variance of k + (1, 2, 3) which
is 1. The numbers 1016 + (1, 2, 3) in column “16” in Table G.11 require 54 bits for
precise representation. They are therefore rounded to 1016 + c(0, 2, 4) to fit within
the capabilities of 53-bit floating point numbers. The variance of the numbers in column “16” is calculated as the variance of k + (0, 2, 4) which is 4. When we place the
numbers into a 54-bit representation (not possible with the standard 53-bit floating
point), the calculated variance is the anticipated 1.
Table G.12 shows the calculation of the sample variance for three sequential
numbers at the boundary of precision of 5-bit floating point numbers. The numbers
{33, 34, 35} on the left side need six significant bits to be represented precisely.

764

G Computational Precision and Floating-Point Arithmetic

Table G.10 The one-pass formula fails at x = (108 + 1, 108 + 2, 108 + 3) (about half as many
significant digits as machine precision). The two-pass formula is stable to the limit of machine
precision. The calculated value at the boundary of machine precision for the two-pass formula is
the correctly calculated floating point value. Please see Section G.12 and Tables G.11 and G.12 for
the explanation.

> varone <- function(x) {
+
n <- length(x)
+
xbar <- mean(x)
+
(sum(x^2) - n*xbar^2) / (n-1)
+ }

> vartwo <- function(x) {
+
n <- length(x)
+
xbar <- mean(x)
+
sum((x-xbar)^2) / (n-1)
+ }

> x <- 1:3

> x <- 1:3

> varone(x)
[1] 1

> vartwo(x)
[1] 1

> varone(x+10^7)
[1] 1

> vartwo(x+10^7)
[1] 1

> ## half machine precision
> varone(x+10^8)
[1] 0

> ## half machine precision
> vartwo(x+10^8)
[1] 1

> varone(x+10^15)
[1] 0

> vartwo(x+10^15)
[1] 1

> ## boundary of machine precision
> ##
> varone(x+10^16)
[1] 0

> ## boundary of machine precision
> ## See next table.
> vartwo(x+10^16)
[1] 4

> varone(x+10^17)
[1] 0

> vartwo(x+10^17)
[1] 0

Following the Round to Even rule, they are rounded to {32, 34, 36} in the fivebit representation on the right side. The easiest way to see the rounding is in the
scientific=FALSE binary presentation (the last section on both sides of the Table G.12, repeated in Table G.13).
There is also a third formula, with additional protection against cancellation,
called the “corrected two-pass algorithm”.
y = x − x̄
(y − ȳ)2 /(n − 1)
We define a function in Table G.14 and illustrate its use in a very tight boundary
case (the 54-bit column “16” of Table G.11) in Table G.15.

G.12 Variance Calculations at the Precision Boundary

765

Table G.11 Variance of numbers at the boundary of 53-bit double precision arithmetic. Column
“15” fits within 53 bits and the variance is calculated as 1. Column “16” requires 54 bits for
precise representation. With 53-bit floating point arithmetic the variance is calculated as 4. With
the extended precision to 54 bits, the variance is calculated as 1.
> x <- 1:3; p <- 15:16
> xx <- t(outer(10^p, x, ‘+‘)); dimnames(xx) <- list(x, p)
> print(xx, digits=17)
15
16
1 1000000000000001 10000000000000000
2 1000000000000002 10000000000000002
3 1000000000000003 10000000000000004
> formatHex(xx)
15
1 +0x1.c6bf526340008p+49
2 +0x1.c6bf526340010p+49
3 +0x1.c6bf526340018p+49

16
+0x1.1c37937e08000p+53
+0x1.1c37937e08001p+53
+0x1.1c37937e08002p+53

> var(xx[,"15"])
[1] 1
> var(xx[,"16"])
[1] 4
> x54 <- mpfr(1:3, 54)
> xx54 <- t(outer(10^p, x54, ‘+‘)); dimnames(xx54) <- list(x, p)
> xx54
’mpfrMatrix’ of dim(.) = (3, 2) of precision
15
16
1 1000000000000001.00 10000000000000001.0
2 1000000000000002.00 10000000000000002.0
3 1000000000000003.00 10000000000000003.0

54

bits

> vartwo(xx54[,"16"] - mean(xx54[,"16"]))
1 ’mpfr’ number of precision 54
bits
[1] 1
> ## hex for 54-bit numbers is not currently available from R.
>
> ## We manually constructed it here.
> formatHex(xx54) ## We manually constructed this
15
16
1 +0x1.c6bf5263400080p+49 +0x1.1c37937e080008p+53
2 +0x1.c6bf5263400100p+49 +0x1.1c37937e080010p+53
3 +0x1.c6bf5263400180p+49 +0x1.1c37937e080018p+53

bits

5
+0b1.00001p+5
+0b1.00010p+5
+0b1.00011p+5

53

> formatBin(yy6, scientific=FALSE)
3
4
5
1 +0b__1001.00 +0b_10001.0_ +0b100001.__
2 +0b__1010.00 +0b_10010.0_ +0b100010.__
3 +0b__1011.00 +0b_10011.0_ +0b100011.__

> formatBin(yy6)
3
4
1 +0b1.00100p+3 +0b1.00010p+4
2 +0b1.01000p+3 +0b1.00100p+4
3 +0b1.01100p+3 +0b1.00110p+4

> vartwo(yy6[,"5"])
1 ’mpfr’ number of precision
[1] 1

bits

5
+0b1.0000p+5
+0b1.0001p+5
+0b1.0010p+5

53

> formatBin(yy5, scientific=FALSE)
3
4
5
1 +0b__1001.0 +0b_10001._ +0b10000_._
2 +0b__1010.0 +0b_10010._ +0b10001_._
3 +0b__1011.0 +0b_10011._ +0b10010_._

> formatBin(yy5)
3
4
1 +0b1.0010p+3 +0b1.0001p+4
2 +0b1.0100p+3 +0b1.0010p+4
3 +0b1.0110p+3 +0b1.0011p+4

> vartwo(yy5[,"5"])
1 ’mpfr’ number of precision
[1] 4

(3, 3) of precision

5

> yy5
’mpfrMatrix’ of dim(.) =
3
4
5
1 9.00 17.0 32.0
2 10.0 18.0 34.0
3 11.0 19.0 36.0

bits

> yy6
’mpfrMatrix’ of dim(.) =
3
4
5
1 9.00 17.0 33.0
2 10.0 18.0 34.0
3 11.0 19.0 35.0
6

> yy5 <- mpfr(yy, 5); dimnames(yy5) <- list(y, q)

> yy6 <- mpfr(yy, 6); dimnames(yy6) <- list(y, q)

(3, 3) of precision

> y <- 1:3; q <- 3:5; yy <- t(outer(2^q, y, ‘+‘))

> y <- 1:3; q <- 3:5; yy <- t(outer(2^q, y, ‘+‘))

bits

Table G.12 Numbers with six and five significant bits. The six-bit integers 33 and 35 on the left side cannot be expressed precisely with only five significant
bits. They are rounded to 32 and 36 in the five-bit representation on the right side. The variance of the numbers {33, 34, 35} is 1. The variance of the numbers
{32, 34, 36} is 4. Please see Section G.12 for the discussion of this table. Please see Table G.13 for additional details.

766
G Computational Precision and Floating-Point Arithmetic

G.12 Variance Calculations at the Precision Boundary

767

Table G.13 This table focuses on the last displays in Table G.12. The last three bits in the 6bit display of 33 (001) show a 1 in the “1” position. The number is rounded to the nearest even
number (000) in the “2” digit (with a 0 in the “1” position) and truncated to (00 ) in the 5-bit
display. The last three bits (011) of 35 are rounded to the nearest even number (100) in the “2”
digit and truncated to (10 ) In both values, the resulting “2” digit is (0). The last three bits (010)
of 34 already have a (0) in the “1” digit and therefore no rounding is needed. The sample variance
of {33, 34, 35} is 1. When those numbers are rounded to five-bit binary, they become {32, 34, 36}.
The sample variance of {32, 34, 36} is 4.
6-bit binary

rounded to 5-bit binary
displayed
as 6-bit

truncated equivalent
to 5-bit
decimal

decimal

binary

33

+0b100001.

+0b100000. +0b10000 .

32

34

+0b100010.

+0b100010. +0b10001 .

34

35

+0b100011.

+0b100100. +0b10010 .

36

Table G.14 The corrected two-pass algorithm centers the data by subtracting the mean, and then
uses the two-pass algorithm on the centered data. It helps in some boundary conditions, for example
the one shown in Table G.15.
> vartwoC <- function(x) {
+
vartwo(x-mean(x))
+ }
> x <- 1:3
> vartwoC(x)
[1] 1
> vartwoC(x+10^7)
[1] 1
> ## half machine precision
> vartwoC(x+10^8)
[1] 1
> vartwoC(x+10^15)
[1] 1
> ## boundary of machine precision
> ##
> vartwoC(x+10^16)
[1] 4
> vartwoC(x+10^17)
[1] 0

768

G Computational Precision and Floating-Point Arithmetic

Table G.15 vartwo doesn’t work for some problems on the boundary, such as this example at
the boundary of 54-bit arithmetic. Summing the numbers effectively required one more significant
binary digit. Since there are no more digits available, the data was rounded and the variance is not
what our real-number intuition led us to expect. vartwoC does work.
> vartwo(xx54[,"15"])
1 ’mpfr’ number of precision
[1] 1

54

bits

> ## wrong answer. numbers were shifted one binary position.
> vartwo(xx54[,"16"])
1 ’mpfr’ number of precision 54
bits
[1] 2.5
> ## vartwoC protects against that problem and gets the right answer.
> vartwoC(xx54[,"16"])
1 ’mpfr’ number of precision 54
bits
[1] 1
> sum(xx54[1:2,"16"])
1 ’mpfr’ number of precision
[1] 20000000000000004

54

bits

> ## Adding the first two numbers effectively doubled the numbers which
> ## means the significant bits were shifted one more place to the left.
> ## The first value was rounded up. Looking at just the last three bytes
> ## (where the last three bits are guaranteed 0):
> ##
+0x008p53 + 0x010p53 -> +0x010p53 + 0x010p53 -> +0x020p53
>
> sum((xx54)[1:3,"16"])
## too high
1 ’mpfr’ number of precision 54
bits
[1] 30000000000000008
> sum((xx54)[3:1,"16"])
## too low
1 ’mpfr’ number of precision 54
bits
[1] 30000000000000004
> sum((xx54)[c(1,3,2),"16"]) ## just right
1 ’mpfr’ number of precision 54
bits
[1] 30000000000000006

G.13 Can the Answer to the Calculation be Represented?

769

G.13 Can the Answer to the Calculation be Represented?
Chan et al. (1983) discuss various strategies needed to make sure that the fundamental goal of numerical analysis is achieved:
If the input values can be represented by the computer, and if the answer can be represented
by the computer, then the calculation should get the right answer.

It is very easy to construct an easy sum-of-squares problem for which naive calculations cannot get the right answer. The programmer’s task is to get the right
answer.
The Pythagorean Theorem tells us that z = x2 + y2 will be an integer for several
well known sets of triples {x, y, z}. The triple {3, 4, 5} is probably the best known.
The triple {3k, 4k, 5k} for any k is also a triple which works. Table G.16 shows an
example of k for which naive calculation fails, and for which Mod (modulus), one of
R’s base function, works. The goal is to understand how Mod is written.
Table G.16 The naive square-root-of-the-sum-of-squares algorithm gets the right answer in two
of the three cases shown here. Mod gets the right answer in all three cases.
> x <- 3; y <- 4
> sqrt(x^2 + y^2)
[1] 5
> Mod(x + 1i*y)
[1] 5
> x <- 3e100; y <- 4e100
> sqrt(x^2 + y^2)
[1] 5e+100
> Mod(x + y*1i)
[1] 5e+100
> x <- 3e305; y <- 4e305
> sqrt(x^2 + y^2)
[1] Inf
> Mod(x + y*1i)
[1] 5e+305

770

G Computational Precision and Floating-Point Arithmetic

The problem is that squaring arguments with a large exponent causes floating
point overflow, which R interprets as infinite. The repair, shown in the MyMod function in Table G.17, is to rescale the numbers to a smaller exponent and then take the
square root of the sum of squares. At the end the exponent is restored.
Table G.17 The MyMod function rescales the numbers and gets the right answer in all three cases.
Only the third case is shown here.
> MyMod <- function(x, y) {
+
XYmax <- max(abs(c(x, y)))
+
xx <- x/XYmax
+
yy <- y/XYmax
+
+
result <- sqrt(xx^2 + yy^2)
+
+
result * XYmax
+ }
> x^2
[1] Inf
> y^2
[1] Inf
> MyMod(x, y)
[1] 5e+305

G.14 Explicit Loops
Desk calculator technology had different technical goals than digital computer technology. Entering the data manually multiple times was expensive and to be avoided.
Table G.18 shows a scalar version of the one-pass algorithm for calculating sample
variance. The explicit loop uses each entered value x[i] twice before going on to
the next value. The term one-pass refers
 to the single entry of the data value for
both accumulations (xi ) and (xi2 ) on the scalars in the vector x. Explicit scalar
loops are exceedingly slow in R and are normally avoided. When we use vectorized
operations (as in statements such as sum(x^2)) the looping is implicit at the user
level and is done at machine speeds inside R.
We timed the scalar version of the one-pass algorithm along with the vectorized one-pass and two-pass algorithms. The explicitly looped one-pass algorithm
varoneScalar is much slower than the vectorized algorithms. The vectorized
onepass and twopass algorithms are about equally fast. Only the twopass algorithm gives the correct calculation to the precision of the computer.

G.14 Explicit Loops

771

Table G.18 The one-pass formula written as a scalar loop. This made sense for desk calculators
because the user keyed in each number exactly once. It is about the most inefficient way to write
code for R. In this example it is about 60 times slower than the vectorized algorithms.
> varoneScalar <- function(x) {
+
## This is a pedagogical example.
+
## Do not use this as a model for writing code.
+
n <- length(x)
+
sumx <- 0
+
sumx2 <- 0
+
for (i in 1:n) {
+
sumx <- sumx + x[i]
+
sumx2 <- sumx2 + x[i]^2
+
}
+
(sumx2 - (sumx^2)/n) / (n-1)
+
}
> x <- 1:3
> varoneScalar(x)
[1] 1
> varoneScalar(x+10^7)
[1] 1
> ## half machine precision
> varoneScalar(x+10^8)
[1] 0
> xx <- rnorm(1000)
> ## explicit loops are much slower in R
> system.time(for (j in 1:1000) varoneScalar(xx))
user system elapsed
1.249
0.309
1.483
> system.time(for (j in 1:1000) varone(xx))
user system elapsed
0.020
0.002
0.021
> system.time(for (j in 1:1000) vartwo(xx))
user system elapsed
0.021
0.001
0.022

Appendix H

Other Statistical Software

The statistical analyses described in this book can be calculated with other software
than R.
Readers are welcome to work the examples and exercises in this book using other
software. All datasets used in either the text or exercises are available in ASCII
characters in csv (comma-separated-values) format in the zip file
http://astro.ocis.temple.edu/~rmh/HH2/HH2datasets.zip
The reader must be aware of several issues when using these datasets.
1. All data sets.
The first row of the csv file contains variable names. There is one fewer name
than columns of data with the convention that the initial unnamed column is the
row number or row name. Missing observations are coded NA. Factors are stored
as character strings. When converting them back to factors, verify that the factor
levels are ordered correctly. Names (row names, column names, level names for
factors, and values of character variables) may include blanks and other nonalphanumeric characters.
2. Time Series datasets.
co2, elnino, employM16, nottem, ozone are stored one row per year in
twelve columns named with the month names. Missing observations are coded
NA. When converting this back to a time series in any software system, verify
that the months are identified as a factor in the correct calendar order. Factor
levels may include blanks and other non-alphanumeric characters.
product, tser.mystery.X, tser.mystery.Y, tser.mystery.Z, tsq
are stored as a single column named x. Read the problem description for any
other information.

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

773

774

H Other Statistical Software

3. Very Long character strings.
SFF8121 contains very long character strings with embedded newline characters
and with embedded commas. It is the only csv file in this set that has been saved
with double quotes around all character strings.

Appendix I

Mathematics Preliminaries

A certain degree of mathematical maturity is a prerequisite for understanding the
material in this book. Many chapters in this book require a basic understanding of
these areas of mathematics:
• algebra
• differential calculus
• matrix algebra, with special attention devoted to quadratic forms, eigenvalues
and eigenvectors, transformations of coordinate systems, and ellipsoids in matrix
notation
• combinations and permutations
• floating point arithmetic
This appendix provides a brief review of these topics at a level comparable to the
book’s exposition of statistics.

I.1 Algebra Review
We begin with some topics in algebra, focusing on the case of two dimensions. The
labels x and y are given to the horizontal and vertical dimensions, respectively.

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

775

776

I Mathematics Preliminaries

I.1.1 Line
The general equation of a straight line is given by y = a + bx, where a and b are
constants with b  ±∞. The line in Figure I.1 intersects the y-axis at y = a and the
x-axis at x = −a/b, and has slope b.
1
y = − 2 + x = a + bx
2
1

+

0

y

−1

+

−2

1

a

2

−3
−2

0

2

4

x
Fig. I.1 Straight line y = 2 + 1/2x with intercepts and slope indicated.

I.1.2 Parabola
The general equation of a parabola having a vertical axis is the quadratic equation
y = ax2 + bx + c for constants a, b, c with a  0. The graph of the parabola opens
upward if a > 0 and attains a minimum when x = −b/2a. The graph opens downward if a < 0 and attains a maximum when x = −b/2a. The quantity d = b2 − 4ac
is called the discriminant. The parabola intersects the horizontal axis at
√
−b ± d
(I.1)
x=
2a
The number of intersections, or real roots, is 2, 1, or 0 according to whether d >, =,
< 0. The parabola intersects the y-axis at y = c. Equation (I.1) is referred to as the
quadratic formula for solving the equation ax2 + bx + c = 0. We illustrate a parabola
opening upward in Figure I.2.

I.1 Algebra Review

777

y = (x + 5)(x − 3) = 1x2 + 2x − 15 = ax2 + bx + c
root1

root2

minimum

10
5
0

y

−5
−10
−15

minimum
−6

−4

−2

0

2

4

x
Fig. I.2 Parabola with x-roots and minimum indicated.

I.1.3 Ellipse
The equation
x2 y2
+
=1
a2 b2
represents an ellipse centered at (0, 0) with major and minor axes parallel to the
coordinate system. When a > b, the semimajor axis has length a and the semiminor
axis has length b. We graph a sample ellipse, with a = 3 and b = 2, in Figure I.3.
An ellipse centered at (μ x , μy ) is obtained by replacing x and y in the above with
x−μ x and y−μy , respectively. Ellipses having axes nonparallel to the coordinate system are important in statistics and will be discussed in Section I.4.13 as an example
of the use of matrix notation.

I.1.4 Simultaneous Equations
A common algebraic problem is the determination of the solution to two (or more)
simultaneous equations. In the case of two linear equations, the number of solutions
may be 0, 1, or ∞. There are no solutions if the equations are contradictory, such as
x + y = 8 and x + y = 9; there are an infinite number of solutions if one equation
is indistinct from the other, for example x + y = 8 and 2x + 2y = 16. When there is
a unique solution, several approaches exist for finding it. One of these is adding a
carefully chosen multiple of one equation to the other equation so as to result in an
easily solved new linear equation involving just one variable. For example, suppose

778

I Mathematics Preliminaries

x2 y2 x2 y2
+ = +
=1
32 22 a2 b2

a
b

2

1

y

0

−1

−2
−3

−2

−1

0

1

2

3

x
2

Fig. I.3 Ellipse

2

x
y
+
=1
32 22

the two equations are x+y = 8 and 2x−3y = 1. Adding three times the first equation
to the second yields 5x + 0y = 25, which implies x = 5 and then y = 3. We illustrate
in Figure I.4.

I.1.5 Exponential and Logarithm Functions
Two additional elementary functions are the exponential function and logarithmic
function. The exponential function y = c1 ec2 x (where c1 , c2 are nonzero constants)
has the property that the rate of change in y in response to a change in x is proportional to the current value of y. The logarithmic function y = c ln(x), x > 0 is
useful in situations when successive changes in x are geometrical (i.e., proportional
to the current value of x). The exponential function and the natural (to the base e)
logarithm are inverse to each other. We illustrate both in Figure I.5.

I.1 Algebra Review

779

x + y = 8, 2x − 3y = 1
4

+ (5, 3)

3
2

=1

y
=
8

2x

1

y
−3

x+

y

0

0

2

4

6

8

x
Fig. I.4 Solution of simultaneous equations at the intersection of the two straight lines described
by the equations.

6
5
4
3

y=

2

dy
dx

= ex

y
1
0

y = ln(x) ;

−1

dy
dx

=

1
x

−2
−3
−3

−2

−1

0

1

2

3

4

5

x
Fig. I.5 Exponential and logarithmic functions with derivatives at x=0.

6

780

I Mathematics Preliminaries

10
9
8
7
6

y

5
4
3
2
1
0
0

1

2

3

4

5

6

7

8

9

10

x
Fig. I.6 Asymptotes of the hyperbola y = 1/x at the horizontal axis as x → ∞ and the vertical
axis as x → 0+ .

I.1.6 Asymptote
An asymptote is a straight line that is gradually approached by a curved line. This
concept is used to describe the ultimate behavior of the curved line. For example in
Figure I.6, the graph of y = 1x has the horizontal axis as its asymptote as x → ∞ and
the vertical axis as its asymptote as x → 0+ .

I.2 Elementary Differential Calculus
If y = f (x) expresses the functional relationship between x and y, the derivative of y
dy
or D x y or f  (x), is a new function of x. For each value
with respect to x, denoted dx

of x, f (x) gives the relative amount that y changes in response to a small change in
x. For example, if y = f (x) = x2 , then it can be shown that f  (x) = 2x. When x = 3,
a small increase in x will beget a sixfold increase in y because f  (3) = 2(3) = 6.
Graphically, f  (x0 ) is the slope of the straight line tangent to f (x) at x = x0 .
If f (x) = a0 + a1 x + a2 x2 + . . . + am xm , an mth -degree polynomial, then f  (x) =
a1 + 2a2 x + 3a3 x2 + . . . + mam xm−1 . This rule can be used to differentiate (i.e., find the
derivative of) many common functions. If f (x) can be expressed as the product of
two functions, say f (x) = g(x) h(x), then its derivative is given by the product rule

I.3 An Application of Differential Calculus

781

f  (x) = g(x) h (x) + g (x) h(x). This can be used, for example, to find the derivative
of (3x2 + 4x + 5)(−8x2 + 7x − 6) without first multiplying the quadratics.
The most important application of f  (x) is in finding relative extrema (i.e., maxima or minima) of f (x). A necessary condition for x0 to be an extremum of f (x) is
that f  (x0 ) = 0. This follows from the interpretation of the derivative as a tangent
slope. Additional investigation is needed to confirm that such an x0 corresponds to
either a maximum or minimum, and then to determine which of the two it is. For
example, if f (x) = x3 − 3x, then f  (x) = 3x2 − 3. Setting f  (x) = 0, we find x = ±1.
x = 1 corresponds to a local minimum and x = −1 corresponds to a local maximum.
As another example, consider f (x) = x3 . While for this function, f  (0) = 0, x = 0 is
neither a relative minimum nor a relative maximum of f (x).

Example of an Optimization Problem
A rectangular cardboard poster is to have 150 square inches for printed matter. It is
to have a 3-inch margin at the top and bottom and a 2-inch margin on each side. Find
the dimensions of the poster so that the amount of cardboard used is minimized.
Solution I.1. Let the vertical dimension of the printed matter be x. Then for the
printed area to be 150, the horizontal dimension of the printed matter is 150
x . Then
allowing for the margins, the vertical and horizontal dimensions of the poster are
x + 6 and 150
x + 4. The product of these dimensions is the area of the poster that we
seek to minimize: a(x) = 174 + 4x + 900
x . Taking the derivative and setting it equal
to zero gives a (x) = 4 − 900x−2 = 0, which leads to the positive solution x = 15.
Thus the minimum-sized poster with required printed area is 21 inches high by 14
inches wide, and its printed area is 15 inches high by 10 inches wide.

I.3 An Application of Differential Calculus
We introduce Newton’s method for solutions of an equation of the form f (x) = 0.
A common application is the need to solve f  (x) = 0 to find extrema, as discussed
in Section I.2. Many equations of this type are readily solvable by successively
moving toward isolation of a lone x on one side of the equation, or via a specialized
technique such as the quadratic formula. In other situations one must employ one
of the number of numerical techniques designed for this purpose, one of which is
Newton’s method.
Newton’s method has the disadvantage of requiring knowledge of the derivative
f  (x), but it will often converge to a solution within a small number of iterations.
As with all procedures for dealing with this problem, one must start with a first

782

I Mathematics Preliminaries

approximation x0 , and if this is “too far” from the solution x∗ , the procedure may
fail to converge.
The idea behind Newton’s method is not difficult to understand. It is based on
the equation of the tangent line to f (x) at x = x0 . If this tangent line intersects the
x-axis at x = x1 , then
f  (x0 ) =

f (x0 )
f (x0 )
→ x1 = x0 − 
x0 − x1
f (x0 )

The iteration then proceeds with
x2 = x1 −

f (x1 )
f  (x1 )

and so on.

An Illustration of Newton’s Method
Consider solving f (x) = x3 − x − 5 = 0, f  (x) = 3x2 − 1, and let x0 = 2. You can
verify with Figure I.7 the following sequence:
i
xi
0 2.00000000
1 1.90909091
2 1.90417486
3 1.90416086

f (xi )
1.00100
4.8810−2
1.3810−4
1.1410−9

I.4 Topics in Matrix Algebra
We next provide an overview of selected topics from matrix algebra that are useful in applied statistics. Not only do matrices (the plural of matrix) allow for
a more concise notation than scalar algebra, but they are an indispensable tool
for communicating statistical findings. Additional material on matrix algebra is
contained in the appendices of most books dealing with regression analysis, linear
models, or multivariate analysis.
A matrix is a rectangular array consisting of r rows and c columns. A vector is a
special type of matrix, having either r or c equal to 1. Data files are often arranged
as a matrix, such that each variable is one column and each observation is one row.
Systems of linear equations may be succinctly written in matrix notation.

I.4 Topics in Matrix Algebra

783

4

f(x) = x3 − x − 5

3

y = − 21 + f′(x0)x

y

2

x (2, 1)

1

+
+
x1 = 21 11 x0 = 2

0
−1

1.8

1.9

2.0

2.1

2.2

x
Fig. I.7 An illustration of Newton’s Method.

Multivariate analysis involves probability distributions of random vectors rather
than scalar random variables. Each component of a random vector is a scalar random variable. The variances and covariances of the components of a random vector
are arranged in a variance–covariance matrix. (Such a symmetric matrix V, also
called covariance matrix or dispersion matrix, has the variances on the main diagonal and the covariance between variables i and j in its row i column j position. See
Section 3.3.5.) The multivariate normal distribution of the random k-vector x with
mean vector μ and covariance matrix V, with notation
x ∼ N(μ, V)

(I.2)

and probability density function
f (x) =

1
 −1
e(x−μ) V (x−μ)/2
(2π)k/2 |V|1/2

(I.3)

is the most important distribution used in multivariate analysis. We give examples
of the bivariate (multivariate with k = 2) normal in Sections 3.3.5 and J.4.2.
Matrices may be used to translate from one coordinate system to another. Orthogonal matrices perform rigid rotations in order to facilitate meaningful interpretation
of results.

784

I Mathematics Preliminaries

I.4.1 Elementary Operations
sum: If two matrices are of the same size, their sum is the new matrix of this size
comprised of the element-wise sums. The difference of two matrices of the same
size is defined similarly.
transpose: If A is an r × c matrix, then its transpose, denoted A , is the c × r matrix
having columns identical to the rows of A and conversely.
inner product: The inner product of two vectors of the same size, say u =
(u1 , u2 , . . . , uk ) and v = (v1 , v2 , . . . , vk ) , is written u v = v u = ki=1 ui vi , i.e.,
the sum of the products of the corresponding elements of the two vectors. The
inner product of a vector with itself yields the sum of squares of the elements of
this vector. A vector u is said to be normalized if u u = 1, i.e., if its (Euclidean)
length equals 1. Two vectors u and v are said to be orthogonal if their inner product is zero: u v = 0.
matrix product: The matrix product AB of an r × c matrix A with an m × n matrix
B is defined when c = m. The element in row i and column j of AB is calculated
as the inner product of the ith row of A and the jth column of B. The condition
c = m assures that the vectors forming the inner products are of the same size.
Matrix addition has the mathematical properties of commutativity and associativity. Matrix multiplication has only associativity. When AB is defined, BA may
have different dimensions from AB or even be undefined.
Matrix multiplication is used when expressing systems of linear equations in
matrix notation. Earlier we considered
the system
  x + y = 8 and 2x − 3y = 1. If


we define the 2 × 1 vectors X = yx and c = 81 and the 2 × 2 matrix A = 12 −31 ,
then this system can be written AX = c. In this context, the matrix A is referred
to as the coefficient matrix.
transpose: The transpose of the product of two matrices is the product of their
transposes in reverse order: (AB) = B A .
square:

A matrix is said to be square if it has the same number of rows as columns.

identity: The n × n identity matrix has ones on its main diagonal positions (those
where the row number equals the column number) and zeros everywhere else.
Thus, for example, the 3 × 3 identity matrix is
⎛1
⎜⎜⎜
I3 = ⎜⎜⎜⎝ 0
0

0
1
0

0 ⎞⎟⎟
⎟
0 ⎟⎟⎟⎠
1

The identity matrix plays the same role in matrix notation as does the number 1
in scalar notation: If I denotes an identity matrix such that the indicated multiplication is defined, then AI = A and IA = A.

I.4 Topics in Matrix Algebra

Jn :

785

We define Jn to be the n × n matrix having all entries equal to 1.

symmetric: A square matrix A is said to be a symmetric matrix if A = A . This
means that row number i corresponds to column number i for all i. Let ai j denote
the element in row i and column j of matrix A. Then if A is square, its trace is
the sum of its main diagonal elements: trace(A) = i aii .
operation count: Numerical analysts count the number of multiplications in an algorithm as an indicator of the costliness of the algorithm. A vector inner product
n
i=1 ai bi , for example, takes n multiplications to complete. There are other operations (indexing, adding) that must also be performed. Rather than report them
explicitly, we say instead that the amount of computation is proportional to the
number of multiplications. We indicate the proportionality by saying the operation count is O(n) (read as “big ‘O’ of n”). Similarly, the operation count for
matrix multiplication is proportional to n3 and is reported as O(n3 ).
determinant: The determinant of a square matrix A, denoted |A|, is a scalar calculated from the elements of A. In the 2 × 2 case where

a
a12
A = 11
a21 a22
the determinant is |A| = a11 a22 − a21 a12 . If A is a square coefficient matrix of a
system of linear equations (thus implying that the system has the same number
of equations as unknowns), then the system has a unique solution if and only
if |A|  0. The determinant has useful mathematical properties, but is totally
impractical from a computational standpoint. It is almost never needed as an
intermediate calculation. There is almost always a cheaper way to calculate the
final answer.
nonsingular: If |A|  0, then A is said to be a nonsingular matrix. There is no
vector v other than v = 0 such that Av ≡ 0.
inverse: A nonsingular matrix has associated with it a unique inverse, denoted
A−1 . The inverse has the property that AA−1 = A−1 A = I. The unique solution to
a system of linear equations AX = c is then X = A−1 c. For example, the inverse
of




1
a11 a12
a22 −a12
A=
is
a21 a22
|A| −a21 a11
provided that |A|  0. Note that this is a mathematical identity. It is not to be
interpreted as an algorithm for calculation of the inverse. As an algorithm it is
very expensive, requiring O(2n3 ) arithmetic operations for an n × n matrix A. An
efficient algorithm requires only O(n3 ) operations.
singular: If |A| = 0, then A is said to be a singular matrix. There exists at least one
vector v other than v = 0 such that Av ≡ 0. A singular matrix does not have an
inverse.

786

I Mathematics Preliminaries

idempotent: A square matrix A is said to be an idempotent matrix if AA = A, i.e.,
A equals the product of A with itself. A simple example is


.5 −.5
−.5 .5

I.4.2 Linear Independence
A matrix X consists of a set of column vectors
X

n×(1+p)

= [1X1 X2 . . . X p ] = [X0 X1 X2 . . . X p ]

The columns are numbered 0, 1, . . . , p.
The matrix X is said to have linearly dependent columns if there exists a nonzero
(1 + p)-vector  such that
X = 0
or, equivalently,
⎛
⎞
⎜⎜⎜
⎟⎟
⎜⎜⎝⎜  j X j ⎟⎟⎟⎟⎠ = (0)
n×1
j

n×1

The matrix X is said to have linearly independent columns if no such vector  exists.
For example, the matrix
⎛
⎞
⎜⎜⎜ 1 1 0 0 0 ⎟⎟⎟
⎜⎜⎜ 1 0 1 0 0 ⎟⎟⎟
⎟⎟
(I.4)
X = ⎜⎜⎜⎜
⎜⎜⎝ 1 0 0 1 0 ⎟⎟⎟⎟⎠
4×(1+4)
10001
has linearly dependent columns because there exists a vector  = (−1 1 1 1 1) such
that X = 0.
The matrix

⎛
⎜⎜⎜ 1
⎜⎜⎜ 1
X(,−1) = ⎜⎜⎜⎜
⎜⎜⎝ 1
4×(1+3)
1

0
1
0
0

0
0
1
0

⎞
0 ⎟⎟
⎟
0 ⎟⎟⎟⎟
⎟
0 ⎟⎟⎟⎟⎠
1

(I.5)

has linearly independent columns because there exists no nonzero vector  such that
X = 0.
The rank of a matrix is the number of linearly independent columns it contains.
Both matrices above, X and X(,−1) in Equations (I.4) and (I.5), have rank 4. For any

I.4 Topics in Matrix Algebra

787

matrix X, rank(X) = rank(X  X). A full-rank matrix is one whose rank is equal to the
minimum of its row and column dimensions, that is,

rank A = min(r, c)
r×c

I.4.3 Rank
The rank of a matrix A is the maximum number of linearly independent rows (equivalently, the maximum number of linearly independent columns) the matrix has. For
example, the matrix
⎛1
⎜⎜⎜
A = ⎜⎜⎜⎝ 1
1

1
2
3

2 ⎞⎟⎟
⎟
3 ⎟⎟⎟⎠
4

has rank 2 since columns 1 and 2 add to column 3, but any two of the columns are
linearly independent of one another (i.e., are not proportional to one another).

I.4.4 Quadratic Forms
A quadratic form is a scalar resulting from the matrix product x Ax, where x is a
k × 1 vector and A is a k × k symmetric matrix. The matrix A is termed the matrix
of the quadratic form. Complicated sums of squares and products as often occur in
an analysis of variance can be written as quadratic forms. If x has a standardized
multivariate normal distribution x ∼ N(0, I) [see Equation (I.2)], then x Ax has a
χ2 -distribution with ν degrees of freedom if and only if A is an idempotent matrix
with rank ν. For example, the numerator of the usual univariate sample variance,
n
1
2


i=1 (xi − x̄) , can be written as x Ax where x = (x1 , x2 , . . . , xn ) and A = In − n Jn .
It can be shown that this matrix A has rank n − 1, the degrees of freedom associated
with the sample variance.
If x is a random vector with expected value μ and covariance matrix V, then the
expected value of the quadratic form x Ax is
E(x Ax) = μ Aμ + trace(AV)
Result (I.6) does not require that x has a multivariate normal distribution.
A square symmetric matrix A is said to be a positive definite (abbreviated p.d.)
matrix if x Ax > 0 for all vectors x other than a vector of zeros. The matrix associated with the quadratic form representation of a sum of squares is always p.d.

788

I Mathematics Preliminaries

I.4.5 Orthogonal Transformations
A matrix M is said to be orthogonal if M  M = I. An example is
⎛ √
⎜⎜⎜ 1/ 3
⎜⎜⎜ √
M = ⎜⎜⎜⎜⎜ 1/ 3
⎜⎜⎝ √
1/ 3

√
1/ 2
√
−1/ 2
0

√ ⎞
1/ 6 ⎟⎟⎟⎟
√ ⎟⎟⎟
1/ 6 ⎟⎟⎟⎟
√ ⎟⎟⎠
−2/ 6

A transformation based on an orthogonal matrix is called an orthogonal transformation. Such transformations are rotations that preserve relative distances: y =
Mx → y y = x M  Mx = x x. Orthogonal transformations are frequently encountered in statistics. A common use of them is to transform from a correlated set
of random variables x to an uncorrelated set y.
The columns of an orthogonal matrix are said to be orthonormal. The columns
are orthogonal to each other, that is M,j M, j = 0 for j  j . In addition, the columns
have been scaled so that M,j M, j = 1.

I.4.6 Orthogonal Basis

If rank A = p < min(r, c) and c ≤ r, then any set of p linearly independent
r×c
columns of A constitutes a basis for A. The set of all vectors that can be expressed
as a linear combination Xv of the columns of A is called the column space of A,
denoted C(A). Therefore, C(A) is completely specified by any basis of A. We say
that the columns of the basis span the column space of A.
An orthogonal basis for A is a basis for A with the property that any two vectors
comprising it are orthogonal. Starting from an arbitrary basis for A, algorithms are
available for constructing an orthogonal basis for A. We show one algorithm, the
Gram–Schmidt process, in Section I.4.8.
A basis set of column vectors for a matrix X is a set of column vectors Ui that
span the same linear space as the original columns Xi . An orthogonal basis is a set
of column vectors that are mutually orthogonal, that is Ui U j = 0 for i  j. An
orthonormal basis
 is an orthogonal basis whose columns have been rescaled to have
norm Ui =

Ui Ui = 1.

I.4 Topics in Matrix Algebra

789

I.4.7 Matrix Factorization—QR
Any rectangular matrix X can be factored into the product of an matrix with orn×m
thogonal columns Q and an upper triangular R
m×m

n×m

X = Q
n×m

R

n×m m×m

The columns of Q span the same column space as the columns of the original
matrix X. This means that any linear combination of the columns of X, say Xv, can
be constructed as a linear combination of the columns of Q. Specifically, using the
associative law, Xv = (QR)v = Q(Rv).
The numerically efficient R function qr is the computational heart of the linear
models and analysis of variance functions. The intermediate matrices Q and R are
usually not explicitly produced. If you wish to see them, use the qr.Q and qr.R
functions. An example showing the QR factorization using the qr function is in
Table I.1.
An expository R function illustrating the construction of the QR factorization is
in Section I.4.8.

I.4.8 Modified Gram–Schmidt (MGS) Algorithm
There are many algorithms available to construct the matrix factorization. We
show one, the Modified Gram–Schmidt (MGS) algorithm Bjork (1967). “Modified”
means that the entire presentation is in terms of the columns Qi of the matrix under
construction. The MGS algorithm is numerically stable when calculated in finite
precision. The original Gram–Schmidt (GS) algorithm, which constructs Q in terms
of the columns Xi of the original matrix, is not numerically stable and should not be
used for computation.
Let Xn×m = [X1 X2 . . . Xm ]. The results of the factorization will be stored in Qn×m
and Rm×m . The columns of X and Q and both the rows and columns of R are numbered 1, . . . , m.

790

I Mathematics Preliminaries

Table I.1 Illustration of QR algorithm to factor X into the product of an orthogonal matrix and an
upper triangular matrix.

> X <- matrix(c(1,3,6,4,2,3,8,6,4,5,3,2), 4, 3)
> X
[1,]
[2,]
[3,]
[4,]

[,1] [,2] [,3]
1
2
4
3
3
5
6
8
3
4
6
2

> crossprod(X)
[,1] [,2] [,3]
[1,]
62
83
45
[2,]
83 113
59
[3,]
45
59
54
> ## use the efficient calculation
> X.qr <- qr(X)
> qr.Q(X.qr) ## display q
[,1]
[,2]
[,3]
[1,] -0.127 0.48139 0.8188
[2,] -0.381 -0.73969 0.4755
[3,] -0.762 -0.02348 -0.3038
[4,] -0.508 0.46965 -0.1057
> qr.R(X.qr) ## display r
[,1]
[,2]
[,3]
[1,] -7.874 -10.541 -5.7150
[2,] 0.000
1.374 -0.9041
[3,] 0.000
0.000 4.5301
> zapsmall(crossprod(qr.Q(X.qr)))
[,1] [,2] [,3]
[1,]
1
0
0
[2,]
0
1
0
[3,]
0
0
1
> crossprod(qr.R(X.qr))
[,1] [,2] [,3]
[1,]
62
83
45
[2,]
83 113
59
[3,]
45
59
54
> qr.X(X.qr) ## reproduce X
[,1] [,2] [,3]
[1,]
1
2
4
[2,]
3
3
5
[3,]
6
8
3
[4,]
4
6
2

## identity

## reproduce crossprod(X)

I.4 Topics in Matrix Algebra

791

We will construct Q and R in steps.
1. Initialize R to 0.
R←0
2. Initialize Q to X.
Q←X
3. Initialize the column counter.
i←1
4. Normalize column Qi .
ri,i ←



Qi Qi

Qi ← Qi /ri,i
If i = m, we are done.
5. For each of the remaining columns Q j , j = i + 1, . . . , m, find the component of
Q j orthogonal to Qi by
ri, j ← Qi Q j
Q j ← Q j − Qi ri, j

6. Update the column counter.
i ← i+1
7. Repeat steps 4–6 until completion.
An R version of this expository algorithm is in Table I.2. An example using the
expository function is in Table I.3.

792

I Mathematics Preliminaries

Table I.2 An expository algorithm for the Modified Gram–Schmidt Algorithm. The function is a
direct translation of the pseudo-code in Section I.4.8.

## modified Gram-Schmidt orthogonalization
mgs <- function(x) {
## modified Gram-Schmidt orthogonalization
## this is an expository algorithm
## this is not an efficient computing algorithm
## q[,j] is the normalized residual from the least squares fit of
## x[,j] on the preceding normalized columns q[,1:(j-1)]
n <- nrow(x)
m <- ncol(x)
q <- x
r <- matrix(0, m, m)
for (i in 1:m) {
r[i,i] <- sqrt(sum(q[,i]^2)) ## length of q[,i]
q[,i] <- q[,i] / r[i,i]
## normalize q[,i]
if (i < m) {
## if we still have columns to go
for (j in (i+1):m) {
r[i,j] <- sum(q[,i] * q[,j]) ## length of projection of q[,j] on q[,i]
q[,j] <- q[,j] - q[,i] * r[i,j] ## remove projection of q[,j] on q[,i]
}
}
}
list(q=q, r=r)
}

I.4 Topics in Matrix Algebra

793

Table I.3 Illustration of the expository algorithm for the Modified Gram–Schmidt Algorithm
shown in Table I.2. These are the same values (up to multiplication by −1) as calculated by the qr
function in Table I.1.

X <- matrix(c(1,3,6,4,2,3,8,6,4,5,3,2), 4, 3)
X
crossprod(X)
## use the expository function defined in the previous chunk
X.mgs <- mgs(X)
X.mgs ## q is orthogonal, r is upper triangular
## These are identical to the results of qr(X)
## up to the sign of the columns of q and the rows of r.
zapsmall(crossprod(X.mgs$q)) ## identity
crossprod(X.mgs$r)
## reproduces crossprod(X)
X.mgs$q %*% X.mgs$r

## reproduces X

I.4.9 Matrix Factorization—Cholesky
Any square positive definite matrix S can be factored into the product of an upper
m×m

triangle matrix R and its transpose
n×m

S = R R
When S has been constructed as the cross product S = X  X of a rectangular matrix
X , then the upper triangular matrix R is the same matrix we get from the QR

n×m

factorization.
S = X  X = (QR) (QR) = R (Q Q)R = R R
The numerically efficient R function chol is illustrated in Table I.4.

I.4.10 Orthogonal Polynomials
Consider the k-vector v = (v1 , v2 , . . . , vk ) , where v1 < v2 < . . . < vk . Construct a
matrix V = [v0 , v1 , v2 , . . . , vk−1 ], where we use the notation v j = (v1j , v2j , . . . , vkj )

794

I Mathematics Preliminaries

Table I.4 Illustration of Cholesky factorization of a square positive definite matrix into an upper
triangular factor and its transpose.

> X <- matrix(c(1,3,6,4,2,3,8,6,4,5,3,2), 4, 3)
> M <- crossprod(X)
> M
[1,]
[2,]
[3,]

[,1] [,2] [,3]
62
83
45
83 113
59
45
59
54

> chol(M)
[,1]
[,2]
[,3]
[1,] 7.874 10.541 5.7150
[2,] 0.000 1.374 -0.9041
[3,] 0.000 0.000 4.5301
> crossprod(chol(M)) ## reproduce M
[,1] [,2] [,3]
[1,]
62
83
45
[2,]
83 113
59
[3,]
45
59
54

An orthogonal basis Q constructed from the matrix V is called a set of orthogonal
polynomials. In the analysis of variance and related techniques, we often construct
dummy variables for ordered factors from a set of contrasts that are orthogonal
polynomials. See Figure 10.4 and the surrounding discussion in Section 10.4 for an
illustration.

I.4.11 Projection Matrices
Given any matrix X = Q
n×m

R the matrix PX = X(X  X)−1 X  = QQ is a projection

n×m m×m

n×n

matrix that projects an n-vector y onto the space spanned by the columns of X, that
is, the product PX y is in the column space C(X). If X has m columns and rank r ≤ m,
then the eigenvalues of PX consist of r 1s and m − r 0s. See Table I.5.

I.4 Topics in Matrix Algebra

795

Table I.5 Projection of a 3-vector onto the space of its first two coordinates.

> X <- matrix(c(3,1,0, 1,2,0, 0,0,0), 3, 3)
> P <- cbind(qr.Q(qr(X))[, 1:2], 0)
> P
[,1]
[,2] [,3]
[1,] -0.9487 0.3162
0
[2,] -0.3162 -0.9487
0
[3,] 0.0000 0.0000
0
> crossprod(P)
[,1] [,2] [,3]
[1,]
1
0
0
[2,]
0
1
0
[3,]
0
0
0
> y <- matrix(1:3)
> y
[1,]
[2,]
[3,]

[,1]
1
2
3

> P %*% y
[,1]
[1,] -0.3162
[2,] -2.2136
[3,] 0.0000
> sqrt(sum(y[1:2,]^2))
[1] 2.236
> sqrt(sum((P %*% y)^2))
[1] 2.236

I.4.12 Geometry of Matrices
We provide some details of the application to two-dimensional geometry. Each twodimensional vector represents a point; alternatively, a directed line segment from the
origin to this point. A 2 × 2 matrix postmultiplied by a vector transforms this point
to another point. Consider the orthogonal matrix


cos(θ) sin(θ)
M=
− sin(θ) cos(θ)

796

I Mathematics Preliminaries

2

v
u

y

1

θ

0

−1

−2
−2

−1

0

1

2

x
Fig. I.8 Rotation of a vector u and its coordinate system into v by angle θ.

and let u be a 2 × 1 vector representing a point in two dimensions. Then v = Mu
produces a new point v which is where u appears in the new coordinate system
formed by rotating the old one θ degrees around the origin. See Figure I.8.
If x and y are each
√ two-dimensional vectors and θ is the angle between them,
then cos(θ) = x y/ x x y y = corr(x, y), the correlation between these two vectors.
Note that if the vectors are orthogonal so that x y = 0, then cos(θ) = 0 and θ = 90◦ .

I.4.13 Eigenvalues and Eigenvectors
Next we study the concepts of eigenvectors and eigenvalues of an n × n symmetric
matrix V.
If Vξ = λξ, where ξ is an n × 1 vector and λ is a scalar, then λ is said to be
an eigenvalue of V with corresponding eigenvector ξ. Without loss of generality,
we can take the eigenvector to be normalized. Geometrically, the matrix V transforms its eigenvectors into multiples of themselves. Any two distinct eigenvectors
are orthogonal: ξi ξ j = 0, i  j. V can be written as its spectral decomposition

V = i λi ξi ξi . V can be written as its eigenvalue factorization V = ΞΛΞ  .
A matrix is nonsingular if and only if it has only nonzero eigenvalues. A matrix
is positive definite if and only if all of its eigenvalues are positive. The eigenvalues
of V −1 are the reciprocals of the eigenvalues of V. The determinant of a matrix

I.4 Topics in Matrix Algebra

797


equals the product of its eigenvalues |V| = λi , and calculating the eigenvalues is
normally the most efficient way to calculate the determinant. The trace of a matrix
equals the sum of its eigenvalues trace(V) = λi .
Consider the problem of choosing x to maximize x V x subject to x x = 1. The
maximum value is the largest eigenvalue of V and is attained when x is the eigenvector corresponding to this eigenvalue. Similarly, x V x is minimized (subject to
x x = 1) at the value of the smallest eigenvalue of V, occurring at the corresponding
eigenvector.
Here is an example of hand calculation of eigenvalues and eigenvectors. Consider
the 2 × 2 matrix

2 1
V=
1 1
Its 2 eigenvalues are the 2 scalar solutions λ to the equation |V − λI| = 0. Taking the
determinant
leads to (2 − λ)(1 − λ) − 1 = 0 =⇒ λ2 − 3λ + 1 = 0 =⇒ λ = (3 ±
√
5)/2, which expands to ≈ 2.618 or ≈ 0.382. (Note that we are explicit about the
approximation to 3 decimal digits. Our usual practice is not to round answers.) The
corresponding to λ ≈ 2.618 is the solution to the equation Vξ ≈
eigenvector ξ = ξξ11
21
2.618ξ. This implies that 2ξ11 + ξ21 ≈ 2.618ξ11 , and coupled with the normalization
2
2
+ ξ21
= 1 we find that ξ11 ≈ .8507 and ξ21 ≈ .5257. The eigenvector
restriction ξ11
corresponding to the other eigenvalue is found similarly.
As a geometric application of eigenvalues, consider the ellipse having equation

(x − μ) V −1 (x − μ) = b. In statistics, this ellipse based on the inverse V −1 is the form
of a confidence ellipse for μ. Let
 λ1 < λ2 be
the eigenvalues of V with corresponding
ξ12
=
,
ξ
normalized eigenvectors ξ1 = ξξ11
2
ξ22 . Then the semimajor axis of this ellipse
21
√
√
has length λ2 b and the semiminor axis has length λ1 b. The angle
 between the
.
extension of the semimajor axis and the horizontal axis is arctan ξξ12
22
2 1
Continuing the example, we calculate the eigenvalues of V = 1 1 in Table I.6
 
and graph the ellipse x V x = x 21 11 x = 1 in Figure I.9.

I.4.14 Singular Value Decomposition
Let M be an arbitrary r × c matrix, U the r × r matrix containing the eigenvectors
of MM  , and W the c × c matrix containing the eigenvectors of M  M. Let Δ be
the r × c matrix having δi j = 0 (i  j). If r ≥ c (the usual case in statistical
applications), define δii = the square root of the eigenvalue of M  M corresponding
to the eigenvector in the ith column of W. If r < c, define δii = the square root of
the eigenvalue of MM  corresponding to the eigenvector in the ith column of U.
Then the singular value decomposition of M is M = UΔW. Note that the number of

798

I Mathematics Preliminaries


Table I.6 Eigenvalues and eigenvectors of V =


2 1
.
1 1

> V <- matrix(c(2, 1, 1, 1), 2, 2)
> V
[1,]
[2,]

[,1] [,2]
2
1
1
1

> eV <- eigen(V)
> eV
$values
[1] 2.618 0.382
$vectors
[,1]
[,2]
[1,] -0.8507 0.5257
[2,] -0.5257 -0.8507

> sqrt(eV$val) ## semimajor and semiminor axis lengths
[1] 1.618 0.618
> ## angle of axes in radians
> atan(c(eV$vec[2,1]/eV$vec[1,1], eV$vec[2,2]/eV$vec[1,2]))
[1] 0.5536 -1.0172
> ## = -pi/2 ## right angle
> diff(atan(c(eV$vec[2,1]/eV$vec[1,1], eV$vec[2,2]/eV$vec[1,2])))
[1] -1.571

nonzero diagonal values in Δ is min(r, c). We show a numerical example in standard
mathematical notation in Table I.7. We show the same example calculated with the
svd function in Tables I.8 and I.9.

I.4 Topics in Matrix Algebra

799

1.0

0.5

y

0.0

λ2

λ1
−0.5

ξ1
ξ2

−1.0
−1.5

−1.0

−0.5

0.0

0.5

1.0

1.5

x



2 1
x = 1. We show the unit circle with the normalized eigenvectors,
Fig. I.9 Ellipse x V x = x
1 1
the ellipse, and the semimajor and semiminor axes whose lengths are the eigenvalues multiplied
by the square root of the eigenvectors.
Table I.7 Matrix multiplication of the components of the Singular Value Decomposition of the
matrix M in Table I.9).
⎡
⎤
⎢⎢⎢
⎥⎥⎥
⎢⎢⎢
⎥
⎢⎢⎢ 1 2 4 ⎥⎥⎥⎥⎥
⎥⎥
⎢⎢⎢⎢

M = UΔV = ⎢⎢ 3 3 5 ⎥⎥⎥⎥ =
⎢⎢⎢
⎥
⎢⎢⎢ 6 8 3 ⎥⎥⎥⎥⎥
⎢⎣
⎥⎦
462
⎤
⎡
⎤
⎤⎡
⎥⎥⎥ ⎡
⎢⎢⎢
⎥⎥⎥
⎥⎢
⎢⎢⎢
⎥⎥⎥ ⎢⎢⎢
⎥
⎢⎢⎢ −0.2561 0.6098 −0.6935 ⎥⎥⎥⎥⎥ ⎢⎢⎢⎢⎢
⎥⎥⎥ ⎢⎢⎢
⎥⎥⎥ ⎢⎢⎢ 14.481 0.000 0.000 ⎥⎥⎥ ⎢⎢⎢ −0.5383 −0.7246 −0.4302 ⎥⎥⎥⎥⎥
⎢⎢⎢
⎥
⎢⎢⎢ −0.4102 0.6334 0.5907 ⎥⎥⎥ ⎢⎢⎢
⎥⎥⎥ ⎢⎢⎢
⎥ ⎢ 0.000 4.324 0.000 ⎥⎥⎥ ⎢⎢⎢ −0.2099 −0.3791 0.9012 ⎥⎥⎥⎥⎥
⎢⎢⎢
⎢⎢⎢ −0.7125 −0.3674 0.1755 ⎥⎥⎥⎥⎥ ⎢⎢⎢⎣
⎥ ⎢⎣
⎦⎥
⎦
⎥⎦ 0.000 0.000 0.783
⎢⎣
0.8162 −0.5755 −0.0520
−0.5084 −0.3034 −0.3733

800

I Mathematics Preliminaries

Table I.8 Illustration of singular value decomposition, part I (to be continued in Table I.9).

> M <- matrix(c(1,3,6,4,2,3,8,6,4,5,3,2), 4, 3)
> M
[1,]
[2,]
[3,]
[4,]

[,1] [,2] [,3]
1
2
4
3
3
5
6
8
3
4
6
2

> M.svd <- svd(M)
> M.svd
$d
[1] 14.4806

4.3243

0.7825

$u
[1,]
[2,]
[3,]
[4,]

[,1]
[,2]
[,3]
-0.2561 0.6098 -0.6935
-0.4102 0.6334 0.5907
-0.7125 -0.3674 0.1755
-0.5084 -0.3034 -0.3733

$v
[,1]
[,2]
[,3]
[1,] -0.5383 -0.2099 0.81617
[2,] -0.7246 -0.3791 -0.57547
[3,] -0.4302 0.9012 -0.05198

> zapsmall(crossprod(M.svd$u))
[,1] [,2] [,3]
[1,]
1
0
0
[2,]
0
1
0
[3,]
0
0
1
> zapsmall(crossprod(M.svd$v))
[,1] [,2] [,3]
[1,]
1
0
0
[2,]
0
1
0
[3,]
0
0
1
> M.svd$u %*% diag(M.svd$d) %*% t(M.svd$v)
[,1] [,2] [,3]
[1,]
1
2
4
[2,]
3
3
5
[3,]
6
8
3
[4,]
4
6
2

I.4 Topics in Matrix Algebra

801

Table I.9 Illustration of singular value decomposition, part II (continued from Table I.8). Relation
between singular value decomposition and Eigenvalue decomposition.

> eigen(tcrossprod(M))
$values
[1] 2.097e+02 1.870e+01

6.123e-01 -5.978e-15

$vectors
[,1]
[,2]
[,3]
[,4]
[1,] -0.2561 0.6098 0.6935 -0.2857
[2,] -0.4102 0.6334 -0.5907 0.2857
[3,] -0.7125 -0.3674 -0.1755 -0.5714
[4,] -0.5084 -0.3034 0.3733 0.7143

> eigen(crossprod(M))
$values
[1] 209.6877 18.7000

0.6123

$vectors
[,1]
[,2]
[,3]
[1,] -0.5383 -0.2099 0.81617
[2,] -0.7246 -0.3791 -0.57547
[3,] -0.4302 0.9012 -0.05198

> M.svd$d^2
[1] 209.6877

18.7000

0.6123

I.4.15 Generalized Inverse
For any rectangular matrix X = UΔW  , the Moore–Penrose generalized inverse is
n×m
defined as
X − = WΔ−1 U 
Since Δ = diag(δi ) is a diagonal matrix, its inverse is Δ−1 = diag(δ−1
i ). The definition is extended to the situation when rank(X) < min(n, m) by using 0−1 = 0. See
Table I.10 for an example.
When rank(X) = m = n, hence the inverse exists, the generalized inverse is equal
to the inverse.

802

I Mathematics Preliminaries

Table I.10 Illustration of the generalized inverse.

> M <- matrix(c(1,3,6,4,2,3,8,6,4,5,3,2), 4, 3)
> M
[1,]
[2,]
[3,]
[4,]

[,1] [,2] [,3]
1
2
4
3
3
5
6
8
3
4
6
2

> library(MASS)
> Mi <- ginv(M)
> Mi
[,1]
[,2]
[,3]
[,4]
[1,] -0.7434 0.6006 0.22741 -0.35569
[2,] 0.4694 -0.4694 -0.06122 0.32653
[3,] 0.1808 0.1050 -0.06706 -0.02332
> zapsmall(eigen(M %*% Mi)$value)
[1] 1 1 1 0
> zapsmall(Mi %*% M)
[,1] [,2] [,3]
[1,]
1
0
0
[2,]
0
1
0
[3,]
0
0
1
> M.svd$v %*% diag(1/M.svd$d) %*% t(M.svd$u)
[,1]
[,2]
[,3]
[,4]
[1,] -0.7434 0.6006 0.22741 -0.35569
[2,] 0.4694 -0.4694 -0.06122 0.32653
[3,] 0.1808 0.1050 -0.06706 -0.02332

I.4 Topics in Matrix Algebra

803

I.4.16 Solving Linear Equations
There are three cases.

I.4.16.1 n = m = rank(X)
Given a matrix X and an n-vector y, the solution β of the linear equation
n×m

y = Xβ
is uniquely defined by
β = X −1 y
when X is invertible, that is when n = m = rank(X).

I.4.16.2 n > m = rank(X)
When n > m = rank(X), the linear equation is said to be overdetermined. Some
form of arbitrary constraint is needed to find a solution. The most frequently used
technique is least-squares, a technique in which β̂ is chosen to minimize the norm
of the residual vector.
⎛
⎞2
n ⎜
m


⎟⎟⎟
⎜⎜⎜
2
⎜⎝⎜yi −
min y − Xβ =
xi j β j ⎟⎟⎠⎟
β
i=1

j=1

The solution β̂ is found by solving the related linear equations
X  y = X  X β̂
The solution is often expressed as
β̂ = (X  X)−1 (X  y) = X − y
This is a definition, not an efficient computing algorithm. The primary efficient
algorithm in R is the QR algorithm as defined in qr and related functions, including lm.

804

I Mathematics Preliminaries

I.4.16.3 m > p = rank(X)
When m > p = rank(X), there are an infinite number of solutions to the linear
equation. The singular value decomposition X = UΔW  will have m − p zero values
along the diagonal of Δ. Let β0 be one solution. Then
 
0
βγ = β0 + W
γ
where 0 is a vector of p zeros and γ is any vector of length m − p, is also a solution.

I.5 Combinations and Permutations
I.5.1 Factorial
For a positive integer n, the notation n!, read “n factorial”, is used to indicate the
product of all the integers from 1 through n:
n! = n × (n − 1) × . . . × 1 = n × (n − 1)!
The factorial of zero, 0!, is separately defined to equal 1.
Thus
n
0
1
2
3
4
5
..
.

n
0
1
2
3
4
5
..
.

=
n
=
1
=
1
=
2
=
6
=
24
= 120
..
.
=

=
=
=
=
=
=
=
=

n((n − 1)!)
1
1×1
2×1
3×2
4×6
5 × 24
..
.

I.5.2 Permutations
The notation n P p , read “n permute p”, indicates the number of ways to select p
distinct items from n possible items where two different orderings of the same p
items are considered to be distinct. Equivalently, n P p is the number of distinct ways
of arranging p items from n possible items:

I.6 Exercises

805
n Pp

=

n!
(n − p)!

For example,
5 P3

=

5×4×3×2×1
5!
=
= 5 × 4 × 3 = 60
(5 − 3)!
2×1

I.5.3 Combinations
 
The notation nC p or np , read “n choose p”, indicates the number of ways to select
p distinct items from n possible items, where two different orderings of the same p
items are considered to be the same selection. Equivalently, nC p is the number of
distinct ways of choosing p items from n possible items:
 
n
n!
n Pp
=
=n C p =
p
p ! × (n − p)!
p!
For example,
 
5
5!
5×4×3×2×1
5×4
5C3
=
=
=
= 10
=
3
2!
3!(5 − 3)! (3 × 2 × 1)(2 × 1) 2 × 1

I.6 Exercises
Exercise I.1.
Start from the matrix in Equation (I.6)
⎛1
⎜⎜⎜
A = ⎜⎜⎜⎝ 1
1

1
2
3

2 ⎞⎟⎟
⎟
3 ⎟⎟⎟⎠
4

Give an example of a basis for A. Then give an example of a vector in C(A) and also
a vector not in C(A). Give an example of an orthogonal basis of A, demonstrating
that it is orthogonal.
Verify that Equation (I.6) defines a family of solutions to the set of linear equations
with p = rank(X) < m.

Appendix J

Probability Distributions

We list, with some discussion, several common probability distributions. We illustrate 21 distributions, 20 distributions in the R stats package and one in the HH
package, for which all three functions (d* for density, p* for probability, and q* for
quantile) are available. We also include the Studentized Range distribution for which
only the p* and q* functions are available, and the discrete multinomial and continuous multivariate normal. The d* functions give the density f (x) for continuous
distributions or the discrete density f (i) for discrete distributions. The p* functions
give the cumulative distribution, the probability that an observation is less than or
equal to the value x
⎧ x
⎪
⎪
⎪
f (x) dx for continuous distributions
⎪
⎪
⎪
⎪
⎪
⎨ −∞
F(x) = P(X ≤ x) = ⎪
x
⎪

⎪
⎪
⎪
⎪
⎪
f (i)
for discrete distributions
⎪
⎩
i=−∞

The q* functions give the quantiles F −1 (p), that is the inverse of the probability
function F(x).
In the example illustrations all three functions (d*, p*, and q*) are shown and
evaluated at sample X and for specific values of the parameters. For distributions
with finite support, the entire domain of x is shown. For distributions with infinite
support, the domain of x showing most of the probability is shown.
For the continuous distributions, we show the plot of the density function. The
darker color shows the probability (area) to the left of x, and the lighter color shows
the probability (area) to the right of x. d*(X) gives the height of the density function
at X, p*(X) gives the probability (area) to the left of X, and q*(p) recovers X from
the probability p.

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

807

808

J Probability Distributions

For the discrete distributions, we show the plot of the discrete density function.
The darkest color shows the probability at X, the intermediate color shows the probability strictly left of X, and the lightest color shows the probability to the right of
X. d*(X) gives the probability at X, p*(X) gives the probability to the left of and
including X, and q*(p) recovers X from the probability p.
We list the continuous central distributions in Section J.1, the continuous noncentral distributions in Section J.2, and the discrete distributions in Section J.3. Within
each section the distributions are ordered alphabetically.

J.1 Continuous Central Distributions
J.1.1 Beta
dbeta(x, shape1 = 85.5, shape2 = 15.5)
10
8
6
4
2
0
0.0

0.2

0.4

0.6

0.8

1.0

x

>
dbeta(.85, shape1=85.5, shape2=15.5)
[1] 11.22
>
pbeta(.85, shape1=85.5, shape2=15.5)
[1] 0.5131
>
qbeta(0.5131489, shape1=85.5, shape2=15.5)
[1] 0.85

This two-parameter distribution is often used to model phenomena restricted to
the range (0, 1), for example sample proportions. It is used in Section 5.1.2 to construct alternative one-sided confidence intervals on a population proportion.

J.1 Continuous Central Distributions

809

J.1.2 Cauchy
dcauchy(x)
0.3

0.2

0.1

0.0
−20

−10

0

10

20

x

>
dcauchy(1.96)
[1] 0.06574
>
pcauchy(1.96)
[1] 0.8498
>
qcauchy(0.8498286)
[1] 1.96

The Cauchy distribution is the same as the t-distribution with 1 degree of freedom. It’s special feature is that it does not have a finite population mean.

J.1.3 Chi-Square
dchisq(x, df = 10)
0.10
0.08
0.06
0.04
0.02
0.00
0

5

10

15

x

>
dchisq(18.31, df=10)
[1] 0.01547
>
pchisq(18.31, df=10)
[1] 0.95
>
qchisq(0.9500458, df=10)
[1] 18.31

20

25

810

J Probability Distributions

The central χ2 distribution with k degrees of freedom is the distribution of the
sum of squares of k independent standard normal r.v.’s. If k > 2, a χ2 r.v. has a
unimodal, positively skewed PDF starting at zero and asymptotically tapering to
the horizontal axis for large values. The mean of this distribution is k, and k is also
approximately its median if k is large. The r.v. [(n−1)s2 ]/σ2 , where s2 is the variance
of a normal sample, has a χ2 distribution with n − 1 degrees of freedom.
This distribution is used in inferences about the variance (or s.d.) of a single
population and as the approximate distribution of many nonparametric test statistics,
including goodness-of-it tests and tests for association in contingency tables.

J.1.4 Exponential
dexp(x, rate = 0.6)
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0

2

4

6

8

x

>
dexp(1, rate=.6)
[1] 0.3293
>
pexp(1, rate=.6)
[1] 0.4512
>
qexp(0.4511884, rate=.6)
[1] 1

μ is both the mean and standard deviation of this distribution. R parameterizes
the exponential distribution with the rate 1/μ, the reciprocal of the mean μ. Times
between successive Poisson events with mean rate of occurrence μ have the exponential distribution. The exponential distribution is the only distribution with the
“lack of memory” or “lack of deterioration” property, which states that the probability that an exponential random variable exceeds t1 + t2 given that it exceeds t1
equals the probability that it exceeds t2 .

J.1 Continuous Central Distributions

811

J.1.5 F
df(x, df1 = 4, df2 = 20)
0.6
0.4
0.2
0.0
0

1

2

3

4

5

x

>
df(3, df1=4, df2=20)
[1] 0.0469
>
pf(3, df1=4, df2=20)
[1] 0.9568
>
qf(0.956799, df1=4, df2=20)
[1] 3

The F distribution is related to the χ2 distribution. If Ui , for i = {1, 2}, is a χ2 r.v.
with νi degrees of freedom, and if U1 and U2 are independent, then F = U1 /U2 has
an F distribution with ν1 and ν2 df. This distribution is extensively used in problems
involving the comparison of variances of two normal populations or comparisons of
means of two or more normal populations.

J.1.6 Gamma
dgamma(x, shape = 3)
0.25
0.20
0.15
0.10
0.05
0.00
0

2

4

6

x

8

812

J Probability Distributions

>
dgamma(6, shape=3)
[1] 0.04462
>
pgamma(6, shape=3)
[1] 0.938
>
qgamma(0.9380312, shape=3)
[1] 6

From ?dgamma:
The Gamma distribution with parameters ’shape’ = a and ’scale’ = s has density
f (x) = 1/(sa Γ(a))x(a−1) e−(x/s)
for x ≥ 0, a > 0 and s > 0. (Here Γ(a) is the function implemented by R’s gamma() and
defined in its help. Note that a = 0 corresponds to the trivial distribution with all mass at
point 0.)
The mean and variance are E(X) = a × s and Var(X) = a × s2 .

The special case of the gamma distribution with shape=1 is the exponential
distribution.

J.1.7 Log Normal
dlnorm(x)
0.6
0.4
0.2
0.0
0

5

10

x

>
dlnorm(5)
[1] 0.02185
>
plnorm(5)
[1] 0.9462
>
qlnorm(0.9462397)
[1] 5

J.1 Continuous Central Distributions

813

A r.v. X is said to have a lognormal distribution with parameters μ and σ if
Y = ln(X) is N(μ, σ2 ); i.e., if Y is normal, then eY is lognormal. This is a positively
skewed unimodal distribution defined for x > 0. It is commonly used as a good
approximation for positively skewed data, such as a distribution of income.

J.1.8 Logistic
dlogis(x)
0.25
0.20
0.15
0.10
0.05
0.00
−4

−2

0

2

4

x

>
dlogis(2)
[1] 0.105
>
plogis(2)
[1] 0.8808
>
qlogis(0.8807971)
[1] 2

From ?dlogis:
The Logistic distribution
with location=m
and scale=s has distribution function


F(x) = 1/ 1 + exp (−(x − m)/s)
and density


f (x) = (1/s) exp ((x − m)/s) 1 + exp ((x − m)/s) −2 .
It is a long-tailed distribution with mean m and variance (π2 /3)s2 .
qlogis(p) is the same as the well known logit function, logit(p) = log(p/(1 − p)), the
log odds function, and plogis(x) has consequently been called the inverse logit.
The distribution function is a rescaled hyperbolic tangent, plogis(x) = (1 + tanh(x/2)) /2,
and it is called a sigmoid function in contexts such as neural networks.

814

J Probability Distributions

J.1.9 Normal
dnorm(x, m = 0, s = 1)
0.4
0.3
0.2
0.1
0.0
−2

−1

0

1

2

x

>
dnorm(1.645, m=0, s=1)
[1] 0.1031
>
pnorm(1.645, m=0, s=1)
[1] 0.95
>
qnorm(0.95, m=0, s=1)
[1] 1.645

This distribution was introduced in Section 3.4.2. If Z is standard normal N(0, 1),
the standard normal density φ and cumulative distribution Φ functions are
 2
z
1
φ(z) = √ exp
2
2π
 Z
φ(z) dz
Φ(Z) =
−∞

The general density, for random variable x with mean μ and variance σ2 , is


x − μ
(x − μ)2
1
1
exp
f (x | μ, σ2 ) = φ
= √
σ
σ
2σ2
2πσ2
The term probit is an alternate notation for the inverse function Φ−1 .
q = Φ−1 (p) = probit(P)
is the inverse function such that p = Φ(q).

J.1 Continuous Central Distributions

815

J.1.10 Studentized Range Distribution
ptukey(x, nmeans=4, df=12)
1.0
0.8
0.6
0.4
0.2
0.0
0

1

2

3

4

5

6

x

>
ptukey(4.199, nmeans=4, df=12)
[1] 0.95
>
qtukey(0.95, nmeans=4, df=12)
[1] 4.199

This distribution is used in the Tukey multiple comparisons procedure discussed
of
in Section 6.3. Let ȳ(1) and ȳ(a) denote the smallest and largest means of samples
√
size n drawn from a populations having a common variance σ2 , and let s = MSRes
be the estimate of σ calculated from the ANOVA table, for example, Table 6.2. Then
the random variable
Q=

ȳ(a) − ȳ(1)
√
s/ n

has a Studentized range distribution with parameters a and dfRes = a(n − 1). The
Studentized range distribution is defined on the domain 0 ≤ q < ∞. R provides the
ptukey and qtukey functions, but not the density function dtukey. We therefore
show the cumulative probability function instead of the density for the Studentized
range distribution.

816

J Probability Distributions

J.1.11 (Student’s) T
dt(x, df = 4)

0.3
0.2
0.1
0.0
−4

−2

0

2

4

x

>
dt(2, df=4)
[1] 0.06629
>
pt(2, df=4)
[1] 0.9419
>
qt(0.9419417, df=4)
[1] 2

This distribution was introduced in Section 3.4.3.

J.1.12 Uniform
dunif(x)
1.0
0.8
0.6
0.4
0.2
0.0
0.0

0.2

0.4

0.6

0.8

1.0

x

>
dunif(.7)
[1] 1
>
punif(.7)
[1] 0.7
>
qunif(.7)
[1] 0.7

All real numbers between a and b are equally likely. The standard case has a = 0
and b = 1. Hypothesis tests work by mapping an appropriate null distribution to the
uniform (with the appropriate p* function in R).

J.2 Noncentral Continuous Probability Distributions

817

J.1.13 Weibull
dweibull(x, shape = 4)
1.5

1.0

0.5

0.0
0.0

0.5

1.0

1.5

x

>
dweibull(1.3, shape=4)
[1] 0.5052
>
pweibull(1.3, shape=4)
[1] 0.9425
>
qweibull(0.9425075, shape=4)
[1] 1.3

From dweibull:
The Weibull distribution with ’shape’ parameter a and ’scale’ parameter b has density given
by
f (x) = (a/b) (x/b)(a−1) exp(−(x/b)a )
for x > 0. The cumulative distribution function is F(x) = 1 − exp(−(x/b)a ) on x > 0, the
mean is E(X) = bΓ(1 + 1/a), and the Var(X) = b2 Γ(1 + 2/a) − (Γ(1 + 1/a))2 .

J.2 Noncentral Continuous Probability Distributions
In hypothesis testing, except in special cases such as testing with the normal distribution, one deals with a central distribution when the null hypothesis is true and an
analogous noncentral distribution when the null hypothesis is false. Thus calculations of probabilities under the alternative hypothesis, as are required when doing
Type II error analysis and when constructing O.C. (beta curves) and power curves,
necessitate the use of noncentral distributions.
The forms of the t, chi-square, and F distributions we’ve considered thus far
have all been central distributions. For example, if X̄ is the mean and s the standard
deviation of a random
sample of size n from a normal population with mean μ, then
√
t = ( x̄ − μ)/(s/ n) has a central t distribution with n − 1 df. Suppose, however, that
the population mean is instead μ1 , different from μ. Then t above is said to have a
noncentral t distribution with n − 1 df and a noncentrality parameter proportional to

818

J Probability Distributions

((μ − μ1 )/σ)2 . (If μ = μ1 , so that the noncentrality parameter is zero, the noncentral
t distribution reduces to the central t distribution.)
A noncentral chi-square (χ2 ) r.v. is a sum of squares of independent normal r.v.’s
each with s.d. 1 but at least some of which have a nonzero mean. A noncentral F
r.v. is the ratio of a noncentral chi-square r.v. to a central chi-square r.v., where the
two chi-squares are independent.
For tests using the t, chi-square, or F distribution, the power of the test (protection
against Type II errors) is an increasing function of the noncentrality parameter.
Noncentral distributions are specified with one more parameter than their corresponding central distribution. Consequently, tabulations of their cumulative distribution function appear much less frequently than those for central distributions.
Fewer statistical software packages include them. There is no noncentral normal distribution. The distribution under the alternative hypothesis is just an ordinary normal
distribution with a shifted mean.
The R functions for noncentral t, chi-square, and F CDFs are the same as those
for the corresponding central distribution with the addition of an argument for the
noncentrality parameter ncp. The noncentrality parameter defaults to zero (hence to
a central distribution) if it is not specified.
The figures in Sections J.2.1, J.2.2, and J.2.3 show the noncentral distribution
along with the corresponding central distribution. This way it is possible to see that
a positive noncentrality parameter shifts the mode to the right.

J.2 Noncentral Continuous Probability Distributions

819

J.2.1 Chi-Square: Noncentral
dchisq(x, df = 10, ncp = 4)
0.10

non−central
central

0.08
0.06
0.04
0.02
0.00
0

10

20

30

x

>
dchisq(18.31, df=10, ncp=4)
[1] 0.0408
>
pchisq(18.31, df=10, ncp=4)
[1] 0.7852
>
qchisq(0.7852264, df=10, ncp=4)
[1] 18.31

See discussion in Section 14.8.2 and example in Figure D.1.

J.2.2 T: Noncentral
dt(x, df = 4, ncp = 2)
non−central
central

0.3
0.2
0.1
0.0
−2

0

2

4

x

>
dt(2, df=4, ncp=2)
[1] 0.3082
>
pt(2, df=4, ncp=2)
[1] 0.4557
>
qt(0.455672, df=4, ncp=2)
[1] 2

See examples in Figures 3.24, 5.2, and 5.10.

6

8

820

J Probability Distributions

J.2.3 F: Noncentral
df(x, df1 = 4, df2 = 20, ncp = 2)
non−central
central

0.6
0.4
0.2
0.0
0

2

4

6

x

>
df(3, df1=4, df2=20, ncp=2)
[1] 0.1062
>
pf(3, df1=4, df2=20, ncp=2)
[1] 0.871
>
qf(0.8710256, df1=4, df2=20, ncp=2)
[1] 3

See example in Section 6.5 and discussion in Section 14.8.2.

J.3 Discrete Distributions
Discrete distributions are defined to have nonzero values on a set of integers.
F(x) = P(X ≤ x) =

x


f (i)

i=−∞

The inverse functions (the q* functions in R) are sensitive to the precision of the
numerical representation.
Computers use finite precision floating point arithmetic, precise to 53 significant binary digits (bits)—approximately 17 decimal digits. They do not use the real
number system that we are familiar with. Simple decimal repeating fractions such
as these are not stored precisely with finite precision machine arithmetic. All of the
individual values in this example are automatically rounded to 53 bits when they
are entered into the computer. None of them are exactly represented inside the computer. See Appendix G for more on the floating point arithmetic used in computers.
In several of the discrete distribution examples here, it has been necessary to display the p values to 17 decimal digits in order to get the desired answer from the q*
functions.

J.3 Discrete Distributions

821

Here is a simple example, the 6-level discrete uniform (one fair die), to illustrate
the problem. In this example, rounding produces several results from the qdiscunif
function that are one unit off (either too large or too small).
F(i)
rounded to
i f (i) fraction 4 decimal digits
1 1/6 1/6
0.1667
2 1/6 2/6
0.3333
3 1/6 3/6
0.5000
4 1/6 4/6
0.6667
5 1/6 5/6
0.8333
6 1/6 6/6
1.0000

machine precision,
rounded to 53 bits,
≈ 17 decimal digits
0.16666666666666666
0.33333333333333331
0.50000000000000000
0.66666666666666663
0.83333333333333337
1.00000000000000000

> ## this is printing precision, not internal representation
> old.digits <- options(digits=7)
> ddiscunif(1:6, 6)
[1] 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667
> pdiscunif(1:6, 6)
[1] 0.1666667 0.3333333 0.5000000 0.6666667 0.8333333 1.0000000
> qdiscunif(pdiscunif(1:6, 6), 6)
[1] 1 2 3 4 5 6
> round(pdiscunif(1:6, 6), 4) ## rounded to four decimal digits
[1] 0.1667 0.3333 0.5000 0.6667 0.8333 1.0000
> ## inverse after rounding to four decimal digits
> qdiscunif(round(pdiscunif(1:6, 6), 4), 6)
[1] 1 1 3 4 4 6
> options(old.digits)

822

J Probability Distributions

J.3.1 Discrete Uniform
ddiscunif(x, size = 12)
0.08
0.06
0.04
0.02
0.00
1

2

3

4

5

6

7

8

9

10

11

12

x

>
ddiscunif(6, size=12)
[1] 0.08333
>
pdiscunif(6, size=12)
[1] 0.5
>
qdiscunif(.5, size=12)
[1] 6

In the discrete uniform distribution, the integers from 1 to n are equally likely.
The population mean is (n + 1)/2. The population variance is (n2 − 1)/12. The
distribution function is

J.3.2 Binomial
dbinom(x, size = 15, prob = 0.4)
0.20
0.15
0.10
0.05
0.00
0

1

2

3

4

5

6

7

8

x

9

10

11

12

13

14

15

J.3 Discrete Distributions

823

> ## probability of exactly 6 Heads
> dbinom(6, size=15, prob=.4)
[1] 0.2066
> ## probability of 6 or fewer Heads
> ## extra precision is needed
> print(pbinom(6, size=15, prob=.4), digits=17)
[1] 0.60981315570892769
> ## q, the number for which the probability of seeing
> ## q or fewer Heads is 0.60981315570892769
> qbinom(0.60981315570892769, size=15, prob=.4)
[1] 6

The binomial distribution was introduced in Section 3.4.1. If X has a binomial
distribution with parameters n (n=size, number of coins tossed simultaneously)
and p (p=prob, probability of one coin landing Heads on one toss), then the binomial distribution gives the probability of observing exactly X heads.

J.3.3 Geometric
dgeom(x, prob = 0.5)
0.5
0.4
0.3
0.2
0.1
0.0
0

1

2

3

4

5

x

>
dgeom(3, prob=.5)
[1] 0.0625
>
pgeom(3, prob=.5)
[1] 0.9375
>
qgeom(0.9375, prob=.5)
[1] 3

6

7

8

9

824

J Probability Distributions

From ?dgeom:
The geometric distribution with ’prob’ = p has density
p(x) = p(1 − p) x
for x = 0, 1, 2, ..., 0 < p ≤ 1.
The quantile is defined as the smallest value x such that F(x) ≥ p, where F is the
distribution function.

J.3.4 Hypergeometric
dhyper(x, m = 5, n = 6, k = 7)

0.4
0.3
0.2
0.1
0.0
0

1

2

3

4

5

6

7

x

>
dhyper(3, m=5, n=6, k=7)
[1] 0.4545
>
print(phyper(3, m=5, n=6, k=7), digits=17)
[1] 0.65151515151515149
>
qhyper(0.65151515151515149, m=5, n=6, k=7)
[1] 3

The hypergeometric distribution is used in Chapter 15. We sample n items without replacement from a population of N items comprised of M successes and N − M
failures. Then the number of successes X observed in the population is said to have
a hypergeometric distribution with parameters N, M, and n.

J.3 Discrete Distributions

825

J.3.5 Negative Binomial
dnbinom(x, size = 8.5, prob = 0.4)
0.06
0.04
0.02
0.00
0

5

10

15

20

25

30

x

>
dnbinom(17, size=8.5, prob=.4)
[1] 0.04338
>
print(pnbinom(17, size=8.5, prob=.4), digits=17)
[1] 0.81209497223034977
>
qnbinom(0.81209497223034977, size=8.5, prob=.4)
[1] 17

The negative binomial distribution with size = n and prob = p has density
Γ(x + n) n
p (1 − p) x
Γ(n) x!
for x = 0, 1, 2, . . ., n > 0, and 0 < p ≤ 1.
This represents the number of failures which occur in a sequence of Bernoulli
trials before a target number of successes is reached. The mean is μ = n(1 − p)/p
and variance is n(1 − p)/p2 .

J.3.6 Poisson
dpois(x, lambda = 4)

0.15
0.10
0.05
0.00
0

2

4

6

x

8

10

12

826

J Probability Distributions

>
dpois(7, lambda=4)
[1] 0.05954
>
print(ppois(7, lambda=4), digits=17)
[1] 0.94886638420715264
>
qpois(0.94886638420715264, lambda=4)
[1] 7

Let the random variable X be the number of occurrences of some event that are
observed in a unit of time, volume, area, etc., and let λ be the mean number of
occurrences of the event per unit, assumed to be constant throughout the process that
generates the occurrences. Suppose that the occurrence(s) of the event in any one
unit are independent of the occurrence(s) of the event in any other nonoverlapping
unit. Then X has a Poisson distribution with parameter λ.

J.3.7 Signed Rank
dsignrank(x, n = 5)

0.08
0.06
0.04
0.02
0.00
0

1

2

3

4

5

6

7

8

x

>
dsignrank(11, n=5)
[1] 0.0625
>
psignrank(11, n=5)
[1] 0.8438
>
qsignrank(0.84375, n=5)
[1] 11

See the discussion in Section 16.3.2.

9

10

11

12

13

14

15

J.4 Multivariate Distributions

827

J.3.8 Wilcoxon
dwilcox(x, m = 4, n = 12)

0.04
0.03
0.02
0.01
0.00
0

5

10

15

20

25

30

35

x

>
dwilcox(35, m=4, n=12)
[1] 0.02088
>
print(pwilcox(35, m=4, n=12), digits=17)
[1] 0.9148351648351648
>
qwilcox(0.9148351648351648, m=4, n=12)
[1] 35

See Table 16.7 for an example of a rank sum test.

40

45

50

828

J Probability Distributions

J.4 Multivariate Distributions
J.4.1 Multinomial
dmultinom(x, prob = c(1,2,5))

300
0.029

201

102

003

0.146

0.244

0.002

210

111
0.293

012

0.012

0.117

120

021

0.023

0.117

030

0.016

> ## This example is based on ?dmultinom in R
> ## all possible outcomes of Multinom(N = 3, K = 3)
> X <- t(as.matrix(expand.grid(0:3, 0:3)))
> X <- X[, colSums(X) <= 3]
> X <- rbind(X, 3:3 - colSums(X))
> dimnames(X) <- list(letters[1:3], apply(X, 2, paste, collapse=""))
> Y <- round(apply(X, 2, function(x) dmultinom(x, prob = c(1,2,5))), 3)
> rbind(X, Y)
003
102
a 0.000 1.000
b 0.000 0.000
c 3.000 2.000
Y 0.244 0.146

201
2.000
0.000
1.000
0.029

300
3.000
0.000
0.000
0.002

012
0.000
1.000
2.000
0.293

111
1.000
1.000
1.000
0.117

210
2.000
1.000
0.000
0.012

021
0.000
2.000
1.000
0.117

120
1.000
2.000
0.000
0.023

030
0.000
3.000
0.000
0.016

The (discrete) multinomial distribution is a generalization of the binomial distribution to the case of k > 2 categories. Suppose there are n independent trials, each
of which can result in just one of k possible categories such that p j is the probability
of resulting in the jth of these k categories. (Hence p1 + p2 + . . . + pk = 1.) Let X j
be the number of occurrences in category j. Then the vector (X1 , X2 , ..., Xk ) is said
to have a multinomial distribution with parameters n, p1 , p2 , . . . , pk . Its PMF is

J.4 Multivariate Distributions

829

P(X j = x j | j = 1, . . . , k) =

n!p1x1 p2x2 . . . pkxk
, x1 + x2 + . . . xk = n
x1 !x2 ! . . . xk !

If a proportion p j of a population of customers prefers product number j, j =
1, . . . , k, among k products, then the multinomial distribution provides the probability of observing any particular configuration of preferences among a random sample
of n customers.

J.4.2 Multivariate Normal
Bivariate Normal: dmvnorm(c(−1, −1))

z

column
row

> dmvnorm(c(-1, -1))
[1] 0.05855
> pmvnorm(upper=c(-1, -1))[1]
[1] 0.02517
> qmvnorm(0.02517, mean=c(0,0))$quantile
[1] -1

See Sections 3.3.5 and I.4 for examples.

Appendix K

Working Style

Working style in a computer environment depends on two interrelated concepts:
which programs you use and how you use them.
For statistical analysis we are recommending R as the primary computational
tool. There are other very good programs, and it is imperative that you develop a
working understanding of several of them. See Appendix H for information on how
to use the datasets discussed in this book with other software.
An excellent text editor is an indispensable tool for the statistical analyst. The
editor is the single program in which we spend most of our time. We use it for
looking at raw data, for writing commands in the statistical languages we use, for
reading the output tables produced by our statistical programs, for writing reports,
and for reading and writing correspondence about our studies to our clients, consultants, supervisors, and subordinates. We discuss our requirements for a text editor
in Section K.1. Our personal choice of editor is Emacs (Free Software Foundation,
2015), which we discuss in Appendix M. There are other excellent editors which
satisfy our requirements.
We discuss in Section K.2 the types of interaction that are possible with R.
We recommend in Section K.3 working with files of R commands. We discuss in
Section K.4 our recommendations for organization of the files within the operating
system’s directory structure.

K.1 Text Editor
As we indicated in the Preface on page ix, our goal in teaching statistical languages
is to make the student aware of the capabilities of the language for describing data
and their analyses. The language is approached through the text editor.

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

831

832

K Working Style

We distinguish between the concepts of text editing (discussed in this Appendix)
and word processing (discussed in Appendix O). Text editing is moving characters
around on the screen with the expectation that they will stay where you put them.
This is critical when writing computer programs where the physical placement of
lines and characters on the page is part of what the computer program interprets. In
the R language the two layouts of the same characters in Table K.1 have completely
different interpretations.
Word processing is moving sentences, paragraphs, sections, figures, and crossreferences around. A word processor can be used as a text editor by manually turning
off many of the word processing features.
Table K.1 Two different interpretations of the same characters {“3”, “+”, “4”} that depend on their
placement on separate lines. If the first set of input lines were reformatted according to English
language paragraph formatting rules it would be interpreted as if it were the second set, which has
a completely different meaning. This is the simplest possible example of why we make a distinction
between text editing and word processing.

R Input
Two lines

One line

3
+ 4

3 + 4

R Output
> 3
[1] 3
> + 4
[1] 4

> 3 + 4
[1] 7

K.1.1 Requirements for an Editor
These are the requirements we place on any text editor that is to be used for interacting with a computing language:
1. Permit easy modification of computing instructions and facilitate their resubmission for processing
2. Be able to operate on output as well as input
3. Be able to cut, paste, and rearrange text; to search documents for strings of text;
to work with rectangular regions of text
4. Be aware of the language, for example, to illustrate the syntax with appropriate
indentation and with color and font highlighting, to detect syntactic errors in
input code, and to check the spelling of statistical keywords

K.2 Types of interaction with R

833

5. Handle multiple files simultaneously
6. Interact cleanly with the presentation documents (reports and correspondence)
based on the statistical results
7. Check spelling of words in natural languages
8. Permit placement of graphics within text.
9. Permit placement of mathematical expressions in the text.
10. Work with Unicode to give access to all character sets in all human languages.

K.1.2 Choice of Editor
Our preference for an editor is Emacs with ESS, which we discuss in Appendix M.
In Section M.5 we show how Emacs satisfies the requirements in Section K.1.1.
There are many other options for an editor, usually within the context of an Integrated Development Environment (IDE), a software application that provides comprehensive facilities to computer programmers for software development. An IDE
normally consists of a source code editor, build-automation tools, and a debugger.
For more information on IDEs see Wikipedia (2015).
For an annotated list, with links to their webpages, of other editors and IDEs that
work with R, see Grosjean (2012).
We discuss word processors such as MS Word (Microsoft, 2015) in Section O.1.
In Section O.1.1 we show how MS Word satisfies some of the requirements in
Section K.1.

K.2 Types of interaction with R
1. CLI Command Line Interface: The user types R statements and the system responds with requested output. Examples: R in a shell/terminal/CMD window, R
in the *R* buffer in Emacs, R in the Console window in the Rgui.exe for Windows, the R.app for Macintosh, and the JGR package on all operating systems.
2. Menu or Dialog Box: Dropdown lists of command names, often with default
settings for arguments. This allows point and click access to many R functions
with standard settings of options. The Rcmdr (R Commander) package described
in Appendix C is such a system.

834

K Working Style

3. Spreadsheet. The RExcel system described in Appendix D allows access to all R
functions from within the automatic recalculation mode of Excel for Windows.
The Rcmdr interface is incorporated into the Excel menu ribbon.
4. Web-based interface. Technology for embedding R applications within an html
page to provide interactive access to R functions for non-programmers. See Appendix E for a discussion of the shiny system.
5. Document based interface. The end user writes a document (book, paper, report)
in a standard document writing system (LATEX or Word, for example) with embedded R code. Three examples are Sweave (Leisch and R-core, 2014), knitr
(Xie, 2015), and SWord (Baier, 2014). The second edition of HH is written using
Sweave (see help(Sweave, package="utils")). All R code, leading to all
graphs and tables in the book, is included in the LATEX source files for the chapters. The code files for the HH package, located with the HHscriptnames()
function, were pulled from the LATEX files by the Stangle function which is part
of the Sweave framework.
6. GUI Graphical User Interface: anything other than the command line interface.

K.3 Script File
Our personal working style (and the one we recommend to our readers) is to write a
file of commands (a script file with extension .R or .r) that specify the analysis, run
the commands and review the graphs and tables so produced, and then correct and
augment the analysis specifications. We construct the command file interactively.
Initially we write the code to read the data into the program and to prepare several
graphs and the initial tables. If there are errors (in typing or in programming logic),
we correct them in the file and rerun the commands. As we progress and gain insight
into what the data say, we add to the code to produce additional graphs and tables.
When we have finished, we have a file of commands that could be run as a batch job
to produce the complete output. We also have a collection of graphs and tables that
can be stored in additional files.

K.4 Directory Structure
When we have many files, we need to organize them within the directory structure
of our computer’s operating system.

K.4 Directory Structure

835

K.4.1 Directory Structure of This Book
This book is written in LATEX (Lamport, 1994). Our organizational structure uses the
directory structure provided by the computer’s operating systems. For this book we
have a main directory (hh2) and subdirectories for code (containing .R files), transcripts (.Rout) files, and figures (.pdf) files. The main directory contains all the
.tex files (one per chapter plus a master file), our LATEX style file (hh2.sty), and
several other support files. We have a work directory parallel to the hh2 directory
for experiments with R code, for correspondence with our publisher, and for correspondence with the people who were kind enough to read and comment on drafts of
this book.
The R functions that we developed for the book, and all datasets used in the book,
are collected into the HH package (Heiberger, 2015) distributed through CRAN.
The master copy of the R scripts for all figures and tables in the second edition
is included in the *.tex source files for the individual chapters. We use the Sweave
functions included in R to execute the code directly from the *.tex files. When
we are ready to distribute the code, we pull the R code from the *.tex files with
the Stangle function (part of R’s Sweave system) and place them into the HH
package. The script files in the HH package for the book can be located by the
package user with the HHscriptnames() function.
We hope that readers of our book, and more generally users of R, design and
collect their own functions into a personal package (it doesn’t have to be distributed
through CRAN). Once you have more than a few functions that are part of your
working pattern, maintaining them in a package is much simpler than inventing
your own idiosyncratic way of keeping track of what you have. We say a few words
about building a package in Appendix F and refer you to the R document
system.file("../../doc/manual/R-exts.pdf")
for complete information.

K.4.2 Directory Structure for Users of This Book
It is critical to realize that your work is yours. Your work is not part of the computer’s
operating system, nor of the installed software. The operating system, the editor, R,
and other software are kept in system directories. In order to protect the integrity of
the system you will usually not have write access to those locations.
You need a home directory, the one where you keep all your personal subdirectories and files. Your operating system automatically creates a HOME directory for you.
R can find its name with the command Sys.getenv("HOME"). Everything you do
should be in a subdirectory of your home directory.

836

K Working Style

For example, each time we teach a course, we create a directory for that course.
For last year’s course based on this book, we used the directory 8003.f14 directly
under the HOME directory. We have a subdirectory of the course directory for
the syllabus, for each class session, and for student records. We keep handouts,
R scripts, and transcripts of the class R session as files within each class session’s
directory.

K.4.3 Other User Directories
We recommend a separate directory for each project. It will make your life much
easier a year from now when you try to find something.

Appendix L

Writing Style

Reports, including homework exercises, based on computer printout must be typed
correctly. We recommend LATEX (the standard required by many statistics and mathematics journals, and the typesetting package with which we wrote this book). We
do accept other word processing software. Whichever software you use, you must
use it correctly. We discuss in this appendix some of the fundamentals about good
technical writing.

L.1 Typographic Style
Specific style issues that you must be aware of are
1. Fonts: Computer listings in R (and S-Plus and SAS and many other statistical
software systems) are designed for monowidth fonts (such as Courier) where all
characters (including spaces) are equally wide. These listings are unreadable in
Times Roman or any other proportional font. English text looks best in a proportional font (where, for example, the letter “M” is wider than the letter “i”) such
as Times Roman. Table L.1 shows the difference with simple alphabetic text.
Table L.2 shows an example of the issue for computer listings. The Courier rendition is consistent with the design of the output by the program designer. The
Times Roman is exactly the same text dropped into an environment that is incorrectly attempting to space it in accordance with English language typesetting
rules. In our classes, we return UNREAD any papers that use a proportional
font for computer listings.
2. Alignment: Numbers in a column are to be aligned on decimal points. Alignment
makes it possible to visually compare printed numbers in columns. There are two
© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

837

838

L Writing Style

Table L.1 We placed a vertical rule after four characters on each line in both Courier and Times
Roman. The Courier rules are aligned because all characters in Courier are exactly the same width.
The Times Roman rules are not aligned because each character in Times Roman has its own width.
Courier

Times Roman

Wide and narrow.
Letters in each row align

Wide and narrow.
Letters in each row do not align

with letters in the previous row.

with letters in the previous row.

Table L.2 R output displayed correctly in a monowidth font, and incorrectly in a proportional
font. The same text, the output from an R summary.data.frame call, is displayed in both fonts.
The unequal width of the characters in the Times Roman font destroys the vertical alignment that
is necessary for interpretation of this listing.

Courier (correct spacing)
> summary(ex0221)
weight
code
Min.:23.20
1:35
1st Qu.:24.75
2:24
Median:25.70
Mean:25.79
3rd Qu.:26.50
Max.:31.10

Times Roman (incorrect spacing)
> summary(ex0221)
weight code
Min.:23.20 1:35
1st Qu.:24.75 2:24
Median:25.70
Mean:25.79
3rd Qu.:26.50
Max.:31.10

Table L.3 Alignment of decimal points in the numbers on the left makes it easy to compare the
magnitudes of the numbers. Centering of the numbers on the right, and therefore non-alignment of
decimal points, makes it difficult to compare the magnitudes of the numbers.

Correct
123.45
12.34
−4.32
0.12

Wrong
123.45
12.34
−4.32
0.12

reasons for getting it wrong. One is carelessness. The other is blind copying from
a source that gets it wrong. We show an example of both alignments in Table L.3.
3. Minus signs and dashes: There are four distinct concepts that have four different typographical symbols in well-designed fonts. On typewriters all four are

L.1 Typographic Style

839

usually displayed with the symbol “-” that appears on the hyphen key (next to
the number 0). You are expected to know the difference and to use the symbols
correctly.
Table L.4 shows an example of the correct usage of all four symbols and the keys
in LATEX and MS Word.
Table L.4 Correct usage of all four dash-like symbols (- – − —) and the keys to generate them in
LATEX and MS Word.

–
−
—

Symbol

Use

Example

LATEX

MS Word

hyphen
en dash
minus
em dash

compound word
range
negation
apposition

t-test
100–120
−12
punctuation—like this

-$-$
---

ctrl-num Insert-menu/symbol. . . / −
alt-ctrl-num -

The misuse of dashes that touches my (rmh) hot button the most is misuse of
hyphen when minus is meant, for example
correct
+12.2
−12.2

WRONG
+12.2
-12.2

In this wrong usage, the “+” and “-” are not aligned and consequently the decimal
point is not aligned.
4. Right margins and folding: Table L.5 intentionally misuses formatting to illustrate how bad it can look. This usually occurs when the R window width is
wider than your word processor is willing to work with. Verify that you picked
an options()$width consistent with your word processor. You can make the
width narrower in R by using the R command
options(width=72)

5. Quotation marks: Quotation marks in Times Roman are directional. This is
“Times Roman correct” (with a left-sided symbol on the left and a right-sided
symbol on the right). This is ”Times Roman incorrect” (with the right-sided symbol on both sides). In the typewriter font recognized by R, quotation marks are
vertical, not directional. and are the same on both sides. In typewriter font, this
is "typewriter correct" (same non-directional symbol on both sides).

840

L Writing Style

Table L.5 Intentional misuse of formatting. Never turn in anything that looks like these bad
examples. The top section has lines of text that extend beyond the right margin of the page. The
middle section is an R table that has been arbitrarily folded at 49 mono-spaced letter widths. The
bottom section retains the lines, but places them in a proportional font and ignores the alignment
of columns.

Do not allow the right margins of your work to run off
the edge of the page. It is hard to read text that isn’t visible.
Do not allow lines to be arbitrarily folded in a way that destroys the formatting.
This is particularly a problem if you copy output from
the R Console window to an editor in a word-processing mode. Word processors (as distinct from text editors) by default enforce
English-language conventions (such as maximum line
length and proportional font) on code and output text that is designed for column alignment and a fixed-width font. Use an editor,
such as Emacs with ESS, that is aware of the R formatting conventions. Most
word processors do have an option to set sections in
fixed-width Courier font and to give the write control of margins.

⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
Folding
⎪
⎪
⎪
⎪
⎪
makes
⎪
⎪
⎨
this table
⎪
⎪
⎪
⎪
⎪
impossible to ⎪
⎪
⎪
⎪
⎪
⎪
read.
⎪
⎪
⎪
⎪
⎪
⎪
⎩

Column
alignment
ignored.
Table is
unreadable.

⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩

Sum of
Squares

Source
re
F Value

DF
Pr > F

Model
65
54.92
Error
28
Corrected Total

2
2649.816730
<.0001
44
1061.382419
46

3711.199149

Sum of
Source DF Squares Mean Square F Value Pr > F
Model 2 2649.816730 1324.908365 54.92 <.0001
Error 44 1061.382419 24.122328
Corrected Total 46 3711.199149

Mean Squa

1324.9083
24.1223

L.2 Graphical Presentation Style

841

L.2 Graphical Presentation Style
Graphs designed for someone to read must be legible. Legibility includes the items
listed in this section and in Chapter 4 (some of which are repeated here).

L.2.1 Resolution
Figure L.1 shows the same graph drawn on a vector graphics device (pdf() in this
example) and a bitmap device (png() in this example).
Vector graphics devices define objects to be drawn in terms of geometrical primitives such as points, lines, curves, and shapes or polygons. Vector graphics can be
magnified without pixalation. Current vector graphics formats are pdf, ps, and wmf.
By contrast bitmap (or raster) graphics devices define objects in terms of the
set of pixels used in one specific magnification. Further magnification gives larger
discrete dots and not smooth objects. Current bitmap formats are png, bmp, tif,
and gif.
Figure L.1 panels a and c are drawn with a vector graphics device. They are
clear and crisp at any magnification (two magnifications are shown here). Figure
L.1 panels b and d are drawn with a bitmapped graphics device. Panel b is not clear,
and the magified version in panel d is granular and fuzzy.

L.2.2 Aspect Ratio
The graphs are initially drawn for the device size they see at the time they are drawn.
The aspect ratio (the ratio of the width to height in graphic units) is set at that time.
Plotting symbols and text are positioned to look right at that initial magnification
with that size device. The graphs in Figure L.1 honor the aspect ratio. Both the x
and y dimensions are scaled identically in those panels.
Changing the aspect ratio after the graph has been drawn interferes with the message of the graph. It is most evident when both axes use the same units, but is visible
even when the units are different.
The graph in Figure L.2 does not honor the aspect ratio, and the graph becomes
very hard to read. The width is stretched to twice its initial size and the height is
left at the original height. As a consequence, the circles used as plotting symbols
are stretched to ellipses. The font used for the labels is stretched to visually change
the shapes of the letters. About half of the zero characters look like circles. The

842

L Writing Style

a. PDF (vector) device

b. PNG (bitmap) device

80

life.exp

75
70
65
60
55

0 100 200 300 400 500 600

ppl.per.tv
c. PDF (vector) device magnified

d. PNG (bitmap) device magnified

60
55

0 100 200

pp
Fig. L.1 Panels a and b were drawn with the same command addressed to different devices. Panel
a uses the pdf vector device and panel b uses the png bitmap device. Even at low magnification
the difference between the two images is clear. The circle glyphs, the text, and the lines are crisp
on the vector device. The circle glyphs, the text, and the lines are granular on the bitmap device.
Panels c and d are the lower left corners of panels a and b magnified by a scale of 2. The vector
display is just as crisp at this magnification. The bitmap display is more granular and fuzzy. Not
even the straight lines are clear in panel d.

L.2 Graphical Presentation Style

843
PNG (bitmap) device stretched

Fig. L.2 The figure here is the same plot shown in Figure L.1 panel b, this time with its xdimension stretched. The circular glyphs are now ellipses. The thin numeral “0” are now circles.
The vertical straight lines are even fuzzier than before.

other half look different because the pixel break points are placed differently on the
underlying letter shapes.
It is possible to damage the aspect ratio even with a good typographic system
(I intentionally did so in Figure L.2 using LATEX). Under normal circumstances in
LATEX the aspect ratio is retained.
It is too easy to damage the aspect ratio with a drag-and-drop editing system
by stretching an image at the top, bottom, or sides of an image. It is also easy to
maintain the aspect ratio when controlling the size of an image, by stretching only
at the corners, and never stretching the top, bottom, or sides of an image.

L.2.3 Other Features
Other features to be aware of were discussed and illustrated in Chapter 4 and are
summarized here.
1. In scatterplot matrices, a NW–SE main diagonal has a consequence of multiple
axes of symmetry that interfere with easy reading. The (SW–NE) main diagonal
(the defaul in splom has a single axis of symmetry both within and between
panels. See the examples and discussion in Section 4.7 for more detail.
2. The panels in a scatterplot matrix should be square to emphasize the equivalence
of rows and columns. The intention of the scatterplot matrix is the comparison of

844

L Writing Style

y ~ x with x ~ y, and maintaining the same scale in both orientations facilitates
the comparison of reflected panels. Compare Figures 4.10 and 4.11.
3. Choice of plotting symbols, open circles, closed circles, other symbols (triangles,
squares, etc), letters. See the itemized discussion in Section 4.1.
4. Choice of colors. See Section 4.10 for discussion of color vision.
5. Location of ticks and labels inside or outside the panels. In scatterplot matrices,
placing ticks and labels inside the main diagonal makes the frees more surface
area for the panels themselves. Compare Figures 4.10 and 4.11.
6. Space between panels. Compare Figures 4.10 and 4.11.

L.3 English Writing Style
1. Check spelling
• Select the right homophone (words that sound alike): brake vs break. Both
spellings will be accepted by a spell-checking program.
• Learn to spell technical words correctly. The following words seem to be particularly liable to misspelling:
– separate: The fourth letter is “a”.
– correlation: The letter “r” is doubled.
– collinear: The letter “l” is doubled.
– stationary: not moving
– stationery: writing paper and envelopes
– symmetric: The letter “m” is doubled.
– asymmetric: The letter “s” is single.
– Tukey: John W. Tukey
– turkey: a bird
• “p-value” is preferred (with p in math italic). “P-value” is not OK (with “P”
in uppercase roman).
• Spell people’s names correctly and with proper capitalization (John W. Tukey,
Dennis Cook).
2. Punctuation.
“.” “:” “,” “;” always touch the preceding character. They always have a space
after them.

L.4 Programming Style and Common Errors

845

L.4 Programming Style and Common Errors
1. Data entry: Use real variable names. The default variable names “X1 ” and “X2 ”
carry no information. Variable names like “height” and “weight” carry information.
2. Data entry: probably you don’t need to do that step. Don’t reenter by hand the
numbers that you already have in machine-readable form.
3. Use dump("varname","") to get ASCII versions of R variables that can be
read back into R with no loss (of digits, labeling, attributes). The output from
the dump can be copied into email and copied back from email. The output from
the simpler print commands will frequently get garbled in email. See Table L.6
for an illustration where the original class of two of the columns is lost when
we neglected to use the dump function.
4. The R function splom() for scatterplot matrices by default gives easy-to-read
plots with a single axis of symmetry over the entire set of square panels. The R
pairs() function (for all pairwise two-variable graphs) by default gives many
conflicting axes of symmetry and rectangular panels. See Figure 4.12 and the
accompanying discussion.
5. Analyze the experiment given you. Don’t ignore the block factor. Usually the
block sum of squares is removed first, before looking at the treatment effects.
In R, this is done by placing the block factor first in the model formula, for example, in Section 12.9 we use aov(plasma ~ id + time) so the sequential
analysis of variance table will read
id
time
Residuals
This way, in non-balanced designs, the sequential sum of squares for the treatment factor (time in this example) is properly adjusted for the blocking factor id.
6. We normally recommend the use of the R command language, not a menu system, when you are learning the techniques. You will get
• much better-looking output
• more control
• the ability to reproduce what you did
7. When GUI point-and-click operations have been used to construct preliminary
graphical (or tabular) views of the data, the commands corresponding to these
operations are frequently displayed. Rcmdr for example displays the generated
commands in its R Script window (see Figure C.14 for an example). These
commands can then be used as components in the construction of more complex
commands needed to produce highly customized graphs.

846

L Writing Style

Table L.6 We construct a data.frame and then display it twice, once by typing the variable name,
the second time by using the dump function. When we re-enter the typed text back to R, we lose the
structure of the data.frame. When we re-enter the dumped structure, we retain the original structure.

> tmp <- data.frame(aa=1:3, bb=factor(4:6), cc=letters[7:9],
+
dd=factor(LETTERS[10:12]), stringsAsFactors=FALSE)
> str(tmp)
’data.frame’: 3 obs. of 4 variables:
$ aa: int 1 2 3
$ bb: Factor w/ 3 levels "4","5","6": 1 2 3
$ cc: chr "g" "h" "i"
$ dd: Factor w/ 3 levels "J","K","L": 1 2 3
> tmp
aa bb cc dd
1 1 4 g J
2 2 5 h K
3 3 6 i L
> dump("tmp", "")
tmp 
+
+
+
+
+

tmp <- read.table(text="
aa bb cc dd
1 1 4 g J
2 2 5 h K
3 3 6 i L
", header=TRUE)

> sapply(tmp, class)
aa
bb
cc
"integer" "integer" "factor"
>
+
+
+
+
+

dd
"factor"

tmp  sapply(tmp, class)
aa
bb
cc
"integer"
"factor" "character"

dd
"factor"

L.5 Presentation of Results

847

8. Store results of an R function call in a variable to permit easy extraction of various displays from the results. For example,
mydata <- data.frame(x=1:6, y=c(1,4,2,3,6,2))
my.lm <- lm( y ~ x , data=mydata)
summary(my.lm) ## summary() method on lm argument
old.mfrow <- par(mfrow=c(2,2)) ## four panels on the graphics device
plot(my.lm)
## plot() method on lm argument
par(old.mfrow)
## restore previous arrangement
coef(my.lm)
## coef() method on lm argument
anova(my.lm)
## anova() method on lm argument
resid(my.lm)
## resid() method on lm argument
predict(my.lm) ## predict() method on lm argument

9. Analysis of Variance requires that the classification factor be declared as a
factor. Otherwise you will get a nonsense analysis. The wrong degrees of
freedom for a treatment effect is usually the indicator that you forgot the
factor(treatment) command in R.
10. The degrees of freedom for a problem always comes from the Residual or
ERROR line of the ANOVA table. In multiple-stratum models, the Residual
line in each stratum provides the comparison value (denominator Mean Square
and degrees of freedom) for effects in that stratum.
11. Please use par(mfrow=c(2,2)) (as illustrated above in item 8) for plotting the
results of an lm() or aov(). That way the plot uses only one piece of paper,
not four.
12. All comparable graphs must be on the same scale—on the same axes is often
better. See Section 17.6.2, especially Figure 17.19, for an example of comparable scaling in the top panels and noncomparable scaling in the bottom panels.
See Figure 4.9 for comparable scaling in Panels a and b and noncomparable
scaling in Panels c, d, and e.

L.5 Presentation of Results
This list is designed for our classroom setting. It is more generally applicable.
1. Use the minus sign “−4” in numbers. We do not accept hyphens “-4”. See
Table L.4.

848

L Writing Style
Normal Q−Q

0.0
−1.0
−2.0

Standardized residuals

1.0

2

4

6

−1.5 −1.0 −0.5

0.0

0.5

1.0

1.5

Theoretical Quantiles
lm(y ~ x)

Fig. L.3 qqplot of a regression of noise. The usual three “largest” residuals are labeled, and none
of the labeled residuals are big.

2. For multiple comparisons we can use the MMC (Section 7.2) default of Tukey
pairwise comparisons. The mmc and mmcplot functions in the HH package are
built on the glht function in the multcomp package.
3. We don’t do multiple comparisons of blocks. We know there are differences.
That is why we chose to use that factor as a block, not as a treatment. See
Section 12.6.
4. Write an experiment description that tells the reader how to reproduce the
experiment.
5. Distinguish between “bigger than the others” and “big”. The plot.lm() labels the three biggest points. It doesn’t care if they are significantly big. In
Figure L.3, for example, the qqplot of a regression of random noise, the usual
three “largest” residuals are labeled, and none of the labeled residuals are big.
6. summary.lm(...) doesn’t usually provide interesting information in designed
experiments. summary.aov(..., split=list()) is frequently interesting.
The ANOVA table in Table 12.13, for example, shows the partition of the 10-df
“strain nested within combination” into two easily interpretable 5-df sums of
squares, “strain within clover” and “strain within clover+alfalfa”. It is easy to
interpret these partitioned sums of squares along with the interaction means at
the bottom of Table 12.12 and in the upper-left panel of Figure 12.12.
Had we used summary.lm we would have gotten information in Table 12.15 on
the regression coefficients for the dummy variables, and we would need to see
the dummy variables in Table 12.14 for the coefficients themselves to make any
sense.
7. Always state the conclusions in the context of the problem.

L.5 Presentation of Results

849

8. Please do not turn in impossible or illegal formatting. For example, the line
breaks in the middle of words or character strings in Table L.7 are unacceptable.
Table L.7 Impossible formatting in R output.

## line break in the middle of a word
Residual standard error: 0.4811 on 10 degrees of fre
dom
## wrong: line break in string
plot(y ~ x, main="abc
def")
plot(y ~ x,
main="abc def")

### correct

9. Do not turn in lists or tables of data values in homework assignments. We know
the data values. In the classroom situation we gave them to you. You may show
a few observations to verify that you have read the data correctly. For example:
data(gunload)
head(gunload)
10. On the other hand, plots of the data are very interesting. We usually expect to
see appropriate plots of the data (scatterplot matrix, interaction plots, sets of
boxplots) and of the analysis.
11. We do not want a cover page for homework. This is our personal style. A classfull of essentially empty cover pages weighs too much and wastes paper.
12. Use spacing for legibility, for example:
abc<--5+3
abc <- -5 + 3

is hard to read.
is easy to read.

13. When you copy output, particularly by mouse from a document in a monowidth
font to one with a default proportionally spaced font, make sure you keep the
original spacing and indentation.
14. Short, complete answers are best.

Appendix M

Accessing R Through a Powerful Editor
—With Emacs and ESS as the Example

This Appendix is a discussion of the use of a powerful editor and programming environment. We use Emacs terminology and examples because we are most familiar
with it—we use Emacs with ESS as our primary editing environment. One of us
(RMH) is a coauthor of ESS.
Much of the discussion applies with only small changes to use of many of the
other high-quality editors. See Grosjean (2012) for an annotated list (with links) of
other editors that are used in programming R.
Emacs (Stallman, 2015) is a mature, powerful, and easily extensible text editing
system freely available under the GNU General Public License for a large number
of platforms, including Linux, Macintosh, and Windows. Emacs shares some features with word processors and, more importantly, shares many characteristics with
operating systems. Most importantly, Emacs can interact with and control other
programs either as subprocesses or as cooperating processes.
The name “Emacs” was originally chosen as an acronym for Editor MACroS.
Richard M. Stallman got a MacArthur genius award in 1990 for the development
of Emacs. Emacs comes from the Free Software Foundation, also known as the
GNU project (GNU is Not Unix).
Emacs provides facilities that go beyond simple insertion and deletion: viewing
two or more files at once (see Figures M.1 and M.2); editing formatted text; visual
comparison of two similar files (Figure M.1); and navigation in units of characters,
words, lines, sentences, paragraphs, and pages. Emacs knows the syntax of each
programming language. It can provide automatic indentation of programs. It can
highlight with fonts or colors specified syntactic characteristics. Emacs is extensible using a dialect of Lisp (Chassell, 1999; Graham, 1996). This means that new
functions, with user interaction, can be written for common and repeated text editing
tasks.
ESS (Mächler et al., 2015; Rossini et al., 2004) extends Emacs to provide a
functional, easily extensible, and uniform interface for multiple statistical packages.
© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

851

852

M Accessing R Through a Powerful Editor—With Emacs and ESS as the Example

One of us (RMH) is a coauthor of ESS. Several of the other coauthors are members of R-Core, the primary authors of R itself. Currently ESS works with R, S+,
SAS, Stata, OpenBUGS/JAGS, and Julia. The online documentation includes an
introduction in file ESS/ess/doc/intro.pdf (an early version of Rossini et al.
(2004)). Online help is available from within Emacs and ESS.

M.1 Emacs Features
M.1.1 Text Editing
Most programming and documentation tasks fall under the realm of text editing.
This work is enhanced by features such as contextual highlighting and recognition
of special reserved words appropriate to the programming language in use. In addition, editor behaviors such as folding, outlining, and bookmarks can assist with
maneuvering around a file. We discuss in Appendix K the set of capabilities we
expect a text editing program to have. Emacs automatically detects mismatched
parentheses and other types of common syntax and typing mistakes.
Typesetting and word processing, which focus on the presentation of a document,
are tasks that are not pure text editing. Emacs shares many features with word
processing programs and cooperates with document preparation systems such as
LATEX (discussed in Section N) and html (discussed in Appendix E).
We strongly recommend that students in our graduate statistics classes use
Emacs as their primary text editor. The primary reason for this recommendation is
that Emacs is the first general editor we know of that fully understands the syntax
and formatting rules for the statistical language R that we use in our courses. Other
editing systems designed to work with R are described and linked to in the webpage
provided by Grosjean (2012). Emacs has many other advantages (listed above), as
evidenced by Richard Stallman having won a MacArthur award in 1992 for developing Emacs.

M.1.2 File Comparison
Visual file comparisons are one of the most powerful capabilities provided by
Emacs. Emacs’ ediff function builds on the standard Unix diff command and
is therefore immediately available for Linux and Macintosh users. Windows users
must first install Rtools and place Rtools in the PATH (see Section F.6).

M.1 Emacs Features

853

Fig. M.1 ediff of two similar files. All mismatches between the two files are detected and
highlighted. The ediff control frame shows that we are currently in the third of three detected
differences between the two files. The matching sections of the third chunk are highlighted in light
pink in the top buffer and light green in the bottom buffer. The mismatching sections of the third
chunk are highlighted in darker pink in the top buffer and darker green in the bottom buffer.

For a simple example, let us compare the current version of our R script for a
homework exercise with the first version we started yesterday. Figure M.1 shows
the comparison.

M.1.3 Buffers
Emacs can work with many multiple files simultaneously. It brings each into a
buffer. A buffer is copy of a file within the Emacs editor. Any editing changes
made to the contents of a buffer are temporary until the buffer is saved back into
the file system. A buffer can hold a file, a directory listing, an interactive session
with the operating system, or an interactive instance of another running program.
Emacs allows you to open and edit an unlimited number of files and login sessions
simultaneously, running each in its own buffer. The files or login sessions can be
local or on another computer, anywhere in the world. You can run simultaneous
multiple sessions. The size of a buffer is limited only by the size of the computer.
One of us (RMH) normally has several buffers visible and frequently has hundreds
of open buffers (several chapters, their code files, their transcript files, the console
buffer for R, help files, directory listings on remote computers, handouts for classes,
and a listing of the currently open buffers).

854

M Accessing R Through a Powerful Editor—With Emacs and ESS as the Example

M.1.4 Shell Mode
Emacs includes a shell mode in which a terminal interaction runs inside an Emacs
buffer. The Unix terminology for the program that runs an interactive command
line session is a “shell”. There are several commonly used shell programs: sh is
the original and most fundamental shell program. Other Unix shell programs are
csh and bash. The MS-DOS prompt window (c:/Windows/System32/cmd.exe)
is the native shell program in MS Windows. We usually use the sh included in
Rtools as our shell under MS Windows.
A terminal interaction running inside an Emacs buffer is much more powerful
than one run in an ordinary terminal emulator window. The entire live login session
inside an Emacs buffer is just another editable buffer (with full search capability).
The only distinction is that both you and the computer program you are working
with can write to the buffer. This is exceedingly important because it means nothing
ever rolls off the top of the screen and gets lost. Just roll it back. The session can be
saved to a file and then is subject to automatic backup to protect you from system
crash or loss of connection to a remote machine.

M.1.5 Controlling Other Programs
A shell running in an Emacs buffer is normally used to run another program (R for
example). Frequently we can drop the intermediate step and have Emacs run the
other program directly. ESS provides that capability for R. The advantage of running
R directly through Emacs is that it becomes possible to design the interactivity that
allows a buffer containing R code to send that code directly to the running R process.
ESS (see Section M.2) builds on shell mode to provide modes for interacting with
statistical processes. The terminal interaction can be local (on the same computer on
which Emacs is running) or remote (anywhere else).

M.2 ESS
Figure M.2 is a screenshot showing an interactive Emacs session. Emacs is a powerful program, hence the figure is quite dense. We discuss many of its components
in the following subsections.
The discussion here is based on Rossini et al. (2004). ESS provides:

Fig. M.2 Interaction with R process through Emacs with ESS. An Emacs frame with three buffers is shown on the left. The top buffer shows the R file
emacsDemo.R that we are executing one line at a time. The cursor is at the end of the line anova(fat.lm), with the closing right paren and its matching left
paren highlighted together. The middle buffer *R* shows the R process with the printed result from the anova(fat.lm) command. The down arrow in the left
margin shows the beginning of the output from the most recently executed line. The bottom buffer *help[R](lm)* shows the help file from the R command
?lm. In the center is the ESS menu. On the right is the graph drawn by the plot(fat.lm, 2) statement. More detail on this figure is presented in Section M.2.

M.2 ESS
855

→

856

M Accessing R Through a Powerful Editor—With Emacs and ESS as the Example

M.2.1 Syntactic Indentation and Color/Font-Based Source
Code Highlighting
The ESS interface includes a description of the syntax and grammar of each statistical language it knows about. This gives ESS the ability to edit the programming language code, often more smoothly than with editors distributed with the languages.
The process of programming code is enhanced as ESS provides the user with a
clear presentation of the code with syntax highlighting to denote assignment, reserved words, strings, and comments. The upper-left buffer (labeled emacsDemo.R)
in Figure M.2 contains an R script file. The mode-line for the buffer is in the “modeline” face indicating that this is the active buffer. The mode-lines for the other, inactive, buffers in this figure are in the lighter-colored “mode-line-inactive” face. We
can tell from the ** in the mode-line that the file is still under construction. Those
two symbols would be replaced by -- once the file is saved.
In order to illustrate syntactic indentation, we show the statement fat.lm  character after the comma, Emacs automatically indented the second line of the statement and placed the cursor aligned with the first character after
the open paren.
The assignment arrow, a keyword in the language, is detected and colored in the
“constant” face. Comments indicated by “##” are colored in the “comment” face.
The cursor, represented by a blinking “|”, is redundantly located by the (11,13)
(indicating row and column in the file) in the mode-line. In this snapshot the cursor immediately follows a closing paren “)”, hence both the closing paren and its
matching opening paren “(” are highlighted in the paren-match face. Mismatched
parens, as in Figure M.3, are shown in the paren-mismatch face.

Fig. M.3 Mismatched parens are shown in a glaring color to help protect the user from many
types of typographical errors.

M.2.2 Partial Code Evaluation
Emacs/ESS can send individual lines, entire function definitions, marked regions,
and whole edited buffers from the window in which the code is displayed for editing to the statistical language/program for execution. Emacs/ESS sends the code directly to the running program and receives the printed output back from the program

M.2 ESS

857

in an editable Emacs buffer (the *R* buffer). This is a major improvement over cutand-paste as it does not require switching buffers or windows.
We show this twice in Figure M.2. The dropdown menu is set to show the various
options on which subset of the buffer will be sent over. The menu also shows the
key-stroke combinations that can be typed directly and thereby avoid using the menu
at all. The cursor is on the line anova(fat.lm) and that line was sent over by this
command.
→

The *R* buffer has just received the command (the hooked down arrow in the
left margin shows the beginning of the output from the most recently executed line).
The command and its output are displayed. In addition to receiving and executing
lines sent over from the script file, the *R* buffer is also an ordinary R console and
the user can type directly into the *R* buffer.
ESS facilitates the editing of files of R scripts by providing a means for loading
and error-checking of small sections of code (as illustrated in Figure M.2). This
allows for source-level debugging of batch files.

M.2.3 Object Name Completion
In addition, for languages in the S family (S developed at Bell Labs, S+ from
TIBCO, and R) ESS provides object-name completion of user- and system-defined
functions and data, and filename completion for files in the computer.

M.2.4 Process Interaction
ESS builds on Emacs’s facilities to interface directly with the running R process.
The output of the package goes directly to the editable *R* text buffer in Emacs.
Emacs has historically referred to processes under its control as “inferior”, accounting for the name inferior ESS (iESS) shown in the mode-line of the *R*
buffer in Figure M.2. This mode allows for command-line editing and for recalling
and searching the history of previously entered commands. Filename completion,
and object-name and function-name completion are available. Transcripts are easily
recorded and can be edited into an ideal activity log, which can then be saved. ESS
intercepts calls to the internal help system for R and displays the help files inside an
Emacs buffer (see the bottom left buffer in Figure M.2).

858

M Accessing R Through a Powerful Editor—With Emacs and ESS as the Example

M.2.5 Interacting with Statistical Programs on Remote Computers
ESS provides the facility to edit and run programs on remote machines in the same
session and with the same simplicity as if they were running on the local machine.

M.2.6 Transcript Editing and Reuse
Once a transcript log is generated, perhaps by saving an iESS buffer, transcript
mode assists with reuse of part or all of the entered commands. It permits editing
and re-evaluating the commands directly from the saved transcript. This is useful
for demonstration of techniques as well as for reconstruction of data analyses. There
currently exist functions within ESS for “cleaning” transcripts from the R, S+, and
SAS languages back to source code by finding all input lines and isolating them
into an input file.
By default transcript files (files with extensions *.Rout *.rt *.Sout *.st)
open into read-only buffers. The intent is to protect the history of your analysis
sequence. If you need to make them writable, use C-x C-q.

M.2.7 Help File Editing (R)
ESS also provides an interface for writing help files for R functions and packages
(see Appendix F). It provides the ability to view and execute embedded R source
code directly from the help file in the same manner as ESS normally handles code
from a source file. ESS Help mode provides syntax highlighting and the ability to
submit code from the help file to a running R process.

M.3 Learning Emacs
There are several ways to learn Emacs. Whichever you choose, it will give you
access to the most comprehensive editing system.

M.3 Learning Emacs

859

M.3.1 GUI (Graphical User Interface)
Emacs provides a GUI (graphical user interface) with dropdown menus (shown
in Figure M.2) or with toolbar icons (not shown here). An excellent guide to menubased Emacs usage, with the delightful title “Emacs has no learning curve: Emacs
and ESS”, is available in Johnson (2015). Detailed help on Emacs is available from
the Help dropdown menu item.

M.3.2 Keyboard Interface
Emacs, in its present incarnation, goes back to 1984. A precursor goes back to
1976. One of us (RMH) started using Emacs about 1990. Emacs was originally
designed in a keyboard-only environment, long before mice and multi-windowed
“desktop” environments. We still prefer the keyboard as it is faster (once learned)
and has more capabilities than the menu-based GUI options. The dropdown menus
themselves show the keyboard equivalents. For keyboard-based usage use one or
more of
1. Tutorial. Enter “C-h t”. Then read the file and follow the instructions. You are
working with your own private copy of the TUTORIAL file, so you can practice
the keystrokes as suggested.
2. Manual. The manual is online in the hyperlinked Info system. It can be accessed
from the menu or by entering “C-h i” to bring up the Info: buffer.
3. Help. You can search within the dropdown Help menu. Or, to find help apropos
a topic, for example to answer the question “How do I save my current editing
buffer to a file?”, enter “C-h a save RET” and get a list of all commands with
save as part of their name. You probably want the command save-buffer and
will see that you can use that command by typing “C-x C-s” or by using the
files pull-down menu.
4. Reference Card to Emacs keystroke commands. The Emacs reference card is
in the directory accessed by “C-h r C-x d ../etc/refcards”. The English
card is in file refcard.pdf. Several other languages are also available as files
with names **-refcard.pdf. Reference cards for other Free Software Foundation programs are also in the same directory.

860

M Accessing R Through a Powerful Editor—With Emacs and ESS as the Example

M.4 Nuisances with Windows and Emacs
When R has been started by an ordinary user, as opposed to Administrator, then
installed packages are placed into a personal library. The R GUI under Windows
uses the directory
"C:/Users/loginname/Documents/R/win-library/x.y"
as the location of your personal library. The *R* buffer inside Emacs uses
"C:/Users/loginname/AppData/Roaming/R/win-library/x.y"
as the location of your personal library. The notation x.y must be replaced by the
first two digits of your current version of R. For example, with R-3.3.0, "x.y"
would become "3.3". Neither will automatically see packages that were installed
into the directory used by the other. You must tell it about the other directory.
You can tell the R GUI to use the additional directory with the statement
.libPaths(
"C:/Users/loginname/AppData/Roaming/R/win-library/x.y")
You can tell *R* to use the additional directory with the statement
.libPaths(
"C:/Users/loginname/Documents/R/win-library/x.y")

M.5 Requirements
Emacs satisfies the requirements detailed in Section K.1.1.
1. ESS provides full interaction between the commands file, the statistical process,
and the transcript.
2. The statistical process runs in an Emacs buffer and is therefore fully searchable
and editable.
3. Cut and paste is standard.
4. Emacs comes with modes specialized for all standard computing languages (C,
C++, Fortran, VBA). ESS provides the modes for R and S+ (and for SAS and
several others). Each mode by default highlights all keywords in its language,
is aware of recommended indentation patterns and other formatting issues, and
has communication with its associated running program.
5. Emacs handles multiple files as part of its basic design.
6. Emacs has a LATEX mode, and therefore provides editing access to the best
mathematical typesetting system currently available anywhere.
7. Emacs has several text modes.
8. Several spell-check programs works with Emacs; we use ispell.

M.5 Requirements

861

9. Emacs permits embedding of graphics directly into the Emacs buffer.
10. Graphics in PDF, PostScript, and bitmap formats can be embedded into LATEX
documents.
11. Emacs includes encoding (Unicode) for all human languages. Enter C-H h to
display the HELLO file, which lists many languages and characters.

Appendix N

LATEX

We used LATEX (Lamport, 1994) as the document preparation system for writing this
book. LATEX knows all the intricacies of mathematical typesetting and makes it very
easy to include figures and tables into the manuscript. LATEX is the standard required
by many statistics and mathematics journals. LATEX can adapt to any standard style,
such as those used by book publishers and journals, or you can write your own.
The LATEX document preparation system is written as a collection of macros for
Donald Knuth’s TEX program (Knuth, 1984). LATEX adds to TEX a collection of
commands that simplify typesetting by letting the user concentrate on the structure
of the text.
TEX is a sophisticated program designed to produce high-quality typesetting,
especially for mathematical text. The TEX system was developed by Donald E.
Knuth at Stanford University. It is now maintained by CTAN, the Comprehensive
TEX Archiving Network (Comprehensive TEX Archiving Network, 2002). There are
several distributions available. We use MikTEX (Schenk, 2001) on Windows and
MacTEX (The TeX Users Group (TUG), 2014) on Macintosh.
The latex function in R (Heiberger and Harrell, 1994; Frank E Harrell et al.,
2014) may be used to prepare formatted typeset tables for a LATEX document. Several
tables in this book (8.1, 9.1, 12.4, 15.5, and 15.8) were prepared in this way.

N.1 Organization Using LATEX
There are several ways to approach LATEX. The way we used for this book was to
write each chapter in its own file, and then combine them into book using the style
file provided by our publisher Springer.

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

863

864

N LATEX

N.2 Setting Equations
This is equation 8.1 as we typed it:
\begin{eqnarray*}
y_i = \beta_0 + \beta_1 x_i
\dots,n$}
\end{eqnarray*}

+ \epsilon_i \cond{for $i=1,

This is how it appears when typeset:
yi = β0 + β1 xi + i

for i = 1, . . . , n

N.3 Coordination with R
The Second Edition of this book was written using the Sweave and Stangle functions from the utils package. All R code is included within the LATEX source for the
book. Stangle reads the LATEX input files, isolates from the input files the actual
code that produced the tables and figures, and collects the code into the script files
that are included with the HH package. See help(Sweave, package="utils")
for details on writing using Sweave.

N.4 Global Changes: Specification of Fonts
In LATEX, it is very easy to make global changes to a document style. For example,
placing programs and transcripts into a verbatim environment takes care of the
font. Should we later decide that we would like a different monowidth font, say
“Adobe Courier” instead of the default “Computer Modern Typewriter”, we include
the single statement \renewcommand{\ttdefault}{pcr} and then immediately
ALL instances of the typewriter font will change. We illustrate in Table N.1.
Compare this simple change of a single specification in LATEX to the more laborintensive way of accomplishing the same task in a visual formatting systems (MS
Word, for example). In a visual formatting system, programs and transcripts must
be individually highlighted and then explicitly placed into Courier. Changing ALL
instances of the typewriter font requires manually highlighting and changing EACH
of them individually, one at a time.

N.4 Global Changes: Specification of Fonts

865

Table N.1 Switching the typewriter font (LATEX command \tt) in LATEX from the default “Computer Modern Typewriter” to “Adobe Courier” and back.

This is the default Computer Modern Typewriter font.
%% switch to Adobe Courier
\renewcommand{\ttdefault}{pcr}
This is Adobe Courier font.
%% return to Computer Modern Typewriter
\renewcommand{\ttdefault}{cmtt}
And back to the Computer Modern Typewriter font.

Appendix O

Word Processors and Spreadsheets

Word processing is moving sentences, paragraphs, sections, figures, and crossreferences around. Most word processors can be used as a text editor by manually
turning off many of the word processing features.
Spreadsheet software is used to operate on tables of numbers.
Microsoft Word and Microsoft Excel are the most prevalent word processor and
spreadsheet software systems. Most of what we say here applies to all such systems.

O.1 Microsoft Word
Microsoft Word is probably the most prevalent text editor and word processor today.
MS Word is configured by default as a word processor. It can be used as a text
editor by changing those default settings. The most critical features are the font and
the paragraph reflow. Courier (or another monowidth font in which all letters are
equally wide) should be used for program writing or for summary reports in which
your displayed output from R is included. The software output from R is designed
to look right (alignment and spacing) with monowidth fonts. It looks terrible, to
the point of illegibility, in proportional fonts. See examples in Section A.4 and Appendix L. Other word processor features to turn off are spell checking and syntax
checking, both of which are designed to make sense with English (or another natural
language) but not with programming languages.

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

867

868

O Word Processors and Spreadsheets

If you are using an editor that thinks it knows more than you, be very sure that
the *.r, *.Rout, and *.rt files are displayed in Courier and fit within the margin
settings of the word processor.

O.1.1 Editing Requirements
MS Word satisfies some of the requirements detailed in Section K.1.1.
1. MS Word can edit the commands file. It does not interact directly with the
running statistical process; manual cut-and-paste is required.
2. The output from the statistical process is in a window independent of MS Word.
The output can be picked up and pasted into an MS Word window.
3. Cut and paste is standard.
4. When checking is on, MS Word will by default inappropriately check computer
programs for English syntax and spelling.
5. MS Word handles multiple files as part of its basic design.
6. Reports can be written and output text can be embedded into them;
7. MS Word has syntax and spell-checking facilities limited to the natural language (English in our case).
8. Graphics can be pasted directly into an MS Word document.
9. Mathematical formulas can be entered in an MS Word document.
10. SWord and other connections between R and MS Word.
11. MS Word works with Unicode, providing access to all character sets in all
human languages. On the Insert tab, click Symbols and then More Symbols.
In the font box, find Arial Unicode MS.

O.1.2 SWord
SWord (Baier, 2014) is an add-in package for MS Word that makes it possible to
embed R code in an MS Word document. The R code will be automatically executed
and the output from the R code will be included within the MS Word document.
SWord is free for non-commercial use. Any other use will require a license. Please
see Section D.1.3 for installation information.

O.2 Microsoft Excel

869

O.2 Microsoft Excel
Microsoft Excel is probably the most prevalent spreadsheet program today.
We believe MS Excel is well suited for two tasks: as a small-scale database
management system and as a way of organizing calculations. We do not recommend
Excel for the actual calculations.

O.2.1 Database Management
R (and S+ and SAS) can read and write Excel files. Since many people within an
organization collect and distribute their data in Excel spreadsheets, this is a very
important feature.

O.2.2 Organizing Calculations
R can be connected directly to Excel on Windows via RExcel (Baier and Neuwirth,
2007) using DCOM, Microsoft’s protocol for exchanging information across programs. See Appendix D for further details and for download information. Used in
this way, Excel can be used similarly to the ways that Emacs and MS Word are
used. Excel can be used to control R, for example by putting R commands inside
Excel cells and making them subject to automatic recalculation. Or R can be in
control, and use Excel as one of its subroutines.

O.2.3 Excel as a Statistical Calculator
We believe Excel is usually a poor choice for statistical computations because:
1. As of this writing (Excel 2013 for Windows and Excel 2011 for Macintosh) at
least some of the built-in statistical functions do not include even basic numerical
protection.
Table O.1 and Figure O.1 show the calculation of the variance of three numbers
by R’s var function and Excel’s VAR function. Figure O.1 shows a simple set
of variance calculations where Excel reveals an algorithmic error. We show the
erroneous number as a hex number in Table O.2. For comparison, Table O.1
shows the correct calculations by R.

870

O Word Processors and Spreadsheets

Fig. O.1 Calculation of the variance by Excel for the sequence (10k ) + 1, (10k ) + 1, (10k ) + 1 for
several values of k. The real-valued result for all values of k is exactly 1. The 2.833 × 1022 value
shown for k = 27 is the indication of an erroneous algorithm in Excel. We see this for Excel 2013
on Windows and Excel 2011 on Macintosh. The floating point result for k = 0 : 15 is exactly 1
as shown. The correct floating point result for all k >= 17 is exactly 0. The floating point value 4
for k = 16 is correct. We illustrate the correct floating point behavior in Section G.12. Table O.2
shows the hexadecimal display for the erroneous value.
Table O.1 Calculation of the variance by R for the sequence (10k ) + 1, (10k ) + 1, (10k ) + 1 for
several values of k. The real-valued result for all values of k is exactly 1. R gets the correct floating
point variance for all values of k. The floating point result for k = 0 : 15 is exactly 1. The floating
point value 4 for k = 16 is correct. The floating point result for k >= 17 is exactly 0. We illustrate
the correct floating point behavior in Section G.12.
> k <- c(0, 1, 2, 15, 16, 17, 26, 27, 28)
> cbind(k=k, var=apply(cbind(10^k + 1, 10^k + 2, 10^k + 3), 1, var))
k var
[1,] 0
1
[2,] 1
1
[3,] 2
1
[4,] 15
1
[5,] 16
4
[6,] 17
0
[7,] 26
0
[8,] 27
0
[9,] 28
0

O.2 Microsoft Excel

871

Table O.2 The first statement shows the hex display of the number Excel shows in Figure O.1
as the result of var(1027 + 1, 1027 + 2, 1027 + 3). It is very close to a pretty hex number suggesting
that there is a numeric overflow somewhere in the algorithm. The second statement shows the hex
display of the three input numbers. They are all identical to machine precision.
> sprintf("%+13.13a", 28334198897217900000000)
[1] "+0x1.8000000000007p+74"
> as.matrix(sprintf("%+13.13a", c(10^27 + 1, 10^27 + 2, 10^27 + 3)))
[,1]
[1,] "+0x1.9d971e4fe8402p+89"
[2,] "+0x1.9d971e4fe8402p+89"
[3,] "+0x1.9d971e4fe8402p+89"

Earlier versions of MS Excel got different wrong answers for the variance of
three numbers. Most notably MS Excel 2002 gets the right answer for 107 and
the wrong answer for 108 , suggesting that it was using the numerically unstable
one-pass algorithm. Later releases got different sets of wrong answers.
2. Most add-in packages are not standard and are not powerful. If add-ins are used
along with an introductory textbook, they will most likely be limited in capability
to the level of the text. They are unlikely to be available on computers in a work
situation.
RExcel is an exception. It uses R from the Excel interface and therefore has all
the power and generality of R.

References

A.A. Adish, S.A. Esrey, T.W. Gyorkos, J. Jean-Baptiste, A. Rojhani, Effect of consumption of food cooked in iron pots on iron status and growth of young children:
a randomised trial. Lancet 353(9154), 712–716 (1999)
A. Agresti, Categorical Data Analysis (Wiley, New York, 1990)
A. Agresti, B. Caffo, Simple and effective confidence intervals for proportions and
differences of proportions result from adding two successes and two failures. Am.
Stat. 54(4), 280–288 (2000)
Albuquerque Board of Realtors (1993). URL: http://lib.stat.cmu.edu/
DASL/Stories/homeprice.html
American Statistical Association, Career Center (2015). URL: http://www.
amstat.org/careers
O. Amit, R.M. Heiberger, P.W. Lane, Graphical approaches to the analysis of safety
data from clinical trials. Pharm. Stat. 7(1), 20–35 (2008). URL: http://www3.
interscience.wiley.com/journal/114129388/abstract
A.H. Anderson, E.B. Jensen, G. Schou, Two-way analysis of variance with correlated errors. Int. Stat. Rev. 49, 153–167 (1981)
R.L. Anderson, T.A. Bancroft, Statistical Theory in Research (McGraw-Hill, New
York,1952)
V.L. Anderson, R.A. McLean, Design of Experiments (Marcel Dekker, New
York,1974)
D.F. Andrews, A.M. Herzberg, Data: A Collection of Problems from Many Fields
for the Student and Research Worker (Springer, New York,1985). URL: http://
lib.stat.cmu.edu/datasets/Andrews/
E. Anionwu, D. Watford, M. Brozovic, B. Kirkwood, Sickle cell disease in a British
urban community. Br. Med. J. 282, 283–286 (1981)

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

873

874

References

P.K. Asabere, F.E. Huffman, Negative and positive impacts of golf course proximity
on home prices. Apprais. J. 64(4), 351–355 (1996)
T. Baier, Sword (2014). URL: http://rcom.univie.ac.at
T. Baier, E. Neuwirth, Excel :: Com :: R. Comput. Stand. 22(1), 91–108 (2007)
M.K. Barnett, F.C. Mead, A 24 factorial experiment in four blocks of eight. Appl.
Stat. 5, 122–131 (1956)
R.A. Becker, J.M. Chambers, A.R. Wilks, The S Language; A Programming Environment for Data Analysis and Graphics (Wadsworth & Brooks/Cole, Pacific
Grove, 1988)
R.A. Becker, W.S. Cleveland, S-PLUS Trellis Graphics User’s Manual (1996a).
URL: http://www.stat.purdue.edu/~wsc/papers/trellis.user.pdf
Becker, R.A., W.S. Cleveland, M.-J. Shyu, S.P. Kaluzny, A tour of trellis graphics (1996b). URL: http://www2.research.att.com/areas/stat/doc/95.
12.color.ps
Y.Y.M. Bishop, S.E. Fienberg, P.W. Holland, Discrete Multivariate Analysis (MIT,
Cambridge,1975)
A. Bjork, Solving least squares problems by Gram–Schmidt orthogonalization. BIT
7, 1–21 (1967)
C.I. Bliss, Statistics in Biology (McGraw-Hill, New York, 1967)
C.R. Blyth, On Simpson’s paradox and the sure-thing principle. J. Am. Stat. Assoc.
67, 364–366 (1972)
B.L. Bowerman, R.T. O’Connell, Linear Statistical Models (Duxbury, Belmont,
1990)
G.E.P. Box, D.R. Cox, An analysis of transformations. J. R. Stat. Soc. B 26, 211–
252 (1964)
G.E.P. Box, W.G. Hunter, J.S. Hunter, Statistics for Experimenters (Wiley, New
York, 1978)
G.E.P. Box, G.M. Jenkins, Time Series Analysis: Forecasting and Control, revised
edn. (Holden-Day, San Francisco, 1976)
R.G. Braungart, Family status, socialization and student politics: a multivariate analysis. Am. J. Sociol. 77, 108–130 (1971)
C. Brewer, Colorbrewer (2002). URL: http://colorbrewer.org
L. Brochard, J. Mancebo, M. Wysocki, F. Lofaso, G. Conti, A. Rauss, G. Simonneau, S. Benito, A. Gasparetto, F. Lemaire, D. Isabey, A. Harf, Noninvasive
ventilation for acute exacerbations of chronic pulmonary disease. N. Engl. J. Med.
333(13), 817–822 (1995)

References

875

D.G. Brooks, S.S. Carroll, W.A. Verdini, Characterizing the domain of a regression
model. Am. Stat. 42, 187–190 (1988)
B.W. Brown Jr., Prediction analyses for binary data, in Biostatistics Casebook, ed.
by R.G. Miller Jr., B. Efron, B.W. Brown Jr., L.E. Moses (Wiley, New York, 1980)
M.B. Brown, A.B. Forsyth, Robust tests for equality of variances. J. Am. Stat.
Assoc. 69, 364–367 (1974)
Bureau of the Census, Statistical Abstract of the United States (U.S. Department of
Commerce, Washington, DC, 2001)
T.T. Cai, One-sided confidence intervals in discrete distributions. J. Stat. Plan.
Inference 131, 63–88 (2003). URL: http://www-stat.wharton.upenn.edu/
~tcai/paper/1sidedCI.pdf
E. Cameron, L. Pauling, Supplemental ascorbate in the supportive treatment of cancer: re-evaluation of prolongation of survival times in terminal human cancer.
Proc. Natl. Acad. Sci. USA 75, 4538–4542 (1978)
J.M. Chambers, W.S. Cleveland, B. Kleiner, P.A. Tukey, Graphical Methods for
Data Analysis (Wadsworth, Belmont, 1983)
T.F.C. Chan, G.H. Golub, R.J. LeVeque, Algorithms for computing the sample variance: analysis and recommendations.Am. Stat. 37(3), 242–247 (1983)
W. Chang, J. Cheng, J.J. Allaire, Y. Xie, J. McPherson, shiny: Web Application Framework for R. R Package Version 0.11.1 (2015). URL: http://CRAN.
R-project.org/package=shiny
R. Chassell, Programming in Emacs Lisp: An Introduction. Free Software Foundation, 2nd edn. (1999) URL: https://www.gnu.org/software/emacs/
manual/html_node/eintr/
T.W. Chin, E. Hall, C. Gravelle, J. Speers, The influence of Salk vaccination on the
epidemic pattern and spread of the virus in the community. Am. J. Hyg. 73, 67–94
(1961)
S. Chu, Diamond ring pricing using linear regression. J. Stat. Educ. 4 (1996). URL:
http://www.amstat.org/publications/jse/v4n3/datasets.chu.html
W.S. Cleveland, Visualizing Data (Hobart Press, Summit, 1993)
W.S. Cleveland, R. McGill, Graphical perception: theory, experimentation, and
application to the development of graphical methods. J. Am. Stat. Assoc. 79,
531–554 (1984)
W.G. Cochran, G.M. Cox, Experimental Designs, 2nd edn. (Wiley, New York, 1957)
D.R. Collett, Modelling Binary Data (Chapman & Hall, London, 1991)
Comprehensive TEX Archiving Network CTAN (2002). URL: ftp://metalab.
unc.edu/pub/packages/TeX/index.html

876

References

W.J. Conover, M.E. Johnson, M.M. Johnson, A comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding
data. Technometrics 23, 351–361 (1981)
Consumer Reports, Hot dogs, Consumer Reports (1986), pp. 366–367. URL:
http://lib.stat.cmu.edu/DASL/Stories/Hotdogs.html
R.D. Cook, S. Weisberg, Applied Regression Including Computing and Graphics
(Wiley, New York, 1999)
T. Cook, Convictions for Drunkenness (New Society, 1971)
CRAN, Comprehensive R Archive Network (2015). URL: http://CRAN.
R-project.org
Cytel Software Corporation, Logxact Statistical Software: Release 5 (2004). URL:
http://www.cytel.com/LogXact/logxact_brochure.pdf
S.R. Dalal, E.B. Fowlkes, B. Hoadley, Risk analysis of the space shuttle: prechallenger prediction of failure. J. Am. Stat. Assoc. 84, 945–957 (1989)
C. Darwin, The Effect of Cross- and Self-fertilization in the Vegetable Kingdom, 2nd
edn. (John Murray, London, 1876)
Data Archive, J. Stat. Educ. (1997). URL: http://www.amstat.org/
publications/jse/jse_data_archive.html
O.L. Davies, Design and Analysis of Industrial Experiments (Oliver and Boyd, London, 1954)
O.L. Davies, P.L. Goldsmith (eds.), Statistical Methods in Research and Production,
4th edn. (Oliver and Boyd, London, 1972)
M.M. Desu, D. Raghavarao, Nonparametric Statistical Methods for Complete and
Censored Data, 1st edn. (Chapman & Hall, Boca Raton, 2003)
R. Dougherty, A. Wade (2006).URL: http://vischeck.com
D. Edwards, J.J. Berry, The efficiency of simulation-based multiple comparisons.
Biometrics 43, 913–928 (1987)
M.E. Ellis, K.R. Neal, A.K. Webb, Is smoking a risk factor for pneumonia for patients with chickenpox? Br. Med. J. 294, 1002 (1987)
J.D. Emerson, M.A. Stoto, Transforming data, in Understanding Robust and Exploratory Data Analysis, ed. by D.C. Hoaglin, F. Mosteller, J.W. Tukey (Wiley,
New York, 1983)
L.W. Erdman, Studies to determine if antibiosis occurs among Rhizobia: 1. Between
Rhizobium meliloti and Rhizobium trifolii. J. Am. Soc. Agron. 38, 251–258
(1946)
J.L. Fleiss, Statistical Methods for Rates and Proportions, 2nd edn. (Wiley,
New York, 1981)

References

877

Forbes Magazine (1993). URL: http://lib.stat.cmu.edu/DASL/Datafiles/
ceodat.html
R.S. Fouts, Acquisition and testing of gestural signs in four young chimpanzees.
Science 180, 978–980 (1973)
J. Fox, Regression Diagnostics: An Introduction (Sage, Thousand Oaks, 1991)
J. Fox, The R Commander: a basic statistics graphical user interface to R. J. Stat.
Softw. 14(9), 1–42 (2005). URL: http://www.jstatsoft.org/v14/i09
E.H. Frank Jr., with contributions from Charles Dupont, and many others, Hmisc:
Harrell Miscellaneous. R Package Version 3.14-6 (2014). URL: http://CRAN.
R-project.org/package=Hmisc
Free Software Foundation, Emacs (2015). URL: http://www.gnu.org/
software/emacs/
R.J. Freund, R.C. Littell, SAS System for Regression (SAS Institute, Inc., Cary,
1991)
J.H. Goodnight, Tests of hypotheses in fixed effects linear models. Technical Report
R-101 (SAS Institute, Cary, 1978)
P. Graham, ANSI Common Lisp (Prentice Hall, Upper Saddle River, 1996)
M.J. Greenacre, Theory and Applications of Correspondence Analysis (Academic,
New York, 1984)
P. Grosjean, Ide/script editors. Annotated list with links to many editing environments for R. (2012). URL: http://www.sciviews.org/_rgui/
R.F. Gunst, R.L. Mason, Regression Analysis and Its Application: A Data-Oriented
Approach (Marcel Dekker, New York, 1980)
L.C. Hamilton, Saving water: a causal model of household conservation. Sociol.
Perspect. 26(4), 355–374 (1983)
L.C. Hamilton, Regression with Graphics (Brooks-Cole, Belmont, 1992)
D.J. Hand, F. Daly, A.D. Lunn, K.J. McConway, E. Ostrowski, A Handbook of Small
Data Sets (Chapman and Hall, London, 1994)
D.D. Harrison, D.E. Harrison, T.J. Janik, R. Cailliet, J.R. Ferrantelli, J.W. Hass,
B. Holland, Modeling of the sagittal cervical spine as a method to discriminate hypo-lordosis: results of elliptical and circular modeling in 72 asymptomatic
subjects, 52 acute neck pain subjects, and 70 chronic neck pain subjects. Spine
29(22):2485–2492 (2004)
D.E. Harrison, R. Cailliet, D.D. Harrison, T.J. Janik, B. Holland, Changes in sagittal
lumbar configuration with a new method of extension traction combined with
spinal manipulation and its clinical significance: non-randomized clinical control
trial. Arch. Phys. Med. Rehabil. 83(11), 1585–1591 (2002)

878

References

R.M. Heavenrich, J.D. Murrell, K.H. Hellman, Light Duty Automotive Technology
and Fuel Economy Trends through 1991 (U.S. Environmental Protection Agency,
Ann Arbor, 1991)
R.M. Heiberger, Computation for the Analysis of Designed Experiments (Wiley,
New York, 1989)
R.M. Heiberger, HH: Statistical Analysis and Data Display: Heiberger and
Holland. Spotfire S+ Package Version 2.1-29 (2009). URL: http://csan.
insightful.com/PackageDetails.aspx?Package=HH
R.M. Heiberger, HH: Statistical Analysis and Data Display: Heiberger and Holland. R Package Version 3.1-15 (2015). URL: http://CRAN.R-project.org/
package=HH
R.M. Heiberger, F.E. Harrell Jr., Design of object-oriented functions in S for screen
display, interface and control of other programs (SAS and LATEX), and S programming, in Computing Science and Statistics, vol. 26 (1994), pp. 367–371. The
software is available in both S-Plus and R in library(hmisc). It is included in
the S-Plus distribution. It may be downloaded for R from the contrib page of
the R Development Core Team (2004) website
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display: An Intermediate
Course with Examples in S-Plus, R, and SAS, 1st edn. (Springer, New York,
2004)
R.M. Heiberger, B. Holland, Mean–mean multiple comparison displays for families
of linear contrasts. J. Comput. Graph. Stat. 14(4), 937–955 (2006)
R.M. Heiberger, E. Neuwirth, R through Excel: A Spreadsheet Interface for Statistics, Data Analysis, and Graphics (Springer, New York, 2009). URL: http://
www.springer.com/978-1-4419-0051-7
R.M. Heiberger, N.B. Robbins, Design of diverging stacked bar charts for Likert
scales and other applications. J. Stat. Softw. 57(5), 1–32 (2014). URL: http://
www.jstatsoft.org/v57/i05/
R.M. Heiberger, P. Teles, Displays for direct comparison of ARIMA models. Am.
Stat. 56, 131–138, 258–260 (2002)
R.M. Heiberger, with contributions from H. Burt, RcmdrPlugin.HH: Rcmdr Support
for the HH Package. R Package Version 1.1-42 (2015). URL: http://CRAN.
R-project.org/package=RcmdrPlugin.HH
C.R. Hicks, K.V. Turner Jr., Fundamental Concepts in Design of Experiments, 5th
edn. (Oxford, New York, 1999)
J.E. Higgens, G.G. Koch, Variable selection and generalized chi-square analysis of
categorical data applied to a large cross-sectional occupational health survey. Int.
Stat. Rev. 45, 51–62 (1977)
D.C. Hoaglin, F. Mosteller, J.W. Tukey (eds.), Understanding Robust and Exploratory Data Analysis (Wiley, New York, 1983)

References

879

Y. Hochberg, A sharper Bonferroni procedure for multiple tests of significance.
Biometrika 75, 800–803 (1988)
Y. Hochberg, A.C. Tamhane, Multiple Comparison Procedures (Wiley, NewYork,
1987)
M. Holzer, Mild therapeutic hypothermia to improve the neurologic outcome after
cardiac arrest. N. Engl. J. Med. 346(8), 549–556 (2002)
D.W. Hosmer, S. Lemeshow, Applied Logistic Regression, 2nd edn. (Wiley, New
York, 2000)
J. Hsu, M. Peruggia, Graphical representations of Tukey’s multiple comparison
method. J. Comput. Graph. Stat. 3, 143–161 (1994)
R. Ihaka, P. Murrell, K. Hornik, J.C. Fisher, A. Zeileis, Colorspace: Color
Space Manipulation. R Package Version 1.2-4 (2013). URL: http://CRAN.
R-project.org/package=colorspace
R.L. Iman, A Data-Based Approach to Statistics (Duxbury, Belmont, 1994)
Insightful Corp., S-Plus Statistical Software: Release 6.1 (2002). URL: http://
www.insightful.com.
F. John et al., R Commander. R Package Version 2.1-6 (2015). URL: http://CRAN.
R-project.org/package=Rcmdr
N.L. Johnson, F.C. Leone, Statistics and Experimental Design in Engineering and
the Physical Sciences, vol. 2 (Wiley, New York, 1967)
P.E. Johnson, Emacs has no learning curve: Emacs and ess (2015). URL: http://
pj.freefaculty.org/guides/Rcourse/emacs-ess/emacs-ess.pdf
P.O. Johnson, F. Tsao, Factorial design and covariance in the study of individual
educational development. Psychometrika 10, 133–162 (1945)
R.W. Johnson, Fitting percentage of body fat to simple body measurements.
J. Stat. Educ. 4(1) (1996). URL: http://www.amstat.org/publications/
jse/archive.htm
D.E. Knuth, The TEXbook (Addison-Wesley, Reading, 1984)
L. Krantz, 1999–2000 Jobs Rated Almanac: The Best and Worst Jobs—250 in All—
Ranked by More Than a Dozen Vital Factors Including Salary, Stress, Benefits
and More (St. Martins Press, New York, 1999). URL: http://www.hallmaps.
com/almanacs_yearbooks/29.shtml
L. Lamport, LATEX: A Document Preparation System: User’s Guide and Reference
Manual (Addison-Wesley, Boston, 1994)
M. Lavine, Problems in extrapolation illustrated with space shuttle o-ring data.
J. Am. Stat. Assoc. 86, 919–921 (1991)
A.J. Lea, New observations on distribution of neoplasms of female breast in certain
European countries. Br. Med. J. 1, 488–490 (1965)

880

References

E.T. Lee, Statistical Methods for Survival Data Analysis (Lifetime Learning Publications, Belmont, 1980)
E. Lehmann, Nonparametrics—Statistical Methods Based on Ranks, revised first
edition (Prentice Hall, New York, 1998)
F. Leisch, R-core, Sweave: automatic generation of reports (2014). URL: https://
stat.ethz.ch/R-manual/R-devel/library/utils/doc/Sweave.pdf
A.Y. Lewin, M.F. Shakun, Policy Sciences, Methodology and Cases (Pergammon, Oxford, 1976). URL: http://lib.stat.cmu.edu/DASL/Stories/
airpollutionfilters.html
R. Likert, A technique for the measurement of attitudes. Arch. Psychol. 140(55),
1–55 (1932)
R.J.A. Little, D.B. Rubin, Statistical Analysis with Missing Data, 2nd edn. (Wiley,
New York, 2002)
J. Longley, An appraisal of least squares programs for the electronic computer from
the point of view of the user. J. Am. Stat. Assoc. 62, 819–841 (1967)
A. Luo, T. Keyes, Second set of results in from the career track member survey, in
Amstat News (American Statistical Association, Arlington, 2005), pp. 14–15
M. Mächler, S. Eglen, R.M. Heiberger, K. Hornik, S.P. Luque, H. Redesting, A.J.
Rossini, R. Sparapani, V. Spinu, ESS (Emacs Speaks Statistics) (2015). URL:
http://ESS.R-project.org/
P. McCullagh, J.A. Nelder, Generalized Linear Models (Chapman and Hall, London, 1983)
C.R. Mehta, N.R. Patel, P. Senchaudhuri, Efficient Monte Carlo methods for conditional logistic regression. J. Am. Stat. Assoc. 95(449), 99–108 (2000)
W.M. Mendenhall, J.T. Parsons, S.P. Stringer, N.J. Cassissi, R.R. Million, T2 oral
tongue carcinoma treated with radiotherapy: analysis of local control and complications. Radiother. Oncol. 16, 275–282 (1989)
D. Meyer, A. Zeileis, K. Hornik, The strucplot framework: visualizing multi-way
contingency tables with vcd. J. Stat. Softw. 17(3), 1–48 (2006). URL: http://
www.jstatsoft.org/v17/i03/
D. Meyer, A. Zeileis, K. Hornik, vcd: Visualizing Categorical Data. R Package Version 1.2-13 (2012). URL: http://CRAN.R-project.org/package=vcd
Microsoft, Inc., Word (2015). URL: https://products.office.com/en-us/
Word
G.A. Milliken, D.E. Johnson, Analysis of Messy Data, vol. I (Wadsworth, Belmont,
1984)
D.C. Montgomery, Design and Analysis of Experiments, 4th edn. (Wiley, New York,
1997)

References

881

D.C. Montgomery, Design and Analysis of Experiments, 5th edn. (Wiley, New York,
2001)
D.S. Moore, G.P. McCabe, Introduction to the Practice of Statistics (Freeman, New
York, 1989)
F. Mosteller, J.W. Tukey, Data Analysis and Regression (Addison-Wesley, Reading,
1977)
J.D. Murray, G. Dunn, P. Williams, A. Tarnopolsky, Factors affecting the consumption of psychotropic drugs. Psychol. Med. 11, 551–560 (1981)
P. Murrell, R Graphics, 2nd edn. (CRC, Boca Raton, 2011). URL: http://www.
taylorandfrancis.com/books/details/9781439831762/
R.H. Myers, Classical and Modern Regression with Applications, Chap. 5 (PWSKent, Boston, 1990), p. 218
S.C. Narula, J.T. Wellington, Prediction, linear regression and the minimum sum of
errors. Technometrics 19, 185–190 (1977)
U.S. Navy, Procedures and Analyses for Staffing Standards Development:
Data/Regression Analysis Handbook (Navy Manpower and Material Analysis
Center, San Diego, 1979)
J.A. Nelder, A reformulation of linear models. J. R. Stat. Soc. 140(1), 48–77 (1977).
doi:10.2307/2344517
J. Neter, M.H. Kutner, C.J. Nachtsheim, W. Wasserman, Applied Linear Statistical
Models, 4th edn. (Irwin, Homewood, 1996)
E. Neuwirth, RColorBrewer: ColorBrewer Palettes. R Package Version 1.0-5
(2011). URL: http://CRAN.R-project.org/package=RColorBrewer
E. Neuwirth, RExcel (2014). URL: http://rcom.univie.ac.at
New Zealand Ministry of Research Science and Technology, Staying in science
(2006). URL: http://www.morst.govt.nz/Documents/publications/
researchreports/Staying-in-Science-summary.pdf
D.F. Nicholls, The analysis of time series—the time domain approach. Aust. J. Stat.
21, 93–120 (1979)
NIST, National Institute of Standards and Technology, Statistical Engineering Division (2002). URL: http://www.itl.nist.gov/div898/software/
dataplot.html/datasets.htm
NIST, National Institute of Standards and Technology, Data set of southern oscillations, in NIST/SEMATECH e-Handbook of Statistical Methods (2005). URL: http://www.itl.nist.gov/div898/handbook/pmc/
section4/pmc4412.htm
Olympic Committee, Salt Lake City 2002 Winter Olympics (2001). URL: http://
www.saltlake2002.com

882

References

R.L. Ott, An Introduction to Statistical Methods and Data Analysis, 4th edn.
(Duxbury, Belmont, 1993)
S.C. Pearce, The Agricultural Field Experiment (Wiley, New York, 1983)
K. Penrose, A. Nelson, A. Fisher, Generalized body composition prediction equation for men using simple measurement techniques (abstract). Med. Sci. Sports
Exerc. 17(2), 189 (1985)
D.H. Peterson (ed.), Aspects of Climate Variability in the Pacific and the Western
Americas. Number 55 in Geophysical Monograph (American Geophysical Union,
Washington, DC, 1990)
R.G. Peterson, Design and Analysis of Experiments (Marcel Dekker, New
York/Basel, 1985)
R Core Team, R: A Language and Environment for Statistical Computing (R
Foundation for Statistical Computing, Vienna, 2015). URL: http://www.
R-project.org/
R Development Core Team, R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, 2004). URL: http://
www.R-project.org. ISBN 3-900051-00-3
W. Rasband, Imagej 1.46t (2015). URL: http://rsb.info.nih.gov/ij
N.B. Robbins, Creating More Effective Graphs (Chart House, Ramsey, 2005, reissued 2013) [Originally Wiley-Interscience]
N.B. Robbins, R.M. Heiberger, E. Neuwirth, C. Ritter, Professional statistics
and graphics accessible from excel, in Presented at the Joint Statistics Meetings (American Statistical Association, Washington, DC, 2009). URL: http://
rcom.univie.ac.at/papers/handoutJSM2009.pdf
W.S. Robinson, Ecological correlations and the behavior of individuals. Am. Sociol.
Rev. 15, 351–357 (1950)
A.J. Rossini, R.M. Heiberger, R.A. Sparapani, M. Mächler, K. Hornik, Emacs
Speaks Statistics (ESS): a multiplatform, multipackage development environment
for statistical analysis. J. Comput. Graph. Stat. 13(1), 247–261 (2004). URL:
http://dx.doi.org/10.1198/1061860042985
A.J. Rossman, Televisions, physicians, and life expectancy. J. Stat. Educ. (1994).
URL: http://www.amstat.org/publications/jse/archive.htm
RStudio, Shiny: a web application framework for r (2015). URL: http://shiny.
rstudio.com
D. Sarkar, Lattice: Multivariate Data Visualization with R (Springer, New York,
2008). URL: http://lmdvr.r-forge.r-project.org. ISBN 978-0-38775968-5
D. Sarkar, lattice: Lattice Graphics. R Package Version 0.20-29 (2014). URL:
http://CRAN.R-project.org/package=lattice

References

883

D. Sarkar, F. Andrews, latticeExtra: Extra Graphical Utilities Based on Lattice. R Package Version 0.6-26 (2013). URL: http://CRAN.R-project.org/
package=latticeExtra
S. Sarkar, Some probability inequalities for ordered MTP2 random variables: a proof
of the Simes conjecture. Ann. Stat. 26, 494–504 (1998)
SAS Institute, Inc., The four types of estimable functions, in SAS/STAT User’s Guide
(SAS Institute, Inc., Cary, 1999)
C. Schenk, MikTeX (2001). URL: ftp://metalab.unc.edu/pub/packages/
TeX/systems/win32/miktex
S.R. Searle, Linear Models (Wiley, New York, 1971)
H.C. Selvin, Durkheim’s suicide: further thoughts on a methodological classic, in
Émile Durkheim, ed. by R.A. Nisbet (Prentice Hall, Englewood Cliffs, NJ, 1965),
pp. 113–136
R.T. Senie, P.P. Rosen, M.L. Lesser, D.W. Kinne, Breast self-examinations and medical examination relating to breast cancer stage. Am. J. Public Health 71, 583–590
(1981)
N. Shaw, Manual of Meteorology, vol. 1 (Cambridge University Press, Cambridge,
1942)
W.J. Shih, S. Weisberg, Assessing influence in multiple linear regression with incomplete data.Technometrics 28, 231–240 (1986)
J. Simpson, A. Olsen, J.C. Eden, A Bayesian analysis of a multiplicative treatment
effect in weather modification. Technometrics 17, 161–166 (1975)
W. Smith, L. Gonick, The Cartoon Guide to Statistics (HarperCollins, New York,
1993)
G.W. Snedecor, W.G. Cochran, Statistical Methods, 7th edn. (Iowa State University
Press, Ames, 1980)
R.R. Sokal, F.J. Rohlf, Biometry, 2nd edn. (W.H. Freeman, New York, 1981)
R.M. Stallman, Emacs (2015). URL: http://www.gnu.org/software/emacs/
R.G.D. Steel, J.H. Torrie, Principles and Procedures of Statistics, 1st edn.
(McGraw-Hill, Auckland, 1960)
P.H. Sulzberger, The effects of temperature on the strength of wood, plywood and
glued joints. Technical Report (Aeronautical Research Consultative Committee,
Australia, Department of Supply, 1953)
N. Teasdale, C. Bard, J. LaRue, M. Fleury, On the cognitive penetrability of posture
control. Exp. Aging Res. 19, 1–13 (1993)
The TeX Users Group (TUG), The MacTeX-2014 Distribution (2014). URL:
http://tug.org/mactex/

884

References

TIBCO Software Inc., TIBCO Spotfire S+: Release 8.2 (2010). URL: https://
edelivery.tibco.com/storefront/eval/tibco-spotfire-s-/
prod10222.html
TIBCO
Software
Inc.,
The
TIBCO
Enterprise
Runtime
for
R
Engine
(2014).
URL:
http://spotfire.tibco.com/
discover-spotfire/what-does-spotfire-do/predictive-analytics/
tibco-enterprise-runtime-for-r-terr
R. Till, Statistical Methods for the Earth Scientist (Macmillan, London, 1974)
E.R. Tufte, The Visual Display of Quantitative Information, 2nd edn. (Graphics
Press, Cheshire, 2001)
J.W. Tukey, One degree of freedom for nonadditivity. Biometrics 5(3), 232–242
(1949)
W. Vandaele, Participation in illegitimate activities: Erlich revisited, in Deterrence
and Incapacitation, ed. by A. Blumstein, J. Cohen, D. Nagin (National Academy
of Sciences, Washington, DC, 1978), pp. 270–335
P.K. VanVliet, J.M. Gupta, Tham-v-sodium bicarbonate in idiopathic respiratory
distress syndrome. Arch. Dis. Child. 48, 249–255 (1973)
W.N. Venables, Exegeses on linear models, in Proceedings of the S-PLUS Users
Conference, Washington, DC (1998). URL: http://www.stats.ox.ac.uk/
pub/MASS3/Exegeses.pdf
W.N. Venables, B.D. Ripley, Modern Applied Statistics with S-PLUS, 2nd edn.
(Springer, New York, 1997)
H. Wainer, Graphical Tales of Fate and Deception from Napoleon Bonaparte to
Ross Perot (Copernicus Books, New York, 1997)
W.W.S. Wei, Time Series Analysis, Univariate and Multivariate Methods (AddisonWesley, Reading, 1990)
A.M. Weindling, F.M. Bamford, R.A. Whittall, Health of juvenile delinquents. Br.
Med. J. 292, 447 (1986)
S. Weisberg, Applied Linear Regression, 2nd edn. (Wiley, New York, 1985)
I. Westbrooke, Simpson’s paradox: an example in a New Zealand survey of jury
composition. Chance 11, 40–42 (1998)
P.H. Westfall, D. Rom, Bootstrap step-down testing with multivariate location shift
data. Unpublished (1990)
H. Wickham, Ggplot2: Elegant Graphics for Data Analysis (Springer, New York,
2009). URL: http://had.co.nz/ggplot2/book
Wikipedia, Integrated Development Environment (2015). URL: http://en.
wikipedia.org/wiki/Integrated_development_environment
L. Wilkinson, The Grammar of Graphics (Springer, New York, 1999)

References

885

A.F. Williams, Teenage passengers in motor vehicle crashes: a summary of current research. Technical report (Insurance Institute for Highway Safety, Arlington, 2001)
E.J. Williams, Regression Analysis (Wiley, New York, 1959)
N. Woods, P. Fletcher, A. Hughes, Statistics in Language Studies (Cambridge University Press, Cambridge, 1986)
World Almanac and Book of Facts, World Almanac and Book of Facts, 2002 edn.
(World Almanac Books, New York, 2001)
E.L. Wynder, A. Naravvette, G.E. Arostegui, J.L. Llambes, Study of environmental
factors in cancer of the respiratory tract in Cuba. J. Natl. Cancer Inst. 20, 665–673
(1958)
Y. Xie (2015). URL: http://cran.r-project.org/web/packages/knitr/
knitr.pdf
F. Yates, The analysis of multiple classifications with unequal numbers in the different classes. J. Am. Stat. Assoc. 29, 51–56 (1934)
F. Yates, The Design and Analysis of Factorial Experiments (Imperial Bureau of
Soil Science, Harpenden, 1937)

Index of Datasets

A
abc 456
abrasion 310
acacia 573
animal 423
anneal 535
apple 502
B
balance 587
barleyp 421
batch 177
bean 535
blood 191
blyth 548
breast 260
budworm 605
byss 628
C
c3c4 232
catalystm 167, 189, 225
cc176 427
cereals 83
chimp 536
circuit 495
concord 110, 373
crash 524
crime 573
D
darwin 591
diamond 261
display 377
distress 592

draft70mn 83, 108, 191
drunk 539
E
eggs 472
elnino 643
employM16 692
esr 627
F
fabricwear 325, 344
fat 237, 263, 264, 285,
470
feed 418
filmcoat 441
filter 472
furnace 424
G
girlht 261
gunload 448
H
har1 164, 582, 583, 586
har2 165
hardness 110, 277, 373
heartvalve 472
hooppine 424
hospital 373
hotdog 332
houseprice 273, 274
hpErie 312, 343, 375
htwt 316

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

887

888

Index of Datasets

I
icu 625, 627
income 261
intubate 575
ironpot 424
J
jury
K
kidney

575

373

L
lake 260
leukemia 627
lifeins 374
longley 287
lymph 610
M
manhours 313
market 476
mice 192
mileage 374
mortality 573
mpg 424
muscle 260
N
njgolf 88
notch 191
O
oats 482
operator 189
oral 575
ozone 688
P
patient 190
political 575
potency 191
pox 591
product 674
psycho 628
pulmonary 229
pulse 189, 590

R
radioact 536
rent 345
retard 422
rhiz.alfalfa 421
rhiz.clover 421, 468
S
salary 83
salinity 189
salk 563
seeding 591
selfexam 573
shipment 313
sickle 190
skateslc 422
spacshu 595, 598
spindle 476
sprint 313
surface 477
T
tablet1 191, 592
teachers 140
testing 421
tires 437, 474
tongue 625
tser.mystery.X 662
tser.mystery.Y 666
tser.mystery.Z 670
tsq 684
turkey 178, 179, 207, 450, 480
tv 39, 93, 708
U
usair 109, 306
uscrime 109, 313
V
vocab 130, 578, 579
vulcan 473
W
washday 475
water 110, 343
weightloss 201
weld 474
wheat 536
wool 472
workstation 397

Index

Symbols
Φ 51
Φ 70
α 60, 63
β 63
∩ 29–30
χ2 38
∪ 29–30
 285
ε 632
η 37
μ 36, 44
⊥
X 38
ρ 44
P (uppercase ρ) 45
σ 36, 44
σ2 36, 268
t distribution 816
X + 268
= 271
˜ 271
A
absolute-sum-2 scaling 187, 222
accuracy 763
ACF see autocorrelation
acid phosphatase 610
ACM ix
added residual plots 305
added variable plots 301–304, 306, 356,
359
AEdotplot 120
Agresti 126
AIC see Akaike information criterion
Akaike information criterion 298, 309, 639
Akaike, Hirotugu 298

algebra 272, 775
algorithm 785
alias 479, 481, 494
alignment 15, 708, 837
time series 633
alternative hypothesis) 203
analysis of covariance see ANCOVA
analysis of deviance table 605
analysis of variance see ANOVA, 171
analysis of variance table see ANOVA table
analysis with concomitant variables see
ANCOVA
ANCOVA 113, 330–343, 428, 429, 434, 479,
501–512, 516, 520, 521
ancova 503
ANCOVA dummy variables 538
ANCOVA plot 353, 355, 428, 433, 513
ANOVA 167–193, 332, 377–425
computation 425
ANOVA table 168–172, 240, 347, 349, 351,
352, 354, 355, 386
antilogit 593, 594
ARIMA 632, 635
arithmetic 272
ARMA 636
ASCII format 17, 21
aspect ratio 841
asymmetry 104
autocorrelation 636
autoregression (AR) 633
B
backshift operator 632
backslash 705, 706
barchart 567
barplot 525, 532

© Springer Science+Business Media New York 2015
R.M. Heiberger, B. Holland, Statistical Analysis and Data Display,
Springer Texts in Statistics, DOI 10.1007/978-1-4939-2122-5

889

890
baseline 570
batch 834
Bell Labs ix
bell-shaped curve 49
Bernoulli distribution 604
beta curve 64, 65, 69, 817
beta distribution 808
bias 59, 79
binary format 757
binomial distribution 48, 578–579, 822
binomial test 579
binomial, normal approximation 48
bitmap graphics 841
bivariate discrete distribution 539
bivariate normal distribution 45, 46
block 387
blocking factor 387, 388
bmp 841
Bonferroni inequality 200
Bonferroni method 200–201
Box, George E. P. 632
Box–Cox transformations 102, 472, 530
Box–Jenkins method 632
boxplot 42–44, 114, 333, 401, 449, 478
with control of position 114
Brown–Forsyth test 189–190
browser 712
build a package 751, 752
bunching 305
bwplot 437
byssinosis 628
C
Caffo 126
calculus 780, 781
cancellation 762, 763
cancer 609
carryover effect 497
Cartesian product 111–114, 121, 385
case 345, 558
case-control study 558
categorical variable 539
Cauchy distribution 809
cell 383, 539
cell means 383
Central Limit Theorem 54
central limit theorem 55, 56
Chambers, John M. ix
changeover design 497
chi-square analysis 539–545
chi-square distribution 543, 809
Cholesky Factorization 793
class variable 13

Index
cluster random sampling 77
Cochran option 135
Cochran, William G. 562
code 20
coded data 421
coding 13, 207, 315–316, 321, 325, 342,
421
coefficient of determination 241, 242, 244,
256, 257, 298, 309
cohort study 559
collinearity 287–292, 373
color choice 113
color deficient vision 107
color palette 570
color vision 107
ColorBrewer 113
column space 324
combinations 805
command file 834
command language ix
common scaling 621
comparison value 527–529
computational precision 753
computing note 203
concomitant variable 330, 331
concomitant variables, analysis of see
ANCOVA
conditional probability 30
conditional sums of squares 466
conditional tests 464–470
confidence bands 251, 254, 259
logistic regression 598
confidence coefficient 60
confidence interval 58–63
matched pairs of means 139–141
one-sided 61
population mean 123, 129–130
population proportion 126–127
population variance 131
two means 134–136
two proportions 133–134
variance ratio 138
conflicts 712
conflicts, names 712
confounding 479–495, 559
conservative 200
consistency 60
console 704
contingency table 8
contingency tables 539–575
continuity 49
contrast matrix 224, 322
contrast vector 186

Index
contrasts 179, 183–188, 222, 223, 225, 315,
316, 322–327, 419, 420, 451, 622
arbitrary 222–224
orthogonal 181–182, 224–228, 489, 490
controls 558
Cook’s distance 360, 362, 366–368, 372
Cook, R. Dennis 362, 366
correction for continuity 49
correlation 44–47
correlation coefficient 243
Courier 837
court system 66
covariance 44–47
covariance adjustment 331, 336, 338, 511
covariance matrix 44, 45
covariance, analysis of see ANCOVA
covariate 330–332, 335, 336, 343
C p 297–299, 309
C p plot 299, 310
CRAN 749
cross-product ratio 552
crossing 272, 425, 451, 453, 482
crossover design 479, 497–501
cumulative distribution 34, 35
D
Daniel, Cuthbert 297
data
categorical 13
continuous 14
count 13
discrete 14
importing 16
interval 13, 14, 578
missing 17, 21, 317
multivariate 14
ordered 13, 14, 578, 590
ratio 13, 14, 578
rearrangement 18
types of 13
data display 2, 239
data snooping 173
database management system 17
datasets 16
datasets for HH 19–20
debugger 712
debugging 712
dedication v
degrees of freedom 52, 277
deleted predicted value 360
deleted regression coefficients 360
deleted standard deviation 360, 364
deviance 605
DFBETAS 360

891
DFBETAS 362, 369–370
DFFITS 360
DFFITS 362, 367–368
diagnostic
logistic regression 619
diagnostic plot 528, 529
time series 646, 678
diagnostics
time series 638, 652
dichotomous 54, 273, 593, 623, 627
dichotomous response variable 622
differencing 635
direct effect 497
directory structure 834
disastrous cancellation 762, 763
discrete distribution 539
discrete uniform distribution 822
discretization 327
distribution see the name of the particular
distribution
diverging stacked barchart 115, 567
dotplot 522
dummy variable 315–321, 325, 336, 343,
456, 461–465, 534, 537
dummy variables 224
Dunnett procedure 201–206, 513–515
E
ECDF 256
ecological correlation 87, 441, 552
editors 831
efficiency 71, 385, 504
eigenvalues 796
elementary operations 783
Emacs 709, 851–861
buffer 853
shell mode 853
EmacsRot 855
Empirical CDF 256
empirical cumulative distribution plot 257,
258
EMS see expected mean squares, 488
epsilon, machine 754
error see residuals
ESS 851–861
estimate 57
estimation 55–62, 638
estimator 57
point 58–60
exact test 580
Excel, Microsoft 17, 869–871
expectation 36–37
expectation of sum 37, 269
expected mean squares 393

892
expected mean squares 175–176, 192, 386,
393, 448, 450, 451, 488
experimental units 481, 492
exponential distribution 810
externally standardized residuals see
Studentized deleted residuals
extra sum of squares 276
F
F 35
F-distribution 811
F-test 134, 139, 168, 171, 175–177, 186,
188, 240, 242, 590
factor 13, 167, 539
factorial 804
family (of related inferences) 199
familywise error rate 172, 199, 200
FAQ 7.22 707
FAQ 7.31 15, 753
figure 20
find 712
firewall 702
Fisher’s exact test 545–548, 564, 573
Fisher, Ronald A. 545
fitting constants 466, 467
fixed effects 167, 169, 388
fixed factor 169
fixed-width font 840
floating-point arithmetic 753
folding 710, 839
font 708, 837
forecasting 641, 661
foreign 17
formatting 840
forward slash 705, 706
fractional factorial design 479–481, 492–496
fractional replicate 481, 492
full model 274
FWE see familywise error rate
G
Gaussian elimination 268
generalized inverse 268, 801
generalized linear model 595, 604
geometric distribution 823
geometry 270
geometry of matrices 795
gif 841
glm see generalized linear model
Gnu Public License 699
GOF see goodness-of-fit
goodness-of-fit test 148–153, 158–160, 544,
578
portmanteau 639

Index
GPL 699
Gram–Schmidt algorithm 789
grand mean 384
granularity 305
graph 401
graphical design 2, 9, 18, 116–121, 213,
620–622
ACF plot 695
ANCOVA plot 119, 332, 341, 342
ARIMA-trellis plot 119
barplot 525
boxplot 401, 456, 478
common scaling 621
construction 695
interaction plot 119, 385
logistic regression plot 117
MMC plot 119, 212–231
odds-ratio CI plot 121, 557, 558
ODOFFNA plot 119, 529, 530
regression diagnostic plots 350, 354, 371,
372
seasonal time series 650
squared residual plot 120
time series 642–645, 692–696
time series plot 694
graphical display
logistic regression 596–599
graphical user interface 116
graphics 841
GUI see graphical user interface
H
Haenszel, William 559
hat matrix 268–270, 363, 375
Heiberger, Mary Morris v
Helmert contrasts 323
hexadecimal notation 756, 871
HH 16, 715
HH package 705
hhcapture 20
hhpdf 20
HHscriptnames 706, 715, 716
HHscriptnames 20
hierarchical factorial relationship 397
high leverage point 269
higher way designs 427–477
histogram 39–40
Hochberg procedure 201
Holland, Andrew v
Holland, Ben v
Holland, Irene v
Holland, Margaret v
homogeneity of variance 114
hov see variance, homogeneity of

Index

893

Hsu, Jason 217
hypergeometric distribution 48, 545–547,
824
hypothesis test
matched pairs of means 139–141
one-sided 68
population mean 124–126
population proportion 127–129
population variance 131
two means 135–136
two variances 138
two-sided 68
hypothesis testing 62–67, 73, 74
I
identification 637
ill-conditioned data 287
imputation 17–18
indentation 709
independence 30, 33, 35, 542–544
indicator variable see dummy variable
inductive inference 2
inexplicable error messages 712
inexplicable messages 712
influence 350, 351, 367, 372
install 749
install.packages 699
integer scaling 188
interaction 5, 181, 330, 350, 378–385
models without 417–420
interaction plot 379, 385, 394, 409, 410, 420,
422–424, 430, 432, 444, 473, 474, 487,
495, 499, 526, 530, 531
internally standardized residuals see
standardized residuals
internet 702
intersection see ∩
isomeans grid 219
J
Jenkins, Gwilym M.
jitter 598
judicial system 66

632

K
Kolmogorov–Smirnov test 148
Kruskal–Wallis test 590–591
L
l’Hôpital’s rule 104
ladder of powers 104, 530, 532
ladder-of-powers plot 114
lag 632

language
command ix
LATEX 837, 863
LATEX 20
Latin square design 435–440, 474, 481, 492,
497, 498, 500, 504
lattice 517
lattice 707
latticeExtra 115
latticeExtra x, 156, 717
latticeExtra 118
LD50 605
least squares 236, 238, 239, 267, 314
least-squares geometry 263
least-squares plane 264
level 13, 167
leverage 250, 269, 270, 360, 362–364, 366
library, personal 700, 860
likelihood ratio 161
likelihood ratio test 162
likert 567
Likert scale 567
line width 839
linear dependence 322
linear equations 803
linear Identity 193, 194
linear identity 248
linear independence 786
linearly independent 316
link 593
link function 594, 603, 604
Linux 700
lmatPairwise 225
load 749
logistic distribution 813
logistic regression 114, 593–629
logistic regression plot 609
logit 593, 601, 813
logit scale 601
lognormal distribution 812
LogXact 624
longitudinal study 516
Loops 770
lung 628
lymph nodes 609
M
machine epsilon 760
machine epsilon 754
Macintosh 700
MAD 1, 190
main effect 384, 480
Mallows, Colin 297, 298

894
Mann–Whitney test 586–590
Mantel, Nathan 559
Mantel–Haenszel test 559–565, 575
marginal means 383
marginal panels 113
marginality 466
margins 839
matrix
geometry 795
matrix algebra 783
matrix factorization 788
maximum likelihood 161, 236
maximum likelihood estimation 161–162
mean polish 528
mean square
error see mean square, residual
residual 242–244, 247, 248
mean square, residual 297
mean–mean display see MMC plot
median 37, 60
median polish 528
method of fitting constants 466
microplot 430, 431
microplots 120
Microsoft 867
Minitab 773
minus sign 838, 847
missing data 17, 21, see data, missing
missing values 17, 21
mixed model 393–394
MMC 431
MMC plot 175, 187, 206, 210–231, 381,
405, 408, 412–414, 435, 440, 443, 515
MMC, split plot 486, 489, 491
model 57
model formula 271
model specification 271–272, 450, 451,
456–462, 622
mono-width font 840
monowidth font 837
Moore–Penrose generalized inverse 268,
801
mosaic plot 114
moving average (MA) 634
mpfr 756
MSE see mean square, residual
multicollinearity see collinearity
multinomial distribution 828
multiple comparison procedures 172–175,
199–232, 513
multiple precision floating point 756
multiplicity 200, 207–208
multivariate distribution 44, 45
multivariate normal distribution 46, 783, 829

Index
N
NA 21
name conflict 712
name conflicts 712
negative binomial distribution 825
Nelder, John A. 466
nested factorial experiment 448–453
nesting 272, 397–398, 411, 412, 448–453
new observation 247–252
New Zealand 572
Newton’s method 782
NID 169, 235, 356, 383
no-intercept models 261, 281
nominal variable 13
nonadditivity 527
noncentral 71, 74
noncentral chi-square distribution 818, 819
noncentral distribution 817–818
noncentral F distribution 818, 820
noncentral t distribution 817–819
noncentrality parameter 817, 818
nonparametric methods 577–591
nonparametric procedures 169
normal approximation to the binomial 48, 49
normal distribution 49–50, 52, 53, 55, 56,
59, 62–64, 69, 70, 72, 74, 144, 814
normal equations 239, 268
normal probability plot 152–157, 255, 357,
358
normalized scaling 187
notch 43
numerical stability 763
O
O(n) 785
O.C. curve see operating characteristic curve
Object Oriented Programming 196, 197
odds 552–557, 593, 601
odds ratio 552–557, 559, 575, 624
odds scale 601
ODOFFNA 119
odoffna see one degree of freedom for
nonadditivity
one degree of freedom for nonadditivity
524–536
online files 16
OOP 196, 197
operating characteristic curve 63–65, 69, 73,
74, 817
operator symbols 272
optimization 781
order statistics 37
ordered categorical scale 567
ordered factor 326

Index
orientation 478
origin, regression through 261, 281
orthogonal basis set 406
orthogonal basis set 408, 413, 788
orthogonal contrasts see contrasts,
orthogonal, 405, 408, 412, 413
orthogonal matrix 783
orthogonal polynomials 325–330, 342, 793
orthogonal transformation 787
OSX 700
outlier 365, 577–579, 581, 582, 587
P
p-value 67
PACF see autocorrelation
package 749
paired t-test 136, 581
pairwise 225
parameter 2
parameterization 323
parsimony 287, 292
partial F-tests 274–277
partial correlation 303
partial residual plots 301–306, 357, 359
partial residuals 301, 303, 305
pdf 841
permutations 804
personal library 700, 860
Peruggia, Mario 217
placebo 139, 558, 559
plots 481
p.m.f. see probability mass function
png 841
point cloud 264, 270
point estimator 58
Poisson distribution 165, 825
polynomial 325
polynomial contrasts 323, 325–330, 344,
489
polynomial function 632
polynomial model 277–281, 292
pooling 134, 417
population 2
population pyramid 573
portmanteau goodness-of-fit test 639
position 114
power 63, 142, 176, 200, 577, 818
power curve 63–65, 69, 73, 74, 817
power transformations 102
powers, ladder of 104
precision 15–16, 60, 328, 756, 758,
763
precision, computational 753
prediction 285–286

895
prediction bands 251, 254, 259
logistic regression 597, 598, 601
prediction interval 250–253, 260, 285–286
predictor matrix 270
predictor variable 263, 264, 273, 315, 316,
327
Preface vii
presentation of results 847
principle of marginality 466
probability 29–31
probability distribution see the name of the
particular distribution
probability distributions 30–54
probability mass function 33
probability scale 601
probit regression 595
programming style 845
projection matrix 268, 794
proportion 624
proportional font 840
prospective study 558–559
proxy 702
ps 841
psychometric scale 567
Q
QR decomposition 508, 788
quadratic form 787
quadratic identity 193, 194, 248
quadratic model 278–280
quantile plot 152–157, 164
quartiles 42, 43
questionnaire 567
quotation marks 839
R
R 2, 112, 699
R Development Core Team x
R language 708
R2 see coefficient of determination
adjusted 241, 242, 298, 309
r-f spread plot 256
random effcts 383
random effects 167, 173, 175, 382, 386, 388,
393–394, 448
random factor 173
random sample 2
random sampling 75
random shock 632
random variable 29–38
random vector 44, 783
randomization 75
randomization test 580

896

Index

randomized complete block design 388–390,
504
rank 14, 578, 582–591, 787
raster graphics 841
rating scale 567
RCBD see randomized complete block
design
Rcmdr 701, 719
recover 712
reduced model 274
regression analysis
diagnostics 345–375
multiple linear regression 263–310
simple linear regression 235–260
using dummy variables 315–341
regression coefficients 236, 238, 241, 243,
247, 249, 250, 266–268, 275, 311, 315,
321, 324, 325, 347, 349, 351
regression diagnostics 117, 254–257
relative risk 552–557
repeated measures design 497
reshape data 18
residual effect 497
residual effects design 497–501
residual mean square see mean square,
residual
residual plot 197, 254–256, 353, 356–357,
371, 372, 848
residual sum of squares see sum of squares,
residual
residuals 238, 247, 268
Resolution 493
resolution 841
response surface 421
response surface methodology 489
retrospective study 558–559, 573
RExcel 17, 700, 701, 733
rhizobium 400
risk factor 558
Rmpfr 756
root mean square error 240
round to even 758
rounding 15–16, 758
Rtools 702
r.v. see random variable
S
S+ x
S-Plus x, 112
sample 2
sample proportion 593
models 623
sample size 63
sample size determination

142–148

sampling 74–78
sampling distribution 54
SAS 372, 773
Satterthwaite option 136
scaled deviation 543
scaling 222
scatterplot 88–89, 114
scatterplot matrix 89–92, 95–100, 108,
116–117, 237, 259, 270, 273, 274, 276,
293, 307, 308, 317, 346, 348, 350, 356,
385, 518, 519, 609, 615, 617, 618
Scheffé procedure 206–209, 211
screenshots 702, 703, 711
script file 706, 834
s.d. see standard deviation
seasonal models 648–649
semantics 272
sequential sums of squares 466
sequential tests 464–470
setInternet2 702
Shapiro–Wilk test 154, 357
shiny 9, 45, 47, 50, 63, 65, 73, 146, 699,
743–747
sign test 578–582
signed rank distribution 826
significant 67
significant digits 763
simple effects 384, 410–416, 441–447, 474
simple random sampling 75
Simpson’s paradox 441, 548–552, 575
simultaneous confidence intervals 172–173
singular value decomposition 797
skewness 38–39
slash 705, 706
small multiples 114
space shuttle 595
spelling 833
split plot design 479, 481–490
splom see scatterplot matrix
spreadsheet 867
sprintf 756
Minitab 773
squared residual plot 239, 265
stacked barchart 567
standard deviation 36
standard error 54, 240, 245, 251
standard error of estimate 240, 269
standard error of sample mean 78–79
standardized residuals 360, 365–366
Stangle 20, 835
start value 102
stationarity 631
statistic 2

Index
statistical model 57, 169, 267, 311, 382–383,
448, 481
statistically significant 67
statistics 2
statistics profession 3
stem-and-leaf display 41, 317, 318, 579, 581
stepwise regression 292, 293, 297–298
all subsets 297
backward elimination 297
forward selection 297
stochastic process 632
stratified random sampling 76–77
strip label 92
Student’s t distribution 50–52, 816, 817
Studentized deleted residuals 360, 362,
365–367
Studentized Range Distribution 396
Studentized range distribution 172, 396,
815
subplot 481
sufficiency 60
sum contrasts 323, 456–463
sum of random variables 37, 46
sum of squares 268
regression 241, 242
residual 240–242, 244
total 241, 242, 244
sum, expectation of 37
sums of squares 469
surveys 5
Sweave 20, 835
SWord 868
symmetry 98, 100
syntax 272
syntax highlighting 832
systematic random sampling 78
T
t distribution 148
t distribution 50–52
t-test 140, 141, 253, 254, 581,
586
text editors 852, 867
tif 841
time series analysis 631–697
Times Roman 837
total sum of squares see sum of squares,
total
trace 712
trace factor 385
transcript 20
transformation 39, 100–106, 169, 306, 356,
577, 593
treatment 387

897
treatment combinations 383, 492
trellis 707
Trellis paradigm 111–113
Tukey procedure 172–173, 201, 211–212,
216, 338, 378, 380, 381, 395, 403–405,
408, 414, 431, 435, 437–440, 443
Tukey, John 38, 42, 524
Type I error 63–69
Type I Sum of Squares 272, 466–471
Type II error 63–66, 69, 817, 818
Type II Sum of Squares 466–467
Type III Sum of Squares 466
Type III Sum of Squares 380, 382, 466–471
Type IV Sum of Squares 467
typography 837
U
unbalanced design 468
unbalanced sampling 398
unbiased 59
Unicode 833
uniform distribution 816
union see ∪
V
values
missing 17, 21
variable selection 292–301
variance 36–37, 60, 870, 871
accuracy 763
homogeneity of 169, 188–190
variance function 603
variance inflation factor 291–293, 373
variance of median 60
variance stabilization 101, 530
variance, analysis of see ANOVA
variance-covariance matrix see covariance
matrix
vector graphics 841
Venables, William N. 466
VIF see variance inflation factor
vision, color 107
W
weibull distribution 817
weighted squares of means 466, 467
Welch two-sample t-test 136
whole plot 479, 481
whole plot error 482
Wilcoxon distribution 827
Wilcoxon signed-ranks test 582–586
Wilcoxon, Frank 582, 586
Wilk–Shapiro test 154, 357

898
Windows 700, 701, 733, 860
WindowsPath 706
wmf 841
word processing 840
word processing software 710
word processor 867
Word, Microsoft 867, 868
working style 831
write a function 751, 752
writing style 837, 844, 847

Index
X
X + 268
x-factor 385
XLConnect 17, 703
xyplot 117
Y
Yates, Frank
Z
zapsmall

466

760



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.4
Linearized                      : Yes
Language                        : EN
XMP Toolkit                     : Adobe XMP Core 5.2-c001 63.143651, 2012/04/05-09:01:49
Creator Tool                    : LaTeX with hyperref package + hypdvips
Modify Date                     : 2015:12:19 08:49:39+05:30
Create Date                     : 2015:12:16 16:25:03+05:30
Metadata Date                   : 2015:12:19 08:49:39+05:30
Producer                        : Acrobat Distiller 9.0.0 (Windows)
Keywords                        : 
Format                          : application/pdf
Description                     : 
Title                           : 
Creator                         : 
Document ID                     : uuid:1e98c55d-846e-492c-b23b-e4073a615664
Instance ID                     : uuid:b877d85b-cad1-49e6-897b-56251c7749ae
Rendition Class                 : default
Version ID                      : 1
History Action                  : converted, converted, converted
History Instance ID             : uuid:b58faa74-72de-4475-8ff7-3c0831922e79, uuid:35d9e18f-735b-433b-a4c1-2133891c7a71, uuid:f6d54ebf-a890-4cbc-9037-9303081692cf
History Parameters              : converted to PDF/A-1b, converted to PDF/A-1b, converted to PDF/A-1b
History Software Agent          : pdfToolbox, pdfToolbox, pdfToolbox
History When                    : 2015:12:16 16:30:03+05:30, 2015:12:19 02:51:58+05:30, 2015:12:19 08:49:39+05:30
Part                            : 1
Conformance                     : B
Schemas Namespace URI           : http://ns.adobe.com/pdf/1.3/
Schemas Prefix                  : pdf
Schemas Schema                  : Adobe PDF Schema
Schemas Property Category       : internal
Schemas Property Description    : A name object indicating whether the document has been modified to include trapping information
Schemas Property Name           : Trapped
Schemas Property Value Type     : Text
Page Layout                     : SinglePage
Page Mode                       : UseOutlines
Page Count                      : 188
Author                          : 
Subject                         : 
EXIF Metadata provided by EXIF.tools

Navigation menu