Stata Press Publication A Visual Guide To Graphics

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 409

DownloadStata Press Publication A-visual-guide-to-stata-graphics
Open PDF In BrowserView PDF
i

i

i

i

A Visual Guide to Stata Graphics

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

A Visual Guide to Stata Graphics

MICHAEL N. MITCHELL
University of California, Los Angeles

A Stata Press Publication
StataCorp LP
College Station, Texas

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Stata Press, 4905 Lakeway Drive, College Station, Texas 77845

c 2004 by StataCorp LP
Copyright 
All rights reserved
Typeset in LATEX 2ε
Printed in the United States of America
10 9 8 7 6 5 4 3 2 1
ISBN 1-881228-85-1
This book is protected by copyright. All rights are reserved. No part of this book may be reproduced, stored in a retrieval system, or transcribed, in any form or by any means—electronic,
mechanical, photocopying, recording, or otherwise—without the prior written permission of
StataCorp LP.
Stata is a registered trademark of StataCorp LP. LATEX 2ε is a trademark of the American Mathematical Society.
The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Dedication
I would like to dedicate this book to Paul Hoffman. Although he was my supervisor for
the last nine years, it always felt much more like he was a trusted friend always there to
help me do the best work that I could. I am so sorry he had so leave us so soon. In my own
way, I hope that I can give to others the same kinds of things he gave to me. I am really
going to miss you, Paul.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Acknowledgments
Although there is a single name on the cover of this book, many people have helped to
make this book possible. Without them, this book would have remained a dream, and I
could have never shared it with you. I want to thank those people who helped that dream
become the book you are now holding.
I want to thank the warm people at Stata, who were very generous in their assistance
and who always find a way to be friendly and helpful. In particular, I wish to thank Vince
Wiggins for his generosity of time, insightful advice, boundless enthusiasm, and commitment
to help make this book the best that it could be. I am very grateful to Jeff Pitblado, who
created the LATEX tools that made the layout of this book possible. Without the benefit of
his time and talent, I would still be learning LATEX instead of writing these acknowledgments.
Also, I would like to thank the Stata technical support team, especially Derek Wagner, for
patiently working with me on my numerous questions. I am also very grateful to John
Williams for his thoroughness and alacrity in editing the book and to Chinh Nguyen for his
creative and clever cover design.
I also want to thank, in alphabetical order, Xiao Chen, Phil Ender, Frauke Kreuter, and
Christine Wells for their support and suggestions.
Last, and certainly not least, I would like to thank the teachers who have added to my
life in very special ways. I have been very fortunate to have been touched by many special
teachers, and I will always be grateful for what they kindly gave to me. I want to thank
(in order of appearance) Larry Grossman, Fred Perske, Rosemary Sheridan, Donald Butler,
Jim Torcivia, Richard O’Connell, Linda Fidell, and Jim Sidanius. These teachers all left me
gifts of knowledge and life lessons that help me every day. Even if they do not all remember
me, I will always remember them.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Contents
Dedication
Acknowledgments
Preface

v
vii
xiii

1 Introduction

1

1.1

Using this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.2

Types of Stata graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.3

Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

1.4

Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

1.5

Building graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

2 Twoway graphs

35

2.1

Scatterplots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

2.2

Regression fits and splines . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

2.3

Regression confidence interval (CI) fits . . . . . . . . . . . . . . . . . . . .

50

2.4

Line plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54

2.5

Area plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61

2.6

Bar plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

2.7

Range plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

2.8

Distribution plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

74

2.9

Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82

2.10

Overlaying plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87

3 Scatterplot matrix graphs

95

3.1

Marker options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

95

3.2

Controlling axes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

98

3.3

Matrix options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

102

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
x

Contents
3.4

Graphing by groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 Bar graphs

103
107

4.1

Y-variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

107

4.2

Graphing bars over groups . . . . . . . . . . . . . . . . . . . . . . . . . . .

111

4.3

Options for groups, over options . . . . . . . . . . . . . . . . . . . . . . . .

117

4.4

Controlling the categorical axis . . . . . . . . . . . . . . . . . . . . . . . .

123

4.5

Controlling legends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

130

4.6

Controlling the y-axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

143

4.7

Changing the look of bars, lookofbar options . . . . . . . . . . . . . . . . .

147

4.8

Graphing by groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

151

5 Box plots

157

5.1

Specifying variables and groups, yvars and over . . . . . . . . . . . . . . .

157

5.2

Options for groups, over options . . . . . . . . . . . . . . . . . . . . . . . .

163

5.3

Controlling the categorical axis . . . . . . . . . . . . . . . . . . . . . . . .

168

5.4

Controlling legends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

174

5.5

Controlling the y-axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

179

5.6

Changing the look of boxes, boxlook options . . . . . . . . . . . . . . . . .

183

5.7

Graphing by groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

189

6 Dot plots

193

6.1

Specifying variables and groups, yvars and over . . . . . . . . . . . . . . .

193

6.2

Options for groups, over options . . . . . . . . . . . . . . . . . . . . . . . .

198

6.3

Controlling the categorical axis . . . . . . . . . . . . . . . . . . . . . . . .

202

6.4

Controlling legends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

205

6.5

Controlling the y-axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

207

6.6

Changing the look of dot rulers, dotlook options . . . . . . . . . . . . . . .

210

6.7

Graphing by groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

214

7 Pie graphs

217

7.1

Types of pie graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

217

7.2

Sorting pie slices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

219

7.3

Changing the look of pie slices, colors, and exploding . . . . . . . . . . . .

221

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
Contents

xi

7.4

Slice labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

224

7.5

Controlling legends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

228

7.6

Graphing by groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

232

8 Options available for most graphs

235

8.1

Changing the look of markers . . . . . . . . . . . . . . . . . . . . . . . . .

235

8.2

Creating and controlling marker labels . . . . . . . . . . . . . . . . . . . .

247

8.3

Connecting points and markers . . . . . . . . . . . . . . . . . . . . . . . .

250

8.4

Setting and controlling axis titles . . . . . . . . . . . . . . . . . . . . . . .

254

8.5

Setting and controlling axis labels . . . . . . . . . . . . . . . . . . . . . . .

256

8.6

Controlling axis scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

265

8.7

Selecting an axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

269

8.8

Graphing by groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

272

8.9

Controlling legends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

287

8.10

Adding text to markers and positions . . . . . . . . . . . . . . . . . . . . .

299

8.11

More options for text and textboxes . . . . . . . . . . . . . . . . . . . . . .

303

9 Standard options available for all graphs

313

9.1

Creating and controlling titles . . . . . . . . . . . . . . . . . . . . . . . . .

313

9.2

Using schemes to control the look of graphs . . . . . . . . . . . . . . . . .

318

9.3

Sizing graphs and their elements . . . . . . . . . . . . . . . . . . . . . . . .

322

9.4

Changing the look of graph regions . . . . . . . . . . . . . . . . . . . . . .

324

10 Styles for changing the look of graphs

327

10.1

Angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

327

10.2

Colors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

328

10.3

Clock position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

330

10.4

Compass direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

331

10.5

Connecting points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

332

10.6

Line patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

336

10.7

Line width . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

337

10.8

Margins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

338

10.9

Marker size

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

340

10.10 Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

341

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
xii

Contents
10.11 Marker symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

342

10.12 Text size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

344

11 Appendix

345

11.1

Overview of statistical graph commands, stat graphs . . . . . . . . . . . .

345

11.2

Common options for statistical graphs, stat graph options . . . . . . . . .

352

11.3

Saving and combining graphs, save/redisplay/combine . . . . . . . . . . .

358

11.4

Putting it all together, more examples . . . . . . . . . . . . . . . . . . . .

366

11.5

Common mistakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

376

11.6

Customizing schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

379

11.7

Online supplements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

382

Subject index

383

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Preface
It is obvious to say that graphics are a visual medium for communication. This book
takes a visual approach to help you learn about how to use Stata graphics. While you can
read this book in a linear fashion or use the table of contents to find what you are seeking, it
is designed to be “thumbed through” and visually scanned. For example, the right margin
of each right page has what I call a Visual Table of Contents to guide you through the
chapters and sections of the book. Generally, each page has three graphs on it, allowing
you to see and compare as many as six graphs at a time on facing pages. For a given graph,
you can see the command that produced it, and next to each graph is some commentary.
But don’t feel compelled to read the commentary; often, it may be sufficient just to see the
graph and the command that made it.
This is an informal book and is written in an informal style. As I write this, I picture
myself sitting at the computer with you, and I am showing you examples that illustrate
how to use Stata graphics. The comments are written very much as if we were sitting down
together and I had a couple of points to make about the graph that I thought you might
find useful. Sometimes, the comments might seem obvious, but since I am not there to hear
your questions, I hope it is comforting to have the obvious stated just in case there was a
bit of doubt.
While this book does not spend much time discussing the syntax of the graph commands
(since you will be able to infer the rules for yourself after seeing a number of examples),
the Intro : Options (20) section discusses some of the unique ways that options are used in
Stata graph commands and compares them to the way that options are used in other Stata
commands.
I strived to find a balance to make this book comprehensive but not overwhelming. As
a result, I have omitted some options I thought would be seldom used. So, just because a
feature is not illustrated in this book, this does not mean that Stata cannot do that task,
and I would refer to [G] graph for more details. I try to include frequent cross-references
to [G] graph; for example, see also [G] axis options. I view this book as a complement to
the Stata Graphics Reference Manual, and I hope that these cross-references will help you
use these two books in a complementary manner. Note that, whenever you see references to
[G] xyz, you can either find “xyz” in the Stata Graphics Reference Manual or type whelp
xyz within Stata. The manual and the help have the same information, although the help
may be more up to date and allows hyperlinking to related topics.
Each chapter is broken into a number of sections showing different features and options
for the particular kind of graph being discussed in the chapter. The examples illustrate how
these options or features can be used, focusing on examples that isolate these features so you
are not distracted by irrelevant aspects of the Stata command or graph. While this approach
improves the clarity of presentation, it does sacrifice some realism since graphs frequently
have many options used together. To address this, there is a section addressing strategies for
The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
xiv

Preface

building up more complicated graphs, Intro : Building graphs (29), and a section giving tips
on creating more complicated graphs, Appendix : More examples (366). These sections are
geared to help you see how you can combine options to make more complex and feature-rich
graphs.
While this book is printed in color, this does not mean that it ignores how to create
monochrome (black & white) graphs. Some of the examples are shown using monochrome
graphs illustrating how you can vary colors using multiple shades of gray and how you
can vary other attributes, such as marker symbol and size, line width, and pattern, and so
forth. I have tried to show options that would appeal to those creating color or monochrome
graphs.
The graphs in this book were created using a set of schemes specifically created for this
book. Despite differences in their appearance, all the schemes increase the size of textual
and other elements in the graphs (e.g., titles) to make them more readable, given the small
size of the graphs in this book. You can see more about the schemes in Intro : Schemes
(14) and how to obtain them in Appendix : Online supplements (382). While one purpose of
the different schemes is to aid in your visual enjoyment of the book, they are also used to
illustrate the utility of schemes for setting up the look and default settings for your graphs.
See Appendix : Online supplements (382) for information about how you can obtain these
schemes.
Stata has a number of graph commands for producing special-purpose statistical graphs.
Examples include graphs for examining the distributions of variables (e.g., kdensity, pnorm,
or gladder), regression diagnostic plots (e.g., rvfplot or lvr2plot), survival plots (e.g.,
sts or ltable), time series plots (e.g., ac or pac), and ROC plots (e.g., roctab or lsens). To
cover these graphs in enough detail to add something worthwhile would have expanded the
scope and size of this book and detracted from its utility. Instead, I have included a section,
Appendix : Stat graphs (345), that illustrates a number of these kinds of graphs to help you see
the kinds of graphs these commands create. This is followed by Appendix : Stat graph options
(352), which illustrates how you can customize these kinds of graphs using the options
illustrated in this book.
If I may close on a more personal note, writing this book has been very rewarding and
exciting. While writing, I kept thinking about the kind of book you would want to help you
take full advantage of the powerful, but surprisingly easy to use, features of Stata graphics.
I hope you like it!
Simi Valley, California
February 2004

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Box
Dot
Pie
Options
Standard options
Styles
Appendix

• Sometimes you might find it useful to visually scan the graphs rather than to read. I
think this is a good way to familiarize yourself with the kinds of features available in
Stata graphs. If a certain feature catches your eye, you can stop and see the command
that made the graph and perhaps even read the text explaining the command.

Bar

• While you might read a traditional book cover to cover, this book has been written
so that the chapters stand on their own. You should feel free to dive into any chapter
or section of any chapter.

Matrix

• Please consider reading this chapter before reading the other chapters, as it provides
key information that will make the rest of the book more understandable.

Twoway

First of all, there are many ways you might read this book, but perhaps I can suggest
some tips:

Building graphs

I hope that you are eager to start reading this book but will take just a couple of minutes
to read this section to get some suggestions that will make the book more useful to you.

Options

Using this book

Schemes

1.1

Types of Stata graphs

This chapter starts off by telling you a little bit about the organization of this book and
giving you tips to help you use it most effectively. The next section gives a brief overview
of the different kinds of Stata graphs we will be examining in this book, followed by an
overview of the different kinds of schemes that will be used for showing the graphs in this
book. The fourth section illustrates the structure of options in Stata graph commands. In
a sense, the second to fourth sections of this chapter are a thumbnail preview of the entire
book, showing the types of graphs covered, how you can control their overall look, and the
general structure of options used within those graphs. By contrast, the final section is about
the process of creating graphs.

Introduction

Introduction

Using this book

1

• Likewise, you might scan a chapter just by looking at the graphs and the part of the
command in red, which is the part of the command we are discussing for that graph.
For example, scanning the chapter on bar charts in this way would quickly familiarize
you with the kinds of features available for bar graphs and show you how to obtain
those features.
As you have probably noticed, the right margin contains what I call the Visual Table
of Contents. I hope you will find it a useful tool for quickly finding the information you
seek. I frequently use the Visual Table of Contents to cross-reference information within
the book. By design, Stata graphs share many features in common. For example, you use
the same kinds of options to control legends across different types of graphs. It would be
The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
1
All rights reserved on the copyright page apply to this document
and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2

Chapter 1. Introduction

repetitive to go into detail about legends for bar charts, box plots, and so on. Within each
kind of graph, legends are briefly described and illustrated, but the details are described in
the Options chapter in the section titled Legend. This is cross-referenced in the book by
saying something like “for more details, see Options : Legend (287)”, which indicates that
you should look to the Visual Table of Contents and thumb to the Options chapter and
then to the Legend section, which begins on page 287.
Sometimes it may take an extra cross-reference to get the information you need. Say
that you want to make the ytitle() large for a bar chart, so you first consult Bar : Y-axis
(143). This gives you some information about using ytitle(), but then that section refers
you to Options : Axis titles (254), where more details about axis titles are described. This
section then refers you to Options : Textboxes (303) for more complete details about options
you can use to control the display of text. That section shows more details but then refers to
Styles : Textsize (344), where all of the possible text sizes are described. I know this sounds
like a lot of jumping around, but I hope that it feels more like drilling down for additional
detail, that you feel you are in control of the level of detail that you want, and that the
Visual Table of Contents eases the process of getting the additional details.
Most pages of this book have three graphs per page, each graph being composed of
the graph itself, the command that produced it, and some descriptive text. An example is
shown below, followed by some points to note.

80
60
40
20
0

% homes cost $100K+

100

graph twoway scatter propval100 ownhome, msymbol(Sh)

40

50

60

70

80

In this example, we use the msymbol()
(marker symbol) option to make the
symbols large hollow squares; see
Options : Markers (235) for more details.
Note that the msymbol() option is only
useful for the types of graphs that have
marker symbols, and Stata will ignore
this option if you use it with a
command like the graph twoway
histogram command.
Uses allstates.dta & scheme vg s2c

% who own home

• Note that the command itself is displayed in a typewriter font, and the part of
the command we are discussing (i.e., msymbol(Sh)) is in this color, both in the
command and when referenced in the descriptive text.
• When commands or parts of commands are given in the descriptive text (e.g., graph
twoway histogram), they are displayed in typewriter font.
• Many of the descriptions contain cross-references, for example, Options : Markers (235),
which means to flip to the Options chapter and then to the section Markers. Equivalently, go to page 235.
• The names of some options are shorthand for two or more words that are sometimes
explained; for instance, “we use the msymbol() (marker symbol) option to make . . . ”.
The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.1

Using this book

3

Options
Standard options
Styles
Appendix

I should note that, while this book is designed for creating graphs in Stata version 8 and
beyond, many of the examples take advantage of numerous enhancements that have been
released as online updates subsequent to the initial version 8 release. As a result, some
features will either look different or may not work at all in Stata 8.0 or 8.1. Therefore, it is
very important that your copy of Stata be fully up to date. Please verify that your copy of
Stata is up to date and obtain any free updates; to do this, enter Stata, type

Pie

For guidance on appropriate abbreviations, consult [G] graph.

Dot

. sc propval100 ownhome, m(Sh)

Box

The tw could also have been omitted, leaving

Bar

. tw sc propval100 ownhome, m(Sh)

Matrix

and even the gr could have been omitted, leaving

Twoway

. gr tw sc propval100 ownhome, m(Sh)

Building graphs

In general, all commands and options are provided in their complete form. Commands
and options are generally not abbreviated. However, for purposes of typing, you may wish
to use abbreviations. The previous example could have been abbreviated to

Introduction

. graph twoway scatter propval100 ownhome, msymbol(Sh) scheme(vg s2c)

Options

After you issue the set scheme vg s2c command, subsequent graph commands will
show graphs using the vg s2c scheme. If you prefer, you could add the scheme(vg sc2)
option to the graph command to specify the scheme used just for that graph; for example,

Schemes

. set scheme vg s2c
. vguse allstates
. graph twoway scatter propval100 ownhome, msymbol(Sh)

Types of Stata graphs

If you want your graphs to look like the ones in the book, you can display them using
the same schemes. See Appendix : Online supplements (382) for information about how to
download the schemes used in this book. Once you have downloaded the schemes, you can
then type the following in the Stata Command window:

Using this book

• The descriptive text always concludes by telling you the name of the data file and
scheme used for making the graph. In this case, the data file was allstates.dta, and the
scheme was vg s2c.scheme. You can read the data file over the Internet by using the
vguse command, a command added to Stata when you install the online supplements;
see Appendix : Online supplements (382). If you are connected to the Internet, and your
Stata is fully up to date, you can simply type vguse allstates to use that file over
the Internet, and you can run the graph command shown to create the graph.

. update query

and follow the instructions. After the update is complete, you can use the help whatsnew
command to learn about the updates you have just received, as well as prior updates
documenting the evolution of Stata. Because Stata sometimes evolves beyond the printed
manual, you might find that some commands or options are documented via the online help
but not in your manual. For example, graph twoway tsline was released after the printed
manual and, as of the first printing of this book, is only documented via the online help
(help tsline).
The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4

Chapter 1. Introduction

What if you are using a newer version of Stata than version 8.2? It is possible that, in
the future, Stata may evolve to make the behavior of some of these commands change. If
this happens, you can use the version command to ask Stata to run the graph commands
as though they were run under version 8.2. For example, if you were running Stata version 9
but wanted a graph command to run as though you were running Stata 8.2, you could type
. version 8.2 : graph twoway scatter propval100 ownhome

and the command would be executed as if you were running version 8.2.
This book has a number of associated online resources to complement the book. Appendix : Online supplements (382) has more information about these online resources and how
to access them. I strongly suggest that you install the online supplements, which make it
easier to run the examples from the book. To install the supplemental programs, schemes,
and help files, just type from within Stata
. net from http://www.stata-press.com/data/vgsg
. net install vgsg

For an overview of what you have installed, type whelp vgsg within Stata. Then, with the
vguse command, you can use any dataset from the book. Likewise, all the custom schemes
used in the book will be installed into your copy of Stata and can be used to display the
graphs, as described earlier in this section.

1.2

Types of Stata graphs

Stata has a wide variety of graph types. This section introduces the types of graphs
Stata produces and covers twoway plots (including scatterplots, line plots, fit plots, fit plots
with confidence intervals, area plots, bar plots, range plots, and distribution plots), scatterplot matrices, bar charts, box plots, dot plots, and pie charts. We will start off with a
section showing the variety of twoway plots that can be created with graph twoway. For
this introduction, we have combined them into six families of related plots: scatterplots and
fit plots, line plots, area plots, bar plots, range plots, and distribution plots. We will start
by illustrating scatterplots and fit plots.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.2

Types of Stata graphs

5

100
80
60
40

% homes cost $100K+

20
0

6000

100
80
60
40

% homes cost $100K+

20
0

4000

6000

Pop/10 sq. miles

Styles

twoway lfit propval100 popden

60
20

40

Fitted values

80

100

Appendix

We can make a linear fit line (lfit)
predicting propval100 from popden.
See Twoway : Fit (49) for more
information about these kinds of plots.
Uses allstates.dta & scheme vg s2c

Standard options

10000

Building graphs

2000

Options

8000

Options

0

Pie

10000

Schemes

We can start this command with just
twoway, and Stata understands that
this is shorthand for graph twoway.
Uses allstates.dta & scheme vg s2c

Dot

8000

twoway scatter propval100 popden

Box

10000

Bar

8000

Pop/10 sq. miles

Matrix

4000

Twoway

2000

Introduction

0

Types of Stata graphs

Here is a basic scatterplot. The variable
propval100 is placed on the y-axis, and
popden is placed on the x-axis. See
Twoway : Scatter (35) for more details
about these kinds of plots.
Uses allstates.dta & scheme vg s2c

Using this book

graph twoway scatter propval100 popden

0

2000

4000

6000

Pop/10 sq. miles

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
6

Chapter 1. Introduction

twoway (scatter propval100 popden) (lfit propval100 popden)

0

20

40

60

80

100

Stata allows us to overlay twoway
graphs. In this case, we make a classic
plot showing a scatterplot overlaid with
a fit line using the scatter and lfit
commands. For more details about
overlaying graphs, see
Twoway : Overlaying (87).
Uses allstates.dta & scheme vg s2c
0

2000

4000

6000

8000

10000

Pop/10 sq. miles
% homes cost $100K+

Fitted values

twoway (scatter propval100 popden) (lfit propval100 popden)
(qfit propval100 popden)

0

20

40

60

80

100

The ability to combine twoway plots is
not limited to just overlaying two plots;
we can overlay multiple plots. Here, we
overlay a scatterplot with a linear fit
line (lfit) and a quadratic fit line
(qfit).
Uses allstates.dta & scheme vg s2c
0

2000

4000

6000

8000

10000

Pop/10 sq. miles
% homes cost $100K+

Fitted values

Fitted values

0

20

40

60

80

100

twoway (scatter propval100 popden) (mspline propval100 popden)
(fpfit propval100 popden) (mband propval100 popden)
(lowess propval100 popden)

0

2000

4000

6000

8000

10000

Stata has other kinds of fit methods in
addition to linear and quadratic fits.
This example includes a median spline
(mspline), fractional polynomial fit
(fpfit), median band (mband), and
lowess (lowess). For more details, see
Twoway : Fit (49).
Uses allstates.dta & scheme vg s2c

Pop/10 sq. miles
% homes cost $100K+

Median spline

predicted propval100

Median bands

lowess propval100 popden

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.2

Types of Stata graphs

7

150
100
50
0

6000

8000

10000

Pop/10 sq. miles
95% CI

Fitted values

1400
1350

Closing price

1300
1250

30

40

30

40

Styles

twoway spike close tradeday

1350
1300
1250

Closing price

1400

Appendix

Here, we use a spike graph to show the
same graph as the previous graph. It is
like the dropline plot, but no markers
are put on the top. For more details,
see Twoway : Scatter (35).
Uses spjanfeb2001.dta & scheme vg s2c

Standard options

20
Trading day number

Options

10

Pie

Building graphs

0

Dot

Options

This dropline graph shows the closing
prices of the S&P 500 by trading day
for the first 40 days of 2001. A
dropline graph is like a scatter plot
since each data point is shown with a
marker, but a dropline for each marker
is shown as well. For more details, see
Twoway : Scatter (35).
Uses spjanfeb2001.dta & scheme vg s2c

Box

Schemes

twoway dropline close tradeday

Bar

% homes cost $100K+

Matrix

4000

Twoway

2000

Introduction

0

Types of Stata graphs

In addition to being able to plot a fit
line, we can also plot a linear fit line
with a confidence interval using the
lfitci command. We also overlay the
linear fit and confidence interval with a
scatterplot. See Twoway : CI fit (50) for
more information about fit lines with
confidence intervals.
Uses allstates.dta & scheme vg s2c

Using this book

twoway (lfitci propval100 popden) (scatter propval100 popden)

0

10

20
Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8

Chapter 1. Introduction

twoway dot close tradeday

1350
1300
1250

Closing price

1400

The dot plot, like the scatter
command, shows markers for each data
point but also adds a dotted line for
each of the x-values. For more details,
see Twoway : Scatter (35).
Uses spjanfeb2001.dta & scheme vg s2c

0

10

20

30

40

Trading day number

twoway line close tradeday, sort

1350
1300
1250

Closing price

1400

The line command is used in this
example to make a simple line graph.
See Twoway : Line (54) for more details
about line graphs.
Uses spjanfeb2001.dta & scheme vg s2c

0

10

20

30

40

Trading day number

twoway connected close tradeday, sort

1350
1300
1250

Closing price

1400

The twoway connected graph is similar
to twoway line, except that a symbol
is shown for each data point. For more
information, see Twoway : Line (54).
Uses spjanfeb2001.dta & scheme vg s2c

0

10

20

30

40

Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.2

Types of Stata graphs

9

1400
1300
1200

Closing price

1100
1000
1400
1300
1200
1100

High price/Low price

1000
900

1Jul01
Date

Styles

twoway area close tradeday, sort

1350
1300
1250

Closing price

1400

Appendix

An area plot is similar to a line plot,
but the area under the line is shaded.
See Twoway : Area (61) for more
information about area plots.
Uses spjanfeb2001.dta & scheme vg s2c

Standard options

40

Building graphs

1Apr01

Options

30

Options

1Jan01

Pie

1Jan02

Schemes

This command uses tsrline (time
series range line) to make a line graph
showing the high and low prices of the
S&P 500 by trading date. For more
information, see Twoway : Line (54).
Uses sp2001ts.dta & scheme vg s2c

Dot

1Oct01

twoway tsrline high low, sort

Box

1Jan02

Bar

1Oct01

Date

Matrix

1Jul01

Twoway

1Apr01

Introduction

1Jan01

Types of Stata graphs

The tsline (time-series line) command
makes a line graph where the x-variable
is a date variable that has previously
been declared using tsset; see
[TS] tsset. This example shows the
closing price of the S&P 500 by trading
date. For more information, see
Twoway : Line (54).
Uses sp2001ts.dta & scheme vg s2c

Using this book

twoway tsline close, sort

0

10

20
Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
10

Chapter 1. Introduction

twoway bar close tradeday

1350
1300
1250

Closing price

1400

Here is an example of a twoway bar
plot. For each x-value, a bar is shown
corresponding to the height of the
y-variable. Note that this shows a
continuous x-variable as compared with
the graph bar command, which would
be useful when we have a categorical
x-variable. See Twoway : Bar (62) for
more details about bar plots.
Uses spjanfeb2001.dta & scheme vg s2c
0

10

20

30

40

Trading day number

twoway rarea high low tradeday, sort

1350
1300
1250
1200

High price/Low price

1400

This example illustrates the use of
rarea (range area) to graph the high
and low prices with the area filled. If
we used rline (range line), the area
would not be filled. See Twoway : Range
(64) for more details.
Uses spjanfeb2001.dta & scheme vg s2c

0

10

20

30

40

Trading day number

twoway rconnected high low tradeday, sort

1350
1300
1250
1200

High price/Low price

1400

The rconnected (range connected)
command makes a graph similar to the
previous one, except that a marker is
shown at each value of the x-variable
and the area in between is not filled. If
we instead used rscatter (range
scatter), the points would not be
connected. See Twoway : Range (64) for
more details.
Uses spjanfeb2001.dta & scheme vg s2c
0

10

20

30

40

Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.2

Types of Stata graphs

11

1400
1350
1300

High price/Low price

1250
1200

30

40

1400
1350
1300

High price/Low price

1250
1200

40

Styles

twoway histogram popk, freq

0

10

Frequency

20

30

Appendix

The twoway histogram command can
be used to show the distribution of a
single variable. It is often useful when
overlaid with other twoway plots;
otherwise, the histogram command
would be preferable. See
Twoway : Distribution (74) for more
details.
Uses allstates.dta & scheme vg s2c

Standard options

30

Options

20
Trading day number

Pie

10

Dot

Building graphs

0

Box

Options

Here, we use the rbar to graph the
high and low prices with bars at each
value of the x-variable. See
Twoway : Range (64) for more details.
Uses spjanfeb2001.dta & scheme vg s2c

Bar

Schemes

twoway rbar high low tradeday, sort

Matrix

20
Trading day number

Twoway

10

Introduction

0

Types of Stata graphs

Here, we use rcap (range cap) to graph
the high and low prices with a spike and
a cap at each value of the x-variable. If
you used rspike instead, spikes would
be displayed but not caps. If we used
rcapsym, the caps would be symbols
and you could modify the symbol. See
Twoway : Range (64) for more details.
Uses spjanfeb2001.dta & scheme vg s2c

Using this book

twoway rcap high low tradeday, sort

0

5000

10000

15000

20000

Pop/1,000

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
12

Chapter 1. Introduction

twoway kdensity popk

.0001
.00005
0

kdensity popk

.00015

The twoway kdensity command shows
a kernel-density plot and is useful for
examining the distribution of a single
variable. It can be overlaid with other
twoway plots; otherwise, the kdensity
command would be preferable. See
Twoway : Distribution (74) for more
details.
Uses allstates.dta & scheme vg s2c
0

5000

10000

15000

20000

25000

x

twoway function y=normden(x), range(-4 4)

y

0

.1

.2

.3

.4

The twoway function command allows
us to graph an arbitrary function over a
range of values we specify. See
Twoway : Distribution (74) for more
details.
Uses allstates.dta & scheme vg s2c

−4

−2

0

2

4

x

graph matrix propval100 rent700 popden
0

20

40
100

% homes
cost
$100K+

50

We can use the graph matrix
command to show a scatterplot matrix.
See Matrix (95) for more details.
Uses allstates.dta & scheme vg s2c

0

40

% rents
$700+/mo

20

0

10000

Pop/10
sq.
miles
0

50

100

0

5000

5000

0
10000

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.2

Types of Stata graphs

13

Mid Atl
E.N.C.

S. Atl.
E.S.C.
W.S.C.

Pacific
0

5,000

10,000

15,000

Mid Atl

S. Atl.
E.S.C.

Pacific
0

5,000

10,000

15,000

20,000

Pop/1,000

Styles

graph dot popk, over(division)

Appendix

The previous plot could also be shown
as a dot plot using graph dot. Dot
plots are often used to show one or
more summary statistics for one or
more continuous variables, broken down
by one or more categorical variables.
See Dot (193) for more details.
Uses allstates.dta & scheme vg s2c

Standard options

W.S.C.
Mountain

Options

Building graphs

E.N.C.
W.N.C.

Pie

N. Eng.

Dot

Options

We can show the previous graph as a
box plot using the graph hbox
(horizontal box) command. The graph
hbox command is commonly used for
showing the distribution of one or more
continuous variables, broken down by
one or more categorical variables. Note
that graph hbox is merely a rotated
version of graph box. See Box (157) for
more details.
Uses allstates.dta & scheme vg s2c

Box

Schemes

graph hbox popk, over(division)

Bar

mean of popk

Matrix

Mountain

Twoway

W.N.C.

Introduction

N. Eng.

Types of Stata graphs

The graph hbar (horizontal bar)
command is often used to show the
values of a continuous variable broken
down by one or more categorical
variables. Note that graph hbar is
merely a rotated version of graph bar.
See Bar (107) for more details.
Uses allstates.dta & scheme vg s2c

Using this book

graph hbar popk, over(division)

N. Eng.
Mid Atl
E.N.C.
W.N.C.
S. Atl.
E.S.C.
W.S.C.
Mountain
Pacific
0

5,000

10,000

15,000

mean of popk

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
14

Chapter 1. Introduction

graph pie popk, over(region)
The graph pie command can be used
to show pie charts. See Pie (217) for
more details.
Uses allstates.dta & scheme vg s2c

1.3

NE

N Cntrl

South

West

Schemes

While the previous section was about the different types of graphs Stata can make, this
section is about the different kinds of looks that you can have for Stata graphs. The basic
starting point for the look of a graph is a scheme, which controls just about every aspect
of the look of the graph. A scheme sets the stage for the graph, but you can use options
to override the settings in a scheme. As you might surmise, if you choose (or develop) a
scheme that produces graphs similar to the final graph you wish to make, you can reduce
the need to customize your graphs using options. Here, we give you a basic flavor of what
schemes can do and introduce you to the schemes you will be seeing throughout the book.
See Intro : Using this book (1) for more details about how to select and use schemes and
Appendix : Online supplements (382) for more information about how to download them.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.3

Schemes

15

100
80
60
40
20
0

70

80

% rents $700+/mo

100
80
60
40
20
0

70

80

% homes cost $100K+

% rents $700+/mo

Styles

graph hbox wage, over(grade) asyvar nooutsides legend(rows(2))

Appendix

This box plot shows an example of the
vg s2c scheme. It is based on the
s2color scheme but increases the sizes
of elements in the graph to make them
more readable. When we use this
scheme, the plot region has a white
background, but the surrounding area
(the graph region) is light blue.
Uses nlsw.dta & scheme vg s2c

Standard options

60
% who own home

Options

50

Pie

Building graphs

40

Dot

Options

This scatterplot is similar to the last
one but uses the vg s1m scheme, the
monochrome equivalent of the vg s1c
scheme. It is based on the s1mono
scheme but increases the sizes of
elements in the graph to make them
more readable. This scheme is in black
and white and has a white background,
both inside the plot region and in the
surrounding area.
Uses allstates.dta & scheme vg s1m

Box

Schemes

twoway scatter propval100 rent700 ownhome

Bar

% homes cost $100K+

Matrix

60
% who own home

Twoway

50

Introduction

40

Types of Stata graphs

This scatterplot illustrates the vg s1c
scheme. It is based on the s1color
scheme but increases the sizes of
elements in the graph to make them
more readable. This scheme is in color
and has a white background, both
inside the plot region and in the
surrounding area.
Uses allstates.dta & scheme vg s1c

Using this book

twoway scatter propval100 rent700 ownhome

0

5

10

15

20

hourly wage
4

5

6

7

8

9

10

12

13

14

15

16

17

18

11

excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
16

Chapter 1. Introduction

graph hbox wage, over(grade) asyvar nooutsides legend(rows(2))

0

5

10

15

20

hourly wage
4

5

6

7

8

9

10

12

13

14

15

16

17

18

11

This box plot is similar to the previous
one but uses the vg s2m scheme, the
monochrome equivalent of the vg s2c
scheme. This scheme is based on the
s2mono scheme but increases the sizes
of elements in the graph to make them
more readable. This scheme is in black
and white and has a white background
in the plot region but is light gray in
the surrounding graph region.
Uses nlsw.dta & scheme vg s2m

excludes outside values

graph hbar wage, over(occ7, label(nolabels)) blabel(group, position(base))
Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

0

2

4

6

8

10

This horizontal bar chart shows an
example of the vg palec scheme. It is
based on the s2color scheme but
makes the colors of the
bars/boxes/markers paler by decreasing
the intensity of the colors. As shown in
this example, one use of this scheme is
to make the colors of the bars pale
enough to include text labels inside of
bars.
Uses nlsw.dta & scheme vg palec

mean of wage

graph hbar wage, over(occ7, label(nolabels)) blabel(group, position(base))
This example is the same as the last
example but uses the vg palem scheme,
the monochrome equivalent of the
vg palec scheme. This scheme is based
on the s2mono scheme but makes the
colors of the bars/boxes/markers paler
by decreasing the intensity of the
colors.
Uses nlsw.dta & scheme vg palem

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

0

2

4

6

8

10

mean of wage

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.3

Schemes

17

100
80
60
40
20
0

70

80

% rents $700+/mo

100
80
60
40
20
0

70

80

% homes cost $100K+

% rents $700+/mo

Styles

60
50

DC

40

% who own home

70

80

Appendix

twoway (scatter ownhome borninstate if stateab=="DC", mlabel(stateab))
(scatter ownhome borninstate), legend(off)
This is an example of the vg samec
scheme, based on s2color, and makes
all of the markers, lines, bars, etc., the
same color, shape, and pattern. Here,
the second scatter command labels
Washington, DC, which normally would
be shown in a different color, but with
this scheme, the marker is the same.
This scheme has a monochrome
equivalent called vg samem that is not
illustrated.
Uses allstates.dta & scheme vg samec

Standard options

60
% who own home

Options

50

Pie

Building graphs

40

Dot

Options

This example is similar to the previous
one but illustrates the vg outm scheme,
the monochrome equivalent of the
vg outc scheme. It is based on the
s2mono scheme but makes the fill color
of the bars/boxes/markers white, so
they appear hollow.
Uses allstates.dta & scheme vg outm

Box

Schemes

scatter propval100 rent700 ownhome

Bar

% homes cost $100K+

Matrix

60
% who own home

Twoway

50

Introduction

40

Types of Stata graphs

This scatterplot illustrates the vg outc
scheme. It is based on the s2color
scheme but makes the fill color of the
bars/boxes/markers white, so they
appear hollow. The plot region is a
light blue to contrast with the white fill
color. In this case, this scheme is useful
to help us see number of markers
present where numerous markers are
close or partially overlapping.
Uses allstates.dta & scheme vg outc

Using this book

scatter propval100 rent700 ownhome

20

40

60

80

% born in state of residence

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
18

Chapter 1. Introduction

graph hbar commute, over(division) asyvar
N. Eng.
Mid Atl
E.N.C.
W.N.C.
S. Atl.
E.S.C.
W.S.C.
Mountain
Pacific
0

5

10
15
mean of commute

20

25

This horizontal bar chart shows an
example of the vg lgndc scheme. It is
based on the s2color scheme but
changes the default attributes of the
legend, namely, showing the legend in
one column to the left of the plot
region, with the key and symbols
placed atop each other. This can be an
efficient way to place the legend to the
left of the graph. There is also a
vg lgndm scheme, which is monochrome
and is not illustrated here.
Uses allstates.dta & scheme vg lgndc

graph bar commute, over(division) asyvar legend(rows(3))

20
15
10
0

5

mean of commute

25

This bar chart shows an example of the
vg past scheme. It is based on the
s2color scheme but selects subdued
pastel colors and provides a sand
background for the surrounding graph
region and an eggshell color for the
inner plot region and legend area.
Uses allstates.dta & scheme vg past
N. Eng.

Mid Atl

E.N.C.

W.N.C.

S. Atl.

E.S.C.

W.S.C.

Mountain

Pacific

twoway scatter rent700 propval100
This bar chart shows an example of the
vg rose scheme. It is based on the
s2color scheme but uses a different set
of colors, having an eggshell
background and a light rose color for
the plot area. The grid lines are
omitted by default, and the labels for
the y-axis are horizontal by default.
Uses allstates.dta & scheme vg rose

40

% rents $700+/mo

30

20

10

0
0

20

40

60

80

100

% homes cost $100K+

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.3

Schemes

19

15
10
5

Mid Atl

E.N.C.

W.N.C.

S. Atl.

E.S.C.

W.S.C.

Mountain

Pacific

N. Eng.

Mid Atl

E.N.C.

W.N.C.

S. Atl.

E.S.C.

W.S.C.

Mountain

Pacific

Options

mean of commute

25
20

Building graphs

This is an example using the vg teal
scheme. This scheme is also based on
the s2color scheme but uses an
olive–teal background. It also
suppresses the display of grid lines and
makes the labels for the y-axis display
horizontally by default.
Uses allstates.dta & scheme vg teal

15
10
5
0

Styles

graph bar commute, over(division) asyvar legend(rows(3))

20
15
10
0

5

mean of commute

25

Appendix

This bar chart shows an example of the
vg brite scheme. It is based on the
s2color scheme but selects a bright set
of colors and changes the background
to light khaki.
Uses allstates.dta & scheme vg brite

Standard options

N. Eng.

graph bar commute, over(division) asyvar legend(rows(3))

Options

Pacific

Pie

Mountain

Dot

E.S.C.

W.S.C.

Box

E.N.C.

S. Atl.

Bar

Mid Atl

W.N.C.

Schemes

N. Eng.

Matrix

0

Twoway

mean of commute

20

Introduction

25

Types of Stata graphs

This bar chart shows an example of the
vg blue scheme. It is based on the
s2color scheme but uses a set of blue
colors, with a light blue background
and a light blue-gray color for the plot
area. The grid lines are omitted by
default, and the labels for the y-axis are
horizontal by default.
Uses allstates.dta & scheme vg blue

Using this book

graph bar commute, over(division) asyvar legend(rows(3))

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
20

Chapter 1. Introduction

This section has just scratched the surface of all there is to know about schemes in Stata,
but I hope that it helps you see how schemes create a starting point for your graph and
that, by choosing a scheme that is most similar to the look you want, you can save time
and effort in customizing your graphs.

1.4

Options

Learning to create effective Stata graphs is ultimately about using options to customize
the look of a graph until you are pleased with it. This section illustrates the general rules
and syntax for Stata graph commands, starting with their general structure, followed by
illustrations showing how options work in the same way across different kinds of commands.
Stata graph options work much like other options in Stata; however, there are additional
features that extend their power and functionality. While we will use the twoway scatter
command for illustration, most of the principles illustrated extend to all kinds of Stata
graph commands.

twoway scatter propval100 rent700

80
60
40
20
0

% homes cost $100K+

100

Consider this basic scatterplot. To add
a title to this graph, we can use the
title() option as illustrated in the
next example.
Uses allstates.dta & scheme vg s2c

0

10

20

30

40

% rents $700+/mo

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.4

Options

21

100
80
60
40

% homes cost $100K+

20
0

30

40

100
80
60
40

% homes cost $100K+

20
0

20

30

40

30

40

% rents $700+/mo

Styles

20

40

60

80

100

This is a title for the graph

0

% homes cost $100K+

Appendix

twoway scatter propval100 rent700,
title("This is a title for the graph", box size(small))
Let’s take the last graph and modify
the title to make it small. We can add
another option to the title() option
by adding the size(small) option.
Here, we see that one of the options is a
keyword (box) and that another option
allows us to supply a value
(size(small)).
Uses allstates.dta & scheme vg s2c

Standard options

10

Options

Building graphs

0

Pie

This is a title for the graph

Dot

Options

Starting with Stata 8, options can have
options of their own. Let’s put a box
around the title of the graph. We can
use title(, box), placing box as an
option within title(). If the default
for the current scheme had included a
box, then we could have used the nobox
option to suppress it.
Uses allstates.dta & scheme vg s2c

Box

Schemes

twoway scatter propval100 rent700,
title("This is a title for the graph", box)

Bar

20
% rents $700+/mo

Matrix

10

Twoway

0

Introduction

This is a title for the graph

Types of Stata graphs

Just as with any Stata command, the
title() option comes after a comma,
and in this case, it contains a quoted
string that becomes the title of the
graph.
Uses allstates.dta & scheme vg s2c

Using this book

twoway scatter propval100 rent700,
title("This is a title for the graph")

0

10

20
% rents $700+/mo

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
22

Chapter 1. Introduction

twoway scatter propval100 rent700,
title("This is a title for the graph", box size(small))
msymbol(S)

80
60
40
20
0

% homes cost $100K+

100

This is a title for the graph

0

10

20

30

40

Say that we want the symbols to be
displayed as squares. We can add
another option called msymbol(S) to
indicate that we want the marker
symbol to be displayed as a square (S
for square). Adding one option at a
time is a common way to build a Stata
graph. In the next graph, we will
change gears and start building a new
graph to show other aspects of options.
Uses allstates.dta & scheme vg s2c

% rents $700+/mo

twoway scatter propval100 rent700

80
60
40
20
0

% homes cost $100K+

100

Let’s return to this simple scatterplot.
Say that we want the labels for the
x-axis to change from 0 10 20 30 40 to
0 5 10 15 20 25 30 35 40.
Uses allstates.dta & scheme vg s2c

0

10

20

30

40

% rents $700+/mo

twoway scatter propval100 rent700, xlabel(0(5)40)

80
60
40
20
0

% homes cost $100K+

100

Here, we add the xlabel() option to
label the x-axis from 0 to 40,
incrementing by 5. But say that we
want the labels to be displayed larger.
Uses allstates.dta & scheme vg s2c

0

5

10

15

20

25

30

35

40

% rents $700+/mo

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.4

Options

23

100
80
60
40

% homes cost $100K+

20
0

15

20

25

30

35

40

100
80
60
40

% homes cost $100K+

20
0

30

40

Standard options

20
% rents $700+/mo

Options

10

Pie

Building graphs

0

Dot

Options

The xlabel() option we use here
indicates that we are content with the
numbers chosen for the label of the
x-axis because we have nothing before
the comma. After the comma, we add
the labsize() option to increase the
size of the labels for the x-axis.
Uses allstates.dta & scheme vg s2c

Box

Schemes

twoway scatter propval100 rent700, xlabel(, labsize(huge))

Bar

% rents $700+/mo

Matrix

10

Twoway

5

Introduction

0

Types of Stata graphs

Here, we add the labsize() (label size)
option to increase the size of the labels
for the x-axis. Say that we were happy
with the original numbering (0 10 20 30
40) but wanted the labels to be huge.
How would we do that?
Uses allstates.dta & scheme vg s2c

Using this book

twoway scatter propval100 rent700, xlabel(0(5)40, labsize(huge))

Styles
Appendix

Let’s consider some examples using the legend() option to show that some options do
not require or permit the use of commas within them. Also, this allows us to show a case
where you might properly specify an option over and over again.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
24

Chapter 1. Introduction

twoway scatter propval100 rent700 popden

0

20

40

60

80

100

Here, we show two y-variables,
propval100 and rent700, graphed
against population density, popden.
Note that Stata has created a legend,
helping us see which symbols
correspond to which variables. We can
use the legend() option to customize
it.
Uses allstates.dta & scheme vg s2c
0

2000

4000

6000

8000

10000

Pop/10 sq. miles
% homes cost $100K+

% rents $700+/mo

0

20

40

60

80

100

twoway scatter propval100 rent700 popden, legend(cols(1))

0

2000

4000

6000

8000

10000

Using the legend(cols(1)) option, we
make the legend display in a single
column. Note that we did not use a
comma because, with the legend()
option, there is no natural default
argument. If we had included a comma
within the legend() option, Stata
would have reported this as an error.
Uses allstates.dta & scheme vg s2c

Pop/10 sq. miles
% homes cost $100K+
% rents $700+/mo

twoway scatter propval100 rent700 popden,
legend(cols(1) label(1 "Property Value"))

0

20

40

60

80

100

This example adds another option
within the legend() option, label(),
which changes the label for the first
variable.
Uses allstates.dta & scheme vg s2c

0

2000

4000

6000

8000

10000

Pop/10 sq. miles
Property Value
% rents $700+/mo

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.4

Options

25

100
80
60
40
20
0

6000

8000

10000

Pop/10 sq. miles

100
80
60

Standard options

40

Options

0

Styles

20

Pie

Building graphs

Consider this graph, which shows a
scatterplot predicting property value
from population density and shows a
linear fit between these two variables.
Say that we wanted to change the
symbol displayed in the scatterplot and
the thickness of the line for the linear
fit.
Uses allstates.dta & scheme vg s2c

Dot

Options

twoway (scatter propval100 popden)
(lfit propval100 popden)

Box

Finally, let’s consider an example that shows how to use the twoway command to overlay two plots, how each graph can have its own options, and how options can apply to the
overall graph.

Bar

Schemes

Property Value
Rent

Matrix

4000

Twoway

2000

Introduction

0

Types of Stata graphs

Here, we add another label() option
for the legend() option, but in this
case, we change the label for the second
variable. Note that we can use the
label() option repeatedly to change
the label for the different variables.
Uses allstates.dta & scheme vg s2c

Using this book

twoway scatter propval100 rent700 popden,
legend(cols(1) label(1 "Property Value") label(2 "Rent"))

0

2000

4000

6000

8000

10000

% homes cost $100K+

Fitted values

Appendix

Pop/10 sq. miles

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
26

Chapter 1. Introduction

0

20

40

60

80

100

twoway (scatter propval100 popden, msymbol(S))
(lfit propval100 popden, clwidth(vthick))

0

2000

4000

6000

8000

10000

Pop/10 sq. miles
% homes cost $100K+

Note that we add the msymbol() option
to the scatter command to change the
symbol to a square, and we add the
clwidth() (connect line width) option
to the lfit command to make the line
very thick. When we overlay two plots,
each plot can have its own options that
operate on its respective parts of the
graph. However, some parts of the
graph are shared, for example, the title.
Uses allstates.dta & scheme vg s2c

Fitted values

twoway (scatter propval100 popden, msymbol(S))
(lfit propval100 popden, clwidth(vthick)),
title("This is the title of the graph")
Note that we add the title() option
to the very end of the command placed
after a comma. That final comma
signals that options concerning the
overall graph are to follow, in this case,
the title() option.
Uses allstates.dta & scheme vg s2c

0

20

40

60

80

100

This is the title of the graph

0

2000

4000

6000

8000

10000

Pop/10 sq. miles
% homes cost $100K+

Fitted values

One of the beauties of Stata graph commands is the way that different graph commands
share common options. If we want to customize the display of a legend, we do it using the
same options, whether we are using a bar graph, a box plot, a scatterplot, or any other
kind of Stata graph. Once we learn how to control legends with one type of graph, we have
learned how to control legends for all types of graphs. Let’s look at a couple of examples.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.4

Options

27

40
20
0

2000

4000

6000

8000

10000

30
20
10
0

West

Styles

graph matrix propval100 rent700 popden, legend(position(1))
0

20

40
100

% homes
cost
$100K+

50

Appendix

Contrast this example with the previous
two. The graph matrix command does
not support the legend() option
because this graph does not need or
produce legends. In the Matrix (95)
chapter, for example, there are no
references to legends, an indication that
this is not a relevant option for this
kind of graph. Note that, even though
we included this additional irrelevant
option, Stata ignored it and produced
an appropriate graph anyway.
Uses allstates.dta & scheme vg s2c

Standard options

South

Options

Building graphs

North

Pie

40

mean of rent700

Dot

mean of propval100

Options

Here, we use the graph bar command,
which is a completely different
command from the previous one. Even
though the graphs are different, the
legend() option we supply is the same
and has the same effect. Many (but not
all) options function in this way,
sharing a common syntax and having
common effects.
Uses allstates.dta & scheme vg s2c

Box

Schemes

graph bar propval100 rent700, over(nsw) legend(position(1))

Bar

Pop/10 sq. miles

Matrix

0

Twoway

60

80

100

% rents $700+/mo

Introduction

% homes cost $100K+

Types of Stata graphs

Consider this scatterplot. We have
added a legend() option to make the
legend display in the one o’clock
position on the graph, putting the
legend in the top right corner.
Uses allstates.dta & scheme vg s2c

Using this book

twoway scatter propval100 rent700 popden, legend(position(1))

0

40

% rents
$700+/mo

20

0

10000

Pop/10
sq.
miles
0

50

100

0

5000

5000

0
10000

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
28

Chapter 1. Introduction

Because legends work the same way with different types of Stata graph commands, we
can save pages by describing legends in detail in one place: Options : Legend (287). However,
it is useful to see examples of legends for each type of graph that uses them. Each chapter,
therefore, provides a brief section describing legends for each type of graph discussed in
that chapter. Likewise, most options are described in detail in Options (235) with a brief
section in every chapter discussing how each option works in specific types of graphs. As
we saw in the case of legends, some options are not appropriate for some types of graphs,
so those options will not be discussed with the commands that do not support them.
While an option like legend() can be used with many, but not all, kinds of Stata graph
commands, other kinds of options can be used with almost every kind of Stata graph.
These are called Standard Options. To help you differentiate these kinds of options, they
are discussed in their own chapter, Standard options (313). Since these options can be used
with most types of graph commands, they are generally not discussed in the chapters about
the different types of graphs, except when their usage interacts with the options illustrated.
For example, subtitle() is a Standard Option, but its behavior takes on a special meaning
when used with the legend() option, so the subtitle() option is discussed in the context of
legends. Consistent with what we have seen before, the syntax of Standard Options follows
the same kinds of rules we have illustrated, and their usage and behavior are uniform across
the many types of Stata graph commands.
To recap, this section was not about any particular options, but about some of the rules
for using these options and how they behave. Some options permit options. In some cases,
you may want to specify only options. Some options allow you to include one or more
options, but no comma is required. When you overlay multiple graphs using twoway, you
may have options that go along with each graph, as well as overall options that appear at
the end of the command. Finally, the syntax of a certain option is the same across the
different graph commands that use the options, but not all options are useful for all kinds
of graph commands.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.5

29

60

52.46

40
19.85

18.46
11.60

10

11.35

fic

n
ai

ci
Pa

.

nt
ou
M

W

.S

C

.C

.

l.
At

S.
E.

.
S.

.C

.

.N
W

.C
N
E.

id
M

N

.E

ng

At

l

.

0

60
0

20

40

Appendix

mean of propval100

Styles

We begin by seeing that this is a bar
chart and look at Bar : Y-variables (107)
and Bar : Over (111). We take our first
step towards making this graph by
making a bar chart showing
propval100 and adding over(nsw) and
over(division) to break the means
down by nsw and division.
Uses allstates.dta & scheme vg past

80

graph bar propval100, over(nsw) over(division)

Standard options

Region

Options

10.01

Building graphs

20

Pie

28.91

30

Dot

53.00

50

Options

% homes over $100K

West

Box

South

Bar

North
66.57

70

Matrix

80

Schemes

Say that we want to create this graph.
For now, the syntax is concealed, just
showing the graph display command
to show the previously drawn graph. It
might be overwhelming at first to
determine all of the options needed to
make this graph. To ease our task, we
will build it one bit at a time, refining
the graph and fixing any problems we
find.
Uses allstates.dta & scheme vg past

Twoway

graph display

Types of Stata graphs

I have three agendas in writing this section. First, I will show the process of building
complex graphs a little bit at a time. At the same time, I illustrate how to use the resources
of this book to get the bits of information needed to build these graphs. Finally, I show that,
even though a complete Stata graph command might look complicated and overwhelming,
the process of building it slowly is actually very straightforward and logical. Let’s first build
a bar chart that looks at property values broken down by region of the country. Then, we
will modify the legend and bar characteristics, add titles, and so forth.

Introduction

Building graphs

Using this book

1.5

Building graphs

North
South
West
North
South
West
North
South
West
North
South
West
North
South
West
North
South
West
North
South
West
North
South
West
North
South
West

N. Eng.Mid AtlE.N.C.W.N.C.S. Atl.E.S.C.W.S.C.
Mountain
Pacific

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
30

Chapter 1. Introduction

60
40
0

20

mean of propval100

80

graph bar propval100, over(nsw) over(division) nofill

North

North

North

North

South

South

South

West

West

The previous graph is not quite what
we want because we see every division
shown with every nsw, but for example,
the Pacific region only appears in the
West. In Bar : Over (111), we see that
we can add the nofill option to show
only the combinations of nsw and
division that exist in the data file.
Next, we will look at the colors of the
bars.
Uses allstates.dta & scheme vg past

N. Eng.Mid AtlE.N.C.W.N.C.S. Atl. E.S.C.W.S.C.
Mountain
Pacific

graph bar propval100, over(nsw) over(division) nofill asyvars

60
40
20
0

mean of propval100

80

The last graph is getting closer, but we
want the bars for North, South, and
West to be displayed in different colors
and labeled with a legend. In
Bar : Y-variables (107), we see that the
asyvars option will accomplish this.
Next, we will change the title for the
y-axis.
Uses allstates.dta & scheme vg past
N. Eng. Mid Atl E.N.C. W.N.C. S. Atl. E.S.C. W.S.C.MountainPacific
North

South

West

graph bar propval100, over(nsw) over(division) nofill asyvars
ytitle("% homes over $100K")

60
40
20
0

% homes over $100K

80

Now, we want to put a title on the
y-axis. In Bar : Y-axis (143), we see
examples illustrating the use of
ytitle() for putting a title on the
y-axis. Here, we put a title on the
y-axis, but now we want to change the
labels for the y-axis to go from 0 to 80,
incrementing by 10.
Uses allstates.dta & scheme vg past
N. Eng. Mid Atl E.N.C. W.N.C. S. Atl. E.S.C. W.S.C.MountainPacific
North

South

West

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.5

Building graphs

31

60
50
40
30

10
0

N. Eng. Mid Atl E.N.C. W.N.C. S. Atl. E.S.C. W.S.C.MountainPacific
South

40
30
20
10
0

N. Eng. Mid Atl E.N.C. W.N.C. S. Atl. E.S.C. W.S.C.MountainPacific

Region
North

South

West

80

North

South

West

Appendix

70
% homes over $100K

Here, we want to use the legend()
option to make the legend have one row
in the top right corner within the plot
area. In Bar : Legend (130), we see that
the rows(1) option makes the legend
appear in one row and that the
position(1) option puts the legend in
the 1 o’clock position. The ring(0)
option puts the legend inside the plot
region. Next, let’s label the bars.
Uses allstates.dta & scheme vg past

Styles

graph bar propval100, over(nsw) over(division) nofill asyvars
ytitle("% homes over $100K") ylabel(0(10)80, angle(0)) b1title(Region)
legend(rows(1) position(1) ring(0))

Standard options

% homes over $100K

50

Options

60

Pie

70

Dot

80

Building graphs

After having used the ytitle() option
to label the y-axis, we might be
tempted to use the xtitle() option to
label the x-axis, but this axis is a
categorical variable. In Bar : Cat axis
(123), we see that this axis is treated
quite differently because of that. To
put a title below the graph, we use the
b1title() option. Now, let’s turn our
attention to formatting the legend.
Uses allstates.dta & scheme vg past

Options

graph bar propval100, over(nsw) over(division) nofill asyvars
ytitle("% homes over $100K") ylabel(0(10)80, angle(0)) b1title(Region)

Box

Schemes

West

Bar

North

Matrix

20

Twoway

% homes over $100K

70

Introduction

80

Types of Stata graphs

The Bar : Y-axis (143) section also tells
us about the ylabel() option. In
addition to changing the labels, we also
want to change the angle of the labels,
and in that section, we see that we can
use the angle() option to change the
angle of the labels. Now that we have
the y-axis labeled as we wish, let’s next
look at the title for the x-axis.
Uses allstates.dta & scheme vg past

Using this book

graph bar propval100, over(nsw) over(division) nofill asyvars
ytitle("% homes over $100K") ylabel(0(10)80, angle(0))

60
50
40
30
20
10
0

N. Eng. Mid Atl E.N.C. W.N.C. S. Atl. E.S.C. W.S.C.MountainPacific

Region

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
32

Chapter 1. Introduction

graph bar propval100, over(nsw) over(division) nofill asyvars
ytitle("% homes over $100K") ylabel(0(10)80, angle(0)) b1title(Region)
legend(rows(1) position(1) ring(0)) blabel(bar)
80

% homes over $100K

70

North

South

West

66.5667

60

53

52.46

50
40
28.9125

30
20

11.6

10.0143

10
0

19.85

18.46
11.35

We want each bar to be labeled with
the height of the bar, and Bar : Legend
(130) shows how we can do this. This
section shows how to use the blabel()
(bar label) option to label the bars in
lieu of legends. blabel() also can label
the bars with their height, using
blabel(bar).
Uses allstates.dta & scheme vg past

N. Eng. Mid Atl E.N.C. W.N.C. S. Atl. E.S.C. W.S.C.MountainPacific

Region

graph bar propval100, over(nsw) over(division) nofill asyvars
ytitle("% homes over $100K") ylabel(0(10)80, angle(0)) b1title(Region)
legend(rows(1) position(1) ring(0)) blabel(bar, format(%4.2f))
80

% homes over $100K

70
60

North

South

West

66.57

53.00

52.46

50
40
28.91

30
20
10
0

We want the labels for each bar to end
in two decimal places, and we see in
Bar : Legend (130) that we can use the
format() option to format these
numbers as we wish.
Uses allstates.dta & scheme vg past

19.85

18.46
11.60

10.01

11.35

N. Eng. Mid Atl E.N.C. W.N.C. S. Atl. E.S.C. W.S.C.MountainPacific

Region

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
1.5

Building graphs

33

North

West

66.57

60

53.00

52.46

50
40
19.85

18.46

20

11.60

10.01

10

11.35

Box
Dot
Pie

M

Pa
ci

.
ou
nt
ai
n

.

.S
.C
W

E.
S.
C

.

S.
At
l.

.C

.

At
l

.C

.N
W

E.
N

id
M

.E
ng
.
N

fic

Options

Options
Standard options

Building graphs

This section has shown that it is not that difficult to create a complex graph by building
it one step at a time. You can use the resources in this book to seek out each piece of
information you need and then put those pieces together the way you want to create your
own graphs. For more information about how to integrate options to create complex Stata
graphs, see Appendix : More examples (366).

Schemes

Region

Bar

0

Matrix

28.91

30

Twoway

70
% homes over $100K

South

Introduction

80

Types of Stata graphs

Finally, in Bar : Cat axis (123), we see
that we can add the label(angle(45))
option to the over() option to specify
that labels for that variable be shown
at a 45-degree angle so they do not
overlap each other.
Uses allstates.dta & scheme vg past

Using this book

graph bar propval100, over(nsw) over(division, label(angle(45))) nofill
ytitle("% homes over $100K") ylabel(0(10)80, angle(0)) b1title(Region)
legend(rows(1) position(1) ring(0)) blabel(bar, format(%4.2f)) asyvars

Styles
Appendix

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Fit

Dot
Pie
Options

Options

Standard options

Overlaying

80

Styles

40

50

60

70

Appendix

% who own home

Box

Distribution

graph twoway scatter ownhome propval100

Bar

Range

This section covers the use of scatterplots. Because scatterplots are so commonly used,
this section will cover more details about the use of these graphs than subsequent sections.
Also, this section will introduce some of the kinds of options that can be used with many
kinds of twoway plots, with cross-references to Options (235).

Here is a basic scatterplot. Note that
this command starts with graph
twoway, which indicates that this is a
twoway graph. scatter indicates that
we are creating a twoway scatterplot.
These are followed by the variable to be
placed on the y-axis and then the
variable for the x-axis.
Uses allstates.dta & scheme vg s2c

Bar

Scatterplots

Area

2.1

Matrix

Line

The graph twoway command represents not just a single kind of graph but actually
over thirty different kinds of graphs. Many of these graphs are similar in appearance and
function, so I have grouped them into eight families, which form the first eight sections
of this chapter. These first eight sections, which cover scatterplots to distribution plots,
cover the general features of these plots and briefly mention some important options. These
are followed by a section giving an overview of the options that can be used with twoway
graphs. (For further details about the options that can be used with twoway graphs, see
Options (235) and the sections within that chapter.) The chapter concludes with a section
illustrating how you can overlay twoway graphs. For more details about graph twoway, see
[G] graph twoway.

Twoway

CI fit

Twoway graphs

Introduction

Scatter

2

0

20

40

60

80

100

% homes cost $100K+

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this 35
document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
36

Chapter 2. Twoway graphs

twoway scatter ownhome propval100

60
40

50

% who own home

70

80

Since it can be cumbersome to type
graph twoway scatter, Stata allows
you to shorten this to twoway scatter.
Uses allstates.dta & scheme vg s2c

0

20

40

60

80

100

% homes cost $100K+

60
40

50

% who own home

70

80

scatter ownhome propval100

0

20

40

60

80

100

% homes cost $100K+

In fact, some graph twoway commands
are so frequently used that Stata
permits you to omit the graph twoway,
as we have done here, and just start the
command with scatter. While this
can save some typing, this can
sometimes conceal the fact that the
command is really a twoway graph and
that these are a special class of graphs.
For clarity, I will generally present
these graphs starting with twoway.
Uses allstates.dta & scheme vg s2c

twoway scatter ownhome propval100, msymbol(Sh)

60
40

50

% who own home

70

80

You can control the marker symbol with
the msymbol() option. Here, we make
the symbols large, hollow squares. See
Options : Markers (235) for more details
about controlling the marker symbol,
size, and color, and see Styles : Symbols
(342) for the symbols you can select.
Uses allstates.dta & scheme vg s2c

0

20

40

60

80

100

% homes cost $100K+

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.1

Scatterplots

37

80
70
60

% who own home

50
40

60

80
70
60

60

Styles

twoway scatter ownhome propval100 [aweight=rent700], msize(small)

60
50
40

% who own home

70

80

Appendix

You can also use a weight variable to
determine the size of the symbols.
Using [aweight=rent700], we size the
symbols according to the proportion of
rents that exceed 700 dollars per
month, allowing us to graph three
variables at once. We add the
msize(small) option to shrink the size
of all the markers so they do not get
too large. See Options : Markers (235)
for more details.
Uses allstates.dta & scheme vg outc

Overlaying

40

50

% who own home

Options

40

% homes cost $100K+

Standard options

100

Distribution

20

Options

80

Range

0

Pie

100

Bar

You can control the marker size with
the msize() option. Using
msize(vlarge), we make the markers
very large. Note that we switched to
the vg outc scheme, showing
white-filled markers, which can be
useful when the markers are large. See
Styles : Markersize (340) for other sizes
you could choose and also
Options : Markers (235) for more details.
Uses allstates.dta & scheme vg outc

Dot

80

twoway scatter ownhome propval100, msize(vlarge)

Box

100

Bar

80

Area

40

% homes cost $100K+

Matrix

Line

20

Twoway

CI fit

0

Introduction

Fit

You can control the marker color with
the mcolor() option. Here, we make
the markers maroon. See Styles : Colors
(328) for other colors you could choose
and also Options : Markers (235) for
more details.
Uses allstates.dta & scheme vg s2c

Scatter

twoway scatter ownhome propval100, mcolor(maroon)

0

20

40

60

% homes cost $100K+

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
38

Chapter 2. Twoway graphs

80

twoway scatter ownhome propval100, mlabel(stateab)

60

NH

RI MA
CA
HI

NY

NV
AK

CT

NJ

MD

50

% who own home

70

MN
WV MI PA ME
DE
IA
VT
MS
AL
IN
WI
UT
ID
AR
KS
SC
MO
KY
OK
WY
ND
TN
NE
OH
NC
NM
SD
IL
FL
MT
LA GA
VA
AZ
OR CO WA
TX

The mlabel(stateab) option can be
used to add a marker label with the
state abbreviation. See
Options : Marker labels (247) for more
details about how you can control the
size, position, color, and angle of
marker labels.
Uses allstates.dta & scheme vg outc

40

DC
0

20

40

60

80

100

% homes cost $100K+

80

twoway scatter ownhome propval100, mlabel(stateab) mlabsize(vlarge)

60

NV
AK

NH
MD

CT

NJ

The mlabsize(vlarge) option controls
the marker label size. Here, we make
the marker label very large.
Uses allstates.dta & scheme vg outc

RI MA
CA
HI

NY

50

% who own home

70

MN
WV MI PA ME
DE
IA
VT
MS
AL
IN
WI
UT
ID
AR
KS
SC
MO
KY
OK
WY
ND
TNNC
NE
NMFLIL
SD
MT
LAOH
VA
GA
AZ
OR CO WA
TX

40

DC
0

20

40

60

80

100

% homes cost $100K+

80

twoway scatter ownhome propval100, mlabel(stateab) mlabposition(12)

60

NV
AK

NH
MD

NJ

CT

RI MA
CA
HI

NY

50

% who own home

70

MN
WV MI PA ME
DE
IAAL
VT
MS
IN
WI
UT
ID
AR
KS
SC
MO
KY
OK
WY
ND
TN
NE
NC
NM FL IL
SD
MT
LAOH
VA
GAAZ
OR CO WA
TX

40

DC
0

20

40

60

% homes cost $100K+

80

100

The mlabposition() option controls
the marker label position with respect
to the marker. Here, we place the
marker labels at the 12 o’clock position
with respect to the markers, placing the
labels directly above the points they
label. See Options : Marker labels (247)
for examples illustrating the
mlabvposition() option, which
permits different marker label positions
for different observations.
Uses allstates.dta & scheme vg outc

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.1

Scatterplots

39

80
70
60
50

Area

60

100

40
75
70
65

yhat ownhome|propval100

60

20

40

60

% homes cost $100K+

Styles

twoway scatter fv propval100, connect(l) sort

70
65
60

yhat ownhome|propval100

75

Appendix

We add the connect(l) option to
indicate that the points should be
connected with a line. We also add the
sort option, which is generally
recommended when you connect
observations and the data are not
already sorted on the x-variable.
Uses allstates.dta & scheme vg past

Overlaying

0

Standard options

80

Options

The variable fv represents the fit
values, and here we graph fv against
propval100. As we expect, all of the
points fall along a line, but they are not
connected. The next few examples will
consider options you can use to connect
points; see Options : Connecting (250)
for more details. For variety, we have
switched to the vg past scheme.
Uses allstates.dta & scheme vg past

Distribution

. regress ownhome propval100
. predict fv

Options

100

Range

Say that we ran the following
commands:

Pie

80

twoway scatter fv propval100

Dot

100

Box

80

Bar

40

% homes cost $100K+

Bar

20

Matrix

% who own home

RI MA
CA
HI

NY

Twoway

CT

DC
0

Introduction

NJ

Line

NV
AK

NH
MD

CI fit

MN
WV MI PA ME
DE
IAAL
VT
MS
IN
WI
UT
ID
AR
KS
SC
MO
KY
OK
WY
ND
TN
NE
OH
NC
NM
SD
IL
FL
MT
LA GAAZ
VA
OR CO WA
TX

Fit

The mlabposition(0) option places
the marker label in the center position.
To keep it from being obscured by the
marker symbol, we also add the
msymbol(i) option to make the marker
symbol invisible. In effect, the marker
symbols have been replaced by the
marker labels.
Uses allstates.dta & scheme vg outc

Scatter

twoway scatter ownhome propval100, mlabel(stateab)
mlabposition(0) msymbol(i)

0

20

40

60

% homes cost $100K+

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
40

Chapter 2. Twoway graphs

twoway scatter fv ownhome propval100, connect(l i) sort

40

50

60

70

80

We can show both the observations and
the fit values in one graph. The
connect(l i) option specifies that the
first y-variable should be connected
with straight lines (l for line) and the
second y-variable should not be
connected (i for invisible connection).
Uses allstates.dta & scheme vg past
0

20

40

60

80

100

% homes cost $100K+
yhat ownhome|propval100

% who own home

twoway scatter fv ownhome propval100, msymbol(i .)
sort

connect(l i)

40

50

60

70

80

The msymbol(i .) option specifies
that the first y-variable should not have
symbols displayed (i for invisible
symbol) and that the second y-variable
should have the default symbols
displayed.
Uses allstates.dta & scheme vg past

0

20

40

60

80

100

% homes cost $100K+
yhat ownhome|propval100

% who own home

twoway scatter fv ownhome propval100, msymbol(i .)
sort legend(label(1 Pred. Perc. Own))

connect(l i)

40

50

60

70

80

The legend() option can be used to
control the legend. We use the label()
option to specify the contents of the
first item in the legend. See
Options : Legend (287) for more details
on legends.
Uses allstates.dta & scheme vg past

0

20

40

60

80

100

% homes cost $100K+
Pred. Perc. Own

% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.1

Scatterplots

41

80
70
60
50
40

60

80

100

Area

40

Bar

% who own home

Pred. Perc. Own

80
70
60
50
40

100

% who own home
Pred. Perc. Own

Styles

70
60
50
40

Percent who own home

80

Appendix

twoway scatter ownhome propval100,
xtitle("Percent homes over $100K") ytitle("Percent who own home")
The xtitle() and ytitle() option
can be used to specify the titles for the
x- and y-axes. See Options : Axis titles
(254) for more details about how to
control the display of axes. Note that
we are now using the vg s2m scheme,
one you might favor for graphs that will
be printed in black and white.
Uses allstates.dta & scheme vg s2m

Standard options

80

Options

60

Overlaying

40

% homes cost $100K+

Pie

Options

20

Dot

Distribution

0

Box

Range

twoway scatter fv ownhome propval100, msymbol(i .) connect(l i)
sort legend(label(1 Pred. Perc. Own) order(2 1) cols(1))

Bar

% homes cost $100K+

Matrix

Line

20

Twoway

CI fit

0

The cols(1) option makes the items in
the legend display in a single column.
Uses allstates.dta & scheme vg past

Introduction

Fit

The order() option can be used to
specify the order in which the items in
the legend are displayed.
Uses allstates.dta & scheme vg past

Scatter

twoway scatter fv ownhome propval100, msymbol(i .) connect(l i)
sort legend(label(1 Pred. Perc. Own) order(2 1))

0

20

40

60

80

100

Percent homes over $100K

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
42

Chapter 2. Twoway graphs

50

60

70

80

Here, we use the size(huge) option to
make the title on the y-axis huge. For
other text sizes you could choose, see
Styles : Textsize (344).
Uses allstates.dta & scheme vg s2m

40

Percent who own home

twoway scatter ownhome propval100,
ytitle("Percent who own home", size(huge))

0

20

40

60

80

100

% homes cost $100K+

twoway scatter ownhome propval100, xlabel(0(10)100) ylabel(40(5)80)

65
60
55
40

45

50

% who own home

70

75

80

We use the ylabel() and xlabel()
options to control the labeling of the xand y-axes. We label the x-axis from 0
to 100, incrementing by 10, and the
y-axis from 40 to 80, incrementing by 5.
See Options : Axis labels (256) for more
details on labeling axes.
Uses allstates.dta & scheme vg s2m

0

10

20

30

40

50

60

70

80

90

100

% homes cost $100K+

twoway scatter ownhome propval100, xlabel(#10) ylabel(#5)

60
40

50

% who own home

70

80

In this example, we use the
xlabel(#10) option to ask Stata to use
approximately 10 nice labels and the
ylabel(#5) option to use
approximately 5 nice labels. In this
case, our gentle request was observed
exactly, but in some cases, Stata will
choose somewhat different values to
create axis labels it believes are logical.
Uses allstates.dta & scheme vg s2m
0

10

20

30

40

50

60

70

80

90

% homes cost $100K+

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.1

Scatterplots

43

80
70
60

% who own home

50
40

30

40

50

60

60

Options

40

50

Overlaying

% who own home

70

80

Distribution

10

20

30

40

50

60

% homes cost $100K+

Styles

twoway scatter ownhome propval100, xscale(alt)
% homes cost $100K+
20

40

60

80

100

60
40

50

% who own home

70

80

0

Appendix

Here, we use the xscale() option to
request that the x-axis be placed in its
alternate position, in this case at the
top instead of at the bottom. To learn
more about axis scales, including
suppressing, extending, or relocating
them, see Options : Axis scales (265).
Uses allstates.dta & scheme vg s2m

Standard options

90

Range

0

Options

80

Bar

The yline() option is used to add a
thin, black, dashed line to the graph
where y equals 55 and 75.
Uses allstates.dta & scheme vg s2m

Pie

70

twoway scatter ownhome propval100, xlabel(#10) ylabel(#5, nogrid)
yline(55 75, lwidth(thin) lcolor(black) lpattern(dash))

Dot

90

Box

80

Bar

70

% homes cost $100K+

Area

20

Matrix

Line

10

Twoway

CI fit

0

Introduction

Fit

Using the nogrid option, we can
suppress the display of the grid. Note
that this option is placed within the
ylabel() option, thus suppressing the
grid for the y-axis. If the grid were
absent, and we wished to include it, we
could add the grid option. (You can
also specify grid or nogrid within the
xlabel() option to control grids for the
x-axis.) For more details, see
Options : Axis labels (256).
Uses allstates.dta & scheme vg s2m

Scatter

twoway scatter ownhome propval100, xlabel(#10) ylabel(#5, nogrid)

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
44

Chapter 2. Twoway graphs

twoway scatter ownhome propval100, by(nsw)
South

60
50

0

50

100

70

80

West

50

60

% who own home

70

80

North

0

50

100

% homes cost $100K+

The by(nsw) option is used here to
make separate graphs for states in the
North, South, and West. At the bottom
left corner, you can see a note that
describes how the separate graphs
arose. This is based on the variable
label for nsw; if this variable had not
been labeled, it would have read Graphs
by nsw. See Options : By (272) for more
details about using the by() option.
Uses allstates.dta & scheme vg s2m

Graphs by Region North, South, or West

twoway scatter ownhome propval100, by(nsw, total)
South

West

Total

The total option can be used within
the by() option to add an additional
graph showing all the observations.
Uses allstates.dta & scheme vg s2m

60
50
80
70
50

60

% who own home

70

80

North

0

50

100 0

50

100

% homes cost $100K+
Graphs by Region North, South, or West

twoway scatter ownhome propval100, by(nsw, total compact)
South

West

Total

The compact option can be used to
make the graphs display more
compactly.
Uses allstates.dta & scheme vg s2m

60
50
80
70
50

60

% who own home

70

80

North

0

50

1000

50

100

% homes cost $100K+
Graphs by Region North, South, or West

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.1

Scatterplots

45

80
70
60

% who own home

50
40

40

60

80

100

80
70
60

% who own home

50
40

80

100

80

100

Standard options

60

Options

40

% homes cost $100K+

Pie

20

Dot

Overlaying

0

Box

Options

Washington, DC

Styles

twoway (scatter ownhome propval100) (scatteri 42.6 62.1 "DC")

50

60

70

80

Appendix

DC

40

This graph uses the scatteri (scatter
immediate) command to plot and label
a point for Washington, DC. The values
42.6 and 62.1 are the values for
ownhome and propval100 for
Washington, DC, and are followed by
"DC", which acts as a marker label for
that point. If we had instead specified
(9) "DC", then “DC” would have been
plotted at the 9 o’clock position.
Uses allstates.dta & scheme vg s2m

Distribution

Stata gives you considerable control
over the display of text you add to the
graph, as well as the ability to enclose
the text in a box and control the
characteristics of the box. See
Options : Textboxes (303) for more
details.
Uses allstates.dta & scheme vg s2m

Range

twoway scatter ownhome propval100, text(47 62 "Washington, DC", size(large)
margin(medsmall) blwidth(vthick) box)

Bar

Bar

% homes cost $100K+

Area

20

Matrix

Line

0

Twoway

CI fit

Washington, DC

Introduction

Fit

We can use the text() option to add
text to the graph. We add text to label
the observation belonging to
Washington, DC. See
Options : Adding text (299) for more
information about adding text in the
section.
Uses allstates.dta & scheme vg s2m

Scatter

twoway scatter ownhome propval100, text(47 62 "Washington, DC")

0

20

40

60

% who own home

y

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
46

Chapter 2. Twoway graphs

twoway (scatter ownhome propval100)
(scatteri 42.6 62.1 "DC" 55.9 89 (8) "HI"), legend(off)

60

70

80

This graph extends the previous
example by adding a second point for
Hawaii and providing a position for the
marker label for Hawaii, placing it at
the 8 o’clock position. In addition, the
legend(off) option suppresses the
legend. Finally, this graph uses the
vg samec scheme, so the markers
created via scatteri look identical to
the other markers.
Uses allstates.dta & scheme vg samec

50

HI

40

DC
0

20

40

60

80

100

This section concludes by looking at some additional graph commands that make graphs
similar to twoway scatter, namely, twoway spike, twoway dropline, and twoway dot.
Most of the options we have illustrated before apply to these graphs as well, so they will
not be repeated here. We will switch to using the vg blue scheme for the rest of the graphs
in this section.

twoway scatter r yhat
Imagine that we ran a regression
predicting propval100 from urban and
generated the residual, calling it r, and
the predicted value, calling it yhat.
Consider this graph using the scatter
command to display the residual by the
predicted value.
Uses allstates.dta & scheme vg blue

resid propval100|urban

60

40

20

0

−20

−40
0

20

40

60

yhat propval100|urban

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.1

Scatterplots

47

0

0

20

40

60

Area

−40

Bar

0

0

20

40

60

yhat propval100|urban

Styles

twoway spike r yhat, base(10)

Appendix

60

resid propval100|urban

By default, the base is placed at 0,
which is a very logical choice when
displaying residuals since our interest is
in deviations from 0. For illustration,
we use the base(10) option to set the
base of the y-axis to be 10, and the
spikes are displayed with respect to 10.
Uses allstates.dta & scheme vg blue

Standard options

−40

Overlaying

−20

Options

20

Pie

40

Options

resid propval100|urban

60

Dot

Distribution

You can use the blcolor() (bar line
color) option to set the color of the
spikes and the blwidth() (bar line
width) option to set the width of the
spikes. Here, we make the spikes thick
and navy. See Styles : Colors (328) for
more details about specifying colors
and see Styles : Linewidth (337) for more
details about specifying line widths.
Uses allstates.dta & scheme vg blue

Box

Range

twoway spike r yhat, blcolor(navy) blwidth(thick)

Bar

yhat propval100|urban

Matrix

Line

−20

Twoway

20

CI fit

resid propval100|urban

40

Introduction

60

Fit

This same graph could be shown using
the spike command. This produces a
spike plot, and each spike, by default,
originates from 0.
Uses allstates.dta & scheme vg blue

Scatter

twoway spike r yhat

40

20

0

−20

−40
0

20

40

60

yhat propval100|urban

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
48

Chapter 2. Twoway graphs

twoway spike r yhat, horizontal xtitle(Title for x-axis)
ytitle(Title for y-axis)
The spike command supports the
horizontal option, which swaps the
position of the r and yhat variables.
Note that the x-axis still remains at the
bottom and the y-axis still remains at
the left.
Uses allstates.dta & scheme vg blue

Title for y−axis

60

40

20

0
−40

−20

0

20

40

60

Title for x−axis

twoway dropline r yhat, msymbol(Oh)

resid propval100|urban

60

40

20

0

−20

−40
0

20

40

60

yhat propval100|urban

A twoway dropline plot is much like a
spike plot but permits a symbol, as
well. It supports the horizontal,
base(), blcolor(), and blwidth()
options just like twoway spike, so these
are not illustrated. But you can use
marker symbol options to control the
symbol. Here, we add the msymbol(Oh)
option to obtain hollow circles as the
symbols; see Options : Markers (235) for
more details.
Uses allstates.dta & scheme vg blue

twoway dropline r yhat, msymbol(O) msize(vlarge)
mfcolor(gold) mlcolor(olive) mlwidth(thick)
Here, we make the symbols very large
circles and use mfcolor() to make the
marker fill color gold, mlcolor() to
make the marker line color olive, and
mlwidth() to make the marker line
width thick. For more information, see
Options : Markers (235).
Uses allstates.dta & scheme vg blue

resid propval100|urban

60

40

20

0

−20

−40
0

20

40

60

yhat propval100|urban

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.2

Regression fits and splines

49

Area

Closing price

1250
20

30

40

Trading day number

Pie
Options
Standard options

75

80

Styles

55

60

65

70

Appendix

Here, we show a scatterplot of ownhome
by pcturban80. In addition, we overlay
a linear fit lfit predicting ownhome
from pcturban80. See
Twoway : Overlaying (87) if you would
like more information about overlaying
twoway graphs.
Uses allstatesdc.dta & scheme vg s2c

Overlaying

twoway (scatter ownhome pcturban80) (lfit ownhome pcturban80)

Options

This section focuses on the twoway commands that are used for displaying fit values:
lfit, qfit, fpfit, mband, mspline, and lowess. For more information, see [G] graph
twoway lfit, [G] graph twoway qfit, [G] graph twoway fpfit, [G] graph twoway
mband, [G] graph twoway mspline, and [G] graph twoway lowess. We use the
allstates data file, omitting Washington, DC, and show the graphs using the vg s2c
scheme.

Dot

Distribution

Regression fits and splines

Box

Range

2.2

Bar

10

Bar

0

Matrix

Line

1300

Twoway

CI fit

1350

Introduction

1400

Fit

The dot command is similar to a
scatterplot but shows dotted lines for
each value of the x-variable, making it
more useful when the x-values are
equally spaced. In this example, we
look at the closing price of the S&P 500
by trading day and make the markers
filled with eltgreen with thick emerald
outlines.
Uses spjanfeb2001.dta & scheme
vg blue

Scatter

twoway dot close tradeday, msize(large) msymbol(O)
mfcolor(eltgreen) mlcolor(emerald) mlwidth(thick)

20.0

40.0

60.0

80.0

100.0

Percent urban
% who own home

Fitted values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
50

Chapter 2. Twoway graphs

twoway (scatter ownhome pcturban80) (lfit ownhome pcturban80)
(qfit ownhome pcturban80)

55

60

65

70

75

80

It is sometimes useful to overlay fit
plots to compare the fit values. In this
example, we overlay a linear fit lfit
and quadratic fit qfit and can see
some discrepancies between them.
Uses allstatesdc.dta & scheme vg s2c

20.0

40.0

60.0

80.0

100.0

Percent urban
% who own home

Fitted values

Fitted values

twoway (scatter ownhome pcturban80) (mspline ownhome pcturban80)
(fpfit ownhome pcturban80) (lowess ownhome pcturban80)

55

60

65

70

75

80

Stata supports a number of other fit
methods. Here, we show an mspline
(median spline) overlaid with fpfit
(fractional polynomial fit) and lowess.
Another similar command, not shown,
is mband (median band).
Uses allstatesdc.dta & scheme vg s2c
20.0

40.0

60.0

80.0

100.0

Percent urban

2.3

% who own home

Median spline

predicted ownhome

lowess ownhome pcturban80

Regression confidence interval (CI) fits

This section focuses on the twoway commands that are used for displaying confidence
intervals around fit values: lfitci, qfitci, and fpfitci. The options permitted by these
three commands are virtually identical so we will use lfitci to illustrate these options.
(Note, however, that fpfitci does not permit the options stdp, stdf, and stdr.) For
more information, see [G] graph twoway lfitci, [G] graph twoway qfitci, and [G] graph
twoway fpfitci.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.3

Regression confidence interval (CI) fits

51

75
70

20

40

60

80

100

Area

Percent urban
95% CI

Fitted values

Bar
Options

65
60

40.0

60.0

80.0

100.0

Percent urban
% who own home

95% CI

Fitted values

Styles
Appendix

twoway (lfitci ownhome pcturban80, stdf)
(scatter ownhome pcturban80)
Here, we add the stdf option, which
computes the confidence intervals using
the standard error of forecast. If
samples were drawn repeatedly, this
confidence interval would capture 95%
of the observations. With 50
observations, we would expect 2 or 3
observations to fall outside of the
confidence interval, and this
corresponds to the data shown here.
Uses allstatesdc.dta & scheme vg rose

Standard options

20.0

Overlaying

55

Options

70

Pie

75

Dot

Distribution

80

Box

Range

twoway (scatter ownhome pcturban80) (lfitci ownhome pcturban80)

Bar

% who own home

Matrix

Line

60

Twoway

CI fit

65

55

This example is the same as the
previous example; however, the order of
the scatter and lfitci commands is
reversed. Note that the order matters
since the points that fell within the
confidence interval are not displayed
because they are masked by the
shading of the confidence interval.
Uses allstatesdc.dta & scheme vg rose

Introduction

80

Fit

This graph uses the lfitci command
to produce a linear fit with confidence
interval. The confidence interval, by
default, is computed using the standard
error of prediction. We overlay this
with a scatterplot.
Uses allstatesdc.dta & scheme vg rose

Scatter

twoway (lfitci ownhome pcturban80) (scatter ownhome pcturban80)

90

80

70

60

50
20

40

60

80

100

Percent urban
95% CI

Fitted values

% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
52

Chapter 2. Twoway graphs

twoway (lfitci ownhome pcturban80, stdf level(90))
(scatter ownhome pcturban80)
We can use the level() option to set
the confidence level for the confidence
interval. Here, we make the confidence
level 90%.
Uses allstatesdc.dta & scheme vg rose

90

80

70

60

50
20

40

60

80

100

Percent urban
90% CI

Fitted values

% who own home

twoway (lfitci ownhome pcturban80, nofit)
(scatter ownhome pcturban80)

55

60

65

70

75

80

We now look at how you can control
the display of the fit line. We can use
the nofit option to suppress the
display of the fit line. Note that we
have switched to the vg brite scheme
for a different look for the graphs.
Uses allstatesdc.dta & scheme vg brite

20

40

60

80

100

Percent urban
95% CI

% who own home

55

60

65

70

75

80

twoway (lfitci ownhome pcturban80, clpattern(dash) clwidth(thick))
(scatter ownhome pcturban80)

20

40

60

80

Percent urban
95% CI

Fitted values

100

You can supply options like connect(),
clpattern() (connect line pattern),
clwidth() (connect line width), and
clcolor() (connect line color) to
control how the fit line will be
displayed. Here, we use the
clpattern(dash) and clwidth(thick)
options to make the fit line dashed and
thick. See Options : Connecting (250) for
more details.
Uses allstatesdc.dta & scheme vg brite

% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.3

Regression confidence interval (CI) fits

53

80
75
70
65
60
55

80

100

% who own home

80
75
70
65
60
55

100

95% CI

Fitted values

% who own home

80
55

60

65

70

75

Appendix

By choosing the rline command for
displaying the confidence interval, we
can then use options appropriate for
the twoway rline command. Here, we
make the line green, dashed, and thick.
See Styles : Colors (328),
Styles : Linepatterns (336), and
Styles : Linewidth (337) for more details
about colors, line patterns, and line
widths.
Uses allstatesdc.dta & scheme vg brite

Styles

twoway
(lfitci ownhome pcturban80, ciplot(rline) blcolor(green) blpattern(dash)
blwidth(thick)) (scatter ownhome pcturban80)

Standard options

80

Options

60
Percent urban

Overlaying

40

Pie

Options

20

Dot

Distribution

The ciplot() option can be used to
select a different command for
displaying the confidence interval. The
default command is twoway rarea and
can be selected via the ciplot(rarea)
option. Here, we use the
ciplot(rline) option, which displays
the confidence interval as two lines
without any filled area. The valid
options include rarea, rbar, rspike,
rcap, rcapsym, rscatter, rline, and
rconnected.
Uses allstatesdc.dta & scheme vg brite

Box

Range

twoway (lfitci ownhome pcturban80, ciplot(rline))
(scatter ownhome pcturban80)

Bar

Fitted values

Bar

95% CI

Area

60
Percent urban

Matrix

Line

40

Twoway

CI fit

20

Introduction

Fit

We use the bcolor(stone) option to
change the color of the area and outline
of the confidence interval. You can use
the options illustrated with twoway
rarea to control the display of the area
encompassing the confidence interval,
namely, bcolor(), bfcolor(),
blcolor(), blwidth(), and
blpattern(). See Twoway : Range (64)
and [G] graph twoway rarea for more
details.
Uses allstatesdc.dta & scheme vg brite

Scatter

twoway (lfitci ownhome pcturban80, bcolor(stone))
(scatter ownhome pcturban80)

20

40

60

80

100

Percent urban
95% CI

Fitted values

% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
54

Chapter 2. Twoway graphs

2.4

Line plots

This section focuses on the twoway commands for creating line plots, including the
twoway line and twoway connected commands. The line command is the same as
scatter, except that the points are connected by default and marker symbols are not
permitted, whereas the twoway connected command permits marker symbols. This section also illustrates twoway tsline and twoway tsrline, which are useful for drawing line
plots when the x-variable is a date variable. Since all these commands are related to the
twoway scatter command, they support most of the options you would use with twoway
scatter. For more information, see [G] graph twoway line, [G] graph twoway connected, and help graph tsline.
twoway line close tradeday, sort

1350
1300
1250

Closing price

1400

Here, we show an example using
twoway line showing the closing price
across trading days. Note the inclusion
of the sort option, which is
recommended when you have points
connected in a Stata graph.
Uses spjanfeb2001.dta & scheme vg s2c

0

10

20

30

40

Trading day number

twoway line close tradeday, sort clwidth(vthick) clcolor(maroon)

1350
1300
1250

Closing price

1400

Here, we show options controlling the
width and color of the lines. Using
clwidth(vthick) (connect line width)
and clcolor(maroon) (connect line
color), we make the line very thick and
maroon. See Options : Connecting (250)
for more examples. Note that you
cannot use options that control marker
symbols with graph twoway line.
Uses spjanfeb2001.dta & scheme vg s2c
0

10

20

30

40

Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.4

Line plots

55

1400
1350

Closing price

1300
1250
1400

Closing price

1350

Styles

1350
1300
1250

Closing price

1400

Appendix

twoway connected close tradeday, sort
msymbol(Dh) mcolor(blue) msize(large)
We can use marker symbol options,
such as msymbol(), mcolor(), and
msize() to control the marker symbols.
Here, we make the symbols large, blue,
hollow diamonds. See Options : Markers
(235) for more examples.
Uses spjanfeb2001.dta & scheme vg s2c

Overlaying

1250

1300

Options

20
Trading day number

Standard options

40

Distribution

10

Options

30

Range

0

Pie

40

Bar

This graph is identical to the previous
graph, except this graph is made with
the scatter command using the
connect(l) option. This illustrates the
convenience of using connected since
you do not need to manually specify
the connect() option.
Uses spjanfeb2001.dta & scheme vg s2c

Dot

30

twoway scatter close tradeday, connect(l) sort

Box

40

Bar

30

Area

20
Trading day number

Matrix

Line

10

Twoway

CI fit

0

Introduction

Fit

This twoway connected graph is
similar to the twoway line graphs we
saw before, except that when you use
connected, a marker is shown for each
data point.
Uses spjanfeb2001.dta & scheme vg s2c

Scatter

twoway connected close tradeday, sort

0

10

20
Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
56

Chapter 2. Twoway graphs

twoway connected close tradeday, sort
clcolor(cranberry) clpattern(dash) clwidth(thick)

1350
1300
1250

Closing price

1400

You can control the look of the lines
with connect options such as
clwidth(), clcolor(), and
clpattern() (connect line pattern). In
this example, we make the line
cranberry, dashed, and thick. See
Options : Connecting (250) for more
details on connecting points.
Uses spjanfeb2001.dta & scheme vg s2c
0

10

20

30

40

Trading day number

twoway connected high low tradeday, sort

1200

1250

1300

1350

1400

You can graph multiple variables at
once. In this case, we graph the high
and low prices across trading days.
Uses spjanfeb2001.dta & scheme vg s2c

0

10

20

30

40

Trading day number
High price

Low price

1200

1250

1300

1350

1400

twoway connected high low tradeday, sort
clwidth(thin thick) msymbol(Oh S)

0

10

20

30

40

When graphing multiple variables, you
can specify connect and marker symbol
options to control each line. In this
case, we use a thin line for the high
price and a thick line for the low price.
We also differentiate the two lines by
using different marker symbols, hollow
circles for the high price and squares for
the low price.
Uses spjanfeb2001.dta & scheme vg s2c

Trading day number
High price

Low price

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.4

Line plots

57

1400
1300
1200

Closing price

1100
1000
1400

Standard options

900

1000

Appendix

1100

1200

Styles

1300

Options

Overlaying

Low price/High price

Pie

Options

1Jan02

Dot

Distribution

1Oct01

Box

Range
1Jan02

Bar

Bar

1Oct01

Date

Matrix

Area

1Jul01

Twoway

Line

1Apr01

Introduction

CI fit

1Jan01

twoway tsrline low high
We can also use the tsrline
(time-series range) graph to show the
low price and high price for each day.
Uses sp2001ts.dta & scheme vg s1c

Fit

twoway tsline close
The tsline (time-series line) graph
shows the closing price on the y-axis
and the date on the x-axis. Note that
we did not specify the x-variable in the
graph command. Stata knew the
variable representing time because we
previously issued the tsset date,
daily command before saving the
sp2001ts file. Note that if you save the
data file, Stata remembers the time
variable, and you do not need to set it
again.
Uses sp2001ts.dta & scheme vg s1c

Scatter

Stata has additional commands for creating line plots where the x-variable is a date
variable, namely, twoway tsline and twoway tsrline. The tsline command is similar
to the line command, and the tsrline is similar to the rline command, but both of these
ts commands offer extra features, making it easier to reference the x-variable in terms of
dates. Note that these commands are not currently documented in [G] graph but are documented via help tsline. We will use the sp2001ts data file, which has the prices for
the S&P 500 index for 2001 with the trading date stored as a date variable named date.
Before saving the file sp2001ts, the tsset date, daily command was used to tell Stata
that the variable date represents the time variable and that it represents daily data.

1Jan01

1Apr01

1Jul01
Date

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
58

Chapter 2. Twoway graphs

twoway tsline close, clwidth(thick) clcolor(navy)

1200
1100
1000

Closing price

1300

1400

As with twoway line, you can use
connect options to control the line.
Here, we make the line thick and navy.
Uses sp2001ts.dta & scheme vg s1c

1Jan01

1Apr01

1Jul01

1Oct01

1Jan02

Date

twoway tsline close
if (date >= mdy(1,1,2001)) & (date <= mdy(3,31,2001))

1300
1200
1100

Closing price

1400

You can use if to subset cases to
graph. Here, we graph the closing
prices between January 1, 2001, and
March 31, 2001. See the next example
for an easier way of doing this.
Uses sp2001ts.dta & scheme vg s1c

1Jan01

1Feb01

1Mar01

1Apr01

Date

twoway tsline close if tin(01jan2001,31mar2001)

1300
1200
1100

Closing price

1400

When using the tsline command, you
can use tin() (time in between) to
specify that you want to graph just the
cases between January 1, 2001, and
March 31, 2001, inclusively.
Uses sp2001ts.dta & scheme vg s1c

1Jan01

1Feb01

1Mar01

1Apr01

Date

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.4

Line plots

59

1400
1300
1200

Closing price

1100
1000

Range

1400
1300
1200
1100
1000

Overlaying

30Jun01
Date

Styles

1200
1100
1000

Closing price

1300

1400

Appendix

twoway tsline close,
tlabel(01jan2001 30jun2001 01jan2002 ) tmlabel(31mar2001 30sep2001)
We can use the tmlabel() option to
include minor labels.
Uses sp2001ts.dta & scheme vg s1c

Standard options

Closing price

Options

31Mar01

Options

Distribution

1Jan01

Pie

1Jan02

Bar

We can use the tlabel() option to
label the time points on the time axis.
Note that we specified these dates using
date literals, and Stata knew how to
interpret these and appropriately label
the graph with these values.
Uses sp2001ts.dta & scheme vg s1c

Dot

30Sep01

twoway tsline close,
tlabel(01jan2001 31mar2001 30jun2001 30sep2001 01jan2002)

Box

1Jan02

Bar

1Oct01

Area

1Jul01
Day of Year

Matrix

Line

1Apr01

Twoway

CI fit

1Jan01

Introduction

Fit

We can use the ttitle() (time title)
option to give a title to the time
variable. We specify this as a ttitle()
instead of xtitle() since this refers to
the axis with the time variable.
Uses sp2001ts.dta & scheme vg s1c

Scatter

twoway tsline close, ttitle(Day of Year)

1Jan01

31Mar01

30Jun01

30Sep01

1Jan02

Date

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
60

Chapter 2. Twoway graphs

twoway tsline close,
tlabel(01jan2001 30jun2001 01jan2002 ) tmtick(31mar2001 30sep2001)

1200
1100
1000

Closing price

1300

1400

We can use the tmtick() option to
include minor ticks instead.
Uses sp2001ts.dta & scheme vg s1c

1Jan01

30Jun01

1Jan02

Date

twoway tsline close,
tline(01apr2001 01jul2001 01oct2001)

1200
1100
1000

Closing price

1300

1400

The tline() option can be used to
include lines at certain time points.
Here, we place lines at the start of the
second, third, and fourth quarters.
Uses sp2001ts.dta & scheme vg s1c

1Jan01

1Apr01

1Jul01

1Oct01

1Jan02

Date

twoway tsline close,
ttext(1035 01apr2001 "Start of Q2", orientation(vertical))

1Jan01

Start of Q2

1200
1100
1000

Closing price

1300

1400

We can use the ttext() option to add
text to the graph. The first coordinate
refers to the position on the y-axis, and
the second coordinate is the position on
the time axis in terms of the date.
Uses sp2001ts.dta & scheme vg s1c

1Apr01

1Jul01

1Oct01

1Jan02

Date

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.5

61

1400
1350

Closing price

1300

Pie

Options

Options

1250

Dot

Distribution

40

Box

Range

30

Bar

Bar

20
Trading day number

Matrix

Area

10

Twoway

Line

0

40

Standard options

Overlaying

twoway area close tradeday, horizontal sort
xtitle(Title for x-axis) ytitle(Title for y-axis)

0

10

20

Appendix

Title for y−axis

30

Styles

The horizontal option swaps the
position of the close and tradeday
variables. Note that the x-axis remains
at the bottom and the y-axis remains
at the left.
Uses spjanfeb2001.dta & scheme
vg palec

CI fit

twoway area close tradeday, sort
This is an example of a twoway area
graph. Because this graph is composed
of connected points, the sort option is
recommended in case the data are not
already sorted by tradeday. If the data
are not sorted, and the sort option is
not specified, then the points are
connected in the order they appear in
the data file and will generally not be
the graph you desire.
Uses spjanfeb2001.dta & scheme
vg palec

Fit

This section illustrates the use of area graphs using twoway area. These graphs are
similar to twoway line graphs, except that the area under the line is shaded. As a result,
many of the options that you would use with twoway line are applicable; see Twoway : Line
(54) for more details. For even more details, see [G] graph twoway area. We will use
the spjanfeb2001 data file, which has the prices for the S&P 500 index for January and
February 2001.

Introduction

Area plots

Scatter

2.5

Area plots

1250

1300

1350

1400

Title for x−axis

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
62

Chapter 2. Twoway graphs

twoway area close tradeday, sort base(1320.28)

1350
1300
1250

Closing price

1400

You can use the base() option to
indicate a base from which the area is
to be shaded. In this example, the base
is the closing price on the first trading
day, and thus all the subsequent points
are a kind of deviation from the first
day’s closing price.
Uses spjanfeb2001.dta & scheme
vg palec
0

10

20

30

40

Trading day number

twoway area close tradeday, sort bcolor(emerald)

1350
1300
1250

Closing price

1400

The bcolor() option sets the color of
the shaded area and the line. Here, we
make the shaded area and line emerald.
Although it is not shown, you can also
use the bfcolor() and blcolor()
options to control the fill color and line
color and the blwidth() option to
control the thickness of the outline.
Uses spjanfeb2001.dta & scheme
vg palec
0

10

20

30

40

Trading day number

2.6

Bar plots

This section illustrates the use of twoway bar graphs using twoway bar. These graphs
show a bar for each x-value where the height of the bar corresponds to the value of the
y-variable. For more details, see [G] graph twoway bar. We will continue to use the
spjanfeb2001 data file, which has the prices for the S&P 500 index for January and February, 2001, but show the graphs using the vg s1m scheme. twoway bar is useful for creating
bar graphs with overlays of lines, points, or other plot types and can be useful with evenly
spaced x-variable data. graph bar is more useful for creating bar graphs with categorical
data.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.6

Bar plots

63

1400
1350

Closing price

1300
1250

30

40

40
30
20

Title for y−axis

10
0

Styles

1400
1250

1300

1350

Appendix

Closing price

Standard options

twoway bar close tradeday, base(1200)
Unless we specify otherwise, the base
for the bar charts is the trading day
with the lowest price. In this example,
the closing price on day 40 was 1239.94,
so unless we specify the base() option,
the base would be 1239.94. As a result,
the bar for day 40 would have a zero
height. Here, we change the base to
1200 to give this bar a height.
Uses spjanfeb2001.dta & scheme
vg s1m

Options

Overlaying
1400

Pie

Options

1350

Title for x−axis

Dot

Distribution

1300

Box

Range

1250

Bar

Bar

twoway bar close tradeday, horizontal
xtitle(Title for x-axis) ytitle(Title for y-axis)
We can make the close and tradeday
variables trade places with the
horizontal option. Note that the
x-axis still remains at the bottom and
the y-axis still remains at the left.
Uses spjanfeb2001.dta & scheme
vg s1m

Area

20
Trading day number

Matrix

Line

10

Twoway

CI fit

0

Introduction

Fit

Consider this bar chart, which shows
the closing prices of the S&P 500
broken down by the trading day of the
year.
Uses spjanfeb2001.dta & scheme
vg s1m

Scatter

twoway bar close tradeday

0

10

20

30

40

Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
64

Chapter 2. Twoway graphs

twoway bar close tradeday, barwidth(.7)

1350
1300
1250

Closing price

1400

Unless otherwise specified, the width of
each bar is one x-unit (in this case, one
day). By making the width of the bars
.7, we can obtain a small gap between
the bars.
Uses spjanfeb2001.dta & scheme
vg s1m

0

10

20

30

40

Trading day number

twoway bar close tradeday, bfcolor(gs15) blcolor(gs5)

1350
1300
1250

Closing price

1400

We can use the bfcolor() (bar fill
color) option to set the color of the
inside of the bars and the blcolor (bar
line color) option to set the color of the
bar outlines. Here, we make the bars
light gray on the inside and dark gray
on the outside. See Styles : Colors (328)
for more colors you can choose.
Uses spjanfeb2001.dta & scheme
vg s1m
0

10

20

30

40

Trading day number

2.7

Range plots

This section focuses on twoway commands that display range plots. The major characteristic these graphs share is that, for each x-value, there are two corresponding y-values. A
common example is a confidence interval where, for each x-value, there are upper and lower
confidence limits. We first show examples of all of these types of graphs and then consider
the options that can be used to customize them. For more information, see [G] graph
twoway rarea, [G] graph twoway rbar, [G] graph twoway rspike, [G] graph twoway
rcap, [G] graph twoway rcapsym, [G] graph twoway rscatter, [G] graph twoway
rline, and [G] graph twoway rconnected. We will start by looking at the rconnected,
rscatter, rline, and rarea graphs, which use combinations of lines, symbols, and shading
to display range plots. These examples use the spjanfeb2001 data file.
The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.7

Range plots

65

1300

0

10

20

Range
Options

High price/Low price

Distribution

1300

1250

0

10

20
Trading day number

Styles

twoway rline high low tradeday, sort

High price/Low price

1400

Appendix

The rline graph is similar to the
rconnected graph, except that symbols
are not plotted at each level of x. Note
the inclusion of the sort option. This
option is recommended because the
points are connected by lines and is
needed if the data were not already
sorted on tradeday.
Uses spjanfeb2001.dta & scheme
vg rose

Overlaying

1200

Standard options

40

1350

Options

30

1400

Pie

40

Bar

The rscatter graph is similar to the
rconnected graph, except that lines
connecting the symbols are not plotted.
Uses spjanfeb2001.dta & scheme
vg rose

Dot

30

twoway rscatter high low tradeday

Box

40

Bar

30

Trading day number

Area

1200

Matrix

Line

1250

Twoway

CI fit

High price/Low price

1350

Introduction

1400

Fit

The rconnected (range connected)
graph shows the high and low prices by
tradeday, the number of days stocks
have been traded in the year. The
rconnected plot shows a separate line
for the high and low prices, and a
marker appears for each x-value. The
sort option is recommended because
the points are connected by lines and is
needed if the data were not already
sorted on tradeday.
Uses spjanfeb2001.dta & scheme
vg rose

Scatter

twoway rconnected high low tradeday, sort

1350

1300

1250

1200
0

10

20
Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
66

Chapter 2. Twoway graphs

twoway rarea high low tradeday, sort
The rarea graph is similar to the
rline graph, except that you can
control the fill color of the area between
the high and low values.
Uses spjanfeb2001.dta & scheme
vg rose

High price/Low price

1400

1350

1300

1250

1200
0

10

20

30

40

Trading day number

Next, we discuss the rcap, rspike, and rcapsym graphs, which use combinations of
spikes, caps, and symbols to display range plots. These plots are followed by rbar, which
uses bars to display range plots. These next examples are shown using the vg s2m scheme.
twoway rcap high low tradeday

1350
1300
1250
1200

High price/Low price

1400

The rcap graph shows a spike ranging
from the low to high values and puts a
cap at the top and bottom of each
spike.
Uses spjanfeb2001.dta & scheme
vg s2m

0

10

20

30

40

Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.7

Range plots

67

1400
1350
1300

High price/Low price

1250
1200

Range

1400
1350
1300

High price/Low price

1200

Styles

1350
1300
1250
1200

High price/Low price

Appendix

1400

twoway rbar high low tradeday
The rbar graph uses bars for each
value of x to show the high and low
values of y.
Uses spjanfeb2001.dta & scheme
vg s2m

Overlaying

20
Trading day number

Standard options

1250

Options

10

Options

Distribution

0

Pie

40

Bar

The rcapsym graph is similar to the
rcap graph, except that instead of
caps, symbols are placed at the end of
the spikes. You can choose among the
symbols to use for a scatterplot.
Uses spjanfeb2001.dta & scheme
vg s2m

Dot

30

twoway rcapsym high low tradeday

Box

40

Bar

30

Area

20
Trading day number

Matrix

Line

10

Twoway

CI fit

0

Introduction

Fit

The rspike graph is similar to the
rcap graph, except that no caps are
placed on the spikes.
Uses spjanfeb2001.dta & scheme
vg s2m

Scatter

twoway rspike high low tradeday

0

10

20

30

40

Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
68

Chapter 2. Twoway graphs

Let’s now consider options you can use with the rconnected, rscatter, rline, and
rarea graphs. We will start by looking at rconnected plots since many of the options used
in that kind of graph also apply to rscatter, rline, and rarea graphs. These graphs will
be shown using the vg s1c scheme.
twoway rconnected high low tradeday, sort

1350
1300
1250
1200

High price/Low price

1400

Here is a general rconnected graph.
Uses spjanfeb2001.dta & scheme vg s1c

0

10

20

30

40

Trading day number

twoway rconnected high low tradeday, sort horizontal
xtitle(Title for x-axis) ytitle(Title for y-axis)

20
0

10

Title for y−axis

30

40

With the horizontal option, you can
swap the axes where high/low and
tradeday appear. Note that the x-axis
remains at the bottom and the y-axis
remains at the left.
Uses spjanfeb2001.dta & scheme vg s1c

1200

1250

1300

1350

1400

Title for x−axis

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.7

Range plots

69

1400
1350
1300

High price/Low price

1250
1400
1350
1300
1250
1200

20
Trading day number

Styles

1350
1300
1250
1200

High price/Low price

1400

Appendix

twoway rscatter high low tradeday, sort msymbol(Sh) msize(medium)
mlwidth(thick)
The options you can use with rscatter
are just a subset of those you would use
with rconnected, where the connecting
options would not be relevant. Here, we
use the marker options to make the
symbols medium, hollow squares with
thick outlines. For more details about
options related to marker symbols, see
Options : Markers (235).
Uses spjanfeb2001.dta & scheme vg s1c

Standard options

Overlaying

High price/Low price

Options

10

Options

Distribution

0

Pie

40

Range

You can control the look of the marker
symbols with options such as
msymbol(), msize(), and mcolor().
Here, we make the marker symbols
large, hollow, lavender circles. For more
details about options related to
symbols, see Options : Markers (235).
Uses spjanfeb2001.dta & scheme vg s1c

Dot

30

twoway rconnected high low tradeday, sort
msymbol(Oh) msize(large) mcolor(lavender)

Box

40

Bar

30

Bar

1200

Area

20
Trading day number

Matrix

Line

10

Twoway

CI fit

0

Introduction

Fit

You can control the look of the lines
with connect options such as
connect(), blwidth(), blcolor(),
and blpattern(). Here, we make the
lines thick, dark green, and dashed. See
Options : Connecting (250) for more
examples, and see more details in
Styles : Connect (332), Styles : Linewidth
(337), Styles : Colors (328), and
Styles : Linepatterns (336).
Uses spjanfeb2001.dta & scheme vg s1c

Scatter

twoway rconnected high low tradeday, sort
blwidth(thick) blcolor(dkgreen) blpattern(dash)

0

10

20

30

40

Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
70

Chapter 2. Twoway graphs

twoway rline high low tradeday, sort blwidth(thick) blcolor(blue)

1350
1300
1250
1200

High price/Low price

1400

The options you can use with rline are
a subset of those you would use with
rconnected; namely, the marker
symbol and marker label options are
not relevant. Here, we show the use of
connect options to make the lines thick
and blue. For more details about
connect options, see
Options : Connecting (250).
Uses spjanfeb2001.dta & scheme vg s1c
0

10

20

30

40

Trading day number

twoway rarea high low tradeday, sort bcolor(teal)

1350
1300
1250
1200

High price/Low price

1400

The rarea graph is similar to the
rline graph, but in addition to being
able to control the characteristics of the
line, you can also control the color of
the area between the low and high
lines. Here, we use the bcolor() option
to make the color of the line and the
area teal.
Uses spjanfeb2001.dta & scheme vg s1c
0

10

20

30

40

Trading day number

twoway rarea high low tradeday, sort
blcolor(emerald) bfcolor(teal) blwidth(thick)

1350
1300
1250
1200

High price/Low price

1400

Here, we make the color of the line
emerald with the blcolor() option,
the fill color teal with the bfcolor()
option, and the line thick with the
blwidth() option.
Uses spjanfeb2001.dta & scheme vg s1c

0

10

20

30

40

Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.7

Range plots

71

Fit

1400
1350
1300

High price/Low price

1250
1200
1350
1300
1250
1200

20

Styles

10

Standard options

Overlaying

High price/Low price

1400

Options

0

Options

40

Distribution

The msize() option usually is used to
control the size of a marker and is
adapted for this kind of graph to
control the size of the cap. In this case,
the cap is made small.
Uses spjanfeb2001.dta & scheme vg s2c

Pie

30

twoway rcap high low tradeday, msize(small)

Dot

40

Box

30

Range

20
Trading day number

Bar

Bar

10

Matrix

Area

0

Twoway

Line

Here is an rcap graph with the default
options. The rcap command supports
the horizontal option, which would
make the variables high/low and
tradeday swap positions.
Uses spjanfeb2001.dta & scheme vg s2c

CI fit

twoway rcap high low tradeday

Introduction

Scatter

Now, let’s look at options that can be used with the rcap, rspike, and rcapsym graphs.
The options permitted by the rcap option are similar to the options used with the rspike
and rcapsym graphs. For these examples, we will use the vg s2c scheme.

Trading day number

Appendix

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
72

Chapter 2. Twoway graphs

1350
1300
1250
1200

High price/Low price

1400

twoway rcap high low tradeday, blcolor(cranberry) blwidth(thick)

0

10

20

30

40

The blcolor() option is used to
control the color of the line, in this case
making the line cranberry. The
blwidth() option is used to set the
width of the line; in this case, the line is
made thick. Although it is not shown
here, you could also control the pattern
of the line with the blpattern()
option. See Options : Connecting (250)
for more details.
Uses spjanfeb2001.dta & scheme vg s2c

Trading day number

twoway rspike high low tradeday, blcolor(red) blwidth(thin)

1350
1300
1250
1200

High price/Low price

1400

The options used for rspike are
basically the same as those for rcap,
except that the msize() option is not
appropriate since there are no markers
to size. Here, for example, we use
blcolor() and blwidth() to make the
lines red and thin.
Uses spjanfeb2001.dta & scheme vg s2c

0

10

20

30

40

Trading day number

1350
1300
1250
1200

High price/Low price

1400

twoway rcapsym high low tradeday, msymbol(Oh) msize(large)

0

10

20
Trading day number

30

40

The options used for rcapsym are
basically the same as for rcap, except
that you can use marker options to
select the marker that goes at the top
and bottom of each spike, and you can
also use marker label options to label
the markers (however, this is probably
not very useful and is not illustrated).
In this case, we use the msymbol()
option to place hollow circles at the end
of the spikes and the msize() option to
make the symbols large.
Uses spjanfeb2001.dta & scheme vg s2c

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.7

Range plots

73

1400
1350
1300

High price/Low price

1200
1400
1350
1300

High price/Low price

1250
1200

Styles

20
Trading day number

Standard options

Overlaying

10

Options

Options

0

Pie

40

Distribution

The barwidth() option can be used to
set the width of the bar. This width is
in units of the x-variable. We set the
bars to be .7 units wide, so they no
longer touch each other.
Uses spjanfeb2001.dta & scheme
vg brite

Dot

30

twoway rbar high low tradeday, barwidth(.7)

Box

40

Range

30

Bar

Bar

1250

Area

20
Trading day number

Matrix

Line

10

Twoway

CI fit

0

Introduction

Fit

twoway rbar high low tradeday
Here is a basic rbar graph with the
default options. As with the other
graphs in this family, we could have
added the horizontal option to switch
the position of the high/low and
tradeday variables, but this is not
shown.
Uses spjanfeb2001.dta & scheme
vg brite

Scatter

We will now explore options that can be used with twoway rbar, and we will switch to
using the vg brite scheme.

Appendix

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
74

Chapter 2. Twoway graphs

twoway rbar high low tradeday, bcolor(sienna)

1350
1300
1250
1200

High price/Low price

1400

The bcolor() (bar color) option sets
the color of the bar and the outline,
making the color sienna.
Uses spjanfeb2001.dta & scheme
vg brite

0

10

20

30

40

Trading day number

twoway rbar high low tradeday, bfcolor(sienna)
blcolor(cranberry) blwidth(thick)

1350
1300
1250
1200

High price/Low price

1400

With the bfcolor() (bar fill color)
option, we set the fill color of the bar to
be sienna and then use the blcolor()
(bar line color) option to set the color
of the outline to be cranberry. We also
use the blwidth() (bar line width)
option to make the lines surrounding
the bars thick.
Uses spjanfeb2001.dta & scheme
vg brite
0

10

20

30

40

Trading day number

2.8

Distribution plots

This section describes the use of twoway histogram and twoway kdensity for showing
the distribution of a single variable. In addition, this section also shows the use of twoway
function for showing the relationship between x and y using a function that you specify.
See [G] graph twoway histogram, [G] graph twoway kdensity, and [G] graph twoway
function for more information. We will start by showing the twoway histogram command
and consider options that allow you to control such things as the number of bins, the width
of the bins, and the starting point for the bins. Then, we will show options that control the
scaling of the y-axis. The next few graphs use the vg past scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.8

Distribution plots

75

.1
.08
.06

Density

.04
.02
0
.08
.06
.04
0

Overlaying

Styles

twoway histogram ttl exp, width(5)

.04
0

.02

Density

.06

.08

Appendix

We can control the width of each bar
using the width() option. Here, we
make each bar 5 units wide. As you
might imagine, you can use either the
bin() option or the width() option
but not both.
Uses nlsw.dta & scheme vg past

Standard options

.02

Density

Options

Tot. work exper.

Options

30

Distribution

10

Pie

20

Range

0

Dot

30

Bar

We can control the number of bins that
are used to display the histogram using
the bin() option. Here, we request
that 10 bins be used.
Uses nlsw.dta & scheme vg past

Box

20

twoway histogram ttl exp, bin(10)

Bar

30

Area

20
Tot. work exper.

Matrix

Line

10

Twoway

CI fit

0

Introduction

Fit

We begin by showing a histogram of
the variable total work experience.
Note that, unlike many other twoway
plots, this command takes only one
variable that is graphed on the x-axis.
The y-axis represents the density, such
that the sum of the areas of the bars
equals 1. If you are not going to
combine this graph with other twoway
graphs, the histogram command may
be preferable to twoway histogram.
Uses nlsw.dta & scheme vg past

Scatter

twoway histogram ttl exp

0

10
Tot. work exper.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
76

Chapter 2. Twoway graphs

twoway histogram ttl exp, start(-2.5) width(5)

.04
0

.02

Density

.06

.08

We add the start() option to indicate
that we want the lower limit of the first
bin to start at −2.5.
Uses nlsw.dta & scheme vg past

0

10

20

30

Tot. work exper.

twoway histogram ttl exp, fraction width(1)

.06
.04
0

.02

Fraction

.08

.1

If we use the fraction option, the
y-axis is scaled such that the height of
each bar is the probability of falling
within the range of x-values represented
by the bar. Thus, if we specify the
width of bars to be 1, the sum of the
heights of the bars is 1.
Uses nlsw.dta & scheme vg past

0

10

20

30

Tot. work exper.

twoway histogram ttl exp, percent width(1)

6
4
0

2

Percent

8

10

The percent option is similar to the
fraction option, except that the y-axis
is represented as a percentage instead
of a proportion. If we also specify a bar
width of 1, the sum of the heights of
the bars is 100%.
Uses nlsw.dta & scheme vg past

0

10

20

30

Tot. work exper.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.8

Distribution plots

77

200
150
100

Frequency

50
0

Area

20

30

.04

Standard options

.06

Options

Overlaying

.08

Pie

Options

.1

Dot

Distribution

twoway histogram ttl exp, gap(20)

Box

Range

Let’s now consider options that control the width of the bars and other characteristics
of the bars, such as color. Then, we will show you how to display the graph as a horizontal
histogram and demonstrate options that allow you to treat varname as a discrete variable.
We will use the vg blue scheme for these graphs.

Bar

Bar

Tot. work exper.

Density

Matrix

Line

10

Twoway

CI fit

0

Styles

The gap() option specifies the gap
between each of the bars. The gap is
created by reducing the width of the
bars. By default, the gap is 0, meaning
that the bars touch exactly and the
bars are reduced by 0%. Here, we
reduce the size of the bars by 20%,
making a small gap between the bars.
Uses nlsw.dta & scheme vg blue

Introduction

Fit

The frequency option changes the
scaling of the y-axis to represent the
number of cases that fall within the
range of x-values represented by the
bar. If we specify a bar width of 1, the
sum of the heights of the bars equals
the number of nonmissing values for
ttl exp.
Uses nlsw.dta & scheme vg past

Scatter

twoway histogram ttl exp, frequency width(1)

.02

0

10

20

30

Tot. work exper.

Appendix

0

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
78

Chapter 2. Twoway graphs

twoway histogram ttl exp, gap(99.99)
Here, we reduce the size of the bars
99.99%, making the bars 0.01% of their
normal size.
Uses nlsw.dta & scheme vg blue

.1

Density

.08

.06

.04

.02

0
0

10

20

30

Tot. work exper.

twoway histogram ttl exp, barwidth(.5)
Another way you can control the width
of the bars is though the barwidth()
option. Here, we indicate that we wish
each bar to be .5 x-units wide.
Uses nlsw.dta & scheme vg blue

.1

Density

.08

.06

.04

.02

0
0

10

20

30

Tot. work exper.

twoway histogram ttl exp, bfcolor(olive teal) blcolor(teal)
blwidth(thick)
.1

Density

.08

.06

.04

.02

0
0

10

20

30

We use the bfcolor() (bar fill color)
option to make the fill color of the bar
olive–teal and the blcolor() (bar line
color) option to make the bar line color
teal. The blwidth() (bar line width)
option makes the line around the bar
thick. The section Styles : Colors (328)
shows more about colors, and
Styles : Linewidth (337) shows more
about line widths.
Uses nlsw.dta & scheme vg blue

Tot. work exper.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.8

Distribution plots

79

Fit

30
20

Tot. work exper.

10
0

.1

.4
.3
.2

Density

.1
0

15

20

Styles

10

Standard options

Overlaying

5

Options

Options

Here, we use the discrete option to
tell Stata that the variable grade is a
discrete variable and can take on only
integer values. In this example, each
bin has a width of 1, and the bars are
too narrow to be useful.
Uses nlsw.dta & scheme vg s1m

Pie

Distribution

twoway histogram grade, discrete

Dot

.08

Density

Box

.06

Range

.04

Bar

Bar

.02

Matrix

Area

0

Twoway

Line

We can use the horizontal option to
swap the position of ttl exp and its
density, making a horizontal display of
the histogram.
Uses nlsw.dta & scheme vg s1m

CI fit

twoway histogram ttl exp, horizontal

Introduction

Scatter

We will now briefly consider some other options that can be used with twoway histogram,
showing how you can swap the position of the x- and y-axes, and the discrete option for
use with discrete variables. We will use the vg s1m scheme for the next set of graphs.

current grade completed

Appendix

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
80

Chapter 2. Twoway graphs

twoway histogram grade, discrete width(2)

.15
.1
0

.05

Density

.2

.25

We add the width() option, and the
bars have a width of 2.
Uses nlsw.dta & scheme vg s1m

5

10

15

20

current grade completed

We will now consider kernel-density plots that can be created using twoway kdensity.
For more details, see [G] graph twoway kdensity. As with histograms, if you are not going
to combine the kernel-density plot with other twoway plots, and sometimes even when you
are, the kdensity command is preferable to twoway kdensity. We will explore a handful
of options that are useful for controlling the display of these graphs. These graphs will use
the vg s2c scheme.
twoway kdensity ttl exp

.06
.04
0

.02

kdensity ttl_exp

.08

Here is a kernel-density plot of total
work experience. We could have added
the horizontal option to display the
graph as a horizontal plot, but this
option is not shown.
Uses nlsw.dta & scheme vg s2c

0

10

20

30

x

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.8

Distribution plots

81

.1
.08
.06
.04

kdensity ttl_exp

.02
0

30

.08
.06
.04

kdensity ttl_exp

.02
0

Standard options
Styles

0

50

100

150

200

Appendix

twoway (histogram ttl exp, width(1) frequency)
(kdensity ttl exp, area(2246))
In this example, we overlay a histogram
of ttl exp, scaling the y-axis as the
frequency of values in each bin. We
overlay this with a kdensity plot but
want to scale the y-axis in a
commensurate manner. By using the
area() option, we can specify that the
sum of the area of the kernel density
should sum to 2246, the sample size.
Uses nlsw.dta & scheme vg s2c

Options

40

Overlaying

30

x

Pie

Options

20

Dot

Distribution

10

Box

Range

0

Bar

Bar

twoway kdensity ttl exp, range(0 40)
You can use the range() option to
specify the range of the x-values at
which the kernel density is computed
and displayed. Here, we expand the
range to span from 0 to 40.
Uses nlsw.dta & scheme vg s2c

Area

20
x

Matrix

Line

10

Twoway

CI fit

0

Introduction

Fit

By default, Stata uses a Epanechnikov
kernel for computing the density
estimates. Here, we use the biweight
option to use the biweight kernel for
computing the densities. Other
methods include cosine, gauss,
parzen, rectangle, and triangle.
Uses nlsw.dta & scheme vg s2c

Scatter

twoway kdensity ttl exp, biweight

0

10
Frequency

20

30

kdensity ttl_exp, area=2246

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
82

Chapter 2. Twoway graphs

twoway kdensity ttl exp, clwidth(thick) clpattern(dash)

.06
.04
0

.02

kdensity ttl_exp

.08

We can use options such as clcolor(),
clwidth(), and clpattern() to alter
the characteristics of the line. Here, we
use the clwidth() and clpattern()
options to make the line thick and
dashed. See Styles : Linewidth (337),
Styles : Linepatterns (336), and
Styles : Colors (328) for more details.
Uses nlsw.dta & scheme vg s2c
0

10

20

30

x

twoway function y=normden(x), range(-4 4)

y

0

.1

.2

.3

.4

We conclude by showing how you can
use twoway function to graph an
arbitrary function. We graph the
function y=normden(x) to show a
normal curve. We add the range(-4
4) to specify that we want the x-values
to range from −4 to 4. Otherwise, the
graph would show the x-values ranging
from 0 to 1.
Uses nlsw.dta & scheme vg s2c
−4

−2

0

2

4

x

2.9

Options

This section discusses the use of options with twoway, showing the types of options you
can use. For more details, see Options (235). This section uses the vg outm scheme for
displaying the graphs.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.9

Options

83

80
60
50
40

60

Range

80
70
60
50
40

60

Styles

60

NJ

CT

RI MA
CA
HI

NY

50

NV
AK

NH
MD

DC
40

% who own home

70

MN
WV MI PA ME
DE
IA
VT
MS
AL
IN
WI
UT
ID
AR
KS
SC
MO
KY
OK
WY
ND
TN
NE
OH
NC
NM
SD
IL
FL
MT
LA GA
VA
AZ
OR CO WA
TX

Appendix

80

twoway scatter ownhome propval100, msymbol(S) mlabel(stateab)
We can use the mlabel() option to
control the marker labels. Here, we
label each of the markers with the
variable stateab showing the two-letter
abbreviation for each state next to each
marker. See Options : Marker labels
(247) for more information about
marker labels.
Uses allstates.dta & scheme vg outm

Overlaying

40

% homes cost $100K+

Standard options

% who own home

Options

20

Options

Distribution

0

Pie

100

Bar

We can use the msymbol() option to
control the marker symbols. Here, we
use squares as symbols. See
Options : Markers (235) for more details.
Uses allstates.dta & scheme vg outm

Dot

80

twoway scatter ownhome propval100, msymbol(S)

Box

100

Bar

80

Area

40

% homes cost $100K+

Matrix

Line

20

Twoway

CI fit

% who own home

70

Fit

0

Introduction

Consider this basic scatterplot. We will
use this for illustrating options.
Uses allstates.dta & scheme vg outm

Scatter

twoway scatter ownhome propval100

0

20

40

60

80

100

% homes cost $100K+

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
Chapter 2. Twoway graphs

twoway scatter fv ownhome propval100, connect(l .)

sort

40

50

60

70

80

84

0

20

40

60

80

100

% homes cost $100K+
yhat ownhome|propval100

% who own home

Say that we regressed ownhome on
propval100 and generated predicted
values named fv. Here, we make a
scatterplot and fit line in the same
graph using the connect(l .) option
to connect the values of fv but not the
values of ownhome. We also add the
sort option, which is generally
recommended when using the
connect() option. See
Options : Connecting (250) for more
details.
Uses allstates.dta & scheme vg outm

twoway scatter propval100 rent700 ownhome,
xtitle(Percent of households that own their own home)

0

20

40

60

80

100

We can add a title to the x-axis using
the xtitle() option, as illustrated
here. See Options : Axis titles (254) for
more details about titles.
Uses allstates.dta & scheme vg outm

40

50

60

70

80

Percent of households that own their own home
% homes cost $100K+

% rents $700+/mo

twoway scatter propval100 rent700 ownhome, ylabel(0(10)100)
0 10 20 30 40 50 60 70 80 90 100

We can label the y-axis from 0 to 100,
incrementing by 10, using the
ylabel(0(10)100) option as shown
here. See Options : Axis labels (256) for
more information about labeling axes.
Uses allstates.dta & scheme vg outm

40

50

60

70

80

% who own home
% homes cost $100K+

% rents $700+/mo

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.9

Options

85

0 10 20 30 40 50 60 70 80 90 100
30
20
10

% rents $700+/mo

40

100
80
60
40

0

% homes cost $100K+

20
0

80

% homes cost $100K+

% rents $700+/mo

Styles

50

60

North

70

80 50

60

0 10 20 30 40 50 60 70 80 90 100

S&W

70

Appendix

twoway scatter propval100 rent700 ownhome,
ylabel(0(10)100) yscale(alt) by(north)
The by() option allows you to see a
graph broken down by one or more
by() variables. Here, we show the
graph from above further broken down
by whether the state was part of the
North, making two graphs that are
combined together into a single graph.
The section Options : By (272) shows
more details and examples about the
use of the by() option.
Uses allstates.dta & scheme vg outm

Standard options

70

Options

60
% who own home...

Pie

Overlaying

50

Options

40

Dot

Distribution

In this example, we show propval100
by ownhome and also rent700 by
ownhome, but for this second plot, we
put the y-axis on the second y-axis with
the yaxis(2) option. See
Options : Axis selection (269) for more
information about using and controlling
additional axes.
Uses allstates.dta & scheme vg outm

Box

Range

twoway (scatter propval100 ownhome)
(scatter rent700 ownhome, yaxis(2))

Bar

% rents $700+/mo

Matrix

Bar

% homes cost $100K+

Twoway

80

% who own home

Introduction

70

Area

60

Line

50

CI fit

40

Fit

Stata gives you a number of options
that you can use to control the axis
scale for both the x- and y-axes. For
example, here we use yscale(alt) to
move the y-axis to its alternate
position, moving it from the left to the
right. See Options : Axis scales (265) for
more details about the options for
controlling the axis scales.
Uses allstates.dta & scheme vg outm

Scatter

twoway scatter propval100 rent700 ownhome,
ylabel(0(10)100) yscale(alt)

80

% who own home
% homes cost $100K+

% rents $700+/mo

Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
86

Chapter 2. Twoway graphs

0

20

40

60

80

100

twoway scatter propval100 rent700 ownhome, legend(cols(1))

40

50

60

70

80

The legend() option allows you to
control the contents and display of the
legend. Here, we use the
legend(cols(1)) option to indicate
that we want the legend to display as a
single column. See Options : Legend
(287) for more details about the
legend() option.
Uses allstates.dta & scheme vg outm

% who own home
% homes cost $100K+
% rents $700+/mo

40

60

DC

0

20

% homes cost $100K+

80

100

twoway scatter propval100 ownhome, text(62 45 "DC")

40

50

60

70

80

In this graph, there is a single
observation that stands out from the
rest. Rather than use the mlabel()
option to label all of the markers, we
may want to label just the outlying
point. Here, we use the text() option
to add the text DC at the (y,x)
coordinates of (62,45), in effect labeling
that point; see Options : Adding text
(299) for more details.
Uses allstates.dta & scheme vg outm

% who own home

twoway scatter propval100 ownhome,
title(This is a Title, box bfcolor(dimgray)
blcolor(black) blwidth(thick))
100
80
60
40
20
0

% homes cost $100K+

Most items of text on a Stata graph
actually display within a box. We
illustrate this with the title() option
showing how we can place a box around
this text. We make the background
color of the box light gray and the
outline thick and black. These options
are described in more detail in
Options : Textboxes (303).
Uses allstates.dta & scheme vg outm

This is a Title

40

50

60

70

80

% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.10

87

Area

40

60

80

100

% rents $700+/mo

Overlaying

% homes cost $100K+

twoway scatter propval100 rent700 urban, msymbol(Oh t)

Styles

100
80

Appendix

The msymbol() option can be used to
select the marker symbols for the
multiple y-variables. Here, we plot the
variable propval100 with hollow
circles, and rent700 is plotted with
triangles.
Uses allstates.dta & scheme vg teal

Standard options

Percent urban 1990

Options

20

Options

0

Pie

20

Dot

Distribution

40

Box

Range

60

Bar

Bar

80

Matrix

Line

100

Twoway

CI fit

twoway scatter propval100 rent700 urban
We can use twoway scatter to graph
multiple y-variables against a single
x-variable in a single plot. Here, we
show propval100 and rent700 against
urban. Note that we are now using the
vg teal scheme.
Uses allstates.dta & scheme vg teal

Fit

One of the terrific features of twoway graphs is the ability to overlay them, giving you
the flexibility to create more complex graphs. This section shows two strategies you can
use. The first strategy is graphing multiple y-variables against a single x-variable in a single twoway command. The second strategy is specifying multiple commands within a single
twoway command, thus overlaying these graphs atop each other. It is also possible to create
separate graphs and glue them together using the graph combine command, which is discussed in Appendix : Save/Redisplay/Combine (358). We first start by illustrating how you
can specify multiple y-variables against a single x-variable using a single twoway command.

Introduction

Overlaying plots

Scatter

2.10

Overlaying plots

60
40
20
0
20

40

60

80

100

Percent urban 1990
% homes cost $100K+

% rents $700+/mo

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
88

Chapter 2. Twoway graphs

twoway scatter propval100 rent700 urban, mstyle(p2 p8)
The mstyle() (marker style) option
can be used to choose among marker
styles. These composite styles set the
symbol, size, fill, color, outline color,
and outline width for the markers.
Uses allstates.dta & scheme vg teal

100
80
60
40
20
0
20

40

60

80

100

Percent urban 1990
% homes cost $100K+

% rents $700+/mo

twoway line high low close tradeday, sort
We will briefly switch to using the
spjanfeb2001 data file. You can also
graph multiple y-variables against a
single x-variable with a line graph.
This works with twoway line, as
illustrated here, as well as with twoway
connected and twoway tsline.
Uses spjanfeb2001.dta & scheme vg teal

1400

1350

1300

1250

1200
0

10

20

30

40

Trading day number
High price

Low price

Closing price

twoway line high low close tradeday, sort clwidth(thick thick .)
Here, we use the clwidth() option to
change the width of the lines, making
the lines for the high and low prices
thick and leaving the line for the
closing price at the default width.
Uses spjanfeb2001.dta & scheme vg teal

1400

1350

1300

1250

1200
0

10

20

30

40

Trading day number
High price

Low price

Closing price

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.10

Overlaying plots

89

1350

1250

0

10

20

30

40

Area

Trading day number
High price

Low price

Options

1300

0

10

20

30

40

Trading day number
High price

Low price

Closing price

Styles

twoway (scatter propval100 urban) (lfit propval100 urban)

Appendix

We return to the allstates data file.
We can overlay multiple twoway
graphs. Here, we show a common kind
of overlay: scatterplot overlaid with a
linear fit between the two variables.
Note that both the scatter command
and the lfit command are surrounded
by parentheses.
Uses allstates.dta & scheme vg teal

Overlaying

1200

Standard options

1250

Options

1350

Pie

1400

Dot

Distribution

Here, we combine clstyle() and
clwidth() to make the lines for the
high and low prices the same style and
make them both thick. The third line is
drawn with the p2 style, and the
thickness is left at its default value.
Uses spjanfeb2001.dta & scheme vg teal

Box

Range

twoway line high low close tradeday, sort clstyle(p1 p1 p2)
clwidth(thick thick .)

Bar

Bar

Closing price

Matrix

Line

1200

Twoway

CI fit

1300

Introduction

1400

Fit

When we graph multiple y-variables, we
can use clstyle() (connect line style)
to control many characteristics of the
lines at once. Here, we plot the high
and low prices with the same style, p1,
and the closing price printed with a
second style, p2.
Uses spjanfeb2001.dta & scheme vg teal

Scatter

twoway line high low close tradeday, sort clstyle(p1 p1 p2)

100
80
60
40
20
0
20

40

60

80

100

Percent urban 1990
% homes cost $100K+

Fitted values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
90

Chapter 2. Twoway graphs

twoway (scatter propval100 urban) (lfit propval100 urban)
(qfit propval100 urban)
100
80
60
40
20
0
20

40

60

80

100

Percent urban 1990
% homes cost $100K+

We can add a quadratic fit to the
previous graph by adding a qfit
command, so we can compare a linear
fit and quadratic fit to see if there are
nonlinearities in the fit. Note that the
legend does not clearly differentiate
between the linear and quadratic fit; we
will show you how to modify the legend
to label this more clearly below.
Uses allstates.dta & scheme vg teal

Fitted values

Fitted values

twoway (scatter propval100 urban, msymbol(Oh))
(lfit propval100 urban, clpattern(dash))
(qfit propval100 urban, clwidth(thick))
100
80
60
40
20
0
20

40

60

80

100

Percent urban 1990
% homes cost $100K+

Fitted values

We add the msymbol(Oh) option to the
scatter command, placing it after the
comma, as it normally would be placed,
but before the closing parenthesis that
indicates the end of the scatter
command. We also add the
clpattern(dash) option to the lfit
command to make the line dashed and
add the clwidth(thick) option to the
qfit command to make the line thick.
Uses allstates.dta & scheme vg teal

Fitted values

twoway (scatter propval100 urban) (lfit propval100 urban)
(qfit propval100 urban), legend(label(2 Linear Fit) label(3 Quad Fit))
100
80
60
40
20
0
20

40

60

80

100

Percent urban 1990
% homes cost $100K+
Quad Fit

Linear Fit

While each graph subcommand can
have its own options, some options can
apply to the entire graph. As
illustrated here, we add a legend to the
graph to clarify the difference in the fit
values, and this option appears
following a comma after the closing
parenthesis following the qfit
command. The legend() option
appears at the end of the command
since it applies to the entire graph.
Uses allstates.dta & scheme vg teal

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.10

Overlaying plots

91

80

40

20

40

60

80

100

Percent urban 1990

Quad Fit

80

40

20

40

60

80

100

Percent urban 1990
95% CI

Fitted values

% homes cost $100K+

Styles

twoway (scatter propval100 urban) (qfitci propval100 urban)

Appendix

However, note the order in which you
overlay these two kinds of graphs. In
this example, the qfitci was drawn
after the scatter, and as a result, the
points are obscured by the confidence
interval.
Uses allstates.dta & scheme vg teal

Overlaying

0

Standard options

20

Options

Options

60

Pie

100

Dot

Distribution

Another common example of overlaying
graphs is to overlay a fit line with
confidence interval and a scatterplot.
Uses allstates.dta & scheme vg teal

Box

Range

twoway (qfitci propval100 urban) (scatter propval100 urban)

Bar

Linear Fit

Bar

% homes cost $100K+

Area

0

Matrix

Line

20

Twoway

CI fit

60

Introduction

100

Fit

We can make the previous graph in a
different, but less appropriate, way.
The legend() option is given as an
option of the qfit() command, not at
the very end as in the previous graph
command. But Stata is forgiving of
this, and even when such options are
inappropriately given within a
particular command, it treats them as
though they were given at the end of
the command.
Uses allstates.dta & scheme vg teal

Scatter

twoway (scatter propval100 urban) (lfit propval100 urban)
(qfit propval100 urban, legend(label(2 Linear Fit) label(3 Quad Fit)))

100
80
60
40
20
0
20

40

60

80

100

Percent urban 1990
% homes cost $100K+

95% CI

Fitted values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
92

Chapter 2. Twoway graphs

twoway (rarea high low date) (spike volmil date)
We now switch to the sp2001ts data
file. Here, we overlay the high and low
closing prices with the volume of shares
sold. But, since both are placed on the
same y-axis, it is difficult to see the
spikes of volmil, volume in millions.
Uses sp2001ts.dta & scheme vg teal

1500

1000

500

0
1Jan01

1Apr01

1Jul01

1Oct01

1Jan02

Date
High price/Low price

Volume (millions)

twoway (rarea high low date) (spike volmil date, yaxis(2)),
legend(span)
2.5

1300

2

1200
1.5
1100
1

Volume (millions)

High price/Low price

1400

1000
.5

900
1Jan01

By placing volmil on the second y-axis
using the yaxis(2) option, we can now
see the volume, but it obstructs the
stock prices. Note that we added the
option legend(span) to allow the
legend to be wider than the plot region
of the graph.
Uses sp2001ts.dta & scheme vg teal

1Apr01

1Jul01

1Oct01

1Jan02

Date...
High price/Low price

Volume (millions)

twoway (rarea high low date) (spike volmil date, yaxis(2)),
legend(span) yscale(range(500 1400) axis(1)) yscale(range(0 5) axis(2))
1400

1200
1100
1000

2.5

900

2
1.5
1
.5

1Jan01

1Apr01

1Jul01

1Oct01

Volume (millions)

High price/Low price

1300

We use the yscale() option to modify
the range for the first y-axis to lift its
range into the top third of the graph,
and another yscale() option to modify
the range for the second y-axis, pushing
the stock market volume down to the
bottom third.
Uses sp2001ts.dta & scheme vg teal

1Jan02

Date...
High price/Low price

Volume (millions)

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
2.10

Overlaying plots

93

100
80
60

Area

Matrix

Line

Twoway

CI fit

60

80

100

Percent urban 1990
Fitted values

100
80
60
20

40

60

80

100

Styles

0

20

Standard options

40

Options

Overlaying

Here, we create three overlaid graphs
using the || notation.
Uses allstates.dta & scheme vg s2m

Options

twoway scatter propval100 urban || lfit propval100 urban ||
qfit propval100 urban

Pie

Distribution

% homes cost $100K+

Dot

40

Box

20

Range

0

20

Bar

Bar

40

Introduction

Fit

twoway scatter propval100 urban || lfit propval100 urban
We switch back to the allstates data
file. Here, the || notation is used to
separate the scatter command from
the lfit command.
Uses allstates.dta & scheme vg s2m

Scatter

While the previous examples (and other examples in this book) have used the parenthetical notation for overlaid graphs, Stata also permits double vertical bars (||) for separating
graphs. To illustrate this, some of the graphs from above will be repeated using this notation. These examples will be shown using the vg s2m scheme.

Percent urban 1990
Fitted values

Fitted values

Appendix

% homes cost $100K+

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
94

Chapter 2. Twoway graphs

twoway scatter propval100 urban, msymbol(Oh) ||
lfit propval100 urban, clwidth(thick) ||
qfit propval100 urban, clwidth(medium)

0

20

40

60

80

100

This example shows how to use the ||
notation with options for each of the
commands.
Uses allstates.dta & scheme vg s2m

20

40

60

80

100

Percent urban 1990
% homes cost $100K+

Fitted values

Fitted values

twoway scatter propval100 urban, msymbol(Oh) ||
lfit propval100 urban, clwidth(thick) ||
qfit propval100 urban, clwidth(medium) ||,
legend(label(2 Linear Fit) label(3 Quad Fit))

0

20

40

60

80

100

This is another example using the ||
notation, in this case illustrating how to
have options on each of the commands,
along with the legend() as an overall
option.
Uses allstates.dta & scheme vg s2m

20

40

60

80

100

Percent urban 1990
% homes cost $100K+

Linear Fit

Quad Fit

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Introduction
Twoway
Matrix
Bar

Matrix options

Box

By

This chapter will explore the use of the graph matrix command for creating scatterplot
matrices among two or more variables. Many of the options that you can use with graph
twoway scatter apply to these kinds of graphs, as well; see Twoway : Scatter (35) and
Options (235) for related information. This chapter illustrates the use of marker options
and marker labels, as well as options for controlling the display of axes. It also includes
options specific to the graph matrix command, as well as the use of the by() option. For
more details about scatterplot matrices, see [G] graph matrix.

Axes

Scatterplot matrix graphs

Marker options

3

Dot

Marker options

Pie

3.1

40

60

80
100

50

0

80

% who
own
home

60

40

Appendix

% homes
cost
$100K+

Styles

You can control the marker symbol
with the msymbol() (marker symbol)
option. Here, we make the symbols
hollow circles. Other values that we
could specify include D (diamond), T
(triangle), S (square), and X (x). Using
a lowercase letter (d instead of D)
makes the symbol smaller. For circles,
diamonds, triangles, and squares, you
can append an h (e.g., Oh) to indicate
that the symbol should be hollow; see
Styles : Symbols (342) for more
examples.
Uses allstates.dta & scheme vg s1m

Standard options

graph matrix propval100 ownhome borninstate, msymbol(Oh)

Options

This section looks at controlling and labeling the markers in scatterplot matrices. This
section will show how to change the marker symbol, size, and color (both fill and outline color) and how to label the markers. You can label markers using the graph matrix
command just as you could when using the graph twoway scatter command. See also
Options : Markers (235) and Options : Marker labels (247) for more details. These examples
will use the vg s1m scheme.

80

% born in
state of
residence

60
40
20

0

50

100

20

40

60

80

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this 95
document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
96

Chapter 3. Scatterplot matrix graphs

graph matrix heatdd cooldd tempjan tempjuly, msymbol(p)
0

2000

4000

60

80

100
10000

Heating
degree
days

5000
0

4000

Cooling
degree
days

2000
0

100

Average
January
temperature

50
0

100

When you have a large number of
observations, the msymbol(p) option
can be very useful since it displays a
very small point for each observation
and can help you to see the overall
relationships among the variables.
Here, we switch to the citytemp data
file to illustrate this.
Uses citytemp.dta & scheme vg s1m

Average
July
temperature

80
60
0

5000

10000

0

50

100

graph matrix propval100 ownhome borninstate, msize(vlarge)
40

60

80
100

% homes
cost
$100K+

50

0

80

% who
own
home

60

40

80

% born in
state of
residence

60
40
20

0

50

100

20

40

60

80

The size of the markers can be changed
using the msize() (marker size) option.
Here, we make the markers very large.
Other values we could have chosen
include vtiny, tiny, vsmall, small,
medsmall, medium, medlarge, large,
vlarge, huge, vhuge, and ehuge; see
Styles : Markersize (340) for more details.
We also could have specified the size as
a multiple of the original size of the
marker; e.g., msize(*2) makes the
marker twice as big.
Uses allstates.dta & scheme vg s1m

graph matrix propval100 ownhome borninstate, mcolor(gs8)
40

60

80
100

% homes
cost
$100K+

50

0

80

% who
own
home

60

40

80

% born in
state of
residence

60
40
20

0

50

100

20

40

60

The mcolor() (marker color) option
can be used to control the color of the
symbols. Among the colors you can
choose are 16 gray-scale colors named
gs0 (black) to gs16 (white). We show a
graph using symbols that are in the
middle of this scale using the
mcolor(gs8) option; see Styles : Colors
(328) for more information about
specifying colors.
Uses allstates.dta & scheme vg s1m

80

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
3.1

Marker options

97

80
100

% homes
cost
$100K+

50

0

80

60

80
60
40
20
0

50

100

20

40

60

80

DC
PA
LA
IAAL
MS
WV
WI
KY
MI
MN
NDOH
NE
SD
IN
NC
IL
NY
TN
MO
SC
ME
MA
UT
AR
GA
OK
KSTX
RI
MT
VT
CT
VA
NJ HI
IDNM WA
MD
DE
ORCO
NH CA
WY
AK
DC
AZ
FL
NV

0

DC

DC

PA
LAKY
IA
MS
WV
WI
AL
MIMN
ND
OH
NE
INME
NC
IL
NYMA SD
MO
SC
UT
AR
GATN
OK
KS
RITX
MT
VT
HI
CT
VA
NJ
NM
MD ID
DE
OR
CA WA
NH
AK CO WY
AZ
FL
NV

80

% born in
state of
residence

60
40
20

0

50

100

20

40

60

80

Styles

40

DC

% homes
cost
$100K+
MN
MEDE
WV
MI
PA
IA
VT NH
MS
AL
IN
WI
UT
ID
AR
KS
SC
MO
KY
OK
WY
ND
TN
NE
OH
NC
NM
SD
FLILVA MD
NJCT
MT
LA
GA
AZ
OR
COWA
TX
RIMA
NV
AK NY CAHI
60
80

40

DC
LA
IA
MS
WV
W
IPA
KY
AL
MI
MN
ND
OH
NE
SD
IN
NC
IL
NY
TN
MO
SC
ME
MA
UT
AR
GA
TX
OK
KS
RI
MT
VT
CT
NJHI
NM VA
ID
DEMD CA
WA
OR
NH
WY COAK DC
DC
AZ
FL
NV

60

80

CT
100
HI MA CT
HI MA
CARI NJ
CA NJ RI
NH
NH
DC
MD
NY MD
NY
DE
DEVA
VT
VT
50
VA
AK
AKWA
NVWA
ME NVFL
ME
IL PA
IL PA
CO
FL
AZ
AZ CO
GA
GA
NM
NM TX
NC
NC
OR
MN
OR
MN
TX
OH
OH
MI
MI
UT
UT
SC
SC
WI
WI
TN
TN
MO
MO
ID
ID
IN
IN
KY
KY
LA
LA
AL
AL
WY
WY
MT
MT
KS
KS
OK
OK
NE
NE
MS
MS
AR
AR
WV
WV
IA
IA
ND
ND
SD
SD
0
MN
ME
WV
MI
PA
DE
IA
VTKS
MS
AL
IN
NH
UT
WI
ID
AR
SC
MO
OK
TN
NKY
D
NE
OH
NM
NC
SD
CT
IL
FL
NJ
MT
LA
MD
VA
AZWY
GA
OR
CO
TX
WA
% who
RIMA
NV
CA HI NY
AK
own

home
PA
LA
IA
MS
WV
WI
KY
AL
MI
MN
ND
OH
NE
SD
NC
ILIN
NY
TN
MO
SC
ME
MA
UT
AR
GA
OK
KS
RITX
MT
HI
CTVT
VA
NJ
NM
MDID
DE
OR
CA WA
NH
CO
WY
AK
AZ
FL
NV

Appendix

graph matrix propval100 ownhome borninstate, mlabel(stateab)
mlabsize(large)
You can use the mlabsize() (marker
label size) option to control the size of
the marker label. Here, we indicate
that the marker labels should be large.
You can also specify the size of the
marker label as a multiple of the
original size of the marker label; e.g.,
specifying mlabsize(*1.5) would make
the labels 1.5 times their normal size.
Uses allstates.dta & scheme vg s1m

Standard options

40

MEMN
WV
MI
PA
IA
MS
AL
INND
NHDE
UT
WI
ID VT KS
AR
SC
MO
KY
OK
TN
NE
OH
NM
NC
SD
CT
IL
FLAZ WYMD
NJ
MTGA
LA
VA
OR
CO
TX
WA
RIMA
NY
NV
AKCA HI

% who
own
home

Options

60

MN ME
WV
MI PA
DE NH
IA
VT
MS
AL
IN
WI
UT
ID
AR
KS
SC
MO
KY
OK
WY
NDOH
TN
NE
NC
NM
SD
FLILVA MD NJ CT
MT
LA
GA
AZ
OR
COWA
TX
RIMA
NV
CA
AK NY
HI

80

CT
100
HI MA CT
HI MA
CA RI NJ
CA NJ RI
NH
DCNHMD
NY MD
NY
DE
DE VT
VT
50
WA VA
AK
NV WAVA
ME NV FL AK
ME
IL PA
CO
FL
AZ
AZ CO NM GAIL PA
GA
NM
NC
NC
OR
OR
MN
TX
OH
TX
OH
MIMN
MI
UT
U
TIN
SC
SC
WI
WI
TN
TN
MO
MO
ID
IN
KY
KY
LA
L
AL
AL
WY
WY ID MTKS
MT
KS
OK
OK
NE
NE
MS
MS
AR
ARSD
WV
WV
IA
IAA
ND
ND
SD

Pie

80

Dot

DC

% homes
cost
$100K+

60

Box

By

40

Bar

% born in
state of
residence

Matrix

40

Matrix options

% who
own
home

graph matrix propval100 ownhome borninstate, mlabel(stateab)
We can label the markers using the
mlabel() (marker label) option. In this
example, we label the markers with the
two-letter postal abbreviation by
supplying the option mlabel(stateab).
Even though many of the labels
overlap, the most interesting
observations are those that stand out
and have readable labels, such as DC
and NV. For additional details, see
Options : Marker labels (247).
Uses allstates.dta & scheme vg s1m

Twoway

60

Introduction

40

Axes

The mfcolor() (marker fill color) and
mlcolor() (marker line color) options
allow you to control the fill color (inside
color) and outline color (periphery
color) of the markers. Below, we make
the fill color light gray by specifying
mfcolor(gs13) and the line color black
by specifying mlcolor(gs0). We use
the msize() option to make the
markers very large to help see the effect
of these options.
Uses allstates.dta & scheme vg s1m

Marker options

graph matrix propval100 ownhome borninstate, msize(vlarge)
mfcolor(gs13) mlcolor(gs0)

DC
80

% born in
state of
residence

60
40
20

0

50

100

20

40

60

80

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
98

Chapter 3. Scatterplot matrix graphs

3.2

Controlling axes

This section looks at labeling axes in scatterplot matrices. It shows how to label axes of
scatterplots, control the scale of axes, and insert titles along the diagonal. For more details,
see Options : Axis labels (256), Options : Axis scales (265) and [G] axis options. This section
uses the vg s2c scheme.
graph matrix urban propval100 borninstate
0

50

100
100
80

Percent
urban
1990

60

Let’s look at a scatterplot matrix of
three variables: urban, propval100,
and borninstate.
Uses allstates.dta & scheme vg s2c

40
100

% homes
cost
$100K+

50

0

80

% born in
state of
residence

60
40
20

40

60

80

100

20

40

60

80

graph matrix urban propval100 borninstate,
xlabel(30(10)100, axis(1)) ylabel(30(10)100, axis(1))
0

50

100
100
90
80
70
60
50
40
30

Percent
urban
1990
100

% homes
cost
$100K+

50

0

80

% born in
state of
residence

60
40
20

30 40 50 60 70 80 90 100

20

40

60

The way you control the axis labels
with a scatterplot matrix is somewhat
different than with other kinds of
graphs. Here, we use the xlabel() and
ylabel() options to control the x- and
y-labels for the first variable, urban, to
be scaled 30 to 100 in increments of 10.
This applies to the first variable
because we specified the axis(1)
option.
Uses allstates.dta & scheme vg s2c

80

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
3.2

Controlling axes

99

40

60

80 100
100
80

Percent
urban
1990

Twoway

20

Introduction

0

Axes

60
40

% homes
cost
$100K+
80
60
40
20
40

60

80

100

20

40

60

80

Dot

40

60

80 100

100
80
60
40
20
0

% homes
cost
$100K+

0

20

40

60

80 100

0

20

40

60

80 100

Styles

100
80
60
40
20
0

% born in
state of
residence

Standard options

100
80
60
40
20
0

Percent
urban
1990

Options

20

Pie

0

Box

By

graph matrix urban propval100 borninstate,
xlabel(0(20)100, axis(1)) ylabel(0(20)100, axis(1))
xlabel(0(20)100, axis(2)) ylabel(0(20)100, axis(2))
xlabel(0(20)100, axis(3)) ylabel(0(20) 100, axis(3))
Let’s label all these variables using the
same scale, from 0 to 100 in increments
of 20. As you can see, this involves
quite a bit of typing, applying the
xlabel() and ylabel() for axis(1),
axis(2), and axis(3), which applies
this to the first, second, and third
variables. However, the next example
shows a more efficient way to do this.
Uses allstates.dta & scheme vg s2c

Bar

% born in
state of
residence

Matrix

100
80
60
40
20
0

Matrix options

We can change the label for the second
variable, propval100, in a similar
manner, but we need to specify
axis(2). In this example, we label the
second variable ranging from 0 to 100
in increments of 20.
Uses allstates.dta & scheme vg s2c

Marker options

graph matrix urban propval100 borninstate,
xlabel(0(20)100, axis(2)) ylabel(0(20)100, axis(2))

Appendix

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
100

Chapter 3. Scatterplot matrix graphs

graph matrix urban propval100 borninstate,
maxes(xlabel(0(20)100) ylabel(0(20)100))
0

20

40

60

80 100
100
80
60
40
20
0

Percent
urban
1990
100
80
60
40
20
0

% homes
cost
$100K+
100
80
60
40
20
0

% born in
state of
residence
0

20

40

60

80 100

0

20

40

60

Stata has a simpler way of applying the
same labels to all the variables in the
scatterplot matrix by using the maxes()
(multiple axes) option. This example
labels the x- and y-axes from 0 to 100
with increments of 20 for all variables.
Uses allstates.dta & scheme vg s2c

80 100

graph matrix urban propval100 borninstate,
maxes(xlabel(0(20)100) ylabel(0(20)100))
xlabel(20(20)100, axis(1)) ylabel(20(20)100, axis(1))
0

20

40

60

80 100
100
80

Percent
urban
1990

60
40
20

100
80
60
40
20
0

% homes
cost
$100K+
100
80
60
40
20
0

% born in
state of
residence
20

40

60

80

100

0

20

40

60

80 100

You might want to label most of the
variables in the scatterplot matrix the
same way but with one or more
exceptions in a different way. In this
example, we label all the variables from
0 to 100, incrementing by 20, but then
override the labeling for urban to make
it 20 to 100, incrementing by 20. We do
this by adding additional xlabel() and
ylabel() options that apply just for
axis(1).
Uses allstates.dta & scheme vg s2c

graph matrix urban propval100 borninstate,
maxes(xlabel(0(20)100) ylabel(0(20)100) xtick(0(10)100) ytick(0(10)100))
0

20

40

60

80 100
100
80
60
40
20
0

Percent
urban
1990
100
80
60
40
20
0

% homes
cost
$100K+
100
80
60
40
20
0

% born in
state of
residence
0

20

40

60

80 100

0

20

40

60

80 100

Here, we label all of the variables from
0 to 100, in increments of 20, and also
add ticks from 0 to 100, in increments
of 10. Note that the xtick() and
ytick() options work the same way as
the xlabel() and ylabel() options.
We place these options within the
maxes() option, and they apply to all
of the axes. See Options : Axis labels
(256) and Options : Axis scales (265) for
more details.
Uses allstates.dta & scheme vg s2c

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
3.2

Controlling axes

101

100
100
80

%
Urban

Twoway

50

Introduction

0

Axes

60
40

100

0

80
60
40
20
40

60

80

100

20

40

60

80

Dot

0

50

100
100

60
40

100

0

80

% Born
in
State

60
40
20

40

60

80

100

20

40

60

80

Styles

0

50

100
100

Appendix

graph matrix urban propval100 borninstate,
diagonal("% Urban" . "% Born in State", bfcolor(eggshell))
We can control the display of the text
on the diagonal using textbox options.
For example, we make the background
color of the text area eggshell using the
bfcolor(eggshell) option. See
Options : Textboxes (303) for more
examples of textbox options.
Uses allstates.dta & scheme vg s2c

Standard options

% homes
cost
$100K+

50

Options

80

%
Urban

Pie

We do not have to change all the titles.
If we want to change just some of the
titles, we can place a period (.) for the
labels where we want the label to stay
the same. In this example, we change
the titles for the first and third
variables but leave the second as is.
Uses allstates.dta & scheme vg s2c

Box

By

graph matrix urban propval100 borninstate,
diagonal("% Urban" . "% Born in State")

Bar

% Born
in
State

Matrix

% Homes
Over
$100K

50

Matrix options

When you use twoway scatter, you
can use xtitle() and ytitle() to
control the titles for the axes. By
contrast, when using graph matrix,
you can control the titles that are
displayed along the diagonal with the
diagonal() option. We use the
diagonal() option to change the titles
for all variables.
Uses allstates.dta & scheme vg s2c

Marker options

graph matrix urban propval100 borninstate,
diagonal("% Urban" "% Homes Over $100K" "% Born in State")

80

%
Urban

60
40

100

% homes
cost
$100K+

50

0

80

% Born
in
State

60
40
20

40

60

80

100

20

40

60

80

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
102

Chapter 3. Scatterplot matrix graphs

3.3

Matrix options

This section shows options that you can use to control the look of the scatterplot matrix,
including showing just the lower half of the matrix, jittering markers, and scaling the size
of marker text. For more details, see [G] graph matrix. These graphs use the vg s2m
scheme.

graph matrix propval100 ownhome region, half
You can use the half option to display
just the lower diagonal of the
scatterplot matrix.
Uses allstates.dta & scheme vg s2m

% homes
cost
$100K+
80

% who
own
home

60

40
4
3

Census
region

2
1
0

50

100
40

60

80

graph matrix propval100 ownhome region, jitter(3)
40

60

80
100

% homes
cost
$100K+

50

0

80

% who
own
home

60

40

4
3

Census
region

You can use the jitter() option to
add random noise to the points; the
higher the value given, the more
random noise is added. This is
especially useful when numerous
observations have the same (x,y)
values, so a number of observations can
appear as a single point.
Uses allstates.dta & scheme vg s2m

2
1

0

50

100

1

2

3

4

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
3.4

Graphing by groups

103

80

50
0

% who
own
home

60
40

4
3
2
1

Census
region
50

100

1

2

3

4

Bar

0

Matrix

100

% homes
cost
$100K+

Twoway

80

Introduction

60

Matrix options

40

Axes

The scale() option can be used to
magnify the contents of the graph,
including the markers, labels, and lines,
but not the overall size of the graph.
Here, we increase the size of these
items, making them 1.5 times their
normal size. Note that, unlike other
similar options, this option does not
take an asterisk preceding the
multiplier; i.e., we specify 1.5 but not
*1.5.
Uses allstates.dta & scheme vg s2m

Marker options

graph matrix propval100 ownhome region, scale(1.5)

Box

By

Dot

Graphing by groups

Pie

3.4

S&W

North

50 60 70 80

60

70

80

100

% homes
cost
$100K+

50

0

80

60

0

80

% who
own
home

70

50

80

% born in
state of
residence

50

60

80

% born in
state of
residence

60
40
20

0

50

100

20 40 60 80

60

Appendix

% who
own
home

70

100

% homes
cost
$100K+

Styles

The by() option can be used with
graph matrix to show separate
scatterplot matrices by a particular
variable. Here, we show separate
scatterplot matrices for households in
northern states and non-northern
states.
Uses allstates.dta & scheme vg brite

Standard options

graph matrix propval100 ownhome borninstate, by(north)

Options

This section looks at the use of the by() option for showing separate graphs based on the
levels of a by() variable. For more information, see Options : By (272) and [G] by option.
This section uses the vg brite scheme.

40
0

50

100

40

60

80

Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
104

Chapter 3. Scatterplot matrix graphs

graph matrix propval100 ownhome borninstate, by(north, compact)
S&W

North

50 60 70 80

60

70

80

100

% homes
cost
$100K+

0

80

60

50

0

80

% who
own
home

70

100

% homes
cost
$100K+

50

% who
own
home

70

50

80

% born in
state of
residence

60

80

% born in
state of
residence

60
40
20

0

50

100

To display the graphs closer together,
you can use the compact option.
Uses allstates.dta & scheme vg brite

60

40

20 40 60 80

0

50

100

40

60

80

Graphs by Region North or Not

twoway scatter propval100 ownhome, by(north, compact)
If we compare the previous scatterplot
matrix to this twoway scatterplot, we
see that the compact option does not
make the scatterplot matrix as compact
as it does with a regular twoway
scatter command, which joins the two
graphs on their edges by omitting the
y-labels between the two graphs.
Uses allstates.dta & scheme vg brite

50

North

0

% homes cost $100K+

100

S&W

50

60

70

8050

60

70

80

% who own home
Graphs by Region North or Not

graph matrix propval100 ownhome borninstate, by(north, compact)
maxes(ylabel(, nolabels))
S&W
50

60

70

60

% homes
cost
$100K+

70

80

% homes
cost
$100K+

% who
own
home

% who
own
home

% born in
state of
residence
0

50

We can make the graph matrix display
more compactly with the by() option
by using the maxes(ylabel(,
nolabels)) option to suppress the
labels on all of the y-axes. Then, when
we use the compact option, the edges of
the plots are pushed closer together.
Uses allstates.dta & scheme vg brite

North
80

100

20

40

60

% born in
state of
residence
80

0

50

100

40

60

80

Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
3.4

Graphing by groups

105

% who
own
home
% born in
state of
residence

100

20 40 60 80

0

50

100

40

60

80

Box

By

Graphs by Region North or Not

Bar

50

% born in
state of
residence

Matrix

% homes
cost
$100K+
% who
own
home

0

80

Twoway

% homes
cost
$100K+

70

Introduction

North
60

Matrix options

S&W
50 60 70 80

Axes

We can use the scale() option to
increase the size of the markers, labels,
and text to make them more readable.
This is especially useful when graphs
get small.
Uses allstates.dta & scheme vg brite

Marker options

graph matrix propval100 ownhome borninstate, by(north, compact scale(*1.3))
maxes(ylabel(, nolabels))

Dot
Pie
Options
Standard options
Styles
Appendix

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Pie
Options

15

Appendix

0

5

Styles

10

Standard options

mean of ttl_exp

Dot

By

This is probably the most basic bar
chart that you can make (and perhaps
the most boring, as well). It shows the
average total work experience for all
observations in the file. It graphs a
single y-variable using the default
summary statistic, the mean.
Uses nlsw.dta & scheme vg past

Box

Lookofbar options

graph bar ttl exp

Bar

Y-axis

A bar chart graphs one or more continuous variables broken down by one or more categorical variables. The continuous variables are graphed on the y-axis and are referred to
as y-variables. This section shows you how to specify the y-variables using the graph bar
command, how to include one or more y-variables, and how to obtain different summary
statistics for the y-variables. For more information, see [G] graph bar. This section begins
using the vg past scheme.

Matrix

Legend

Y-variables

Cat axis

4.1

Over options

This chapter will explore how to create bar charts using the graph bar command. It
will show how you can use graph bar to graph one or more continuous y-variables and how
you can break them down by one or more categorical variables. In addition, this chapter
will illustrate how you can control the display of each of the axes, the legend, and the look of
the bars, and how to use the by() option. We will start this chapter by looking at features
related to graphing one or more y-variables. For this entire chapter, we will use the nlsw
data file.

Twoway

Over

Bar graphs

Introduction

Y-variables

4

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this107
document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
108

Chapter 4. Bar graphs

graph bar prev exp tenure ttl exp

0

5

10

15

You can specify multiple y-variables to
be plotted at one time. Here, we graph
the mean of previous, current, and total
work experience in the same plot. The
bars are plotted touching each other,
and a legend indicates which bar
corresponds to which variable.
Uses nlsw.dta & scheme vg past

mean of prev_exp

mean of tenure

mean of ttl_exp

0

5

10

15

graph bar (median) prev exp tenure ttl exp

p 50 of prev_exp

p 50 of tenure

This graph is much like the last one,
but it shows the median of these
y-variables. Note that we only specified
(median) before prev exp but it
applied to all the y-variables that
follow. You can summarize the
y-variables using any of the summary
statistics permitted by the collapse
command (e.g., mean, sd, sum, median,
and p10); see [R] collapse.
Uses nlsw.dta & scheme vg past

p 50 of ttl_exp

graph bar (median) prev exp tenure (mean) ttl exp

0

5

10

15

In this example, we get the median of
the first two y-variables and then the
mean of the last y-variable. I don’t
know, however, how often you would do
this.
Uses nlsw.dta & scheme vg past

p 50 of prev_exp

p 50 of tenure

mean of ttl_exp

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.1

Y-variables

109

8
6
4
0

Matrix

2

Twoway

Over options

Options

By

Standard options

40

Pie

Lookofbar options

30

Dot

Y-axis

20

Box

Legend

graph bar prev exp tenure ttl exp hours

Bar

p 50 of wage

Cat axis

mean of wage

We now consider a handful of options that are useful when you have multiple y-variables.
These options allow you to display the y-variables as though they were categories of the
same variable, to create stacked bar charts, and to display the y-variables as percentages of
the total y-variables. These options are illustrated in the following graphs using the vg s1m
scheme.

mean of prev_exp

mean of tenure

mean of ttl_exp

mean of hours

Appendix

0

10

Styles

First, consider this bar chart showing
four y-variables. Each y-variable is
shown with a different colored bar and
with a legend indicating which
y-variable corresponds to which bar.
See the next example for another way
to differentiate these four bars.
Uses nlsw.dta & scheme vg s1m

Introduction

Over

You can plot different summary
statistics for the same y-variable, but
you must specify a target name for the
statistic being created. Here, we create
meanwage for the mean of wage and
medwage for the median of wage. If we
omitted the meanwage= and medwage=
from this command, Stata would return
an error indicating that the name for
the mean of wage conflicts with the
median of wage.
Uses nlsw.dta & scheme vg past

Y-variables

graph bar (mean) meanwage=wage (median) medwage=wage

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
110

Chapter 4. Bar graphs

graph bar prev exp tenure ttl exp hours, ascategory

0

10

20

30

40

You can use the ascategory option to
indicate that you want Stata to graph
multiple y-variables using the style that
would be used for the levels of an
over() variable. Comparing this graph
with the previous graph, note how the
bars for the different variables are the
same color and labeled on the x-axis
rather than using a legend.
Uses nlsw.dta & scheme vg s1m
mean of prev_exp mean of tenure

mean of ttl_exp

mean of hours

0

2

4

6

8

graph bar prev exp tenure, over(occ5)

Prof/Mgmt

Sales

Clerical

mean of prev_exp

Labor/Ops

Other

Consider this graph, where we show
work experience prior to one’s current
job (prev exp) and work experience at
one’s current job (tenure) broken down
by occ5. The total of previous and
current work experience represents
total work experience, and you might
want to show each bar as a percent of
total work experience. The next
example shows how you can do that.
Uses nlsw.dta & scheme vg s1m

mean of tenure

graph bar prev exp tenure, over(occ5) percentages

40
0

20

percent

60

80

Here, we show the time worked before
one’s current job, prev exp, and time
at the current job, tenure, in terms of
their percentage of the total (i.e.,
percentage of total work experience).
We can view the bars in this way using
the percentages option.
Uses nlsw.dta & scheme vg s1m
Prof/Mgmt

Sales

Clerical

mean of prev_exp

Labor/Ops

Other

mean of tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.2

Graphing bars over groups

111

15
10
5
0

Clerical

Labor/Ops

Other

100
80
60

percent

40
20

Standard options

0

Options

Other

mean of tenure

By

Labor/Ops

Pie

Lookofbar options

Clerical

Dot

Y-axis

Sales

mean of prev_exp

Box

Legend

Prof/Mgmt

Bar

Cat axis

mean of tenure

graph bar prev exp tenure, over(occ5) percentages stack
We can also combine the stack and
percentages options to create a
stacked bar chart in terms of
percentages.
Uses nlsw.dta & scheme vg s1m

Matrix

Sales

mean of prev_exp

Twoway

Over options

Prof/Mgmt

Introduction

Over

The stack option shows the y-variables
as a stacked bar chart. This allows you
to see the mean of each y-variable, as
well as the mean of the total
y-variables.
Uses nlsw.dta & scheme vg s1m

Y-variables

graph bar prev exp tenure, over(occ5) stack

Styles

Graphing bars over groups

Appendix

4.2

This section focuses on the use of the over() option for showing bar charts by one
or more categorical variables. It illustrates the use of the over() option with a single
y-variable and with multiple y-variables. We also look at some basic options, including
options for displaying the over() variable as though its levels were multiple y-variables,
including missing values on the over() variable, and suppressing empty combinations of
multiple over() variables. See the group options and over subopts tables of [G] graph bar
for more details.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
112

Chapter 4. Bar graphs

graph hbar wage, over(occ5)
Here, we use the over() option to show
the average wages broken down by
occupation. Note that we are using
graph hbar to produce horizontal,
rather than vertical, bar charts.
Uses nlsw.dta & scheme vg brite

Prof/Mgmt

Sales

Clerical

Labor/Ops

Other
0

2

4

6

8

10

mean of wage

graph hbar wage, over(occ5) over(collgrad)
Prof/Mgmt
Sales

not college grad

Clerical
Labor/Ops
Other
Prof/Mgmt
Sales

college grad

Clerical
Labor/Ops
Other
0

5

10

15

mean of wage

Here, we use the over() option twice to
show the wages broken down by
occupation and whether one graduated
college. Note the appropriate way to
produce this graph is to use two over()
options, rather than using a single
over() option with two variables. As
we will see later, each over() can have
its own options, allowing you to
customize the display of each over()
variable.
Uses nlsw.dta & scheme vg brite

graph hbar wage, over(urban2) over(occ5) over(collgrad)
Prof/Mgmt
Sales

not college grad

Clerical
Labor/Ops
Other
Prof/Mgmt
Sales

college grad

Clerical
Labor/Ops
Other
0

5

10

15

mean of wage
Rural

Metro

We can even add a third over()
option, in this case using over(urban2)
to compare those living in rural versus
urban areas. Note the change in the
look of the graph when we add the
third over() variable. This is because
Stata is now treating the first over()
variable as though it were multiple
y-variables. Because of this, you can
only specify one y-variable when you
have three over() options.
Uses nlsw.dta & scheme vg brite

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.2

Graphing bars over groups

113

Matrix

Over options

This graph shows the overall mean of
previous, current, and total work
experience.
Uses nlsw.dta & scheme vg lgndc

Twoway

Over

graph hbar prev exp tenure ttl exp

Introduction

Y-variables

Now, let’s look at examples of using multiple y-variables with the over() option. We
first consider a simple bar graph with multiple y-variables. These examples will use the
vg lgndc scheme, which places the legend to the left of the graph and displays it in a single, stacked column.

mean of prev_exp

5

10

15

mean of ttl_exp

By

mean of tenure

Clerical

Standard options

Sales
mean of prev_exp

Options

Lookofbar options

Prof/Mgmt

Pie

Y-axis

graph hbar prev exp tenure ttl exp, over(occ5)
We can take the graph from above and
break the means down by whether one
graduated from college by adding the
over(occ5) option.
Uses nlsw.dta & scheme vg lgndc

Dot

Legend

0

Box

mean of ttl_exp

Bar

Cat axis

mean of tenure

Labor/Ops

Styles

Other

5

10

15

Appendix

0

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
114

Chapter 4. Bar graphs

graph hbar prev exp tenure ttl exp, over(occ5) over(union)
We can take the previous graph and
further break the results down by
whether one belongs to a union. Note,
however, that we cannot add a third
over() option when we have multiple
y-variables.
Uses nlsw.dta & scheme vg lgndc

Prof/Mgmt
Sales

nonunion

Clerical
Labor/Ops

mean of prev_exp

Other
mean of tenure

Prof/Mgmt
Sales

mean of ttl_exp

union

Clerical
Labor/Ops
Other
0

5

10

15

Now let’s consider options that may be used in combination with the over() option to
customize the behavior of the graphs. We show how you can treat the levels of the variable
in the first over() option as though they were multiple y-variables and can even graph
those levels as percentages or stacked bar charts. You can also request that missing values
for the levels of the over() variables be displayed, and you can suppress empty levels when
multiple over() options are used. These examples are shown below using the vg rose
scheme.

graph bar wage, over(occ5) over(union)
Consider this graph, where we show
wages broken down by occupation and
whether one belongs to a union. The
labels for the levels of occ5 overlap, but
this is mended in the next example.
Uses nlsw.dta & scheme vg rose

10

mean of wage

8

6

4

2

0

Prof/Mgmt
SalesClerical
Labor/Ops
Other

nonunion

Prof/Mgmt
SalesClerical
Labor/Ops
Other

union

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.2

Graphing bars over groups

115

4
2

nonunion

union
Prof/Mgmt

Sales

Clerical

Labor/Ops

union
Sales

Clerical

Labor/Ops

percent of mean of wage

Other

Styles

graph bar wage, over(occ5) over(union) asyvars percentages stack

Appendix

100
percent of mean of wage

Again, because we are treating the
levels of occ5 as though they were
multiple y-variables, we can add the
stack option to view the graph as a
stacked bar chart.
Uses nlsw.dta & scheme vg rose

By

Prof/Mgmt

Standard options

nonunion

Options

0

Lookofbar options

10

Pie

20

Dot

30

Y-axis

With the levels of occ5 considered as
y-variables, we can use some of the
options that apply when we have
multiple y-variables. Here, we request
that the values be plotted as
percentages.
Uses nlsw.dta & scheme vg rose

Box

Legend

graph bar wage, over(occ5) over(union) asyvars percentages

Bar

Cat axis

Other

Matrix

0

Twoway

6

Over options

mean of wage

8

Introduction

10

Over

If we add the asyvars option, then the
first over() variable (occ5) is graphed
as if there were five y-variables
corresponding to the five levels of occ5.
The levels of occ5 are shown as
differently colored bars pushed next to
each other and labeled using the
legend.
Uses nlsw.dta & scheme vg rose

Y-variables

graph bar wage, over(occ5) over(union) asyvars

80
60
40
20
0

nonunion

union
Prof/Mgmt

Sales

Clerical

Labor/Ops

Other

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
116

Chapter 4. Bar graphs

graph hbar wage, over(urban3) over(union)
Consider this graph, where we use the
over(union) option to compare the
mean wages of union workers with
nonunion workers. One aspect this
graph hides is that there are a number
of missing values on the variable union.
Uses nlsw.dta & scheme vg rose

Rural

nonunion Suburb
Urban

Rural

union Suburb
Urban
0

2

4

6

8

10

mean of wage

graph hbar wage, over(urban3) over(union) missing
By adding the missing option, we then
see a category for those who are
missing on the union variable, shown as
the third set of bars. The label for this
bar is a single dot, which is the Stata
indicator of missing values. The section
Bar : Cat axis (123) shows how you can
give this bar a more meaningful label.
Uses nlsw.dta & scheme vg rose

Rural

nonunion Suburb
Urban
Rural

union Suburb
Urban
Rural

. Suburb
Urban
0

2

4

6

8

10

mean of wage

graph bar wage, over(grade) over(collgrad)
Consider this bar chart, which breaks
wages down by two variables: the last
grade that one completed and whether
one is a college graduate. By default,
Stata shows all possible combinations
for these two variables. In most cases,
all combinations are possible, but not
in this case.
Uses nlsw.dta & scheme vg rose

20

mean of wage

15

10

5

0

4 5 6 7 8 9 101112131415161718 4 5 6 7 8 9 101112131415161718

not college grad

college grad

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.3

Options for groups, over options

117

0

4

5

6

7

8

9 10 11 12 13 14 15

13 14 15 16 17 18

Standard options

By

Options

Lookofbar options

We first consider options that control the spacing among the bars and switch to the
vg s2m scheme.

Pie

This section considers some of the options that can be used with the over() and
yvaroptions() options for customizing the display of the bars. We will focus on controlling
the spacing between the bars and the order in which the bars are displayed. Other options
that control the display of the x-axis (such as the labels) are covered in Bar : Cat axis (123).
For more information on the over() options covered in this section, see the over subopts
table in [G] graph bar.

Dot

Y-axis

Options for groups, over options

Box

Legend

4.3

Bar

college grad

Cat axis

not college grad

Matrix

5

Twoway

10

Over options

mean of wage

15

Introduction

20

Over

If you only want to display the
combinations of the over() variables
that exist in the data, use the nofill
option.
Uses nlsw.dta & scheme vg rose

Y-variables

graph bar wage, over(grade) over(collgrad) nofill

Styles
Appendix

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
118

Chapter 4. Bar graphs

graph hbar wage, over(grade4) over(union)
Consider this graph, where we show the
mean wages broken down by grade4
and union. Using graph hbar displays
the chart as a horizontal bar chart,
which can be useful when you have
many categories to compare.
Uses nlsw.dta & scheme vg s2m

Not HS

nonunion

HS Grad
Some Coll
Coll Grad

Not HS

union

HS Grad
Some Coll
Coll Grad
0

2

4

6

8

10

mean of wage

graph hbar wage, over(grade4, gap(*3)) over(union)
We can change the gap between the
levels of grade4. Here, we make that
gap three times as large as it normally
would have been. This leads to thinner
bars with a greater gap between them.
Uses nlsw.dta & scheme vg s2m

Not HS
HS Grad

nonunion
Some Coll
Coll Grad
Not HS
HS Grad

union
Some Coll
Coll Grad
0

2

4

6

8

10

mean of wage

graph hbar wage, over(grade4, gap(*.3)) over(union)
Here, we shrink the gap between the
levels of grade4, making the gaps 30%
of the size they normally would have
been. This leads to wider bars with a
smaller gap between them.
Uses nlsw.dta & scheme vg s2m

Not HS

nonunion

HS Grad
Some Coll
Coll Grad

Not HS

union

HS Grad
Some Coll
Coll Grad
0

2

4

6

8

10

mean of wage

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.3

Options for groups, over options

119

Coll Grad

2

4

6

8

10

Options
Standard options

Cler.

Pie

By

Operat.

Dot

Lookofbar options

Other
Labor

Box

Y-axis

graph hbar wage, over(occ7, descending)

Sales

Styles

Mgmt
Prof
0

2

4

6

8

10

mean of wage

Appendix

Consider this graph showing average
wages broken down by the seven levels
of occupation. The bars are normally
ordered by the levels of occ7, going
from 1 to 7, where 1 is Prof and 7 is
Other. Using the descending option
switches the order of the bars. They
still are ordered according to the seven
levels of occupation, but the bars are
ordered going from 7 to 1.
Uses nlsw.dta & scheme vg s2c

Legend

So far, we have let Stata control the order in which the bars are displayed. By default,
the bars formed by over() variables are ordered in ascending sequence according to the
values of the over() variable. However, Stata gives you considerable flexibility in the ordering of the bars, as illustrated in the following examples using the vg s2c scheme.

Bar

mean of wage

Cat axis

0

Matrix

Coll Grad

Twoway

Over options

Not HS
HS Grad

union Some Coll

Introduction

Not HS
HS Grad

nonunion Some Coll

Over

We can control the gap with respect to
each of the over() variables at the
same time. In this example, we make
the gap among the grade4 categories
smaller (20% their original size) and
the gap between the levels of union
larger (three times the normal size).
Uses nlsw.dta & scheme vg s2m

Y-variables

graph hbar wage, over(grade4, gap(*.2)) over(union, gap(*3))

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
120

Chapter 4. Bar graphs

graph hbar wage, over(occ7, sort(occ7alpha))
Cler.
Labor
Mgmt
Operat.
Prof
Sales
Other
0

2

4

6

8

10

mean of wage

We might want to put these bars in
alphabetical order (but with Other still
appearing last). We can do this by
recoding occ7 into a new variable (say
occ7alpha) such that as occ7alpha
goes from 1 to 7, the occupations are
alphabetical. We recoded occ7 with
these assignments: 4 = 1, 6 = 2, 2 = 3,
5 = 4, 1 = 5, 3 = 6, and 7 = 7; see
[R] recode. Then, the
sort(occ7alpha) option has the effect
of alphabetizing the bars.
Uses nlsw.dta & scheme vg s2c

graph hbar wage, over(occ7, sort(1))
Here, we sort the variables on the
height of the bars (in ascending order).
The sort(1) means to sort the bars
according to the height of the first
y-variable, in this case, the mean of
wage.
Uses nlsw.dta & scheme vg s2c

Labor
Operat.
Sales
Other
Cler.
Prof
Mgmt
0

2

4

6

8

10

mean of wage

graph hbar wage, over(occ7, sort(1) descending)
Adding the descending option yields
bars in descending order.
Uses nlsw.dta & scheme vg s2c

Mgmt
Prof
Cler.
Other
Sales
Operat.
Labor
0

2

4

6

8

10

mean of wage

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.3

Options for groups, over options

121

Operat.

Other

Prof
Mgmt
0

10

20

30
mean of hours

Cler.

Prof
Other

Mgmt
0

10

20

30

By

mean of wage

40

mean of hours

Styles

graph hbar wage hours, over(occ7, sort(2)) over(married)

single

Labor
Cler.
Sales
Operat.
Other
Prof
Mgmt

married

Labor
Cler.
Sales
Prof
Other
Operat.
Mgmt

Appendix

We can use the sort() option when
there are additional over() variables.
Here, the sort(2) option orders the
bars according to the mean number of
hours worked within each level of
married.
Uses nlsw.dta & scheme vg s2c

Standard options

Operat.

Options

Lookofbar options

Sales

Pie

Labor

Dot

Y-axis

Changing sort(1) to sort(2) sorts the
bars according to the second y-variable,
the mean of hours.
Uses nlsw.dta & scheme vg s2c

Box

Legend

graph hbar wage hours, over(occ7, sort(2))

Bar

Cat axis

mean of wage

40

Matrix

Over options

Cler.

Twoway

Sales

Introduction

Labor

Over

Here, we plot two y-variables. In
addition to wages, we also show the
average hours worked per week.
Including the sort(1) option sorts the
bars according to the mean of wage
since that is the first y-variable.
Uses nlsw.dta & scheme vg s2c

Y-variables

graph hbar wage hours, over(occ7, sort(1))

0

10

20
mean of wage

30

40

50

mean of hours

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
122

Chapter 4. Bar graphs

graph hbar wage hours, over(occ7, sort(2)) over(married, descending)

married

Labor
Cler.
Sales
Prof
Other
Operat.
Mgmt

single

Labor
Cler.
Sales
Operat.
Other
Prof
Mgmt

Each over() option can have its own
separate sorting options. In this
example, we add the descending
option to the second over() option,
and the levels of married are now
shown with those who are married
appearing first.
Uses nlsw.dta & scheme vg s2c
0

10

20

30

mean of wage

40

50

mean of hours

graph hbar (sum) wage, over(collgrad) over(occ7) asyvars stack
Say that we were to graph the sum of
wage broken down by collgrad and
occ7. We further treat the levels of
collgrad as y-variables and form a
stacked bar chart. We might want to
sort these bars based on the sum of
wages for each occupation. See the next
example for how we can do that.
Uses nlsw.dta & scheme vg s2c

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

1,000

2,000

3,000

4,000

5,000

sum of wage
not college grad

college grad

graph hbar (sum) wage,
over(collgrad) over(occ7, sort((sum) wage)) asyvars stack
Here, we add sort((sum) wage) to the
over() option for occ7, and then the
bars are sorted on the sum of wages at
each level of occ7, sorting the bars on
their total height.
Uses nlsw.dta & scheme vg s2c

Cler.
Operat.
Labor
Other
Mgmt
Prof
Sales
0

1,000

2,000

3,000

4,000

5,000

sum of wage
not college grad

college grad

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.4

Controlling the categorical axis

123

Sales

Mgmt

Labor
Operat.
Cler.
1,000

2,000

3,000

4,000

5,000

not college grad

college grad

0

5

Appendix

10

15

Styles

mean of wage

Standard options

This bar chart breaks wages down by
education level and whether one lives in
the South. Adding the asyvars option
graphs the levels of education level as
differently colored bars, as though they
were different y-variables. More
importantly, note that the variable
south is coded 0/1 and has no labels,
leaving the x-axis poorly labeled.
Uses nlsw.dta & scheme vg s2c

Options

By

graph bar wage, over(grade6) over(south) asyvars

Pie

We will start by exploring how you can change the labels for the bars on the x-axis.

Lookofbar options

This section describes ways that you can label categorical axes. Bar charts are special
since their x-axis is formed by categorical variables. This section describes options you can
use to customize these categorical axes. For more details, see [G] cat axis label options
and [G] cat axis line options.

Dot

Y-axis

Controlling the categorical axis

Box

Legend

4.4

Bar

sum of wage

Cat axis

0

Matrix

Over options

Other

Twoway

Over

Prof

Introduction

Here, we add the descending option to
change the sort order from highest to
lowest. Note the placement of the
descending option outside of the
sort() option.
Uses nlsw.dta & scheme vg s2c

Y-variables

graph hbar (sum) wage,
over(collgrad) over(occ7, sort((sum) wage) descending) asyvars stack

0

1
No HS

Some HS

HS Grad

Some Coll

Coll Grad

Post Grad

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
124

Chapter 4. Bar graphs

graph bar wage, over(grade6) over(south, relabel(1 "N & W" 2 "South"))
asyvars

10
5
0

mean of wage

15

The relabel() option is used to
change the labels displayed for the
levels of south, giving the x-axis more
meaningful labels. Note that we wrote
relabel(1 "N & W") and not
relabel(0 "N & W") since these
numbers do not represent the actual
levels of south but the ordinal position
of the levels, i.e., first and second.
Uses nlsw.dta & scheme vg s2c

N&W

South
No HS

Some HS

HS Grad

Some Coll

Coll Grad

Post Grad

graph bar wage, over(grade6) over(union, relabel(3 "missing")) missing
asyvars

10
5
0

mean of wage

15

Consider this example, where we show
wages broken down by education and
union membership with the missing
option to show a separate category for
missing values. Normally, the bar for
the missing category would be labeled
with a dot, but here we add the
relabel() option to label that
category with the word “missing”.
Uses nlsw.dta & scheme vg s2c

nonunion

union

missing

No HS

Some HS

HS Grad

Some Coll

Coll Grad

Post Grad

.

graph hbar wage, over(grade6) over(south, relabel(1 "N & W" 2 "South"))
over(smsa, relabel(1 "Non Metro" 2 "Metro"))
N&W

Non Metro
South

N&W

Metro
South
0

5

10

15

This is an example of a bar chart with
three over() variables, two of which we
relabel. The relabel() option is used
to change the labels for the levels of
south and smsa. Note each over()
option can have its own relabel()
option.
Uses nlsw.dta & scheme vg s2c

mean of wage
No HS

Some HS

HS Grad

Some Coll

Coll Grad

Post Grad

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.4

Controlling the categorical axis

125

mean of tenure
mean of ttl_exp

mean of prev_exp

42/46

mean of tenure
mean of ttl_exp
10

15

Previous Exp

38/41

Current Exp
Total Exp
Previous Exp

42/46

Current Exp

By

Total Exp
0

5

10

15

Styles
Appendix

graph hbar prev exp tenure ttl exp, ascategory
over(age3, relabel(1 "34-37 yrs" 2 "38-41 yrs" 3 "42-46 yrs"))
yvaroptions(relabel(1 "Previous Exp" 2 "Current Exp" 3 "Total Exp"))
This example is similar to the previous
example, but we have added a
relabel() option to the over()
variable as well. As before, we use
yvaroptions(relabel()) to modify
the labels for the multiple y-variables,
and then we also use the relabel()
option within the over() option to
change the labels for age.
Uses nlsw.dta & scheme vg s2c

Previous Exp

34−37 yrs

Standard options

Lookofbar options

Total Exp

Options

Current Exp

Pie

Previous Exp

34/37

Dot

Y-axis

If the three level-of-experience variables
were indicated by an over() option, we
would use the over(, relabel())
option to change the labels. Instead,
since we have treated the multiple
y-variables as categories, we then use
yvaroptions(relabel()) to modify
the labels on the x-axis.
Uses nlsw.dta & scheme vg s2c

Box

Legend

graph hbar prev exp tenure ttl exp, ascategory over(age3)
yvaroptions(relabel(1 "Previous Exp" 2 "Current Exp" 3 "Total Exp"))

Bar

5

Cat axis

0

Matrix

Over options

mean of tenure
mean of ttl_exp

Twoway

mean of prev_exp

38/41

Introduction

mean of prev_exp

34/37

Over

This bar chart shows three y-variables,
but we use the ascategory option to
plot the different y-variables as
categorical variables on the x-axis. The
default labels on the x-axis are not bad,
but we might want to change them.
Uses nlsw.dta & scheme vg s2c

Y-variables

graph hbar prev exp tenure ttl exp, ascategory over(age3)

Current Exp
Total Exp
Previous Exp

38−41 yrs

Current Exp
Total Exp
Previous Exp

42−46 yrs

Current Exp
Total Exp
0

5

10

15

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
126

Chapter 4. Bar graphs

graph hbar prev exp tenure ttl exp, ascategory xalternate
over(age3, relabel(1 "34-37 yrs" 2 "38-41 yrs" 3 "42-46 yrs"))
yvaroptions(relabel(1 "Previous Exp" 2 "Current Exp" 3 "Total Exp"))
Previous Exp
Current Exp

34−37 yrs

Total Exp
Previous Exp
Current Exp

38−41 yrs

Total Exp
Previous Exp
Current Exp

42−46 yrs

If we wish, we can move the x-axis to
the opposite side of the graph. Here, we
add the xalternate option, which
moves the labels for the x-axis to the
opposite side, in this case from the left
to the right. You can also use the
yalternate option to move the y-axis
to its opposite side.
Uses nlsw.dta & scheme vg s2c

Total Exp
0

5

10

15

In the previous examples, we saw that the relabel option can be used in the over()
option to control the labeling of over() variables and can be used within yvaroptions()
to control the labeling of multiple y-variables (provided that the ascategory option is used
to convert the multiple y-variables into categories). We will further explore other over()
options, which can be used with either over() or yvaroptions().

6
4

We can use the label(nolabels)
option to suppress the display of the
labels associated with the levels of
occ7. The label(nolabels) option is
generally not useful alone but is very
useful in combination with other means
to label the bars. Consider the next
example.
Uses nlsw.dta & scheme vg s2c

0

2

mean of wage

8

10

graph bar wage, over(occ7, label(nolabels))

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.4

Controlling the categorical axis

127

10

Mgmt

Cler.

8
6

mean of wage

2
0
15
10

mean of wage

5

of
gm
t
Sa
le
s
C
le
O r.
pe
ra
t
La .
bo
r
O
th
er

Pr

M

M

Pr

of
gm
t
Sa
le
s
C
le
O r.
pe
ra
t
La .
bo
r
O
th
er

0

Standard options
Styles

0

5

10

15

Appendix

mean of wage

Options

By

graph bar wage, over(occ7, label(alternate)) over(collgrad)
Compare this graph with the previous
example. This example uses the
label(alternate) strategy to avoid
overlapping by alternating the labels for
occupation.
Uses nlsw.dta & scheme vg s2c

Pie

Lookofbar options

college grad

Dot

Y-axis

not college grad

Box

Legend

This graph shows wages broken down
by occupation and by whether one
graduated college. The
label(angle(45)) option is added to
rotate the labels for occupation by 45
degrees. If this had been omitted, the
labels would have overlapped each
other.
Uses nlsw.dta & scheme vg s2c

Bar

Cat axis

graph bar wage, over(occ7, label(angle(45))) over(collgrad)

Matrix

4

Labor

Over options

Operat.

Twoway

Other

Sales

Introduction

Prof

Over

By adding the blabel(group) (bar
label) option, the bars are labeled with
the name of the group to which the bar
belongs. See Bar : Legend (130) for more
about blabel().
Uses nlsw.dta & scheme vg s2c

Y-variables

graph bar wage, over(occ7, label(nolabels)) blabel(group)

Prof
Sales Operat. Other
Mgmt
Cler.
Labor

Mgmt
Cler.
Labor
Prof
Sales Operat. Other

not college grad

college grad

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
128

Chapter 4. Bar graphs

graph bar wage hours ttl exp, ascategory over(collgrad)
yvaroptions(label(alternate))

0

10

20

30

40

This is another example of using the
label(alternate) option, but in this
case, it is used in the context of
alternating labels created by multiple
y-variables converted to categories
using the ascategory option. In such a
case, the option is specified as
yvaroptions(label(alternate)).
Uses nlsw.dta & scheme vg s2c
mean of wage
mean of ttl_exp
mean of hours

mean of hours
mean of wage
mean of ttl_exp

not college grad

college grad

graph bar wage hours ttl exp, ascategory over(union) nolabel

0

10

20

30

40

If we add the nolabel option, the
names of the variables are shown
instead of the value labels.
Uses nlsw.dta & scheme vg s2c

wage

hours

ttl_exp

wage

nonunion

hours

ttl_exp

union

graph hbar wage, over(occ5, label(labcolor(green)))
over(collgrad, label(labcolor(maroon) labsize(small)))
We can change the color of the labels
using the labcolor() option. Here, we
make the label for occ5 green and the
label for collgrad maroon. We also
use labsize(small) to make the labels
for collgrad small. See Styles : Colors
(328) and Styles : Textsize (344) for more
details about other values you could
choose.
Uses nlsw.dta & scheme vg s2c

Prof/Mgmt
Sales
not college grad

Clerical
Labor/Ops
Other
Prof/Mgmt
Sales

college grad

Clerical
Labor/Ops
Other
0

5

10

15

mean of wage

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.4

Controlling the categorical axis

129

10
8
6

mean of wage

4

34/37

38/41

42/46

10
8
6
4

mean of wage

2
0

34/37

38/41

42/46

college grad

Styles

graph bar wage, over(age3) over(collgrad, label(labgap(*5)))

0

2

4

6

8

10

Appendix

mean of wage

Standard options

By

42/46

Options

Lookofbar options

38/41

not college grad

Pie

Y-axis

34/37

Dot

Legend

graph bar wage, over(age3, label(labgap(*5))) over(collgrad)

Box

college grad

Bar

0

2

Matrix

42/46

Cat axis

38/41

not college grad

We use the label(labgap(*5)) option
to control the gap between the labels
for age3 and collgrad, making that
gap five times the normal size.
Uses nlsw.dta & scheme vg s2c

Twoway

Over options

34/37

The labgap(*5) option increases the
gap between the label and the axis,
making the gap between the labels for
the levels of age3 and the axis five
times their normal size.
Uses nlsw.dta & scheme vg s2c

Introduction

Over

Stata permits you to add ticks using
the ticks option. At the same time, we
modify the attributes of the ticks,
making the tick line width thick, the
tick length twice as long as normal, and
the tick position crossing the x-axis.
See [G] cat axis label options for
more details and other options for
controlling ticks.
Uses nlsw.dta & scheme vg s2c

Y-variables

graph bar wage,
over(age3, label(ticks tlwidth(thick) tlength(*2) tposition(crossing)))
over(collgrad)

34/37

38/41

42/46

not college grad

34/37

38/41

42/46

college grad

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
130

Chapter 4. Bar graphs

graph bar wage, over(age3) over(collgrad, axis(outergap(*20)))

8
6
4
0

2

mean of wage

10

The axis(outergap(*20)) option
controls the gap between the labels of
the x-axis and the outside of the graph.
As you can see, this increases the space
below the labels for collgrad and the
bottom of the graph.
Uses nlsw.dta & scheme vg s2c
34/37

38/41

42/46

not college grad

34/37

38/41

42/46

college grad

graph bar wage, over(union) over(grade4) asyvars
b1title("Education Level in Four Categories")

6
4
0

2

mean of wage

8

10

The b1title() option adds a title to
the bottom of the graph, in effect
labeling the x-axis. We can add a
second title below that using the
b2title() option. If we used graph
hbox, we could label the left axis using
the l1title() and l2title() options.
Uses nlsw.dta & scheme vg s2c
Not HS

HS Grad

Some Coll

Coll Grad

Education Level in Four Categories
nonunion

4.5

union

Controlling legends

This section discusses the use of legends for bar charts, emphasizing the features that
are unique to bar charts. The section Options : Legend (287) goes into great detail about
legends, as does [G] legend option. Legends can be used for multiple y-variables or when
the first over() variable is treated as a y-variable via the asyvars option. See Bar : Yvariables (107) for more information about the use of multiple y-variables and Bar : Over
(111) for more examples of treating the first over() variable as a y-variable. Next, we
will consider examples that show the different kinds of labels that you can create using the
blabel() option. You can create labels that display the name of y-variable, the name of
the first over() group, the height of the bar, or the overall height of the bar (when used
with the stack option). These examples begin using the vg s1c scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.5

Controlling legends

131

40
30
20
10
0
10
8
6
4

mean of wage

2

Options
Standard options

0

Pie

Labor

Other

By

Cler.

Operat.

Lookofbar options

Mgmt

Sales

Dot

Y-axis

Prof

Box

Legend

This is another example of where a
legend can arise in a Stata bar graph by
specifying the asyvars option, which
treats an over() variable as though the
levels were different y-variables.
Uses nlsw.dta & scheme vg s1c

Bar

Cat axis

graph bar wage, over(occ7) asyvars

Matrix

mean of ttl_exp

mean of age

Twoway

mean of hours

mean of tenure

Over options

mean of wage

Introduction

Over

Consider this bar graph of five different
y-variables. The bars for the different
y-variables are shown with different
colors, and a legend is used to identify
the y-variables.
Uses nlsw.dta & scheme vg s1c

Y-variables

graph bar wage hours tenure ttl exp age

Styles
Appendix

Unless otherwise mentioned, the legend options described below work the same regardless of whether the legend was derived from multiple y-variables or from an over() variable
that was combined with the asyvars option. These next examples use the vg s2m scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
132

Chapter 4. Bar graphs

graph hbar wage hours tenure ttl exp age, nolabel
The nolabel option only works when
you have multiple y-variables. When it
is used, the variable names (not the
variable labels) are used in the legend.
For example, instead of showing the
variable label hourly wage, it shows
the variable name wage.
Uses nlsw.dta & scheme vg s2m
0

10

20

30

wage

hours

tenure

ttl_exp

40

age

graph hbar wage hours tenure ttl exp age, showyvars
The showyvars option puts the labels
on the axis, beside or “under” the bars.
Uses nlsw.dta & scheme vg s2m

mean of wage
mean of hours
mean of tenure
mean of ttl_exp
mean of age

0

10

20

30

40

mean of wage

mean of hours

mean of tenure

mean of ttl_exp

mean of age

8
6
4
0

2

mean of wage

10

graph bar wage, over(occ7) asyvars showyvars

Prof

Mgmt

Sales

Cler.

Operat.

Labor

Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

Even though the showyvars option
sounds like it would work only with
multiple y-variables, it also works when
you combine the over() and asyvars
options. As you can see, the legend is
now redundant and could be
suppressed.
Uses nlsw.dta & scheme vg s2m

Other

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.5

Controlling legends

133

10
8
6

mean of wage

4
2
0

Operat.

Labor

Other

10
8
6
4

mean of wage

2
0

Styles

graph bar wage, over(occ7) asyvars legend(rows(2) colfirst)

0

2

mean of wage
4
6
8

10

Appendix

In this example, we use the rows(2)
option combined with colfirst to
display the legend in two rows and to
order the keys by column (instead of
the default, which is by row). This
yields keys that are more adjacent to
the bars that they label.
Uses nlsw.dta & scheme vg s2m

Standard options

Labor

Other

Options

Cler.

Operat.

By

Management

Sales

Pie

Lookofbar options

Professional

Dot

Y-axis

We can use legend(label()) to change
the labels for one or more of the bars in
the graph. Here, we change the labels
for the first and second bars in the
legend. Note that you use a separate
label() option for each bar. This is in
contrast to the relabel() option,
where all of the label assignments were
placed in one relabel() option; see
Bar : Cat axis (123).
Uses nlsw.dta & scheme vg s2m

Box

Legend

graph bar wage, over(occ7) asyvars legend(label(1 "Professional")
label(2 "Management"))

Bar

Cler.

Matrix

Sales

Cat axis

Mgmt

Twoway

Over options

Prof

Introduction

Over

This example is similar to the previous
example, but we use the legend(off)
option to suppress the display of the
legend.
Uses nlsw.dta & scheme vg s2m

Y-variables

graph bar wage, over(occ7) asyvars showyvars legend(off)

Prof

Sales

Operat.

Mgmt

Cler.

Labor

Other

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
134

Chapter 4. Bar graphs

As you can see, the default placement for the legend is below the x-axis. However, Stata
gives you tremendous flexibility in the placement of the legend. We now consider options
that control the placement of the legend, along with options useful for controlling the placement of the items within the legend. The following examples use the vg blue scheme.

graph bar wage, over(occ7) asyvars legend(position(1))
Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

mean of wage

10
8
6
4
2

We can use the legend(position(1))
option to place the legend in the top
right corner of the graph. The values
you supply for position() are like the
numbers on a clock face, where 12
o’clock is the top, 6 o’clock is the
bottom, and 0 represents the center of
the clock face. Specifying 1 o’clock
places the legend in the top right; see
Styles : Clockpos (330) for more details.
Uses nlsw.dta & scheme vg blue

0

graph bar wage, over(occ7) asyvars legend(position(1) ring(0))

10

mean of wage

8

6

4

2

0

Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

Adding the ring(0) option, we can try
to tuck the legend inside the top right
corner of the plot area. Think of the
ring() option as specifying concentric
rings around the graph, where 0 is a
position inside the plot region, 1 is just
outside the plot region, and increasing
values are farther and farther from the
center of the plot region. Unfortunately,
the legend touches one of the bars, but
we will fix that in the next example.
Uses nlsw.dta & scheme vg blue

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.5

Controlling legends

135

Mgmt

Sales

Cler.

Operat.

Labor

Other

10

8

Sales
Cler.
Operat.
Labor

2

4

6

8

10

mean of wage

Appendix

Adding the textfirst option places
the description of the key before the
symbol in the legend.
Uses nlsw.dta & scheme vg blue

Styles

graph hbar wage, over(occ7) asyvars legend(cols(1) position(9) textfirst)

By

0

Standard options

Other

Options

Lookofbar options

Mgmt

Pie

Prof

Dot

Y-axis

We switch to making this a horizontal
bar chart and move the legend using
the position(9) option to place the
legend in the 9 o’clock position. We
also use the cols(1) option to display
the legend as a single column.
Uses nlsw.dta & scheme vg blue

Box

Legend

graph hbar wage, over(occ7) asyvars legend(cols(1) position(9))

Bar

Cat axis

4

Matrix

6

Twoway

Over options

mean of wage

Prof

Introduction

12

Over

Adding exclude0 no longer forces the
y-axis to start at 0 and makes room in
the top corner of the plot region for the
legend. See Bar : Y-axis (143) for more
details about the exclude0 option.
Uses nlsw.dta & scheme vg blue

Y-variables

graph bar wage, over(occ7) asyvars legend(position(1) ring(0)) exclude0

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

0

2

4

6

8

10

mean of wage

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
136

Chapter 4. Bar graphs

graph hbar wage, over(occ7) asyvars legend(cols(1) position(9) stack)

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

2

4

6
mean of wage

8

10

With the stack option, the keys and
their labels are placed on top of each
other to form an even narrower legend,
leaving more room to plot the bars.
You have considerable control over the
elements within the legend using other
options, such as rowgap(), keygap(),
symxsize(), symysize(),
textwidth(), and symplacement().
See Options : Legend (287) and
[G] legend option for more details.
Uses nlsw.dta & scheme vg blue

graph hbar wage, over(occ7) asyvars
This example uses the vg lgndc
scheme, set scheme vg lgndc. Notice
how it positions and customizes the
legend, as in the previous example.
With this scheme, the legend defaults
to the 9 o’clock position, in a single
column, with the keys and symbols
stacked.
Uses nlsw.dta & scheme vg lgndc

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

2

4

6
mean of wage

8

10

Let’s now look at how we can use the blabel() (bar label) option to add labels to the
bars. These labels can show the name of the over() option, the name of y-variables, or
the height of the bar. These options are illustrated below along with other related options
you might use in conjunction with blabel() for identifying the bars. These examples begin
using the vg past scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.5

Controlling legends

137

40
30
20
10
0
40
30
20
10

Standard options

0

Options
Styles

graph bar wage hours tenure, over(collgrad) blabel(name) nolabel

Appendix

hours

20

30

hours

By

mean of hours

mean of tenure

Pie

college grad

Lookofbar options

mean of tenure

mean of wage

40

Dot

Y-axis

mean of wage
mean of tenure

not college grad

If we use the nolabel option, just the
name y-variable is shown. For example,
instead of showing the variable label
hourly wage, it shows the variable
name wage.
Uses nlsw.dta & scheme vg past

Box

mean of hours

mean of hours

mean of wage

Bar

Legend

graph bar wage hours tenure, over(collgrad) blabel(name)
We can add the blabel(name) (bar
label) option, and it places labels on
each of the bars with the name of
y-variables. Here, each of these labels is
preceded with “mean of” since each bar
represents the mean of y-variable.
Uses nlsw.dta & scheme vg past

Cat axis

mean of hours

mean of tenure

Matrix

college grad

mean of wage

Twoway

Over options

not college grad

Introduction

Over

Consider this graph, where we look at
wage, hours, and tenure broken down
by the levels of collgrad. The legend
identifies the bars for us. In addition to
the legend, Stata offers us other ways
we can label these bars, as we shall see
in the upcoming examples.
Uses nlsw.dta & scheme vg past

Y-variables

graph bar wage hours tenure, over(collgrad)

tenure

tenure

0

10

wage
wage

not college grad

college grad
wage

hours

tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
138

Chapter 4. Bar graphs

graph bar wage hours tenure, over(collgrad) blabel(name) nolabel
legend(off)
In this case, the legend is no longer
needed, so we can suppress the display
of the legend with the legend(off)
option. See Options : Legend (287) for
more information about legend options.
Uses nlsw.dta & scheme vg past

40

hours

20

30

hours

10

wage
wage

tenure

0

tenure

not college grad

college grad

7

graph bar tenure, over(occ7) exclude0 blabel(group)
Other
Prof
Mgmt
Sales

5

Labor

4

mean of tenure

6

Operat.

Using the blabel(group) option shows
the label for the first over() group at
the top of each bar. In this case, the
label at the bottom of the bar becomes
unnecessary.
Uses nlsw.dta & scheme vg past

3

Cler.

Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

7

graph bar tenure, over(occ7, label(nolabels)) exclude0
blabel(group) yscale(range(7.2))
Other
Prof
Mgmt
Sales

5

Labor

Cler.

3

4

mean of tenure

6

Operat.

We can add the label(nolabels)
option to suppress the display of the
labels below each bar. Note that we
have also used the option
yscale(range(7.2)) to provide more
room within the plot area to label the
bar for the Other category.
Uses nlsw.dta & scheme vg past

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.5

Controlling legends

139

9
8

Other

7

Sales

6

Clerical

5

Labor/Ops

Clerical

4

union

Coll Grad

6.64236
6.81536

13.2673

13.441

0

5

10

15
mean of tenure

mean of ttl_exp

Styles
Appendix

15,000

graph bar (sum) prev exp tenure, stack over(grade4) blabel(bar)

10,000

5474.25

5,000

6221.99

1552.58

3475.83
2803.58

3387.6

3186.63

1865.06

0

Using the (sum) function, this graph
shows the sum of experience for all
individuals in a grade level before their
current job (prev exp) and the sum of
experience for all individuals in a grade
level in their current job (tenure) and
then uses stack to stack these two
totals. With the blabel(bar) option,
the bar labels are the sums for each
y-variables broken down by grade4.
Uses nlsw.dta & scheme vg past

By

mean of prev_exp

Standard options

Some Coll

Options

12.4683
7.06569
6.21637

Lookofbar options

HS Grad

Pie

10.347
6.64033
5.84232

Dot

Y-axis

5.66887
4.7191

Not HS

Box

Legend

graph hbar prev exp tenure ttl exp, over(grade4) blabel(bar)

Bar

Cat axis

nonunion

Consider this graph showing previous,
current, and total work experience
broken down by education. In this
example, the blabel(bar) option is
used to display the bar height (in this
case, the mean of y-variables).
Uses nlsw.dta & scheme vg past

Matrix

Over options

mean of tenure

Labor/Ops
Prof/Mgmt

Twoway

Sales
Prof/Mgmt

Introduction

Other

Over

Even if we add a second over() option,
the levels of the first over() variable
are labeled at the top of each bar due to
the blabel() option, and the levels of
the second over() variable are labeled,
as usual, at the bottom of the bars.
Note that the blabel() option does
not work this way when you have three
over() options or multiple y-variables.
Uses nlsw.dta & scheme vg past

Y-variables

graph bar tenure, over(occ5, label(nolabels)) exclude0 blabel(group)
yscale(range(7.2)) over(union)

Not HS

HS Grad
sum of prev_exp

Some Coll

Coll Grad

sum of tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
140

Chapter 4. Bar graphs

15,000

graph bar (sum) prev exp tenure, stack over(grade4) blabel(total)

10,000

11696.2

6863.44

5,000

6221.99

5990.21

3417.64

As compared with the prior example,
this example uses the blabel(total)
option to display the results as totals.
Now, the labels represent the
cumulative total height of the bar.
Uses nlsw.dta & scheme vg past

3387.6

3186.63

0

1865.06

Not HS

HS Grad

Some Coll

sum of prev_exp

Coll Grad

sum of tenure

We have seen a variety of ways that you can use the blabel() option to label the bars.
In addition, Stata offers a variety of options you can use to control the display of these
labels. Below, we will consider some of these options that allow you to customize the way
these labels are displayed. These example begin using the vg palec scheme.

graph hbar hours, over(occ7, label(nolabels)) blabel(group)
Consider this graph of the average
hours worked by occupation. We add
labels of the occupation at the top of
Mgmt
each bar but suppress the label at the
Sales
bottom of each bar. The label for the
Cler.
second bar runs off the right of the
graph. Fortunately, Stata offers us a
Operat.
number of options to control where
Labor
these labels are displayed.
Other
Uses nlsw.dta & scheme vg palec
Prof

0

10

20

30

40

mean of hours

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.5

Controlling legends

141

Prof

Over

Mgmt
Sales

Operat.
Labor

Matrix

Over options

Cler.

Twoway

With the position(inside) option, we
can place the group label inside the
bar. By default, inside refers to the
very “top” of the bar but on the inside
of the bar. Note that, because we chose
the vg palec scheme, the bar colors are
pale, so the labels within the bars are
readable.
Uses nlsw.dta & scheme vg palec

Introduction

Y-variables

graph hbar hours, over(occ7, label(nolabels))
blabel(group, position(inside))

Other

20

40

Y-axis
Lookofbar options

Mgmt
Sales
Cler.
Operat.
Labor

By

Other

0

10

20
mean of hours

Styles
Appendix

graph hbar hours, over(occ7, label(nolabels))
blabel(group, position(base) gap(*10))
The gap() option can be used to
fine-tune the placement of the label.
Here, we position the label at the base
but increase the gap between the label
and the base to be 10 times its normal
size. You can also use the gap() option
with position(inside) to position the
label with respect to the top of the bar.
Uses nlsw.dta & scheme vg palec

Standard options

30

Prof

Options

40

Legend

With the position(inside) option, we
can place the label inside the bar, but
at the base of the bar. You can also
specify position(center) to place the
label in the center of the bar.
Uses nlsw.dta & scheme vg palec

Pie

30

graph hbar hours, over(occ7, label(nolabels))
blabel(group, position(base))

Dot

40

Box

30

mean of hours

Bar

10

Cat axis

0

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

0

10

20
mean of hours

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
142

Chapter 4. Bar graphs

45

graph bar hours, over(occ7) blabel(bar, position(outside)) exclude0

40

39.3862
38.6349
37.7603
35.9284
34.9804

35

mean of hours

42.9886

This graph is similar to the previous
ones, but the bars are vertical, and we
now are labeling the bars with the
height of the bar. The label is placed
just outside the bar.
Uses nlsw.dta & scheme vg palec

30

31.9754

Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

30
20
0

10

mean of hours

40

graph bar hours, over(occ7, axis(outergap(*5))) asyvars
blabel(bar, position(base) gap(-4))

37.7603 42.9886 35.9284 34.9804 39.3862 31.9754 38.6349

Prof

Mgmt

Sales

Cler.

Operat.

Labor

To put the labels just under the bars,
we use position(base) to put the
labels at the base but also specify
gap(-4) to move the labels below the
bars. Adding the axis(outergap(*5))
option (see Bar : Cat axis (123)), we
make enough room so the labels do not
bump into the legend.
Uses nlsw.dta & scheme vg palec

Other

30
20
10

mean of hours

40

graph bar hours, over(occ7) asyvars
blabel(bar, position(base) box bfcolor(white) size(large) format(%5.2f))

0

37.76 42.99 35.93 34.98 39.39 31.98 38.63
Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

Here, we show more options that you
can use to customize the display of the
labels. We add a number of options to
place a box around the label, make the
background fill color white, increase the
size of the text to be large, and display
the means with a width of 5 and 2
decimal places. See Options : Textboxes
(303) for additional examples of how to
use textbox options to control the
display of text.
Uses nlsw.dta & scheme vg palec

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.6

143

Options

15
10

mean of wage

5
0

Pie
Standard options

0

5

Appendix

10

15

Styles

Years of experience

Dot

By

graph bar wage, over(occ5) over(married) asyvar
ytitle("Years of experience")

Box

Labor/Ops

Other

Lookofbar options

Clerical

Bar

Y-axis

Sales

Matrix

Legend

married
Prof/Mgmt

Twoway

Cat axis

single

We can use the ytitle() option to add
a title to the y-axis. See
Options : Axis titles (254) and
[G] axis title options for more details,
but please disregard any references to
xtitle() since that option is not valid
when using graph bar.
Uses nlsw.dta & scheme vg s2c

Over options

graph bar wage, over(occ5) over(married) asyvar
Consider this graph showing the mean
hourly wage broken down by
occupation and marital status.
Uses nlsw.dta & scheme vg s2c

Over

This section describes options you can use to control the y-axis in bar charts. To be
precise, when Stata refers to the y-axis on a bar chart, it refers to the axis with the continuous variable, whether the left axis when using graph bar or the bottom axis when
using graph hbar. This section emphasizes the features that are particularly relevant to
bar charts. For more details, see Options : Axis titles (254), Options : Axis labels (256), and
Options : Axis scales (265). Also see [G] axis title options, [G] axis label options, and
[G] axis scale options. This section uses the vg s2c scheme.

Introduction

Controlling the y-axis

Y-variables

4.6

Controlling the y-axis

single

married
Prof/Mgmt

Sales

Clerical

Labor/Ops

Other

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
144

Chapter 4. Bar graphs

graph hbar wage, over(occ5) over(married) asyvar
ytitle("Years of" "experience")
Splitting the title into two separate
quoted strings displays the title on
separate lines. Note that, when using
graph hbar, the title of the y-axis now
appears at the bottom.
Uses nlsw.dta & scheme vg s2c

single

married

0

5

10

15

Years of
experience
Prof/Mgmt

Sales

Clerical

Labor/Ops

Other

graph hbar wage, over(occ5) over(married) asyvar
ytitle("Years of" "experience", size(vlarge) box bexpand)

single

married

0

5

10

15

Years of
experience
Prof/Mgmt

Sales

Clerical

Labor/Ops

Other

Because this title is considered to be a
textbox, you can use a variety of
textbox options to control the look of
the title. In this example, the title is
made large with a box around it, and
the bexpand (box expand) makes the
box expand to fill the width of the plot
area. See Options : Textboxes (303) for
additional examples of how to use
textbox options to control the display
of text.
Uses nlsw.dta & scheme vg s2c

graph hbar wage, over(occ5) over(married) asyvar
yline(8 10, lwidth(thick) lcolor(red) lpattern(dash))
The yline() option is used to place a
thick, red, dashed line on the graph
where y equals 8 and 10. Note that this
option is still called yline() since the
y-axis is the axis with the continuous
variable.
Uses nlsw.dta & scheme vg s2c

single

married

0

5

10

15

mean of wage
Prof/Mgmt

Sales

Clerical

Labor/Ops

Other

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.6

Controlling the y-axis

145

Styles

30 35 40 45

mean of hours

45
40

mean of hours

35
30

Appendix

45

mean of hours

Standard options

We can add the angle() option to
modify the angle of the y-label, making
the labels for the y-axis horizontal (zero
degrees).
Uses nlsw.dta & scheme vg s2c

Options

graph bar hours, over(occ7) asyvar ylabel(30(5)45, angle(0)) exclude0

Pie

Labor

Other

Dot

Cler.

Operat.

By

Mgmt

Sales

Lookofbar options

Prof

Box

Y-axis

By default, bar charts include 0 on the
y-axis, unless you specify the exclude0
option, as we do here.
Uses nlsw.dta & scheme vg s2c

Bar

Legend

graph bar hours, over(occ7) asyvar ylabel(30(5)45) exclude0

Matrix

Labor

Other

Twoway

Cler.

Operat.

Cat axis

Mgmt

Sales

Over options

Prof

Introduction

Over

We can use the ylabel() option to
label the y-axis. In this case, we label
the y-axis from 30 to 45 by increments
of 5. See Options : Axis labels (256) and
[G] axis label options for more
details. Please disregard any references
to xlabel() since that option is not
valid when using graph bar. Note that
the y-axis still begins at 0. See the
following example to see how you can
control that.
Uses nlsw.dta & scheme vg s2c

Y-variables

graph bar hours, over(occ7) asyvar ylabel(30(5)45)

40

35

30
Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
146

Chapter 4. Bar graphs

graph bar hours, over(occ7) asyvar ylabel(30(5)45, nogrid) exclude0

40
35
30

mean of hours

45

The nogrid option suppresses the
display of the grid. Note that this
option is placed within the ylabel()
option, thus suppressing the grid for
the y-axis. (With bar charts, there is
never a grid with respect to the x-axis.)
If the grid were absent, and we wanted
to include it, we could add the grid
option. For more details, see
Options : Axis labels (256).
Uses nlsw.dta & scheme vg s2c

Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

graph bar prev exp tenure, over(occ7) yscale(off)
If you want to suppress the display of
the y-axis entirely, you can use the
yscale(off) option. See
Options : Axis scales (265) and
[G] axis scale options for more
details. Please disregard any references
to xscale() since that option is not
valid when using graph bar.
Uses nlsw.dta & scheme vg s2c
Prof

Mgmt

Sales

Cler.

Operat.

mean of prev_exp

Labor

Other

mean of tenure

We can use the yalternate option to
put the y-axis on the opposite side, in
this case on the right side of the graph.
Uses nlsw.dta & scheme vg s2c

0

2

4

6

8

graph bar prev exp tenure, over(occ7) yalternate

Prof

Mgmt

Sales

Cler.

mean of prev_exp

Operat.

Labor

Other

mean of tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.7

Changing the look of bars, lookofbar options

147

Prof
Mgmt

Cler.

Labor

8

6

4

0

Options
Standard options

40

Pie
Styles

Consider this bar chart. It shows the
mean wages, hours worked per week,
total experience, and job tenure broken
down by whether one graduated college.
Uses nlsw.dta & scheme vg rose

By

graph bar wage hours ttl exp tenure, over(collgrad)

Lookofbar options

This section shows how you can control the look of the bars in your bar charts: the
space between the bars, the color of the bars, and the characteristics of the line outlining the bars. For more information, see the lookofbar options table in [G] graph bar and
[G] barlook options. This section begins using the vg rose scheme.

Dot

Y-axis

Changing the look of bars, lookofbar options

Box

Legend

4.7

Bar

mean of tenure

Cat axis

mean of prev_exp

2

Matrix

Other

Over options

Operat.

Twoway

Over

Sales

Introduction

You can reverse the direction of the
y-axis with the yreverse option. We
combine this with the xalternate
option to place the labels for the bars
on the alternate (right) side of the
graph.
Uses nlsw.dta & scheme vg s2c

Y-variables

graph hbar prev exp tenure, over(occ7) xalternate yreverse

30

20

Appendix

10

0

not college grad

college grad

mean of wage

mean of hours

mean of ttl_exp

mean of tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
148

Chapter 4. Bar graphs

graph bar wage hours ttl exp tenure, over(collgrad)
outergap(*15)
We can change the outer gap between
the bars and the edge of the plot area
with the outergap() option. Here, the
gap is fifteen times its normal size. You
can also supply values less than 1 to
shrink the size of the gap.
Uses nlsw.dta & scheme vg rose

40

30

20

10

0

not college grad

college grad

mean of wage

mean of hours

mean of ttl_exp

mean of tenure

graph bar wage hours ttl exp tenure, over(collgrad)
bargap(25)
The bargap() option controls the size
of the gap between the bars. The
default value is 0, meaning that the
bars touch exactly. Here, we make the
gap 25% of the width of the bars.
Uses nlsw.dta & scheme vg rose

40

30

20

10

0

not college grad

college grad

mean of wage

mean of hours

mean of ttl_exp

mean of tenure

graph bar wage hours ttl exp tenure, over(collgrad)
bargap(-50)
The bargap() option permits negative
values to indicate that you want the
bars to overlap. Here, we make the bars
overlap by 50% of the size of the bars.
Uses nlsw.dta & scheme vg rose

40

30

20

10

0

not college grad

college grad

mean of wage

mean of hours

mean of ttl_exp

mean of tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.7

Changing the look of bars, lookofbar options

149

10

0

college grad
mean of hours

mean of ttl_exp

mean of tenure

20

0

college grad

mean of wage

mean of hours

mean of ttl_exp

mean of tenure

By

not college grad

Standard options

10

Options

Lookofbar options

30

Pie

40

Dot

Y-axis

In this example, we use the
intensity() option to make the colors
within the bars 1.4 times more intense
than they would normally be. Note
that Stata also has an option called
lintensity() that works the same way
but controls the intensity of the line
surrounding the bar. (This option is
not illustrated.)
Uses nlsw.dta & scheme vg rose

Box

Legend

graph bar wage hours ttl exp tenure, over(collgrad)
intensity(*1.4)

Bar

mean of wage

Cat axis

not college grad

Matrix

Over options

20

Twoway

30

Introduction

40

Over

The intensity option is used to
control the intensity of the color within
the bars. Here, we request that the
color be 50% as intense as it normally
would be.
Uses nlsw.dta & scheme vg rose

Y-variables

graph bar wage hours ttl exp tenure, over(collgrad)
intensity(*.5)

Styles
Appendix

So far, all these options that we have examined determine the overall behavior and look
over all of the bars as a group. Using the bar() option, you can control the look of the
bars for each y-variable, as illustrated below. These graphs use the vg s2c scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
150

Chapter 4. Bar graphs

graph bar wage hours ttl exp tenure, over(collgrad)
bar(1, bcolor(dkgreen))

0

10

20

30

40

Here, we use the bar() option to make
the color of the first bar dark green.
See Styles : Colors (328) for more
information about colors you can select.
Uses nlsw.dta & scheme vg s2c

not college grad

college grad

mean of wage

mean of hours

mean of ttl_exp

mean of tenure

0

10

20

30

40

graph bar wage hours ttl exp tenure, over(collgrad)
bar(1, bfcolor(ltblue) blcolor(blue) blwidth(vthick))

not college grad

college grad

mean of wage

mean of hours

mean of ttl_exp

mean of tenure

In this example, we make the fill color
of the first bar light blue and the
outline very thick and blue. See
Styles : Linewidth (337) for more details
on controlling the thickness of lines.
You could also use the blpattern()
option to control the pattern of the line
surrounding the bar; see
Styles : Linepatterns (336) for more
details.
Uses nlsw.dta & scheme vg s2c

0

10

20

30

40

graph bar wage hours ttl exp tenure, over(collgrad)

not college grad

college grad

mean of wage

mean of hours

mean of ttl_exp

mean of tenure

While you can use the bar() option to
control the look of each bar, selecting a
different scheme allows you to control
the look of all of the bars. For example,
this graph is drawn using the vg palec
scheme. See Intro : Schemes (14) for
some other schemes you could try and
Appendix : Customizing schemes (379) for
tips on customizing your own schemes.
Uses nlsw.dta & scheme vg palec

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.8

151

10
8
6
4

mean of wage

2
0
0

5

10

Appendix

mean of wage

15

college grad

Styles

not college grad

Standard options

If we want to show the previous graph
separately by collgrad, we can use the
by() option. This gives us two graphs
side by side: one for those who are not
college graduates and one for college
graduates.
Uses nlsw.dta & scheme vg s1c

Options

By

graph bar wage, over(urban2) over(married) over(union) by(collgrad)

Pie

Metro

Dot

Rural

Box

married

union

Lookofbar options

single

Bar

Y-axis

married

nonunion

Matrix

Legend

single

Twoway

Cat axis

Consider this bar graph that breaks
wages down by three categorical
variables. If we wanted to further break
this down by another categorical
variable, we could not use another
over() option since we can have a
maximum of three over() options with
a single y-variable.
Uses nlsw.dta & scheme vg s1c

Over options

graph bar wage, over(urban2) over(married) over(union)

Over

This section discusses the use of the by() option in combination with graph bar. Normally, you would use the over() option instead of the by() option, but there are cases
where the by() option is either necessary or more advantageous. For example, a by() option is useful if you exceed the maximum number of over() options (three if you have a
single y-variable or two if you have multiple y-variables). In such cases, the by() option
allows you to break your data down by additional categorical variables. Also, by() gives
you more flexibility in the placement of the separate panels. For more information about
the by() option, see Options : By (272); for more information about the over() option, see
Bar : Over (111). These examples are shown using the vg s1c scheme.

Introduction

Graphing by groups

Y-variables

4.8

Graphing by groups

single married

single married

single married

single married

nonunion

union

nonunion

union

Rural

Metro

Graphs by college graduate

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
152

Chapter 4. Bar graphs

graph bar ttl exp tenure, over(married) over(urban2)

0

5

10

15

Consider this bar graph with multiple
y-variables broken down by two
categorical variables using two over()
options. When you have multiple
y-variables, you can only have a
maximum of two over() options.
Uses nlsw.dta & scheme vg s1c

single

married

single

married

Rural

Metro

mean of ttl_exp

mean of tenure

graph bar ttl exp tenure, over(married) over(urban2)
by(union)
If we want to further show the previous
graph by another categorical variable,
say union, we can use the by() option.
Uses nlsw.dta & scheme vg s1c

union

0

5

10

15

nonunion

single

married

single

Rural

married

Metro

single

married

Rural

mean of ttl_exp

single

married

Metro

mean of tenure

Graphs by union worker

graph bar ttl exp tenure, over(married) over(urban2)
by(union, missing)
union

0

5 10 15

nonunion

single

married

single

Rural

married

Metro

single

married

Rural

single

married

We can add the missing option to
include a panel for the missing values of
union.
Uses nlsw.dta & scheme vg s1c

Metro

0

5 10 15

(missing)

single

married

Rural

single

married

Metro
mean of ttl_exp

mean of tenure

Graphs by union worker

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.8

Graphing by groups

153

0
single

married

single

married

Metro

single

married

Rural

married

Metro
Total

0
single

married

single

married

Metro

single

married

Rural

married

Metro

mean of tenure

Graphs by union worker

Rural

single
married

Metro

single
married

union

5

10

mean of ttl_exp

15

mean of tenure

Graphs by union worker

Styles
Appendix

graph hbar ttl exp tenure, over(married) over(urban2)
by(union, cols(1) note(""))
We add the note("") option within the
by() option, and that suppresses the
note in the left corner, leaving more
room for the graph.
Uses nlsw.dta & scheme vg s1c

nonunion
Rural
Metro

By

0

Standard options

single
married

Options

Metro

Lookofbar options

Rural

Pie

nonunion
single
married

Dot

Y-axis

We remove the total and missing
options and flip the graph to make a
horizontal bar chart. We then use the
cols(1) option to show these graphs in
one column. This makes the graph
pretty cramped. Let’s explore a number
of options we can add to this graph to
make it less cramped, adding the
options just a small number at a time.
Uses nlsw.dta & scheme vg s1c

Box

Legend

graph hbar ttl exp tenure, over(married) over(urban2)
by(union, cols(1))

Bar

mean of ttl_exp

single

Cat axis

Rural

Matrix

5 10 15

(missing)

single

Over options

Rural

Twoway

5 10 15

union

Introduction

nonunion

Over

We can add the total option to include
a panel for all observations.
Uses nlsw.dta & scheme vg s1c

Y-variables

graph bar ttl exp tenure, over(married) over(urban2)
by(union, missing total)

single
married
single
married

union
Rural
Metro

single
married
single
married
0

5

mean of ttl_exp

10

15

mean of tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
154

Chapter 4. Bar graphs

graph hbar ttl exp tenure, over(married) over(urban2)
by(union, cols(1) note("") legend(position(3)))
nonunion
Rural

Metro

single
married
single
married

union
Rural

Metro

mean of ttl_exp

mean of tenure

single

We add the legend(position(3))
option to put the legend at the right.
Note that this is contained within the
by() option because it changes the
position of the legend. If we could make
the legend narrow (instead of wide), it
would work well in this position.
Uses nlsw.dta & scheme vg s1c

married
single
married
0
10
5
15

graph hbar ttl exp tenure, over(married) over(urban2)
by(union, cols(1) note("") legend(position(3)))
legend(cols(1) stack label(1 "Tot Exp") label(2 "Curr Exp"))
nonunion
Rural

Metro

single
married
single
married

Tot Exp

union
Rural

Metro

single

Curr Exp

married
single

We add the legend(cols(1) stack)
to make the legend narrow and the
label() option to change the labels in
the legend. Note that this legend()
option appears outside of the by()
option. See Options : By (272) and
Options : Legend (287) for more
information about the interactions of
by() and legend().
Uses nlsw.dta & scheme vg s1c

married
0

5

10

15

graph hbar ttl exp tenure, over(married) over(urban2)
by(union, cols(1) note("") legend(position(3)))
legend(cols(1) stack label(1 "Tot Exp") label(2 "Curr Exp"))
subtitle(, position(5) ring(0) nobexpand)
single

Rural
married

single

Metro
married

nonunion
Tot Exp

single

Rural

Curr Exp

married

single

Metro
married

We can add the subtitle() option to
position the title for each separate
graph in the lower right corner. The
position(5) option puts the title in
the 5 o’clock position, and the ring(0)
option puts the title inside the plot
area. The nobexpand (no box expand)
option keeps the title from expanding
to fill the entire plot area.
Uses nlsw.dta & scheme vg s1c

union
0

5

10

15

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
4.8

Graphing by groups

155

single

married

single

married

Metro

single

married

Rural

married

Metro

union, college grad

single

married

single

married

Metro

married

Rural

single

married

Metro

mean of tenure

Graphs by union worker and college graduate

Bar

mean of ttl_exp

single

Cat axis

Rural

Matrix

0 5 10 15

union, not college grad

single

Over options

Rural

Twoway

0 5 10 15

nonunion, college grad

Introduction

nonunion, not college grad

Over

You can include multiple variables
within the by() option. Here, in
addition to breaking these variables
down by two over() variables, we
break them down by two additional
variables using the by(union
collgrad) option.
Uses nlsw.dta & scheme vg s1c

Y-variables

graph bar ttl exp tenure, over(married) over(urban2)
by(union collgrad)

Box
Dot

Legend

Pie

Y-axis

Options

By

Standard options

Lookofbar options

Styles
Appendix

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Matrix
Bar
Box
Dot
Pie
Options

By

Standard options

Boxlook options

This section introduces the use of box plots, illustrating the use of the over() option
for showing box plots by one or more grouping variables. Next, we give examples showing
how you can graph multiple variables at once by specifying additional y-variables, followed
by some general options for controlling the display of multiple y-variables and the behavior
of over() options. See the group options table in [G] graph box for more details. This
section begins with the vg s2c scheme.

Twoway

Y-axis

Specifying variables and groups, yvars and over

Styles

graph box wage, over(grade4)

20
10
0

hourly wage

30

40

Appendix

This is a box plot of wages broken
down by education. The over(grade4)
option breaks down wages by education
level (in four categories). By default,
the separate levels of grade4 are
graphed using the same color, and the
levels are labeled on the x-axis. The
graph shows a large number of outside
values that are displayed as markers
beyond the whiskers. The following
example shows how we can suppress the
display of the outside values.
Uses nlsw.dta & scheme vg s2c

Legend

5.1

Cat axis

A box plot displays box(es) bordered at the 25th and 75th percentiles of the y-variable
with a median line at the 50th percentile. Whiskers extend from the box to the upper and
lower adjacent values and are capped with an adjacent line. Values exceeding the upper and
lower adjacent values are called outside values and are displayed as markers. This chapter
starts by showing the use of the over() option to break box plots down by categorical
variables and then showing how you can specify multiple y-variables to display plots for
multiple variables. Next, we see further options that can be used to customize the display
of over() option, followed by options that control the display of categorical axes. Next,
we discuss options for legends, followed by options that control the display of the y-axis.
Finally, we cover options that control the look of boxes and the by() option.

Over options

Box plots

Introduction

Yvars and over

5

Not HS

HS Grad

Some Coll

Coll Grad

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this157
document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
158

Chapter 5. Box plots

10
0

5

hourly wage

15

20

graph box wage, over(grade4) nooutsides

Not HS

HS Grad

Some Coll

Coll Grad

excludes outside values

By adding the nooutsides option, we
suppress the display of the outside
values. Graphs using this option have a
note in the bottom left corner
indicating that the outside values have
been excluded from display in the
graph. For most of the graphs in this
chapter, there would be a large number
of outside values, which would make the
graphs very cluttered, so many of the
graphs will use the nooutsides option.
Uses nlsw.dta & scheme vg s2c

graph box wage, nooutsides over(grade4) over(union)

10
0

5

hourly wage

15

20

Here, we add the over(union) option
to show wages broken down by
education and whether one is a member
of a union. Note, however, that the
labels for grade4 overlap each other.
See the next example for one solution.
Uses nlsw.dta & scheme vg s2c

Not HS HS Grad
Some Coll
Coll Grad

Not HS HS Grad
Some Coll
Coll Grad

nonunion

union

excludes outside values

graph hbox wage, nooutsides over(grade4) over(union)
Here, we use graph hbox to make a
horizontal box plot. Note that this
eliminates the overlapping of the labels
for grade4. The next example will
show another possible solution.
Uses nlsw.dta & scheme vg s2c

Not HS

nonunion

HS Grad
Some Coll
Coll Grad
Not HS

union

HS Grad
Some Coll
Coll Grad
0

5

10

15

20

hourly wage
excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.1

Specifying variables and groups, yvars and over

159

20
15
10

hourly wage

5
0

Some Coll

Coll Grad

excludes outside values

25
20
15

hourly wage

10
5
0

union

Not HS

HS Grad

Some Coll

Coll Grad

excludes outside values

Styles
Appendix

Now, let’s look at examples of using multiple y-variables with the over() option. We
first consider a graph with multiple y-variables. These examples use the vg outc scheme.

By

Metro

Standard options

nonunion

Options

union

Rural

Pie

Boxlook options

nonunion

Dot

Y-axis

In this example, we add a third over()
option, in this case comparing people
who live in rural and metropolitan
areas. Note that the first over()
variable, grade4, is now treated as
though it were multiple y-variables.
Because of this, you can only specify
one y-variable when you have three
over() options.
Uses nlsw.dta & scheme vg s2c

Box

Legend

graph box wage, nooutsides over(grade4) over(union) over(urban2)

Bar

Cat axis

HS Grad

Matrix

union
Not HS

Twoway

nonunion

Introduction

Over options

Using the asyvars option, the first
over() variable, grade4, is treated as
though it were multiple y-variables. As
a result, the levels of grade4 are shown
in multiple colors and labeled via a
legend. You can only use asyvars when
you have a single y-variable.
Uses nlsw.dta & scheme vg s2c

Yvars and over

graph box wage, nooutsides over(grade4) over(union) asyvars

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
160

Chapter 5. Box plots

graph hbox prev exp tenure, nooutsides
This graph shows work experience
before one’s current job and work
experience at one’s current job.
Uses nlsw.dta & scheme vg outc

0

5

10

15

Prev. work exper.

20

Curr. work exper.

excludes outside values

graph hbox prev exp tenure, nooutsides over(married)
We can further break these variables
down by marital status.
Uses nlsw.dta & scheme vg outc

single

married

0

5

10

15

Prev. work exper.

20

Curr. work exper.

excludes outside values

graph hbox prev exp tenure, nooutsides over(married) over(union)
We can take the last graph and add
another over() option to even further
break these variables down by whether
one belongs to a union. Note, however,
that we cannot add a third over()
option when we have multiple
y-variables, but we could add the by()
option; see Box : By (189).
Uses nlsw.dta & scheme vg outc

single

nonunion
married

single

union
married
0

5

10

Prev. work exper.

15

20

25

Curr. work exper.

excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.1

Specifying variables and groups, yvars and over

161

Matrix
Bar

Some Coll

Twoway

nonunion

Cat axis

Not HS
HS Grad

Introduction

Over options

graph hbox wage, nooutsides over(grade4) over(union)
Consider this graph where we show
wages broken down by education level
and whether one belongs to a union.
Uses nlsw.dta & scheme vg s2m

Yvars and over

Now, let’s consider options that may be used in combination with the over() option
to customize the behavior of the graphs. We show how you can treat the levels of the first
over() option as though they were multiple y-variables. You can also request that missing
values for the levels of the over() variables be displayed, and you can suppress empty categories when multiple over() options are used. These examples are shown below using the
vg s2m scheme.

Coll Grad

Some Coll

5

10

15

20

hourly wage

By

nonunion

union

Styles

If we add the asyvars option, then the
first over() variable (grade4) is
graphed as if there were four y-variables
corresponding to each level of grade4.
Each level of grade4 is shown as a
differently colored/shaded box and
labeled using the legend.
Uses nlsw.dta & scheme vg s2m

Standard options

graph hbox wage, nooutsides over(grade4) over(union) asyvars

Options

Boxlook options

excludes outside values

Pie

0

Y-axis

Coll Grad

Dot

union

Box

Legend

Not HS
HS Grad

0

5

10

15

20

Not HS

HS Grad

Some Coll

Coll Grad

excludes outside values

Appendix

hourly wage

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
162

Chapter 5. Box plots

graph hbox wage, nooutsides over(grade4) over(union) asyvars missing

nonunion

union

.

0

10

20

30

hourly wage
Not HS

HS Grad

Some Coll

Coll Grad

By adding the missing option to the
previous graph, we see a category for
those who are missing on the union
variable, shown as the third group,
which is labeled with a dot to indicate
that those values are missing; see
Box : Cat axis (168) to see how you
could label this differently (e.g.,
labeling it with the word “Missing”).
Uses nlsw.dta & scheme vg s2m

.
excludes outside values

graph box wage, nooutsides over(grade) over(collgrad)

20
10
0

hourly wage

30

Consider this box chart that breaks
wages down by two variables: the last
grade that one completed and whether
one is a college graduate. By default,
Stata shows all possible combinations
for these two variables. In most cases,
all combinations are possible, but not
in this case.
Uses nlsw.dta & scheme vg s2m
4 5 6 7 8 9 101112131415161718 4 5 6 7 8 9 101112131415161718

not college grad

college grad

excludes outside values

graph box wage, nooutsides over(grade) over(collgrad) nofill

20
10
0

hourly wage

30

If you only want to display the
combinations of the over() variables
that exist in the data, then you can use
the nofill option.
Uses nlsw.dta & scheme vg s2m

4

5

6

7

8

9 10 11 12 13 14 15

not college grad

13 14 15 16 17 18

college grad

excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.2

163

Other

Sales
Clerical
Other
0

5

10

15

20

Curr. work exper.

By

graph hbox tenure, nooutsides over(occ5, gap(*3)) over(collgrad)
Prof/Mgmt

Styles

Sales

not college grad

Clerical
Labor/Ops

Appendix

We can change the gap between the
levels of occ5. Here, we make that gap
twice as large as it normally would.
This leads to narrow boxes with a
sizable gap between them.
Uses nlsw.dta & scheme vg past

Standard options

excludes outside values

Options

Labor/Ops

Boxlook options

college grad

Pie

Prof/Mgmt

Dot

Y-axis

Labor/Ops

Box

Clerical

Bar

Sales

not college grad

Matrix

Legend

Prof/Mgmt

Twoway

Cat axis

graph hbox tenure, nooutsides over(occ5) over(collgrad)
Consider this graph that shows box
plots of tenure broken down by occ5
and collgrad. We use the nooutsides
option to suppress the display of
outside values. For the rest of the
graphs in this section, there would be a
large number of outside values, which
would make the graphs very cluttered,
so we will include the nooutsides
option for each example.
Uses nlsw.dta & scheme vg past

Over options

This section considers some of the options that can be used with the over() and
yvaroptions() options for customizing the display of the boxes. We will focus on controlling the spacing between the boxes and the order in which the boxes are displayed.
Other options that control the display of the x-axis, such as the labels, are covered in
Box : Cat axis (168). For more information on the over() options covered in this section,
see the over subopts table in [G] graph box. We begin by considering options that control
the spacing among the boxes and use the vg past scheme.

Introduction

Options for groups, over options

Yvars and over

5.2

Options for groups, over options

Other
Prof/Mgmt
Sales

college grad

Clerical
Labor/Ops
Other
0

5

10

15

20

Curr. work exper.
excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
164

Chapter 5. Box plots

graph hbox tenure, nooutsides over(occ5, gap(*.2)) over(collgrad)

not college grad

Prof/Mgmt
Sales
Clerical
Labor/Ops
Other

college grad

Prof/Mgmt
Sales
Clerical
Labor/Ops
Other

Here, we shrink the gap between the
levels of collgrad, making the gaps
20% of the size they normally would.
This yields boxes that are wider than
they normally would.
Uses nlsw.dta & scheme vg past

0

5

10

15

20

Curr. work exper.
excludes outside values

graph hbox tenure, nooutsides over(occ5, gap(*.4)) over(collgrad, gap(*2))

not college grad

Prof/Mgmt
Sales
Clerical
Labor/Ops
Other

college grad

Prof/Mgmt
Sales
Clerical
Labor/Ops
Other

We can control the gap with respect to
each of the over() variables. In this
example, we make the gap among the
occ5 categories small (40% of their
original size) and the gap between the
levels of collgrad larger (two times the
normal size).
Uses nlsw.dta & scheme vg past
0

5

10

15

20

Curr. work exper.
excludes outside values

By default, the boxes formed by over() variables are ordered in ascending sequence
according to the values of the over() variable. Stata allows us to control the order of the
boxes by allowing us to put them in descending order, order them according to the values of
another variable, or sort the boxes according to their medians. These options are illustrated
in the following examples.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.2

Options for groups, over options

165

Cler.
Sales

Prof
5

10

15

Y-axis
Boxlook options

Labor
Mgmt
Operat.
Prof
Sales

By

Other
0

5

10

15

Curr. work exper.
excludes outside values

Styles

graph hbox tenure, nooutsides over(occ7, sort(1))

Appendix

Here, we sort the variables based on the
median of tenure, yielding boxes with
medians in ascending order. The
sort(1) option sorts the boxes
according to the median of the first
y-variable, meaning to sort on the
median of tenure.
Uses nlsw.dta & scheme vg past

Standard options

25

Cler.

Options

20

Legend

We might want to put these boxes in
alphabetical order, but with Other still
appearing last. We can do this by
recoding occ7 into a new variable (say
occ7alpha) such that, as occ7alpha
goes from 1 to 7, the occupations
alphabetically ordered. We recoded
occ7 with these assignments: 4 = 1,
6 = 2, 2 = 3, 5 = 4, 1 = 5, 3 = 6, and
7 = 7. Then, the sort(occ7alpha)
option alphabetizes the boxes (but with
Other still appearing last).
Uses nlsw.dta & scheme vg past

Pie

25

graph hbox tenure, nooutsides over(occ7, sort(occ7alpha))

Dot

20

excludes outside values

Box

25

Bar

20

Curr. work exper.

Cat axis

0

Matrix

Mgmt

Twoway

Operat.

Introduction

Other
Labor

Over options

Consider this graph showing tenure
broken down by the seven levels of
occupation. The boxes would normally
be ordered by levels of occ7, going from
1 to 7. The descending option
switches the order of the boxes. They
still are ordered according to the seven
levels of occupation, but the boxes are
ordered going from 7 to 1.
Uses nlsw.dta & scheme vg past

Yvars and over

graph hbox tenure, nooutsides over(occ7, descending)

Cler.
Labor
Sales
Operat.
Mgmt
Prof
Other
0

5

10

15

Curr. work exper.
excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
166

Chapter 5. Box plots

graph hbox tenure, nooutsides over(occ7, sort(1) descending)
Adding the descending option yields
boxes in descending order, going from
highest median tenure to lowest
median tenure.
Uses nlsw.dta & scheme vg past

Other
Prof
Mgmt
Operat.
Sales
Labor
Cler.
0

5

10

15

20

25

Curr. work exper.
excludes outside values

graph hbox prev exp tenure, nooutsides over(occ7)
Here, we plot two y-variables: the
number of years of work experience
before one’s current job and the years
in one’s current job. Since we have
removed any sort() options, the boxes
are sorted according to the values of
occ7.
Uses nlsw.dta & scheme vg past

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

5

10

15

Prev. work exper.

20

25

Curr. work exper.

excludes outside values

graph hbox prev exp tenure, nooutsides over(occ7, sort(1))
Adding the sort(1) option now sorts
the boxes according to the median of
prev exp since that is the first
y-variable.
Uses nlsw.dta & scheme vg past

Other
Operat.
Labor
Sales
Cler.
Prof
Mgmt
0

5

10

Prev. work exper.

15

20

25

Curr. work exper.

excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.2

Options for groups, over options

167

Cler.
Labor
Sales

Mgmt
Prof

5

10

15

Prev. work exper.

20

25

Curr. work exper.

15

20

25

excludes outside values

college grad

Operat.
Labor
Cler.
Sales
Prof
Mgmt
Other

not college grad

Cler.
Labor
Other
Sales
Operat.
Mgmt
Prof

Appendix

We add the descending option to the
second over() option, and the levels of
collgrad are now shown with college
graduates appearing first.
Uses nlsw.dta & scheme vg past

Styles

graph hbox tenure, nooutsides over(occ7, sort(1)) over(collgrad, descending)

Standard options

10

Curr. work exper.

Options

5

Pie

0

Dot

By

college grad

Operat.
Labor
Cler.
Sales
Prof
Mgmt
Other

Boxlook options

not college grad

Cler.
Labor
Other
Sales
Operat.
Mgmt
Prof

Y-axis

We can use the sort() option when
there are additional over() variables.
Here, the boxes are ordered according
to the median of tenure across occ7
but within each level of collgrad.
Uses nlsw.dta & scheme vg past

Box

Legend

graph hbox tenure, nooutsides over(occ7, sort(1)) over(collgrad)

Bar

excludes outside values

Cat axis

0

Matrix

Other

Twoway

Over options

Operat.

Introduction

Changing sort(1) to sort(2) then
sorts the boxes according to the median
of the second y-variable, tenure.
Uses nlsw.dta & scheme vg past

Yvars and over

graph hbox prev exp tenure, nooutsides over(occ7, sort(2))

0

5

10

15

20

25

Curr. work exper.
excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
168

Chapter 5. Box plots

5.3

Controlling the categorical axis

This section describes ways that you can label categorical axes. Box plots are similar
to bar charts, but they are different from other graphs because their x-axes are represented by categorical variables. This section describes options you can use to customize
these categorical axes. For more details on this, see [G] cat axis label options and
[G] cat axis line options.
We will start by showing examples of how you can change the labels for the x-axis for
these categorical variables. The next set of examples will use the vg teal scheme.
graph box wage, nooutsides over(south)
20

hourly wage

15

10

5

0
0

1

excludes outside values

This is an example of a box plot with
one over() variable graphing wages
broken down by whether one lives in
the South. The variable south is a
dummy variable that does not have any
value labels, so the x-axis is not labeled
very well. We use the nooutsides
option to suppress the display of
outside values. For the rest of the
graphs in this section, there would be a
large number of outside values, which
would make the graphs very cluttered,
so we will include the nooutsides
option for each example.
Uses nlsw.dta & scheme vg teal

graph box wage, nooutsides over(south, relabel(1 "N & W" 2 "South"))
We can use the relabel() option to
change the labels displayed for the
levels of south, giving the x-axis more
meaningful labels. Note that we wrote
relabel(1 "N & W"), not relabel(0
"N & W"), since these numbers do not
represent the actual levels of south but
the ordinal position of the levels, i.e.,
first and second.
Uses nlsw.dta & scheme vg teal

20

hourly wage

15

10

5

0
N&W

South

excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.3

Controlling the categorical axis

169

hourly wage

10

South

N&W

Non Metro

South

Metro

Bar

N&W

Cat axis

0

Matrix

5

Twoway

15

Introduction

20

Over options

This is an example of a box plot with
two over() variables. Here, we use the
relabel() option to change the labels
displayed for the levels of south and
smsa.
Uses nlsw.dta & scheme vg teal

Yvars and over

graph box wage, nooutsides over(south, relabel(1 "N & W" 2 "South"))
over(smsa, relabel(1 "Non Metro" 2 "Metro"))

excludes outside values

15

10

By

0
Prev. work exper.

Curr. work exper.

Tot. work exper.

excludes outside values

Styles
Appendix

graph box prev exp tenure ttl exp, nooutsides ascategory
yvaroptions(relabel(1 "Prev Exp" 2 "Curr Exp" 3 "Tot Exp"))
If we had an over() option, we would
use the relabel() option to change the
labels on the x-axis. But since we had
multiple y-variables that we have
treated as categories, we then use the
yvaroptions(relabel()) option to
modify the labels on the x-axis.
Uses nlsw.dta & scheme vg teal

Standard options

5

Options

Boxlook options

20

Pie

25

Dot

Y-axis

This shows a box plot with multiple
y-variables but uses the ascategory
option to plot the different y-variables
as if they were categorical variables.
The boxes for the different variables are
the same color, and the categories are
labeled on the x-axis rather than with a
legend. The default labels on the x-axis
are not bad, but we might want to
change them.
Uses nlsw.dta & scheme vg teal

Box

Legend

graph box prev exp tenure ttl exp, nooutsides ascategory

25

20

15

10

5

0
Prev Exp

Curr Exp

Tot Exp

excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
170

Chapter 5. Box plots

graph box prev exp tenure ttl exp, nooutsides ascategory
over(south, relabel(1 "N & W" 2 "South"))
yvaroptions(relabel(1 "Prev Exp" 2 "Curr Exp" 3 "Tot Exp"))
This example is similar to the previous
example, but we have added an over()
variable as well. As before, we use
yvaroptions(relabel()) to modify
the labels for the multiple y-variables,
and then we also use the relabel()
option within the over() option to
change the labels for south.
Uses nlsw.dta & scheme vg teal

25
20
15
10
5
0
Prev Exp

Curr Exp

Tot Exp

Prev Exp

N&W

Curr Exp

Tot Exp

South

excludes outside values

graph box prev exp tenure ttl exp, nooutsides ascategory xalternate
over(south, relabel(1 "N & W" 2 "South"))
yvaroptions(relabel(1 "Prev Exp" 2 "Curr Exp" 3 "Tot Exp"))
N&W
Prev Exp

Curr Exp

25
20
15

South
Tot Exp

Prev Exp

Curr Exp

Tot Exp

We add the xalternate option, which
moves the labels for the x-axis to the
opposite side, in this case from the
bottom to the top. You can also use
the yalternate option to move the
y-axis to its opposite side.
Uses nlsw.dta & scheme vg teal

10
5
0
excludes outside values

In the examples above, we have seen that, even though the relabel() option is called
an over() option, it can be used within yvaroptions() to control the labeling of multiple
y-variables (provided that the ascategory option is used to convert the multiple y-variables
into categories). We will next explore other over() options, which also can be used with
either over() or yvaroptions(). These examples will use the vg rose scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.3

Controlling the categorical axis

171

15
10
5

Pr
of
gm
t
Sa
le
s
C
le
O r.
pe
ra
t.
La
bo
r
O
th
er
M

Pr
of
gm
t
Sa
le
s
C
le
O r.
pe
ra
t.
La
bo
r
O
th
er
M

25

hourly wage

10

Prof
Sales Operat. Other
Mgmt
Cler.
Labor

Mgmt
Cler.
Labor
Prof
Sales Operat. Other

not college grad

college grad

excludes outside values

Appendix

25
20
hourly wage

We can instead make the size of the
labels smaller to make them fit without
overlapping. Here, we make the label
size small using the
label(labsize(small)) option. See
Styles : Textsize (344) for other values
you could choose for labsize().
Uses nlsw.dta & scheme vg rose

Styles

graph box wage, nooutsides over(occ7, label(labsize(small))) over(collgrad)

By

0

Standard options

5

Options

15

Pie

Boxlook options

20

Dot

Y-axis

Another way we can avoid overlapping
is by adding the label(alternate)
option. As you can see, the labels
alternate in height, avoiding
overlapping.
Uses nlsw.dta & scheme vg rose

Box

Legend

graph box wage, nooutsides over(occ7, label(alternate)) over(collgrad)

Bar

college grad

Cat axis

not college grad
excludes outside values

Matrix

0

Twoway

hourly wage

20

Introduction

25

Over options

In this example, the levels of occ7
might overlap each other. Using the
label(angle(45)) option makes the
angle of the labels for occ7 45 degrees,
and they do not overlap.
Uses nlsw.dta & scheme vg rose

Yvars and over

graph box wage, nooutsides over(occ7, label(angle(45))) over(collgrad)

15
10
5
0
Prof Mgmt Sales Cler. Operat. Labor Other

Prof Mgmt Sales Cler. Operat. Labor Other

not college grad

college grad

excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
172

Chapter 5. Box plots

graph hbox wage, nooutsides over(occ5, label(labcolor(maroon)))
over(collgrad)
Using the label(labcolor(maroon))
option, we change the label color for
occ5 to maroon. See Styles : Colors
(328) for more details about other
colors you could choose.
Uses nlsw.dta & scheme vg rose

Prof/Mgmt
Sales

not college grad

Clerical
Labor/Ops
Other
Prof/Mgmt
Sales

college grad

Clerical
Labor/Ops
Other
0

5

10

15

20

25

hourly wage
excludes outside values

graph hbox wage, nooutsides
over(occ5, label(ticks tlwidth(thick) tlength(*2) tposition(crossing)))
over(collgrad)
We can use the label(ticks) option
to place ticks under each box. We also
modify the attributes of the ticks,
making the tick thick, twice as long as
normal, and crossing the x-axis. See
[G] cat axis label options for more
details and other options for controlling
ticks.
Uses nlsw.dta & scheme vg rose

Prof/Mgmt
Sales

not college grad

Clerical
Labor/Ops
Other
Prof/Mgmt
Sales

college grad

Clerical
Labor/Ops
Other
0

5

10

15

20

25

hourly wage
excludes outside values

graph hbox wage, nooutsides over(occ5, label(labgap(*5))) over(collgrad)
The label(labgap(*5)) option
controls the gap between the label and
the ticks. Here, we increase the gap
between the label for the levels of occ5
and the axis line to five times its
normal size.
Uses nlsw.dta & scheme vg rose

Prof/Mgmt
Sales

not college grad

Clerical
Labor/Ops
Other
Prof/Mgmt
Sales

college grad

Clerical
Labor/Ops
Other
0

5

10

15

20

25

hourly wage
excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.3

Controlling the categorical axis

173

Sales

not college grad

Clerical
Labor/Ops

Sales
Clerical
Other
5

10

15

20

25

excludes outside values

hourly wage

10
5

Mgmt

Sales

Cler.

Operat.

Labor

Other

excludes outside values

Styles
Appendix

So far, we have focused on labeling the values on the categorical x-axis, but we have not
yet looked at how to add a title to that axis. We might be tempted to use xtitle(), but
that option is not valid for a categorical axis. Instead, we can use other means for giving
titles to these axes, as illustrated in the examples below using the vg s1c scheme.

By

Prof

Standard options

0

Options

Boxlook options

15

Pie

20

Dot

Y-axis

25

Box

Legend

graph box wage, nooutsides over(occ7, axis(outergap(*20)))

Bar

hourly wage

Cat axis

0

Matrix

Labor/Ops

Twoway

Other
Prof/Mgmt

college grad

We use the axis(outergap()) option
to increase the gap between the labels
of the x-axis and the outside of the
graph. As you can see, this increases
the space between the labels for occ7
and the bottom of the graph.
Uses nlsw.dta & scheme vg rose

Introduction

Prof/Mgmt

Over options

Using the label(labgap(*7)) option,
we increase the gap associated with
collgrad. This example makes the gap
between collgrad and occ5 seven
times its normal size.
Uses nlsw.dta & scheme vg rose

Yvars and over

graph hbox wage, nooutsides over(occ5) over(collgrad, label(labgap(*7)))

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
174

Chapter 5. Box plots

graph box wage, over(grade6) nooutsides
b1title("Level of Education") b2title("in six categories")

10
0

5

hourly wage

15

20

In this example, the categorical axis
represents the level of education, and
we can use the b1title() and
b2title() options to add titles to the
bottom of the graph. See Standard
options : Titles (313) for more details.
Uses nlsw.dta & scheme vg s1c

No HS

Some HS

HS Grad

Some Coll Coll Grad

Post Grad

Level of Education
in six categories
excludes outside values

graph hbox wage, over(grade6) nooutsides
l1title("Level of Education" "in six categories")
By using graph hbox, the categorical
axis is now on the left axis, so we then
use the l1title() to add a title to the
x-axis. We could also use the
l2title() to add a second title as well.
Uses nlsw.dta & scheme vg s1c

Level of Education
in six categories

No HS
Some HS
HS Grad
Some Coll
Coll Grad
Post Grad
0

5

10

15

20

hourly wage
excludes outside values

5.4

Controlling legends

This section discusses the use of legends for box charts, emphasizing the features that
are unique to box charts. The section Options : Legend (287) goes into great detail about
legends, as does [G] legend option. Legends can be used for multiple y-variables or when
the first over() variable is treated as a y-variable via the asyvars option. See Box : Yvars
and over (157) for more information about using multiple y-variables and more examples
of treating the first over() variable as a y-variable. These first examples use the vg brite
scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.4

Controlling legends

175

Standard options

25
20
15
10
5
0
25
20
15

hourly wage

10
5
0
excludes outside values

Options

Labor

Other

Pie

Cler.

Operat.

By

Mgmt

Sales

Dot

Boxlook options

Prof

Box

Y-axis

This is another example of where a
legend can arise in a Stata box plot by
using the asyvars option, which treats
an over() variable as though the levels
were different y-variables.
Uses nlsw.dta & scheme vg brite

Bar

Legend

graph box wage, nooutsides over(occ7) asyvars

Matrix

excludes outside values

Twoway

Curr. work exper.

Tot. work exper.

Cat axis

Prev. work exper.

Introduction

Over options

Consider this box plot of three different
variables. These variables are shown
with different colors, and a legend is
used to identify the variables. We use
the nooutsides option to suppress the
display of outside values. For the rest
of the graphs in this section, there
would be a large number of outside
values, which would make the graphs
very cluttered, so we will include the
nooutsides option for each example.
Uses nlsw.dta & scheme vg brite

Yvars and over

graph box prev exp tenure ttl exp, nooutsides

Styles
Appendix

Unless otherwise mentioned, the legend() options described below work the same
whether the legend was derived from multiple y-variables or from an over() option that
was combined with the asyvars option. These examples use the vg teal scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
176

Chapter 5. Box plots

graph box prev exp tenure ttl exp, nooutsides nolabel
The nolabel option only works when
you have multiple y-variables. When
this option is used, the variable names
(not the variable labels) are used in the
legend. For example, instead of showing
the variable label Prev. work exper.,
it shows the variable name prev exp.
Uses nlsw.dta & scheme vg teal

25
20
15
10
5
0
prev_exp

tenure

ttl_exp
excludes outside values

graph box prev exp tenure ttl exp, nooutsides showyvars
The showyvars option puts the labels
under the boxes.
Uses nlsw.dta & scheme vg teal

25
20
15
10
5
0
Prev. work exper.

Curr. work exper.

Prev. work exper.

Tot. work exper.
Curr. work exper.

Tot. work exper.
excludes outside values

graph box prev exp tenure ttl exp, nooutsides showyvars legend(off)
One instance when the showyvars
option would be useful is when you
want separately colored boxes labeled
at the bottom. Here, we use showyvars
to show the labels at the bottom of the
boxes and the legend(off) option to
suppress the display of the legend.
Uses nlsw.dta & scheme vg teal

25
20
15
10
5
0
Prev. work exper.

Curr. work exper.

Tot. work exper.

excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.4

Controlling legends

177

15

10

Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

hourly wage

15
10
5

Management

Sales

Cler.

Operat.

Labor

By

Professional

Other
excludes outside values

Styles

graph box wage, nooutsides over(occ7) asyvars legend(rows(2) colfirst)

Appendix

25
20
hourly wage

In this example, we use the
legend(rows(2) colfirst) options to
display the legend in two rows and to
order the keys by column (instead of
the default, which is by row). This
yields keys that are more adjacent to
the boxes that they label.
Uses nlsw.dta & scheme vg teal

Standard options

0

Options

Boxlook options

20

Pie

25

Dot

Y-axis

We use the legend(label()) option to
change the labels for the first and
second variables in the legend. Note
that you use a separate label() option
for each bar. This is in contrast to the
relabel() option, where all the label
assignments were placed in one
relabel() option; see Box : Cat axis
(168).
Uses nlsw.dta & scheme vg teal

Box

Legend

graph box wage, nooutsides over(occ7) asyvars
legend(label(1 "Professional") label(2 "Management"))

Bar

excludes outside values

Cat axis

0

Matrix

5

Twoway

hourly wage

20

Introduction

25

Over options

Even though the showyvars option
sounds like it would work only with
multiple y-variables, it also works when
you combine the over() and asyvars
options. As before, we suppress the
legend in this example using the
legend(off) option.
Uses nlsw.dta & scheme vg teal

Yvars and over

graph box wage, nooutsides over(occ7) asyvars showyvars legend(off)

15
10
5
0
Prof

Sales

Operat.

Mgmt

Cler.

Labor

Other

excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
178

Chapter 5. Box plots

graph box wage, nooutsides over(occ7) asyvars
legend(position(1))
Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other
25

hourly wage

20
15
10
5

We can put the legend up in the top
right corner of the graph with the
legend(position(1)) option. The
values you supply for position() are
like the numbers on a clock face, where
12 o’clock is the top, 6 o’clock is the
bottom, and 0 represents the center of
the clock face; see Styles : Clockpos (330)
for more details.
Uses nlsw.dta & scheme vg teal

0
excludes outside values

graph hbox wage, nooutsides over(occ7) asyvars
legend(cols(1) position(9))
We switch to making this a horizontal
box chart and then move the legend
using the legend(position(9))
option. The legend is now placed in the
9 o’clock position and is displayed as a
single column.
Uses nlsw.dta & scheme vg teal

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

0

5

10

15

20

25

hourly wage
excludes outside values

graph hbox wage, nooutsides over(occ7) asyvars
legend(cols(1) position(9) textfirst)
We can add the textfirst option to
put the key description before the key
in the legend.
Uses nlsw.dta & scheme vg teal

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

0

5

10

15

20

25

hourly wage
excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.5

Controlling the y-axis

179

Sales
Cler.

Other
5

10
15
hourly wage

20

25

10
15
hourly wage

20

25

Bar

0

Cat axis

Labor

Matrix

Operat.

Twoway

Mgmt

Introduction

Prof

Over options

With the stack option, we can place
the keys and their labels on top of each
other to form an even more compact
column. You have considerable control
over the elements within the legend
using other options like rowgap(),
keygap(), symxsize(), symysize(),
textwidth(), and symplacement().
See Options : Legend (287) and
[G] legend option for more details.
Uses nlsw.dta & scheme vg teal

Yvars and over

graph hbox wage, nooutsides over(occ7) asyvars
legend(cols(1) position(9) stack)

excludes outside values

Sales
Cler.
Operat.

By

Other
0

5

excludes outside values

Standard options

Labor

Options

Boxlook options

Mgmt

Pie

Prof

Dot

Y-axis

Switching to the vg lgndc scheme, by
typing set scheme vg lgndc, positions
the legend at the left in a single
column, by default, without the need to
specify options.
Uses nlsw.dta & scheme vg lgndc

Box

Legend

graph hbox wage, nooutsides over(occ7) asyvars

Styles

Controlling the y-axis

Appendix

5.5

This section describes options you can use with respect to the y-axis with box charts.
To be precise, when Stata refers to the y-axis on a box chart, it refers to the axis with the
continuous variable, whether the left axis when using graph box or the bottom axis when
using graph hbox. This section emphasizes the features that are particularly relevant to
box charts. For more details, see Options : Axis titles (254), Options : Axis labels (256), and
Options : Axis scales (265). See also [G] axis title options, [G] axis label options, and
[G] axis scale options. These examples are shown using the vg lgndc scheme, which
places the legend to the left in a single column.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
180

Chapter 5. Box plots

graph box wage, nooutside over(occ5)

10
0

5

hourly wage

15

20

Consider this graph showing the hourly
wages broken down by occupation. We
use the nooutsides option to suppress
the display of outside values. For the
rest of the graphs in this section, there
would be a large number of outside
values, which would make the graphs
very cluttered, so we will include the
nooutsides option for each example.
Uses nlsw.dta & scheme vg lgndc
Prof/Mgmt

Sales

Clerical

Labor/Ops

Other

excludes outside values

graph box prev exp tenure, nooutside over(occ5)
ytitle("Years of experience")

15
0

5

Curr. work exper.

10

Prev. work exper.

Years of experience

20

Looking at previous and current work
experience over occupations, we can use
the ytitle() option to add a title to
the y-axis. See Options : Axis titles (254)
and [G] axis title options for more
details, but please disregard any
references to xtitle() there since that
option is not valid when using graph
box.
Uses nlsw.dta & scheme vg lgndc
Prof/Mgmt Sales

Clerical Labor/Ops Other

excludes outside values

graph hbox prev exp tenure, nooutside over(occ5)
ytitle("Years of" "experience")
In this example, we place the title
across two lines by using two separate
quoted strings. Note that, even though
we have used graph hbox to place the
y-axis on the bottom axis, we still
should use ytitle() to change the title
of that axis.
Uses nlsw.dta & scheme vg lgndc

Prof/Mgmt
Sales
Prev. work exper.

Clerical

Curr. work exper.
Labor/Ops
Other
0

5

10

15

20

Years of
experience
excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.5

Controlling the y-axis

181

Prev. work exper.

Clerical

Curr. work exper.

5

10

15

20

Years of experience
excludes outside values

25
20
15
10

hourly wage

5
0

not college grad

college grad

excludes outside values

Styles

25

Appendix

graph box wage, nooutside over(occ5) over(collgrad) asyvar
ylabel(5(10)25)

Clerical

15

Sales

hourly wage

Prof/Mgmt

Labor/Ops
Other

5

We can use the ylabel() option to
label the y-axis. In this case, we use the
labels going from 5 to 25 by increments
of 10. Note that the y-axis still starts
at 0, and we would have to supply the
exclude0 option, so 0 is not necessarily
the starting point for the y-axis. See
Options : Axis labels (256) and
[G] axis label options for more
details. Please disregard any references
to xlabel() since that option is not
valid when using graph box.
Uses nlsw.dta & scheme vg lgndc

Standard options

Other

Options

By

Labor/Ops

Pie

Clerical

Boxlook options

Sales

Dot

Y-axis

Prof/Mgmt

Box

Legend

graph box wage, nooutside over(occ5) over(collgrad) asyvar
yline(4 12, lwidth(medthick) lcolor(maroon) lpattern(dash))
In this example, we use the yline()
option to add a medium-thick, maroon,
dashed line to the points in the graph
where wages equal 4 and 12. Note that
we would still use yline(), even if we
used graph hbox, placing the y-axis at
the bottom.
Uses nlsw.dta & scheme vg lgndc

Bar

0

Cat axis

Other

Matrix

Labor/Ops

Twoway

Sales

Introduction

Prof/Mgmt

Over options

Because this title is considered to be a
textbox, you can use a variety of
textbox options to control the look of
the title. This example makes the title
very large, surrounds it with a box, and
uses the bexpand (box expand) option
to stretch the box to fill the width of
the plot area. See Options : Textboxes
(303) for additional examples of how to
use textbox options to control the
display of text.
Uses nlsw.dta & scheme vg lgndc

Yvars and over

graph hbox prev exp tenure, nooutside over(occ5)
ytitle("Years of experience", size(vlarge) box bexpand)

not college grad

college grad

excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
182

Chapter 5. Box plots

graph box wage, nooutside over(occ5) over(collgrad) asyvar
ylabel(5(10)25, angle(0))
We can add the angle(0) option to
modify the angle of the y-labels, in this
case making them display horizontally.
Uses nlsw.dta & scheme vg lgndc

25

Sales
Clerical

hourly wage

Prof/Mgmt

15

Labor/Ops
5

Other

not college grad

college grad

excludes outside values

25

graph box wage, nooutside over(occ5) over(collgrad) asyvar
ylabel(5(10)25, nogrid)

Clerical

15

Sales

hourly wage

Prof/Mgmt

Other

5

Labor/Ops

not college grad

college grad

The nogrid option suppresses the
display of the grid. Note that this
option is placed within the ylabel()
option, thus suppressing the grid for
the y-axis. (With box plots, there is
never a grid with respect to the x-axis.)
If the grid were absent and we wanted
to include it, we could add the grid
option. For more details, see
Options : Axis labels (256).
Uses nlsw.dta & scheme vg lgndc

excludes outside values

graph box wage, nooutside over(occ5) over(collgrad) asyvar yscale(off)
We can use yscale(off) to turn off the
y-axis. See Options : Axis scales (265)
and [G] axis scale options for more
details. Please disregard any references
to xscale(), since that option is not
valid when using graph box.
Uses nlsw.dta & scheme vg lgndc

Prof/Mgmt
Sales
Clerical
Labor/Ops
Other

not college grad

college grad

excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.6

Changing the look of boxes, boxlook options

183

25
20
15
10
college grad

0

Dot

Y-axis

Pie

5
10
15
20
25

By

Other

not college grad

college grad

excludes outside values

Standard options

Labor/Ops

Options

Clerical

hourly wage

Prof/Mgmt
Sales

Boxlook options

You can reverse the direction of the
y-axis, in effect turning your boxes
upside down, with the yreverse
option.
Uses nlsw.dta & scheme vg lgndc

Box

Legend

graph box wage, nooutside over(occ5) over(collgrad) asyvar yreverse

Bar

not college grad
excludes outside values

Cat axis

0

Other

Matrix

5

Labor/Ops

Twoway

Clerical

Over options

Sales

hourly wage

Prof/Mgmt

Introduction

We can put the y-axis on the opposite
side, in this case on the right side of the
graph, using the yalternate option.
Uses nlsw.dta & scheme vg lgndc

Yvars and over

graph box wage, nooutside over(occ5) over(collgrad) asyvar yalternate

Styles

Changing the look of boxes, boxlook options

Appendix

5.6

This section shows how you can control the look of the boxes in your box charts: control the space between the boxes, the color of the boxes, and the characteristics of the line
outlining the boxes. For more information, see the boxlook options table in [G] graph box.
These examples begin with the vg blue scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
184

Chapter 5. Box plots

graph box prev exp tenure ttl exp, over(collgrad)
Consider this box chart, which shows
the distribution of previous work
experience, current work experience,
and total work experience. These three
variables are broken down by whether
one graduated college.
Uses nlsw.dta & scheme vg blue

30

20

10

0
not college grad
Prev. work exper.

college grad
Curr. work exper.

Tot. work exper.

graph box prev exp tenure ttl exp, nooutsides over(collgrad)
We add the nooutsides option to
suppress the display of outside values.
We will use this option for most of the
graphs in this section.
Uses nlsw.dta & scheme vg blue

25
20
15
10
5
0
not college grad
Prev. work exper.

college grad
Curr. work exper.

Tot. work exper.
excludes outside values

graph box prev exp tenure ttl exp, nooutsides over(collgrad)
outergap(*5)
We can change the outer gap between
the boxes and the edge of the plot area
with the outergap() option. Here, the
gap is five times its normal size. You
could also supply a value less than 1 to
shrink the size of the outer gap.
Uses nlsw.dta & scheme vg blue

25
20
15
10
5
0
not college grad
Prev. work exper.

college grad
Curr. work exper.

Tot. work exper.
excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.6

Changing the look of boxes, boxlook options

185
Introduction

25
20
15
10

0
college grad
Curr. work exper.

Tot. work exper.

Bar

Prev. work exper.

Cat axis

not college grad

Matrix

5

Twoway

Over options

The boxgap() option controls the size
of the gap among the boxes formed by
the multiple y-variables. The default
value is 33, meaning that the distance
between the boxes is 33% of the width
of the boxes. Here, we make the gap
smaller, making the boxes for the
y-variables closer to each other.
Uses nlsw.dta & scheme vg blue

Yvars and over

graph box prev exp tenure ttl exp, nooutsides over(collgrad)
boxgap(10)

excludes outside values

Pie

25

Dot

Y-axis

20
15
10
5

Prev. work exper.

college grad
Curr. work exper.

Tot. work exper.
excludes outside values

By

not college grad

Standard options

0

Options

Boxlook options

Here, we use the gap() option to
control the gap between the college
graduate group and the noncollege
graduate group. Here, we make the gap
three times the width of a box. See
Box : Over options (163) for more
information about controlling the gap
among boxes created by the over()
option.
Uses nlsw.dta & scheme vg blue

Box

Legend

graph box prev exp tenure ttl exp, nooutsides over(collgrad, gap(*3))

Styles
Appendix

Let’s now look at options that allow us to control the color of the boxes. We will first
look at options that control the overall intensity of the color for all the boxes and then show
how you can control the color of each box. We will use the vg s2c scheme for the following
examples.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
186

Chapter 5. Box plots

graph box wage, over(occ5) over(collgrad) asyvars nooutsides intensity(*.5)

15
10
0

5

hourly wage

20

25

The intensity option controls the
intensity of the color within the boxes.
Here, we request that the color be 50%
as intense as it normally would be.
Uses nlsw.dta & scheme vg s2c

not college grad

college grad

Prof/Mgmt

Sales

Clerical

Labor/Ops

Other
excludes outside values

graph box wage, over(occ5) over(collgrad) asyvars nooutsides intensity(*1.5)

15
10
0

5

hourly wage

20

25

In this example, we use the intensity
option to make the colors within the
boxes 1.5 times more intense than they
would normally.
Uses nlsw.dta & scheme vg s2c

not college grad

college grad

Prof/Mgmt

Sales

Clerical

Labor/Ops

Other
excludes outside values

graph box wage, over(occ5) over(collgrad) asyvars nooutsides
box(1, bcolor(sand))

15
10
0

5

hourly wage

20

25

Here, we add box(1, bcolor(sand))
to make the box color for the first bar a
sand color. See Styles : Colors (328) for
more information about colors you can
select.
Uses nlsw.dta & scheme vg s2c
not college grad

college grad

Prof/Mgmt

Sales

Clerical

Labor/Ops

Other
excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.6

Changing the look of boxes, boxlook options

187

Bar

25
20
15
10

hourly wage

5
0

Labor/Ops

Other

Cat axis

Sales

Clerical

Matrix

college grad

Prof/Mgmt

Twoway

not college grad

Introduction

Over options

excludes outside values

Dot

Y-axis

Now, let’s consider options that allow us to control the display of the median, whiskers,
caps, and outside markers. These examples use the vg s1m scheme.

Box

Legend

We add the blcolor() (box line color)
and blwidth() (box line width) options
to make the outline for the first box
brown and thick. Note that, while you
can control the color of the boxes and
outline characteristics via the box()
option, if you want to extensively
change these characteristics for many
graphs, you might consider making
your own scheme. See Intro : Schemes
(14) and Appendix : Customizing schemes
(379).
Uses nlsw.dta & scheme vg s2c

Yvars and over

graph box wage, over(occ5) over(collgrad) asyvars nooutsides
box(1, bcolor(sand) blcolor(brown) blwidth(thick))

Pie

25
20
15
10
5
0

Curr. work exper.

Styles

Prev. work exper.

Standard options

By

The medtype(cline) option sets the
median type to be a custom line. We
then customize the median line using
the medline() option to specify that
the line width be thick and the line
color be black.
Uses nlsw.dta & scheme vg s1m

Options

Boxlook options

graph box prev exp tenure ttl exp, nooutsides
medtype(cline) medline(lwidth(thick) lcolor(black))

Tot. work exper.

Appendix

excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
188

Chapter 5. Box plots

graph box prev exp tenure ttl exp, nooutsides
medtype(marker) medmarker(msymbol(+) msize(large))

0

5

10

15

20

25

We can use the medtype(marker)
option to tell Stata that we want to use
a marker symbol to label the median
and then use the medmarker() option
to control the display of the median
marker. In this case, we make the
marker symbol a plus sign and make
the marker size large.
Uses nlsw.dta & scheme vg s1m
Prev. work exper.

Curr. work exper.

Tot. work exper.
excludes outside values

graph box prev exp tenure ttl exp, nooutsides
cwhiskers lines(lwidth(thick) lcolor(black))

0

5

10

15

20

25

To customize the whiskers, we need to
specify the cwhiskers (customize
whiskers) option, and then we can add
the lines() option to specify how we
want the whiskers customized. In this
case, we make the whiskers thick and
black.
Uses nlsw.dta & scheme vg s1m
Prev. work exper.

Curr. work exper.

Tot. work exper.
excludes outside values

graph box prev exp tenure ttl exp, nooutsides alsize(20)

0

5

10

15

20

25

The alsize() (adjacent line size)
option allows you to control the size
(width) of the adjacent line. By
default, the adjacent line is 67% of the
width of the box. Here, we make the
adjacent line much smaller, 20% of the
width of the box.
Uses nlsw.dta & scheme vg s1m
Prev. work exper.

Curr. work exper.

Tot. work exper.
excludes outside values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.7

Graphing by groups

189

Dot

25
20
15
10
5
0
30

Box

Y-axis

Pie

20
10
0

Curr. work exper.

Tot. work exper.

Standard options

By

Prev. work exper.

Options

Boxlook options

The marker() option allows you to
control the markers used to display the
outside values. You can control this
separately for each y-variable. Here, we
make the outside value for tenure
display as large, hollow circles.
Uses nlsw.dta & scheme vg s1m

Bar

Legend

graph box prev exp tenure ttl exp, marker(2, msymbol(Oh) msize(vlarge))

Matrix

excludes outside values

Twoway

Curr. work exper.

Tot. work exper.

Cat axis

Prev. work exper.

Introduction

Over options

The capsize() option allows you to
specify the size of the caps (if any) on
the adjacent line. The default value is
0, meaning that no cap is displayed.
Here, we add a small cap to the
adjacent line.
Uses nlsw.dta & scheme vg s1m

Yvars and over

graph box prev exp tenure ttl exp, nooutsides capsize(5)

Styles

Graphing by groups

Appendix

5.7

This section discusses the use of the by() option in combination with graph box. Normally, you would use the over() option instead of the by() option, but in some cases the
by() option is either necessary or more advantageous. For example, a by() option is useful
if you exceed the maximum number of over() options (three if you have a single y-variable
or two if you have multiple y-variables). In such cases, the by() option allows you to break
your data down by additional categorical variables. Also, by() gives you more flexibility
in the placement of the separate panels. For more information about the by() option, see
Options : By (272); for more information about the over() option, see Box : Yvars and over
(157).

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
190

Chapter 5. Box plots

graph hbox wage, nooutsides note("")
over(collgrad) over(urban2) over(married)
Rural

single
Metro

Rural

married
Metro
0

5

10

15

20

hourly wage
not college grad

college grad

Consider this box graph, which breaks
wages down by three categorical
variables. If we wanted to further break
this down by another categorical
variable, we could not use another
over() option since we can have a
maximum of three over() options with
a single y-variable. We use the
nooutsides option to suppress the
display of outside values for this graph
and the rest of the graphs in this
section.
Uses nlsw.dta & scheme vg s1m

graph hbox wage, nooutsides note("")
over(collgrad) over(urban2) over(married) by(union)
nonunion

If we want to further break prev exp
down by union, we can use the
by(union) option to do this. We also
add the note("") option to suppress
the note saying that the outside values
have been omitted.
Uses nlsw.dta & scheme vg s1m

union

Rural

Rural

single

single
Metro

Metro

Rural

Rural

married

married
Metro

Metro
0

5

10

15

20

0

5

10

15

20

hourly wage
not college grad

college grad

Graphs by union worker

graph hbox prev exp tenure, nooutsides note("")
over(urban2) over(married)
Consider this box graph with multiple
y-variables breaking them down by two
categorical variables using two over()
options. When you have multiple
y-variables, you can only have a
maximum of two over() options.
Uses nlsw.dta & scheme vg s1m

Rural

single
Metro

Rural

married
Metro

0

5

10

Prev. work exper.

15

20

25

Curr. work exper.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
5.7

Graphing by groups

191

15
10
5
0

Rural

Metro

married

Rural

Metro

Rural

single

Curr. work exper.

Bar

Prev. work exper.

Metro

married

Matrix

Metro

single

Cat axis

Rural

Twoway

20

25

union

Introduction

nonunion

Over options

If we want to further break prev exp
down by another categorical variable,
say union, we can use the by(union)
option. We can include multiple
variables within by(), although this can
make some very small graphs.
Uses nlsw.dta & scheme vg s1m

Yvars and over

graph box prev exp tenure, nooutsides note("")
over(urban2) over(married) by(union)

Graphs by union worker

single

Rural

married Metro
0

5

10 15 20 25

(missing)

By

Rural

married Metro
0

5

10 15 20 25

Tot. work exper.

Curr. work exper.

Graphs by union worker

Styles

nonunion
single

Rural
Metro

union
single

Rural

Rural
Metro

Appendix

graph hbox ttl exp tenure, nooutsides note("")
over(urban2) over(married) by(union, total)
We can add the total option to include
a panel for all observations.
Uses nlsw.dta & scheme vg s1m

Standard options

Rural

single Metro

Options

married

Rural
Metro

union
Rural
Metro

Boxlook options

single

Pie

nonunion
Rural
Metro

Dot

Y-axis

We can use the missing option to
include a panel for the missing values of
union.
Uses nlsw.dta & scheme vg s1m

Box

Legend

graph hbox ttl exp tenure, nooutsides note("")
over(urban2) over(married) by(union, missing)

Rural

married Metro

married Metro
0

5

10 15 20 25

Total
Rural

single Metro
Rural

married Metro
0

5

10 15 20 25

Tot. work exper.

Curr. work exper.

Graphs by union worker

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
192

Chapter 5. Box plots

graph hbox ttl exp tenure, nooutsides note("")
over(urban2) over(married) by(union, total row(1))
nonunion

union

Rural

Total

Rural

single

Rural

single
Metro

single
Metro

Rural

Metro

Rural

married

Rural

married
Metro

Switching to a vertical box chart, we
can use the rows(1) option to show the
multiple graphs in one row.
Uses nlsw.dta & scheme vg s1m

married
Metro

Metro

0510
12
52
05

0510
12
52
05

Tot. work exper.

0510
12
52
05

Curr. work exper.

Graphs by union worker

graph hbox ttl exp tenure, nooutsides note("")
over(urban2) over(married) by(union, cols(1))
Here, we flip the graph back to a
horizontal box chart and use the
cols(1) option to show both graphs in
one column.
Uses nlsw.dta & scheme vg s1m

nonunion
single
married

Rural
Metro
Rural
Metro

union
single
married

Rural
Metro
Rural
Metro
0

5

10

15

Tot. work exper.

20

25

Curr. work exper.

Graphs by union worker

graph hbox ttl exp tenure, nooutsides note("")
over(urban2) over(married) by(union, cols(1) legend(position(9)))
legend(cols(1) stack)
To make the last graph more readable,
we can add the legend(pos(9)) within
the by() option to put the legend at 9
o’clock and legend(cols(1) stack) to
make the legend one stacked column.
Adding note("") suppresses the note
about outside values being omitted.
Uses nlsw.dta & scheme vg s1m

nonunion
single

married

Rural
Metro
Rural
Metro

Tot. work exper.
Curr. work exper.

union
single

married

Rural
Metro
Rural
Metro
0

5

10

15

20

25

Graphs by union worker

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Bar
Box
Dot
Pie
Options
Standard options

By

graph dot tenure, over(occ7)

Dotlook options

This section introduces the use of dot plots. It shows how you can use the over()
option for displaying dot plots by one or more grouping variables. It then shows how you
can specify one or more y-variables in a plot and control the summary statistic used for
collapsing the y-variable(s). See the group options table in [G] graph dot for more details.
The graphs in this section begin using the vg s1c scheme.

Matrix

Y-axis

Specifying variables and groups, yvars and over

Styles

Prof
Mgmt
Sales

Appendix

Here, we use the over() option to show
the average current work experience
broken down by occupation. By
default, the y-variable (tenure) is
placed on the bottom axis and is
considered to be the y-axis. Likewise,
the levels of occ7 are placed on the left
axis and are considered to form the
x-axis, or categorical axis.
Uses nlsw.dta & scheme vg s1c

Legend

6.1

Cat axis

This chapter discusses the use of dot plots in Stata. We start by showing how you can
specify multiple y-variables to display plots for multiple variables and how you can use the
over() option to break dot plots down by categorical variables. Then, we discuss over()
options that can be used to customize the display of these categorical variables, followed by
options concerning the display of display of categorical axes. Next, we cover options that
control legends, followed by options that control the y-axis. Finally, we discuss options that
control the look of the lines and dots that form the dot plot and, lastly, the by() option.

Twoway

Over options

Dot plots

Introduction

Yvars and over

6

Cler.
Operat.
Labor
Other
0

2

4

6

8

mean of tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this193
document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
194

Chapter 6. Dot plots

graph dot tenure, over(occ7) over(collgrad)

not college grad

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

college grad

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

Here, we use a second over() option to
show the mean of work experience
broken down by occupation and
whether one graduated college.
Uses nlsw.dta & scheme vg s1c

0

2

4

6

8

mean of tenure

graph dot tenure, over(occ7) over(collgrad) over(married)
not college grad

single
college grad

not college grad

married
college grad
0

2

4

6

8

10

mean of tenure
Prof
Sales
Operat.
Other

Mgmt
Cler.
Labor

We can add a third over() option, in
this case further breaking the tenure
down by whether one is married. Note
that the first over() variable (occ7) is
now treated as multiple y-variables.
When you use three over() options,
the first variable is then treated as
multiple y-variables, as though you had
specified the asyvars option. This
graph can be difficult to read with occ7
forming the multiple y-variables.
Uses nlsw.dta & scheme vg s1c

graph dot tenure, over(married) over(occ7) over(collgrad)

not college grad

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

college grad

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

This graph shows the same data as the
last one, except we have switched the
order of the over() options, making
over(married) come first and thus
forming the multiple y-variables. This
might be easier to read than the
previous graph.
Uses nlsw.dta & scheme vg s1c
0

2

4

6

8

10

mean of tenure
single

married

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
6.1

Specifying variables and groups, yvars and over

195

Prof
Mgmt

Cler.

Labor

Bar

Cat axis

Operat.

Matrix

Sales

Twoway

Over options

This graph shows the average previous
experience and average current tenure
broken down by occupation. While you
do not need to use the over() option,
omitting it may make a fairly boring
graph.
Uses nlsw.dta & scheme vg outc

Introduction

graph dot prev exp tenure, over(occ7)

Yvars and over

Let’s now consider examples with multiple y-variables. These examples are shown using
the vg outc scheme.

Other
4

mean of prev_exp

6

8

mean of tenure

6

8

10

mean of tenure

Styles

4

Standard options

2

mean of prev_exp

Options

0

Pie

college grad

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

By

not college grad

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

Dotlook options

This graph adds whether one is a
college graduate as an additional
grouping level. Because the command
has multiple y-variables, we cannot
include another over() option since
dot plots support three levels of nesting
and the multiple y-variables account for
a level.
Uses nlsw.dta & scheme vg outc

Dot

Y-axis

graph dot prev exp tenure, over(occ7) over(collgrad)

Box

2

Legend

0

Appendix

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
196

Chapter 6. Dot plots

graph dot (median) prev exp tenure, over(occ7) over(collgrad)

not college grad

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

college grad

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

So far, all the examples we have seen
have graphed the mean of y-variable(s).
Here, we preface the y-variables with
(median), plotting the median for each
y-variable.
Uses nlsw.dta & scheme vg outc

0

2

4

6

p 50 of prev_exp

8

10

p 50 of tenure

graph dot (p10) wage p10=wage (p25) wage p25=wage
(p50) wage p50=wage (p75) wage p75=wage (p90) wage p90=wage,
over(occ7)
You can request different statistics for
the same variable, such as in this
example, which shows the 10th, 25th,
50th, 75th, and 90th percentiles of
wages broken down by occupation.
Uses nlsw.dta & scheme vg outc

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

5

10
p 10 of wage
p 50 of wage
p 90 of wage

15

20

p 25 of wage
p 75 of wage

Now, let’s consider options that can be used in combination with the over() option to
customize the behavior of the graphs. We show how you can treat the levels of the first
over() option as though they were multiple y-variables. You can also request that missing
values for the levels of the over() variables be displayed, and you can suppress empty categories when multiple over() options are used. These examples are shown below using the
vg s2m scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
6.1

Specifying variables and groups, yvars and over

197

college grad
Mgmt not college
grad

college grad
Cler. not college
grad
college grad
Operat. not college
grad

2

4

6

8

mean of tenure

Cler.
Operat.

Other

By

0

2

4

6

8

mean of tenure
not college grad

college grad

Styles

graph dot tenure, over(occ5) over(union) missing

nonunion

Prof/Mgmt
Sales
Clerical
Labor/Ops
Other

union

Prof/Mgmt
Sales
Clerical
Labor/Ops
Other

.

Prof/Mgmt
Sales
Clerical
Labor/Ops
Other

Appendix

Consider this graph in which we use the
over() option to show tenure broken
down by occ5 and union. By including
the missing option, we then see the
category for those who are missing on
the union variable, shown as the third
group labeled with a dot. See
Dot : Cat axis (202) for examples
showing how you could change the label
(.) to something more meaningful, e.g.,
“Missing”.
Uses nlsw.dta & scheme vg s2m

Standard options

Labor

Options

Dotlook options

Sales

Pie

Prof
Mgmt

Dot

Y-axis

If we add the asyvars option, the first
over() variable (collgrad) is graphed
as if there were two y-variables. The
two levels of collgrad are shown as
different markers on the same line, and
they are labeled using the legend.
Uses nlsw.dta & scheme vg s2m

Box

Legend

graph dot tenure, over(collgrad) over(occ7) asyvars

Bar

0

Cat axis

college grad
Other not college
grad

Matrix

college grad
Labor not college
grad

Twoway

college grad
Sales not college
grad

Introduction

college grad
Prof not college
grad

Over options

Consider this graph, which shows the
average current work experience broken
down by whether one is a college
graduate and by occupation.
Uses nlsw.dta & scheme vg s2m

Yvars and over

graph dot tenure, over(collgrad) over(occ7)

0

2

4

6

8

10

mean of tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
198

Chapter 6. Dot plots

graph dot tenure, over(grade) over(collgrad)

not college grad

college grad

Consider this dot plot, which breaks
tenure down by two variables: the last
grade that one completed and whether
one is a college graduate. By default,
Stata shows all possible combinations
for these two variables. In most cases,
all combinations are possible, but not
in this case, and including them has
caused the labels for grade to overlap.
Uses nlsw.dta & scheme vg s2m

4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
0

2

4

6

8

mean of tenure

graph dot tenure, over(grade) over(collgrad) nofill

not college grad

If you only want to display only the
combinations of the over() variables
that exist in the data, you can use the
nofill option.
Uses nlsw.dta & scheme vg s2m

4
5
6
7
8
9
10
11
12
13
14
15
13
14

college grad 15
16
17
18
0

2

4

6

8

mean of tenure

6.2

Options for groups, over options

This section considers some of the options that can be used with the over() and
yvaroptions() options for customizing the display of the markers. We will focus on controlling the spacing between the markers and the order in which the markers are displayed.
Other options that control the display of the x-axis (such as the labels) are covered in
Dot : Cat axis (202). For more information on the over() options covered in this section, see
the over subopts table in [G] graph dot. We first consider options that control the spacing
among the markers and then options that change the order in which the markers are sorted.
These examples begin with the vg blue scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
6.2

Options for groups, over options

199

Prof/Mgmt
Sales

not college grad

Clerical

Prof/Mgmt
Sales

college grad

Clerical
Labor/Ops

2

4

6

8

mean of tenure

6

8

Styles

graph dot tenure, over(occ7)

Appendix

Consider this graph showing tenure
broken down by the seven levels of
occupation. The markers are ordered
by levels of occ7, going from 1 to 7.
Uses nlsw.dta & scheme vg blue

Standard options

4
mean of tenure

Options

2

Pie

0

Dot

By

college grad

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

Dotlook options

not college grad

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

Y-axis

Suppose that we wanted to make the
gap between the levels of collgrad
larger. Here, we use the gap(*5) option
to make this gap five times as large as
it normally would be.
Uses nlsw.dta & scheme vg blue

Box

Legend

graph dot tenure, over(occ7) over(collgrad, gap(*5))

Bar

0

Cat axis

Other

Matrix

Over options

Other

Twoway

Labor/Ops

Introduction

Consider this graph in which we show a
dot plot of tenure broken down by
occ5 and collgrad.
Uses nlsw.dta & scheme vg blue

Yvars and over

graph dot tenure, over(occ5) over(collgrad)

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

2

4

6

8

mean of tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
200

Chapter 6. Dot plots

graph dot tenure, over(occ7, descending)
The descending option switches the
order of the markers. They still are
ordered according to the seven levels of
occupation, but the markers are
ordered from 7 to 1.
Uses nlsw.dta & scheme vg blue

Other
Labor
Operat.
Cler.
Sales
Mgmt
Prof
0

2

4

6

8

mean of tenure

graph dot tenure, over(occ7, sort(occ7alpha))
Cler.
Labor
Mgmt
Operat.
Prof
Sales
Other
0

2

4

6

8

mean of tenure

We might want to put these markers in
alphabetical order (but with Other
appearing last). We can do this by
recoding occ7 into a new variable (say
occ7alpha), such that, as occ7alpha
goes from 1 to 7, the occupations are
alphabetical. We recoded occ7 with
these assignments: 4 = 1, 6 = 2, 2 = 3,
5 = 4, 1 = 5, 3 = 6, and 7 = 7; see
[R] recode. Then, the
sort(occ7alpha) option alphabetizes
the markers (but with Other still
appearing last).
Uses nlsw.dta & scheme vg blue

graph dot tenure, over(occ7, sort(1))
Here, we sort the variables based on the
mean of tenure, yielding markers with
means in ascending order. The sort(1)
option sorts the markers according to
the mean of the first y-variable, the
mean of tenure. In this case, there is
only one variable.
Uses nlsw.dta & scheme vg blue

Cler.
Labor
Operat.
Sales
Mgmt
Prof
Other
0

2

4

6

8

mean of tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
6.2

Options for groups, over options

201

Prof

Operat.

Cler.
4

Y-axis

Other

Labor
Sales
Prof
Cler.

By

Mgmt
0

2

4

mean of tenure

mean of prev_exp

Styles

graph dot tenure prev exp, over(occ7, sort(1)) over(collgrad)

not college grad

Cler.
Labor
Other
Operat.
Sales
Mgmt
Prof

college grad

Labor
Operat.
Cler.
Sales
Mgmt
Prof
Other

Appendix

We can use the sort() option when
there are additional over() variables.
Here, the markers are ordered
according to the mean of tenure within
each level of collgrad.
Uses nlsw.dta & scheme vg blue

Standard options

Dotlook options

Operat.

Options

8

Legend

Adding a second y-variable and
changing sort(1) to sort(2) sorts the
markers according to the second
y-variable, the mean of prev exp.
Uses nlsw.dta & scheme vg blue

Pie

6

graph dot tenure prev exp, over(occ7, sort(2))

Dot

8

Box

6

mean of tenure

Bar

2

Cat axis

0

Matrix

Labor

Twoway

Mgmt
Sales

Introduction

Other

Over options

Adding the descending option yields
markers in descending order, going
from highest mean tenure to lowest
mean tenure.
Uses nlsw.dta & scheme vg blue

Yvars and over

graph dot tenure, over(occ7, sort(1) descending)

0

2

4

mean of tenure

6

8

10

mean of prev_exp

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
202

Chapter 6. Dot plots

graph dot tenure prev exp, over(occ7, sort(1)) over(collgrad, descending)

college grad

Labor
Operat.
Cler.
Sales
Mgmt
Prof
Other

not college grad

Cler.
Labor
Other
Operat.
Sales
Mgmt
Prof

We add the descending option to the
second over() option, and the levels of
collgrad are now shown with college
graduates appearing first.
Uses nlsw.dta & scheme vg blue

0

2

4

mean of tenure

6.3

6

8

10

mean of prev_exp

Controlling the categorical axis

This section describes ways that you can label the categorical axis in dot plots. Dot plots,
like bar and box plots, are different from other plots since their x-axis is formed by categorical variables. (Remember that Stata calls the axis with the categorical variable(s) the x-axis,
even though it may be placed on the left axis.) This section describes options you can use
to customize the categorical axis. For more details on this, see [G] cat axis label options
and [G] cat axis line options. We will start by showing how you can change the labels
used for the categorical axis. These examples use the vg past scheme.

graph dot tenure, over(occ7) over(south)

0

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

1

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

This is an example of a dot plot with
two over() variables graphing the
average tenure broken down by
occupation and whether one lives in the
South. The variable south is a dummy
variable that does not have any value
labels, so the x-axis is not labeled very
well.
Uses nlsw.dta & scheme vg past
0

2

4

6

8

mean of tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
6.3

Controlling the categorical axis

203

15

Labor/Ops

mean of prev_exp
mean of tenure
mean of ttl_exp

Other

mean of prev_exp
mean of tenure
mean of ttl_exp
0

5

10

15

Styles

Prof/Mgmt

Previous
Current
Total

Sales

Previous
Current
Total

Clerical

Previous
Current
Total

Labor/Ops

Previous
Current
Total

Other

Previous
Current
Total
0

5

Appendix

graph dot prev exp tenure ttl exp, over(occ5) ascategory
yvaroptions(relabel(1 "Previous" 2 "Current" 3 "Total"))
If we had an over() option, we could
use the relabel() option to change the
labels on the x-axis. But since we have
multiple y-variables that we have
treated as categories, we then use the
yvaroptions(relabel()) option to
modify the labels on the x-axis.
Uses nlsw.dta & scheme vg past

Standard options

mean of prev_exp
mean of tenure
mean of ttl_exp

Options

Clerical

Pie

mean of prev_exp
mean of tenure
mean of ttl_exp

Dot

Sales

By

mean of prev_exp
mean of tenure
mean of ttl_exp

Dotlook options

Prof/Mgmt

Y-axis

This graph dot command has multiple
y-variables but uses the ascategory
option to plot the different y-variables
as if they were categorical variables.
The dots for the different y-variables
are plotted on different lines using the
same symbol, and each line is labeled
on the x-axis rather than using a
legend. The default labels on the x-axis
are not bad, but we might want to
change them.
Uses nlsw.dta & scheme vg past

Box

Legend

graph dot prev exp tenure ttl exp, over(occ5) ascategory

Bar

10
mean of wage

Matrix

5

Twoway

0

Introduction

South

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

Cat axis

N&W

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

Over options

We can use the relabel() option to
change the labels displayed for the
levels of south, giving the x-axis more
meaningful labels. Note that we wrote
relabel(1 "N & W"), not relabel(0
"N & W"), since these numbers do not
represent the actual levels of south but
the ordinal position of the levels, i.e.,
first and second.
Uses nlsw.dta & scheme vg past

Yvars and over

graph dot wage, over(occ7) over(south, relabel(1 "N & W" 2 "South"))

10

15

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
204

Chapter 6. Dot plots

graph dot prev exp tenure ttl exp, ascategory
over(south, relabel(1 "N & W" 2 "South"))
yvaroptions(relabel(1 "Previous" 2 "Current" 3 "Total"))
In this example, we have multiple
y-variables that are converted into
categorical variables via the
ascategory option, and an over()
variable, as well. The relabel() option
within the over() option changes the
labels for south, and the relabel()
option within yvaroptions() changes
the labels for the multiple y-variables.
Uses nlsw.dta & scheme vg past

Previous

N&W

Current
Total

Previous

South

Current
Total
0

5

10

15

graph dot prev exp tenure ttl exp, ascategory xalternate
over(south, relabel(1 "N & W" 2 "South"))
yvaroptions(relabel(1 "Previous" 2 "Current" 3 "Total"))
Previous
Current

N&W

Total

Previous
Current

We add the xalternate option, which
moves the labels for the x-axis to the
opposite side, in this case, from the left
to the right. We could also use the
yalternate option to move the y-axis
to its opposite side.
Uses nlsw.dta & scheme vg past

South

Total
0

5

10

15

graph dot wage, over(occ7)
l1title("Occupations recoded" "into seven categories")

Occupations recoded
into seven categories

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

2

4

6

8

10

In this example, the categorical axis
represents the occupation after recoding
it into seven categories. We can use the
l1title() option to add a title to the
left of the graph labeling this axis. Note
that we broke the title into two quoted
strings that appear on the graph as two
lines. We could also add a second title
to the left with l2title(); see Standard
options : Titles (313) for more details.
Uses nlsw.dta & scheme vg past

mean of wage

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
6.4

205

10

15

mean of tenure

Prof
Mgmt

Styles

Sales
Cler.

Appendix

This is another example of how a
legend can arise in a Stata dot plot if
you use the over() variable with the
asyvars option. Stata treats the levels
of the over() variable as if they were
really multiple y-variables.
Uses nlsw.dta & scheme vg rose

Standard options

By

graph dot wage, over(collgrad) over(occ7) asyvars

Options

5
mean of prev_exp
mean of ttl_exp

Pie

0

Dotlook options

Other

Dot

Labor

Box

Y-axis

Cler.
Operat.

Bar

Mgmt
Sales

Matrix

Legend

Prof

Twoway

Cat axis

graph dot prev exp tenure ttl exp, over(occ7)
Consider this dot plot of three different
variables. These variables are shown
with different markers, and a legend is
used to identify the variables.
Uses nlsw.dta & scheme vg rose

Over options

This section discusses the use of legends for dot plots, emphasizing the features that are
unique to dot plots. The section Options : Legend (287) goes into great detail about legends,
as does [G] legend option. Legends can be used for multiple y-variables or when the first
over() variable is treated as a y-variable via the asyvars option. See Dot : Yvars and over
(193) for more information about using multiple y-variables and more examples of treating
the first over() variable as a y-variable. These following examples use the vg rose scheme.

Introduction

Controlling legends

Yvars and over

6.4

Controlling legends

Operat.
Labor
Other
0

5

10

15

mean of wage
not college grad

college grad

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
206

Chapter 6. Dot plots

Unless otherwise mentioned, the legend options described below work the same whether
the legend was derived from multiple y-variables or from an over() variable that was combined with the asyvars option.

graph dot prev exp tenure ttl exp, over(occ7) nolabel
The nolabel option only works when
you have multiple y-variables. When
this option is used, the variable names
(not the variable labels) are used in the
legend. For example, instead of showing
the variable label Prev. work exper.,
this option shows the variable name
prev exp.
Uses nlsw.dta & scheme vg rose

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

5

10
prev_exp
ttl_exp

15

tenure

graph dot prev exp tenure ttl exp, over(occ7)
legend(label(1 "Previous") label(2 "Current") label(3 "Total")
title("Work Experience"))
We use the legend(label()) option to
change the labels for the variables in
the legend and the title() option to
add a title to the legend.
Uses nlsw.dta & scheme vg rose

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

5

10

15

Work Experience
Previous
Total

Current

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
6.5

Controlling the y-axis

207

mean of tenure

mean of ttl_exp

Prof

Sales
Cler.

Other
0

5

10

15

By

Labor
Other
0

5

10

15

Standard options

Operat.

Options

Cler.

Pie

Dotlook options

Sales
mean of prev_exp
mean of tenure
mean of ttl_exp

Dot

Y-axis

Mgmt

Box

Legend

Prof

Bar

Cat axis

Labor

Matrix

Operat.

Twoway

Mgmt

graph dot prev exp tenure ttl exp, over(occ7)
legend(cols(1) position(9))
Here, the legend is moved to the left
and displayed in a single column using
the legend(cols(1) position(9))
options.
Uses nlsw.dta & scheme vg rose

Introduction

mean of prev_exp

Over options

We can put the legend at the top of the
graph with the legend(position(12))
option. The values you supply for
position() are similar to the numbers
on a clock face, where 12 o’clock is the
top, 6 o’clock is the bottom, and 0
represents the center of the clock face;
see Styles : Clockpos (330) for more
details. We also add the rows(1)
option to make the legend display as
one row.
Uses nlsw.dta & scheme vg rose

Yvars and over

graph dot prev exp tenure ttl exp, over(occ7)
legend(position(12) rows(1))

Styles

Controlling the y-axis

Appendix

6.5

This section describes options to customize the y-axis with dot plots. To be precise,
when Stata refers to the y-axis on a dot plot, it refers to the axis with the continuous variable, which is placed on the bottom (where the x-axis would traditionally be placed). This
section emphasizes the features that are particularly relevant to dot plots. For more details,
see Options : Axis titles (254), Options : Axis labels (256), and Options : Axis scales (265). Also,
see [G] axis title options, [G] axis label options, and [G] axis scale options. These
examples use the vg teal scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
208

Chapter 6. Dot plots

graph dot hours, over(occ7) ytitle("Hours Worked" "Per Week")
Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

10

20

30

40

Hours Worked
Per Week

Consider this graph showing the mean
hourly wage broken down by
occupation. We use the ytitle()
option to add a title to the y-axis. We
place the title across two lines by using
two separate, quoted strings. See
Options : Axis titles (254) and
[G] axis title options for more details,
but please disregard any references to
xtitle(), since that option is not valid
when using graph dot.
Uses nlsw.dta & scheme vg teal

graph dot hours, over(occ7)
ytitle("Hours Worked" "Per Week", bfcolor(eggshell) box bexpand)
Because the title is considered to be a
textbox, you can use textbox options as
illustrated here to control the look of
the title. See Options : Textboxes (303)
for additional examples of how to use
textbox options to control the display
of text.
Uses nlsw.dta & scheme vg teal

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

10

20

30

40

Hours Worked
Per Week

graph dot hours, over(occ7)
yline(35 40, lwidth(thin) lcolor(navy) lpattern(dash))
This example uses the yline() option
to add a thin, navy, dashed line to the
graph where the hours worked equal 35
and 40.
Uses nlsw.dta & scheme vg teal

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

10

20

30

40

mean of hours

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
6.5

Controlling the y-axis

209

Mgmt

Cler.
Operat.

35

40

45

Cler.
Operat.

By

Other
30

35

40

45

mean of hours

Prof
Mgmt

Appendix

We can use the yscale(off) option to
turn off the y-axis. See
Options : Axis scales (265) and
[G] axis scale options for more
details. Please disregard any references
to xscale() since that option is not
valid when using graph dot.
Uses nlsw.dta & scheme vg teal

Styles

graph dot hours, over(occ7) yscale(off)

Standard options

Labor

Options

Dotlook options

Mgmt
Sales

Pie

Prof

Dot

Y-axis

When we add the exclude0 option, the
dot plot does not automatically begin
at 0. In this case, it starts at 30 since
that is the value we specified as the
starting point on the ylabel() option.
Uses nlsw.dta & scheme vg teal

Box

Legend

graph dot hours, over(occ7) ylabel(30(5)45) exclude0

Bar

30
mean of hours

Cat axis

Other

Matrix

Labor

Twoway

Sales

Introduction

Prof

Over options

We use the ylabel() option to label
the y-axis from 30 to 45 by increments
of 5. See Options : Axis labels (256) and
[G] axis label options for more
details. Please disregard any references
to xlabel() since that option is not
valid when using graph dot. Note that
the y-axis still begins at 0, but see the
next example for how you can override
this.
Uses nlsw.dta & scheme vg teal

Yvars and over

graph dot hours, over(occ7) ylabel(30(5)45)

Sales
Cler.
Operat.
Labor
Other

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
210

Chapter 6. Dot plots

graph dot hours, over(occ7) yalternate
The yalternate option puts the y-axis
on the opposite side, in this case on the
top side of the graph.
Uses nlsw.dta & scheme vg teal

mean of hours
0

10

20

30

40

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

graph dot hours, over(occ7) yreverse
You can reverse the direction of the
y-axis with the yreverse option.
Uses nlsw.dta & scheme vg teal

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
40

30

20

10

0

mean of hours

6.6

Changing the look of dot rulers, dotlook options

This section shows how you can control the look of the lines in your dot plots. We
show how you can control the space between the lines, the color of the lines, and other
characteristics of the line. For more information, see the linelook options table in [G] graph
dot. These graphs are shown using the vg s2c scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
6.6

Changing the look of dot rulers, dotlook options

211

Mgmt

Cler.
Operat.
Labor

2

4

8

mean of tenure

Pie

Mgmt
Sales
Cler.

By

Labor
Other
0

2

4

6

8

mean of tenure

Standard options

Operat.

Options

Dotlook options

Prof

mean of prev_exp

Styles
Appendix

graph dot prev exp tenure, over(occ7) linetype(line)
lines(lwidth(thick) lcolor(erose))
Using the linetype(line) option, the
dots are instead displayed as lines.
Further, we use the lines() option to
make the line width thick and the line
color rose. We could also add the
lpattern() option to control the line
pattern. See Styles : Linewidth (337),
Styles : Colors (328), and
Styles : Linepatterns (336) for more
information.
Uses nlsw.dta & scheme vg s2c

Dot

Y-axis

By default, each line would be
composed of 100 small dots, but here
we use the ndots(50) option to display
50 small dots. Further, using the
dots() option, the small dots are
displayed as medium-sized, dark green,
hollow circles. See Styles : Symbols
(342), Styles : Markersize (340), and
Styles : Colors (328) for more
information.
Uses nlsw.dta & scheme vg s2c

Box

Legend

graph dot prev exp tenure, over(occ7)
ndots(50) dots(msymbol(Oh) msize(medium) mcolor(dkgreen))

Bar

mean of prev_exp

6

Cat axis

0

Matrix

Other

Twoway

Sales

Introduction

Prof

Over options

Consider this dot plot showing previous
and current work experience broken
down by occupation. Each dot plot has
a series of small dots that forms a line
on which the symbols are plotted.
Uses nlsw.dta & scheme vg s2c

Yvars and over

graph dot prev exp tenure, over(occ7)

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

2

4

mean of prev_exp

6

8

mean of tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
212

Chapter 6. Dot plots

graph dot prev exp tenure, over(occ7) linetype(rectangle)
rwidth(3) rectangles(fcolor(erose) lcolor(maroon))
Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

2

4

mean of prev_exp

6

8

Here, we change the linetype() to be
a rectangle. The rwidth(3) sets the
rectangle width to be three times its
normal width. In addition, the
rectangle() option is used to
customize it, using the fcolor() (fill
color) and lcolor() (line color)
options to make the rectangle rose on
the inside with a maroon outline.
Uses nlsw.dta & scheme vg s2c

mean of tenure

Let’s now look at options that allow us to control the markers and whether the markers
are displayed on the same line.

graph dot prev exp tenure, over(occ7)
marker(1, msymbol(D) mcolor(teal) msize(large))
Here, we use the marker() option to
control the marker used for the first
y-variable, making it a large
teal-colored diamond. See
Options : Markers (235) for more details
on how you can control markers.
Uses nlsw.dta & scheme vg s2c

Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other
0

2

4

mean of prev_exp

6

8

mean of tenure

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
6.6

Changing the look of dot rulers, dotlook options

213

Cler.

Other
0

2

4

6

8

Pie

Mgmt
Sales
Cler.

By

Other
0

2

4

6

mean of prev_exp

8

mean of tenure

Standard options

Operat.

Options

Dotlook options

Prof

Labor

Styles

graph dot tenure, over(occ5) over(collgrad) over(married)

Appendix

Consider this graph. Since we have
used three over() options, the levels of
the first over() variable are displayed
as though they were different
y-variables. We may want to use the
linegap() option to display the
different y-variables on different lines to
make the graph more readable; see the
next example.
Uses nlsw.dta & scheme vg s2c

Dot

Y-axis

graph dot prev exp tenure, over(occ7) linegap(45)

Box

Legend

mean of tenure

Bar

Cat axis

Labor

Matrix

Operat.

Twoway

Mgmt
Sales

mean of prev_exp

We can use the linegap() option to
display the y-variables on different lines
and specify the gap between these lines.
The default value is 0, meaning that all
y-variables are displayed on the same
line.
Uses nlsw.dta & scheme vg s2c

Introduction

Prof

Over options

In this example, we use two marker()
options, so we can control both
markers. The first marker is now a
diamond with a teal fill and a thick,
dark green outline. The second marker
is a square, light blue on the inside with
a thick blue outline. The section
Options : Markers (235) has more details
on controlling markers.
Uses nlsw.dta & scheme vg s2c

Yvars and over

graph dot prev exp tenure, over(occ7)
marker(1, msymbol(d) mfcolor(teal) mlcolor(dkgreen) mlwidth(thick))
marker(2, msymbol(S) mfcolor(ltblue) mlcolor(blue) mlwidth(thick))

not college grad

single
college grad

not college grad

married
college grad
0

2

4

6

8

10

mean of tenure
Prof/Mgmt
Clerical
Other

Sales
Labor/Ops

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
214

Chapter 6. Dot plots

graph dot tenure, over(occ5) over(collgrad) over(married) linegap(30)
legend(rows(1) span)
This example is the similar to the
previous one, but we have added the
linegap(30) option to make the levels
of occ5 display on separate lines,
making the results more readable. We
have also added a legend() option to
make the legend display in one line and
span the width of the graph.
Uses nlsw.dta & scheme vg s2c

not college grad

single
college grad

not college grad

married
college grad

0

2

4

6

8

10

mean of tenure
Prof/Mgmt

6.7

Sales

Clerical

Labor/Ops

Other

Graphing by groups

This section discusses the use of the by() option in combination with graph dot. Normally, you would use the over() option instead of the by() option, but in some cases, the
by() option is either necessary or more advantageous. For example, a by() option is useful
if you exceed the maximum number of over() options (three if you have a single y-variable
or two if you have multiple y-variables). In such cases, the by() option allows you to break
your data down by additional categorical variables. by() also gives you more flexibility in
the placement of the separate panels. For more information about the by() option, see
Options : By (272), and for more information about the over() option, see Dot : Yvars and
over (193). The examples in this section use the vg s1m scheme.

graph dot wage, over(collgrad) over(occ5) over(urban2)

Rural

Prof/Mgmt
Sales
Clerical
Labor/Ops
Other

Metro

Prof/Mgmt
Sales
Clerical
Labor/Ops
Other

Consider this dot graph breaking wages
down by three categorical variables. If
we wanted to break this down further
by another categorical variable, we
could not use another over() option
since we can have a maximum of three
over() options with a single y-variable.
Uses nlsw.dta & scheme vg s1m
0

5

10

15

mean of wage
not college grad

college grad

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
6.7

Graphing by groups

215

nonunion

union

Prof/Mgmt

Prof/Mgmt

Sales

Rural

Sales

Rural

Clerical

Labor/Ops
Other

Prof/Mgmt

Prof/Mgmt

Sales

Metro

Sales

Metro

Clerical

Clerical

Labor/Ops

Labor/Ops

Other

Other
0

5

10

15

0

5

10

15

not college grad

college grad

Graphs by union worker

Pie

Prof/Mgmt
Sales
Clerical
Labor/Ops
Other
Prof/Mgmt
Sales
Clerical
Labor/Ops

By

Other
0

5

10

mean of tenure

15

mean of ttl_exp

Styles

nonunion

union

Prof/Mgmt

Prof/Mgmt

Sales

Rural

Sales

Rural

Clerical
Labor/Ops

Clerical
Labor/Ops

Other

Other

Prof/Mgmt

Prof/Mgmt

Sales

Metro

Appendix

graph dot tenure ttl exp, over(occ5) over(urban2)
by(union)
If we want to break tenure down
further by another categorical variable,
say union, we can use the by(union)
option. Although this example shows
only a single variable in the by()
option, you can specify multiple
variables.
Uses nlsw.dta & scheme vg s1m

Standard options

Metro

Options

Dotlook options

Rural

Dot

Y-axis

Consider this dot graph with multiple
y-variables breaking them down by two
categorical variables using two over()
options. When you have multiple
y-variables, you can have a maximum
of two over() options.
Uses nlsw.dta & scheme vg s1m

Box

Legend

graph dot tenure ttl exp, over(occ5) over(urban2)

Bar

Cat axis

mean of wage

Matrix

Over options

Other

Twoway

Labor/Ops

Clerical

Introduction

If we want to break wage down further
by union, we can use the by(union)
option.
Uses nlsw.dta & scheme vg s1m

Yvars and over

graph dot wage, over(collgrad) over(occ5) over(urban2) by(union)

Sales

Metro

Clerical
Labor/Ops

Clerical
Labor/Ops

Other

Other
0

5

10

15

mean of tenure

0

5

10

15

mean of ttl_exp

Graphs by union worker

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
216

Chapter 6. Dot plots

graph dot ttl exp tenure, over(married) over(urban2)
by(union, missing)
nonunion

union

single

single

Rural married

Rural married

single

single

Metro married

Metro married
0

5

10

15

(missing)
single

Rural married

We can use the missing option to
include a panel for the missing values of
union. Note that we changed the first
over() variable to be over(married)
to make an example that was more
readable.
Uses nlsw.dta & scheme vg s1m

single

Metro married
0

5

10

15

mean of ttl_exp

mean of tenure

Graphs by union worker

graph dot ttl exp tenure, over(married) over(collgrad)
by(union, total)
nonunion
single

union
single

not college grad married

not college grad married

single

We can add the total option to include
a panel for all observations.
Uses nlsw.dta & scheme vg s1m

single

college grad married

college grad married
0

5 10 15

Total
single

not college grad married
single

college grad married
0

5 10 15

mean of ttl_exp

mean of tenure

Graphs by union worker

graph dot ttl exp tenure, over(married) over(collgrad)
by(union, total cols(1))
We can use the cols(1) option to show
the graphs in one column.
Uses nlsw.dta & scheme vg s1m

nonunion
not college grad
college grad

single
married
single
married

union
single
not college grad married
single
college grad married

Total
single
not college grad married
single
college grad married
0

mean of ttl_exp

5

10

15

mean of tenure

Graphs by union worker

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Box
Dot
Pie
Options
Standard options
Styles
Appendix

In this syntax, you supply multiple
y-variables, and each y-variable
corresponds to a slice in the pie. The
first y-variable is the population in the
state that is younger than 5 years old,
the next the population 5 to 17 years
old, the next 18 to 64 years old, and the
last 65 years and older. The entire pie
would correspond to the sum of all of
these variables across all states. The
first slice then corresponds to the
percentage of the total population that
is younger than 5 years old.
Uses allstates.dta & scheme vg s1c

By

graph pie poplt5 pop5 17 pop18 64 pop65p

Legend

This section describes different ways to produce pie charts using Stata. Stata allows you
to produce pie charts based on multiple y-variables, with each y-variable corresponding to
a slice. You can also create a pie chart based on a single y-variable broken down by a single
over() variable. Finally, you can create a pie chart with no y-variables broken down by
an over() variable, which counts the number of observations by each level of the over()
variable. For more details, see [G] graph pie. This section uses the vg s1c scheme.

Bar

Labels

Types of pie graphs

Colors and exploding

7.1

Matrix

Sorting

This chapter discusses the use of pie charts in Stata. We start by illustrating the different
kinds of ways that you can create pie charts in Stata, followed by showing how you can sort
the slices in your pie charts. Next, we show how you can customize the display of individual
slices, as well as control the colors of the pie chart. Then, we demonstrate different ways
you can label the pie slices and then how you can control the legends for pie charts. Finally,
we discuss how to use the by() option.

Twoway

Pie graphs

Introduction

Types of pie graphs

7

Pop, < 5 year

Pop, 5 to 17 years

Pop, 18 to 64 years

Pop, 65 and older

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this217
document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
218

Chapter 7. Pie graphs

graph pie pop, over(division)

N. Eng.

Mid Atl

E.N.C.

W.N.C.

S. Atl.

E.S.C.

W.S.C.

Mountain

In this syntax, you supply a single
y-variable and an over() option. In
this case, the y-variable corresponds to
the population of the state, the entire
pie corresponds to the entire
population, and each slice corresponds
to the percentage of the population for
each level of division.
Uses allstates.dta & scheme vg s1c

Pacific

graph pie, over(occ7)

Prof

Mgmt

Sales

Cler.

Operat.

Labor

For this third example, we switch to the
nlsw data file. In this syntax, an
over() option is supplied, but no
y-variable is supplied (in a sense, the
observation itself serves as the
y-variable). This pie chart is much like
a visual frequency distribution of occ7,
where the size of each slice corresponds
to the proportion of women in each
occupation.
Uses nlsw.dta & scheme vg s1c

Other

graph pie, over(union) missing
This example shows the proportion of
women in union and nonunion jobs. We
add the missing option, and another
pie slice is added for the observations in
which union is missing.
Uses nlsw.dta & scheme vg s1c

nonunion

union

.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
7.2

219

Prof

Cler.

Pie

Labels

Labor

Dot

Operat.

Box

Sales

Bar

Mgmt

Colors and exploding

Consider this pie chart showing the
number of women who work in these
seven different occupations. The slices
are ordered according to the levels of
occ7 from 1 to 7, rotating clockwise,
starting with the first slice, which is
positioned at 90 degrees.
Uses nlsw.dta & scheme vg lgndc

Matrix

Sorting

graph pie, over(occ7)

Twoway

This section describes how you can sort and arrange slices in pie charts. For more details, see [G] graph pie. This section uses the vg lgndc scheme, which places the legend
at the left in a single column.

Introduction

Sorting pie slices

Types of pie graphs

7.2

Sorting pie slices

Other

By

Prof
Mgmt

Standard options

With the noclockwise option, you can
display the slices in counterclockwise
order.
Uses nlsw.dta & scheme vg lgndc

Options

Legend

graph pie, over(occ7) noclockwise

Sales

Styles

Cler.
Operat.

Other

Appendix

Labor

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
220

Chapter 7. Pie graphs

graph pie, over(occ7) angle0(0)

Prof
Mgmt
Sales

With the angle0() option, you can set
the angle of the line that begins the
first pie slice. Here, we make the first
pie slice begin at 0 degrees.
Uses nlsw.dta & scheme vg lgndc

Cler.
Operat.
Labor
Other

graph pie, over(occ7) sort

Cler.
Operat.

The sort option sorts the slices
according to their size, from smallest to
largest.
Uses nlsw.dta & scheme vg lgndc

Mgmt
Labor
Other
Prof
Sales

graph pie, over(occ7) sort descending

Sales
Prof

Adding the descending option to the
sort option orders the slices from
largest to smallest.
Uses nlsw.dta & scheme vg lgndc

Other
Labor
Mgmt
Operat.
Cler.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
7.3

Changing the look of pie slices, colors, and exploding

221

Operat.

Sales

Box
Dot
Pie
Options

Legend
By

This section describes how to change the color of pie slices, explode pie slices, control
the overall intensity of colors, and control the characteristics of lines surrounding the pie
slices. For more details, see [G] graph pie. This section uses the vg rose scheme.

Labels

Changing the look of pie slices, colors, and exploding

Bar

Other

Matrix

Prof

Twoway

Mgmt

Introduction

Labor

Colors and exploding

7.3

Cler.

Sorting

Say that we wanted to sort the slices
(alphabetically) by occupation name.
We have created a new variable,
occ7alpha, that is a recoded version of
occ7. It is recoded such that, as
occ7alpha goes from 1 to 7, the
occupations are alphabetized (except
for Other, which is placed last). We
add sort(occ7alpha), and the slices
are ordered alphabetically.
Uses nlsw.dta & scheme vg lgndc

Types of pie graphs

graph pie, over(occ7) sort(occ7alpha)

Standard options

graph pie, over(occ7)

Styles

Consider this pie chart showing the
number of women who work in these
seven different occupations. The slices
are colored using the colors indicated
by the scheme. None of the slices are
exploded, and no lines surround the
slices.
Uses nlsw.dta & scheme vg rose
Mgmt

Sales

Cler.

Operat.

Labor

Appendix

Prof

Other

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
222

Chapter 7. Pie graphs

graph pie, over(occ7) pie(3, explode)
In this example, we use the pie()
option to explode the third pie slice,
calling attention to this slice. By
default, it is exploded by 3.8 units.
Uses nlsw.dta & scheme vg rose

Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

graph pie, over(occ7) pie(3, explode(5) color(cyan))
Here, we specify explode(5) to
increase the distance this slice is
exploded to 5 units. We also make the
third slice cyan to make it more
noticeable. See Styles : Colors (328) for
other colors you could choose.
Uses nlsw.dta & scheme vg rose
Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

graph pie, over(occ7) pie(3, color(cyan) explode(5))
pie(1, color(gold) explode(2.5))
You can use the pie() option
repeatedly. Here, we change the color
and explode slices 1 and 3.
Uses nlsw.dta & scheme vg rose

Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
7.3

Changing the look of pie slices, colors, and exploding

223

Using the intensity() option, we
make the colors of all of the slices 1.5
times their normal intensity.
Uses nlsw.dta & scheme vg rose

Dot
Pie

Labels

In this example, we make the intensity
of the colors 60% of the normal color.
Uses nlsw.dta & scheme vg rose

Box

graph pie, over(occ7) intensity(*.6)

Bar

Labor

Other

Matrix

Cler.

Operat.

Colors and exploding

Mgmt

Sales

Twoway

Sorting

Prof

Introduction

Types of pie graphs

graph pie, over(occ7) intensity(*1.5)

Options

Legend
By

Mgmt

Sales

Cler.

Operat.

Labor

Other

Standard options

Prof

Styles

graph pie, over(occ7) line(lcolor(sienna) lwidth(thick))

Appendix

The line() option can be used to
change the characteristics of the lines
surrounding the pie slices. Here, we add
the lcolor() (line color) and lwidth()
(line width) options to make the line
sienna and thick.
Uses nlsw.dta & scheme vg rose
Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
224

7.4

Chapter 7. Pie graphs

Slice labels

This section describes how you can label the pie slices. For more details, see [G] graph
pie. For this section, we will use the economist scheme.

graph pie, over(occ7) plabel( all sum)
Prof
Cler.
Other

Mgmt
Operat.

305

Sales
Labor

317

286

264

246
102

726

Consider this pie chart showing the
number of women who work in these
seven different occupations. Here, we
use the plabel() (pie label) option to
label all slices with the sum, in this
case the frequency of women who work
in each occupation. Notice how
readable the labels are because of the
pale colors of the pie slices selected by
the vg past scheme. Other schemes
with more intense colors would have
made these labels hard to read.
Uses nlsw.dta & scheme economist

graph pie, over(occ7) plabel( all percent)
Prof
Cler.
Other

13.58%

12.73%

Mgmt
Operat.

Sales
Labor

Using the percent option, we can show
the percent of women who work in each
occupation.
Uses nlsw.dta & scheme economist

14.11%

11.75%

10.95%
4.541%

32.32%

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
7.4

Slice labels

225

Prof
Cler.
Other

Mgmt
Operat.

Prof

Mgmt

Operat.
Sales

Pie

Other

Dot

Labels

Prof

Mgmt

Options

Labor

Legend

When the name option is used, the
legend is not as necessary and can be
suppressed using the legend(off)
option.
Uses nlsw.dta & scheme economist

Box

graph pie, over(occ7) plabel( all name) legend(off)

Bar

Colors and exploding

Cler.

Matrix

Sorting

Labor

Twoway

Other

Sales
Labor

Introduction

The name option adds a label that is
the name of the occupation.
Uses nlsw.dta & scheme economist

Types of pie graphs

graph pie, over(occ7) plabel( all name)

By

Standard options

Operat.
Cler.

Sales

Styles

The plabel() option can also be used
to put any text that you want into all
slices or into individual slices. Here, we
add text to the first and third slices.
Uses nlsw.dta & scheme economist

Prof
Cler.
Other

Mgmt
Operat.

Sales
Labor

Appendix

graph pie, over(occ7) plabel(1 "Prof=14.11")
plabel(3 "Sales=32.32%")

Prof=14.11

Sales=32.32%

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
226

Chapter 7. Pie graphs

graph pie, over(occ7) plabel( all percent, format("%2.0f"))
Prof
Cler.
Other

Mgmt
Operat.

14%

Sales
Labor

14%

13%

12%

When you use plabel to label slices
with a sum or percent, you can use the
format() option to control the format
of the numeric values displayed. Here,
we display the percentages as whole
numbers.
Uses nlsw.dta & scheme economist

11%
5%

32%

graph pie, over(occ7) plabel( all percent, gap(-5))
Prof
Cler.
Other

Mgmt
Operat.

13.58%

Sales
Labor

14.11%

12.73%

11.75%

You can use the gap() option to adjust
the position of the label with respect to
the center of the pie. A positive number
pushes the label away from the center
of the pie, and a negative value pushes
the label closer to the center of the pie.
Uses nlsw.dta & scheme economist

10.95%
4.541%

32.32%

graph pie, over(occ7) plabel( all percent, size(large) color(maroon))
Prof
Cler.
Other

Mgmt
Operat.

Sales
Labor

13.58%14.11%
12.73%

11.75%

You can use textbox options to modify
the display of the text labeling the pie
slices. Here, we increase the size of the
text and change its color to maroon.
See Options : Textboxes (303) for more
options you can use.
Uses nlsw.dta & scheme economist

10.95%
4.541% 32.32%

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
7.4

Slice labels

227

32%
Prof

Mgmt

Operat.
Sales

12%

Standard options

Operat.
11%
Cler.

Sales

5%

32%

Styles

graph pie, over(occ7) ptext(0 30 "This is some text")
Prof
Cler.
Other

Mgmt
Operat.

Sales
Labor

Appendix

The ptext() (pie text) option can be
used to add text to the pie chart. Polar
coordinates are used to determine the
location of the text by specifying the
angle and distance from the center.
Here, the angle is 0, and the distance
from the center is 30.
Uses nlsw.dta & scheme economist

Options

Mgmt

Pie

Labor

By

13%

Prof

Dot

Other

14%

Box

14%

Legend

Here, we use the plabel() option twice
to label the slices with the occupation
name and with the percentage. We use
the gap() option to move the names
closer to the center by 5 extra units and
move the percentage 5 extra units from
the center.
Uses nlsw.dta & scheme economist

Labels

graph pie, over(occ7) plabel( all name, gap(-5))
plabel( all percent, gap(5) format("%2.0f"))
legend(off)

Bar

Colors and exploding

Cler.

Matrix

Sorting

Labor

Twoway

Other

Introduction

We can include multiple plabel()
options. In this example, the first
plabel() option assigns the occupation
names to all the slices and moves the
names 5 units inward. The second
plabel() option assigns text to the
second slice and displays it 5 more
units from the center. Since the legend
was not needed, we suppressed it with
the legend(off) option.
Uses nlsw.dta & scheme economist

Types of pie graphs

graph pie, over(occ7) plabel( all name, gap(-5))
plabel(1 "32%", gap(5)) legend(off)

This is some text

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
228

Chapter 7. Pie graphs

graph pie, over(occ7) ptext(-10 10 "This is some text")
Prof
Cler.
Other

Mgmt
Operat.

Sales
Labor

This is some text

Here, we choose an angle of −10
(putting it 10 degrees below 0) and a
distance of 10 units from the center.
Note that the angle determines only the
position of the text but not its actual
angle of display, which is controlled
with the orientation() option. See
the next example for more details.
Uses nlsw.dta & scheme economist

graph pie, over(occ7)
ptext(-10 10 "This is some text", orientation(rvertical)
placement(s) box margin(medsmall) bfcolor(sand))
Prof
Cler.
Other

Mgmt
Operat.

Sales
Labor

This is some text

7.5

Here, we choose an angle of −10
degrees and a distance of 10 units. We
also add a number of textbox options to
make the text reverse vertical, meaning
that it is placed to the south of the
given coordinates, within a box that
has with a medium-small margin and is
filled with a sand color. For more
information on these kinds of textbox
options, see Options : Textboxes (303).
Uses nlsw.dta & scheme economist

Controlling legends

This section illustrates some of the options that you can use to control the display of
legends with pie charts. While this section illustrates the use of legends, it emphasizes
options that may be particularly useful with pie charts. See Options : Legend (287) for more
details about legends; those details apply well to pie charts, even if the examples use other
kinds of graphs. Also, see [G] legend option for more details. We begin this section using
the vg brite scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
7.5

Controlling legends

229

Consider this pie graph showing the
frequencies of women in these seven
occupational categories.
Uses nlsw.dta & scheme vg brite

Dot
Pie

Labels

Options

Legend

We can use the legend(label())
option to change the label for the first
occupation.
Uses nlsw.dta & scheme vg brite

Box

graph pie, over(occ7) legend(label(1 "Professional"))

Bar

Labor

Other

Matrix

Cler.

Operat.

Colors and exploding

Mgmt

Sales

Twoway

Sorting

Prof

Introduction

Types of pie graphs

graph pie, over(occ7)

By

Mgmt

Sales

Cler.

Operat.

Labor

Other

Standard options

Professional

Styles

graph pie, over(occ7) legend(title(Occupation))

Appendix

We can add the title() option to the
legend() option to add a title to the
legend. In fact, we can also use
subtitle(), note(), and caption()
options as well, much as we would for
adding titles to a graph; see Standard
options : Titles (313) for more details.
Uses nlsw.dta & scheme vg brite
Occupation
Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
230

Chapter 7. Pie graphs

graph pie, over(occ7) legend(title(Occupation, position(6)))
We can use the position() option
within the title() option to control
the position of the title. Here, we put
the title in the 6 o’clock position,
placing it at the bottom of the legend.
Uses nlsw.dta & scheme vg brite

Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

Occupation

graph pie, over(occ7) legend(colfirst)
We can use the legend(colfirst)
option to order the items in the legend
by columns instead of rows.
Uses nlsw.dta & scheme vg brite

Prof

Operat.

Mgmt

Labor

Sales

Other

Cler.

graph pie, over(occ7) legend(colfirst order(7 6 5 1 2 3 4) holes(1))
The pie wedges rotate clockwise, and
here we make the items within the
legend rotate in a similar clockwise
fashion, starting from the top right.
The order() option puts the items in
the legend in a clockwise order, and the
holes(1) option leaves the first
position empty.
Uses nlsw.dta & scheme vg brite
Prof
Other

Mgmt

Labor

Sales

Operat.

Cler.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
7.5

Controlling legends

231

Mgmt

Sales

Operat.

Labor

Other

Cler.

Options
Standard options
Styles

graph pie, over(occ7)

Prof

Appendix

Here, we use the vg lgndc scheme.
Using this scheme places the legend to
the left in a single column with the
symbol stacked above the description.
Uses nlsw.dta & scheme vg lgndc

Pie

Other

Dot

Labor

Box

By

Operat.

Bar

Legend

Cler.

Matrix

Sales

Mgmt

Twoway

Prof

Labels

Here, we use the same options as those
in the last example but use them to
place the legend to the left of the graph
(in the 9 o’clock position) and make the
legend display in a single column. We
also add the stack option to the
previous example to stack the symbol
and descriptive text above each other.
This makes an even narrower column,
leaving more room for the pie chart.
Uses nlsw.dta & scheme vg brite

Colors and exploding

graph pie, over(occ7) legend(position(9) cols(1) stack)

Introduction

Prof

Sorting

We can use the position() option to
control the position of the legend,
indicating its position like the numbers
on a clock face; see Styles : Clockpos
(330). Here, we put the legend at the
12 o’clock position, placing it at the top
of the chart, and also add the rows(2)
option to make the legend display in
two rows.
Uses nlsw.dta & scheme vg brite

Types of pie graphs

graph pie, over(occ7) legend(position(12) rows(2))

Mgmt
Sales
Cler.
Operat.
Labor
Other

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
232

7.6

Chapter 7. Pie graphs

Graphing by groups

This section describes the use of the by() option with pie charts, focusing on features
that are specifically relevant to pie charts. For more details, see Options : By (272) and
[G] by option.

graph pie, over(occ7)
Here, we see a basic pie chart showing
the distribution of occupations.
Uses nlsw.dta & scheme vg s2c

Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

graph pie, over(occ7) by(union)
nonunion

union

Prof

Mgmt

Sales

Cler.

Operat.

Labor

In this graph, the occupations are
broken down by whether one belongs to
a union.
Uses nlsw.dta & scheme vg s2c

Other
Graphs by union worker

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
7.6

Graphing by groups

233

nonunion

union

Labor

Other

union

Pie
Options

Legend
By

Operat.

Cler.

Mgmt

Other

Labor

Labor

Prof

Mgmt

Prof

Operat.

Other

Sales

Graphs by union worker

Standard options

Cler.

Sales

Styles

nonunion

Other

Appendix

graph pie, over(occ7) by(union, legend(off))
plabel( all name)
Here, we add the plabel() option to
label the inside of each slice with the
name of the slice, so the legend is no
longer needed. We suppress the legend
with the legend(off) option, which is
placed within the by() option because
it, in a way, determines the placement
of the legend by turning it off.
Uses nlsw.dta & scheme vg s2c

Dot

nonunion

Labels

Here, we sort the slices from least
frequent to most frequent. Note that
separate legends are shown for each
chart. This is because the slices can be
ordered differently in the two different
graphs when sorted. Thus, when you
use the sort option for pie charts,
Stata shows two separate legends to
assure proper labeling of the slices.
Uses nlsw.dta & scheme vg s2c

Box

graph pie, over(occ7) by(union) sort

Bar

Graphs by union worker

Matrix

Cler.

Operat.

Colors and exploding

Mgmt

Sales

Twoway

Sorting

Prof

Introduction

If we add the pie(2, explode) option,
the second slice is exploded in both
graphs.
Uses nlsw.dta & scheme vg s2c

Types of pie graphs

graph pie, over(occ7) by(union) pie(2, explode)

union

Prof

Prof
Other

Mgmt

Labor
Mgmt
Labor

Operat.

Sales

Cler.
Operat.
Sales

Cler.

Graphs by union worker

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
234

Chapter 7. Pie graphs

graph pie, over(occ7) by(union, legend(pos(3))) legend(cols(1) stack)
nonunion

union
Prof
Mgmt
Sales
Cler.
Operat.
Labor
Other

Graphs by union worker

Here, we place the legend to the right
using the legend(pos(3)) option.
Note that this option is contained
within the by() option because it alters
the position of the legend. We also
make the legend a single column with
the legend symbols and labels stacked
with the legend(cols(1) stack)
option. Note this option is outside of
the by() option since it does not
determine the position of the legend.
Uses nlsw.dta & scheme vg s2c

graph pie, over(occ7) by(union) legend(pos(3) cols(1) stack) sort
nonunion

union
Cler.

Cler.

Operat.

Mgmt

Other

Labor

Labor

Prof

Mgmt

Operat.

Prof

Other

Sales

Sales

Graphs by union worker

This example is similar to the previous
example, but we have added the sort
option. Note that, when we add the
sort option, we need to move the
pos() option from within the by()
option to outside of the by() option.
This is an exception to the general rule
that legend options that control the
position of the legend are placed within
the by() option. Here, we get the
legends that we desire, each to the right
of the pie.
Uses nlsw.dta & scheme vg s2c

graph pie, over(occ7) by(urban3, legend(at(4)))
Rural

Here, we break down occupation by a
three-level variable, leaving a fourth
position open. We can specify the
legend(at(4)) option within the by()
option to place the legend in the space
in the fourth position, conserving space
on the graph.
Uses nlsw.dta & scheme vg s2c

Suburb

Urban

Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

Graphs by Rural vs. Suburb vs. Urban

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Standard options
Styles

80

Options

70

Pie

40

Appendix

Adding text

60

Dot

Legend

% who own home

Box

By

50

Bar

Axis selection

twoway scatter ownhome borninstate

Axis scales

This section looks at options that we can use for controlling markers. While the examples in this section focus on twoway scatter, these options apply to any graph where
you have markers and can control them. This section will show how to change the marker
symbol, marker size, and color (both fill and outline color). For more information, see
[G] marker options. We will start this section using the vg s2c scheme.

Matrix

Axis labels

Changing the look of markers

Consider this scatterplot showing the
relationship between the percentage of
people in a state who own their home
and the percentage of people born in
their state of residence. The markers
used in this plot are filled circles.
Uses allstates.dta & scheme vg s2c

Axis titles

8.1

Connecting

This chapter discusses options that are used in many, but not all, kinds of graphs in
Stata, as compared with the Standard options (313) chapter, which covers options that are
standard in all Stata graphs. This chapter goes into greater detail about how to use these
options to customize your graphs. As you can see from the Visual Table of Contents at
the right, this chapter covers markers, connecting, axis titles, labels, scales, selection, using
the by() option, legends, added text, and textboxes. For further details, the examples will
frequently refer to sections of Styles (327) and to [G] graph.

Twoway

Marker labels

Options available for most graphs

Introduction

Markers

8

40

60

80

% born in state of residence

Textboxes

20

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this235
document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
236

Chapter 8. Options available for most graphs

twoway scatter ownhome borninstate, msymbol(S)

60
40

50

% who own home

70

80

We can control the shape of the marker
with the msymbol() (marker symbol)
option. Here, we make the symbols
large squares.
Uses allstates.dta & scheme vg s2c

20

40

60

80

% born in state of residence

twoway scatter ownhome borninstate, msymbol(s)

60
40

50

% who own home

70

80

Specifying msymbol(s), which uses a
lowercase s, displays smaller squares.
Uses allstates.dta & scheme vg s2c

20

40

60

80

% born in state of residence

twoway scatter ownhome borninstate, msymbol(sh)

60
40

50

% who own home

70

80

We can append an h (i.e.,
msymbol(sh)) to yield hollow squares.
In addition to choosing S for larger
squares and s for small squares, we can
specify D (large diamond), T (large
triangle), and O (large circles). We can
specify lowercase letters to get smaller
versions of these symbols and append
the h for hollow versions.
Uses allstates.dta & scheme vg s2c
20

40

60

80

% born in state of residence

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.1

Changing the look of markers

237

80
70
60

% who own home

50
40
80
60
50
40

80

8000 10000
6000
4000
0

2000

Textboxes

Heating degree days

Adding text

Appendix

Legend

Styles

By

twoway scatter heatdd cooldd, msymbol(p)

Standard options

60

Axis selection

40

% born in state of residence

Options

% who own home

70

Pie

Axis scales

20

Here, we switch to the citytemp data
file to illustrate the use of the
msymbol(p) option to plot very small
points. Although each point is hard to
see because they are so small, we can
see the overall pattern of the data
because of the large number of points
and the strong trend in the data. See
Styles : Symbols (342) for more
information about symbols.
Uses citytemp.dta & scheme vg s2c

Dot

Axis labels

Specifying msymbol(+) yields a plus
sign shape for the markers. As with the
X, we cannot make these hollow, nor is
there a symbol for a smaller version of
plus signs.
Uses allstates.dta & scheme vg s2c

Box

Axis titles

twoway scatter ownhome borninstate, msymbol(+)

Bar

80

Matrix

60

% born in state of residence

Twoway

40

Connecting

20

Introduction

Marker labels

We can also specify msymbol(X) to use
a large X shape for the markers. We
could also use a lowercase x for smaller
markers. We cannot append an h since
we cannot make a hollow X.
Uses allstates.dta & scheme vg s2c

Markers

twoway scatter ownhome borninstate, msymbol(X)

0

1000

2000

3000

4000

Cooling degree days

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
238

Chapter 8. Options available for most graphs

twoway scatter ownhome propval100 borninstate

0

20

40

60

80

100

Aside from aesthetics, choosing
different marker symbols is useful to
differentiate multiple markers displayed
in the same plot. In this example, we
plot two y-variables, and Stata displays
both as solid circles, differing in color.
Uses allstates.dta & scheme vg s2c

20

40

60

80

% born in state of residence
% who own home

% homes cost $100K+

twoway scatter ownhome propval100 borninstate,
msymbol(t Oh)

0

20

40

60

80

100

To further differentiate the symbols, we
add the msymbol(t Oh) option to
control both markers. Here, we make
the first marker a small triangle and the
second a larger hollow circle.
Uses allstates.dta & scheme vg s2c

20

40

60

80

% born in state of residence
% who own home

% homes cost $100K+

twoway scatter ownhome propval100 borninstate,
msymbol(. Oh)

0

20

40

60

80

100

Using the msymbol(. Oh) option, we
can leave the first symbol unchanged
(as indicated by the dot) and change
the second symbol to a hollow circle.
We might think that the dot indicates a
small point, but that is indicated by the
p option.
Uses allstates.dta & scheme vg s2c
20

40

60

80

% born in state of residence
% who own home

% homes cost $100K+

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.1

Changing the look of markers

239

80
70
60
40

40

60

80

% born in state of residence

80
70
40

% who own home

60
50

80

Adding text

Appendix

Legend

Styles

By

So far, we have seen that the msymbol() option can be used to control the marker symbol and, to a certain extent, can be used to control the marker size (e.g., using O yields
large circles, and using o yields small circles). As the following examples show, the msize()
option can be used to exert more flexible control over the size of the markers. The following
examples will use the vg s1m scheme.

Standard options

60

Options

40

% born in state of residence

Pie

20

Axis selection

DC

Dot

NV

AZ

Axis scales

FL

MN
ME
WVPA
MI
IA
MS
AL
IN
NH DE
UT
WI
ID VT KS
AR
MO
KY
OK SC
WY
TN
ND
NE
OH LA
NMCTMT
NC
SD
IL
MD NJ
VA
GA
OR
CO
TX
WA
RI MA
NY
AK CA
HI

Axis labels

If we use msymbol(i) to make the
marker symbol invisible, the marker
label (the state abbreviation) can be
displayed without being obscured by
the marker symbol. See Styles : Symbols
(342) for more information about
symbols.
Uses allstates.dta & scheme vg s2c

Box

Axis titles

twoway scatter ownhome borninstate, mlabel(stateab) mlabpos(center)
msymbol(i)

Bar

20

Matrix

DC

Connecting

50

NV

Twoway

% who own home

AZ

Introduction

FL

MN
ME
WVPA
MI
IA
MS
AL
IN
NH DE
UT
WI
ID VT KS
SC
MO
KY
OKAR
WY
TN
ND
NE
OH
NM
NC
SD
CT
IL
LA
MD NJ
VA MT GA
OR
CO
TX
WA
RI MA
NY
AK CA
HI

Marker labels

One last marker symbol is i for
invisible, allowing us to hide the marker
symbol. In this example, we use the
mlabel(stateab) (marker label)
option to display a marker label with
the state abbreviation for each
observation and the mlabpos(center)
(marker label position) option to center
the marker label. However, the marker
symbol (the circle) and the marker
label (the abbreviation) are right on
top of each other.
Uses allstates.dta & scheme vg s2c

Markers

twoway scatter ownhome borninstate, mlabel(stateab) mlabpos(center)

Textboxes

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
240

Chapter 8. Options available for most graphs

60
40

50

% who own home

70

80

twoway scatter ownhome borninstate, msymbol(+) msize(small)

20

40

60

80

% born in state of residence

Previously, we saw that the size of the
symbols created using O, D, S, and T
could be modified using an uppercase
or lowercase letter. Here, we use the
msize() (marker size) option to control
the size of the marker symbol, making
the marker symbol small. Other values
we could have chosen include vtiny,
tiny, vsmall, small, medsmall,
medium, medlarge, large, vlarge,
huge, vhuge, and ehuge.
Uses allstates.dta & scheme vg s1m

twoway scatter ownhome borninstate, msymbol(Oh) msize(*2)

60
40

50

% who own home

70

80

We can specify the sizes as multiples of
the original size of the marker. In this
example, we make the markers twice
their original size by specifying
msize(*2). Specifying a value less than
one reduces the marker size; e.g.,
msize(*.5), would make the marker
half its normal size. See
Styles : Markersize (340) for more details.
20

40

60

80

Uses allstates.dta & scheme vg s1m

% born in state of residence

60
40

50

% who own home

70

80

twoway scatter ownhome borninstate [aweight=propval100],
msymbol(oh)

20

40

60

80

Stata even allows us to size the symbols
based on the values of another variable
in your data file. This allows us, in a
sense, to graph three variables at once.
Here, we look at the relationship
between borninstate and ownhome and
then size the markers based on
propval100 using
[aweight=propval100], weighting the
markers by propval100.
Uses allstates.dta & scheme vg s1m

% born in state of residence

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.1

Changing the look of markers

241

80
70
60

% who own home

50
40

80

80
70
60

% who own home

50

60

80

40
80
70
60

DC

40

% who own home

50

NV

AZ

Appendix

Adding text

FL

ME MN
WVPA
MI
IA
MS
IN AL
NH DE
UT
WI
ID VT KS
AR
MO
KY
OK SC
WY
TN
ND
NE
OH
NM
NC
SD
CTMT
IL
LA
MD NJ
VA
GA
OR
CO
TX
WA
RI MA
NY
AK CA
HI

Textboxes

Legend

We can solve the problem from the
previous example by overlaying a
scatterplot that has the symbols
weighted by propval100 with a
scatterplot that shows just the marker
labels. The second scatterplot uses the
mlabel(stateab) msymbol(i)
mlabpos(center) options to label the
markers with the state abbreviation.
See Options : Marker labels (247) for
more details.
Uses allstates.dta & scheme vg s1m

Styles

By

twoway (scatter ownhome borninstate [aweight=propval100],
msymbol(oh) msize(large))
(scatter ownhome borninstate, mlabel(stateab) msymbol(i) mlabpos(center))

Standard options

40

% born in state of residence

Options

20

Axis selection

DC

Pie

NV

Dot

AZ

Axis scales

FL

MN
ME MI
WVPA
IA
VT
MS
IN AL
NH DE
UT
WI
ID
AR
KS
SC
KY
OK MO
WY
TN
ND
NE
OH
NM
NC
SD
CT
IL
NJ
MT
LA
MD VA
GA
OR
CO
TX
WA
RI MA
NY
CA
AK
HI

Axis labels

We might try to even graph a fourth
variable in the plot by using the
mlabel() (marker label) option. Here,
we try to use the mlabel(stateab)
option to label each marker with the
abbreviation of the state. However,
note that when we add the mlabel()
option, the weights no longer affect the
size of the markers. See the following
example for a solution to this.
Uses allstates.dta & scheme vg s1m

Box

Axis titles

twoway scatter ownhome borninstate [aweight=propval100],
msymbol(oh) msize(large) mlabel(stateab)

Bar

60

Matrix

40

% born in state of residence

Twoway

Connecting

20

Introduction

Marker labels

Even if we weight the size of the
markers using aweight, we can still
control the general size of the markers.
Here, we make all markers smaller
using the msize(small) option. The
markers are smaller than they were
previously but are still sized according
to the value of propval100.
Uses allstates.dta & scheme vg s1m

Markers

twoway scatter ownhome borninstate [aweight=propval100],
msymbol(oh) msize(small)

20

40

60

80

% born in state of residence
% who own home

% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
242

Chapter 8. Options available for most graphs

Stata also allows us to control the color of the markers. We can control the overall color
of the marker, create a solid color, or make the inner part of the marker one color (called a
fill color) and the outline of the marker a different color. We can also vary the thickness of
the outline of the marker. These next examples will use the vg rose scheme.

twoway scatter ownhome borninstate, mcolor(navy)
The mcolor() (marker color) option
can be used to control the color of the
markers. Here, we make the markers
navy blue using the mcolor(navy)
option. See Styles : Colors (328) for more
information about specifying colors
Uses allstates.dta & scheme vg rose

80

% who own home

70

60

50

40
20

40

60

80

% born in state of residence

twoway scatter ownhome borninstate, mfcolor(ltblue) mlcolor(navy)
We can separately control the fill color
(inside color) and outline color with the
mfcolor() (marker fill color) and
mlcolor() (marker line color) options,
respectively. Here, we make the fill
color light blue by specifying
mfcolor(ltblue) and the line color
navy by specifying mlcolor(navy).
Uses allstates.dta & scheme vg rose

80

% who own home

70

60

50

40
20

40

60

80

% born in state of residence

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.1

Changing the look of markers

243

60

40

60

80

% born in state of residence

80

Pie

60

40

60

80

% born in state of residence

Adding text

60

Textboxes

% who own home

70

Appendix

Legend

80

Styles

By

twoway scatter ownhome borninstate, mfcolor(eltgreen)
mlcolor(dkgreen) mlwidth(vthick)

Standard options

20

Axis selection

40

Options

Axis scales

% who own home

70

50

We can control the width of the line
that surrounds the marker using the
mlwidth() option. Here, we make the
width very thick by specifying the
mlwidth(vthick) (marker line width)
option. We can also indicate the
thickness as a multiple of the original
thickness; e.g., mlwidth(*3) indicates
the line should be three times as thick
as it would normally be. See
Styles : Linewidth (337) for more details.
Uses allstates.dta & scheme vg rose

Dot

Axis labels

We can also separately control the fill
color using the mfcolor() option. If we
choose mfcolor(ltblue), the fill color
is light blue.
Uses allstates.dta & scheme vg rose

Box

Axis titles

twoway scatter ownhome borninstate, mfcolor(ltblue)

Bar

20

Matrix

40

Connecting

50

Twoway

% who own home

70

Introduction

80

Marker labels

We can change the line color
surrounding the marker with the
mlcolor() option. Here, we specify
mlcolor(black) to make the line
surrounding the markers black.
Uses allstates.dta & scheme vg rose

Markers

twoway scatter ownhome borninstate, mlcolor(black)

50

40
20

40

60

80

% born in state of residence

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
244

Chapter 8. Options available for most graphs

twoway scatter ownhome borninstate, mlwidth(medthick)
80

% who own home

70

60

50

40
20

40

60

80

If we do not specify a different color for
the line that outlines the marker (e.g.,
via the mlcolor() option), we may not
see any effect in specifying the
mlwidth() option. This is because the
color of the line surrounding the marker
is the same as the fill color, so we
cannot see the effect of modifying the
width of the line surrounding the
marker, as illustrated here.
Uses allstates.dta & scheme vg rose

% born in state of residence

So far, we have focused on controlling the individual elements of markers, the marker
symbol, color, size, fill color, line color, and so forth. There is another way to change the
appearance of a marker, and that is by specifying a marker style. The marker style controls
all these attributes at once, and in some situations, it can be more efficient to use a marker
style to control the elements individually, as we will see in the following examples. The next
examples will use the vg s2m scheme.
twoway scatter ownhome borninstate

60
40

50

% who own home

70

80

The marker styles are named/numbered
p1 to p15. The markers in this example
are displayed using the p1 style because
we are plotting only one y-variable and
have not specified a marker style.
Uses allstates.dta & scheme vg s2m

20

40

60

80

% born in state of residence

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.1

Changing the look of markers

245

80
70
60

% who own home

50
40
80
60
50
40

80

100
0

60

80

Textboxes

20

40

60

80

Adding text

40

Appendix

Legend

20

Styles

By

twoway scatter ownhome propval100 borninstate

Standard options

60

Axis selection

40

% born in state of residence

Options

% who own home

70

Pie

Axis scales

20

Now, if we plot two variables, notice
how the first variable is plotted using
the p1 style and the second variable is
plotted using the p2 style. We would
have gotten the same result if we had
specified the option mstyle(p1 p2).
Uses allstates.dta & scheme vg s2m

Dot

Axis labels

Here, we use mstyle(p2) to explicitly
select the p2 style for displaying the
markers, and now the markers are
different in size, shape, and color. The
markers are now larger diamonds that
are a middle-level gray color.
Uses allstates.dta & scheme vg s2m

Box

Axis titles

twoway scatter ownhome borninstate, mstyle(p2)

Bar

80

Matrix

60

% born in state of residence

Twoway

40

Connecting

20

Introduction

Marker labels

Here, we explicitly select the default
marker style using the mstyle(p1)
(marker style) option, and the markers
look identical to the previous graph.
Uses allstates.dta & scheme vg s2m

Markers

twoway scatter ownhome borninstate, mstyle(p1)

% born in state of residence
% who own home

% homes cost $100K+

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
246

Chapter 8. Options available for most graphs

twoway scatter ownhome propval100 borninstate, mstyle(p1 p10)

0

20

40

60

80

100

In this graph, we use the mstyle(p1
p10) option to request that the first
variable be plotted with the p1 style
and the second be plotted with the p10
style. A style is just a starting point,
and we can use additional options to
modify the markers to suit our taste.
Uses allstates.dta & scheme vg s2m
20

40

60

80

% born in state of residence
% who own home

% homes cost $100K+

twoway scatter ownhome propval100 borninstate, mstyle(p1 p10)
msize(. medium)

0

20

40

60

80

100

Say that in the previous graph you
wanted medium-sized triangles. We can
add the msize(. medium) option to
control the size of the second marker,
leaving the first unchanged. So, even
though a style chooses a number of
characteristics for the markers, we can
override them.
Uses allstates.dta & scheme vg s2m
20

40

60

80

% born in state of residence
% who own home

% homes cost $100K+

0

20

40

60

80

100

twoway scatter ownhome propval100 borninstate, mstyle(p1 p1)
mfcolor(. white)

20

40

60

% born in state of residence
% who own home

% homes cost $100K+

80

In this example, we use the p1 style for
both the first and second markers,
which are small, dark gray, filled circles.
If no other options are specified, the
markers for the first variable will be
identical to those for the second. But
adding the mfcolor(. white) option,
the fill color for the first variable was
left alone, and the second was changed
to white. This easily gave us solid and
white-filled circles for the two markers.
Uses allstates.dta & scheme vg s2m

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.2

Creating and controlling marker labels

247

100
80
60
40
20
0

80

Adding text

Appendix

Legend
Textboxes

100
80
60
40
20
0

This section looks at the details of using marker labels. Marker labels can be used to
identify the markers with graph twoway but also can be used with other types of graphs,
such as graph matrix and graph box, affecting the outside values. You can even use
marker labels in lieu of markers. For more information, see [G] marker label options.
For this section, we will use the vg s2c scheme and the allstates3 file, which keeps the
states that are in the South, i.e., if region is equal to 3.

Styles

By

Creating and controlling marker labels

Standard options

% homes cost $100K+

Options

% who own home

Axis selection

80

Pie

Axis scales

60

% born in state of residence

Dot

Axis labels

40

Box

Axis titles

20

Bar

% homes cost $100K+

Matrix

60

% born in state of residence

twoway scatter ownhome propval100 borninstate

8.2

Twoway

40

Connecting

20

% who own home

Say that you wanted the markers to be
displayed as outlines filled with white.
Rather than specifying the mfcolor()
option, you could use the vg outm
scheme, as shown here. Even if you
overlaid multiple commands, using this
scheme would display the markers, by
default, as white-filled outlines.
Uses allstates.dta & scheme vg outm

Introduction

Marker labels

Another strategy for controlling the
marker symbols is choosing or creating
a scheme. The vg samem scheme makes
all markers the same size, shape, color,
etc., allowing you to customize them all
from a common base. Here, we use the
vg samem scheme, making all markers
solid, dark gray circles, but use the
msymbol(. Sh) option to make the
second symbol hollow squares.
Uses allstates.dta & scheme vg samem

Sh)

Markers

twoway scatter ownhome propval100 borninstate, msymbol(.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
248

Chapter 8. Options available for most graphs

twoway scatter ownhome borninstate, mlabel(stateab)
Consider this scatterplot showing the
relationship between the percentage of
people in a state who own their home
MS
AL
and the percentage of people born in
KY
their state of residence. We might want
to be able to identify some of the
LA observations, and we can use the
mlabel() (marker label) option to label
the markers with the two-letter
abbreviation of the state.
Uses allstates3.dta & scheme vg s2c
80

75

WV

70

ARSC
OK
TN
NC

FL
MD

VA
GA
TX

65

% who own home

DE

30

40

50

60

70

% born in state of residence

twoway scatter ownhome borninstate, mlabel(stateab) mlabpos(12)
75

WV
ALMS
OK
70

% who own home

DE
ARSC

KY

TN
NC

FL
MD

LA

VA
GA

65

TX

In the previous graph, the marker labels
were all at the 3 o’clock position with
respect to the markers. We can use the
mlabpos() (marker label position)
option to give the marker labels a
different position. In this example, we
place the marker labels in the 12
o’clock position above the markers.
Uses allstates3.dta & scheme vg s2c

30

40

50

60

70

80

% born in state of residence

twoway scatter ownhome borninstate, mlabel(stateab) mlabvpos(pos)
75

WV
AL

AR
SC

70

OK

MS
KY

TN
NC

FL
MD

VA
GA

LA

TX

65

% who own home

DE

30

40

50

60

% born in state of residence

70

80

There are a few markers whose
corresponding marker labels overlap
each other. The mlabvpos() (marker
label variable position) option allows us
to assign a different marker label
position for each observation via a
variable in the data file. The variable
pos has a value of 3, except for states
AL, MS, AR, and LA, where pos is 9, 12,
12, and 6, respectively. Note how the
markers are in the 3 o’clock position,
except for AL, MS, AR, and LA, which
are in the 9, 12, 12, and 6 o’clock
positions, respectively.
Uses allstates3.dta & scheme vg s2c

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.2

Creating and controlling marker labels

249

75

MS

DE
AL

KY

70

TN
NC
FL
VA

LA
GA

30

40

50

60

70

80

75

OK

KY

70

TN
NC

FL

LA
GA

65

TX

40

50

60

70

80

% born in state of residence

V
W
KY
LA
TX

G

A

VA

M

D

FL

O

K
AR
S
N TN C
C

D

AL
M
S

E

75
70
65

Textboxes

% who own home

Adding text

Appendix

Legend

The mlabangle() (marker label angle)
option can be used to control the angle
of the marker label. 0 degrees indicates
horizontal text, 90 degrees vertical text,
180 degrees reverse horizontal text, and
270 degrees reverse vertical text. You
can also specify negative degrees (for
example, −90 degrees is the same as
270 degrees). See Styles : Angles (327)
for more details.
Uses allstates3.dta & scheme vg s2c

Styles

By

twoway scatter ownhome borninstate, mlabel(stateab) mlabangle(45)

Standard options

VA

Axis selection

% who own home

SC

Options

AL
AR

30

Pie

DE

MD

Dot

WV

MS

Axis scales

We can also specify the mlabsize() as
a relative size, a multiple of the original
size. In this example, the labels are .6
times their normal size.
Uses allstates3.dta & scheme vg s2c

Axis labels

twoway scatter ownhome borninstate, mlabel(stateab) mlabvpos(pos)
mlabsize(*.6)

Box

Axis titles

% born in state of residence

Bar

65

TX

Matrix

MD

Connecting

% who own home

SC
OK

Twoway

AR

Introduction

WV

Marker labels

We can use the mlabsize() (marker
label size) option to control the size of
the markers. In this example, we make
the markers small. Some of the sizes
you could choose include small,
medsmall, medium, medlarge, large,
and vlarge; see Styles : Textsize (344)
for more options.
Uses allstates3.dta & scheme vg s2c

Markers

twoway scatter ownhome borninstate, mlabel(stateab) mlabvpos(pos)
mlabsize(small)

30

40

50

60

70

80

% born in state of residence

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
250

Chapter 8. Options available for most graphs

75

twoway scatter ownhome borninstate, mlabel(stateab) mlabpos(7)
mlabcolor(red)

DE

MS
AL
OK

ARSC

KY

TN
NC

70

% who own home

The mlabcolor() (marker label color)
option controls the color of the marker
labels. In this example, we make the
marker labels red. See Styles : Colors
(328) for more details.
Uses allstates3.dta & scheme vg s2c

WV

FL
MD

LA

VA

65

GA
TX
30

40

50

60

70

80

% born in state of residence

75

twoway scatter ownhome borninstate, mlabel(stateab) mlabpos(7)
mlabgap(*3)
WV
MS
AL
OK

70

% who own home

DE
ARSC

KY

TN
NC

FL
MD

LA

VA

65

GA

The mlabgap() (marker label gap)
option controls the gap between the
marker and the marker label. In this
example, we make the gap three times
the size that it would normally. You
can also specify a value less than 1 to
place the marker label closer to the
marker.
Uses allstates3.dta & scheme vg s2c

TX
30

40

50

60

70

80

% born in state of residence

8.3

Connecting points and markers

Stata supports a variety of methods for connecting points using different values for the
connectstyle. These include l to connect with a straight line, L to connect with a straight
line only if the current x-value is greater than the prior x-value, J for stairstep, stepstair
for step then stair, and i for invisible connections. For the next few examples, let’s switch
to using the spjanfeb2001 data file, keeping just the data for January and February of
2001. These examples of connect styles do not demonstrate how you would normally use
these styles but illustrate the different ways you can connect points. See [G] connectstyle
for more information. For this section, we will use the vg blue scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.3

Connecting points and markers

251

1300

10

20

Axis labels

Closing price

Axis scales

1300

0

10

20
Trading day number

Axis selection

1250

1400

Adding text

1300

Textboxes

Closing price

1350

Appendix

Legend

If we add the sort option, the
observations are connected after sorting
them by tradeday, which leads to the
kind of graph we wanted to create.
Alternatively, we could have typed
sort tradeday, and all ensuing graphs
would have been ordered on tradeday,
even without the sort option.
Uses spjanfeb2001.dta & scheme
vg blue

Styles

By

twoway scatter close tradeday, connect(l) sort

Standard options

40

1350

Options

30

1400

Pie

40

Axis titles

We use connect(l) to connect the
points, but this does not lead to the
kind of graph we really wanted to
create. This is because the observations
in the data file are not sorted according
to tradeday, yet the observations are
connected based on the order in which
they appear in the data file.
Uses spjanfeb2001.dta & scheme
vg blue

Dot

30

twoway scatter close tradeday, connect(l)

Box

40

Bar

30

Trading day number

Matrix

0

Connecting

1250

Twoway

Closing price

1350

Introduction

1400

Marker labels

Consider this graph, which shows the
closing price of the S&P 500 index for
January and February of 2001 by
tradeday, the trading day numbered
from 1 to 40.
Uses spjanfeb2001.dta & scheme
vg blue

Markers

twoway scatter close tradeday

1250
0

10

20
Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
252

Chapter 8. Options available for most graphs

twoway scatter close tradeday, connect(J) sort
You would not normally connect
observations for this kind of graph
using a stairstep pattern. This
connection method, obtained by using
the connect(J) option, would more
normally be used in a graph showing a
survival function over time.
Uses spjanfeb2001.dta & scheme
vg blue

1400

Closing price

1350

1300

1250
0

10

20

30

40

Trading day number

twoway scatter close tradeday, connect(stepstair) sort
A connection method related to the one
above can be obtained using the
connect(stepstair) option.
Uses spjanfeb2001.dta & scheme
vg blue

1400

Closing price

1350

1300

1250
0

10

20

30

40

Trading day number

twoway scatter close dom, connect(l) sort
1400

Closing price

1350

1300

1250
0

10

20

30

Say that we wanted to show the closing
price as a function of the day of the
month for the two months for which we
have data. In this example, we have the
variable dom (day of the month) on the
x-axis. If we include the sort option,
the data are shown as one continuous
line, as opposed to having one line for
January and a second line for February.
Uses spjanfeb2001.dta & scheme
vg blue

Day of month

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.3

Connecting points and markers

253

1300

Axis labels

1400

Closing price

Axis scales

1350

1300

10
Day of month

Adding text

1400

Textboxes

Closing price

1350

Appendix

Legend

The connect() option determines how
the markers are connected but not the
color, width, or pattern of the line.
Here, we use the clcolor() (connect
line color), clwidth() (connect line
width), and clpattern() (connect line
pattern) options to make the line green,
thick, and dashed. See Styles : Colors
(328), Styles : Linewidth (337), and
Styles : Linepatterns (336) for more
information.
Uses spjanfeb2001.dta & scheme
vg blue

Styles

By

twoway scatter close tradeday, connect(l) sort
clcolor(green) clwidth(thick) clpattern(dash)

Standard options

0

Axis selection

1250

Options

30

Axis titles

This graph is what we wanted to create.
The connect(L) option avoids the line
connecting January 31 and February 1
because it connects points only as long
as dom is increasing. When dom
decreases from 31 to 1, the connect(L)
option does not connect those two
points. See Styles : Connect (332) for
more details on connect() options.
Uses spjanfeb2001.dta & scheme
vg blue

Pie

20

twoway scatter close dom, connect(L) sort(tradeday)

Dot

30

Box

20
Day of month

Bar

10

Matrix

0

Connecting

1250

Twoway

Closing price

1350

Introduction

1400

Marker labels

We need to sort the observations by
tradeday, using the sort(tradeday)
option. This graph is almost what we
want, but the observation for January
31 is connected to the observation for
February 1.
Uses spjanfeb2001.dta & scheme
vg blue

Markers

twoway scatter close dom, connect(l) sort(tradeday)

1300

1250
0

10

20

30

40

Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
254

8.4

Chapter 8. Options available for most graphs

Setting and controlling axis titles

This section provides more details about the use of axis title options for providing titles
for axes. For more information, see [G] axis title options. For this section, we will use
the vg past scheme.

70
65
55

60

% who own home

75

80

twoway scatter ownhome propval100

0

20

40

60

80

100

Consider this graph of the percentage
of home owners by the percentage of
homes that cost over one hundred
thousand dollars. The titles of the xand y-axes are the names of the
variables, unless the variables are
labeled, in which case the default title
is the variable label. In this example,
the axes are labeled with the variable
labels.
Uses allstatesdc.dta & scheme vg past

% homes cost $100K+

60

65

70

75

80

We can use the xtitle() and ytitle()
options to supply our own titles.
Uses allstatesdc.dta & scheme vg past

55

Percent of households that own their homes

twoway scatter ownhome propval100,
ytitle("Percent of households that own their homes")
xtitle("Percent of homes that cost over $100,000")

0

20

40

60

80

100

Percent of homes that cost over $100,000

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.4

Setting and controlling axis titles

255

80
75
70
65
60

Percent of households that own their homes

55
80
75
70
65

Percent of households
that own their homes

60

60

By

70
55

60

65

Textboxes

% who own home
1990 Census Data

75

80

Adding text

We can use the prefix and suffix
options to add information before or
after the existing title, respectively.
Uses allstatesdc.dta & scheme vg past

Appendix

Legend

twoway scatter ownhome propval100,
ytitle("1990 Census Data", suffix)
xtitle("In 1990 dollars", prefix)

Styles

100

55

40

Percent of homes
that cost over $100,000

Axis selection

20

Standard options

80

Axis scales

0

Options

100

Axis labels

In this example, we supply the same
titles but divide them into two separate
quoted strings, which then are
displayed on separate lines.
Uses allstatesdc.dta & scheme vg past

Pie

80

twoway scatter ownhome propval100,
ytitle("Percent of households" "that own their homes")
xtitle("Percent of homes" "that cost over $100,000")

Dot

100

Box

80

Bar

60

Matrix

40

Percent of homes that cost over $100,000

Axis titles

20

Twoway

Connecting

0

Introduction

Marker labels

Because an axis title is considered a
textbox, you can use textbox options,
as illustrated here, to control the look
of the axis title. Here, we add the
size() and box options to xtitle() to
make the x-axis title small with a box
around it. See Options : Textboxes (303)
for additional examples of how to use
textbox options to control the display
of text.
Uses allstatesdc.dta & scheme vg past

Markers

twoway scatter ownhome propval100,
ytitle("Percent of households that own their homes")
xtitle("Percent of homes that cost over $100,000", size(small) box)

0

20

40

60

In 1990 dollars
% homes cost $100K+

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
256

Chapter 8. Options available for most graphs

80
60
40
20

Consider this overlaid twoway graph.
The two y-variables are both scaled in
percentages, but they have different
ranges. We use the yaxis(2) option on
the second scatter command to place
that axis on the second y-axis, which is
then placed on the right axis.
Uses allstatesdc.dta & scheme vg past

0

% homes cost $100K+

30
20
10
0

% rents $700+/mo

40

100

twoway (scatter rent700 ownhome )
(scatter propval100 ownhome, yaxis(2))

55

60

65

70

75

80

% who own home...
% rents $700+/mo

% homes cost $100K+

80
60
40
20
0
55

60

65

70

75

% who own home...
% rents $700+/mo

8.5

% homes cost $100K+

80

Percent homes over $100,000

30
20
10
0

Percent rents over $700

40

100

twoway (scatter rent700 ownhome) (scatter propval100 ownhome, yaxis(2)),
ytitle("Percent rents over $700", axis(1))
ytitle("Percent homes over $100,000", axis(2))
Now that we have two y-axes, the
ytitle() option would change the
y-title for the first y-axis, unless we
specify otherwise. In this example, we
supply a ytitle() option with the
axis(1) option to indicate that the
title belongs to the first y-axis, and a
second ytitle() option using the
axis(2) option to indicate that the
second title belongs to the second
y-axis.
Uses allstatesdc.dta & scheme vg past

Setting and controlling axis labels

This section describes more details about axis labels, including major and minor (numeric) labels, major and minor tick marks, and grid lines. This section also shows how
to control the appearance of these objects (e.g., size, color, thickness, or angle). For more
information, see [G] axis label options. For this section, we will use the vg s1c scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.5

Setting and controlling axis labels

257

100
80
60
40

% homes cost $100K+

20
0

30000

% homes cost $100K+

10 20 30 40 50 60 70 80 90
0

24000

26000

28000

0

Textboxes

% homes cost $100K+

10 20 30 40 50 60 70 80 90 100

Adding text

Appendix

Legend

We can change the major labels for the
y-variable to range from 0 to 100,
incrementing by 10, using the
ylabel(0(10)100) option.
Uses allstatesdc.dta & scheme vg s1c

Styles

By

twoway scatter propval100 faminc, ylabel(0(10)100)

Standard options

22000

Options

20000

1979 Median Family Inc.

Pie

18000

Axis selection

16000

Dot

Axis scales

14000

Box

Axis labels

Using the xlabel(#10) and
ylabel(#10) options, we ask for about
10 values to be labeled on each axis.
Stata chose to use 10 values for the
y-axis, labeling it from 0 to 90,
incrementing by 10, and 8 values for
the x-axis going from 14,000 to 28,000,
incrementing by 2,000. As you can see
from this example, sometimes Stata
follows your suggestion exactly, and
sometimes it chooses a different number
of values to make more logical labels.
Uses allstatesdc.dta & scheme vg s1c

Bar

Axis titles

twoway scatter propval100 faminc, xlabel(#10) ylabel(#10)

Matrix

25000

1979 Median Family Inc.

Twoway

20000

Connecting

15000

Introduction

Marker labels

Let’s start with a basic graph showing
the percent of homes costing over
$100,000 by the median family income.
Uses allstatesdc.dta & scheme vg s1c

Markers

twoway scatter propval100 faminc

15000

20000

25000

30000

1979 Median Family Inc.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
258

Chapter 8. Options available for most graphs

twoway scatter propval100 faminc, xlabel(minmax) ylabel(none)

% homes cost $100K+

Here, we use the xlabel(minmax)
option to label the x-axis only with the
minimum and maximum and use
ylabel(none), so that the y-axis will
have no major labels or ticks.
Uses allstatesdc.dta & scheme vg s1c

14591

28395
1979 Median Family Inc.

twoway scatter propval100 faminc, ymlabel(10(20)90)

70

60
50

40
30

0

10

20

% homes cost $100K+

80

90

100

The default graph had major labels for
the y-axis at 0, 20, 40, 60, 80, and 100.
We could add minor labels for the
y-variable at 10, 30, 50, 70, and 90
using the ymlabel(10(20)90) option.
The m in ymlabel() stands for minor.
Uses allstatesdc.dta & scheme vg s1c

15000

20000

25000

30000

1979 Median Family Inc.

twoway scatter propval100 faminc, ytick(10(10)90)

60
40
0

20

% homes cost $100K+

80

100

The default graph had major ticks for
the y-axis at 0, 20, 40, 60, 80, and 100.
We can add major ticks ranging from
10 to 90, incrementing by 10, using the
ytick(10(10)90) option.
Uses allstatesdc.dta & scheme vg s1c

15000

20000

25000

30000

1979 Median Family Inc.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.5

Setting and controlling axis labels

259

100
80
60
40

% homes cost $100K+

20
0
100
80
60
40

% homes cost $100K+

20
0
0

Textboxes

% homes cost $100K+

10 20 30 40 50 60 70 80 90 100

Adding text

Appendix

Legend

If we wanted to label the y-axis using
values ranging from 0 to 100,
incrementing by 10 but suppressing the
display of ticks, we could use the
noticks option.
Uses allstatesdc.dta & scheme vg s1c

Styles

By

twoway scatter propval100 faminc, ylabel(0(10)100, noticks)

Standard options

30000

Options

25000

1979 Median Family Inc.

Axis selection

20000

Pie

Axis scales

15000

Dot

Axis labels

The default graph had major labels for
the y-axis at 0, 20, 40, 60, 80, and 100.
We can place 9 minor ticks between
major ticks with the ymtick(##10)
option. Note that the value of 10
includes the 9 minor ticks plus the 10th
major tick.
Uses allstatesdc.dta & scheme vg s1c

Box

Axis titles

twoway scatter propval100 faminc, ymtick(##10)

Bar

30000

Matrix

25000

1979 Median Family Inc.

Twoway

20000

Connecting

15000

Introduction

Marker labels

We can use the ymtick() option to add
minor ticks to the graph. For example,
here we add minor ticks at 10, 30, 50,
70, and 90. The m in ymtick() stands
for minor.
Uses allstatesdc.dta & scheme vg s1c

Markers

twoway scatter propval100 faminc, ymtick(10(20)90)

15000

20000

25000

30000

1979 Median Family Inc.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
260

Chapter 8. Options available for most graphs

twoway scatter propval100 faminc, ylabel(, nolabel)

% homes cost $100K+

We could suppress the labels using the
nolabel option, and only the ticks
would be shown.
Uses allstatesdc.dta & scheme vg s1c

15000

20000

25000

30000

1979 Median Family Inc.

twoway scatter propval100 region

60
40
0

20

% homes cost $100K+

80

100

If a variable has meaningful value
labels, we can display the value labels
in place of the values. For example, we
can look at the propval100 broken
down by census region, but we do not
know which regions correspond to the
values 1 to 4.
Uses allstatesdc.dta & scheme vg s1c

1

2

3

4

Census region

twoway scatter propval100 region, xlabel(, valuelabels)

60
40
0

20

% homes cost $100K+

80

100

If we include the xlabel(,
valuelabels) option, the value labels
are displayed instead, making the graph
much easier to understand.
Uses allstatesdc.dta & scheme vg s1c

NE

N Cntrl

South

West

Census region

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.5

Setting and controlling axis labels

261

100
80
60
40

% homes cost $100K+

20
0
100
80
60
40

% homes cost $100K+

20
0
40

Textboxes

% homes cost $100K+

60

Appendix

Adding text

80

Styles

Legend

100

Standard options

By

twoway scatter propval100 faminc, ylabel(, angle(0))

Options

Axis selection

1979 Median Family Inc.

Pie

Axis scales

20,000

Dot

30,000

Axis labels

15,000

Box

25,000

twoway scatter propval100 faminc, xlabel(, format(%8.0gc))

Bar

West

Axis titles

South

Matrix

NorthCentral
Census region

We can change the angles of the labels
from their default orientation. By
default, the values on the y-axis are
shown at a 90-degree angle, but we can
use the ylabel(, angle(0)) to display
the labels without rotation.
Uses allstatesdc.dta & scheme vg s1c

Twoway

Connecting

NorthEast

We can change the formatting of the
labels using the format() option, just
as we would using a format statement.
In this example, we format income
using a comma format to make the
larger numbers more readable.
Uses allstatesdc.dta & scheme vg s1c

Introduction

Marker labels

If region were not labeled, or if we
wanted different labels, we could
indicate those labels using the
xlabel() option, as illustrated here.
Uses allstatesdc.dta & scheme vg s1c

Markers

twoway scatter propval100 region,
xlabel(1 "NorthEast" 2 "NorthCentral" 3 "South" 4 "West")

20

0
15000

20000

25000

30000

1979 Median Family Inc.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
262

Chapter 8. Options available for most graphs

80
60
40
20
15
00
16 0
00
17 0
00
18 0
00
19 0
00
20 0
00
21 0
00
22 0
00
23 0
00
24 0
00
25 0
00
26 0
00
27 0
00
28 0
00
29 0
00
30 0
00
0

0

% homes cost $100K+

100

twoway scatter propval100 faminc, xlabel(15000(1000)30000, angle(45))
If we label an axis with a large number
of values (and especially with wide
values), the labels may crowd each
other and overlap. Here, we label the
x-axis from 15000 to 30000 in
increments of 1000. To avoid
overlapping, we add the angle(45)
option to show the labels at a 45-degree
angle.
Uses allstatesdc.dta & scheme vg s1c

1979 Median Family Inc.

twoway scatter propval100 faminc, xlabel(15000(1000)30000, alternate)

80
60
40
20
0

% homes cost $100K+

100

We can also avoid overlapping the axis
labels by adding the alternate option
to xlabel(). The labels are now
displayed in two rows in alternating
rows, so they are not crowded or
overlapped.
Uses allstatesdc.dta & scheme vg s1c

15000 17000 19000 21000 23000 25000 27000 29000
16000 18000 20000 22000 24000 26000 28000 30000
1979 Median Family Inc.

10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90

We can control the size of labels with
the labsize() option. For example, we
might want to label our y-axis from 0
to 90, incrementing by 5. The labels
would ordinarily overlap, but if we add
the labsize(vsmall) option, the very
small labels no longer overlap.
Uses allstatesdc.dta & scheme vg s1c

0

5

% homes cost $100K+

twoway scatter propval100 faminc, ylabel(0(5)90, labsize(vsmall))

15000

20000

25000

30000

1979 Median Family Inc.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.5

Setting and controlling axis labels

263

100
80
60
40

% homes cost $100K+

20
0
100
80
60
40

% homes cost $100K+

20
0
60
0

20

40

Textboxes

% homes cost $100K+

80

100

Adding text

Appendix

Legend

In this example, we place major ticks
from 0 to 100, incrementing by 10,
locating the ticks on the outside of the
plot, and place minor ticks from 5 to
95, incrementing by 10, placing the
ticks on the inside of the plot region.
Uses allstatesdc.dta & scheme vg s1c

Styles

By

twoway scatter propval100 faminc,
ytick(0(10)100, tposition(outside))
ymtick(5(10)95, tposition(inside))

Standard options

30000

1979 Median Family Inc.

Options

25000

Axis selection

20000

Pie

Axis scales

15000

Dot

Axis labels

You can control the tick length with
the tlength() option, the tick line
width with the tlwidth() option, and
the tick position with the tposition()
option. In this example, we make the
tick length 1.5 times normal and the
width three times normal, with the
ticks crossing the y-axis.
Uses allstatesdc.dta & scheme vg s1c

Box

Axis titles

twoway scatter propval100 faminc,
ylabel(, tlength(*1.5) tlwidth(*3) tposition(crossing))

Bar

30000

Matrix

25000

1979 Median Family Inc.

Twoway

20000

Connecting

15000

Introduction

Marker labels

We can control the gap between the
label and the tick with the labgap()
option. In this example, we increase the
gap between the y-labels and the
y-ticks to five times the original size.
Uses allstatesdc.dta & scheme vg s1c

Markers

twoway scatter propval100 faminc, ylabel(, labgap(*5))

15000

20000

25000

30000

1979 Median Family Inc.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
264

Chapter 8. Options available for most graphs

60
40
0

20

% homes cost $100K+

80

100

twoway scatter propval100 faminc, ylabel(, nogrid)

15000

20000

25000

30000

1979 Median Family Inc.

We can use the grid and nogrid
options to display or suppress the
display of grid lines corresponding to
the labels and ticks associated with the
ylabel(), ymlabel(), ytick(), or
ymtick() options (this also applies to
xlabel(), xmlabel(), xtick(), or
xmtick()). Say that we want to
suppress the grid on the y-axis. We can
do this with the ylabel(, nogrid)
option.
Uses allstatesdc.dta & scheme vg s1c

twoway scatter propval100 faminc, ylabel(, grid) xlabel(, grid)

60
40
0

20

% homes cost $100K+

80

100

If we want a grid to be displayed for
the values that correspond to the
ylabel() and the xlabel() options,
we can specify the grid option, as
shown in this example. Depending on
the scheme you choose, grids may be
included or omitted by default.
Uses allstatesdc.dta & scheme vg s1c

15000

20000

25000

30000

1979 Median Family Inc.

60
40
0

20

% homes cost $100K+

80

100

twoway scatter propval100 faminc,
ylabel(, grid glwidth(vthin) glcolor(gs10) glpattern(shortdash))

15000

20000

25000

30000

You can control the grid line width,
grid line color, and grid line pattern
with the glwidth(), glcolor(), and
glpattern() options. In this example,
we make the grid line very thin, the
color gray (gs10), and the pattern of
the lines short dashes. See
Styles : Linewidth (337), Styles : Colors
(328), and Styles : Linepatterns (336) for
additional details.
Uses allstatesdc.dta & scheme vg s1c

1979 Median Family Inc.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.6

Controlling axis scales

265

100
90

80
70

60
50

40
30

% homes cost $100K+

20
10

0
1400
1300
1200
1100

High price/Low price

1000
900

250

Appendix

200

Styles

150

Textboxes

100

Trading day number

Standard options

Adding text

50

Options

Legend

0

Pie

By

First, consider this rspike graph,
which shows the high and low prices
across 248 trading days.
Uses sp2001.dta & scheme vg s2m

Axis selection

twoway rspike high low tradeday

Axis scales

This section provides more details about axis scale options, which allow us to control
whether an axis is displayed, where it is displayed, the direction it is displayed, and the scale
of the axis. For more information about these options, see [G] axis scale options. This
section begins by using data on the S&P 500 from January 2, 2001, to December 31, 2001,
stored in the file sp2001. For simplicity, we will use tradeday on the x-axis, representing
the trading day of the year. For this section, we will use the vg s2m scheme.

Dot

Controlling axis scales

Box

Axis labels

8.6

Bar

30000

Matrix

25000

1979 Median Family Inc.

Axis titles

20000

Twoway

Connecting

15000

Introduction

Marker labels

We can use different kinds of grid lines
for the major and minor axis labels. In
this example, we have a solid, darker
gray line for the major axis labels and a
lighter gray, short, dashed line for the
minor axis labels. We include the grid
option to ensure that the grid is
displayed.
Uses allstatesdc.dta & scheme vg s1c

Markers

twoway scatter propval100 faminc,
ylabel(0(20)100, grid glcolor(gs8) glpattern(solid))
ymlabel(10(20)90, grid glcolor(gs11) glpattern(shortdash))

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
266

Chapter 8. Options available for most graphs

twoway rspike high low tradeday, xscale(off)

1200
1100
900

1000

High price/Low price

1300

1400

If we wish, we could remove the display
of the x-axis entirely with the
xscale(off) option. Although it is not
shown, the same could be done for the
y-axis if we were to use the
yscale(off) option. This is not
normally an option we would use, but it
can be useful for combining multiple
graphs on the same scale without
having to show the scale on some of the
graphs.
Uses sp2001.dta & scheme vg s2m

twoway rspike high low tradeday, xscale(alt)
Trading day number
50

100

150

200

250

1300
1200
1100
1000

We could shift the display of the x-axis
from the bottom of the graph to the
top of the graph with the xscale(alt)
option. Likewise, we could have chosen
to supply the yscale(alt) option to
shift the y-axis from the left to the
right.
Uses sp2001.dta & scheme vg s2m

900

High price/Low price

1400

0

twoway rspike high low tradeday, xscale(reverse)

1300
1200
1100
1000
900

High price/Low price

1400

We can reverse the scale of the x-axis
by specifying the xscale(reverse)
option, as illustrated here. We can
reverse the y-axis by indicating the
yscale(reverse) option.
Uses sp2001.dta & scheme vg s2m

250

200

150

100

50

0

Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.6

Controlling axis scales

267

11
10.5
10

average education level

9.5
11
10.5
10

average education level

9.5
1400
1300
1200
900

1000

Textboxes

1100

Appendix

Adding text

High price/Low price

Styles

Legend

We now return to the sp2001 data.
You can use the xscale() and
yscale() options to control the axis
lines. In this example, we make the
x-axis line thick by specifying
xscale(lwidth(thick)).
Uses sp2001.dta & scheme vg s2m

Standard options

By

twoway rspike high low tradeday, xscale(lwidth(thick))

Options

10000

Pie

1000

Dot

100
Pop/10 sq. miles

Axis selection

10

Box

Axis scales

1

Bar

Axis labels

Here, we use the xlabel() option to
change the labels for the x-axis using
the values 1, 10, 100, 1000, and 10,000,
and you can see how these powers of 10
are more equally spaced, reflecting the
log scale of the x-axis.
Uses allstates.dta & scheme vg s2m

Matrix

Axis titles

twoway scatter educ popden, xscale(log)
xlabel(1 10 100 1000 10000)

Twoway

Connecting

2000 40006000
8000
10000
Pop/10 sq. miles

Introduction

Marker labels

We briefly return to the allstates file
to illustrate the xscale(log) option.
The xscale(log) option indicates that
the x-axis should be displayed on a log
scale. Note that the labels for the
x-axis overlap each other.
Uses allstates.dta & scheme vg s2m

Markers

twoway scatter educ popden, xscale(log)

0

50

100

150

200

250

Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
268

Chapter 8. Options available for most graphs

twoway rspike high low tradeday, xscale(off noline)

1200
1100
900

1000

High price/Low price

1300

1400

We could suppress the display of the
x-axis line completely by using the
xscale(noline) option.
Uses sp2001.dta & scheme vg s2m

High price/Low price

900 1000 1100 1200 1300 1400

twoway rspike high low tradeday, yscale(range(700 1400))

0

50

100

150

200

250

Trading day number

The yscale(range()) option can be
used to expand the scale of the y-axis
without needing to expand the labels
for the axis (as the ylabel() option
would). In this example, we have
expanded the range of the y-axis from
700 to 1400. However, this example
does not show the real utility of this
option. Note that range() can only be
used to expand the scale, not contract
it.
Uses sp2001.dta & scheme vg s2m

1.5
1

Consider that, in addition to the spike
graph that shows the high and low
values for a given trading day, we wish
to see the volume for a given trading
day. We can combine the plots into a
single graph, but this is difficult to read
because the two plots overlap.
Uses sp2001.dta & scheme vg s2m

900

.5

Volume (millions)

2

High price/Low price
1000 1100 1200 1300

2.5

1400

twoway (rspike high low tradeday)
(line volmil tradeday, sort yaxis(2))

0

50

100

150

200

250

Trading day number...
High price/Low price

Volume (millions)

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.7

Selecting an axis

269

Volume (millions)

.5 11.522.5

High price/Low price
900 1000 1100 1200 1300 1400

250

Volume (millions)

1400

0

1

2

Volume (millions)

High price/Low price
1000
1200

250

Adding text
Textboxes

This section provides more details about how to select different axes and modify them.
By default, any modifications you make to an axis are applied to the first axis, so you need
to take extra action to modify other axes that you may create. For more information about
these options, see [G] axis selection options. For this section, we will use the vg outc
scheme.

Appendix

Selecting an axis

Styles

Volume (millions)

Legend

8.7

By

High price/Low price

Standard options

200

Options

150

Pie

100

Trading day number...

Dot

50

Axis selection

0

Axis scales

Because we manipulated the scale of
the y-axes, the labels were pushed
together. We can add the ylabel(1000
1200 1400, axis(1)) and ylabel(0 1
2, axis(2)) options to the previous
example to make the labels for the
y-axes more readable.
Uses sp2001.dta & scheme vg s2m

Axis labels

twoway (rspike high low tradeday)
(line volmil tradeday, sort yaxis(2)),
yscale(range(700 1400) axis(1)) yscale(range(0 10) axis(2))
ylabel(1000 1200 1400, axis(1)) ylabel(0 1 2, axis(2))

Box

Axis titles

High price/Low price

Bar

200

Matrix

150

Twoway

100

Trading day number...

Introduction

50

Connecting

0

Marker labels

This example shows the utility of the
yscale(range()) option. The
yscale(range(700 1400) axis(1))
option sets the range of price to be
from 700 to 1400, shifting that series up
to the upper third of the graph. The
yscale(range(0 10) axis(2)) option
sets the range of volume to occupy the
lower third of the graph.
Uses sp2001.dta & scheme vg s2m

Markers

twoway (rspike high low tradeday)
(line volmil tradeday, sort yaxis(2)),
yscale(range(700 1400) axis(1)) yscale(range(0 10) axis(2))

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
270

Chapter 8. Options available for most graphs

twoway scatter faminc educ

25000
20000
15000

1979 Median Family Inc.

30000

This section focuses on the options that
we can use to select axes and shows
examples of graphing multiple variables
in a single graph. This graph shows the
relationship between one x-variable,
educ, and one y-variable, faminc.
Uses allstatesdc.dta & scheme vg outc

9.5

10

10.5

11

average education level

25000
20000
15000

1979 Median Family Inc.

30000

twoway (scatter faminc educ, xaxis(1) yaxis(1))

9.5

10

10.5

11

average education level

By default, the x-variable is placed on
the first x-axis, and the y-variable is
placed on the first y-axis. It is as
though you had added the options
xaxis(1) and yaxis(1), as illustrated
here. Note that we add parentheses to
emphasize that the options xaxis(1)
and yaxis(1) belong to the scatter
command and are not general options
for the overall graph, which would
appear after the parentheses.
Uses allstatesdc.dta & scheme vg outc

0

10000

20000

30000

twoway (scatter faminc educ)
(scatter workers2 educ)

9.5

10

10.5

average education level
1979 Median Family Inc.

11

Now let’s overlay a second scatterplot
showing workers2 by educ, which has
the effect of adding a second variable to
the y-axis. Stata assumes that all
variables are on the first (and thus, the
same) axis, unless we specify otherwise.
As a result, this graph is hard to read
because faminc is scaled very
differently from workers2 but scaled on
the same axis.
Uses allstatesdc.dta & scheme vg outc

% HHs with 2+ workers

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.7

Selecting an axis

271

65
60
55
50
45

% HHs with 2+ workers

70

30000
25000
20000

1979 Median Family Inc.

15000

% HHs with 2+ workers

% HHs with 2+ workers

40 45 50 55 60 65 70 75 80

30000
25000
20000

1979 Median Family Inc.

15000

70
65
60
55

% HHs with 2+ workers

50
40
45
50
55
60
65
70
75
80

45

Textboxes

1979 Median Family Inc.

Adding text

Appendix

Legend

You might be tempted to enter the
ylabel() option as an option of the
second scatter statement and expect
the ylabel() to modify the scaling of
workers2. However, we can see in this
example that this does not work.
Uses allstatesdc.dta & scheme vg outc

Styles

By

twoway (scatter faminc educ)
(scatter workers2 educ, yaxis(2) ylabel(40(5)80))

Standard options

% HHs with 2+ workers

Options

1979 Median Family Inc.

Pie

11

Dot

10.5

Box

10

average education level...

Axis selection

9.5

Axis scales

Say that you wished to label workers2
starting at 40, incrementing by 5 until
80. Since workers2 is on the second
y-axis, you would specify
ylabel(40(5)80, axis(2)). Without
the axis(2) option, Stata would
assume that you are referring to the
first y-axis and would change the
scaling of faminc.
Uses allstatesdc.dta & scheme vg outc

Axis labels

twoway (scatter faminc educ)
(scatter workers2 educ, yaxis(2)), ylabel(40(5)80, axis(2))

Bar

Axis titles

1979 Median Family Inc.

Matrix

11

Twoway

10.5

average education level...

Introduction

10

Connecting

9.5

Marker labels

Stata permits you to have multiple axes
for the x-variables and the y-variables.
In this example, we use the yaxis(1)
option to place faminc on the first
y-axis and the yaxis(2) option to
place workers2 on the second y-axis.
To make the graph more readable,
Stata moved the second y-axis over to
the right side. Note that the yaxis(1)
option was not needed but was included
for clarity.
Uses allstatesdc.dta & scheme vg outc

Markers

twoway (scatter faminc educ, yaxis(1))
(scatter workers2 educ, yaxis(2))

9.5

10

10.5

11

average education level...
1979 Median Family Inc.

% HHs with 2+ workers

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
272

Chapter 8. Options available for most graphs

65
40
45
50
55
60
65
70
75
80

45

50

55

60

1979 Median Family Inc.

% HHs with 2+ workers

70

twoway (scatter faminc educ)
(scatter workers2 educ, yaxis(2)), ylabel(40(5)80, axis(1))

9.5

10

10.5

11

ylabel() is really an overall option,
but Stata is willing to pretend that you
specified this option globally, as though
you had typed ylabel() as a global
option as specified in this example. To
make this clearer, we have added the
default axis(1) to ylabel() to
illustrate why this usage does not
change the second y-axis.
Uses allstatesdc.dta & scheme vg outc

average education level...
1979 Median Family Inc.

% HHs with 2+ workers

60
55

These same rules apply to modifying
the axis titles and labeling. In this
example, we use the ytitle() option
to change the titles for the first and
second y-axes.
Uses allstatesdc.dta & scheme vg outc

45

15000

50

Two+ workers

25000
20000

Family income

65

70

30000

twoway (scatter faminc educ)
(scatter workers2 educ, yaxis(2)),
ytitle("Family income", axis(1)) ytitle("Two+ workers", axis(2))

9.5

10

10.5

11

average education level...
1979 Median Family Inc.

8.8

% HHs with 2+ workers

Graphing by groups

This section provides more details about repeating graphs using the by() option to
show separate graphs for each by-group. For more information, see [G] by option. For this
section, we will use the vg brite scheme.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.8

Graphing by groups

273

80
75
70
65

% who own home

60
55

80

80
70
60

% who own home

50

20

40

60

80

By
Legend

twoway scatter ownhome borninstate, by(north, total)
North

60
50

20

40

Total

50

60

70

80

Textboxes

% who own home

70

80

S&W

Adding text

We can use the total option to see the
overall relationship for all 50 states, as
well as the two plots separately, by the
levels of north.
Uses allstatesdc.dta & scheme vg brite

Appendix

80

Styles

60

Standard options

80

Options

60

% born in state of residence

Axis selection

40

Pie

Axis scales

20

Graphs by Region North or Not

Dot

North

Box

S&W

Axis labels

We can use the by(north) option to
look at this relationship broken down
by whether the state is considered to be
in the North.
Uses allstatesdc.dta & scheme vg brite

Bar

Axis titles

twoway scatter ownhome borninstate, by(north)

Matrix

60

% born in state of residence

Twoway

40

Connecting

20

Introduction

Marker labels

We start by looking at a scatterplot of
ownhome and borninstate, and we see
a general positive relationship such that
the higher the percentage of those who
were born in the state, the higher the
percentage of home owners in the state.
Uses allstatesdc.dta & scheme vg brite

Markers

twoway scatter ownhome borninstate

20

40

60

80

% born in state of residence
Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
274

Chapter 8. Options available for most graphs

twoway scatter ownhome borninstate, by(north, total colfirst)
We can add the colfirst option to
show the graphs going down columns
first rather than going across rows first,
which is the default.
Uses allstatesdc.dta & scheme vg brite

Total

60
50

20

40

60

80

70

80

North

50

60

% who own home

70

80

S&W

20

40

60

80

% born in state of residence
Graphs by Region North or Not

twoway scatter ownhome borninstate, by(north, total holes(2))
The holes(2) option leaves the second
position empty. Here, we specify a
single position to leave empty, but you
can specify multiple positions within
the holes() option.
Uses allstatesdc.dta & scheme vg brite

60
50

Total

70

80

North

50

60

% who own home

70

80

S&W

20

40

60

80

20

40

60

80

% born in state of residence
Graphs by Region North or Not

twoway scatter ownhome borninstate, by(north, total rows(1))
North

The rows(1) option indicates that the
graph should be displayed in one row.
Uses allstatesdc.dta & scheme vg brite

Total

70
60
50

% who own home

80

S&W

20

40

60

80

20

40

60

80

20

40

60

80

% born in state of residence
Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.8

Graphing by groups

275

50 60 70 80
20

40

60

80

Graphs by Region North or Not

50607080

80

20

40

60

Axis selection

50607080

Total

80

% born in state of residence
Graphs by Region North or Not

80

North

60
50
80
50

60

70

Textboxes

% who own home

70

Adding text

Total

Appendix

S&W

Legend

The compact option displays the graph
using a compact style, pushing the
graphs tightly together. This is almost
the same as specifying
style(compact).
Uses allstatesdc.dta & scheme vg brite

Styles

By

twoway scatter ownhome borninstate, by(north, total compact)

Standard options

% who own home

60

Options

40

Axis scales

20

Pie

North

Dot

S&W

Axis labels

Sometimes when you use the by()
option, the graph can become small,
making the text and symbols difficult to
see. You can use the iscale() option
to magnify the size of these elements.
In this example, we increase the size of
these elements by a factor of 1.5.
Uses allstatesdc.dta & scheme vg brite

Box

Axis titles

twoway scatter ownhome borninstate, by(north, total iscale(*1.5))

Bar

% born in state of residence

Matrix

50 60 70 80

Total

Connecting

% who own home

North

Twoway

Marker labels

50 60 70 80

S&W

Introduction

The cols(1) option shows the graph in
a single column.
Uses allstatesdc.dta & scheme vg brite

Markers

twoway scatter ownhome borninstate, by(north, total cols(1))

20

40

60

80

% born in state of residence
Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
276

Chapter 8. Options available for most graphs

twoway scatter ownhome borninstate, by(north, total noedgelabel)
The noedgelabel option suppresses the
display of the x-axis for the graphs that
do not appear on the bottom row, in
this case the graph for the North.
Uses allstatesdc.dta & scheme vg brite

North

60
50
70

80

Total

50

60

% who own home

70

80

S&W

20

40

60

80

% born in state of residence
Graphs by Region North or Not

twoway scatter ownhome borninstate, by(north, yrescale)

75

75

70

70

65

65

60

60
55

% who own home

The yrescale option allows the
y-variables to be scaled independently
for each by-group.
Uses allstatesdc.dta & scheme vg brite

North
80

S&W

20

40

60

80

20

40

60

80

% born in state of residence
Graphs by Region North or Not

twoway scatter ownhome borninstate, by(north, xrescale)
Likewise, the xrescale option allows
the x-variable to be scaled differently
across all the by-groups.
Uses allstatesdc.dta & scheme vg brite

North

70
60
50

% who own home

80

S&W

20

40

60

80 40

50

60

70

80

% born in state of residence
Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.8

Graphing by groups

277

75
70

70

65

65

60

60
55

60

80

40

50

60

70

80

Graphs by Region North or Not

80
70
50

60

70
60

% who own home

50

80

20

40

60

80

% born in state of residence

80

S&W

60
50
80

North

50

60

70

Textboxes

% who own home

70

Adding text

Appendix

Legend

Likewise, the ixaxes option will display
the x-axis for all graphs. In this graph,
we omit this option. If we display two
graphs in a single column, Stata
displays the top graph, omitting the
x-axis.
Uses allstatesdc.dta & scheme vg brite

Styles

By

twoway scatter ownhome borninstate, by(north, cols(1))

Standard options

60

Axis selection

40

Options

Axis scales

20

Graphs by Region North or Not

Pie

80

North

Dot

S&W

Axis labels

You can use the iyaxes option so the
y-axes for each individual graph will be
displayed.
Uses allstatesdc.dta & scheme vg brite

Box

Axis titles

twoway scatter ownhome borninstate, by(north, iyaxes)

Bar

% born in state of residence

Matrix

40

Twoway

20

Connecting

% who own home

75

80

North

Introduction

S&W

Marker labels

If you want both the x-variable and
y-variable to be scaled differently across
the by-groups, you can use the rescale
option, and both axes are separately
rescaled.
Uses allstatesdc.dta & scheme vg brite

Markers

twoway scatter ownhome borninstate, by(north, rescale)

20

40

60

80

% born in state of residence
Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
278

Chapter 8. Options available for most graphs

twoway scatter ownhome borninstate, by(north, ixaxes cols(1))
We now include the ixaxes option and
see that the x-axis is now displayed on
the top graph.
Uses allstatesdc.dta & scheme vg brite

60
50
20

40

60

80

60

80

70

80

North

50

60

% who own home

70

80

S&W

20

40

% born in state of residence
Graphs by Region North or Not

twoway scatter ownhome borninstate, by(north, total iytitle)
80
70
60
50

% who own home

We can display the title for each y-axis
using the iytitle option.
Uses allstatesdc.dta & scheme vg brite

North

20

40

60

80

50

60

70

80

Total
% who own home

% who own home

S&W

20

40

60

80

% born in state of residence
Graphs by Region North or Not

twoway scatter ownhome borninstate, by(north, total iyaxes iytitle)
80
70
60

% who own home

50

80
70
60
50

% who own home

North

20

40

60

80

Note that the y-title is not displayed for
the North since the y-axis is omitted for
that graph. If we include the iyaxes
and iytitle options, the y-axis and
y-title are displayed for that graph as
well.
Uses allstatesdc.dta & scheme vg brite

50

60

70

80

Total
% who own home

% who own home

S&W

20

40

60

80

% born in state of residence
Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.8

Graphing by groups

279

50 60 70 80

40

60

80

20

% born in state of residence

40

60

80

% born in state of residence

20

40

60

80

% born in state of residence

70
60
50

60

80

20

40

60

80

By
Legend

twoway scatter ownhome borninstate, by(north, title("My title"))
If we make the title() an option
within the by() option, Stata will make
this an overall title for the graph.
Uses allstatesdc.dta & scheme vg brite

My title
North

70
50

60

Textboxes

% who own home

80

Adding text

S&W

Appendix

80

Styles

60

% born in state of residence

Standard options

40

Axis selection

20

Graphs by Region North or Not

Options

Axis scales

% who own home

80

North

Pie

My title

S&W

Dot

My title

Axis labels

If we include a title() option with
by(), Stata creates each graph
separately using the title we specify.
Uses allstatesdc.dta & scheme vg brite

Box

Axis titles

twoway scatter ownhome borninstate, by(north) title("My title")

Bar

% born in state of residence
Graphs by Region North or Not

Matrix

50 60 70 80

Total

Twoway

20

Connecting

% who own home

North

Introduction

S&W

Marker labels

Likewise, we can display the x-title on
each graph using the ixaxes and
ixtitle options.
Uses allstatesdc.dta & scheme vg brite

Markers

twoway scatter ownhome borninstate, by(north, total ixaxes ixtitle)

20

40

60

80

20

40

% born in state of residence
Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
280

Chapter 8. Options available for most graphs

twoway scatter ownhome borninstate, by(north, title("By title"))
title("Regular title")
This example should help you to
understand how these two types of
titles work. When the title() is used
overall, it applies to all graphs that are
created because it is repeated via the
by() option. The by(title()) is
applied after all smaller graphs are
created, providing an overall title for
the graph.
Uses allstatesdc.dta & scheme vg brite

By title
Regular title

S&W

North

70
60
50

% who own home

80

Regular title

20

40

60

80

20

40

60

80

% born in state of residence
Graphs by Region North or Not

twoway scatter ownhome borninstate, by(north)
caption("Regular caption")
Stata treats the caption() option the
same way that it treats titles. Here, we
include an overall caption, which is
displayed with each graph.
Uses allstatesdc.dta & scheme vg brite

North

70
60
50

% who own home

80

S&W

20

40

60

Regular caption

80

20

40

60

80

Regular caption
% born in state of residence

Graphs by Region North or Not

twoway scatter ownhome borninstate, by(north, caption("By caption"))
When we include the caption() inside
the by() option, it is displayed as a
caption for the full graph.
Uses allstatesdc.dta & scheme vg brite

North

70
60
50

% who own home

80

S&W

20

40

60

80

20

40

60

80

% born in state of residence
Graphs by Region North or Not

By caption

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.8

Graphing by groups

281

70
60
50

60

80

20

40

60

80

80
70
60

% who own home

50

80

20

40

60

80

70
50

60

Textboxes

% who own home

80

Adding text

North
Region of state

Appendix

Legend

S&W
Region of state

Styles

By

twoway scatter ownhome borninstate, by(north)
subtitle("Region of state", suffix)

Standard options

60

% born in state of residence

Options

40

Axis selection

20

Graphs by Region North or Not

We can use the suffix option to insert
text that appears in the subtitle after
the name of the by-group.
Uses allstatesdc.dta & scheme vg brite

Pie

Region of state
North

Dot

Region of state
S&W

Axis scales

We can use the subtitle() option to
add more labeling to the by-group
names. Here, we use the prefix option
to insert text that appears in the
subtitle before the name of the
by-group.
Uses allstatesdc.dta & scheme vg brite

Axis labels

twoway scatter ownhome borninstate, by(north)
subtitle("Region of state", prefix)

Box

Axis titles

% born in state of residence

Bar

40

Matrix

20

Graphs by Region North or Not

Twoway

Connecting

% who own home

80

This is a subtitle

Introduction

This is a subtitle

Marker labels

Stata treats the subtitle() option
differently than the title() and
caption() options. Here, we include a
subtitle() option, and we see that it
has replaced the title above each graph
that represented the names of the
by-group.
Uses allstatesdc.dta & scheme vg brite

Markers

twoway scatter ownhome borninstate, by(north)
subtitle("This is a subtitle")

20

40

60

80

20

40

60

80

% born in state of residence
Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
282

Chapter 8. Options available for most graphs

twoway scatter ownhome borninstate, by(north)
subtitle("State’s location", prefix)
subtitle("Based on Region", suffix)
We can even combine the prefix and
suffix option to insert text before and
after the label of the by-group.
Uses allstatesdc.dta & scheme vg brite

State’s location
North
Based on Region

70
60
50

% who own home

80

State’s location
S&W
Based on Region

20

40

60

80

20

40

60

80

% born in state of residence
Graphs by Region North or Not

twoway scatter ownhome borninstate,
by(north, subtitle("This is a subtitle"))
When used as an option within the
by() option, the subtitle() option
works just like the title() and
caption() options, placing a subtitle
on the overall graph.
Uses allstatesdc.dta & scheme vg brite

This is a subtitle
North

70
60
50

% who own home

80

S&W

20

40

60

80

20

40

60

80

% born in state of residence
Graphs by Region North or Not

twoway scatter ownhome borninstate, by(north) note("Regular note")
Stata treats the note() option much as
it does the title(), caption(), and
subtitle() options. Here, we include a
note() option and see that it is shown
beneath both graphs.
Uses allstatesdc.dta & scheme vg brite

North

70
60
50

% who own home

80

S&W

20

40

60

Regular note

80

20

40

60

80

Regular note

% born in state of residence
Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.8

Graphing by groups

283

70
60
50

60

80

20

40

Axis labels

80
70
60
50

40

60

80

20

40

% born in state of residence
Graphs by Region North or Not
North N=21, Not North N=29

Axis selection

% who own home

Axis scales

20

North

60
50

20

40

Textboxes

60

70

80

Total

50

% who own home

70

80

Adding text

S&W

Appendix

Legend

Previously, we saw that the
subtitle() option could be used to
modify the by-group names above each
graph. We can also use the subtitle(,
position()) option to modify the
placement of this text. Here, we move
the text to appear in the 11 o’clock
position.
Uses allstatesdc.dta & scheme vg brite

Styles

By

twoway scatter ownhome borninstate,
by(north, total) subtitle(, position(11))

Standard options

80

North

Options

60

Axis titles

S&W

Pie

80

twoway scatter ownhome borninstate,
by(north, note("North N=21, Not North N=29", suffix))

Dot

60

By note

Box

80

Bar

60

Matrix

40

% born in state of residence

As with the subtitle() option, we can
use the prefix or suffix option to add
our own text before or after the existing
note.
Uses allstatesdc.dta & scheme vg brite

Twoway

20

Connecting

% who own home

80

North

Introduction

S&W

Marker labels

If we include the note() option within
the by() option, we see that our note
overrides the note that Stata provided
to indicate that the graphs were
separated by the variable north.
Uses allstatesdc.dta & scheme vg brite

Markers

twoway scatter ownhome borninstate, by(north, note("By note"))

20

40

60

80

% born in state of residence
Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
284

Chapter 8. Options available for most graphs

50

S&W

North
40

60

80

70

80

20

60

% who own home

60

70

80

twoway scatter ownhome borninstate,
by(north, total) subtitle(, pos(5) ring(0) nobexpand)

50

Total
20

40

60

80

% born in state of residence

We can place the name of the by-group
in the bottom right corner of each
graph using the subtitle() option.
The options pos(5) and ring(0) move
the subtitle to the 5 o’clock position
and inside the plot region. The
nobexpand (no box expand) option
prevents the by-group name from
expanding to consume the entire plot
region.
Uses allstatesdc.dta & scheme vg brite

Graphs by Region North or Not

twoway scatter ownhome borninstate,
by(north, total title("My title", ring(0) position(5)))
North

60
50

20

40

60

80

70

80

Total

We can also use the ring() and pos()
options with title(), note(), and
caption() to alter their placement.
Here, we use position(5) to put the
title in the bottom right corner and
ring(0) to locate it inside the plot
region.
Uses allstatesdc.dta & scheme vg brite

50

60

% who own home

70

80

S&W

20

40

60

My title

80

% born in state of residence
Graphs by Region North or Not

twoway scatter ownhome borninstate,
by(north, total title("My title", position(5)))
50 60 70 80

North

20

40

60

80

Total

The previous graph is repeated with
the position(5) option but not the
ring(0) option to illustrate the impact
of ring(0). Without ring(0), the title
is placed outside the plot region.
Uses allstatesdc.dta & scheme vg brite

50 60 70 80

% who own home

S&W

20

40

60

80

% born in state of residence
Graphs by Region North or Not

My title

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.8

Graphing by groups

285

left title

left title

50 60 70 80

20

40

bottom title

60

80

bottom title

50 60 70 80

60

80

100
50
0

70

80

100
50
0

60

70

80

% who own home
Born in state

% > 100K

Graphs by Region North, South, or West

% homes cost $100K+

North

South

0

50

100

Adding text

% born in state of residence

Appendix

Legend

50

60

70

80

0

50

100

West

Textboxes

In this graph, we use the position()
option to modify the position of the
legend. Such options that modify the
position of the legend must be placed as
an option within the by() option.
Uses allstatesdc.dta & scheme vg brite

Styles

By

twoway scatter (borninstate propval100 ownhome),
by(nsw, legend(position(12)))

Standard options

Axis selection

50

Options

60

Axis scales

50

West

Pie

South

Dot

North

Axis labels

twoway scatter (borninstate propval100 ownhome), by(nsw)
legend(label(1 "Born in state") label(2 "% > 100K"))

Box

Axis titles

% born in state of residence
Graphs by Region North or Not

Bar

bottom title

Matrix

40

Connecting

left title

Total

20

Here, we use the legend() option to
change the labels associated with the
first two keys. These options modify the
contents of the legend, so they should
appear outside of the by() option.
Uses allstatesdc.dta & scheme vg brite

Twoway

% who own home

North

Introduction

S&W

Marker labels

Including the l1title() option adds a
title to the left (on the y-axis) of each
of the graphs. Likewise, the b1title()
option adds a title to the bottom (on
the x-axis) of each of the graphs.
Uses allstatesdc.dta & scheme vg brite

Markers

twoway scatter ownhome borninstate,
by(north, total) l1title("left title") b1title("bottom title")

50

60

70

80

% who own home
Graphs by Region North, South, or West

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
286

Chapter 8. Options available for most graphs

twoway scatter (borninstate propval100 ownhome),
by(nsw, legend(pos(12)))
legend(label(1 "Born in state") label(2 "% > 100K"))
Born in state

% > 100K

South

0

50

100

North

50

60

70

80

0

50

100

West

Here, we use both of the options from
the previous two graphs, and the
legend() option is used twice: inside
the by() option to modify the position
and outside the by() option to modify
its contents. The use of legend() with
the by() option is covered more
thoroughly in Options : Legend (287).
Uses allstatesdc.dta & scheme vg brite

50

60

70

80

% who own home
Graphs by Region North, South, or West

twoway scatter ownhome borninstate,
by(north, title("% own home" "by % born in state"))
title("Region of state")
We can use the title() option on its
own to make a title that is displayed
with each graph, and the title()
option within the by() option to make
an overall title.
Uses allstatesdc.dta & scheme vg brite

% own home
by % born in state
Region of state

S&W

North

50

60

70

% who own home

80

Region of state

20

40

60

80

20

40

60

80

% born in state of residence
Graphs by Region North or Not

20

40

60

80

Total

20

40

60

40

50

60

70

% born in state of residence

50 60 70 80

% who own home

% born in state of residence

North
60 65 70 75 80

% who own home

S&W
55 60 65 70 75

% who own home

twoway scatter ownhome borninstate,
by(north, total rescale ixtitle iytitle b1title("") l1title(""))

80

Here, we obtain separate graphs for the
three groups, using rescale to obtain
different x- and y-axis labels and scales,
ixtitle and iytitle to title the
graphs separately, and b1title() and
l1title() to suppress the overall titles
for the x- and y-axes.
Uses allstatesdc.dta & scheme vg brite

80

% born in state of residence

Graphs by Region North or Not

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.9

287

80
60
40
20
0

100

Axis scales

% who own home

% homes cost $100K+

80
75
70
65
60
55

60

80

100

Percent urban 1990
% who own home

Fitted values

Adding text

40

Appendix

Legend

20

Styles

By

Legends are also created when you
overlay plots. Here, Stata adds a legend
entry for each of the overlaid plots. The
default legend, in this case, is less useful
since it does not help us differentiate
between the kinds of fit values.
Uses allstatesdc.dta & scheme vg s2c

Standard options

Axis selection

twoway (scatter ownhome urban) (lfit ownhome urban)
(qfit ownhome urban)

Options

80

Pie

60
Percent urban 1990

Dot

Axis labels

40

Box

Axis titles

20

Bar

100

Connecting

Legends can be created in a variety of
ways. For example, here we have two
y-variables, ownhome and propval100,
on the same plot, and Stata creates a
legend labeling the different points.
The default legend, in this case, is quite
useful.
Uses allstatesdc.dta & scheme vg s2c

Matrix

Marker labels

twoway scatter ownhome propval100 urban

Twoway

This section describes more details about using legends. Legends can be useful in a
number of situations, and this section shows how to customize them. For more information
about legend options, see [G] legend option. Also, for controlling the text and textbox of
the legend, see Options : Textboxes (303) and Options : Adding text (299). We will use the
vg s2c scheme.

Introduction

Controlling legends

Markers

8.9

Controlling legends

Fitted values

Textboxes

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
288

Chapter 8. Options available for most graphs

twoway (scatter ownhome urban if north==0)
(scatter ownhome urban if north==1)

70
65
55

60

% who own home

75

80

A third example is when you overlay
two plots using if to display the same
variables but for different observations.
Here, we show the same scatterplot
separately for states in the North and
for those not in the North. Here, the
legend does not help us at all to
differentiate the kinds of values.
Uses allstatesdc.dta & scheme vg s2c
20

40

60

80

100

Percent urban 1990
% who own home

% who own home

twoway (scatter ownhome urban) (lfit ownhome urban)
(qfit ownhome urban)

55

60

65

70

75

80

Regardless of the graph command(s)
that generated the legend, it can be
customized the same way. For many of
the examples, we will use this graph for
customizing the legend.
Uses allstatesdc.dta & scheme vg s2c

20

40

60

80

100

Percent urban 1990
% who own home

Fitted values

Fitted values

twoway (scatter ownhome urban) (lfit ownhome urban)
(qfit ownhome urban),
legend(label(1 "% Own home") label(2 "Lin. Fit") label(3 "Quad.

Fit"))

55

60

65

70

75

80

You can use the label() option to
assign labels for the keys. Note that
you use a separate label() option for
each key that you wish to modify.
Uses allstatesdc.dta & scheme vg s2c

20

40

60

80

100

Percent urban 1990
% Own home

Lin. Fit

Quad. Fit

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.9

Controlling legends

289

80
75
70
65
60
55

100

Lin. Fit

Quad. Fit

Dot

80

Pie

Axis labels

twoway (scatter ownhome urban) (lfit ownhome urban)
(qfit ownhome urban),
legend(label(1 "%own" "home") label(2 "Lin" "Fit") label(3 "Qd" "Fit"))

Box

Axis titles

% who own home

Bar

80

Matrix

60
Percent urban 1990

75
70
65
60
55

80

100

Lin
Fit

80
55

60

Textboxes

65

70

75

Adding text

Appendix

Legend

twoway (scatter ownhome urban) (lfit ownhome urban)
(qfit ownhome urban), legend(order(2 3 1))

Styles

By

%own
home
Qd
Fit

Standard options

60
Percent urban 1990

Axis selection

40

Options

Axis scales

20

You can use the order() option to
change the order of the keys in the
legend.
Uses allstatesdc.dta & scheme vg s2c

Twoway

40

Connecting

20

You can put the label on multiple lines
by including multiple quoted strings.
Uses allstatesdc.dta & scheme vg s2c

Introduction

Marker labels

You can use the label() option to
modify just some of the keys; for
example, here we just modify the
second and third key.
Uses allstatesdc.dta & scheme vg s2c

Markers

twoway (scatter ownhome urban) (lfit ownhome urban)
(qfit ownhome urban),
legend(label(2 "Lin. Fit") label(3 "Quad. Fit"))

20

40

60

80

100

Percent urban 1990
Fitted values

Fitted values

% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
290

Chapter 8. Options available for most graphs

twoway (scatter ownhome urban) (lfit ownhome urban)
(qfit ownhome urban), legend(order(2 3))

55

60

65

70

75

80

We can also omit keys from the
order() option to suppress their
display in the legend. Here, we suppress
the display of the first key.
Uses allstatesdc.dta & scheme vg s2c

20

40

60

80

100

Percent urban 1990
Fitted values

Fitted values

twoway (scatter ownhome urban) (lfit ownhome urban)
(qfit ownhome urban), legend(order(2 "Lin. fit" 3 "Quad.

fit" 1))

55

60

65

70

75

80

You can also insert and replace text for
the keys when using the order()
option. Here, we order the keys 2, 3,
and 1, and at the same time, replace
the text for keys 2 and 3.
Uses allstatesdc.dta & scheme vg s2c

20

40

60

80

100

Percent urban 1990
Lin. fit

Quad. fit

% who own home

twoway (scatter ownhome urban) (lfit ownhome urban)
(qfit ownhome urban),
legend(order(- "Fitted" 2 "Lin. fit" 3 "Quad. fit" - "Observed" 1))

55

60

65

70

75

80

We use - "Fitted" to insert the word
Fitted and - "Observed" to insert the
word Observed. Due to the
organization of the keys in the legend,
this is hard to follow.
Uses allstatesdc.dta & scheme vg s2c
20

40

60

80

100

Percent urban 1990
Fitted

Lin. fit

Quad. fit

Observed

% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.9

Controlling legends

291

60

80

100

Percent urban 1990
Fitted

Observed
% who own home

80
75
70
65
60
55

100

Fitted

Lin. fit

Quad. fit

Observed

Adding text

55

60

Textboxes

65

70

75

80

Legend

Adding the colfirst option displays
the keys in column order instead of row
order, with the Fitted keys in the left
column and the Observed keys in the
right column.
Uses allstatesdc.dta & scheme vg s2c

Appendix

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(- "Fitted" 2 "Lin. fit" 3 "Quad. fit" - "Observed" 1)
rows(3) colfirst)

Styles

By

% who own home

Standard options

80

Options

60
Percent urban 1990

Pie

40

Axis selection

20

Dot

Axis scales

We can use the rows() option to
display the legend in three rows. If we
want to display the fitted keys in the
left column and the observed keys in
the right column, we can order the keys
according to columns instead of
according to rows. See the next
example.
Uses allstatesdc.dta & scheme vg s2c

Axis labels

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(- "Fitted" 2 "Lin. fit" 3 "Quad. fit" - "Observed" 1)
rows(3))

Box

Axis titles

Quad. fit

Bar

Lin. fit

Matrix

40

Connecting

55 60 65 70 75 80

Marker labels

20

Twoway

Markers

We can use the cols() option to
display the legend in a single column.
Here, the added text makes more sense,
but the legend uses quite a bit of space.
Uses allstatesdc.dta & scheme vg s2c

Introduction

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(- "Fitted" 2 "Lin. fit" 3 "Quad. fit" - "Observed" 1)
cols(1))

20

40

60

80

100

Percent urban 1990
Fitted

Observed

Lin. fit

% who own home

Quad. fit

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
292

Chapter 8. Options available for most graphs

55

60

65

70

75

80

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(- "Observed" 1 - "Fitted" 2 "Lin. fit" 3 "Quad. fit")
rows(3) holes(3) colfirst)

20

40

60

80

100

Percent urban 1990
Observed

Fitted

% who own home

Lin. fit

This legend is the same as the one in
the previous example but places the
Observed keys in the left column and
the Fitted keys in the right column. To
do this, we changed the order of the
keys but also added the holes(3)
option so that Fitted would be in the
fourth position at the top of the second
column.
Uses allstatesdc.dta & scheme vg s2c

Quad. fit

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(- "Observed" 1 - " " - "Fitted" 2 "Lin fit" 3 "Qd fit")
rows(3) colfirst)

55

60

65

70

75

80

Referring to the last graph, instead of
using holes(), we can put in a blank
key, - " ", in the order() option,
which pushes the word Fitted over to
the next column.
Uses allstatesdc.dta & scheme vg s2c
20

40

60

80

100

Percent urban 1990
Observed

Fitted

% who own home

Lin fit
Qd fit

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(- "Observed" 1 - " " - "Fitted" 2 "Lin fit" 3 "Qd fit")
rows(3) colfirst textfirst)

55

60

65

70

75

80

Using the textfirst option, we can
make the text for the key appear first,
followed by the symbol.
Uses allstatesdc.dta & scheme vg s2c

20

40

60

80

100

Percent urban 1990
Observed

Fitted

% who own home

Lin fit
Qd fit

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.9

Controlling legends

293

55 60 65 70 75 80

80

100

80
75
70
65
60
55

60

80

100

Adding text

65

Textboxes

70

75

80

Legend

We can use the ring(0) option to place
the legend inside the plot area and use
position(7) to put it in the bottom
left corner, using the empty space in
the plot for the legend.
Uses allstatesdc.dta & scheme vg s2c

Appendix

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(2 "Linear" "Fit" 3 "Quadratic" "Fit")
stack cols(1) ring(0) position(7))

Styles

By

Percent urban 1990

Standard options

40

Options

20

Axis selection

Quadratic
Fit

Pie

Linear
Fit

Dot

Axis scales

We can use the position() option to
change where the legend is displayed.
Here, we take the narrow legend from
the previous graph and put it to the
right of the graph, making good use of
space.
Uses allstatesdc.dta & scheme vg s2c

Axis labels

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(2 "Linear" "Fit" 3 "Quadratic" "Fit")
stack cols(1) position(3))

Box

Axis titles

Quadratic
Fit

Bar

Linear
Fit

Matrix

60
Percent urban 1990

Twoway

40

Connecting

20

Introduction

Marker labels

Using the stack option, we can stack
the symbols above the labels. We use
this here to make a tall, narrow legend.
Uses allstatesdc.dta & scheme vg s2c

Markers

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(2 "Linear" "Fit" 3 "Quadratic" "Fit")
stack cols(1))

55

60

Linear
Fit
Quadratic
Fit
20

40

60

80

100

Percent urban 1990

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
294

Chapter 8. Options available for most graphs

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(1 "% Own Home" 2 "Linear" 3 "Quad")
rows(1) position(12))
Linear

Here, we make the legend a thin row
using the rows(1) option and then use
the position(12) option to put it at
the top of the graph.
Uses allstatesdc.dta & scheme vg s2c

Quad

55

60

65

70

75

80

% Own Home

20

40

60

80

100

Percent urban 1990

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(1 "% Own Home" 2 "Linear" 3 "Quad")
rows(1) position(12) bexpand)
Linear

We can expand the width of the legend
to the width of the plot area using the
bexpand (box expand) option.
Uses allstatesdc.dta & scheme vg s2c

Quad

55

60

65

70

75

80

% Own Home

20

40

60

80

100

Percent urban 1990

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(2 "Linear Fit" 3 "Quadratic Fit")
rows(1) position(12) bexpand span)
If we wanted to expand the legend to
the entire width of the graph area (not
just the plot area), we would add the
span option.
Uses allstatesdc.dta & scheme vg s2c

Quadratic Fit

55

60

65

70

75

80

Linear Fit

20

40

60

80

100

Percent urban 1990

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.9

Controlling legends

295

55

60

80

100

80
75
70
65
60
55

100

Legend

Adding text

70

75

80

Legend

Here, we use the note() option,
showing that we can even add a note to
the legend.
Uses allstatesdc.dta & scheme vg s2c

Appendix

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(2 "Linear Fit" 3 "Quadratic Fit")
rows(1) pos(5) note("Fit obtained with lfit and qfit"))

Styles

Quadratic Fit

By

Linear Fit

Standard options

80

Options

60
Percent urban 1990

Axis selection

40

Pie

Axis scales

20

Dot

Axis labels

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(2 "Linear Fit" 3 "Quadratic Fit")
rows(1) pos(5) subtitle("Legend", box bexpand))

Box

Quadratic Fit

Axis titles

Legend

Bar

Percent urban 1990

Matrix

40

Connecting

60

65

70

75

80

Marker labels

20

Linear Fit

To emphasize all the control we have,
we could put the subtitle for the legend
in a box and use bexpand to make it
expand to the width of the legend.
Uses allstatesdc.dta & scheme vg s2c

Twoway

Markers

We can add a title, subtitle, note, or
caption to the legend using all the
features described in Standard
options : Titles (313). Here, we add a
title() and use the position()
option to position it in the top left
corner. A simple way to get a smaller
title is to use the subtitle() option
instead.
Uses allstatesdc.dta & scheme vg s2c

Introduction

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(order(2 "Linear Fit" 3 "Quadratic Fit")
rows(1) pos(5) title("Legend", position(11)))

55

60

65

Textboxes

20

40

60

80

100

Percent urban 1990
Linear Fit

Quadratic Fit

Fit obtained with lfit and qfit

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
296

Chapter 8. Options available for most graphs

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(size(medium) color(maroon) bfcolor(eggshell) box)

55

60

65

70

75

80

The legend() option permits us to
supply options that control the display
of the labels for the keys. Here, we
request that those labels be maroon,
medium in size, displayed with an
eggshell background, and surrounded
by a box.
Uses allstatesdc.dta & scheme vg s2c
20

40

60

80

100

Percent urban 1990
% who own home

Fitted values

Fitted values

55

60

65

70

75

80

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(region(fcolor(dimgray) lcolor(gs8) lwidth(thick) margin(medium)))

20

40

60

80

100

Percent urban 1990
% who own home

The region() option can be used to
control the overall box in which the
legend is placed. Here, we specify the
fill color to be a dim gray, the line color
to be a medium gray
(gs8 = gray scale 8), the line to be
thick, and the margin between the text
and the box to be medium.
Uses allstatesdc.dta & scheme vg s2c

Fitted values

Fitted values

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(rows(1) bmargin(t=10))

55

60

65

70

75

80

We can adjust the margin around the
box of the legend with the bmargin()
option. Here, we use t=10 to make the
margin 10 at the top, increasing the
gap between the legend and the title of
the x-axis.
Uses allstatesdc.dta & scheme vg s2c
20

40

60

80

100

Percent urban 1990

% who own home

Fitted values

Fitted values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.9

Controlling legends

297

55 60 65 70 75 80

60

80

100

Percent urban 1990

55 60 65 70 75 80

80

100

Axis selection

% who own home

Fitted values

Fitted values

Adding text

South

50 60 70 80

North

40

60

80

100

50 60 70 80

Textboxes

West

40

60

Appendix

Legend

Consider this graph, which shows two
overlaid scatterplots shown separately
by the location of the state. We will
explore how to modify the legend for
this kind of graph.
Uses allstatesdc.dta & scheme vg s2c

Styles

By

twoway (scatter ownhome urban) (qfit ownhome urban),
by(nsw)

Standard options

60
Percent urban 1990

Options

40

Pie

Axis scales

20

Dot

Axis labels

We can control the space between
columns of the legend with the
colgap() option and the space between
the rows with the rowgap() option.
Note that the rowgap() option does
not affect the border between the top
row and the box or the border between
the bottom row and the box.
Uses allstatesdc.dta & scheme vg s2c

Box

Axis titles

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(colgap(20) rowgap(20))

Bar

Fitted values

Matrix

Fitted values

Connecting

% who own home

Twoway

40

Introduction

20

Marker labels

We can control the width allocated to
symbols with the symxsize() option
and the height with the symysize()
option.
Uses allstatesdc.dta & scheme vg s2c

Markers

twoway (scatter ownhome urban) (lfit ownhome urban) (qfit ownhome urban),
legend(symxsize(30) symysize(20))

80

100

Percent urban 1990
% who own home

Fitted values

Graphs by Region North, South, or West

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
298

Chapter 8. Options available for most graphs

twoway (scatter ownhome urban) (qfit ownhome urban),
by(nsw) legend(position(12) label(2 "Quadratic Fit"))
South

50 60 70 80

North

40

60

80

100

Here, we add a legend() option, but
the position() option does not seem
to have any effect since it does not
move the position of the legend.
Uses allstatesdc.dta & scheme vg s2c

50 60 70 80

West

40

60

80

100

Percent urban 1990
% who own home

Quadratic Fit

Graphs by Region North, South, or West

twoway (scatter ownhome urban) (qfit ownhome urban),
by(nsw, legend(position(12))) legend(label(2 "Quadratic Fit"))
% who own home

Quadratic Fit

South

50 60 70 80

North

40

60

80

100

50 60 70 80

West

40

60

80

The graph command from the last
example did not change the position of
the legend because options for
positioning the legend must be placed
within the by() option. Here, we place
the legend(position()) option within
the by() option, and the legend is now
placed above the graph.
Uses allstatesdc.dta & scheme vg s2c

100

Percent urban 1990
Graphs by Region North, South, or West

twoway (scatter ownhome urban) (qfit ownhome urban), by(nsw, legend(off))
Likewise, if we wish to turn the legend
off, we must place legend(off) within
the by() option.
Uses allstatesdc.dta & scheme vg s2c

South

50

60

70

80

North

40

60

80

100

50

60

70

80

West

40

60

80

100

Percent urban 1990
Graphs by Region North, South, or West

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.10

Adding text to markers and positions

299

50

40

60

80

100

West
80
70
60
50

80

Fitted values

100

Graphs by Region North, South, or West

80
70
60
50

80

100

80
70
60
50

40

60

80

100

Percent urban 1990
Graphs by Region North, South, or West

Appendix

Adding text
Textboxes

This section provides more details about the text() option for adding text to a graph.
Although added text can be used in a wide variety of situations, we will focus on how it can
be used to label points and lines and to add descriptive text to your graph. For more information about this option, see [G] added text option. To learn more about how the text
can be customized, see Options : Textboxes (303). For this section, we will use the vg teal
scheme.

Legend

Adding text to markers and positions

Styles

By

8.10

Standard options

Fitted values

Axis selection

% who own home

Options

60

Axis scales

40

West

Pie

South

Dot

North

Axis labels

twoway (scatter ownhome urban) (qfit ownhome urban),
by(nsw, legend(position(center) at(4))) legend(cols(1))

Box

Axis titles

Percent urban 1990

Bar

60

Matrix

Connecting

% who own home
40

To position the legend, we can add the
position(center) option within the
by() option to make the legend appear
in the center of the fourth position.
Uses allstatesdc.dta & scheme vg s2c

Twoway

60

70

80

South

Introduction

North

Marker labels

To place the legend in one of the holes,
we can use the at() option within the
by() option. Here, the legend is placed
inside the fourth position. To display
the legend in one column, we use the
legend(cols(1)) option outside of the
by() option since this does not control
the position of the legend.
Uses allstatesdc.dta & scheme vg s2c

Markers

twoway (scatter ownhome urban) (qfit ownhome urban),
by(nsw, legend(at(4))) legend(cols(1))

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
300

Chapter 8. Options available for most graphs

twoway scatter ownhome borninstate
In this scatterplot, one point appears to
be an outlier from the rest, but since it
is not labeled, we cannot tell from
which state it originates.
Uses allstatesn.dta & scheme vg teal

80

% who own home

70

60

50

40
40

50

60

70

80

% born in state of residence

scatter ownhome borninstate, mlabel(stateab)
We can use the mlabel(stateab) to
see that

80

VT
NJCT

NH
% who own home

70

KS

RI

60

MN
label all points, which helps us
ME
MI
IA PAthe outlying point comes from
IN
MONE NDWI
ILSD OH

Washington, DC. However, this plot is
rather cluttered by all the labels.
Uses allstatesn.dta & scheme vg teal

MA
NY

50

DC
40
40

50

60

70

80

% born in state of residence

twoway (scatter ownhome borninstate)
(scatter ownhome borninstate if stateab == "DC", mlabel(stateab))
We could repeat a second scatterplot
just to label DC, but this is a bit
cumbersome.
Uses allstatesn.dta & scheme vg teal

% who own home

80

70

60

50

40

DC
40

50

60

70

80

% born in state of residence
% who own home

% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.10

Adding text to markers and positions

301

60

50

60

70

80

% born in state of residence

80

Pie

60

40

50

60

70

80

% born in state of residence

Axis selection

DC

Adding text

70

60

Textboxes

50

DC
40
40

Appendix

Legend

80

Styles

By

twoway (scatter ownhome borninstate, text(43 40 "DC", placement(e)))
(lfit ownhome borninstate) (lfit ownhome borninstate if stateab !="DC")

Standard options

50

Options

Axis scales

% who own home

70

40

Consider this scatterplot showing a
linear fit between the two variables: one
including Washington, DC, and one
omitting Washington, DC. See the next
graph, which uses the text() option to
label the graph instead of the legend.
Uses allstatesn.dta & scheme vg teal

Dot

Axis labels

Adding the placement(ne) option
places the label above and to the right
(northeast) of the point. Other options
you could choose include n, ne, e, se, s,
sw, w, nw, and c (center); see
Styles : Compassdir (331) for more
details.
Uses allstatesn.dta & scheme vg teal

Box

Axis titles

twoway scatter ownhome borninstate, text(43 40 "DC", placement(ne))

Bar

40

Matrix

DC
40

Connecting

50

Twoway

% who own home

70

Introduction

80

Marker labels

Instead, we can use the text() option
to add text to our graph. Looking at
the values of ownhome and borninstate
for DC, we see that their values are
about 43 and 40, respectively. We use
these as coordinates to label the point,
but the text() option places the label
at the center of the specified y x
coordinate, sitting right over the point.
Uses allstatesn.dta & scheme vg teal

Markers

twoway scatter ownhome borninstate, text(43 40 "DC")

50

60

70

80

% born in state of residence
% who own home

Fitted values

Fitted values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
302

Chapter 8. Options available for most graphs

twoway (scatter ownhome borninstate, text(43 40 "DC", placement(ne)))
(lfit ownhome borninstate) (lfit ownhome borninstate if stateab !="DC",
text(72 50 "Without DC") text(60 50 "With DC")), legend(off)
This graph turns the legend off and
uses the text() option to label each
regression line to indicate which
regression line includes DC and which
excludes DC.
Uses allstatesn.dta & scheme vg teal

80

Without DC

70

With DC

60

50

DC
40
40

50

60

70

80

% born in state of residence

twoway (scatter ownhome borninstate, text(43 40 "DC", placement(ne)))
(lfit ownhome borninstate) (lfit ownhome borninstate if stateab !="DC",
text(71 50 "Without DC") text(60 50 "With DC")
text(50 70 "Coef with DC .16" "Coef without DC .44")), legend(off)
This graph adds explanatory text
showing the regression coefficient with
and without DC.
Uses allstatesn.dta & scheme vg teal

80

70

Without DC

60

With DC

Coef with DC .16
Coef without DC .44

50

DC
40
40

50

60

70

80

% born in state of residence

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.11

More options for text and textboxes

303

80

50

60

MN
ME
MI IA PA
IN
WI
KS NH MO
ND
NE
OH
SD
IL NJ
CT
MA RI MA
NY

RI
NY

60

80

0

DC
20

40

60

80

100

% who own home

70

80

70

DC

40
0

DC
20

40

60

80

100

Axis selection

50

% homes cost $100K+

Appendix

Adding text
Textboxes

This section describes more options for modifying textbox elements: titles, captions,
notes, added text, and legends. Technically, all text in a graph is displayed within a textbox.
We can modify the box’s attributes, such as its size and color, the margin around the box,
and the outline; and we can modify the attributes of the text within the box, such as its
size, color, justification, and margin. We sometimes use the box option to see how both
the textbox and its text are being displayed. This helps us to see if we should modify the
attributes of the box containing the text or the text within the box. For more information,
see [G] textbox options and Options : Adding text (299). In this section, we will begin by
showing examples illustrating how to control the placement, size, color, and orientation of
text. We will begin this section using the vg s1m scheme.

Styles

More options for text and textboxes

Legend

8.11

% who own home

By

% who own home

Standard options

60

Options

60

80

Pie

50

Dot

% born in state of residence
40

Axis scales

Rather than labeling all the points, we
can label just the point for DC. We
have to be very careful because we have
two different x-axes. The first text()
option uses the first x-axis, so no
special option is required. The second
text() option uses the second x-axis,
so we must specify the xaxis(2)
option.
Uses allstatesn.dta & scheme vg teal

Axis labels

twoway (scatter ownhome propval100, xaxis(1))
(scatter ownhome borninstate, xaxis(2)),
text(43 66 "DC") text(43 42 "DC", xaxis(2))

Box

Axis titles

% who own home

Bar

% homes cost $100K+

Matrix

DC

40

Connecting

50

Twoway

MN
MI PA ME VT
IAKS
VT
IN
NH
WI
ND
NE MO
OH
SD
IL NJCT

70

70

Introduction

% born in state of residence
40

Marker labels

Consider this graph in which we overlay
two scatterplots. We place propval100
on the first x-axis and borninstate on
the second x-axis.
Uses allstatesn.dta & scheme vg teal

Markers

twoway (scatter ownhome propval100, xaxis(1) mlabel(stateab))
(scatter ownhome borninstate, xaxis(2) mlabel(stateab))

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
304

Chapter 8. Options available for most graphs

twoway scatter ownhome borninstate,
text(43 40 "Washington, DC", placement(ne))

60
50

% who own home

70

80

Consider this scatterplot, which has a
dramatic outlying point. We have used
the text() option to label that point,
but, perhaps, we might want to control
the size of the text for this label. See
the next example for an illustration of
how to do this.
Uses allstatesn.dta & scheme vg s1m

40

Washington, DC
40

50

60

70

80

% born in state of residence

twoway scatter ownhome borninstate,
text(43 40 "Washington, DC", placement(ne) size(vlarge))

60
50

% who own home

70

80

We can alter the size of the text using
the size() option. Here, we make the
text large. Other values we could use
with the size() option include zero,
miniscule, quarter tiny, third tiny,
half tiny, tiny, vsmall, small,
medsmall, medium, medlarge, large,
vlarge, huge, and vhuge; see
Styles : Textsize (344) for more details.
Uses allstatesn.dta & scheme vg s1m

40

Washington, DC
40

50

60

70

80

% born in state of residence

twoway scatter ownhome borninstate,
text(43 40 "Washington, DC", placement(ne) color(gs9))

60
50

% who own home

70

80

We can alter the color of the text using
the color() option. Here, we make the
text a middle-level gray. See
Styles : Colors (328) for other colors you
could select.
Uses allstatesn.dta & scheme vg s1m

40

Washington, DC
40

50

60

70

80

% born in state of residence

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.11

More options for text and textboxes

305

80

Washington, DC

70
60

% who own home

50
40

80

60

50

50

60

70

80

Adding text

% born in state of residence

Appendix

40

Legend

40

Styles

By

% who own home

70

Standard options

80

Options

Axis selection

% who own home by
% that reside in state of birth

Pie

Axis scales

Consider this example where we place a
title on our graph. To help show how
the options work, we will put a box
around the title.
Uses allstatesn.dta & scheme vg rose

Dot

Axis labels

twoway (scatter ownhome borninstate),
title("% who own home by" "% that reside in state of birth", box)

Box

This next set of examples considers options for justifying text within a box, sizing the
box, and creating margins around the box. This is followed by options that control margins
within the textbox. This next set of graphs use the vg rose scheme

Bar

70

Matrix

60
% born in state of residence

Axis titles

50

Twoway

Connecting

40

Introduction

Marker labels

We can use the orientation() option
to change the direction of the text.
Other values you can choose are
horizontal for 0 degrees, vertical for
90 degrees, rhorizontal for 180
degrees, and rvertical for 270
degrees, see Styles : Orientation (341) for
more details.
Uses allstatesn.dta & scheme vg s1m

Markers

twoway scatter ownhome borninstate,
text(43 40 "Washington, DC", placement(ne) orientation(vertical))

Textboxes

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
306

Chapter 8. Options available for most graphs

twoway (scatter ownhome borninstate),
title("% who own home by" "% that reside in state of birth", box
justification(left))
We can left-justify the text using the
justification() option. Note that
the title is justified within the textbox,
not with respect to the entire graph
area.
Uses allstatesn.dta & scheme vg rose

% who own home by
% that reside in state of birth

% who own home

80

70

60

50

40
40

50

60

70

80

% born in state of residence

twoway (scatter ownhome borninstate),
title("% who own home by" "% that reside in state of birth", box
bexpand)
If we use the bexpand (box expand)
option, the textbox containing the title
expands to fill the width of the plot
area.
Uses allstatesn.dta & scheme vg rose

% who own home by
% that reside in state of birth

% who own home

80

70

60

50

40
40

50

60

70

80

% born in state of residence

twoway (scatter ownhome borninstate),
title("% who own home by" "% that reside in state of birth", box
bexpand justification(left))
With the box expanded, the
justification(left) option now
makes the title flush left with the plot
area.
Uses allstatesn.dta & scheme vg rose

% who own home by
% that reside in state of birth

% who own home

80

70

60

50

40
40

50

60

70

80

% born in state of residence

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.11

More options for text and textboxes

307

70
60
50

50

60

Axis scales
Axis selection

% who own home

70
60
50
40
40

50

60

By

% born in state of residence

Adding text

% who own home by
% that reside in state of birth

% who own home

80

Textboxes

To make only the bottom margin 3, we
could specify bmargin(b=3), where b=3
means to change the bottom margin to
3. The top, left, bottom, and top
margins can be changed individually
using t=, l=, b=, and t=, respectively.
Uses allstatesn.dta & scheme vg rose

Appendix

Legend

twoway (scatter ownhome borninstate),
title("% who own home by" "% that reside in state of birth",
box bexpand justification(left) bmargin(b=3))

Styles

80

80

Standard options

70

% who own home by
% that reside in state of birth

Options

80

Axis labels

If we wanted the margin for the left and
right to be 0 and for the top and
bottom to be 3, we could use the
bmargin(0 0 3 3) option. The order
of the margins is bmargin(#left #right
#top #bottom ).
Uses allstatesn.dta & scheme vg rose

Pie

70

twoway (scatter ownhome borninstate),
title("% who own home by" "% that reside in state of birth",
box bexpand justification(left) bmargin(0 0 3 3))

Dot

80

Box

70

% born in state of residence

Bar

40

Axis titles

40

Matrix

Connecting

% who own home

80

Twoway

% who own home by
% that reside in state of birth

Introduction

Marker labels

We can change the size of the margin
around the outside of the box using the
bmargin(medium) (box margin) option,
making the margin a medium size at all
four edges: left, right, top, and bottom.
Uses allstatesn.dta & scheme vg rose

Markers

twoway (scatter ownhome borninstate),
title("% who own home by" "% that reside in state of birth",
box bexpand justification(left) bmargin(medium))

70

60

50

40
40

50

60
% born in state of residence

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
308

Chapter 8. Options available for most graphs

twoway (scatter ownhome borninstate) (lfit ownhome borninstate)
(lfit ownhome borninstate if stateab !="DC",
text(45 70 "Coef with DC .16" "Coef without DC .44", box))
Consider this graph, which uses the
text() option to place an annotation
in the middle of the plot region. The
text might look better if we increased
the margin around the text.
Uses allstatesn.dta & scheme vg rose

80

70

60

50

Coef with DC .16
Coef without DC .44
40
40

50

60

70

80

% born in state of residence
% who own home

Fitted values

Fitted values

twoway (scatter ownhome borninstate) (lfit ownhome borninstate)
(lfit ownhome borninstate if stateab !="DC",
text(45 70 "Coef with DC .16" "Coef without DC .44", box margin(medium)))
We can expand the margin between the
text and the box with the margin()
option. Note the difference between this
and the bmargin() option (illustrated
previously), which increased the margin
around the outside of the box.
Uses allstatesn.dta & scheme vg rose

80

70

60

50

Coef with DC .16
Coef without DC .44
40
40

50

60

70

80

% born in state of residence
% who own home

Fitted values

Fitted values

twoway (scatter ownhome borninstate) (lfit ownhome borninstate)
(lfit ownhome borninstate if stateab !="DC",
text(45 70 "Coef with DC .16" "Coef w/out DC .44", box margin(5 5 2 2)))
As with the bmargin() option, we can
more precisely modify the margin
around the text. Here, we use the
margin() option to make the size of the
margin 5, 5, 2, and 2 for the left, right,
top, and bottom, respectively.
Uses allstatesn.dta & scheme vg rose

80

70

60

50

Coef with DC .16
Coef w/out DC .44
40
40

50

60

70

80

% born in state of residence
% who own home

Fitted values

Fitted values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.11

More options for text and textboxes

309

70

60

Coef with DC .16
Coef without DC .44

40
50

60

70

80

% born in state of residence
Fitted values

Fitted values

70
60
50
40

70

80

Adding text

Appendix

60
% born in state of residence

Legend

50

Styles

By

% who own home

80

Axis selection

40

Standard options

% own home by % reside in state

Options

Axis scales

Consider this graph with a title at the
top.
Uses allstatesn.dta & scheme vg past

Pie

Axis labels

twoway (scatter ownhome borninstate),
title("% own home by % reside in state")

Dot

Let’s now consider options that control the color of the textbox and the characteristics
of the outline of the box (including the color, thickness, and pattern). This next set of
graphs uses the vg past scheme.

Box

Axis titles

% who own home

Bar

40

Matrix

50

Twoway

Marker labels

80

Introduction

Markers

We can change the gap between the
lines with the linegap() option. Here,
we make the gap larger than it
normally would be. See Styles : Margins
(338) for more details.
Uses allstatesn.dta & scheme vg rose

Connecting

twoway (scatter ownhome borninstate) (lfit ownhome borninstate)
(lfit ownhome borninstate if stateab !="DC",
text(45 70 "Coef with DC .16" "Coef without DC .44", box linegap(4)))

Textboxes

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
310

Chapter 8. Options available for most graphs

twoway (scatter ownhome borninstate),
title("% own home by % reside in state", box)
We can add the box option now for
aesthetic purposes.
Uses allstatesn.dta & scheme vg past

70
60
50
40

% who own home

80

% own home by % reside in state

40

50

60

70

80

% born in state of residence

twoway (scatter ownhome borninstate),
title("% own home by % reside in state",
box bfcolor(ltblue) blcolor(gray) blwidth(thick))

70
60
50
40

% who own home

80

% own home by % reside in state

40

50

60

70

80

We can change the box fill color with
the bfcolor() option, the color of the
line around the box with blcolor(),
and the width of the surrounding box
line with blwidth(). See Styles : Colors
(328) for other possible values you
could use with the bfcolor() and
blcolor() options and
Styles : Linewidth (337) for other values
you could choose for blwidth().
Uses allstatesn.dta & scheme vg past

% born in state of residence

twoway (scatter ownhome borninstate),
title("% own home by % reside in state",
box bcolor(gold))
We can change the box color with the
bcolor() option. Here, we make the fill
and outline color of the title box gold.
Uses allstatesn.dta & scheme vg past

70
60
50
40

% who own home

80

% own home by % reside in state

40

50

60

70

80

% born in state of residence

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
8.11

More options for text and textboxes

311

50 60 70 80

60

Graphs by Region North, South, or West

Axis scales

scatter ownhome borninstate,
by(nsw, title("% own home" "by % born in state",
ring(0) position(5) box))
South

60
50

20

40

80
70
60
50

80

% own home
by % born in state

% born in state of residence
Graphs by Region North, South, or West

Appendix

60

Adding text

40

Legend

20

Styles

West

By

% who own home

70

80

North

Axis selection

Let’s put the title in the open hole in
the right corner of the graph using the
ring(0) and position(5) options. We
include the box option only to show the
outline of the textbox, not for
aesthetics.
Uses allstates.dta & scheme vg s2c

Standard options

80

80

% born in state of residence

Options

60

50 60 70 80

40

Axis labels

20

Pie

80

Dot

60

Box

40

Bar

20

West

Axis titles

% who own home

South

Matrix

North

Twoway

% own home
by % born in state

Introduction

Connecting

Consider this graph in which we use the
by() option to show this scatterplot
separately for states in the North,
South, and West.
Uses allstates.dta & scheme vg s2c

Marker labels

scatter ownhome borninstate,
by(nsw, title("% own home" "by % born in state"))

Markers

Let’s now use the allstates file and consider some examples in which we use the by()
option to display multiple graphs broken down by the location of the state. We will look at
options for placing and aligning text in graphs that use the by() option. This next set of
graphs uses the vg s2c scheme.

Textboxes

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
312

Chapter 8. Options available for most graphs

scatter ownhome borninstate,
by(nsw, title("% own home" "by % born in state",
ring(0) position(5) box width(65) height(40)))
South

60
50

20

40

60

80

70

80

West

We can make the area for the textbox
bigger using the width() and height()
options. We change the value to make
the box approximately as tall as the
graph for the West and as wide as the
graph for the South.
Uses allstates.dta & scheme vg s2c

60

% who own home

70

80

North

% own home
by % born in state

50
20

40

60

80

% born in state of residence
Graphs by Region North, South, or West

scatter ownhome borninstate,
by(nsw, title("% own home" "by % born in state", ring(0) position(5)
box width(65) height(40) justification(left) alignment(top)))
South

60
50

20

40

60

80

% own home
by % born in state

70

80

West

We can left-justify the text and align it
with the top using the
justification(left) and
alignment(top) options. These
options make the title appear in the top
left corner of the empty hole.
Uses allstates.dta & scheme vg s2c

50

60

% who own home

70

80

North

20

40

60

80

% born in state of residence
Graphs by Region North, South, or West

scatter ownhome borninstate,
by(nsw, title("% own home" "by % born in state", ring(0) position(5)
width(65) height(40) justification(left) alignment(top)))
Now that we have aligned the text as
we would like, we can take away the
box by omitting the box option.
Uses allstates.dta & scheme vg s2c

South

60
50

20

40

60

80

% own home
by % born in state

70

80

West

50

60

% who own home

70

80

North

20

40

60

80

% born in state of residence
Graphs by Region North, South, or West

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Twoway
Matrix
Bar
Box
Dot
Pie
Options
Standard options
Styles

Titles are useful for providing additional information that explains the contents of a
graph. Stata includes four standard options for adding explanatory text to graphs: title(),
subtitle(), note(), and caption(). This section will illustrate how to use these titles and how to customize their content and their placement. For further information
about customizing the appearance of such titles (e.g., color, size, orientation, etc.), see
Options : Textboxes (303). For more information about titles, see [G] title options. This
section uses the vg s1m scheme.

Graph regions

Creating and controlling titles

Sizing graphs

9.1

Schemes

This chapter discusses a class of options Stata refers to as standard options, because
these options can be used in most graphs. This chapter will begin by discussing options
that allow you to add or change the titles in the graph and then showing you how to use
schemes to control the overall look and style of your graph. Next, we demonstrate options
for controlling the size of the graph and the scale of items within graphs. The chapter
will conclude by illustrating options that allow you to control the colors of the plot region,
the graph region, and the borders that surround these regions. For further details, see
[G] std options.

Introduction

Titles

9 Standard options available for all
graphs

Appendix

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this313
document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
314

Chapter 9. Standard options available for all graphs

scatter propval100 ownhome, title("My title")
100
80
60
40
20
0

% homes cost $100K+

The title() option adds a title to a
graph. Here, we add a simple title to
the graph. Although the title includes
quotes, we could have omitted them in
this case. Later, we will see examples
where the quotes become very
important.
Uses allstates.dta & scheme vg s1m

My title

40

50

60

70

80

% who own home

scatter propval100 ownhome, title("My title") subtitle("My subtitle")
The subtitle() option adds a subtitle
to a graph. The subtitle, by default,
appears below the title in a smaller
font.
Uses allstates.dta & scheme vg s1m

My title

80
60
40
20
0

% homes cost $100K+

100

My subtitle

40

50

60

70

80

% who own home

scatter propval100 ownhome, subtitle("My smaller title")
100
80
60
40
20
0

% homes cost $100K+

We do not have to specify a title() to
specify a subtitle(). For example, we
might want a title that is smaller in size
than a regular title, so we could specify
a subtitle alone.
Uses allstates.dta & scheme vg s1m

My smaller title

40

50

60

70

80

% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
9.1

Creating and controlling titles

315

100
80
60
40

% homes cost $100K+

20
0

70

80

Matrix

60

Twoway

50

Sizing graphs

40

Introduction

Schemes

In this example, the caption() option
adds a small-sized caption in the lower
corner, and the note() option places a
smaller-sized note in the bottom left
corner. If both options are specified,
the note appears above the caption. We
do not need to include both of these
options in the same graph.
Uses allstates.dta & scheme vg s1m

Titles

scatter propval100 ownhome, caption("My caption") note("My note")

% who own home

My caption

Dot
Pie

My t2title

My r2title

My r1title

80
60
40

My l2title

% homes cost $100K+

20
0
40

50

60

70

80

% who own home

My b1title

Standard options

My l1title

100

My t1title

Options

Although these are not as commonly
used, Stata offers a number of
additional title options for titling the
top of the graph (t1title() and
t2title()), the bottom of the graph
(b1title() and b2title()), the left
side of the graph (l1title() and
l2title()), and the right side of the
graph (r1title() and r2title()).
Uses allstates.dta & scheme vg s1m

Box

scatter propval100 ownhome, t1title("My t1title") t2title("My t2title")
b1title("My b1title") b2title("My b2title") l1title("My l1title")
l2title("My l2title") r1title("My r1title") r2title("My r2title")

Bar

Graph regions

My note

My b2title

Styles
Appendix

Stata gives you considerable flexibility in the placement of these titles, notes, and captions, as well as controlling the size, color, and orientation of the text. This is illustrated
below using the title() option, but the same options apply equally to the subtitle(),
note(), and caption() options.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
316

Chapter 9. Standard options available for all graphs

scatter propval100 ownhome, title("My" "title")

100
80
60
40
20
0

% homes cost $100K+

In this example, we use multiple sets of
quotes in the title() option to tell
Stata that we want the title to appear
on two separate lines.
Uses allstates.dta & scheme vg s1m

My
title

40

50

60

70

80

% who own home

scatter propval100 ownhome, title(‘"A "title" with quotes"’)
100
80
60
40
20
0

% homes cost $100K+

This example illustrates that we can
have quotation marks in the title()
option, as long as we open the title
with ‘" and close it with "’. (The open
single quote is often located below the
tilde on your keyboard, and the close
single quote is often located below the
double quote on your keyboard.)
Uses allstates.dta & scheme vg s1m

A "title" with quotes

40

50

60

70

80

% who own home

scatter propval100 ownhome, title("My title", position(7))

80
60
40
20
0

% homes cost $100K+

100

The position() option can be used to
change the position of the title. Here,
we place the title in the bottom left
corner of the graph by indicating that it
should be at the 7 o’clock position. See
Styles : Clockpos (330) for more details.
Uses allstates.dta & scheme vg s1m

40

50

60

70

80

% who own home

My title

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
9.1

Creating and controlling titles

317

100
80
60
40

% homes cost $100K+

20
0

70

80

70

80

Dot

This is my
title

40
20
0
40

50

60

Standard options

60

80

Options

% homes cost $100K+

Pie

100

scatter propval100 ownhome, title("This is my" "title",
position(11) box)
Because titles, subtitles, notes, and
captions are considered textboxes, you
can use the options associated with
textboxes to customize their display.
Here, we place a box around the title
using the box option. We also use the
position(11) option to place the title
in the 11 o’clock position.
Uses allstates.dta & scheme vg s1m

Box

80

Bar

70

Matrix

60
% who own home

Graph regions

50

Twoway

Sizing graphs

40

Introduction

My title

Schemes

As we saw in the last example, we can
use the position() option to control
the placement of the title, but this
option does not control the distance
between the title and center of the plot
region. That is controlled by the
ring() option. ring(0) means that
the item is inside the plot region, and
higher values for ring() place the item
farther away from the plot region.
Imagine concentric rings around the
plot area with higher values
corresponding to the rings that are
farther from the center.
Uses allstates.dta & scheme vg s1m

Titles

scatter propval100 ownhome, title("My title", position(1) ring(0))

% who own home

Styles

80
60
40
20

% homes cost $100K+

100

This is my
title

0

Here, we add the span option, so the
title spans the width of the graph,
positioning the title flush left at the 11
o’clock position. Note that now the
title partly obscures the 100 labeling
the y-axis.
Uses allstates.dta & scheme vg s1m

Appendix

scatter propval100 ownhome, title("This is my" "title",
position(11) box span)

40

50

60
% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
318

Chapter 9. Standard options available for all graphs

scatter propval100 ownhome, title("This is my" "title", box
justification(right))

100
80
60
40
20
0

% homes cost $100K+

We can use the justification(right)
option to right-justify the text inside
the box. Note the difference between
the position() option, which positions
the textbox, and the justification()
option, which justifies the text within
the textbox.
Uses allstates.dta & scheme vg s1m

This is my
title

40

50

60

70

80

% who own home

scatter propval100 ownhome, title("This is my" "title", box bexpand)

100
80
60
40
20
0

% homes cost $100K+

We can expand the box to fill the width
of the plot region using the bexpand
option. If we wanted the box to span
the entire width of the graph, we could
add the span option (not shown).
There are numerous other textbox
options than can be used with titles;
see Options : Textboxes (303) and
[G] textbox options for more details.
Uses allstates.dta & scheme vg s1m

This is my
title

40

50

60

70

80

% who own home

9.2

Using schemes to control the look of graphs

Schemes control the overall look of Stata graphs by providing default values for numerous graph options. You can accept these defaults or override them using graph options. This
section first examines the kinds of schemes available in Stata, discuss different methods for
selecting schemes, and then show how to obtain additional schemes. For more information
about schemes, see [G] schemes. Stata has two basic families of schemes, the s2 family
and the s1 family, each sharing similar characteristics. There are also other specialized
schemes, including the sj scheme for making graphs like those in the Stata Journal and the
economist scheme for making graphs like those that appear in The Economist. We will
look at these schemes below.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
9.2

Using schemes to control the look of graphs

319

100
80
60
40
20
0

100

% rents $700+/mo
Fitted values

Dot

0
20

40

60
Percent urban 1990

% homes cost $100K+
Fitted values

80

100

% rents $700+/mo
Fitted values

Styles

0

20

40

60

80

100

Appendix

twoway (scatter propval100 urban) (scatter rent700 urban)
(lfit propval100 urban) (lfit rent700 urban), scheme(s2manual)
Here is an example using the s2manual
scheme, which is very similar to the
s2mono scheme. One difference is that
the lines of the fit values are the same
pattern (solid) in this graph, but they
have different patterns when we use
s2mono.
Uses allstates.dta & scheme s2manual

Standard options

20

40

60

Options

80

100

Pie

The s2mono scheme is a
black-and-white version of the s2color
scheme. In this example, the symbols
differ in gray scale and size, and the
lines differ in their patterns.
Uses allstates.dta & scheme s2mono

Box

twoway (scatter propval100 urban) (scatter rent700 urban)
(lfit propval100 urban) (lfit rent700 urban), scheme(s2mono)

Bar

% homes cost $100K+
Fitted values

80

Matrix

60
Percent urban 1990

Graph regions

40

Twoway

Sizing graphs

20

Introduction

Schemes

This example uses the
scheme(s2color) option to create a
graph using the s2color scheme. Using
the scheme() option, we can manually
select which scheme to use for
displaying the graph we wish to create.
The s2color scheme is the default
scheme for Stata graphs.
Uses allstates.dta & scheme s2color

Titles

twoway (scatter propval100 urban) (scatter rent700 urban)
(lfit propval100 urban) (lfit rent700 urban), scheme(s2color)

20

40

60
Percent urban 1990

% homes cost $100K+
Fitted values

80

100

% rents $700+/mo
Fitted values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
320

Chapter 9. Standard options available for all graphs

0

20

40

60

80

100

twoway (scatter propval100 urban) (scatter rent700 urban)
(lfit propval100 urban) (lfit rent700 urban), scheme(s1color)

20

40

60
Percent urban 1990

% homes cost $100K+
Fitted values

80

100

% rents $700+/mo
Fitted values

This is an example of a graph using the
s1color scheme. Note how the lines
and markers are only differentiated by
their color. Both the plot area and the
border around the plot are white. Also,
note the absence of grid lines. (Stata
also has an s1rcolor scheme, in which
the plot area and border area are black.
This is not shown since it would be
difficult to read in print.)
Uses allstates.dta & scheme s1color

twoway (scatter propval100 urban) (scatter rent700 urban)
(lfit propval100 urban) (lfit rent700 urban), scheme(s1mono)

0

20

40

60

80

100

The s1mono scheme is similar to the
s1color scheme in that the plot area
and border are white and the grid is
omitted. In a mono scheme, the
markers differ in gray scale and size,
and the lines differ in their pattern.
Uses allstates.dta & scheme s1mono

20

40

60
Percent urban 1990

% homes cost $100K+
Fitted values

80

100

% rents $700+/mo
Fitted values

twoway (scatter propval100 urban) (scatter rent700 urban)
(lfit propval100 urban) (lfit rent700 urban), scheme(s1manual)

0

20

40

60

80

100

The s1manual is similar to s1mono, but
the sizes of the markers and text are
increased. This is useful if you are
making a small graph and want these
small elements to be magnified to be
more easily seen.
Uses allstates.dta & scheme s1manual

20

40

60

80

100

Percent urban 1990
% homes cost $100K+
Fitted values

% rents $700+/mo
Fitted values

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
9.2

Using schemes to control the look of graphs

321

100
80
60
40
20
0

100

% rents $700+/mo
Fitted values

% rents $700+/mo

Pie

% homes cost $100K+
Fitted values

Dot

Fitted values
100
80

Options

60
40
20

20

40

60
Percent urban 1990

80

0
100

Standard options

The economist scheme is quite
different from all the other schemes and
is a very good example of how much
can be controlled with a scheme. Using
this scheme modifies the colors of the
plot area, border, markers, lines, the
position of the y-axis, and the legend.
It also removes the line on the y-axis
and changes the angle of the labels on
the y-axis.
Uses allstates.dta & scheme economist

Box

twoway (scatter propval100 urban) (scatter rent700 urban)
(lfit propval100 urban) (lfit rent700 urban), scheme(economist)

Bar

% homes cost $100K+
Fitted values

80

Matrix

60
Percent urban 1990

Graph regions

40

Twoway

Sizing graphs

20

Introduction

Schemes

The sj scheme is very similar to the
s2mono scheme. In fact, a comparison
of this graph with an earlier graph that
used the s2mono scheme shows no
visible differences. The sj scheme is
based on the s2mono scheme and only
alters xsize() and ysize(). See
Appendix : Customizing schemes (379) for
more information about how to inspect
(and alter) the contents of graph
schemes.
Uses allstates.dta & scheme sj

Titles

twoway (scatter propval100 urban) (scatter rent700 urban)
(lfit propval100 urban) (lfit rent700 urban), scheme(sj)

Styles

. set scheme economist

Appendix

As these examples have shown, we can change the scheme of a graph by supplying the
scheme() option on a graph command. If we want to use the same scheme over and over,
we can use the set scheme command to set the default scheme. For example, if we typed

the default scheme would become economist until we quit Stata. Or, we could type
. set scheme economist, permanently

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
322

Chapter 9. Standard options available for all graphs

The economist scheme would be our default scheme, even after we quit and start Stata
again. If we will be creating a series of graphs that we want to have a common look, then
schemes are a very powerful tool for accomplishing this. Even though Stata has a variety
of built-in schemes, we may want to obtain other schemes. The findit command can be
used to search for information about schemes and to download schemes that others have
developed. To search for schemes, type
. findit scheme

and Stata will list web pages and packages associated with the word scheme.
See Intro : Schemes (14) for an overview of the schemes used in this book and Appendix : Online supplements (382) for instructions for obtaining the schemes for this book.
Seeing how powerful and flexible schemes are, we might be interested in creating our
own schemes. Stata gives us complete control over creating schemes. The section Appendix : Customizing schemes (379) provides tips for getting started.

9.3

Sizing graphs and their elements

This section illustrates how to use the xsize() and ysize() options to control the size
and aspect ratio of graphs. It also illustrates the use of the scale() option for controlling
the size of the text and markers. This section uses the vg s1c scheme.
scatter propval100 ownhome

60
40
0

20

% homes cost $100K+

80

100

Let’s first consider this graph. The
graphs in this book have been sized to
be 3 inches wide by 2 inches tall.
Although we do not see it, some graphs
are sized via an xsize() and ysize()
option, and some are sized via schemes.
Uses allstates.dta & scheme vg s1c

40

50

60

70

80

% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
9.3

Sizing graphs and their elements

323

100
% homes cost $100K+
40
60
80
20
0

60
% who own home

70

80

Box

100

Dot

0

50

60

70

80

% who own home

Standard options

20

40

60

Options

% homes cost $100K+

80

Pie

40

Styles

scatter propval100 ownhome, scale(1.7)
% homes cost $100K+
0 20 40 60 80 100

Appendix

In this example, we add the
scale(1.7) option to magnify the sizes
of the text and markers in the graph,
making them 1.7 times their normal
sizes. This can be useful when we make
small graphs and want to increase the
sizes of the text and markers to make
them easier to see.
Uses allstates.dta & scheme vg s1c

Bar

Here, we make just one more graph to illustrate that we
can use xsize() and ysize() to control the aspect
ratio of the graph, as well as the size. Here, we make
the graph square by making the graph 2 inches high by
2 inches tall.
Uses allstates.dta & scheme vg s1c

Matrix

Graph regions

scatter propval100 ownhome, xsize(2) ysize(2)

Twoway

50

Sizing graphs

40

Introduction

Schemes

Here, we make a graph to illustrate how
to use xsize() and ysize() to control
the aspect ratio of the graph, as well as
the size. Note that when we do this,
the size of the graph will not change on
the screen but the aspect ratio will.
Although we can size the graph on the
screen, when we export the graph, it
will have both the size and aspect ratio
we chose using xsize() and ysize().
Uses allstates.dta & scheme vg s1c

Titles

scatter propval100 ownhome, xsize(3) ysize(1)

40

50
60
70
% who own home

80

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
324

Chapter 9. Standard options available for all graphs

scatter propval100 ownhome, scale(.5)

0

20

% homes cost $100K+
40
60

80

100

We can also use the scale() option to
decrease the size of the text and
markers. Here, we make the size of
these elements half their normal size.
Uses allstates.dta & scheme vg s1c

40

9.4

50

60
% who own home

70

80

Changing the look of graph regions

This section discusses the region options that can be controlled via the plotregion()
and graphregion() options. These allow we to control the color of the plot region and
graph region, as well as the lines that border these regions. For more information, see
[G] region options. This section uses the vg s2c scheme.

scatter propval100 ownhome, title("My title")

80
60
40
20
0

% homes cost $100K+

100

My title

40

50

60

70

80

Consider this scatterplot. In general,
Stata sees this graph as having two
overall regions. The area inside the xand y-axes where the data are plotted
is called the plot region. In this graph,
the plot region is white. The area
surrounding the plot region, where the
axes and titles are placed, is called the
graph region. In this graph, the graph
region is shaded light blue.
Uses allstates.dta & scheme vg s2c

% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
9.4

Changing the look of graph regions

325

100
80
60
40

% homes cost $100K+

20
0

80

Pie

100

My title

20
0
40

50

60

70

80

% who own home

Styles

scatter propval100 ownhome, title("My title") graphregion(color(erose))

20

40

60

80

100

My title

0

% homes cost $100K+

Appendix

Here, we use the
graphregion(color(erose)) option to
modify the color of the graph region to
be erose, a light rose color. The graph
region is the area outside of the plot
region where the titles and axes are
displayed.
Uses allstates.dta & scheme vg s2c

Standard options

40

60

80

Options

% homes cost $100K+

Dot

In this graph, we put a thick, navy blue
line around the plot region using the
lcolor() and lwidth() options. This
puts a bit of a frame around the plot
region.
Uses allstates.dta & scheme vg s2c

Box

scatter propval100 ownhome, title("My title") plotregion(lcolor(navy)
lwidth(thick) )

Bar

70

Matrix

60
% who own home

Graph regions

50

Twoway

Sizing graphs

40

Introduction

My title
Schemes

Here, we use
plotregion(color(stone)) to make
the color of the plot region stone. The
color() option controls the color of the
plot region.
Uses allstates.dta & scheme vg s2c

Titles

scatter propval100 ownhome, title("My title") plotregion(color(stone))

40

50

60

70

80

% who own home

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
326

Chapter 9. Standard options available for all graphs

scatter propval100 ownhome, title("My title")
graphregion(ifcolor(erose) fcolor(maroon))
100
80
60
40
20
0

% homes cost $100K+

The graph region is actually composed
of an inner part and an outer part.
Here, we use the ifcolor(erose)
option to make the inner graph region
light rose and the fcolor(maroon)
option to make the outer graph region
maroon. This has the effect of putting a
maroon frame around the entire graph.
Uses allstates.dta & scheme vg s2c

My title

40

50

60

70

80

% who own home

scatter propval100 ownhome, title("My title")
graphregion(lcolor(navy) lwidth(vthick))
100
80
60
40
20
0

% homes cost $100K+

We can put a somewhat different frame
around the graph by altering the size
and color of the line that surrounds the
graph region. Using the lcolor(navy)
lwidth(vthick) options gives this
graph a very thick, navy blue border.
Uses allstates.dta & scheme vg s2c

My title

40

50

60

70

80

% who own home

This section omitted numerous options that we could use to control the plot region and
graph region, including further control of the inner and outer regions and further control of
the lines that surround these regions. Stata gives us more control than we generally need,
so rather than covering these options here, I refer you to [G] region options.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Colors

M SD
KA T
E
N YL NS NC V
C
FL M
IDGA
T
O
K
N
D
M
MT
AZ
O R NE
T
N
POA NX KSIAVUI
H
R Y IN AT
D W
O
E
H N CM MI
WC V O A N
M AA
I IW
L YN MC HI
J DT

Appendix

W
15000

20000

25000

30000

1979 Median Family Inc.

Textsize

45

V

LA

Symbols

M A
S R

AK

70

Styles

65

Standard options

60

Options

Orientation

55

Pie

Markersize

% HHs with 2+ workers

Dot

Margins

50

Box

Linewidth

scatter workers2 faminc, mlabel(stateab) mlabangle(45)

Bar

Linepatterns

An anglestyle specifies the angle for displaying an item (or group of items) in the graph.
Common examples include specifying the angle for marker labels with mlabangle() or the
angle of the labels on the y-axis with ylabel(, angle()). We can specify an anglestyle as
a number of degrees of rotation (negative values are permitted, so for example, −90 can be
used instead of 270). We can also use the keywords horizontal for 0 degrees, vertical
for 90 degrees, rhorizontal for 180 degrees, and rvertical for 270 degrees. See [G] anglestyle for more information.

Matrix

Connect

Angles

Here, we use the mlabangle(45)
(marker label angle) to change the angle
of the marker labels to 45 degrees.
Uses allstatesdc.dta & scheme vg s2c

Compassdir

10.1

Twoway

Clockpos

This section focuses on frequently used styles that arise in making graphs, such as
linepatternstyle, linewidthstyle, or markerstyle. The styles are covered in alphabetical order,
providing more details about the values you can choose. Each section refers to the appropriate section of [G] graph to provide complete details on each style. We begin by using
the allstates file and omitting Washington, DC.

Introduction

Angles

10 Styles for changing the look of
graphs

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this327
document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
328

Chapter 10. Styles for changing the look of graphs

scatter workers2 faminc, ylabel(, angle(0))
Here, we change the angle of the labels
of the y-axis so that they read
horizontally by using the angle(0)
option. We could also have used
horizontal to obtain the same effect.
Uses allstatesdc.dta & scheme vg s2c

% HHs with 2+ workers

70

65

60

55

50

45
15000

20000

25000

30000

1979 Median Family Inc.

scatter workers2 faminc, xlabel(15000(1000)30000, angle(45))

65
60
55
50
15

00
16 0
00
17 0
00
18 0
00
19 0
00
20 0
00
21 0
00
22 0
00
23 0
00
24 0
00
25 0
00
26 0
00
27 0
00
28 0
00
29 0
00
30 0
00
0

45

% HHs with 2+ workers

70

In this example, we label the x-axis
from 15,000 to 30,000 incremented by
1,000. When we have so many labels,
we can use the angle(45) option to
display the labels at a 45-degree angle.
Uses allstatesdc.dta & scheme vg s2c

1979 Median Family Inc.

10.2

Colors

A colorstyle allows us to modify the color of an object, be it a title, a marker, a marker
label, a line around a box, a fill color of a box, or practically any other object in graphs.
The two main ways to specify a color are either by giving a name of color (e.g., red, pink,
teal) or by supplying an RGB value giving the amount of red, green, and blue to be mixed
to form a custom color. See [G] colorstyle for more information.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
10.2

Colors

329

70
65
60
55

% HHs with 2+ workers

50
45
70
65
60
55

% HHs with 2+ workers

50
45

gs1

gs2

gs3

gs4

gs6

gs7

gs8

gs9

gs10

gs11

gs12

gs13

gs14

gs15

gs16

white

blue

bluishgray

brown

cranberry

cyan

dimgray

dkgreen

dknavy

dkorange

eggshell

emerald

forest_green

gold

gray

green

khaki

lavender

lime

ltblue

ltbluishgray

ltkhaki

magenta

maroon

midblue

midgreen

mint

navy

olive

olive_teal

orange

orange_red

pink

purple

red

sand

sandb

sienna

stone

teal

yellow

ebg

ebblue

edkblue

eltblue

eltgreen

emidblue

erose

Textsize

gs0

gs5

Symbols

black

Appendix

Color Map of Standard Stata Colors

Orientation

This vgcolormap command is a
command that I wrote to show the
different standard colors available in
Stata all at once. We simply issue the
command vgcolormap, and it creates a
scatterplot that shows the colors we can
choose from and their names. See the
list of colors available in [G] colorstyle,
and see how to get vgcolormap in
Appendix : Online supplements (382).
Uses allstatesdc.dta & scheme s2color

Styles

Markersize

vgcolormap, quietly

Standard options

30000

Options

25000

1979 Median Family Inc.

Margins

20000

Pie

Linewidth

15000

Dot

Linepatterns

Here, we use the mcolor(lavender)
option to make the markers lavender,
one of the predefined colors created by
Stata. The next example illustrates
more of the colors from which you can
choose.
Uses allstatesdc.dta & scheme vg s2c

Box

Connect

scatter workers2 faminc, mcolor(lavender)

Bar

30000

Matrix

25000

1979 Median Family Inc.

Compassdir

20000

Twoway

Clockpos

15000

Introduction

Colors

The mcolor() (marker color) option is
used here to make the marker a middle
gray. Stata provides 17 levels of gray
named gs0 to gs16. The darkest is gs0
(a synonym for black), and the lightest
is gs16 (a synonym for white).
Uses allstatesdc.dta & scheme vg s2c

Angles

scatter workers2 faminc, mcolor(gs8)

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
330

Chapter 10. Styles for changing the look of graphs

65
60
55
50
45

% HHs with 2+ workers

70

scatter workers2 faminc, mcolor("255 255 0")

15000

20000

25000

30000

Despite all the standard color choices,
we may want to mix our own colors by
specifying how much red, green, and
blue that we want mixed together. We
can mix between 0 and 255 units of
each color. Mixing 0 units of each
yields black, and 255 units of each
yields white. Here, we mix 255 units of
red, 255 units of green, and 0 units of
blue to get a shade of yellow.
Uses allstatesdc.dta & scheme vg s2c

1979 Median Family Inc.

scatter workers2 faminc, mcolor("255 150 100")

65
60
55
50
45

% HHs with 2+ workers

70

By mixing 255 parts red, 150 parts
green, and 100 parts blue, we get a
peach color. Since colors for web pages
use this same principle of mixing red,
green, and blue, we can do a web search
using terms like color mixing html and
find numerous web pages to help us find
the right mixture for the colors that we
want to make.
Uses allstatesdc.dta & scheme vg s2c
15000

20000

25000

30000

1979 Median Family Inc.

10.3

Clock position

A clock position refers to a location using the numbers on an analog clock to indicate
the location, with 12 o’clock being above the center, 3 o’clock to the right, 6 o’clock below
the center, and 9 o’clock to the left. A value of 0 refers to the center but may not always
be valid. See [G] clockpos for more information.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
10.4

Compass direction

331

70
65
60
55

% HHs with 2+ workers

50
45

WV

25000

30000

1979 Median Family Inc.

70
65
60
55

% HHs with 2+ workers

50

20000

25000

30000

45

1979 Median Family Inc.

Margins

WV
15000

Appendix

Symbols
Textsize

A compassdirstyle is much like clockpos, but where a clockpos has 12 possible outer positions, like a clock, the compassdirstyle has only 9 possible outer positions, like the major
labels on a compass: north, neast, east, seast, south, swest, west, nwest, and center.
These can be abbreviated as n, ne, e, se, s, sw, w, nw, and c. Stata permits us to use
a clockpos even when a compassdirstyle is called for and makes intuitive translations; for
example, 12 is translated to north, or 2 is translated to neast. See [G] compassdirstyle
for more information.

Orientation

Compass direction

Styles

Markersize

10.4

Standard options

LA

Options

Linewidth

HI
CT
MN
VT
MD
WI
NE
MA
SD
ND RIVA
CO NJ
UT
IA DE
MENC
KS IN NV WY
GA
IL
SCID MT
MOTXNY CA
WA
TN
OH
MI
OR
PA
AL
KY OK AZ
AR
NM
FL
MS

Pie

AK

NH

Dot

Linepatterns

In this example, we place the markers
in the center position using the
mlabposition(0) option. We also
make the symbols invisible using the
msymbol(i) option. Otherwise, the
markers and marker labels would be
atop each other.
Uses allstatesdc.dta & scheme vg s2c

Box

Connect

scatter workers2 faminc, mlabel(stateab) mlabposition(0) msymbol(i)

Bar

20000

Matrix

15000

Compassdir

LA

Twoway

Clockpos

HI
CT
MN
VT
MD
NE WI
MA
SD
ND RIVA
CO NJ
UT
IA DE
MENC
KSIN NV WY
GA
IL
ID
SC
MT
NY CA
MO
WA
TX
TN
OH
MI
OR
PA
AL
KY OK AZ
AR
NM
FL
MS

Introduction

AK

NH

Colors

In this example, we add marker labels
to a scatterplot and use the
mlabposition(5) (marker label
position) option to place the marker
labels in the 5 o’clock position with
respect to the markers.
Uses allstatesdc.dta & scheme vg s2c

Angles

scatter workers2 faminc, mlabel(stateab) mlabposition(5)

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
332

Chapter 10. Styles for changing the look of graphs

scatter workers2 faminc, title("Work Status and Income",
ring(0) placement(se))

65
60
55
50

Work Status and Income

45

% HHs with 2+ workers

70

In this example, we use placement() to
position the title in the southeast
(bottom right corner) of the plot
region. The ring(0) option moves the
title inside the plot region.
Uses allstatesdc.dta & scheme vg s2c

15000

20000

25000

30000

1979 Median Family Inc.

scatter workers2 faminc, title("Work Status and Income",
ring(0) placement(4))

65
60
55
50

Work Status and Income

45

% HHs with 2+ workers

70

If we instead specify the placement(4)
option (using a clockpos instead of
compassdir), Stata makes a suitable
substitution, and the title is placed in
the bottom right corner.
Uses allstatesdc.dta & scheme vg s2c

15000

20000

25000

30000

1979 Median Family Inc.

10.5

Connecting points

Stata supports a variety of methods for connecting points using different values for the
connectstyle. These include l (lowercase L, as in line) to connect with a straight line, L to
connect with a straight line only if the current x-value is greater than the prior x-value, J
for stairstep, stepstair for step then stair, and i for invisible connections. For the next
few examples, let’s switch to using the spjanfeb2001 data file, keeping only the data for
January and February of 2001. See [G] connectstyle for more information.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
10.5

Connecting points

333

1400
1350

Closing price

1300
1250
1400
1350

Closing price

1300
1250
1400
1350

Appendix

Orientation

0

10

20

Textsize

1250

1300

Symbols

Closing price

Styles

Markersize

scatter close tradeday, connect(l) sort
To fix the previous graph, we can either
first use the sort command to sort the
data on tradeday or, as we do here, use
the sort option to tell Stata to sort the
data on tradeday before connecting the
points. We also could have specified
sort(tradeday), and it would have
had the same effect.
Uses spjanfeb2001.dta & scheme vg s2c

Margins

20
Trading day number

Standard options

40

Linewidth

10

Options

30

Linepatterns

0

Pie

40

Connect

Here, we add the connect(l) option,
but this is probably not the kind of
graph we wanted to create. The
problem is that the observations are in
a random order, but the observations
are connected in the same order as they
appear in the data. We really want the
points to be connected based on the
order of tradeday.
Uses spjanfeb2001.dta & scheme vg s2c

Dot

30

scatter close tradeday, connect(l)

Box

40

Bar

30

Matrix

20
Trading day number

Compassdir

10

Twoway

Clockpos

0

Introduction

Colors

Here, we make a scatterplot showing
the closing price on the y-axis and the
trading day (numbered 1 to 40) on the
x-axis. Normally, we would connect
these points.
Uses spjanfeb2001.dta & scheme vg s2c

Angles

scatter close tradeday

Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
334

Chapter 10. Styles for changing the look of graphs

scatter close predclose tradeday

1250

1300

1350

1400

Say that we used the regress
command to predict close from
tradeday and generated a predicted
value called predclose. Here, we plot
the actual closing prices and the
predicted closing prices.
Uses spjanfeb2001.dta & scheme vg s2c

0

10

20

30

40

Trading day number
Closing price

Lin. Fit close from tradeday

1250

1300

1350

1400

scatter close predclose tradeday, connect(i l) sort
msymbol(. i)

0

10

20

30

40

Trading day number
Closing price

We use the connect(i l) option to
connect the predicted values and leave
the observed values unconnected. The i
option with close indicates that the
closing values are not connected, and
the l (letter l) option indicates that the
predclose values should be connected
with a straight line. We also add
msymbol(. i) to make the symbols
invisible for the fit values.
Uses spjanfeb2001.dta & scheme vg s2c

Lin. Fit close from tradeday

scatter close tradeday, connect(J) sort

1350
1300
1250

Closing price

1400

In other contexts (such as survival
analysis), we might want to connect
points using a stairstep pattern. Here,
we connect the observed closing prices
with the J option (which can also be
specified as stairstep) to get a
stairstep effect.
Uses spjanfeb2001.dta & scheme vg s2c

0

10

20

30

40

Trading day number

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
10.5

Connecting points

335

1400
1350

Closing price

1300
1250

40

1400
1350

Closing price

1300
1250
1400
1350
0

10

Textsize

1250

1300

Symbols

Closing price

Orientation

Appendix

30

Markersize

This kind of example calls for the
connect(L) option, which avoids the
line that swoops back by connecting
points with a straight line, except when
the x-value (dom) decreases (e.g., goes
from 31 to 1).
Uses spjanfeb2001.dta & scheme vg s2c

Styles

20

scatter close dom, connect(L) sort(date)

Standard options

30

Options

20
Day of month

Margins

10

Pie

Linewidth

0

Dot

Linepatterns

Say that we created a variable called
dom that represented the day of month
and wanted to graph the closing prices
for January and February against the
day of the month. Using the
sort(date) option combined with
connect(l), we almost get what we
want, but we get a line that swoops
back connecting January 31 to Feb 1.
Uses spjanfeb2001.dta & scheme vg s2c

Box

Connect

scatter close dom, connect(l) sort(date)

Bar

30

Matrix

20
Trading day number

Compassdir

10

Twoway

Clockpos

0

Introduction

Colors

In other contexts, we might want to
connect points using a stepstair
pattern. Here, we connect the observed
closing prices with the stepstair
option to get a stepstair effect.
Uses spjanfeb2001.dta & scheme vg s2c

Angles

scatter close tradeday, connect(stepstair) sort

Day of month

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
336

Chapter 10. Styles for changing the look of graphs

10.6

Line patterns

We can specify the pattern we want for a line in three ways. We can specify a word that
selects among a set of predefined styles, including solid (solid line), dash (a dashed line),
dot (a dotted line), shortdash (short dashes), longdash (long dashes), and blank (invisible). There are also combination styles dash dot, shortdash dot, and longdash dot. We
can also use a formula that combines the following five elements in any way that we wish:
l (letter l, solid line), (underscore, long dash), - (hyphen, medium dash), . (period, short
dash), and # (small amount of space). We could specify longdash dot or " .", and they
would be equivalent. See [G] linepatternstyle for more information.
twoway (line close tradeday, clpattern(solid) sort)
(lfit close tradeday, clpattern(dash))
(lowess close tradeday, clpattern(shortdash dot))

1250

1300

1350

1400

In this example, we make a line plot
and use the clpattern() (connect line
pattern) option to obtain a solid
pattern for the observed data, a dash
for the linear fit line, and a short dash
and dot line for a lowess fit.
Uses spjanfeb2001.dta & scheme vg s2c
0

10

20

30

40

Trading day number
Closing price

Fitted values

lowess close tradeday

twoway (line close tradeday, clpattern("l") sort)
(lfit close tradeday, clpattern(". "))
(lowess close tradeday, clpattern("-###"))

1250

1300

1350

1400

We can use the clpattern() option
specifying a formula to indicate the
pattern for the lines. Here, we specify a
solid line for the line plot, a dot and
dash for the lfit plot, and a dash and
three spaces for the lowess fit.
Uses spjanfeb2001.dta & scheme vg s2c
0

10

20

30

40

Trading day number
Closing price

Fitted values

lowess close tradeday

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
10.7

Line width

337

1400
1350
1300
1250

40

Closing price

Fitted values

longdash_dot

longdash

shortdash

blank

Margins

shortdash_dot

Appendix

Symbols
Textsize

We can indicate the width of a line in two ways. We can indicate a linewidthstyle,
which allows us to use a word to specify the width of a line, including none (no width,
invisible), vvthin, vthin, thin, medthin, medium, medthick, thick, vthick, vvthick, and
even vvvthick. We can also specify a relativesize, which is a multiple of the line’s normal
thickness (e.g., *2 is twice as thick, or *.7 is .7 times as thick). See [G] linewidthstyle for
more information.

Orientation

Line width

Styles

Markersize

10.7

Standard options

dash_dot

Linewidth

dot

Options

dash

Pie

solid

Linepatterns

Line pattern palette

Dot

palette linepalette

Box

Connect

lowess close tradeday

We can use the built-in Stata command
palette linepalette to view a
variety of line patterns that are
available within Stata to help us choose
a pattern to our liking.
Uses spjanfeb2001.dta & scheme vg s2c

Bar

30

Matrix

20
Trading day number

Compassdir

10

Twoway

Clockpos

0

Introduction

Colors

This example shows other formulas we
could create, including " ##", which
yields long dashes with short and then
long breaks in the middle, and "-.#",
which yields a short dash, a dot, and a
space. Using these formulas, we can
create a wide variety of line patterns for
those instances where we need to
differentiate multiple lines.
Uses spjanfeb2001.dta & scheme vg s2c

Angles

twoway (line close tradeday, clpattern("l") sort)
(lfit close tradeday, clpattern(" ##"))
(lowess close tradeday, clpattern("-.#"))

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
338

Chapter 10. Styles for changing the look of graphs

twoway (line close tradeday, clwidth(vthick) sort)
(lfit close tradeday, clwidth(thick))
(lowess close tradeday, bwidth(.5) clwidth(thin))

1250

1300

1350

1400

Now, we plot the same three lines but
this time differentiate them by line
thickness using the clwidth() (connect
line width) option.
Uses spjanfeb2001.dta & scheme vg s2c

0

10

20

30

40

Trading day number
Closing price

Fitted values

lowess close tradeday

twoway (line close tradeday, clwidth(*4) sort)
(lfit close tradeday, clwidth(*2))
(lowess close tradeday, bwidth(.5) clwidth(*.5))

1250

1300

1350

1400

We could create a similar graph using
the clwidth() option and specify the
widths as relative sizes, making the line
four times as wide for the line plot,
two times as wide for the lfit
command, and half as wide for the line
for the lowess command.
Uses spjanfeb2001.dta & scheme vg s2c
0

10

20

30

40

Trading day number
Closing price

Fitted values

lowess close tradeday

10.8

Margins

We can specify the size of a margin in three different ways. We can use a word that represents a predefined margin. These include zero, vtiny, tiny, vsmall, small, medsmall,
medium, medlarge, large, and vlarge. They also include top bottom to indicate a medium
margin at the top and bottom, and sides to indicate a medium margin at the left and right.
A second method is to give four numbers giving the margins at the left, right, top, and bottom. A third method is to use expressions such as b=5 to modify one or more of the margins.
These are illustrated below. See [G] marginstyle for more information.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
10.8

Margins

339

70
65
60
55
50

% HHs with 2+ workers

45
70
65
60
55
50

% HHs with 2+ workers

45

30000

65
60
45
15000

20000

25000

30000

Textsize

50

55

Symbols

% HHs with 2+ workers

70

Overall title

Appendix

Orientation

Using the margin(sides) option, we
obtain a margin that is medium on the
left and right but zero on the top and
bottom.
Uses allstatesdc.dta & scheme vg s2c

Styles

Markersize

scatter workers2 faminc, title("Overall title", margin(sides) box)

Standard options

25000

1979 Median Family Inc.

Margins

20000

Options

Linewidth

15000

Pie

Overall title

Dot

Linepatterns

Using margin(top bottom), we obtain
a margin that is medium on the top and
bottom but zero on the left and right.
Uses allstatesdc.dta & scheme vg s2c

Box

Connect

scatter workers2 faminc, title("Overall title", margin(top bottom) box)

Bar

30000

Matrix

25000

1979 Median Family Inc.

Compassdir

20000

Twoway

Clockpos

15000

Introduction

Overall title

Colors

We illustrate the control of margins by
adding a title to this scatterplot and
putting a box around it. We can then
see the effect of the margin() option:
the gap between the title and the box
changes. Here, we specify a large
margin, and the margin on all four
sides is now large.
Uses allstatesdc.dta & scheme vg s2c

Angles

scatter workers2 faminc, title("Overall title", margin(large) box)

1979 Median Family Inc.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
340

Chapter 10. Styles for changing the look of graphs

scatter workers2 faminc, title("Overall title", margin(9 6 3 0) box)
In addition to the words describing
margins, we can manually specify the
margin for the left, right, bottom, and
top. In this example, we specify
margin(9 6 3 0) and make the margin
for the left 9, for the right 6, for the
bottom 3, and for the top 0.
Uses allstatesdc.dta & scheme vg s2c

65
60
55
50
45

% HHs with 2+ workers

70

Overall title

15000

20000

25000

30000

1979 Median Family Inc.

scatter workers2 faminc, title("Overall title", margin(l=9 r=9) box)

65
60
55
50
45

% HHs with 2+ workers

70

Overall title

15000

20000

25000

30000

We can also manually change some of
the margins without specifying all four
margins (as in the previous example).
By specifying margin(l=9 r=9), we
can make the margin at the left and
right 9 units, leaving the top and
bottom unchanged. We can specify one
or more of the expressions l=, r=, t=,
or b= to modify the left, right, top, or
bottom margins, respectively.
Uses allstatesdc.dta & scheme vg s2c

1979 Median Family Inc.

10.9

Marker size

We can control the size of the markers by specifying a markersizestyle or a relativesize.
The markersizestyle is a word that describes the size of a marker, including vtiny, tiny,
vsmall, small, medsmall, medium, medlarge, large, vlarge, huge, vhuge, and ehuge. We
could also specify the sizes as a relativesize, which is either an absolute size or a multiple
of the original size of the marker (e.g., *2 is twice as large, or *.7 is .7 times as large). See
[G] markersizestyle for more information.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
10.10

Orientation

341

100
80
60
40
20
0

80

100

% rents $700+/mo

100
80
60
40
20
0

100

% homes cost $100K+
% who own home

% rents $700+/mo

Appendix

Symbols
Textsize

An orientationstyle is used to change the orientation of text, such as a y-axis title, an
x-axis title, or added text. An orientationstyle is similar to an anglestyle (see Styles : Angles
(327)). We can only specify four different orientations using the keywords horizontal for
0 degrees, vertical for 90 degrees, rhorizontal for 180 degrees, and rvertical for 270
degrees. See [G] orientationstyle for more information.

Orientation

Orientation

Styles

Markersize

10.10

Standard options

80

Options

60
Percent urban 1990

Margins

40

Pie

Linewidth

20

Dot

Linepatterns

We can repeat the previous graph but
use relative sizes within the msize()
option to control the sizes of the
markers, making them, respectively,
half the normal size, regular size, and
half again the normal size.
Uses allstatesdc.dta & scheme vg s2c

Box

Connect

twoway (scatter propval100 rent700 ownhome urban, msize(*.5 *1 *1.5))

Bar

% homes cost $100K+
% who own home

Matrix

60
Percent urban 1990

Compassdir

40

Twoway

Clockpos

20

Introduction

Colors

Here, we have an overlaid scatterplot
where we graph three variables on the
y-axis (propval100, rent700, and
ownhome) and use the msize(vsmall
medium large) option to make the
sizes of these markers very small,
medium, and large, respectively.
Uses allstatesdc.dta & scheme vg s2c

Angles

twoway (scatter propval100 rent700 ownhome urban,
msize(vsmall medium large))

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
342

Chapter 10. Styles for changing the look of graphs

scatter workers2 faminc,
ytitle("Family" "Worker" "Status", orientation(horizontal))

55
45

50

Family
Worker
Status

60

65

70

This example shows how we can rotate
the title for the y-axis using the
orientation(horizontal) option to
make the title horizontal.
Uses allstatesdc.dta & scheme vg s2c

15000

20000

25000

30000

1979 Median Family Inc.

scatter workers2 faminc,
xtitle("Family" "Income", orientation(vertical))

65
60
55
50
45

% HHs with 2+ workers

70

This example shows how we can rotate
the title for the x-axis to be vertical
using the orientation(vertical)
option.
Uses allstatesdc.dta & scheme vg s2c

20000

25000

30000

Family
Income

15000

10.11

Marker symbols

Stata allows a wide variety of marker symbols. We can specify O (circle), D (diamond),
T (triangle), S (square), + (plus sign), X (x), p (a tiny point), and i (invisible). We can also
use lowercase letters o, d, t, s, and x to indicate smaller symbols. For circles, diamonds,
triangles, and squares, we can append an h to indicate that the symbol should be displayed
as hollow (e.g., Oh is a hollow circle). See [G] symbolstyle for more information.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
10.11

Marker symbols

343

100
80
60
40
20
0

80

100

% rents $700+/mo

100
80
60
40
20
0

100

% homes cost $100K+
% who own home

% rents $700+/mo

100
80
60

Appendix

Orientation
Symbols

0

20

40

In this example, we use the msymbol(s
t o) option to specify small squares,
small triangles, and small circles.
Uses allstatesdc.dta & scheme vg s2c

Styles

Markersize

twoway (scatter propval100 rent700 ownhome urban, msymbol(s t o))

Standard options

80

Options

60
Percent urban 1990

Margins

40

Pie

Linewidth

20

Dot

Linepatterns

We append an h to each marker symbol
option to indicate that the symbol
should be displayed as hollow.
Uses allstatesdc.dta & scheme vg s2c

Box

Connect

twoway (scatter propval100 rent700 ownhome urban, msymbol(Sh Th Oh))

Bar

% homes cost $100K+
% who own home

Matrix

60
Percent urban 1990

Compassdir

40

Twoway

Clockpos

20

Introduction

Colors

In this example, we use the msymbol(S
T O) (marker symbol) option to plot
the three symbols in this graph using
squares, triangles, and circles.
Uses allstatesdc.dta & scheme vg s2c

Angles

twoway (scatter propval100 rent700 ownhome urban, msymbol(S T O))

40

60

80

100

Percent urban 1990
% homes cost $100K+
% who own home

% rents $700+/mo

Textsize

20

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
344

Chapter 10. Styles for changing the look of graphs

10.12

Text size

The textsizestyle is used to control the size of text, either by specifying a keyword
that corresponds to a particular size or by specifying a number representing a relative size.
The predefined keywords include zero, miniscule, quarter tiny, third tiny, half tiny,
tiny, vsmall, small, medsmall, medium, medlarge, large, vlarge, huge, and vhuge. We
could also specify the sizes as a relative size, which is a multiple of the original size of the
text. See [G] textsizestyle for more information.

scatter workers2 faminc, mlabel(stateab) mlabsize(small)
70

This example uses mlabel(stateab) to
add marker labels with the state
abbreviation labeling each point. We
use the mlabsize(small) (marker label
size) option to modify the size of the
marker labels to make the labels small.
Uses allstatesdc.dta & scheme vg s2c

AK

NH

65

MN
WI
MA
CO
VA
UT
RI IA DE
KS
IN NV

SD

ND

60

ME NC
GA
ID
SC

MT
MO

55

TN

AR
MS

AL
KY

CT
MD

NE

NJ
WY
IL

CA
NY
WA
TX
MI
OR OH
PA

OK
AZ

NM
FL

50

% HHs with 2+ workers

HI
VT

45

LA

WV

15000

20000

25000

30000

1979 Median Family Inc.

65
60
55
50
45

% HHs with 2+ workers

70

scatter workers2 faminc, mlabel(stateab) mlabsize(*1.5)

NH
HI
CT
MN MD
VT NE WI
MA
SD ND RIVA
CO NJ
UT
IA
DE
ME
NC
KS
WY
GA
IL
IDMT INNV
SC
CA
NY
MO
WA
TX
TN
OH
MI
OR
PA
OKAZ
KY
AR AL
FL
MS NM
LA
WV
15000

20000

25000

AK

In addition to using the keywords, we
can specify a relative size that is a
multiple of the normal size. Here, we
use the mlabsize(*1.5) option to
make the marker labels 1.5 times as
large as they would normally be.
Uses allstatesdc.dta & scheme vg s2c

30000

1979 Median Family Inc.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Twoway
Matrix
Bar
Box
Dot

Save/Redisplay/Combine

The appendix contains a hodgepodge of material that did not fit well in any of the previous chapters. We begin by illustrating some of the other kinds of graphs Stata can produce
that were not covered in this book and how to use the options illustrated in this book to
make them. Next, we look at how to save graphs, redisplay graphs, and combine multiple
graphs into a single graph. This is followed by a section with more realistic examples that
require a combination of multiple options or data manipulation to create the graph. We
review some common mistakes in writing graph commands and showing how to fix them,
followed by a brief look at creating custom schemes. This chapter and the book conclude
by describing the online supplements to the book and how to get them.

Stat graph options

Appendix

Introduction

Stat graphs

11

Pie

• Figure 11.6 shows plots associated with Receiver Operating Characteristic (ROC) analyses, which can also be used with logistic regression analysis.

Appendix

• Figure 11.5 shows a number of different plots used to understand the nature of timeseries data and to select among different time-series models.

Online supplements

• Figure 11.4 shows some plots that help to illustrate the results of a survival analysis.

Styles

• Figure 11.3 shows a number of graphs that can be used to assess how your data meets
the assumptions of linear regression.

Standard options

• Figure 11.2 illustrates the gladder and qladder commands, which show the distribution of a variable according to the ladder of powers to help visually identify
transformations for achieving normality.

Customizing schemes

• Figure 11.1 illustrates a number of graphs used to examine the univariate distribution
of variables.

Common mistakes

This section illustrates some of the Stata commands for producing specialized statistical
graphs. Unlike other sections of this book, this section merely illustrates these kinds of
graphs but does not further explain the syntax of the commands used to create them. The
graphs are illustrated on the following six pages, with multiple graphs on each page. The
title of each graph is the name of the Stata command that produced the graph. We can use
the help command to find out more about that command or look up more information in
the appropriate Stata manual. The figures are described below.

Options

Overview of statistical graph commands, stat graphs

More examples

11.1

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this345
document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
346

Chapter 11. Appendix

spikeplot

0

0

10

.05

Density

Frequency
20
30

.1

40

50

.15

histogram

0

10

20
30
hourly wage

40

0

10

30

40

symplot

0

0

.05

Density

.1

Distance above median
10
20
30

40

.15

kdensity

20
hourly wage

0

10

20
hourly wage

30

40

0

1

2
3
4
Distance below median

qnorm

0.00

−10

0

hourly wage
10
20

Normal F[(wage−m)/s]
0.50

30

40

1.00

pnorm

5

0.00

0.25
0.50
0.75
Empirical P[i] = i/(N+1)

1.00

−10

0

10
Inverse Normal

20

30

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
Figure
11.1: Distribution
graphs
published form of the book may
be distributed
or reproduced,
either electronically or in printed form.

i
i

i
i

i

i

i

i
11.1

Overview of statistical graph commands, stat graphs

347

.15
.1

.01
40000

60000

0

.05

.005
0

0

20000

0

1000

0

20

6

2
0

0

.2
0

4

0

2

−1

−.5

1/square
30
20
10

10

4

0

0

2
0

−1

−.5

0

−1

−.5

0

hourly wage
Histograms by transformation

20

40

identity

0

20000

0
−20

−1000 0

−20000

−500

500

1000

−20

0

6

−.5

2
0

−1

2
0
0

2

4

6

0

2

0
−.5

0

−1

−1
.5

0

1/cubic

−.5

0
−.5

0

−.5

1/square

−1
−.5

−1

−.2

0

.2

−.2

−.1

0

.1

Online supplements

inverse

4

Appendix

4

0

1/sqrt

4

log

20

Customizing schemes

sqrt

0

Styles

−50000

0

50000

1000 2000

square

Common mistakes

cubic

Standard options

qladder

Options

0

Pie

−.5

More examples

−1

Dot

6

0

1/cubic

20

inverse

4

Box

2

Bar

0

Save/Redisplay/Combine

.5

.4

4

1/sqrt

1

.6

log

40

Matrix

sqrt

2000

Twoway

0

Density

identity

Stat graph options

2.0e−044.0e−04

square

Introduction

Stat graphs

gladder
cubic

hourly wage
Quantile−Normal plots by transformation

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
Figure
11.2: Ladder
of power
published form of the book may
be distributed
or reproduced,
eithergraphs
electronically or in printed form.

i
i

i
i

i

i

i

i
348

Chapter 11. Appendix

Residuals
10
0
−10

−10

0

Residuals
10

20

rvpplot

20

rvfplot

−5

0

5
10
Fitted values

15

20

0

5000
Pop/10 sq. miles

cprplot

0

−10

.1

Leverage

.2

Component plus residual
0
10
20

.3

30

lvr2plot

10000

0

.1
.2
Normalized residual squared

.3

0

5000
Pop/10 sq. miles

−10

−10

0

Aug Comp Plus Res
0
10
20

e( rent700 | X )
10
20

30

avplot

30

acprplot

10000

−30
0

5000
Pop/10 sq. miles

10000

−20

−10
0
e( urban | X )

10

20

coef = .21476037, se = .06342317, t = 3.39

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
11.3:
Regression
diagnostics
graphs
published form of theFigure
book may
be distributed
or reproduced,
either
electronically or in printed form.

i
i

i
i

i

i

i

i
Overview of statistical graph commands, stat graphs

349

1

stcurve, surv

.8
Survival
.6
.4

0.50
0.00

.2
0

500

2500

1

1

stci, graph

Dot

ltable, graph

1000 1500 2000
analysis time

Box

2500

Bar

2000

Matrix

1000
1500
analysis time

Save/Redisplay/Combine

500

Twoway

Stat graph options

0

Introduction

Stat graphs

sts graph, by()

1.00

11.1

Pie

Survival probability
.4
.6
.8
.2

Proportion Surviving
.4
.6
.8
.2

0

5000
10000
analysis time

6

Survival Probability
0.50
0.00

0

Online supplements

2

4
6
ln(analysis time)

8

Appendix

1.00

stcoxkm

Styles

−ln[−ln(Survival Probability)]
2
4

15000

Customizing schemes

stphplot

0

Standard options

2500

Common mistakes

500
1000 1500 2000
recurrence free survival time,

Options

More examples

0

0

500

1000 1500 2000
analysis time

2500

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
11.4:
Survival graphs
published form of the book may beFigure
distributed
or reproduced,
either electronically or in printed form.

i
i

i
i

i

i

i

i
350

Chapter 11. Appendix

Partial autocorrelations of close
0.00
0.50
1.00
−5.00

−5.00
0.00

0.10

0.20 0.30
Frequency

0.40

0.50

0.00
−0.20
−10

0
Lag

10

20

−0.40

Cross−correlations of close and volume
−0.40
−0.20
0.00

−20

20
Lag

30

40

cumsp

0.00

xcorr

10

0.50

0.00

Closing price
Log Periodogram
0.00

5.00

pergram

0

1.00

40

0.00

30

Closing price
Cumulative spectral distribution
0.50
1.00

20
Lag

0.00

10

pac

0.10

0.20 0.30
Frequency

0.40

0.50

wntestb

Cumulative periodogram for close
0.00
0.50
1.00

0

5.00

−1.00

Autocorrelations of close
−0.50
0.00
0.50

1.00

ac

0.00

0.10

0.20
0.30
Frequency

0.40

0.50

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
Figure
11.5: Time-series
graphs
published form of the book may
be distributed
or reproduced,
either electronically or in printed form.

i
i

i
i

i

i

i

i
Overview of statistical graph commands, stat graphs

351

.75
Sensitivity
.5
.25

0.25

0

0.00

1.00

0

.75

1

Area under curve = 0.8945 se(area) = 0.0305

1.00

lroc

Dot

roccomp, graph

1.00

.5
1 − Specificity

Box

Area under ROC curve = 0.8828

.25

Bar

0.50
0.75
1 − Specificity

Save/Redisplay/Combine

0.25

Matrix

Stat graph options

0.00

Twoway

Sensitivity
0.50
0.75

1

rocplot

Introduction

Stat graphs

roctab, graph

1.00

11.1

Pie

Sensitivity
0.50
0.75

0.75

0.25

Sensitivity
0.50

0.00

0.25
0.00

0.75

1.00

0.25

0.50
0.75
1 − Specificity

1.00

Area under ROC curve = 0.8828

1.00
Sensitivity/Specificity
0.25
0.50
0.75
0.00

0.50
0.75
Probability cutoff

Sensitivity

1.00

Appendix

0.25

Online supplements

0.00

Styles

Customizing schemes

lsens

Standard options

0.50
1−Specificity

Common mistakes

0.25

Options

More examples

0.00
0.00

Specificity

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
Figure 11.6:
ROC graphs
published form of the book may be distributed
or reproduced,
either electronically or in printed form.

i
i

i
i

i

i

i

i
352

Chapter 11. Appendix

11.2

Common options for statistical graphs, stat graph options

This section illustrates how to use Stata graph options with specialized statistical graph
commands. Many of the examples will assume that we have run the command
. regress propval100 popden pcturban

and will illustrate subsequent commands with options to customize those specialized statistics graphs.

lvr2plot

0

.1

Leverage

.2

.3

Consider this regression analysis, which
predicts propval100 from two
variables, popden and pcturban. We
can use the lvr2plot command to
produce a leverage-versus-residual
squared plot.
Uses allstates.dta & scheme vg s2c
Before running the graph command,
type
reg propval100 popden pcturban
0

.05

.1

.15

.2

Normalized residual squared

lvr2plot, msymbol(Oh) msize(vlarge)

0

.1

Leverage

.2

.3

We can add options such as msymbol()
and msize() to control the display of
the markers in the graph. See
Options : Markers (235) for more details.
Uses allstates.dta & scheme vg s2c
Before running the graph command,
type
reg propval100 popden pcturban

0

.05

.1

.15

.2

Normalized residual squared

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.2

Common options for statistical graphs, stat graph options

353

.3
Leverage

MA

VT
HI
NH

0

.1

CA

.1

.15

.2

Normalized residual squared

.025
.015
.01
.005
0

60

80

100

Customizing schemes

.015
.01
.005
0
0

20

40

60

% homes cost $100K+

Online supplements

Density

.02

.025

kdensity propval100, clwidth(thick) clpattern(dash)

Appendix

100

Styles

80

Standard options

40

% homes cost $100K+

Options

Density

.02

Pie

20

Common mistakes

0

The section Options : Connecting (250)
shows a number of options we could
add to control the display of the line.
Here, we add the clwidth() and
clpattern() options to make the line
thick and dashed.
Uses allstates.dta & scheme vg s2c

Dot

More examples

Consider this kernel-density plot for the
variable propval100. We could add
options to control the display of the
line. See the following example.
Uses allstates.dta & scheme vg s2c

Box

kdensity propval100

Bar

.05

Save/Redisplay/Combine

0

Matrix

WV
CT
NV
AZUT
CO
SD
NC
MS
FL ME
NY
MD
ND
ILTX
KY
AR
NM
MT
ID
SC
WA
DE
OR
WY
KS
IA
NE
OK
MN
PA
AL
MO
TN
LA
GA
WI
MI
IN AKOH
VA

Twoway

.2

RI

Introduction

NJ

Stat graph options

We can add the mlabel() option to add
marker labels to the graph. We could
also add further options to control the
size, color, and position of the marker
labels; see Options : Marker labels (247)
for more details.
Uses allstates.dta & scheme vg s2c
Before running the graph command,
type
reg propval100 popden pcturban

Stat graphs

lvr2plot, mlabel(stateab)

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
354

Chapter 11. Appendix

avplot popden

20
0
−40

−20

e( propval100 | X )

40

60

Consider this added-variable plot. We
can modify the axis titles as illustrated
in the following examples.
Uses allstates.dta & scheme vg s2c
Before running the graph command,
type
reg propval100 popden pcturban

−2000

0

2000

4000

6000

e( popden | X )
coef = .00673009, se = .00120878, t = 5.57

−20

0

20

40

60

Here, we use the xtitle() and
ytitle() options to change the titles of
the x- and y-axes. See
Options : Axis titles (254) for more
details.
Uses allstates.dta & scheme vg s2c
Before running the graph command,
type
reg propval100 popden pcturban

−40

property value adjusted for percent urban

avplot popden, xtitle("popden adjusted for percent urban")
ytitle("property value adjusted for percent urban")

−2000

0

2000

4000

6000

popden adjusted for percent urban
coef = .00673009, se = .00120878, t = 5.57

20
0
−20
−40

e( propval100 | X )

40

60

avplot popden, note("Regression statistics for popden", prefix)

−2000

0

2000
e( popden | X )

Regression statistics for popden
coef = .00673009, se = .00120878, t = 5.57

4000

6000

The prefix option can be used with
the different title options to add a prefix
to an existing title. In the note()
option, for example, we add text before
the existing note. In this way, we add
additional descriptive information to an
existing title, subtitle, note, or caption.
We could also use the suffix option to
add information after an existing title.
Uses allstates.dta & scheme vg s2c
Before running the graph command,
type
reg propval100 popden pcturban

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.2

Common options for statistical graphs, stat graph options

355

60
40
20
0

e( propval100 | X )

−20
−40

2000

4000

6000

coef = .00673009, se = .00120878, t = 5.57

60

Pie

40
20

Residuals

0
−20

60

80

100

Customizing schemes

0
−20
−40
−60
0

20

40

60

Fitted values

Online supplements

Residuals

20

40

60

rvfplot, ylabel(-60(20)60, nogrid) yline(-20 20)

Appendix

100

Styles

80

Standard options

40

Options

20

Common mistakes

0

Fitted values

Here, we add the ylabel() option to
label the y-axis from −60 to 60,
incrementing by 20, and suppress the
grid. Further, we use the yline()
option to add a y-line at 20 and −20.
For more information about labeling
and scaling axes, see Options : Axis labels
(256) and Options : Axis scales (265).
Uses allstates.dta & scheme vg s2c
Before running the graph command,
type
reg propval100 popden pcturban

Dot

More examples

Consider this residual-versus-fit plot.
We often hope to see an even
distribution of points around zero on
the y-axis. To help evaluate this
distribution, we might want to label the
y-axis identically for the values above 0
and below 0.
Uses allstates.dta & scheme vg s2c
Before running the graph command,
type
reg propval100 popden pcturban

Box

rvfplot

Bar

Save/Redisplay/Combine

e( popden | X )

Matrix

0

Twoway

−2000

Introduction

Stat graph options

We can modify the look of the existing
title without changing the text. For
example, we add the size(huge)
option to make the existing title huge
in size. See Options : Axis titles (254)
and Options : Textboxes (303) for more
details.
Uses allstates.dta & scheme vg s2c
Before running the graph command,
type
reg propval100 popden pcturban

Stat graphs

avplot popden, xtitle(, size(huge))

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
356

Chapter 11. Appendix

sts graph, by(hormon)
This graph shows survival-time
estimates broken down by whether one
is in the treatment group or the control
group. The legend specifies the groups,
but we might want to modify the labels
as shown in the next example.
Uses hormone.dta & scheme vg s2c

0.00

0.25

0.50

0.75

1.00

Kaplan−Meier survival estimates, by hormon

0

500

1000

1500

2000

2500

analysis time
hormon = 0

hormon = 1

sts graph, by(hormon) legend(label(1 Control) label(2 Treatment))
We can use the legend() option to use
different labels within the legend. See
Options : Legend (287) for more details.
Uses hormone.dta & scheme vg s2c

0.00

0.25

0.50

0.75

1.00

Kaplan−Meier survival estimates, by hormon

0

500

1000

1500

2000

2500

analysis time
Control

Treatment

sts graph, by(hormon) legend(off)
text(.5 800 "Control", box) text(.8 1500 "Treatment", box)
1.00

Kaplan−Meier survival estimates, by hormon

0.50

0.75

Treatment

0.00

0.25

Control

0

500

1000

1500

2000

2500

To suppress the display of the legend,
we can use the legend(off) option.
Instead, we can use the text() option
to add text directly to the graph to
label the two lines; see
Options : Adding text (299) for more
information. We also use the box
option to surround the text with a box;
see Options : Textboxes (303) for more
details.
Uses hormone.dta & scheme vg s2c

analysis time

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.2

Common options for statistical graphs, stat graph options

357

60
40
20
0

e( propval100 | X )

−20
−40

6000

Pie

40
20

e( propval100 | X )

0
−20
−40

0

2000
e( popden | X )

20

0

e( propval100 | X )

40

−20

−2000

0

2000
e( popden | X )

4000

6000

−40

Appendix

60

Styles

coef = .00673009, se = .00120878, t = 5.57

Online supplements

We can change the look of the graph by
selecting a different scheme. Here, we
use scheme(economist) to display the
graph using the economist scheme. See
Standard options : Schemes (318) for
more details.
Uses allstates.dta & scheme vg s2c
Before running the graph command,
type
reg propval100 popden pcturban

Customizing schemes

avplot popden, scheme(economist)

Standard options

Common mistakes

−2000

Options

More examples

Here, we add the note("") option,
which suppresses the display of the note
at the bottom showing the coefficients
for the regression model.
Uses allstates.dta & scheme vg s2c
Before running the graph command,
type
reg propval100 popden pcturban

60

avplot popden, note("")

Dot

4000

coef = .00673009, se = .00120878, t = 5.57

Box

6000

Bar

4000

Matrix

2000
e( popden | X )

Twoway

0

Save/Redisplay/Combine

−2000

Introduction

Added variable plot

Stat graph options

We return to the regression analysis
predicting propval100 from two
variables, popden and pcturban. Here,
we show an added-variable plot with
the title() option to add a title. We
could also add a subtitle(),
caption(), or note() to the graph, as
well; see Standard options : Titles (313)
for more details.
Uses allstates.dta & scheme vg s2c
Before running the graph command,
type
reg propval100 popden pcturban

Stat graphs

avplot popden, title("Added variable plot")

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
358

Chapter 11. Appendix

avplot popden, xsize(3) ysize(1) scale(1.3)
e( propval100 | X )
−40 −20 0
20 40

60

The section Standard options : Sizing
graphs (322) describes options we can
use to control the size of the graph and
the scale of the contents of the graph.
Here, we show the xsize(), ysize(),
and scale() options.
Uses allstates.dta & scheme vg s2c
Before running the graph command,
type
reg propval100 popden pcturban

−2000

0

2000
e( popden | X )

4000

6000

coef = .00673009, se = .00120878, t = 5.57

11.3

Saving and combining graphs, save/redisplay/combine

This section shows how to save, redisplay, and combine Stata graphs. We begin by
showing how to save graphs either to disk or in memory. We also show how to redisplay
the graph and, when we redisplay the graph, control the look of the graph.

0

.01

Density

.02

.03

twoway histogram urban, saving(hist1)

40

60

80

100

Most, if not all, Stata graph commands
allow us to use the saving() option to
save the graph as a Stata .gph file. We
save this graph, naming it hist1.gph,
and store it in the current directory.
We will assume in these examples that
all graphs are stored in the current
directory, but we can precede the
filename with a directory name and
store it wherever we wish.
Uses allstates.dta & scheme vg s2c

Percent urban 1990

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.3

Saving and combining graphs, save/redisplay/combine

359

.03
.02
Density
.01
0
.03

Pie

.02
Density
.01
0

.04
.03
Density

.02
.01
0

60

80

Appendix

40

% homes cost $100K+

Styles

20

Online supplements

0

Standard options

Customizing schemes

twoway histogram propval100, name(hist2)

Options

60
Percent urban 1990

Common mistakes

40

The name() option is much like the
saving() option, except that the graph
is saved in memory instead of on disk.
We can then view the graph later
within the same Stata session, but once
we quit Stata, the graph in memory
will be gone.
Uses allstates.dta & scheme vg s1c

Dot

100

More examples

When we view the graph, we can add
the scheme() option to view the same
graph using a different scheme. Here,
we view the last graph but use the
s1mono scheme.
Uses allstates.dta & scheme s1mono

Box

80

graph use hist1.gph, scheme(s1mono)

Bar

100

Matrix

80

Twoway

60
Percent urban 1990

Save/Redisplay/Combine

40

Introduction

Stat graph options

At a later time (including after quitting
and restarting Stata), we can view the
saved graph with the graph use
command. If hist1.gph had been
stored in a different directory, we would
have to precede it with the directory
where it was saved or use the cd
command to change to that directory.
Uses allstates.dta & scheme vg s2c

Stat graphs

graph use hist1.gph

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
360

Chapter 11. Appendix

graph display hist2

.02
0

.01

Density

.03

.04

The graph display command is
similar to the graph use command,
except that it redisplays graphs saved
in memory. Here, we redisplay the
graph we created with the name(hist2)
option.
Uses allstates.dta & scheme vg s1c

0

20

40

60

80

% homes cost $100K+

graph display hist2, xsize(2) ysize(2)

.02
0

.01

Density

.03

.04

The graph display command allows us to use the
xsize() and ysize() options to change the size and
aspect ratio of the graph. Here, we redisplay the graph
we named hist2 and make the graph 2 inches tall by 2
inches wide.
Uses allstates.dta & scheme vg s1c

0

20

40

60

80

% homes cost $100K+

graph display hist2, scheme(s1mono)

0

.01

Density
.02

.03

.04

We can also use the scheme() option to view the same
graph using a different scheme. Here, we view the
previous graph but use the s1mono scheme.
Uses allstates.dta & scheme s1mono

0

20

40
60
% homes cost $100K+

80

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.3

Saving and combining graphs, save/redisplay/combine

361

100
80
60
40

% homes cost $100K+

20
0

100

80

100

Dot

80

Box

60
Percent urban 1990

Bar

40

Save/Redisplay/Combine

20

Matrix

Stat graph options

Using the name(scat1) option saves
this scatterplot in memory with the
name scat1.
Uses allstates.dta & scheme vg s2c

Twoway

Stat graphs

twoway scatter propval100 urban, name(scat1)

Introduction

Let’s look at some examples to show how to combine graphs once they have been created and saved. First, we will see how to show two scatterplots side by side rather than
overlaying them.

Pie

40
30
20
10

60
Percent urban 1990

Styles

40

Appendix

Customizing schemes

0

% rents $700+/mo

Common mistakes

20

Standard options

We save this second scatterplot with
the name scat2.
Uses allstates.dta & scheme vg s2c

Options

More examples

twoway scatter rent700 urban, name(scat2)

Online supplements

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
362

Chapter 11. Appendix

Using the graph combine command,
we can see these two scatterplots side
by side. In a sense, the y-axis is on a
different scale for these two graphs
since they are different variables.
However, in another sense, the scale for
the two y-axes is the same since they
are both measured in percents.
Uses allstates.dta & scheme vg s2c

0

0

20

10

% rents $700+/mo
20

% homes cost $100K+
40
60

30

80

40

100

graph combine scat1 scat2

20

40
60
80
Percent urban 1990

100

20

40
60
80
Percent urban 1990

100

100
80
20
0

0

20

% rents $700+/mo
40
60

% homes cost $100K+
40
60

80

100

graph combine scat1 scat2, ycommon

20

40
60
80
Percent urban 1990

100

20

40
60
80
Percent urban 1990

100

This graph is the same as the last one,
except that the y-axes are placed on a
common scale by using the ycommon
option. This makes it easy to compare
the two y-variables by forcing them to
be on the same metric. Note that the
ycommon option does not work when the
graphs have been made using different
kinds of commands, e.g., graph bar
and graph box.
Uses allstates.dta & scheme vg s2c

Let’s look at more detailed examples showing how we can combine graphs and at options
we can use in creating the graphs. The next set of examples uses the sp2001ts data file.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.3

Saving and combining graphs, save/redisplay/combine

363

1400
1300
1200
1100

High price/Low price

1000
900

1Jan02

2.5

Pie

2
1.5

Volume (millions)

1
.5

1Jul01
Date

2.5

1400

2

1300

1

Volume (millions)
1.5

High price/Low price
1100
1200

.5

1000
900

1Apr01

1Jul01
Date

1Oct01

1Jan02

1Jan01

1Apr01

1Jul01
Date

1Oct01

1Jan02

Appendix

1Jan01

Online supplements

We can now use the graph combine
command to combine these two graphs
into a single graph. The graphs are
displayed as a single row, but say that
we would like to display them in a
single column.
Uses sp2001ts.dta & scheme vg s2c

Styles

Customizing schemes

graph combine hilo vol

Standard options

1Apr01

Common mistakes

1Jan01

Options

More examples

We can make another graph that shows
the volume (millions of shares sold per
day) for 2001 and save this graph in
memory, naming it vol.
Uses sp2001ts.dta & scheme vg s2c

Dot

1Oct01

twoway spike volmil date, name(vol)

Box

1Jan02

Bar

1Oct01

Date

Matrix

1Jul01

Twoway

1Apr01

Save/Redisplay/Combine

1Jan01

Introduction

Stat graph options

We make a graph showing the high and
low closing price of the S&P 500 for
2001 and save this graph in memory,
naming it hilo.
Uses sp2001ts.dta & scheme vg s2c

Stat graphs

twoway rarea high low date, name(hilo)

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
364

Chapter 11. Appendix

High price/Low price
90010001100120013001400

graph combine hilo vol, cols(1)

1Jan01

1Apr01

1Jul01

1Oct01

1Jan02

1Oct01

1Jan02

.5

Volume (millions)
1 1.5 2 2.5

Date

Using the cols(1) option, we can
display the price above the volume.
However, because the x-axes of these
two graphs are scaled the same, we
could save space and remove the x-axis
scale from the top graph.
Uses sp2001ts.dta & scheme vg s2c

1Jan01

1Apr01

1Jul01
Date

twoway rarea high low date, xscale(off) name(hilo, replace)

1200
1100
900

1000

High price/Low price

1300

1400

Here, we use the xscale(off) option
to suppress the display of the x-axis,
including the space that would be
allocated for the labels. We name this
graph hilo again but need to use the
replace option to replace the existing
graph named hilo.
Uses sp2001ts.dta & scheme vg s2c

graph combine hilo vol, cols(1)

.5

Volume (millions)
1 1.5 2 2.5

High price/Low price
900 1000 1100 1200 1300 1400

We combine these two graphs; however,
we might want to push the graphs a bit
closer together.
Uses sp2001ts.dta & scheme vg s2c

1Jan01

1Apr01

1Jul01
Date

1Oct01

1Jan02

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.3

Saving and combining graphs, save/redisplay/combine

365

Volume (millions)
High price/Low price
1
1.5
2
2.5 900 1000 1100 1200 1300 1400
.5

1Jan02

12

1Jul01

1Oct01

1Jan02

Common mistakes

1400
High price/Low price
1100
1200
1300
1000
Volume (millions)
1 2 900

1Oct01

1Jan02

Appendix

1Jul01
Date

Online supplements

1Apr01

Styles

Customizing schemes

1Jan01

Standard options

Date

Options

Volume (millions)

Pie

1Apr01

graph combine hilo vol, cols(1) imargin(b=1 t=1)
We combine these graphs again, and
the combined graph looks pretty good.
We might further tinker with the graph,
changing the xtitle() for the volume
graph to be shorter or modifying the
xlabel() for the volume graph.
Uses sp2001ts.dta & scheme vg s2c

Dot

1Jan01

More examples

Using the fysize() (force y size)
option makes the graph 25% of its
normal size. We use this instead of
ysize() because the graph combine
command does not respect the ysize()
or xsize() options. For aesthetics, we
also reduce the number of labels. We
save this graph in memory, replacing
the existing graph named vol.
Uses sp2001ts.dta & scheme vg s2c

Box

twoway spike volmil date, ylabel(1 2) fysize(25) name(vol, replace)

Bar

1Oct01

Date

Matrix

1Jul01

Twoway

1Apr01

Save/Redisplay/Combine

1Jan01

Introduction

Stat graph options

Here, we use the imargin(b=1 t=1)
option to make the margin at the top
and bottom of the graphs to be very
small before combining them. However,
we might want the lower graph of
volume to be smaller.
Uses sp2001ts.dta & scheme vg s2c

Stat graphs

graph combine hilo vol, cols(1) imargin(b=1 t=1)

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
366

Chapter 11. Appendix

11.4

Putting it all together, more examples

Most of the examples in this book have focused on the impact of a single option or a
small number of options, using datasets that required no manipulation prior to making the
graph. In reality, many graphs use multiple options together, and some require prior data
management. This section addresses this issue by showing some examples that combine
numerous options and require some data manipulation before making the graph.

80
60
40

Line where % Urban 1980 = % Urban 1990

20

Percent Urban 1990

100

twoway (scatter urban pcturban80) (function y=x, range(30 100)),
xtitle(Percent Urban 1980) ytitle(Percent Urban 1990)
legend(order(2 "Line where % Urban 1980 = % Urban 1990") pos(6) ring(0))

20.0

40.0

60.0

80.0

100.0

Percent Urban 1980

This graph shows the percentage of
population living in an urban area of a
state in 1990 against that of 1980. If
there had been no changes from 1980 to
1990, the values would fall along a
45-degree line, where the value of y
equals the value of x. Overlaying
(function y=x), we can see any
discrepancies from 1980 to 1990. The
range(30 100) option makes the line
span from 30 to 100 on the x-axis.
Uses allstates.dta & scheme vg s2c

60
40

50

% Own Home

70

80

twoway (lfitci ownhome borninstate) (lfitci ownhome borninstate,
ciplot(rline) blcolor(blue) blwidth(thick) blpattern(dash))
(scatter ownhome borninstate), legend(off) ytitle("% Own Home")

20

40

60

80

This example shows how we can make a
scatterplot, a regression line, and a
confidence interval for the fit shown as
an area. We also add a thick, blue,
dashed line showing the upper and
lower confidence limits. The first
lfitci makes the fit line and area; the
second lfitci makes a thick, blue,
dashed outline for the area; and
scatter overlays the scatterplot.
Uses allstates.dta & scheme vg s2c

% born in state of residence

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.4

Putting it all together, more examples

367

80
70
60
50

80
70
60
50

60

80

20

40

60

80

% born in state of residence

1400
1350
1300
1250
1200

Standard options

21feb2001

Appendix

Customizing schemes

Styles

Date

Options

01feb2001

Common mistakes

08jan2001

Pie

More examples

Before making this high/low/close
graph, we first type tsset date,
daily to tell Stata that date should be
treated as a date in the tlabel()
option. The rcap command uses close
for both the high and the low values,
making the tick line for the closing
price, and the legend(off) option
suppresses the legend. Using the
vg samec scheme makes the spikes and
closes the same color.
Uses spjanfeb2001.dta & scheme
vg samec

Dot

twoway (rspike hi low date) (rcap close close date, msize(medsmall)),
tlabel(08jan2001 01feb2001 21feb2001) legend(off)

Box

40

Bar

20

Save/Redisplay/Combine

% who own home

West

Matrix

South

Twoway

North

%Own home by
%born in St.
by region

Introduction

Stat graph options

The hole(1) option leaves the first
position empty when creating the
graphs, and the title is placed there
using pos(11) and ring(0). We use
width() and height() to adjust the
size of the textbox and
justification() and alignment() to
center the textbox horizontally and
vertically. The note("") option
suppresses the note in the bottom
corner of the graph.
Uses allstates.dta & scheme vg s2c

Stat graphs

twoway scatter ownhome borninstate,
by(nsw, hole(1) title("%Own home by" "%born in St." "by region",
pos(11) ring(0) width(65) height(35) justification(center)
alignment(middle)) note(""))

Online supplements

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
368

Chapter 11. Appendix

twoway (rspike hi low date) (rcap close close date, msize(medsmall))
(scatteri 1220 15027 1220 15034, recast(line) clwid(vthick) clcol(red)),
tlabel(08jan2001 01feb2001 21feb2001) legend(off)

1200

1250

1300

1350

1400

This example is the same as above,
except that this one uses scatteri() to
draw a support-level line. Two y x pairs
are given after the scatteri, and the
recast(line) option draws them as a
line instead of two points. The x-values
were calculated beforehand using
display d(21feb2001) and
display d(28feb2001) to compute the
elapsed date values.
Uses spjanfeb2001.dta & scheme
vg samec

08jan2001

01feb2001

21feb2001

The rest of the examples in this section involve some data management before we create
the graph. For the next few examples, we use the allstates data file and run a regression
command,
. vguse allstates
. regress ownhome propval100 workers2 popden

and then issue the
. dfbeta

command, creating DFBETAs for each predictor: DFpropval100, DFworkers2, and DFurban,
which are used in the following graph.

twoway dropline DFpropval100 DFworkers2 DFurban statefips,
mlabel(stateab stateab stateab)
1

DC

−.5

0

.5

CT
UT
MN NJ
UT
VT
AKAZ
NH
FL IL
MNMT
NV
NC
SD
ND
NJ
NH
ME
MD
NY
MI
MA
DEGAID
NE
TXVAWA
CA
WI
ND
MD
KY
CO
MT
IL
OH
OR
KS
NM
MO
MS
SC
AR
WI
IN
OK
CT
ME
PA
TN
AKAZ
IA
TX
DE
RI
LA
WY
NC
OH
MA
MT
KY
AL
FL
NE
GA
HIILINKS MI
VT WV
CA
WV
OR
NJ
MI MO
NC
WI
CO HI
NV
NE
NM
TX
NY
AZCA
PA
ND
ME
SD VT
CTFL
NH
MN NV
HI
UT

In this example, we show each of the
DFBETAs as a dropline plot. We add
the mlabel() option to label each point
with the state abbreviation.
Uses allstates.dta & scheme vg s2c

DC

−1

AK
DC
0

20

40

60

state code
Dfbeta propval100

Dfbeta workers2

Dfbeta urban

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.4

Putting it all together, more examples

369

1
.5

NJ

UT

0

MN

MN

−.5

NV

UT

DC

−1

AK
20

30

40

50

id
Dfbeta propval100

Dfbeta workers2

2
0
−2

Studentized residuals

−4
−6

DC

Appendix

AK

Styles

Customizing schemes

0

10

20

30

40

50

id

Online supplements

This graph uses scatter rs id to
make an index plot of the studentized
residuals. It also overlays a second
scatter command with an if
condition showing only studentized
residuals that have an absolute value
exceeding 2 and showing the labels for
those observations. Using the vg samec
scheme makes the markers the same for
both scatter commands.
Uses allstates.dta & scheme vg samec

Standard options

Common mistakes

twoway (scatter rs id) (scatter rs id if abs(rs) > 2, mlabel(stateab)),
legend(off)

Options

We are then ready to run the next graph.

Pie

More examples

. predict cd, cook
. predict rs, rstudent
. predict l, leverage

Dot

Before making the next graph, we need to issue three predict commands to generate
variables that contain the Cook’s distance, the studentized residual, and the leverage based
on the previous regression command:

Box

Dfbeta urban

Bar

10

Save/Redisplay/Combine

DC
0

Matrix

HI

Twoway

CT

Introduction

DC

Stat graph options

This example is similar to the one above
but simplifies the graph by showing
only the points where the DFBETA
exceeds .25. Note that we have taken
the example from above and converted
it into three overlaid dropline plots,
each of which has an if condition.
Uses allstates.dta & scheme vg s2c

Stat graphs

twoway (dropline DFpropval100 id if abs(DFpropval100)>.25, mlabel(stateab))
(dropline DFworkers2 id if abs(DFworkers2)>.25, mlabel(stateab))
(dropline DFurban id if abs(DFurban)>.25, mlabel(stateab))

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
370

Chapter 11. Appendix

twoway (scatter rs id, text( -3 27 "Possible Outliers", size(vlarge)))
(scatteri -3 18 -4.8 10, recast(line))
(scatteri -3 18 -3 3, recast(line)), legend(off)

−2

0

2

This graph is similar to the one above
but uses the text() option to add text
to the graph. Two scatteri commands
are used to draw a line from the text
Possible Outliers to the markers for
those points. The y x coordinates are
given for the starting and ending
positions, and recast(line) makes
scatteri behave like a line plot,
connecting the points to the text.
Uses allstates.dta & scheme vg s2c

−6

−4

Possible Outliers

0

10

20

30

40

50

This graph shows the
leverage-versus-studentized residuals,
weighting the symbols by Cook’s D
(cd). We overlay it with a scatterplot
showing the marker labels if cd exceeds
.1, with the cd value placed underneath.
Uses allstates.dta & scheme vg s2c

−2

0

CT
.1235647

AK
.1903994
−4

Studentized residuals

2

twoway (scatter rs l [aw=cd], msymbol(Oh))
(scatter rs l if cd > .1, msymbol(i) mlabel(stateab) mlabpos(0))
(scatter rs l if cd > .1, msymbol(i) mlabel(cd) mlabpos(6)), legend(off)

−6

DC
.6812371
0

.1

.2

.3

.4

Leverage

Imagine that we have a data file called comp2001ts that contains variables representing
the stock prices of four hypothetical companies: pricealpha, pricebeta, pricemu, and
pricesigma, as well as a variable date. To compare the performance of these companies,
let’s make a line plot for each company and stack them. We can do this using twoway
tsline with the by(company) option, but we first need to reshape the data into a long
format. We do so with the following commands:
. vguse comp2001ts
. reshape long price, i(date) j(compname) string

We now have variables price and company and can graph the prices by company.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.4

Putting it all together, more examples

371

20
60 0
50
40
60

price

40 40
20

1Jul01

1Oct01

1Jan02

12
10
8
6
4

Operat.

Labor

Other

(mean) wage

wageucl/wagelcl

Appendix

Cler.

Styles

Sales

Occupation recoded into 7 categories

Standard options

Mgmt

Options

Prof

Online supplements

This bar chart is overlaid with a range
plot showing the upper and lower
confidence limits. The xlabel() option
labels the values from 1 to 7,
incrementing by 1. The valuelabel
option indicates that the value labels
for occ7 will be used to label the
x-axis. The xscale() option adds a
margin to the outer bars, and the
barwidth() option creates the gap
between the bars.
Uses allstates.dta & scheme vg s2c

Customizing schemes

twoway (bar mwage occ7, barwidth(.5))
(rcap wageucl wagelcl occ7, blwid(medthick) blcolor(navy) msize(large)),
xlabel(1(1)7, valuelabel noticks) xscale(range(.5 7.5))

Pie

After this, we are ready to execute the following command:

Common mistakes

. generate wageucl = mwage + invttail(nwage,0.025)*sdwage/sqrt(nwage)
. generate wagelcl = mwage - invttail(nwage,0.025)*sdwage/sqrt(nwage)

Dot

. vguse allstates
. collapse (mean) mwage=wage (sd) sdwage=wage (count) nwage=wage, by(occ7)

More examples

For the next graph, we want to create a bar chart that shows the mean of wages by
occupation with error bars showing a 95% confidence interval for each mean. To do this,
we first collapse the data across the levels of occupation, creating the mean, standard deviation, and count. Next, we create the variables wageucl and wagelcl, which are the upper
and lower confidence limits, as shown below.

Box

Date

Bar

1Apr01

Save/Redisplay/Combine

sigma

1Jan01

Matrix

mu

Twoway

beta

Introduction

alpha

Stat graph options

We graph price for the different
companies with the by() option.
Further, cols(1) puts the graphs in
one column. yrescale and ylabel(#2)
allow the y-axes to be scaled
independently and labeled with about 2
values. The subtitle() option puts
the name of the company in the
bottom, right corner of each graph. The
title() option creates an empty title
that is thin, wide, and blue. Combined
with the compact option, the title
creates a border between the graphs.
Uses comp2001ts.dta & scheme vg s2c

Stat graphs

twoway tsline price, by(compname, cols(1) yrescale note("") compact)
ylabel(#2, nogrid) subtitle(, pos(5) ring(0) nobexpand nobox color(red))
title(" ", box width(130) height(.001) bcolor(ebblue))

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
372

Chapter 11. Appendix

twoway (rcap wageucl wagelcl occ7, blwidth(medthick) msize(large))
(bar mwage occ7, barwidth(.5) bcolor(navy)),
xlabel(1(1)7, valuelabel noticks) xscale(range(.5 7.5))

4

6

8

10

12

This graph is similar to the previous
graph, except that we have reversed the
order of the commands, placing the
rcap command first, followed by the
bar command. As a result, only the top
half of the error bar is shown. As in the
previous example, the xlabel() option
determines the labels on the x-axis.
Uses allstates.dta & scheme vg s2c
Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

Occupation recoded into 7 categories
wageucl/wagelcl

(mean) wage

Suppose that we wanted to show the mean wages with confidence intervals broken down
by occupation and whether one graduated college. We use the collapse command to create
the mean, standard deviation, and count by the levels of occ7 and collgrad, and then we
create the upper and lower confidence limits. Finally, the separate command makes separate variables for mwage based on whether one graduated college, creating mwage0 (wages for
noncollege grad) and mwage1 (wages for college grad). These commands are shown below,
followed by the command to create the graph.
. vguse nlsw
. collapse (mean) mwage=wage (sd) sdwage=wage
(count) nwage=wage, by(occ7 collgrad)
. generate wageucl = mwage + invttail(nwage,0.025)*sdwage/sqrt(nwage)
. generate wagelcl = mwage - invttail(nwage,0.025)*sdwage/sqrt(nwage)
. separate mwage, by(collgrad)

0

5

Wages

10

15

twoway (line mwage0 mwage1 occ7) (rcap wageucl wagelcl occ7),
xlabel( 1(1)7, valuelabel) xtitle(Occupation) ytitle(Wages)
legend(order(1 "Not College Grad" 2 "College Grad"))

Prof

Mgmt

Sales

Cler.

Operat.

Labor

Other

Here, we make a line graph showing the
mean wages for the noncollege
graduates, mwage0, and the college
graduates, mwage1, by occupation. We
overlay that with a range plot showing
the confidence interval. The xlabel()
option labels the x-axis with value
labels, and the legend() option labels
the legend.
Uses nlsw.dta & scheme vg s2c

Occupation
Not College Grad

College Grad

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.4

Putting it all together, more examples

373

.6
.4

% Coll Grads

.2
0

Operat.

0

.1

.2

.3

.4

% Union

Styles
Appendix

Customizing schemes

This section concludes with a graph adapted from an example on the Stata web site.
The graph combines numerous tricks, so rather than show it all at once, let’s build it up a
piece at a time. Below is the ultimate graph we would like to create. It shows the population
(in millions) for males and females in 17 different age groups, ranging from “Under 5” up
to “80 to 84”. The blue bar represents the males, and the red bar represents the females.

Standard options

Sales

Common mistakes

Labor

Options

Mgmt
Cler.

Pie

Prof

Dot

Other

Box

% Union and % college graduates
(with CIs) by occupation

More examples

The overlaid rcap commands show the
confidence intervals for both union and
collgrad for each occupation. The
scatter command uses an invisible
marker and labels each occupation at
the 10 o’clock position with a larger
gap than normal.
Uses .dta & scheme vg s2c

Bar

twoway (rcap lci coll uci coll pct un) (rcap lci un uci un pct coll, hor)
(sc pct coll pct un, msymbol(i) mlabel(occ7) mlabpos(10) mlabgap(5)),
ylabel(0(.2).7) xtitle(% Union) ytitle(% Coll Grads) legend(off)
titl e("% Union and % college graduates" "(with CIs) by occupation")

Save/Redisplay/Combine

. gen uci coll = pct coll + sd coll/sqrt(ct coll)

Matrix

. gen lci un = pct un - sd un/sqrt(ct un)
. gen uci un = pct un + sd un/sqrt(ct un)
. gen lci coll = pct coll - sd coll/sqrt(ct coll)

Twoway

. collapse (mean) pct un=un pct coll=collgrad
(sd) sd un=union sd coll=collgrad
(count) ct un=union ct coll=collgrad, by(occ7)

Introduction

Stat graph options

. vguse nlsw

Stat graphs

This next graph shows a kind of scatterplot of the mean and confidence interval for
union and collgrad for each level of occ7. To do this, we collapse the data file by occ7
and use those summary statistics to compute the confidence intervals below, followed by
the command to create the graph.

Online supplements

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
374

Chapter 11. Appendix

graph display
This is the graph that we wish to
create. For now, we simply use the
graph display command to display
the graph. Because this is displayed
using the s2color scheme, the size of
the text is not enlarged as in the other
vg schemes, so the text may be hard to
read.
Uses pop2000mf.dta & scheme s2color

80 to 84
75 to 79
70 to 74
65 to 69
60 to 64
55 to 59
50 to 54
45 to 49
40 to 44
35 to 39
30 to 34
25 to 29
20 to 24
15 to 19
10 to 14
5 to 9
Under 5

12

8

4

4

8

12

Population in millions
Male

Female

To build this graph, we first use the data file pop2000mf, which contains 17 observations corresponding to 17 age groups (for example, “Under 5”, “5 to 9”, “10 to 14”, and so
forth). The variables femtotal and maletotal contain the number of females and males in
each age group. After using the file, we create femmil, which is the number of females per
million, and malmil, which is the number of males per million, but this is made negative so
that the male (blue) bar will be scaled in the negative direction. We also make a variable
zero, which contains 0 for all observations.
. vguse pop2000mf
. gen femmil = femtotal/1000000
. gen malmil = -maletotal/1000000
. gen zero = 0

We now take the first step in making this graph.

twoway (bar malmil agegrp) (bar femmil agegrp)

−10

−5

0

5

10

This is our first attempt to make this
graph by overlaying the bar chart for
the males with the bar chart for the
females. The agegrp variable ranges
from 1 to 17 and forms the x-axis, but
we can rotate this as shown in the next
example.
Uses pop2000mf.dta & scheme s2color
0

5

10
Age category
malmil

15

20

femmil

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.4

Putting it all together, more examples

375

20
15
Age category
10
5
0

5

10

20
15
Age category
10
5
0

0

5

malmil
Age category

10

femmil

20
15
Age category
10
5
0
12

8

4

4

8

12

Appendix

Uses pop2000mf.dta & scheme s2color

80 to 84
75 to 79
70 to 74
65 to 69
60 to 64
55 to 59
50 to 54
45 to 49
40 to 44
35 to 39
30 to 34
25 to 29
20 to 24
15 to 19
10 to 14
5 to 9
Under 5

Online supplements

We use the xlabel() to change −12 to
12, −8 to 8, −4 to 4, and to label the
positive side of the x-axis as 4, 8, and
12. We also add a title for the x-axis.
Next, let’s fix the y-axis and the legend.

Styles

Customizing schemes

twoway (bar malmil agegrp, horizontal) (bar femmil agegrp, horizontal)
(scatter agegrp zero, msymbol(i) mlabel(agegrp) mlabcolor(black)),
xlabel(-12 "12" -8 "8" -4 "4" 4 8 12) xtitle("Population in millions")

Standard options

−5

Options

−10

Pie

Common mistakes

80 to 84
75 to 79
70 to 74
65 to 69
60 to 64
55 to 59
50 to 54
45 to 49
40 to 44
35 to 39
30 to 34
25 to 29
20 to 24
15 to 19
10 to 14
5 to 9
Under 5

Dot

More examples

This scatter command uses agegrp
(ranging from 1–17) as the y-value and
zero (0) for the x-value, leading to the
stack of 17 observations. Using the
msymbol(i) and mlabel() options
suppresses the symbol but displays the
name of the age group from the labeled
value of agegrp. Next, we will fix the
label and title for the x-axis.
Uses pop2000mf.dta & scheme s2color

Box

twoway (bar malmil agegrp, horizontal) (bar femmil agegrp, horizontal)
(scatter agegrp zero, msymbol(i) mlabel(agegrp) mlabcolor(black))

Bar

femmil

Matrix

0
malmil

Twoway

−5

Save/Redisplay/Combine

−10

Introduction

Stat graph options

Adding the horizontal option to each
bar chart, we can see the graph taking
shape. However, we would like the age
categories to appear inside of the red
(female) bars.
Uses pop2000mf.dta & scheme s2color

Stat graphs

twoway (bar malmil agegrp, horizontal) (bar femmil agegrp, horizontal)

Population in millions
malmil
Age category

femmil

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
376

Chapter 11. Appendix

twoway (bar malmil agegrp, horizontal) (bar femmil agegrp, horizontal)
(scatter agegrp zero, msymbol(i) mlabel(agegrp) mlabcolor(black)),
xlabel(-12 "12" -8 "8" -4 "4" 4 8 12) xtitle("Population in millions")
y scale(off) ylabel(, nogrid) legend(order(1 "Male" 2 "Female"))
We suppress the display of the y-axis
using the yscale(off) option and
suppress the grid lines with the
ylabel(, nogrid) option. Finally, we
use the legend() option to label the
bars and suppress the display of the
third symbol in the legend.
Uses pop2000mf.dta & scheme s2color

80 to 84
75 to 79
70 to 74
65 to 69
60 to 64
55 to 59
50 to 54
45 to 49
40 to 44
35 to 39
30 to 34
25 to 29
20 to 24
15 to 19
10 to 14
5 to 9
Under 5

12

8

4

4

8

12

Population in millions
Male

11.5

Female

Common mistakes

This section discusses mistakes that are frequently made when creating Stata graphs.
Using Stata 7 syntax
No matter how long we have been using Stata 8 (or later), we might revert back to old
habits and type a graph command in Stata 7 style. Consider this example:
. graph propval100 rent700

Stata replies with this error message:
propval100graph g.new rent700: class member function not found
r(4023);

Clearly, the easiest solution is to convert the command to the proper Stata 8 syntax.
Commas with graph options
With Stata 8, graph options can accept their own options (sometimes referred to as
suboptions); for example,
. twoway scatter propval100 popden rent700, xtitle("My Title", box)

Note that the xtitle() option allows us to specify the x-title followed by a comma and
a further suboption that places a box around the x-title. If we had been content with the
existing x-title, we could have issued this command:
. twoway scatter propval100 popden rent700, xtitle( , box)

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.5

Common mistakes

377

Styles
Appendix

Consider another example, which is a bit more subtle. We would like to make the line
(periphery) of the marker thick. When we run the following command, we do not see any
effect from adding the mlwidth(thick) option:

Online supplements

This command executes, but nothing changes as a result of including the mlabpos(12)
option, which would change the position of the marker labels to the 12 o’clock position.
There are no marker labels in the graph, so adding this option has no effect. We would have
to use the mlabel() option to add marker labels before we saw the effect of this option.

Standard options

. twoway scatter propval100 rent700, mlabpos(12)

Options

When we add an option to a graph, we generally expect to see the effect of adding the
option. However, sometimes adding an option has no effect. Consider this example:

Customizing schemes

Options appear to have no effect

Pie

In this case, we are half-right. There is an option alternate, but we have used it in the
wrong context, yielding the syntax error. In such cases, remember that the option we are
specifying may be right, but we just need to put it into the right context.

Dot

option alternate not allowed
invalid syntax
r(198);

Common mistakes

. twoway scatter propval100 rent700, alternate

Box

This command moves the entire scale of the x-axis to the alternate position and has the
desired effect. Another mistake we might have made was to put the alternate option as
an overall option. This command is shown below with the result:

More examples

. twoway scatter propval100 rent700, xscale(alternate)

Bar

This command executes, but it does not have the desired effect. Instead, it staggers the
labels of the x-axis, alternating between the upper and lower positions. In this context,
the alternate option means something different than we had intended. What we really
wanted to specify was xscale(alternate):

Matrix

. twoway scatter propval100 rent700, xlabel( , alternate)

Save/Redisplay/Combine

Consider the example below. Our goal is to move the labels for the x-axis from their
default position at the bottom of the graph to the alternate position at the top of the graph.

Twoway

Using options in the wrong context

Stat graph options

Based on the syntax from the title() option, we might have been tempted to have
typed legend( , cols(1)), but that would have led to an error. Some options, like the
legend() option, simply take a list of options with no comma permitted.

Introduction

. twoway scatter propval100 popden rent700, xtitle( , box) legend(cols(1))

Stat graphs

The box option places a box around the title. Note that we place a comma before the
box option. Now, suppose that we are content with the existing legend but wish to make
the legend display in a single column.

. twoway scatter propval100 rent700, mlwidth(thick)

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
378

Chapter 11. Appendix

The reason for this is that the marker has a line color and a fill color, and by default,
they are the same color, so it is impossible to see the effect of changing the thickness of the
line around the marker. However, if we make the line and fill colors different, as in the
following example, we can see the effect of the mlwidth() option:
. twoway scatter propval100 rent700, mlwidth(thick)
mlcolor(black) mfcolor(gs13)

Options when using by()
Using the by() option changes the meaning of some options. Consider the following
example:
. twoway scatter propval100 rent700, by(north) title(My title)

We might think that the title() option will provide an overall title for the graph, as it
would when the by() option is not included. However, actually, each graph will have “My
title” as the title; the graph as a whole will not. Instead, to provide an overall title for the
graph, we would specify the command this way:
. twoway scatter propval100 rent700, by(north, title(My title))

When using the legend() option combined with the by() option, we should place
options that affect the position of the legend within the by() option. Consider this example:
. twoway scatter propval100 popden rent700,
by(north, legend(pos(12))) legend(cols(1))

Here, the legend(pos(12)) option controls the position of the legend, placing it at
the 12 o’clock position, so we place it within the by() option. On the other hand, the
legend(cols(1)) option does not affect the position of the legend, so we place it outside
of the by() option. For more details on this, see Options : By (272).
Altering the wrong axis
When we use multiple x- or y-axes, it is easy to modify the wrong axis. Consider this
example:
. twoway (scatter propval100 ownhome)
(scatter rent700 ownhome, yaxis(2) ytitle(Rents over 700))

We might think that the ytitle() option will change the title for the second y-axis,
but it will actually change the first axis. Because ytitle() is an option that concerns the
overall graph, we should place it at the very end of the graph command, as shown below.
. twoway (scatter propval100 ownhome)
(scatter rent700 ownhome, yaxis(2)), ytitle(Rents over 700, axis(2))

Note that we use the axis(2) option to indicate that ytitle() should be modified for the
second y-axis.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.6

Customizing schemes

379

Options
Standard options
Styles
Appendix

Rather than creating the vg lgndc scheme from scratch, which would be very laborious,
we use the #include s2color statement to base this new scheme on the s2color scheme.
The subsequent statements change the position of the legend and the number of columns
in the legend and stack the legend keys and symbols upon each other.

Pie

half_tiny // very, very small gap between key and label
small
// somewhat larger gap between key/label pairs

Online supplements

gsize legend_key_gap
gsize legend_row_gap

9
// put the legend in the 9 o’clock position
1
// make the legend display in 1 column
yes // stack the keys & symbols on top of each other

Dot

clockdir legend position
numstyle legend cols
yesno legend stacked

Box

#include s2color // start with the s2color scheme

Customizing schemes

This section shows how to customize your own schemes. Although schemes can look
complicated, it is possible to easily create some simple schemes on our own. Let’s look at
the vg lgndc scheme as an example. This scheme is based on the s2color scheme but
changes the legend to display at the 9 o’clock position, in a single column, with the keys
stacked on top of the symbols. Here are the contents of that scheme:

Bar

Customizing schemes

Common mistakes

11.6

More examples

• Reach out to fellow Stata users, either local friends, friends at Statalist, or friends at
Stata tech support. See http://www.stata.com/support/ for more details.

Save/Redisplay/Combine

• For more detailed information about the syntax of Stata graphics, see [G] graph.
Please remember that some of the graph commands available in Stata were added
after the printing of [G] graph but are documented via the help graph command.
See also Appendix : Online supplements (382), which has links to the online help that
are organized according to the table of contents of this book.

Matrix

Stat graph options

• When possible, model graphs from existing examples. This book strives to provide
examples to model from. For additional online examples, see Appendix : Online supplements (382) for the companion web site for the book, which links to additional
examples.

Twoway

• Build graphs slowly. Rather than trying to make a final graph all at once, try
to build the graph slowly adding, one option at a time. This is illustrated in Intro : Building graphs (29), where we took a complex graph and built it one piece at a
time. Building slowly helps us isolate problems to a particular option, which we can
then further investigate.

Introduction

I hope that, by describing these errors, I can help you avoid some common errors. Here
are some additional ideas and resources to help you when you are struggling:

Stat graphs

When all else fails

Say that we liked the vg lgndc scheme but wanted to make our own version in which
the legend is in the 3 o’clock position instead of the 9 o’clock position, naming our version
legend3. To do this, we would start the Stata do-file editor, for example, by typing doedit
and then type the following into it: (Of course, the scheme will work fine if we omit
comments after the double slashes.)
The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
380

Chapter 11. Appendix

#include s2color // start with the s2color scheme
clockdir legend position
numstyle legend cols
yesno legend stacked
gsize legend_key_gap
gsize legend_row_gap

3
// put the legend in the 3 o’clock position
1
// make the legend display in 1 column
yes // stack the keys & symbols on top of each other
half_tiny // very, very small gap between key and label
small
// somewhat larger gap between key/label pairs

We can then save the file as scheme-legend3.scheme, and we are ready to use it. We
can then use the scheme(legend3) option at the end of a graph command or type set
scheme legend3, and Stata will use that scheme for displaying our graph. Below, we show
an example using this scheme. (Note that the legend3 scheme is not included among the
downloadable schemes.)

60

80

100

twoway (scatter propval100 rent700) (lfit propval100 rent700),
scheme(legend3)

% homes cost $100K+

Here, we see an example using our
newly created legend3 scheme, and
indeed, we see the legend in the 3
o’clock position, in a single column,
with the legend stacked.
Uses allstates.dta & scheme legend3

0

20

40

Fitted values

0

10

20
% rents $700+/mo

30

40

So far, things are going great. However, note that Stata will only know how to find
the newly created scheme-legend3.scheme while we are working in the directory where we
saved that scheme. If we change to a different directory, Stata will not know where to find
scheme-legend3.scheme. If, however, we save the scheme into our PERSONAL directory,
Stata would know where to find it regardless of the directory we were in. For example, on
my computer, I used the sysdir command, and it showed me the following information:
. sysdir
STATA:
UPDATES:
BASE:
SITE:
PLUS:
PERSONAL:
OLDPLACE:

C:\Stata8\
C:\Stata8\ado\updates\
C:\Stata8\ado\base\
C:\Stata8\ado\site\
c:\ado\plus\
c:\ado\personal\
c:\ado\

From this, I know that my PERSONAL directory is located in c:\ado\personal\, so if I
store either .ado files or .scheme files there, Stata will be able to find them. So, if instead
The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
11.6

Customizing schemes

381

Dot
Pie
Options
Standard options
Styles
Appendix

Online supplements

Of course, we have just scratched the surface of how to create and customize schemes.
However, this should provide the basic tools needed for making a basic scheme, storing
it in the personal directory, and then playing with the scheme. Because schemes are so
powerful, they can appear complicated, but if built slowly and methodically, the process
can be straightforward, logical, and, actually, quite a bit of fun.

Customizing schemes

which obviously controls the color of some kind of background element, but we might not
be sure which element it controls. We can find out by making a copy of the scheme and
then changing eggshell to some other nonsubtle value, such as red, and then make a graph
using this new scheme (using scheme(schemename), not set scheme schemename). The
part of the graph that becomes red will indicate the part that is controlled by the color
background statement.

Box

background eggshell

Bar

color

Common mistakes

Schemes that other people have created and the schemes built into Stata will contain
statements that control some aspect of a graph, but we may not know which aspect they
control. For example, in the vg rose scheme there is the statement

More examples

Third, we can look at the built-in Stata schemes, such as s1color, s2color, or economist.
Looking at these schemes shows us the menu of items that we can fiddle with in our own
schemes, but these schemes should never be modified directly. We can use the strategy
outlined above where we make our own scheme and use #include to read in a scheme, and
then we can add our own statements to modify the scheme as desired.

Save/Redisplay/Combine

Second, looking at other schemes can help us find ideas, for example, the downloaded
schemes for this book (see Appendix : Online supplements (382)). Say that we wanted to look
at the vg rose scheme. We could type which scheme-vg rose.scheme, and that would
tell us where that scheme is located. Then, we could use any editor (including the do-file
editor) to view that scheme for ideas.

Matrix

Stat graph options

First, the help for schemes via help schemes will tell us about schemes in general. Also,
help scheme files contains documentation about scheme files and what we can change
using schemes.

Twoway

Stat graphs

So far, this section has really focused on the mechanics of creating a scheme but has not
said much about the possible content that could be placed inside a scheme. This is beyond
the scope of this little introduction, but here are three other places where you can find this
kind of information:

Introduction

of saving scheme-legend3.scheme into the current directory, we save it into our PERSONAL
directory, Stata will be able to find it. (If we have already saved scheme-legend3.scheme
to the current directory and also save it to the PERSONAL directory, we may want to remove
the copy from the current directory.)

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
382

Chapter 11. Appendix

11.7

Online supplements

This book has a number of online resources associated with it. I encourage all readers
to take advantage of these online extras by visiting the web site for the book at
http://www.stata-press.com/books/vgsg.html
Resources on the web site include
• Programs and help files. You can easily download and install the programs and help
files associated with this book. To install these programs and help files, just type
. net from http://www.stata-press.com/data/vgsg
. net install vgsg

After installing the programs and help, type whelp vgsg for an overview of what has
been installed.
• Data files. All the data files used in the book are available at the web site for downloading. I encourage you to download the data files used in this book, play with these
examples, and try variations on your own to solidify and extend your understanding.
If you visit the website, you can download and save all the data files at once. You
can quickly download all the datasets into your current working directory from within
Stata by typing
. net from http://www.stata-press.com/data/vgsg
. net get vgsg

If you prefer, you can obtain any of the data files over the Internet with the vguse
command. Each example concludes by indicating the data file and scheme that was
used to make the graph. For example, a graph may conclude by saying
Uses allstates.dta & scheme vg s2c
This indicates that you can type vguse allstates and Stata will download and use
the data file over the Internet for you (assuming that you have installed the programs).
• Schemes. This book uses a variety of schemes, and when you download the programs
and help files (see above), the schemes used in this book are downloaded as well,
allowing you to use them to reproduce the look of the graphs in this book.
• Hopefully, a very short or empty Errata will be found at the web site. Although I
have tried very hard to make this book true and accurate, I know that some errors
will be found, and they will be listed there.
• Links to the online Stata Graphics Reference Manual, which are organized according
to the structure of the table of contents of this book.
• Other resources that may be placed on the site after this book goes to press, so visit
the site to see what else may appear there.

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i

Subject index
A
ac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
acprplot. . . . . . . . . . . . . . . . . . . . . . . . . . . .348
added-variable plot . . . . . . . . . . see avplot
adjacent lines . . . . . . . . . . . . . . . . see alsize
alsize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
alternate . . . . . . . . . . . . . . . . . . . . . . . . . . 171
alternate axes . . . . . . . . see axes, alternate
angle . . . . . . . . . . . . . . . . . . . . . . . . . . . 327–328
axis labels . . . . . . . . see xlabel() and
ylabel()
label . . . . . 31, 127, 145, 171, 182, 261
marker labels . . . . . . see mlabangle()
angle0(). . . . . . . . . . . . . . . . . . . . . . . . . . . .220
area() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
area graphs . . . . . . . . . . . . see twoway area
color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
horizontal . . . . . . . . . . . . . . . . . . . . . . . . 61
setting the base . . . . . . . . . . . . . . . . . . 62
shading . . . . . . . . . . . . . . . . . . . . . . . . . . 62
sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
ascategory . . 110, 125–126, 169–170, 203
aspect ratio . . . . . . . . . . . . . . . . . . . . . 323–324
asyvars . . 30–33, 115, 122, 123, 131–136,
159, 161–162, 175, 177–179, 194,
197, 205
augmented component-plus-residual plot
. . . . . . . . . see acprplot
avplot . . . . . . . . . . . 348, 354–355, 357–358
aweight . . . . . . . . . . . . . . . . . . . . . . . . . 37, 240
axes
alternate . . . . 126, 127, 146, 171, 204,
210, 266
bar graphs . . . . . . . . 123–130, 143–147
base . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
box plots. . . . . . . . . .168–174, 179–183
categorical
bar graphs . . . . . . . . . . . . . . . 123–130
box plots. . . . . . . . . . . . . . . . .168–174
dot plots . . . . . . . . . . . . . . . . . 202–204
titles . . . . . . . . . . . . . . . . . . . . . . . . . . 31

axes, continued
displaying for multiple graphs . . . 277
dot plots . . . . . . . . . . 202–204, 207–210
label gap. . . . . . . . . . . . . . . . . . . . . . . .129
labels . . see xlabel() and ylabel()
lines . . . . . . see xline() and yline()
log scale . . . . . . . . . . . . . . . . . . . . . . . . 267
multiple . . . . . . . . 85, 92, 98–100, 256,
271–272, 303
options . . . . . . . . . . . . . . . . 254–256, 272
reverse scale . . . . . . . . . . . . . . . . . . . . 266
scale . . . see xscale() and yscale()
scaling independently . . . . . . . . . . . 276
selecting. . . . . . . . . . . . . . . . . . . .271–272
size . . . . . . . . . . . . . . . . . . . . . . . . 323–324
suppressing. . . . . . . . . . . .146, 182, 266
titles . . . see xtitle() and ytitle()
axis() . . . . . . . . . . . . . . . . . . . . . . . . . . 99, 256
B
b1title() . . . . . 31–33, 130, 174, 285, 315
b2title() . . . . . . . . . . . . . . . . . 130, 174, 315
bar() . . . . . . . . . . . . . . . . . . . . . . . . . . 149, 150
bcolor() . . . . . . . . . . . . . . . . . . . . . . . 150
bfcolor() . . . . . . . . . . . . . . . . . . . . . . 150
blcolor() . . . . . . . . . . . . . . . . . . . . . . 150
blwidth() . . . . . . . . . . . . . . . . . . . . . . 150
bar graphs . . . see graph bar and twoway
bar
axes . . . . . . . . . . . . see axes, bar graphs
bar height . . . . . . . . . . . . . . . . . . . . . . 139
bar width . . . . . . . . . . . . . . . . . . . . . . . . 64
base . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
by() . . . . . . . . . . . . . . . . . . . . . . . 151–155
categorical axes . . . . . . . . . . . . see axes,
categorical, bar graphs
color. . . . . . . . . . . . . . . . . . . .30, 149–150
confidence intervals . . . . . . . . . . . . . 366
descending . . . . . . . . . . . . . . . . . 119–123
excluding missing bars. . .see nofill
fill color . . . . . . . . . . . . . . . . . . . . . . . . . . 64

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
384
bar graphs, continued
format . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
horizontal . . . . . . . . . . see graph hbar
labels . . . . . . . . . . . . . . . . . . . . . . . 32, 136
legend . . . . . . . . . . . . . . . . . . . . . . 130–142
line color . . . . . . . . . . . . . . . . . . . . . . . . . 64
lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
look . . . . . . . . . . . . . . . . . . . . . . . . 147–150
ordering . . . . . . . . . . . . . . . . . . . . 119–123
overlaying. . . . . . . . . . . . . . . . . . . . . . .148
placing labels below bars . . . . . . . . 142
placing labels inside bars . . . . . . . . 141
placing labels outside bars . . . . . . 142
reverse order . . . . . . . . . . . . . . . . . . . . 123
sorting . . . . . . . . . . . . . . . . . . . . . 120–122
stacked . . . . . . . . . . . . . . . . . . . . 111, 115
titles . . . . . . . . . . . . . . . . . . . . . . . 143–144
vertical separators . . . . . . . . . . . . 93–94
y-variables . . . . . . . . . . . . . . . . . 107–111
bargap(). . . . . . . . . . . . . . . . . . . . . . . . . . . .148
barwidth() . . . . . . . . . . . . . . . . . . . 64, 73, 78
base() . . . . . . . . . . . . . . . . . . . . . . . . 47, 62, 63
bcolor() . . . . . . . . . . . . . . . . . . . . . 53, 70, 74
bexpand() . . . . . . . . . . . . . . . . . . . . . . 306–307
bfcolor() . . . . . . . . . . . . . 62, 64, 70, 74, 78
bin() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
bins
lower limit . . . . . . . . . . . . . . . . . . . . . . . 76
number . . . . . . . . . . . . . . . . . . . . . . . . . . 75
biweight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
blabel() . . . . . . . . . . . 32–33, 127, 136–142
bfcolor() . . . . . . . . . . . . . . . . . . . . . . 142
box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
format() . . . . . . . . . . . . . . . . 32–33, 142
gap() . . . . . . . . . . . . . . . . . . . . . . 141–142
position() . . . . . . . . . . . . . . . . 141–142
size() . . . . . . . . . . . . . . . . . . . . . . . . . 142
blcolor() . . . . . 47, 53, 62, 64, 69, 70, 72,
74, 78
blpattern() . . . . . . . . . . . . . . . . . . . . . 53, 69
blwidth(). . . . . .47, 53, 69, 70, 72, 74, 78
box() . . . . . . . . . . . . . . . . . . . . . . . . . . .186–187
bcolor() . . . . . . . . . . . . . . . . . . 186–187
blcolor() . . . . . . . . . . . . . . . . . . . . . . 187
blwidth() . . . . . . . . . . . . . . . . . . . . . . 187
box. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .356
box plots . . . . . . . . . . . . . . . . . see graph box
adjacent lines . . . . . . . . . . . see alsize

Subject index
box plots, continued
alphabetical order. . . . . . . . . . . . . . .165
axes . . . . . . . . . . . . . see axes, box plots
by() . . . . . . . . . . . . see by(), box plots
categorical axes . . . . . . . . . . . . see axes,
categorical, box plots
descending order . . . . . . . . . . . . . . . . 165
excluding missing categories . . . . . see
nofill
horizontal . . . . . . . . . . see graph hbox
legend . . . . . . . . . . . . . . . . . . . . . . 174–179
lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
look . . . . . . . . . . . . . . . . . . . . . . . . 183–189
median values . . . . . . . see medtype(),
medmarker(), and medline()
ordering . . . . . . . . . . . . . . . . . . . . 165–167
over() . . . . . . . see over(), box plots
patterns . . . . . . . . . . . . . . . . . . . . . . . . 181
sorting . . . . . . . . . . . . . . . . . . . . . . . . . . 165
titles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
whiskers customized . . . . . . . . . . . . . see
cwhiskers
y-variables . . . . . . . . . . . . . . . . . 157–162
boxgap() . . . . . . . . . . . . . . . . . . . . . . . 185, 186
bubble plots . . . . . . . . . . . . . . . . . 37, 240–241
building a graph . . . . . . . . . . . . . . . . . . 29–33
by() . . . . . . . . 103–105, 191–192, 215–216,
232–234, 272–287, 297–299
alignment() . . . . . . . . . . . . . . . . . . . 312
b1title() . . . . . . . . . . . . . . . . . . . . . . 286
bar graphs . . . . . . . . . . . . . . . . . 151–155
box . . . . . . . . . . . . . . . . . . . . . . . . 311–312
box plots . . . . . . . . . . . . . . . . . . . 189–193
caption() . . . . . . . . . . . . . . . . . . . . . . 280
colfirst . . . . . . . . . . . . . . . . . . . . . . . 274
cols() . . . . . . 153, 192, 216, 275, 277
combining options . . . . . . . . . . . . . . 286
compact . . . . . . . . . . . 44, 104–105, 275
dot plots . . . . . . . . . . . . . . . . . . . 214–217
errors . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
height() . . . . . . . . . . . . . . . . . . . . . . . 312
holes() . . . . . . . . . . . . . . . . . . . . . . . . 274
iscale() . . . . . . . . . . . . . . . . . . . . . . . 275
ixaxes . . . . . . . . . . . . . . . . . . . . . 278–279
ixtitle . . . . . . . . . . . . . . . . . . . 279, 286
iyaxes . . . . . . . . . . . . . . . . . . . . .277, 278
iytitle . . . . . . . . . . . . . . . . . . . 278, 286
justification() . . . . . . . . . . . . . . . 312
l1title() . . . . . . . . . . . . . . . . . . . . . . 286

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
Subject index
by(), continued
legend() . . . 154, 192, 233–234, 285,
286, 297–299
at() . . . . . . . . . . . . . . . . . . . . . 234, 299
position() . . . 154, 192, 234, 285,
286, 299
missing . . . . . . . . . . 152–153, 191, 216
noedgelabel . . . . . . . . . . . . . . . . . . . 276
note() . . . . . . . . . . . . . . . . . . . . .153, 283
suffix . . . . . . . . . . . . . . . . . . . . . . . 283
pie charts . . . . . . . . . . . . . . . . . . 232–235
position() . . . . . . . . . . . . . . . . 311–312
rescale . . . . . . . . . . . . . . . . . . . 277, 286
ring() . . . . . . . . . . . . . . . . . . . . . 311–312
row() . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
rows() . . . . . . . . . . . . . . . . . . . . . . . . . 274
scale() . . . . . . . . . . . . . . . . . . . . . . . . 105
scatterplot matrices . . . . . . . . 103–105,
273–286
sts graph . . . . . . . . . . . . . . . . . . . . . . . 356
subtitle() . . . . . . . . . . . . . . . . . . . . . 282
textboxes. . . . . . . . . . . . . . . . . . .311–312
title. . . . . . . . . . . . . . . . . . . . . . . . . . . . .273
title() . . . . . . . . . . . . . . . 279–280, 286
title . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
position() . . . . . . . . . . . . . . . . . . 284
ring() . . . . . . . . . . . . . . . . . . . . . . . 284
total . . . 44, 153, 191–192, 216, 273,
286
twoway . . . . . . . . . . . . . . . . . . . 44–45, 85
width() . . . . . . . . . . . . . . . . . . . . . . . . 312
xrescale . . . . . . . . . . . . . . . . . . . . . . . 276
yrescale . . . . . . . . . . . . . . . . . . . . . . . 276
C
caps . . . . . . . . . . . . . . . . . . . . . . see capsize()
capsize() . . . . . . . . . . . . . . . . . . . . . . . . . . 189
caption(). . . . . . . . . . . . . . . . . . . . . .280, 315
categorical axes . . . . . see axes, categorical
ciplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
clcolor() . . . . . . . . . . . . 54, 56, 58, 82, 253
clock position . . . . . . . . . . . . . . . . . . . 330–331
clpattern() . . . . . . . . . 52, 56, 82, 90, 253,
336–337, 353
clstyle(). . . . . . . . . . . . . . . . . . . . . . . . . . . .89
clwidth(). .26, 52, 54, 56, 58, 82, 88–90,
94, 253, 338, 353
color
area graphs . . . . . . . . . . . . . . . . . . . . . . 62

385
color, continued
axis lines . . . . . . . . . . . . . . . . . . . . . . . . . 43
bar fill . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
bar graphs . . . . . . . . . . . . . . . . . . . . . . 150
bar lines. . . . . . . . . . . . . . . . . . . . . .47, 64
box plots . . . . . . . . . . . . . . . . . . . 186–187
confidence level . . . . . . . . . . . . . . . . . . 52
connecting lines . . . . . . . . . 54, 69, 253
graph region . . . . see graphregion()
histogram bars . . . . . . . . . . . . . . . . . . . 78
intensity . . . . . . . . . . . . . . . . . . . 149, 186
labels . . . . . . . . . . . . . . . . . . . . . . 128, 172
legend . . . . . . . . . . . . . . . . . . . . . . . . . . 296
lines . . . . . . . . . . . . . . . . . . . . 82, 208, 211
marker fill . . . . . . . . . . . . . . . . . . . 48, 242
marker outline . . . . . . . . . . . . . . . . . . 242
marker symbols . . . . . . . . . . . . . . . . . . 37
markers . . . 55, 69, 211–213, 242–244
median line . . . . . . . . . . . . . . . . . . . . . 187
pie charts . . . . . . . . . . . . . . . . . . 221–223
plot region . . . . . . . see plotregion()
schemes . . . . . . . . . . . . . . . . . . . . . . . . . 319
styles. . . . . . . . . . . . . . . . . . . . . . .328–330
textbox . . . . . . . . . . . . . . . . . . . . 304, 310
cols() . . . . . . . . . . . . . . . . . . . . . . . . . 364–365
columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
combining graphs . . . . . . . . . . . . . . . 361–365
commas with graph options. . . . . . . . . .376
compass direction . . . . . . . . . . . . . . . 331–332
component-plus-residual plot . . . . . . . . . see
cprplot
confidence interval
fit (regression predictions) . . . . 50–54
for means and percentils of survival
time . . . . . . . . . . . . . . . . . . see stci
selecting display command . . . . . . . 53
setting level . . . . . . . . . . . . . . . . . . . . . . 51
confidence level
color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
pattern . . . . . . . . . . . . . . . . . . . . . . . . . . 52
width . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
connect() . . . . . . . 39–41, 55, 84, 251–253,
333–335
connect lines width . . . . . . . . . see clwidth
connected plots . . see twoway connected
connecting
lines . . . . . . . . . . . see lines, connecting
points . . . . . . . . . . . . . . . . . . . . . . . . 39–41
styles. . . . . . . . . . . . . . . . . . . . . . .332–335

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
386
correlogram . . . . . . . . . . . . . . see ac and pac
cprplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
cross-correlogram . . . . . . . . . . . . . see xcorr
cumsp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
cumulative spectral distribution graph . .
. . . . . . . . . . . see cumsp
cwhiskers . . . . . . . . . . . . . . . . . . . . . . . . . . 188
D
dates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58–59
density . . . . . . . . see kdensity and twoway
kdensity
descending . . . . . . . . . . . . . . . . . . . . 122, 220
diagonal. . . . . . . . . . . . . . . . . . . . . . . . . . . .101
bfcolor() . . . . . . . . . . . . . . . . . . . . . . 101
discrete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
displaying named graphs . . . . . . . . . . . . 359
distribution graphs . . . . . . . . . . . . . . . . . . 346
distribution plots . . . . . . . . . . . . . . . . . . 74–82
dot plots . . . . . see graph dot and twoway
dot
alphabetical order. . . . . . . . . . . . . . .200
axes . . . . . . . . . . . . . see axes, dot plots
by() . . . . . . . . . . . see by(), dots plots
categorical axes . . . . . . . . . . . . see axes,
categorical, dot plots
descending order . . . . . . . . . . . . . . . . 200
excluding missing categories . . . . . see
nofill
legend . . . . . . . . . . . . . . . . . . . . . . 205–207
look . . . . . . . . . . . . . . . . . . . . . . . . 210–214
ordering . . . . . . . . . . . . . . . . . . . . . . . . 200
over() . . . . . . . . see over(), dot plots
reverse order . . . . . . . . . . . . . . . . . . . . 202
sorting . . . . . . . . . . . . . . . . . . . . . . . . . . 200
dots() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
mcolor() . . . . . . . . . . . . . . . . . . . . . . . 211
msize() . . . . . . . . . . . . . . . . . . . . . . . . 211
msymbol() . . . . . . . . . . . . . . . . . . . . . . 211
dropped-line plots. . . . . . . . . . . .see twoway
dropline
E
exclude() . . . . . . . . . . . . . . . . . . . . . . . . . . 135
exclude0 . . . . . . . . . . . . . . . . . . 145–146, 209
exploding pie slices . . . . . . . . . . . . . . . . . . 222

Subject index
F
fits (regression predictions)
fractional polynomial . . . . see twoway
fpfit and twoway fpfitci
linear . . . . . . . . . see twoway lfit and
twoway lfitci
quadratic . . . . . see twoway qfit and
twoway qfitci
formatting numbers
axis labels . . . . . . . . . . . . . . . . . . . . . . 261
bar graphs . . . . . . . . . . . . . . . . . . . . . . 142
bar labels . . . . . . . . . . . . . . . . . . . . . . . . 32
pie slices . . . . . . . . . . . . . . . . . . . 226–227
forty-five degree lines . . . . . . . . . . . . . . . . 362
fraction() . . . . . . . . . . . . . . . . . . . . . . . . . . 76
fractional polynomial fits . . . . . see twoway
fpfit and twoway fpfitci
frequency() . . . . . . . . . . . . . . . . . . . . . . . . . 77
frequency. . . . . . . . . . . . . . . . . . . . . . . . . . . .81
function, line plot of . . . . . . . . . see twoway
function
fysize(). . . . . . . . . . . . . . . . . . . . . . . . . . . .365
G
gap
between bars . . . . . . . . . . . . . . . . . . . . 148
between bars and edge of plot . . . 148
between boxes . . . . . . . . . . . . . . . . . . 185
between boxes and edge of plot. .184
between columns . . . . . . . . . . . . . . . . 297
between groups . . . . . . . . . . . . . . . . . 185
between labels and outside of graph
. . . . . . . . . . . . . . 173
between labels and ticks . . . . . . . . 172
between lines . . . . . . . . . . . . . . . . . . . 213
between marker and label . . . . . . . 250
between rows . . . . . . . . . . . . . . . . . . . 297
box plots . . . . . . . . . . . . . . . . . . . 163–164
dot plots . . . . . . . . . . . . . . . . . . . . . . . . 199
textboxes . . . . . . . . . . . . . . . . . . . . . . . 309
gap() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77–78
gladder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
glcolor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
graph bar . . . . . . . . . . . 27, 29–33, 107–155
graph box . . . . . . . . . . . . . . . . . . . . . . 157–193
graph combine . . . . . . . . . . . . . . . . . 362–363
graph display . . . . . . . . . . . . . . . . . . 29, 360
graph dot . . . . . . . . . . . . . . . . . . 13, 193–217
graph hbar . . . . . . . . . . . . . 13, 63, 107–155

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
Subject index
graph hbox . . . . . . . . . . . . . . . . . 13, 157–193
graph matrix . . . . . . . . . . . . 12, 27, 95–105
graph pie . . . . . . . . . . . . . . . . . . 14, 217–235
graph use . . . . . . . . . . . . . . . . . . . . . . . . . . 359
graphing a function . . . . . . . . . . see twoway
function
graphregion() . . . . . . . . . . . . . . . . . 325–326
color() . . . . . . . . . . . . . . . . . . . . . . . . 325
fcolor() . . . . . . . . . . . . . . . . . . . . . . . 326
ifcolor() . . . . . . . . . . . . . . . . . . . . . . 326
lcolor() . . . . . . . . . . . . . . . . . . . . . . . 326
lwidth() . . . . . . . . . . . . . . . . . . . . . . . 326
graphs, specialized . . . . . . . . see specialized
graphs
grids
displaying . . . . . . . . . . . . . . . . . . 264–265
suppressing. . . . . . . . . . . .146, 182, 264
groups . . . . . . . . . . . . . see by() and over()
H
half . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
height
bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
histogram bar . . . . see histogram, bar
height
symbol. . . . . . . . . . . . . . . . . . . . . . . . . .297
hi-lo graphs . . . . . . . . . . . . . . see range plots
histogram . . . . . . . . see twoway histogram
bar color . . . . . . . . . . . . . . . . . . . . . . . . . 78
bar height . . . . . . . . . . . . . . . . . . . . 75–77
bar width . . . . . . . . . . . . . . . . . . . . 78, 80
gap between bars . . . . . . . . . . . . . . . . 77
horizontal . . . . . . . . . . . . . . . . . . . . . . . . 79
overlaying . . . . . . . . . . . . . . . . . . . . . . . . 81
horizontal . . . . . . . . 48, 61, 63, 68, 79, 80
I
if . . . . . . . . . . . . . . . . . see samples, selecting
imargin() . . . . . . . . . . . . . . . . . . . . . . . . . . 365
immediate graphs . . . . . . . . . . . . see twoway
scatteri
in . . . . . . . . . . . . . . . . . see samples, selecting
intensity() . . . . . . . . . . . . . . 149, 186, 223
J
jitter(). . . . . . . . . . . . . . . . . . . . . . . . . . . .102
jittering . . . . . . . . see scatterplot matrices,
jittering

387
justification
textboxes. . . . . . . . . . . . . . . . . . .305–307
titles . . see title(), justification
K
kdensity . . . . . . . . . . . . . . . . . . . . . . . . 12, 353
kernel density . . . . . . . . . see kdensity and
twoway kdensity
horizontal . . . . . . . . . . . . . . . . . . . . . . . . 80
lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
methods . . . . . . . . . . . . . . . . . . . . . . . . . 81
overlaying . . . . . . . . . . . . . . . . . . . . . . . . 81
L
l1title() . . . . . . . 130, 174, 204, 285, 315
l2title() . . . . . . . . . . . . 130, 174, 204, 315
label() . . . . . . . . . . . . . . . . . . . . . . . . 298, 356
labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22–23
alternate . . . . . . . . . . . . . . . . . . . 127, 171
angles . . . . . . . . . . . . . . . see angle, label
axes . . . . see xlabel() and ylabel()
bar graphs . . . . . . . . . . . . . . . . . . 32, 136
changing . . . . 124–126, 133, 168, 177,
203
color . . . . . . . . . . . . . . . . . . . . . . . 128, 172
gap from axis . . . . . . . . . . . . . . 129, 130
gap from outside edge of graph . . 173
gap from ticks . . . . . . . . . . . . . . . . . . 172
legend. . . . . . . . . . . . .see legend, labels
marker symbols . . . . . . . . . . . . . . . 38–39
markers . . . . . . . . . . . . . 83, 97, 247–250
matrix. . . . . . . . . . . . . . . . . . . . . . .98–100
missing values . . . . . . . . . . . . . . . . . . 124
pie charts. . . . . . . . . . . . . .224–228, 233
placing below bars . . . . . . . . . . . . . . 142
placing inside bars . . . . . . . . . . . . . . 141
placing outside bars . . . . . . . . . . . . . 142
points . . . . . . . . . . . . . . . . . . . . . . . . . . 300
position. . . . . . . . . . .140–142, 226, 301
scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
size . . . . . . . . . . . . . . . . 97, 128, 171, 226
suppressing . . . . . . 104, 126, 128, 132,
137, 176–177, 206, 260
ticks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
time series . . . . . . . . . . . . . . . . . . . . . . . 59
titles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
ladder of power graphs . . . . . . . . . . . . . . 347
legend . . . . . . . . . . . . . . . . . . . . . 23–25, 40–41
bar graphs . . . . . . . . . . . . . . . . . 130–142

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
388
legend, continued
box plots . . . . . . . . . . . . . . . . . . . 174–179
columns . . . . . 135, 154, 178, 230, 291
dot plots . . . . . . . . . . . . . . . . . . . 205–207
key . . . . . . . . . . . . . . . . . . . . . . . . 289, 292
labels . . . . . . . . 154, 206, 288–289, 296
margins . . . . . . . . . . . . . . . . . . . . . . . . . 296
options . . . . . . . . . . . . . . . . . . . . . 287–299
overlaid graphs . . . . . . . . . . . . . . . 90–92
pie charts. . . . . . . . . . . . . .228–231, 234
placing within plot regions . . . . . . 134
position. .31, 134–136, 154, 178, 231,
234, 293–294
rows . . . . . . . . . . 31, 133, 177, 207, 291
stacked . . . . . . . . . . . . . . . . 136, 179, 293
subtitle . . . . . . . . . . . . . . . . . . . . . . . . . 154
suppressing . . . . . . . . . . . . . . . . 133, 225
text. . . . . . . . . . . . . . . . . . . . . . . . . . . . .290
titles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
twoway . . . . . . . . . . . . . . . . . . . . . . . . . . 86
width . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
legend() . . . 23–25, 27–29, 31–33, 40–41,
46, 86, 90–94, 133–136, 138, 154,
177–179, 206–207, 225, 229–231,
234, 285, 288–299, 356
bexpand . . . . . . . . . . . . . . . . . . . . . . . . 294
bfcolor() . . . . . . . . . . . . . . . . . . . . . . 296
bmargin() . . . . . . . . . . . . . . . . . . . . . . 296
box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
by() . . . . . . . . . . . . . . see by(), legend
colfirst . . . . 133, 177, 230, 291–292
colgap() . . . . . . . . . . . . . . . . . . . . . . . 297
color() . . . . . . . . . . . . . . . . . . . . . . . . 296
cols() . . . . 24, 41, 86, 135, 154, 178,
207, 231, 234, 291, 293, 299
holes() . . . . . . . . . . . . . . . . . . . 230, 292
label() . . . . . . . . . . . . . . . . . . . . . 24–25,
40–41, 90–91, 94, 133, 154, 177,
206, 229, 285–286, 288–289
legend() . . . . . . . . . . . . . . . . . . . . . . . 192
cols() . . . . . . . . . . . . . . . . . . . . . . . 192
note() . . . . . . . . . . . . . . . . . . . . . . . . . 295
order() . . . . . . . . . . . 41, 230, 289–295
position() . . . . . 27, 31–33, 134–136,
178, 207, 293–295, 298
region() . . . . . . . . . . . . . . . . . . . . . . . 296
fcolor() . . . . . . . . . . . . . . . . . . . . . 296
lcolor() . . . . . . . . . . . . . . . . . . . . . 296
lwidth() . . . . . . . . . . . . . . . . . . . . . 296

Subject index
region(), continued
margin() . . . . . . . . . . . . . . . . . . . . . 296
ring() . . . . . . . . . 31–33, 134–135, 293
rowgap() . . . . . . . . . . . . . . . . . . . . . . . 297
rows() . . . 31–33, 133, 177, 207, 231,
291–296
size() . . . . . . . . . . . . . . . . . . . . . . . . . 296
span . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
stack . . 136, 154, 179, 192, 231, 234,
293
subtitle()
bexpand . . . . . . . . . . . . . . . . . . . . . . 295
box . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
symxsize() . . . . . . . . . . . . . . . . . . . . . 297
symysize() . . . . . . . . . . . . . . . . . . . . . 297
textfirst . . . . . . . . . . . . 135, 178, 292
title() . . . . . . . . . . 206, 229–230, 295
position() . . . . . . . . . 230–231, 295
level() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
leverage-versus-squared-residual plot . . . .
. . . . . . . . . see lvr2plot
life tables for survival data . . . see ltable
line() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
lcolor() . . . . . . . . . . . . . . . . . . . . . . . 223
lwidth() . . . . . . . . . . . . . . . . . . . . . . . 223
line plots . . . . . . . . . . . . . . see twoway line
sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
line, twoway . . . . . . . . . . see twoway line
linear fits . . . . . . . . . . see lfit and lfitci
linear regression diagnostics graphs . . 348
linegap() . . . . . . . . . . . . . . . . . . . . . . 213–214
lines
adjacent . . . . . . . . . . . . . . . . . . . . . . . . 188
axes
color . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
width . . . . . . . . . . . . . . . . . . . . . . . . . . 43
box plots . . . . . . . . . . . . . . . . . . . . . . . 181
color . . . . . . . . . . . . . . 82, 181, 208, 211
connecting . . . 52, 54, 69, 84, 250–253
fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52–53
gap between . . . . . . . . . . . . . . . . . . . . 213
graph region . . . . see graphregion()
median . . . . . . . . . . . . . . . . . . . . . . . . . 187
overlaying . . . . . . . . . . . . . . . . . . . . . . . . 89
patterns . . . . . . . 43, 82, 181, 208, 211,
336–338
plot region . . . . . . . see plotregion()
styles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
textbox outlines . . . . . . . . . . . . 309–310

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
Subject index
lines, continued
whiskers . . . . . see box plots, whiskers
width . . . . . . . . . . . . . . . 82, 88, 181, 208
lines() . . . . . . . . . . . . . . . . . . . . . . . . 188, 211
lcolor() . . . . . . . . . . . . . . . . . . 188, 211
lwidth() . . . . . . . . . . . . . . . . . . 188, 211
linetype() . . . . . . . . . . . . . . . . . . . . . 211–212
loading graphs . . . . . . . . . . . see graph use
local linear smooth plots . . . . . see twoway
lowess
lroc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
lsens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
ltable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
lvr2plot . . . . . . . . . . . . . . . . . . 348, 352–353
M
margins
graph region . . . . see graphregion()
legend . . . . . . . . . . . . . . . . . . . . . . . . . . 296
plot region . . . . . . . see plotregion()
styles. . . . . . . . . . . . . . . . . . . . . . .338–340
textboxes . . . . . . . . . . . . . . . . . . . . . . . 308
marker() . . . . . . . . . . . . . . . . . . 189, 212–213
mcolor() . . . . . . . . . . . . . . . . . . . . . . . 212
mfcolor() . . . . . . . . . . . . . . . . . . . . . . 213
mlcolor() . . . . . . . . . . . . . . . . . . . . . . 213
mlwidth() . . . . . . . . . . . . . . . . . . . . . . 213
msize() . . . . . . . . . . . . . . . 189, 212–213
msymbol() . . . . . . . . . . . . 189, 212–213
markers
fill color . . . . . . . . . . . . . . . . . . . . . . . . . . 48
box plots . . . . . . . . . . . . . . . . . . . . . . . 189
color . . . . . . . . 55, 69, 96–97, 211–213,
242–244
displaying for data points . . . . . . . . 55
fill color . . . . . . . . . . . . . . . . . . . . . . . . 242
invisible . . . . . . . . . . . . . . . . . . . . . . . . 239
label gap. . . . . . . . . . . . . . . . . . . . . . . .250
label size . . . . . . . . . . . . . . . . . . . . . . . . . 97
labels . . . . . . . . 38–39, 83, 97, 247–250
line width . . . . . . . . . . . . . . . . . . . . . . . . 48
median line . . . . . . . . . . . . . . . . . . . . . 188
options . . . . . . . . . . . . . . 95–97, 235–250
outline color . . . . . . . . . . . . . . . . . . . . 242
outline width . . . . . . . . . . . . . . . . . . . 243
overlaying . . . . . . . . . . . . . . . . . . . . . . . . 90
plus sign . . . . . . . . . . . . . . . . . . . . . . . . 237
schemes . . . . . . . . . . . . . . . . . . . . . . . . . 247

389
markers, continued
size. . . .55, 69, 96, 211–213, 240–241,
340–341
squares . . . . . . . . . . . . . . . . . . . . . . . . . 236
styles. . . . . . . . . .88, 244–246, 340–343
symbols . . . . . . . 25–27, 36–38, 83, 87,
95–96, 342–343
width . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
matrix
axis labels . . . . . . . . . . . . . . . . . . . 98–100
scatterplot . . see scatterplot matrices
titles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
maxes() . . . . . . . . . . . . . . . . . . . 100, 104–105
xlabel() . . . . . . . . . . . . . . . . . . . . . . . 100
xtick() . . . . . . . . . . . . . . . . . . . . . . . . 100
ylabel() . . . . . . . . . . . . . . 100, 104–105
ytick() . . . . . . . . . . . . . . . . . . . . . . . . 100
mcolor(). . . .37, 55, 69, 96, 242, 329–330
mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
median . . . . . . . . . . . . . . . . . . . . . . . . . 108, 196
median band plots . . . . see twoway mband
median line
color . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
markers . . . . . . . . . . . . . . . . . . . . . . . . . 188
width . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
median points . . . . . . . . . . . see medtype(),
medmarker(), and medline()
median spline plots . . . . . . . . . . see twoway
mspline
medline() . . . . . . . . . . . . . . . . . . . . . . . . . . 187
lcolor() . . . . . . . . . . . . . . . . . . . . . . . 187
lwidth() . . . . . . . . . . . . . . . . . . . . . . . 187
medmarker() . . . . . . . . . . . . . . . . . . . . . . . . 188
msize() . . . . . . . . . . . . . . . . . . . . . . . . 188
msymbol() . . . . . . . . . . . . . . . . . . . . . . 188
medtype() . . . . . . . . . . . . . . . . . . . . . . 187–188
mfcolor() . . . . . . 48–49, 97, 242–243, 246
missing. . . . . . . . . .116, 124, 162, 197, 218
mlabangle() . . . . . . . . . . . . . . . . . . . 249, 327
mlabcolor() . . . . . . . . . . . . . . . . . . . . . . . . 250
mlabel() . . . . . . . . 38–39, 83, 97, 239, 241,
248–250, 300, 303, 327, 344, 353
mlabpos() . . . . . . . . . . . . . . . . . . . . . . . . . . 248
mlabgap() . . . . . . . . . . . . . . . . . . . . . . . . . . 250
mlabposition() . . . . 38–39, 239, 241, 331
mlabsize() . . . . . . . . . . . . . 38, 97, 249, 344
mlabvposition() . . . . . . . . . . . . . . . 248–249
mlcolor() . . . . . . . . . . . 48–49, 97, 242–243
mlwidth() . . . . . . . . . . . . . . . 48–49, 243–244

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
390
mountain plots . . . . . . . . . see twoway area
msize() . . . . 37–38, 48–49, 55, 69, 71, 72,
96, 240–241, 246, 341, 352
mstyle() . . . . . . . . . . . . . . . . . . . 88, 244–246
msymbol() . . . . . 22, 26, 36, 39–41, 48–49,
55, 56, 69, 72, 83, 87, 90, 94–96,
236–241, 247, 331, 343, 352
multiple axes . . . . . . . . . . see axes, multiple
multiple plots . . . . . . . . . . . . . see overlaying
N
name() . . . . . . . . . . . . . . . . 359, 361, 363–365
naming graphs . . . . . . . . . . . . . . . see name()
ndots() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
noclockwise . . . . . . . . . . . . . . . . . . . . . . . . 219
nofill . . . . . . . . . . . . . 30–33, 117, 162, 198
nofit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
nogrid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
nolabel. . . . . . . . . .128, 132, 137, 176, 206
nooutsides . . . . . . . . . . . . . . . . . . . . . 157–193
note() . . . . . . 190–192, 282, 315, 354, 357
O
options . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20–29
adding text . . . . . . . . . . . . . . . . .299–303
axes . . . . . . . . . . . . . . . . . . . 254–256, 272
labels . . . . . . . . . . . . . . . . . . . . . . . . . 22–23
legend . . . . . . . . . . . . . . . . . . . . . . . . 23–25
marker symbols . . . . . . . . . . . . . . . 25–27
markers . . . . . . . . . . . . . . . . . . . . 235–250
region . . . . . . . . . . . . . . . . . . . . . . 324–326
scatterplot matrices . . . . . . . . 102–103
specialized graphs . . . . . . . . . . 352–358
standard . . . . . . . . . . . . . . . . . . . 324–326
textboxes. . . . . . . . . . . . . . . . . . .303–313
titles . . . . . . . . . . . . . . . . . . . . . . . . . 20–22
using in the wrong context . . . . . . 377
ordering
bars . . . . . . . . see bar graphs, ordering
boxes . . . . . . . . see box plots, ordering
orientation
textboxes . . . . . . . . . . . . . . . . . . . . . . . 305
titles . . . . . . . . . . . . . . . . . . . . . . . 341–342
outergap() . . . . . . . . . . . . . . . . . . . . 148, 184
outside values
color . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
suppressing . . . . . . . . . . . . . . . . . . . . . 180

Subject index
over() . . . . . . . . . 29–33, 111–150, 157–235
asyvars . . . . . . . . . . . . . . . . . . . . 161–162
axis() . . . . . . . . . . . . . . . . 130, 142, 173
outergap() . . . . . . . . . 130, 142, 173
bar graphs . . . . . . . . 111–123, 151–155
box plots. . . . . . . . . .157–162, 193–202
descending . . . . . . . . . . . 119–121, 123,
165–167, 200–202
display only existing variables . . . 117
dot plots . . . . . . . . . . . . . . . . . . . 193–202
gap() . . . 118–119, 163–164, 185, 199
label() . . 33, 126–130, 138, 171–173
angle() . . . . . . . . . . . . . 33, 127, 171
labcolor(). . . . . . . . . . . . . .128, 172
labgap() . . . . . . . . . . . 129, 172–173
labsize() . . . . . . . . . . . . . . . 128, 171
ticks. . . . . . . . . . . . . . . . . . . .129, 172
tlength() . . . . . . . . . . . . . . . 129, 172
tlwidth() . . . . . . . . . . . . . . . 129, 172
tposition() . . . . . . . . . . . . 129, 172
missing . . . . . . . . . . . . . . . . . . . . . . . . . 116
pie charts . . . . . . . . . . . . . . . . . . . . . . . 218
relabel() . . . . . . . 124–126, 168–170,
203–204
sort() . . 120–123, 165–167, 200–202
sum() . . . . . . . . . . . . . . . . . . . . . . . . 122
y-variables . . . . . . . . . . . . . . . . . 110–111
overlaying . . . . . . . . . . . . . . . . . . 49–50, 87–94
bar graphs . . . . . . . . . . see bar graphs,
overlaying
connected marker plots. . . . . . . . . . .56
fits, CIs, smooths, and scatters . . 49–
54, 89–90
histograms . . . . . . . . . . . see histogram,
overlaying
kernel density . . . . see kernel density,
overlaying
legends . . see legend, overlaid graphs
lines . . . . . . . . . . . . see lines, overlaying
markers . . . . . see markers, overlaying
mixed plot types . . . . . . . . . . . 6, 25–26
scatterplots . . . . . . . . see scatterplots,
overlaying
P
pac. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .350
patterns
axis lines . . . . . . . . . . . . . . . . . . . . . . . . . 43
box plots . . . . . . . . . . . . . . . . . . . . . . . 181

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
Subject index
patterns, continued
confidence level . . . . . . . . . . . . . . . . . . 52
connecting lines . . . . . . . . . 52, 69, 253
lines. . . . . .82, 181, 208, 211, 336–338
percent(). . . . . . . . . . . . . . . . . . . . . . . . . . . .76
percent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
percentages . . . . . . . . . . . . . . . 110–111, 115
pergram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
periodogram . . see pergram and wntestb
pie() . . . . . . . . . . . . . . . . . . . . . . . . . . 222, 233
color() . . . . . . . . . . . . . . . . . . . . . . . . 222
explode . . . . . . . . . . . . . . . . . . . 222, 233
pie charts . . . . . . . . . . . . . . . . see graph pie
adding text . . . . . . . . . . . . . . . . .227–228
angles . . . . . . . . . . . . . . . . . . . . . . . . . . 220
by(). . . . . . . . . . . .see by(), pie charts
color . . . . . . . . . . . . . . . . . . . . . . . 221–223
counterclockwise . . . . . . . . . . . . . . . . 219
descending order . . . . . . . . . . . . . . . . 220
exploding slices . . . . . . . . . . . . .222–223
labels . . . . . . . . . . . . . . . . . .224–228, 233
legend . . . . . . . . . . . . . . . . . 228–231, 234
over() . . . . . . . see over(), pie charts
slices . . . . . . . . . . . . . . . . . . . . . . . 221–223
sorting . . . . . . . . . . . . . . . . 219–221, 233
titles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
types . . . . . . . . . . . . . . . . . . . . . . . 217–218
plabel() . . . . . . . . . . . . . . . . . . 224–227, 233
color() . . . . . . . . . . . . . . . . . . . . . . . . 226
format() . . . . . . . . . . . . . . . . . . 226–227
gap() . . . . . . . . . . . . . . . . . . . . . . 226–227
legend() . . . . . . . . . . . . . . . . . . . . . . . 227
name . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
size() . . . . . . . . . . . . . . . . . . . . . . . . . 226
plotregion() . . . . . . . . . . . . . . . . . . . . . . . 325
color() . . . . . . . . . . . . . . . . . . . . . . . . 325
lcolor() . . . . . . . . . . . . . . . . . . . . . . . 325
lwidth() . . . . . . . . . . . . . . . . . . . . . . . 325
pnorm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
points, connecting . . . . . . . . see connecting
points
population pyramid . . . . . . . . . . . . . 367–368
position
labels . . . . . . . . . 38, 140–142, 226, 301
legend . . . 31, 134–136, 154, 178, 231,
234, 293–294
marker labels . . . . . . . . . . . . . . . 248–249
standard options . . . . . . . . . . . 330–332
ticks . . . . . . . . . . . . . . . . . . . . . . . 129, 263

391
position, continued
titles . . . . . . . . see title(), position
prefix . . . . . . . . . . . . . . . . . . . see titles, prefix
ptext() . . . . . . . . . . . . . . . . . . . . . . . . 227–228
bfcolor() . . . . . . . . . . . . . . . . . . . . . . 228
box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
margin() . . . . . . . . . . . . . . . . . . . . . . . 228
orientation() . . . . . . . . . . . . . . . . . 228
placement() . . . . . . . . . . . . . . . . . . . 228
Q
qladder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
qnorm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
quadratic fits . . . . . . see twoway qfit and
twoway qfitci
R
r1title() . . . . . . . . . . . . . . . . . . . . . . . . . . 315
r2title() . . . . . . . . . . . . . . . . . . . . . . . . . . 315
range() . . . . . . . . . . . . . . . . . . . . . . . . . . 81, 82
range plots
with area shading . . . . . . . see twoway
rarea
with bars . . . . . . . . . . see twoway rbar
with capped spikes . . . . . . see twoway
rcap
with capped spikes and marker symbols . . . . . . . . . . . . . . . . see twoway
rcapsym
with connected lines . . . . . see twoway
rconnected
with lines . . . . . . . . see twoway rline
with markers . . . . . . . . . . . . see twoway
rscatter
with spikes . . . . . . see twoway rspike
rectangles() . . . . . . . . . . . . . . . . . . . . . . . 212
fcolor() . . . . . . . . . . . . . . . . . . . . . . . 212
lcolor() . . . . . . . . . . . . . . . . . . . . . . . 212
reference lines . . . . . . . . see lines, axes and
yline()
region . . . . . . . . . . . . . . . . . . . . . . . . . . 324–326
replace . . . . . . . . . . . . . . . . . . . . . . . . 364, 365
rescheming graphs . . . . . . . . . . see schemes,
rescheming graphs
residual-versus-fit plot . . . . . . see rvfplot
residual-versus-predictor plot . . . . . . . . . see
rvpplot
restoring graphs . . . . . . . . . . see graph use
reusing graphs . . . . . . . . . . . . . . . . . . 359–360

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
392
reversing axes . . . . . see axes, reverse scale
ROC analysis . . . . . . . . . . . . . . . . . . . . . . . . 351
roccomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
rocplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
roctab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
rvfplot . . . . . . . . . . . . . . . . . . . . . . . . 348, 355
rvpplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
rwidth(). . . . . . . . . . . . . . . . . . . . . . . . . . . .212
S
samples, selecting . . . . . . . . . . . . . . . . . . . . . 58
saving(). . . . . . . . . . . . . . . . . . . . . . . . . . . .358
saving graphs. . . . . . . . . . . . . . . . . . . . . . . .358
scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
adjusting . . . . . . . . . . . . . . . . . . . 322–324
axes . . . . . . . . . . . . . . . 85, 209, 265–269
labels . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
markers . . . . . . . . . . . . . . . . . . . . . . . . . 105
scale() . . . . . . . . . . . . . . . 103, 323–324, 358
scatter with immediate arguments. . . .see
twoway scatteri
scatter, twoway . . . see twoway scatter
scatterplot matrices. . . . . . . . . . . . . . . . . . .12
by(). .see by(), scatterplot matrices
displaying lower half . . . . . . . . . . . . 102
jittering. . . . . . . . . . . . . . . . . . . . . . . . .102
options . . . . . . . . . . . . . . . . . . . . . 102–103
scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
scatterplots . . . . . . . . see twoway scatter
overlaying . . . . . . . . . . . . . 39–40, 84–86
scheme() . . . . . . . . . . . . . . . . . . . . . . . 359–360
schemes. . . . . . . . . . . . . . . . . .14–20, 318–321
colors . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
customizing . . . . . . . . . . . . . . . . 379–381
economist . . . . . . . . . . . . . . . . . 321, 357
markers . . . . . . . . . . . . . . . . . . . . . . . . . 247
rescheming named graphs . . . . . . . see
scheme()
s1color . . . . . . . . . . . . . . . . . . . . . . . . 320
s1manual . . . . . . . . . . . . . . . . . . . . . . . 320
s1mono . . . . . . . . . . . . . . . . . . . . . . . . . 320
s2color . . . . . . . . . . . . . . . . . . . . . . . . 319
s2manual . . . . . . . . . . . . . . . . . . . . . . . 319
s2mono . . . . . . . . . . . . . . . . . . . . . . . . . 319
sj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
vg blue . . . . . . . . . . . . . . . . . . . . . . . . . . 19
vg brite . . . . . . . . . . . . . . . . . . . . . . . . 19
vg lgndc . . . . . . . . . . . . . . . . . . . . . . . . 18
vg outc . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Subject index
schemes, continued
vg outm . . . . . . . . . . . . . . . . . . . . . . . . . . 17
vg palec . . . . . . . . . . . . . . . . . . . . . . . . 16
vg palem . . . . . . . . . . . . . . . . . . . . . . . . 16
vg past . . . . . . . . . . . . . . . . . . . . . . . . . . 18
vg rose . . . . . . . . . . . . . . . . . . . . . . . . . . 18
vg s1c . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
vg s1m . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
vg s2c . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
vg s2m . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
vg samec . . . . . . . . . . . . . . . . . . . . . . . . 17
vg teal . . . . . . . . . . . . . . . . . . . . . . . . . . 19
vg lgndc . . . . . . . . . . . . . . . . . . . . . . . 179
separating graphs . . . . . . . . . . . . . . . . . . . . . 44
shading area graphs . . . . . . . . . . . . . . . . . . 62
showyvars . . . . . . . . . . . . 132–133, 176–177
size
adjacent line . . . . . . . . . . . . . . . . . . . . 188
adjusting . . . . . . . . . . . . . . . . . . . 322–324
axes . . . . . . . . . . . . . . . . . . . . . . . . 323–324
caps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
labels . . . . . . . . . . . . . . 38, 128, 171, 226
marker symbols . . . . . . . . . . . . . . . 37–38
markers . . . . . 55, 69, 96–97, 211–213,
240–241, 340–341
text. . . . . . . . . . . . . . . . . . . . . . . . . . . . .344
textbox . . . . . . . . . . . . . . . . . . . . 144, 304
titles . . . . . . . . . . . . . . . . . . . . . . . 144, 181
slices . . . . . . . . . . . . . . . . . . . . . . . . . . . 221–223
sort . . 39–41, 54–56, 61, 65, 84, 220–221,
233–234, 251–253, 333–335
sorting
area graphs. . . . . . . . .see area graphs,
sorting
box plots . . . . . . see box plots, sorting
line plots . . . . . . see line plots, sorting
pie charts . . . . .see pie charts, sorting
spacing . . . . . . . . . . . . . . . . . . . . . . . . . see gaps
specialized graphs . . . . . . . . . . . . . . . 345–358
spike plots . . . . . . . . . . . . see twoway spike
splines . . . . . . . . . . . . . . . . . . . . . . see mspline
stack . . . . . . . . . . . . . . . . . 111, 115, 122, 139
stacking bars . . . . . . . . . . . . . . . . . . see stack
standard error of forecast . . . . . . . see stdf
standard options . . . . . . . . . . . . . . . . 324–326
standardized normal probability graphs. .
. . . . . . . . . . . . . . 346
start() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Stata 7 syntax . . . . . . . . . . . . . . . . . . . . . . . 376

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
Subject index
statistical function graphs . . . . see twoway
function
stci . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
stcoxkm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
stcurve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
stdf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
stepstair. . . . . . . . . . . . . . . . . . . . . .252, 335
storing graphs . . . . . . . . . . . see graph save
stphplot. . . . . . . . . . . . . . . . . . . . . . . . . . . .349
strip plots . . . . . . . . . . . . . . . . . . 277–278, 371
sts graph. . . . . . . . . . . . . . . . . . . . . .349, 356
styles . . . . . . . . . . . . . . . . . . . . . . . . . . . 327–345
angles . . . . . . . . . . . . . . . . . . . . . . 327–328
clock position . . . . . . . . . . . . . . 330–331
color . . . . . . . . . . . . . . . . . . . . . . . 328–330
compass direction . . . . . . . . . . 331–332
lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
margins . . . . . . . . . . . . . . . . . . . . 338–340
marker symbols . . . . . . . . . . . . 342–343
markers . . . . . . . 88, 244–246, 340–343
orientation . . . . . . . . . . . . . . . . . 341–342
text size . . . . . . . . . . . . . . . . . . . . . . . . 344
subtitle() . . . . . . . . . . . 154, 281–284, 314
nobexpand . . . . . . . . . . . . . . . . . . . . . . 154
position() . . . . . . . . . . . 154, 283–284
prefix . . . . . . . . . . . . . . . . . . . . . 281–282
ring() . . . . . . . . . . . . . . . . . . . . .154, 284
suffix . . . . . . . . . . . . . . . . . . . . . 281–282
suffix . . . . . . . . . . . . . . . . . . . . see titles, suffix
sum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .139
survival graphs . . . . . . . . . . . . . . . . . . . . . . 349
symbols
height . . . . . . . . . . . . . . . . . . . . . . . . . . 297
margin . . . . . . . . . . . . . . see msymbol()
width . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
symmetry plots . . . . . . . . . . . . . . . . . . . . . . 346
symplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
T
t1title() . . . . . . . . . . . . . . . . . . . . . . . . . . 315
t2title() . . . . . . . . . . . . . . . . . . . . . . . . . . 315
text
adding . . . . . . . . . . . . . . 45, 86, 299–303
legend . . . . . . . . . . . . . . . . . . . . . . . . . . 290
orientation . . . . . . . . . . . . . . . . . . . . . . 101
pie charts . . . . . . . . . . . . . . . . . . 227–228
scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344

393
text() . . . . . . . . . . . . . 45, 86, 299–313, 356
blwidth() . . . . . . . . . . . . . . . . . . . . . . . 45
box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
color() . . . . . . . . . . . . . . . . . . . . . . . . 304
linegap() . . . . . . . . . . . . . . . . . . . . . . 309
margin() . . . . . . . . . . . . . . . . . . . 45, 308
orientation() . . . . . . . . . . . . . . . . . 305
placement(). . . . . .301–302, 304–305
size() . . . . . . . . . . . . . . . . . . . . . . 45, 304
textboxes . . . . . . . . . . . . . . . . . . . . . . . 303–313
annotations . . . . . . . . . . . . . . . . . . . . . 308
by() . . . . . . . . . . . . see by(), textboxes
color . . . . . . . . . . . . . . . . . . . . . . . 304, 310
interline gaps . . . . . . . . . . . . . . . . . . . 309
justification . . . . . . . . . . . . . . . . 305–307
margins . . . . . . . . . . . . . . . . . . . . . . . . . 308
orientation . . . . . . . . . . . . . . . . . . . . . . 305
outline . . . . . . . . . . . . . . . . . . . . . 309–310
size . . . . . . . . . . . . . . . . . . . . . . . . 144, 304
thickness . . . . . . . . . . . . . . . . . . . . . . see width
ticks
controlling . . . . . . . . . . . . . . . . . 258–265
labels . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
length . . . . . . . . . . . . . . . . . . . . . . . . . . 263
matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 100
position . . . . . . . . . . . . . . . . . . . . . . . . . 263
suppressing . . . . . . . . . . . . . . . . . . . . . 259
time series . . . . . . . . . . . . . . . . . . . . . . . 60
time series
labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
line plots . . . . . . . . . . . . . . . . . . . . . 57–60
minor labels . . . . . . . . . . . . . . . . . . . . . 59
ticks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
titles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
time-series line. .see tsline and tsrline
tin() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
title() . . . . 20–22, 26, 86, 279, 309–318,
357
bcolor() . . . . . . . . . . . . . . . . . . . . . . . 310
bexpand . . . . . . . . . . . . . . . . . . . . . . . . 318
bfcolor() . . . . . . . . . . . . . . . . . . 86, 310
blcolor() . . . . . . . . . . . . . . . . . . 86, 310
blwidth() . . . . . . . . . . . . . . . . . . 86, 310
bmargin() . . . . . . . . . . . . . . . . . . . . . . 307
box . . . . 21–22, 86, 305–312, 317–318
justification() . . . . . 306–307, 318
margin() . . . . . . . . . . . . . . . . . . 339–340
nobox . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
placement() . . . . . . . . . . . . . . . . . . . 332

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
394
title(), continued
position() . . . . . . . . . . . . . . . . 316–317
ring() . . . . . . . . . . . . . . . . . . . . . . . . . 317
size() . . . . . . . . . . . . . . . . . . . . . . . 21–22
span . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
titles . . . . . . . . . . . . . . . . 20–22, 130, 313–318
axes . . . . . . . . . . . . . . . . . see axes, titles
bar graphs. . . . .see bar graphs, titles
box plots. . . . . . . .see box plots, titles
categorical axes . . . . . . . . . . . . see axes,
categorical, titles
justification . . . . . . . . . . . see title(),
justification
legend . . . . . . . . . . . . . . . . . . . . . . . . . . 295
matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 101
multiple lines . . . . . . . . . . . . . . . . . . . 316
orientation . . . . . . . . . . . . . . . . . 341–342
pie charts . . . . . . . . . . . . . . . . . . . . . . . 229
placing in a box . . . see title(), box
placing inside a plot region . . . . . . see
title(), placement
position . . . . . see title(), position
prefix . . . . . . . . . . . . . 255, 281–282, 354
size . . . . . . . . . . . . . . see title(), size
suffix . . . . . . . . . . . 1, 255, 281–282, 354
time series . . . . . . . . . . . . . . . . . . . . . . . 59
width . . . . . . . . . . . . . . . . . . . . . . 317–318
tlabel() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
tline() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
tmlabel(). . . . . . . . . . . . . . . . . . . . . . . . . . . .59
tmtick() . . . . . . . . . . . . . . . . . . . . . . . . . 54, 60
ttext() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
orientation() . . . . . . . . . . . . . . . . . . 60
ttitle() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
twoway
adding text . . . . . . . . . . . . . . . . . . . . . . 86
by() . . . . . . . . . . . . . . see by(), twoway
graphs . . . . . . . . . . . . . . . . . . . . . . . . 35–94
legend . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
options . . . . . . . . . . . . . . . . . . . . . . . 82–86
overlaying . . . . . . . . . . . . . . . . . . . . 87–94
titles . . . . . . . . . . . . . . . . . . . . . . . . . 26, 86
twoway area . . . . . . . . . . . . . . . . . . . 9, 61–62
twoway bar . . . . . . . . . . . . . . . . . . . 10, 62–64
twoway connected. . . . . . . . . .8, 55–56, 88
twoway dot . . . . . . . . . . . . . . . . . . . . . . . . 8, 49
twoway dropline . . . . . . . . . . . . . . 7, 48–49
twoway fpfit . . . . . . . . . . . . . . . . . . . . . 6, 50
twoway fpfitci . . . . . . . . . . . . . . . . . . 50–54

Subject index
function . . . . . . . . . . . . . . . . . 12, 82
histogram . . 11, 75–81, 346, 358
kdensity . . . . . . . . 12, 80–82, 346
lfit . . 5–6, 25–26, 49–50, 89–91,
93–94, 287–297
twoway lfitci . . . . . . . . . . . . . . . . . 7, 50–54
twoway line . . . 8, 54–60, 88–89, 268–269
twoway lowess . . . . . . . . . . . . . . . . . . . . 6, 50
twoway mband. . . . . . . . . . . . . . . . . . . . . . .6, 7
twoway mspline . . . . . . . . . . . . . . . . . . . 6, 50
twoway qfit . . . . . . . . 6, 50, 90–91, 93–94,
287–299
twoway qfitci . . . . . . . . . . . . 50–54, 91–92
twoway rarea . . . . . . 10–11, 66, 70, 92–93
twoway rbar . . . . . . . . . . . . . . 11, 67, 73–74
twoway rcap . . . . . . . . . . . . . . 11, 66, 71–72
twoway rcapsym . . . . . . . . . . . . . . 11, 67, 72
twoway rconnected . . . . . . . 10, 65, 68–69
twoway rline . . . . . . . . . . . . . . . . . 10, 65, 70
twoway rscatter . . . . . . . . . . . . . 10, 65, 69
twoway rspike . . . . . . 11, 67, 72, 265–269
twoway scatter . . . . . . 5–7, 20–26, 35–54,
83–94, 104, 235–313
twoway scatteri . . . . . . . . . . . . . . . . . 45–46
twoway spike . . . . . . 7, 47–48, 92–93, 346
twoway tsline . . . . . . . . . . . . . 9, 57–60, 88
twoway tsrline . . . . . . . . . . . . . . . . . . . 9, 57
types of graphs . . . . . . . . . . . . . . . . . . . . . 4–14

twoway
twoway
twoway
twoway

U
using graphs . . . . . . . . . . . . . see graph use
W
whiskers . . . . . . . . . see box plots, whiskers
width
axes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
bars. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .64
box plot lines. . . . . . . . . . . . . . .185–187
confidence level . . . . . . . . . . . . . . . . . . 52
connecting lines . . . . . . 52, 54, 69, 253
histogram bars. . . . . . . . . . . .75, 78, 80
legend . . . . . . . . . . . . . . . . . . . . . . . . . . 294
lines . . . . . . . . . . . . . . . . 82, 88, 181, 208
marker outline . . . . . . . . . . . . . . . . . . 243
markers . . . . . . . . . . . . . . . . . . . . . . . . . . 69
median line . . . . . . . . . . . . . . . . . . . . . 187
symbols . . . . . . . . . . . . . . . . . . . . . . . . . 297
ticks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
titles . . . . . . . . . . . . . . . . . . . . . . . 317–318

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i

i

i

i

i
Subject index
width() . . . . . . . . . . . . . . . . . . . 75–77, 80, 81
wntestb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
X
xalternate. . . . . . . . . . . . . . . .126, 170, 204
xaxis() . . . . . . . . . . . . . . . . . . . . . . . . 270, 303
xcorr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
xlabel() . . . . . . . . . 22–23, 42–43, 98–100,
257–258, 260–262, 267, 328
alternate . . . . . . . . . . . . . . . . . . . . . . 262
angle() . . . . . . . . . . . . . . . . . . . . . . . . 262
axis() . . . . . . . . . . . . . . . . . . . . . . 98–100
format() . . . . . . . . . . . . . . . . . . . . . . . 261
grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
labsize() . . . . . . . . . . . . . . . . . . . . . . . 23
nogrid . . . . . . . . . . . . . . . . . . . . . . . . . 264
valuelabels . . . . . . . . . . . . . . . . . . . 260
xline() . . . . . . . . . . . . . . . . . . see lines, axes
xscale(). . . . . . . . . . . . . . .43, 266–268, 364
lwidth() . . . . . . . . . . . . . . . . . . . . . . . 267
xsize() . . . . . . . . . . . . . . . . . . . 323, 358, 360
xtitle() . . . . . . 41, 84, 254–256, 354–355
box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
orientation() . . . . . . . . . . . . . . . . . 342
prefix . . . . . . . . . . . . . . . . . . . . . . . . . 255
size() . . . . . . . . . . . . . . . . . . . . .255, 355
Y
y-variables
bar graphs . . . . . . . . . . see bar graphs,
y-variables
multiple . . . . . . . . . . . . . . . . . . . . 109–111
yalternate. . . . . . . . . . . . . . . .146, 183, 210
yaxis() . . . . . . . . . 85, 92–93, 256, 270–272
ycommon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
ylabel() . . 31–33, 42–43, 84–86, 98–100,
145–146, 181–182, 209, 257–265,
269, 271–272, 328, 355, 365
angle() . . . 31–33, 145, 182, 261, 328
axis() . . . . . . . . . . . . . 98–100, 271–272
glcolor . . . . . . . . . . . . . . . . . . . . . . . . 264
glpattern . . . . . . . . . . . . . . . . . 264–265
glwidth . . . . . . . . . . . . . . . . . . . . . . . . 264
grid. . . . . . . . . . . . . . . . . . . .43, 264–265
labgap() . . . . . . . . . . . . . . . . . . . . . . . 263
labsize() . . . . . . . . . . . . . . . . . . . . . . 262
nogrid . . . . . . . . . . . . 43, 146, 182, 264
nolabel . . . . . . . . . . . . . . . . . . . . . . . . 260
noticks . . . . . . . . . . . . . . . . . . . . . . . . 259

395
ylabel(), continued
tlength() . . . . . . . . . . . . . . . . . . . . . . 263
tlwidth() . . . . . . . . . . . . . . . . . . . . . . 263
tposition() . . . . . . . . . . . . . . . . . . . 263
yline() . . . . . . . . . . . 43, 144, 181, 208, 355
lcolor() . . . . . . . . . . 43, 144, 181, 208
lpattern() . . . . . . . 43, 144, 181, 208
lwidth() . . . . . . . . . . 43, 144, 181, 208
ymlabel(). . . . . . . . . . . . . . . . . . . . . .258, 265
glcolor() . . . . . . . . . . . . . . . . . . . . . . 265
glpattern() . . . . . . . . . . . . . . . . . . . 265
grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
ymtick() . . . . . . . . . . . . . . . . . . . . . . . 259, 263
tposition() . . . . . . . . . . . . . . . . . . . 263
yreverse . . . . . . . . . . . . . . . . . . 147, 183, 210
yscale() . . . . . 85, 92, 138, 146, 182, 209,
268–269
axis() . . . . . . . . . . . . . . . . . . . . . . 92, 269
range() . . . . . . . . . . . 92, 138, 268–269
ysize() . . . . . . . . . . . . . . . . . . . 323, 358, 360
ytick() . . . . . . . . . . . . . . . . . . . . . . . . 258, 263
tposition() . . . . . . . . . . . . . . . . . . . 263
ytitle() . . . 30–33, 41–42, 144, 180–181,
254–256, 272, 354
axis() . . . . . . . . . . . . . . . . . . . . . . . . . 272
bexpand . . . . . . . . . . . . . . . 144, 181, 208
bfcolor() . . . . . . . . . . . . . . . . . . . . . . 208
box . . . . . . . . . . . . . . . . . . . 144, 181, 208
orientation() . . . . . . . . . . . . . . . . . 342
size() . . . . . . . . . . . . . . . . . 42, 144, 181
suffix . . . . . . . . . . . . . . . . . . . . . . . . . 255
ytitle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
yvaroptions() . . 125–126, 128, 163–167,
169–170, 203–204
label() . . . . . . . . . . . . . . . . . . . . . . . . 128
relabel() . . . . . . . 125–126, 169–170,
203–204

The electronic form of this book is solely for direct use at UCLA and only by faculty, students, and staff of UCLA.
All rights reserved on the copyright page apply to this document and specifically neither the electronic nor
published form of the book may be distributed or reproduced, either electronically or in printed form.

i
i

i
i



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.4
Linearized                      : Yes
Encryption                      : Standard V2.3 (128-bit)
User Access                     : Extract
Page Count                      : 409
Page Mode                       : UseOutlines
XMP Toolkit                     : XMP toolkit 2.9.1-13, framework 1.6
About                           : uuid:4e7dd1ec-a501-11d8-9b44-000a9595fc18
Producer                        : Acrobat Distiller 6.0.1 for Macintosh
Keywords                        : 
Create Date                     : 2004:04:05 17:37:25-05:00
Modify Date                     : 2004:05:13 12:16:43-05:00
Creator Tool                    : LaTeX with hyperref package
Metadata Date                   : 2004:05:13 12:16:43-05:00
Document ID                     : uuid:7a2ff8f0-8752-11d8-a131-000a27dd0994
Format                          : application/pdf
Description                     : Stata and Statistics
Creator                         : Stata Press
Title                           : Stata Press Publication
Author                          : Stata Press
Subject                         : Stata and Statistics
EXIF Metadata provided by EXIF.tools

Navigation menu