The Tagled Web A Guide To Securing Modern Applications

The%20Tagled%20Web%20A%20Guide%20to%20Securing%20Modern%20Web%20Applications

The%20Tagled%20Web%20A%20Guide%20to%20Securing%20Modern%20Web%20Applications

The%20Tagled%20Web%20A%20Guide%20to%20Securing%20Modern%20Web%20Applications

The%20Tagled%20Web%20:%20A%20Guide%20to%20Securing%20Modern%20Web%20Applications

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 324

DownloadThe Tagled Web A Guide To Securing Modern Applications
Open PDF In BrowserView PDF
PRAISE FOR THE TANGLED WEB
“Thorough and comprehensive coverage from one of the foremost experts
in browser security.”
—TAVIS ORMANDY, GOOGLE INC.
“A must-read for anyone who values their security and privacy online.”
—COLLIN JACKSON, RESEARCHER AT THE CARNEGIE MELLON WEB
SECURITY GROUP
“Perhaps the most thorough and insightful treatise on the state of security
for web-driven technologies to date. A must have!”
—MARK DOWD, AZIMUTH SECURITY, AUTHOR OF THE ART OF SOFTWARE
SECURITY ASSESSMENT

PRAISE FOR SILENCE ON THE WIRE BY MICHAL ZALEWSKI
“One of the most innovative and original computing books available.”
—RICHARD BEJTLICH, TAOSECURITY
“For the pure information security specialist this book is pure gold.”
—MITCH TULLOCH, WINDOWS SECURITY
“Zalewski’s explanations make it clear that he’s tops in the industry.”
—COMPUTERWORLD
“The amount of detail is stunning for such a small volume and the examples
are amazing. . . . You will definitely think different after reading this title.”
—(IN)SECURE MAGAZINE
“Totally rises head and shoulders above other such security-related titles.”
—LINUX USER & DEVELOPER

THE TANGLED WEB
A Guide to Securing
Modern Web Applications

by Michal Zalewski

San Francisco

THE TANGLED WEB. Copyright © 2012 by Michal Zalewski.
All rights reserved. No part of this work may be reproduced or transmitted in any form or by any means, electronic or
mechanical, including photocopying, recording, or by any information storage or retrieval system, without the prior
written permission of the copyright owner and the publisher.
15 14 13 12 11

123456789

ISBN-10: 1-59327-388-6
ISBN-13: 978-1-59327-388-0
Publisher: William Pollock
Production Editor: Serena Yang
Cover Illustration: Hugh D’Andrade
Interior Design: Octopod Studios
Developmental Editor: William Pollock
Technical Reviewer: Chris Evans
Copyeditor: Paula L. Fleming
Compositor: Serena Yang
Proofreader: Ward Webber
Indexer: Nancy Guenther
For information on book distributors or translations, please contact No Starch Press, Inc. directly:
No Starch Press, Inc.
38 Ringold Street, San Francisco, CA 94103
phone: 415.863.9900; fax: 415.863.9950; info@nostarch.com; www.nostarch.com
Library of Congress Cataloging-in-Publication Data
Zalewski, Michal.
The tangled Web : a guide to securing modern Web applications / Michal Zalewski.
p. cm.
Includes bibliographical references and index.
ISBN-13: 978-1-59327-388-0 (pbk.)
ISBN-10: 1-59327-388-6 (pbk.)
1. Computer networks--Security measures. 2. Browsers (Computer programs) 3. Computer security. I. Title.
TK5105.59.Z354 2011
005.8--dc23
2011039636

No Starch Press and the No Starch Press logo are registered trademarks of No Starch Press, Inc. “The Book of” is
a trademark of No Starch Press, Inc. Other product and company names mentioned herein may be the trademarks
of their respective owners. Rather than use a trademark symbol with every occurrence of a trademarked name, we
are using the names only in an editorial fashion and to the benefit of the trademark owner, with no intention of
infringement of the trademark.
The information in this book is distributed on an “As Is” basis, without warranty. While every precaution has been
taken in the preparation of this work, neither the author nor No Starch Press, Inc. shall have any liability to any
person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the
information contained in it.

For my son

BRIEF CONTENTS

Preface .......................................................................................................................xvii
Chapter 1: Security in the World of Web Applications ........................................................1

PART I: ANATOMY OF THE WEB ............................................................................ 21
Chapter 2: It Starts with a URL ........................................................................................23
Chapter 3: Hypertext Transfer Protocol ............................................................................41
Chapter 4: Hypertext Markup Language ......................................................................... 69
Chapter 5: Cascading Style Sheets .................................................................................87
Chapter 6: Browser-Side Scripts ......................................................................................95
Chapter 7: Non-HTML Document Types .........................................................................117
Chapter 8: Content Rendering with Browser Plug-ins........................................................127

PART II: BROWSER SECURITY FEATURES ............................................................... 139
Chapter 9: Content Isolation Logic ................................................................................141
Chapter 10: Origin Inheritance.....................................................................................165
Chapter 11: Life Outside Same-Origin Rules ...................................................................173
Chapter 12: Other Security Boundaries .........................................................................187

Chapter 13: Content Recognition Mechanisms................................................................197
Chapter 14: Dealing with Rogue Scripts ........................................................................213
Chapter 15: Extrinsic Site Privileges ..............................................................................225

PART III: A GLIMPSE OF THINGS TO COME ........................................................... 233
Chapter 16: New and Upcoming Security Features .........................................................235
Chapter 17: Other Browser Mechanisms of Note ............................................................255
Chapter 18: Common Web Vulnerabilities.....................................................................261
Epilogue ....................................................................................................................267
Notes ........................................................................................................................269
Index .........................................................................................................................283

viii

Brief Contents

CONTENTS IN DETAIL
PREFACE

xvii

Acknowledgments ................................................................................................... xix

1
S E C U R IT Y I N T H E W O R L D O F W E B A P P L IC A T I O N S

1

Information Security in a Nutshell ................................................................................ 1
Flirting with Formal Solutions ......................................................................... 2
Enter Risk Management................................................................................. 4
Enlightenment Through Taxonomy .................................................................. 6
Toward Practical Approaches ........................................................................ 7
A Brief History of the Web ......................................................................................... 8
Tales of the Stone Age: 1945 to 1994 ........................................................... 8
The First Browser Wars: 1995 to 1999 ........................................................ 10
The Boring Period: 2000 to 2003 ................................................................ 11
Web 2.0 and the Second Browser Wars: 2004 and Beyond .......................... 12
The Evolution of a Threat.......................................................................................... 14
The User as a Security Flaw......................................................................... 14
The Cloud, or the Joys of Communal Living.................................................... 15
Nonconvergence of Visions ......................................................................... 15
Cross-Browser Interactions: Synergy in Failure ............................................... 16
The Breakdown of the Client-Server Divide .................................................... 17

PART I: ANATOMY OF THE WEB
2
IT S TA R T S W I T H A U R L

21

23

Uniform Resource Locator Structure............................................................................ 24
Scheme Name ........................................................................................... 24
Indicator of a Hierarchical URL .................................................................... 25
Credentials to Access the Resource............................................................... 26
Server Address .......................................................................................... 26
Server Port ................................................................................................ 27
Hierarchical File Path.................................................................................. 27
Query String.............................................................................................. 28
Fragment ID............................................................................................... 28
Putting It All Together Again ........................................................................ 29
Reserved Characters and Percent Encoding ................................................................ 31
Handling of Non-US-ASCII Text.................................................................... 32
Common URL Schemes and Their Function.................................................................. 36
Browser-Supported, Document-Fetching Protocols ........................................... 36
Protocols Claimed by Third-Party Applications and Plug-ins.............................. 36
Nonencapsulating Pseudo-Protocols.............................................................. 37
Encapsulating Pseudo-Protocols .................................................................... 37
Closing Note on Scheme Detection .............................................................. 38

Resolution of Relative URLs ....................................................................................... 38
Security Engineering Cheat Sheet.............................................................................. 40
When Constructing Brand-New URLs Based on User Input ............................... 40
When Designing URL Input Filters ................................................................. 40
When Decoding Parameters Received Through URLs ...................................... 40

3
H YP E R T E X T T R A N S F E R P R O T O C O L

41

Basic Syntax of HTTP Traffic ..................................................................................... 42
The Consequences of Supporting HTTP/0.9 .................................................. 44
Newline Handling Quirks............................................................................ 45
Proxy Requests........................................................................................... 46
Resolution of Duplicate or Conflicting Headers............................................... 47
Semicolon-Delimited Header Values.............................................................. 48
Header Character Set and Encoding Schemes ............................................... 49
Referer Header Behavior ............................................................................. 51
HTTP Request Types ................................................................................................. 52
GET.......................................................................................................... 52
POST ........................................................................................................ 52
HEAD ....................................................................................................... 53
OPTIONS.................................................................................................. 53
PUT .......................................................................................................... 53
DELETE ..................................................................................................... 53
TRACE ...................................................................................................... 53
CONNECT ............................................................................................... 54
Other HTTP Methods .................................................................................. 54
Server Response Codes............................................................................................ 54
200–299: Success ..................................................................................... 54
300–399: Redirection and Other Status Messages......................................... 55
400–499: Client-Side Error ......................................................................... 55
500–599: Server-Side Error ........................................................................ 56
Consistency of HTTP Code Signaling ............................................................ 56
Keepalive Sessions .................................................................................................. 56
Chunked Data Transfers ........................................................................................... 57
Caching Behavior ................................................................................................... 58
HTTP Cookie Semantics............................................................................................ 60
HTTP Authentication................................................................................................. 62
Protocol-Level Encryption and Client Certificates .......................................................... 64
Extended Validation Certificates................................................................... 65
Error-Handling Rules ................................................................................... 65
Security Engineering Cheat Sheet.............................................................................. 67
When Handling User-Controlled Filenames in Content-Disposition Headers ....... 67
When Putting User Data in HTTP Cookies...................................................... 67
When Sending User-Controlled Location Headers .......................................... 67
When Sending User-Controlled Redirect Headers........................................... 67
When Constructing Other Types of User-Controlled Requests or Responses........ 67

x

Contents in D e ta i l

4
H YP E R T E X T M A RK U P L A N GU AG E

69

Basic Concepts Behind HTML Documents ................................................................... 70
Document Parsing Modes............................................................................ 71
The Battle over Semantics ............................................................................ 72
Understanding HTML Parser Behavior ........................................................................ 73
Interactions Between Multiple Tags ............................................................... 74
Explicit and Implicit Conditionals.................................................................. 75
HTML Parsing Survival Tips.......................................................................... 76
Entity Encoding ....................................................................................................... 76
HTTP/HTML Integration Semantics............................................................................. 78
Hyperlinking and Content Inclusion ........................................................................... 79
Plain Links ................................................................................................. 79
Forms and Form-Triggered Requests.............................................................. 80
Frames...................................................................................................... 82
Type-Specific Content Inclusion .................................................................... 82
A Note on Cross-Site Request Forgery........................................................... 84
Security Engineering Cheat Sheet.............................................................................. 85
Good Engineering Hygiene for All HTML Documents ...................................... 85
When Generating HTML Documents with Attacker-Controlled Bits .................... 85
When Converting HTML to Plaintext ............................................................. 85
When Writing a Markup Filter for User Content ............................................. 86

5
CASCADING STYLE SHEETS

87

Basic CSS Syntax .................................................................................................... 88
Property Definitions .................................................................................... 89
@ Directives and XBL Bindings ..................................................................... 89
Interactions with HTML ................................................................................ 90
Parser Resynchronization Risks.................................................................................. 90
Character Encoding................................................................................................. 91
Security Engineering Cheat Sheet.............................................................................. 93
When Loading Remote Stylesheets ............................................................... 93
When Putting Attacker-Controlled Values into CSS ......................................... 93
When Filtering User-Supplied CSS................................................................ 93
When Allowing User-Specified Class Values on HTML Markup ........................ 93

6
BROWSER-SIDE SCRIPTS

95

Basic Characteristics of JavaScript............................................................................. 96
Script Processing Model .............................................................................. 97
Execution Ordering Control ....................................................................... 100
Code and Object Inspection Capabilities .................................................... 101
Modifying the Runtime Environment ............................................................ 102
JavaScript Object Notation and Other Data Serializations ............................ 104
E4X and Other Syntax Extensions............................................................... 106

Contents in D etai l

xi

Standard Object Hierarchy .................................................................................... 107
The Document Object Model ..................................................................... 109
Access to Other Documents ....................................................................... 111
Script Character Encoding...................................................................................... 112
Code Inclusion Modes and Nesting Risks ................................................................. 113
The Living Dead: Visual Basic ................................................................................. 114
Security Engineering Cheat Sheet............................................................................ 115
When Loading Remote Scripts ................................................................... 115
When Parsing JSON Received from the Server ............................................ 115
When Putting User-Supplied Data Inside JavaScript Blocks ............................ 115
When Interacting with Browser Objects on the Client Side ............................ 115
If You Want to Allow User-Controlled Scripts on Your Page ........................... 116

7
N O N - H TM L D O C U M E N T T Y P E S

117

Plaintext Files ........................................................................................................ 117
Bitmap Images ...................................................................................................... 118
Audio and Video .................................................................................................. 119
XML-Based Documents ........................................................................................... 119
Generic XML View ................................................................................... 120
Scalable Vector Graphics.......................................................................... 121
Mathematical Markup Language................................................................ 122
XML User Interface Language..................................................................... 122
Wireless Markup Language....................................................................... 123
RSS and Atom Feeds ................................................................................ 123
A Note on Nonrenderable File Types ...................................................................... 124
Security Engineering Cheat Sheet............................................................................ 125
When Hosting XML-Based Document Formats .............................................. 125
On All Non-HTML Document Types............................................................. 125

8
C O N TE N T R E N D E R I N G WI T H BR O WS E R PL U G - IN S

127

Invoking a Plug-in.................................................................................................. 128
The Perils of Plug-in Content-Type Handling ................................................. 129
Document Rendering Helpers.................................................................................. 130
Plug-in-Based Application Frameworks ..................................................................... 131
Adobe Flash ............................................................................................ 132
Microsoft Silverlight .................................................................................. 134
Sun Java ................................................................................................. 134
XML Browser Applications (XBAP) .............................................................. 135
ActiveX Controls.................................................................................................... 136
Living with Other Plug-ins ....................................................................................... 137
Security Engineering Cheat Sheet............................................................................ 138
When Serving Plug-in-Handled Files ........................................................... 138
When Embedding Plug-in-Handled Files ...................................................... 138
If You Want to Write a New Browser Plug-in or ActiveX Component .............. 138

xii

C on t e n t s i n D e t a i l

PART II: BROWSER SECURITY FEATURES
9
C O N TE N T I S O L AT IO N L O G IC

139

141

Same-Origin Policy for the Document Object Model .................................................. 142
document.domain .................................................................................... 143
postMessage(...) ...................................................................................... 144
Interactions with Browser Credentials.......................................................... 145
Same-Origin Policy for XMLHttpRequest ................................................................... 146
Same-Origin Policy for Web Storage ....................................................................... 148
Security Policy for Cookies ..................................................................................... 149
Impact of Cookies on the Same-Origin Policy.............................................. 150
Problems with Domain Restrictions.............................................................. 151
The Unusual Danger of “localhost” ............................................................. 152
Cookies and “Legitimate” DNS Hijacking.................................................... 153
Plug-in Security Rules ............................................................................................. 153
Adobe Flash ............................................................................................ 154
Microsoft Silverlight .................................................................................. 157
Java ....................................................................................................... 157
Coping with Ambiguous or Unexpected Origins ....................................................... 158
IP Addresses ............................................................................................ 158
Hostnames with Extra Periods .................................................................... 159
Non–Fully Qualified Hostnames ................................................................. 159
Local Files ............................................................................................... 159
Pseudo-URLs ............................................................................................ 161
Browser Extensions and UI ........................................................................ 161
Other Uses of Origins ............................................................................................ 161
Security Engineering Cheat Sheet............................................................................ 162
Good Security Policy Hygiene for All Websites ............................................ 162
When Relying on HTTP Cookies for Authentication ....................................... 162
When Arranging Cross-Domain Communications in JavaScript ...................... 162
When Embedding Plug-in-Handled Active Content from Third Parties .............. 162
When Hosting Your Own Plug-in-Executed Content....................................... 163
When Writing Browser Extensions ............................................................. 163

10
O R IG I N I N H E R IT A N C E

165

Origin Inheritance for about:blank .......................................................................... 166
Inheritance for data: URLs....................................................................................... 167
Inheritance for javascript: and vbscript: URLs ............................................................ 169
A Note on Restricted Pseudo-URLs ........................................................................... 170
Security Engineering Cheat Sheet............................................................................ 172

11
L I F E O U T S I D E S A M E - O R I G IN R U L E S

173

Window and Frame Interactions ............................................................................. 174
Changing the Location of Existing Documents .............................................. 174
Unsolicited Framing.................................................................................. 178
Contents i n Detail

xiii

Cross-Domain Content Inclusion .............................................................................. 181
A Note on Cross-Origin Subresources......................................................... 183
Privacy-Related Side Channels ................................................................................ 184
Other SOP Loopholes and Their Uses ...................................................................... 185
Security Engineering Cheat Sheet............................................................................ 186
Good Security Hygiene for All Websites ..................................................... 186
When Including Cross-Domain Resources .................................................... 186
When Arranging Cross-Domain Communications in JavaScript ...................... 186

12
O T H E R S E C U R I T Y B O U N D A R IE S

187

Navigation to Sensitive Schemes............................................................................. 188
Access to Internal Networks.................................................................................... 189
Prohibited Ports ..................................................................................................... 190
Limitations on Third-Party Cookies............................................................................ 192
Security Engineering Cheat Sheet............................................................................ 195
When Building Web Applications on Internal Networks................................ 195
When Launching Non-HTTP Services, Particularly on Nonstandard Ports ......... 195
When Using Third-Party Cookies for Gadgets or Sandboxed Content ............. 195

13
C O N TE N T R E C O G N IT IO N M E CH A N I S M S

197

Document Type Detection Logic............................................................................... 198
Malformed MIME Types ............................................................................ 199
Special Content-Type Values...................................................................... 200
Unrecognized Content Type ...................................................................... 202
Defensive Uses of Content-Disposition ......................................................... 203
Content Directives on Subresources ............................................................ 204
Downloaded Files and Other Non-HTTP Content ......................................... 205
Character Set Handling ......................................................................................... 206
Byte Order Marks .................................................................................... 208
Character Set Inheritance and Override ...................................................... 209
Markup-Controlled Charset on Subresources................................................ 209
Detection for Non-HTTP Files...................................................................... 210
Security Engineering Cheat Sheet............................................................................ 212
Good Security Practices for All Websites..................................................... 212
When Generating Documents with Partly Attacker-Controlled Contents ........... 212
When Hosting User-Generated Files ........................................................... 212

14
D E A L I N G W IT H R O GU E S C R IP T S

213

Denial-of-Service Attacks ........................................................................................ 214
Execution Time and Memory Use Restrictions ............................................... 215
Connection Limits ..................................................................................... 216
Pop-Up Filtering ....................................................................................... 217
Dialog Use Restrictions.............................................................................. 218
Window-Positioning and Appearance Problems ........................................................ 219
Timing Attacks on User Interfaces ............................................................................ 222
xiv

Contents in D e ta i l

Security Engineering Cheat Sheet............................................................................ 224
When Permitting User-Created 

In traditional HTML documents, this tag puts the parser in one of the
special parsing modes, and all text between the opening and the closing tag
will simply be ignored in frame-aware browsers. In legacy browsers that do
not understand 


NOTE

In the HTML markup provided in this example, and when creating new windows or
frames in general, about:blank can be omitted. The value is defaulted to when no
other URL is specified by the creator of the parent document.
In every browser, most types of navigation to about:blank result in the creation of a new document that inherits its SOP origin from the page that initiated the navigation. The inherited origin is reflected in the document.domain
property of the new JavaScript execution context, and DOM access to or
from any other origins is not permitted.
This simple formula holds true for navigation actions such as clicking a
link, submitting a form, creating a new frame or a window from a script, or
programmatically navigating an existing document. That said, there are exceptions, the most notable of which are several special, user-controlled navigation
methods. These include manually entering about:blank in the address bar, following a bookmark, or performing a gesture reserved for opening a link in a
new window or a tab.* These actions will result in a document that occupies a
unique synthetic origin and that can’t be accessed by any other page.
Another special case is the loading of a normal server-supplied document that subsequently redirects to about:blank using Location or Refresh. In
Firefox and WebKit-based browsers, such redirection results in a unique, nonaccessible origin, similar to the scenario outlined in the previous paragraph.
In Internet Explorer, on the other hand, the resulting document will be
*
This is usually accomplished by holding CTRL or SHIFT while clicking on a link, or by rightclicking the mouse to access a contextual menu, and then selecting the appropriate option.

166

Chapter 10

accessible by the parent page if the redirection occurs inside an 

In this scenario, there is no compelling reason for a data: URL to behave
differently than about:blank. In reality, however, it will behave differently in
some browsers and therefore must be used with care.


WebKit browsers In Chrome and Safari, all data: documents are given a
unique, nonaccessible origin and do not inherit from the parent at all.



Firefox In Firefox, the origin for data: documents is inherited from the
navigating context, similar to about:blank. However, unlike with about:blank,
manually entering data: URLs or opening bookmarked ones results in
the new document inheriting origin from the page on which the navigation occurred.



Opera As of this writing, a shared “empty” origin is used for all data:
URLs, which is accessible by the parent document. This approach is
unsafe, as it may allow cross-domain access to frames created by unrelated pages, as shown in Figure 10-1. (I reported this behavior to Opera,
and it likely will be amended soon.)



Internet Explorer data: URLs are not supported in Internet Explorer
versions prior to 8. The scheme is supported only for select types of subresources in Internet Explorer 8 and 9 and can’t be used for navigation.
Table 10-2 summarizes the current behavior of data: URLs.

Table 10-2: Origin Inheritance for data: URLs
Type of navigation
New page

168

Chapter 10

Existing non-sameorigin page

Location
redirect

Refresh
redirect

URL entry or
gesture

Internet
Explorer 6/7

(Not supported)

Internet
Explorer 8/9

(Not supported for navigation)

Firefox

Inherited from caller

Unique origin

Inherited from
previous page

All WebKit

Unique origin

(Denied)

Unique
origin

Unique origin

Opera

Shared origin (This is a bug!)

(Denied)

Inherited
from
parent

Opera
Top-level document: fuzzybunnies.com
frame: data:text/html,...

frame: bunnyoutlet.com

Cross-domain DOM
access possible

frame: data:text/html,...


Figure 10-1: Access between data: URLs in Opera

Inheritance for javascript: and vbscript: URLs
Scripting-related pseudo-URLs, such as javscript:, are a very curious mechanism. Using them to load some types of subresources will lead to code execution in the context of the document that attempts to load such an operation
(subject to some inconsistent restrictions, as discussed in Chapter 4). An
example of this may be


More interestingly (and far less obviously) than the creation of new
subresources, navigating existing windows or frames to javascript: URLs will
cause the inlined JavaScript code to execute in the context of the navigated
page (and not the navigating document!)—even if the URL is entered manually or loaded from a bookmark.
Given this behavior, it is obviously very unsafe to allow one document
to navigate any other non-same-origin context to a javascript: URL, as it
would enable the circumvention of all other content-isolation mechanisms: Just load fuzzybunnies.com in a frame, and then navigate that frame
to javascript:do_evil_stuff() and call it a day. Consequently, such navigation
is prohibited in all browsers except for Firefox. Firefox appears to permit it
for some reason, but it changes the semantics in a sneaky way. When the
origin of the caller and the navigation target do not match, it executes the
javascript: payload in a special null origin, which lacks its own DOM or any of
the browser-supplied I/O functions registered (thus permitting only purely
algorithmic operations to occur).
Origi n Inheritance

169

The cross-origin case is dangerous, but its same-origin equivalent is not:
Within a single origin, any content is free to navigate itself or its peers to
javascript: URLs on its own volition. In this case, the javascript: scheme is honored when following links, submitting forms, calling location.assign(...), and
so on. In WebKit and Opera, Refresh redirection to javascript: will work as well;
other browsers reject such navigation due to vague and probably misplaced
script-injection concerns.
The handling of scripting URLs is outlined in Table 10-3.
Table 10-3: Origin Inheritance for Scripting URLs
Type of navigation

Internet
Explorer
Firefox
All WebKit

Opera

New page

Existing
same-origin
page

Existing
non-sameorigin page

Location
redirect

Refresh
redirect

URL entry
or gesture

Inherited
from caller

Inherited
from
navigated
page

(Denied)

(Denied)

(Denied)

Inherited
from
navigated
page

Null context

(Denied)

(Denied)

Inherited from
navigated
page

(Denied)

Inherited from
navigated
page

On top of these fascinating semantics, there is a yet another twist unique
to the javascript: scheme: In some cases, the handling of such script-containing
URLs involves a second step. Specifically, if the supplied code evaluates properly, and the value of the last statement is nonvoid and can be converted to a
string, this string will be interpreted as an HTML document and will replace
the navigated page (inheriting origin from the caller). The logic governing
this curious behavior is very similar to that influencing the behavior of data:
URLs. An example of such a document-replacing expression is this:
javascript:"2 + 2 = " + (2+2) + ""

A Note on Restricted Pseudo-URLs
The somewhat quirky behavior of the three aforementioned classes of
URLs—about:blank, javascript:, and data:—are all that most websites need to
be concerned with. Nevertheless, browsers use a range of other documents
with no inherent, clearly defined origin (e.g., about:config in Firefox, a privileged JavaScript page that can be used to tweak the browser’s various underthe-hood settings, or chrome://downloads in Chrome, which lists the recently
downloaded documents with links to open any of them). These documents
are a continued source of security problems, even if they are not reachable
directly from the Internet.
170

Chapter 10

Because of the incompatibility of these URLs with the boundaries controlled by the same-origin policy, special care must be taken to make sure
that these URLs are sufficiently isolated from other content whenever they
are loaded in the browser as a result of user action or some other indirect
browser-level process. An interesting case illustrating the risk is a 2010 bug
in the way Firefox handled about:neterror.2 Whenever Firefox can’t correctly
retrieve a document from a remote server (a condition that is usually easy
to trigger with a carefully crafted link), it puts the destination URL in the
address bar but loads about:neterror in place of the document body. Unfortunately, due to a minor oversight, this special error page would be same-origin
with any about:blank document opened by any Internet-originating content,
thereby permitting the attacker to inject arbitrary content into the
about:neterror window while preserving the displayed destination URL.
The moral of this story? Avoid the urge to gamble with the same-origin
policy; instead, play along with it. Note that making about:neterror a hierarchical URL, instead of trying to keep track of synthetic origins, would have prevented the bug.

Origi n Inheritance

171

Security Engineering Cheat Sheet
Because of their incompatibility with the same-origin policy, data:, javascript:, and implicit
or explicit about:blank URLs should be used with care. When performance is not critical, it is
preferable to seed new frames and windows by pointing them to a server-supplied blank document with a definite origin first.
Keep in mind that data: and javascript: URLs are not a drop-in replacement for about:blank,
and they should be used only when absolutely necessary. In particular, it is currently unsafe to
assume that data: windows can’t be accessed across domains.

172

Chapter 10

LIFE OUTSIDE
SAME-ORIGIN RULES

The same-origin policy is the most important mechanism we have to keep hostile web applications at bay,
but it’s also an imperfect one. Although it is meant to
offer a robust degree of separation between any two
different and clearly identifiable content sources, it
often fails at this task.
To understand this disconnect, recall that contrary to what common
sense may imply, the same-origin policy was never meant to be all-inclusive.
Its initial focus, the DOM hierarchy (that is, just the document object exposed
to JavaScript code) left many of the peripheral JavaScript features completely
exposed to cross-domain manipulation, necessitating ad hoc fixes. For example, a few years after the inception of SOP, vendors realized that allowing thirdparty documents to tweak the location.host property of an unrelated window is
a bad idea and that such an operation could send potentially sensitive data
present in other URL segments to an attacker-specified site. The policy has

subsequently been extended to at least partly protect this and a couple of
other sensitive objects, but in some less clear-cut cases, awkward loopholes
remain.
The other problem is that many cross-domain interactions happen
completely outside of JavaScript and its object hierarchy. Actions such as
loading third-party images or stylesheets are deeply rooted in the design of
HTML and do not depend on scripting in any meaningful way. (In principle,
it would be possible to retrofit them with origin-based security controls, but
doing so would interfere with existing websites. Plus, some think that such a
decision would go against the design principles that made the Web what it is;
they believe that the ability to freely cross-reference content should not be
infringed upon.)
In light of this, it seems prudent to explore the boundaries of the sameorigin policy and learn about the rich life that web applications can lead outside its confines. We begin with document navigation—a mechanism that at
first seems strikingly simple but that is really anything but.

Window and Frame Interactions
On the Web, the ability to steer the browser from one website to another
is taken for granted. Some of the common methods of achieving such navigation are discussed throughout Part I of this book; the most notable of
these are HTML links, forms, and frames; HTTP redirects; and JavaScript
window.open(...) and location.* calls.
Actions such as pointing a newly opened window to an off-domain URL
or specifying the src parameter of a frame are intuitive and require no further review. But when we look at the ability of one page to navigate another,
existing document—well, the reign of intuition comes to a sudden end.

Changing the Location of Existing Documents
In the simple days before the advent of HTML frames, only one document
could occupy a given browser window, and only that single window would be
under the document’s control. Frames changed this paradigm, however, permitting several different and completely separate documents to be spliced
into a single logical view, coexisting within a common region of the screen.
The introduction of the mechanism also necessitated another step: To sanely
implement certain frame-based websites, any of the component documents
displayed in a window needed the ability to navigate its neighboring frames
or perhaps the top-level document itself. (For example, imagine a two-frame
page with a table of contents on the left and the actual chapter on the right.
Clicking a chapter name in the left pane should navigate the chapter in the
right pane, and nothing else.)
The mechanism devised for this last purpose is fairly simple: One can
specify the target parameter on  links or forms, or provide the
name of a window to the JavaScript method known as window.open(...), in

174

Chapter 11

order to navigate any other, previously named document view. In the mid1990s, when this functionality first debuted, there seemed to be no need to
incorporate any particular security checks into this logic; any page could navigate any other named window or a frame displayed by the browser to a new
location at will.
To understand the consequences of this design, it is important to pause
for a moment and examine the circumstances under which a particular document may obtain a name to begin with. For frames, the story is simple: In
order to reference a frame easily on the embedding page, virtually all frames
have a name attribute (and some browsers, such as Chrome, also look at id).
Browser windows, on the other hand, are typically anonymous (that is, their
window.name property is an empty string), unless created programmatically;
in the latter case, the name is specified by whoever creates the view. Anonymous windows do not necessarily stay anonymous, however. If a rogue application is displayed in such a window even briefly, it may set the window.name
property to any value, and this effect will persist.
The aforementioned ability to target windows and frames by name is not
the only way to navigate them; JavaScript programs that hold window handles
pointing to other documents may directly invoke certain DOM methods without knowing the name of their target at all. Attacker-supplied code will not
normally hold handles to completely unrelated windows, but it can traverse
properties such as opener, top, parent, or frames[] in order to locate even distant
relatives within the same navigation flow. An example of such a far-reaching
lookup (and subsequently, navigation) is
opener.opener.frames[2].location.assign("http://www.bunnyoutlet.com/");

These two lookup techniques are not mutually exclusive: JavaScript
programs can first obtain the handle of an unrelated but named window
through window.open(...) and then traverse the opener or frames[] properties
of that context in order to reach its interesting relatives nearby.
Once a suitable handle is looked up in any fashion, the originating context can leverage one of several DOM methods and properties in order to
change the address of the document displayed in that view. In every contemporary browser, calling the .location.replace(...) method, or assigning a
value to .location or .location.href properties, should do the
trick. Amusingly, due to random implementation quirks, other theoretically
equivalent approaches (such as invoking .location.assign(...) or
.window.open(..., "_self")) may be hit-and-miss.
Okay, so it may be possible to navigate unrelated documents to new
locations—but let’s see what could possibly go wrong.
Frame Hijacking Risks
The ability for one domain to navigate windows created by other sites, or
ones that are simply no longer same-origin with their creator, is usually not
a grave concern. This laid-back design may be an annoyance and may pose

Life O utsi de S ame-O ri gi n Ru les

175

some minor, speculative phishing risk,* but in the grand scheme of things, it
is neither a very pronounced issue nor a particularly distinctive one. This is,
perhaps, the reason why the original authors of the relevant APIs have not
given the entire mechanism too much thought.
Alas, the concept of HTML frames alters the picture profoundly: Any
application that relies on frames to build a trusted user interface is at an obvious risk if an unrelated site is permitted to hijack such UI elements without
leaving any trace of the attack in the address bar! Figure 11-1 shows one such
plausible attack scenario.
Bunny Browser 2000
https://fuzzybunnies.com
Welcome to Fuzzy Bunnies
Online Banking and BBQ!

Bunny Browser 2000
http://bunnyoutlet.com


frame: login.fuzzybunnies.com
Login:
Password:

Login frame can be navigated
to an attacker-supplied URL.

Figure 11-1: A historically permitted, dangerous frame navigation scenario: The window
on the right is opened at the same time as a banking website and is actively subverting it.

Georgi Guninski, one of the pioneering browser security researchers,
realized as early as 1999 that by permitting unconstrained frame navigation,
we were headed for some serious trouble. Following his reports, vendors
attempted to roll out frame navigation restrictions mid-2000.1 Their implementation constrained all cross-frame navigation to the scope of a single
window, preventing malicious web pages from interfering with any other
simultaneously opened browser sessions.
Surprisingly, even this simple policy proved difficult to implement
correctly. It was only in 2008 that Firefox eliminated this class of problems,2
while Microsoft essentially ignored the problem until 2006. Still, these setbacks aside, we should be fine—right?
Frame Descendant Policy and Cross-Domain Communications
The simple security restriction discussed in the previous session was not,
in fact, enough. The reason was a new class of web applications, sometimes
known as mashups, that combined data from various sources to enable users
to personalize their working environment and process data in innovative ways.
Unfortunately for browser vendors, such web applications frequently relied
on third-party gadgets loaded through